Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow dynamically-built URLs #124

Closed
roniemartinez opened this issue Mar 29, 2022 · 0 comments · Fixed by #146
Closed

Follow dynamically-built URLs #124

roniemartinez opened this issue Mar 29, 2022 · 0 comments · Fixed by #146
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@roniemartinez
Copy link
Owner

roniemartinez commented Mar 29, 2022

Use case

There are instances wherein IDs or slugs are embedded in other elements and they can be built into URLs that can be "followed".

<div id="project-id">Project: eca514fc</div>

The project ID eca514fc can be extracted and can be built to a URL, for example, https://example.com/projects/eca514fc.
There should be an option to follow this URL from a decorated function.

Solution

Implement the proposed solution in #62

@select(css=".project-id")
def get_link(element, scraper):  # <-- pass scraper object

    project_id = element.text_content().removeprefix("Project:").strip()
    url = f"https://example.com/projects/{project_id}"

    scraper.follow_url(url)  # <-- add to the URLs that will be scraped by the scraper 

    return {"project_id": project_id}

Final solution: #146

@roniemartinez roniemartinez added the enhancement New feature or request label Mar 29, 2022
@roniemartinez roniemartinez self-assigned this Apr 30, 2022
@roniemartinez roniemartinez added this to the Alpha milestone Apr 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant