Skip to content
This repository has been archived by the owner on May 3, 2019. It is now read-only.

Error when scraping captions #8

Open
joaanna opened this issue Jul 3, 2017 · 4 comments
Open

Error when scraping captions #8

joaanna opened this issue Jul 3, 2017 · 4 comments

Comments

@joaanna
Copy link

joaanna commented Jul 3, 2017

Hey, so far I crawled followers smoothly, but I have 2 issues:

  1. I get this when I try to crawl the captions
    python instagramcrawler.py -d data -q 'viralnova365' -c -n 10
    dir_prefix: data, query: viralnova365, crawl_type: photos, number: 10, caption: True
    posts: 1660, number: 10
    Scraping photo links...
    Number of photo_links: 25
    Scraping captions...
    Traceback (most recent call last):
    File "instagramcrawler.py", line 297, in
    main()
    File "instagramcrawler.py", line 293, in main
    caption=args.caption)
    File "instagramcrawler.py", line 85, in crawl
    self.click_and_scrape_captions(number)
    File "instagramcrawler.py", line 161, in click_and_scrape_captions
    FIREFOX_FIRST_POST_PATH).click()
    File "/InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 313, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
    File "InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 791, in find_element
    'value': value})['value']
    File 'InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
    self.error_handler.check_response(response)
    File "InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
    selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //a[contains(@Class, '_8mlbc _vbtk2 _t5r8b')]
  2. also I would like to crawl all the images, but it never downloades the number specifed by -n, do you have any suggestions?
@tzuhsial
Copy link
Owner

tzuhsial commented Jul 4, 2017

Hi @joaanna ,
Thank you for telling me!
I'll look into this when I have time...

@tzuhsial
Copy link
Owner

tzuhsial commented Jul 7, 2017

@joaanna
I think I fixed the path to caption, that makes captions crawlable now.
(Guess I'll have to do this everytime whenever Instagram updates)

And about the number issue,
I am still looking for a robust way to detect if new posts are loaded.
Any help is appreciated!

@anfiallos
Copy link

Hi. I have the same problem. Error with values on label.
FIREFOX_FIRST_POST_PATH
Any suggestion please?

@anakmalank
Copy link

hi, i got this problem too.
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //div[contains(@Class, '_8mlbc _vbtk2 _t5r8b')]
image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants