Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tumblr list index out of range #129

Closed
eoop opened this issue Dec 2, 2018 · 4 comments
Closed

Tumblr list index out of range #129

eoop opened this issue Dec 2, 2018 · 4 comments
Labels

Comments

@eoop
Copy link

eoop commented Dec 2, 2018

Hello. It appears that Tumblr blogs with more than 1166 items throws this particular index error. I have sqlite archive enabled. I've attached the verbose output below.

tumblr: Traceback Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/gallery_dl/job.py", line 51, in run for msg in self.extractor: File "/usr/local/lib/python3.7/site-packages/gallery_dl/extractor/tumblr.py", line 113, in items yield self._prepare_image(url, post) File "/usr/local/lib/python3.7/site-packages/gallery_dl/extractor/tumblr.py", line 161, in _prepare_image post["hash"] = parts[1] if parts[1] != "inline" else parts[2] IndexError: list index out of range

@Hrxn
Copy link
Contributor

Hrxn commented Dec 2, 2018

If by 1166 items you mean the number of posts on a certain blog (postcount), then this is not the issue, I have already tried it with blogs containing far more posts in the past. And unless they haven't fundamentally changed their site in the last weeks (which is not the case, as far as I can tell), I can guarantee you that this should work just as before.

So the crux has to be the specific blog you are trying to process, or maybe rather just one specific post on said blog. Can you provide a link so that someone (like me) can try to reproduce the behavior shown here? Or is there some reason that is against sharing this example?

@eoop
Copy link
Author

eoop commented Dec 2, 2018

Gotcha. Here’s the blog: http://tingtongten.tumblr.com (NSFW)

@mikf
Copy link
Owner

mikf commented Dec 2, 2018

The offending post is http://tingtongten.tumblr.com/post/118653991558.
It contains an image whose URL doesn't follow the usual Tumblr pattern, but links directly to its source (http://i.gyazo.com/d033577d05901a8da1b0847628b7c10e.png).

@mikf mikf closed this as completed in 9563641 Dec 2, 2018
@Hrxn
Copy link
Contributor

Hrxn commented Dec 2, 2018

mikf being very fast again.. 😄

At least I'm here to confirm 9563641 as working:

PS E:\Test\Temp> gallery-dl -d . "http://tingtongten.tumblr.com/post/118653991558"
[tumblr][error] An unexpected error occurred: IndexError - list index out of range. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
PS E:\Test\Temp> pip --quiet install --upgrade "https://github.com/mikf/gallery-dl/archive/master.zip"
PS E:\Test\Temp> gallery-dl -d . "http://tingtongten.tumblr.com/post/118653991558"
* .\Tumblr\_Posts\tingtongten_118653991558_1_this-was-a-perfect-setup-with-your-icon-waggle.png
PS E:\Test\Temp>

Note: That \Tumblr\_Posts\ part in the output is caused by my custom config, others will probably see something different here

But I have one more question with the changes in that commit:

try:
post["hash"] = parts[1] if parts[1] != "inline" else parts[2]
except IndexError:
post["hash"] = ""

What happens if a blog has more than one of these "problematic" posts, and subsequently gallery-dl encounters an IndexError and sets post["hash"] = ""? This alone is not an issue, obviously, but what if someone uses not the default settings for archive and/or filename output (like archive_fmt = "{id}_{num}") , but relies on {hash} being used for the archive file?

mikf added a commit that referenced this issue Dec 4, 2018
While a filename might not be a real 'hash', or comparable to what
tumbler usually provides, it is still better than an empty string.
At least as long as "alternatives" in format strings aren't implemented.
@mikf mikf added the bug label Dec 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants