Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of missing articles in wikivoyage_en_all_novid_2019-05.zim #724

Closed
Jaifroid opened this issue May 8, 2019 · 8 comments
Closed

Lots of missing articles in wikivoyage_en_all_novid_2019-05.zim #724

Jaifroid opened this issue May 8, 2019 · 8 comments
Assignees
Labels
Milestone

Comments

@Jaifroid
Copy link
Collaborator

Jaifroid commented May 8, 2019

Many articles are missing. To see an example, load the page for Singapore, and also open the equivalent page on en.wikivoyage.org. You will see that half of the "district" links are missing, e.g. for Marina Bay, Little India, Sentosa and Harbourfront, etc. These articles are simply missing from the ZIM (they are not in the index either). Similarly on the Africa page, overview articles are missing for North Africa, West Africa, Central Africa, East Africa, East African Islands, etc.

This looks like a bad encode, and I suggest this ZIM should be pulled before it makes its way into Wikivoyage-specific clients.

@ISNIT0
Copy link
Contributor

ISNIT0 commented May 8, 2019

@kelson42 Not sure if it will make any difference, but it seems the mentioned article was not made with 1.8.6

@kelson42 kelson42 added this to the 1.9 milestone May 9, 2019
@kelson42 kelson42 added the bug label May 9, 2019
@kelson42
Copy link
Collaborator

kelson42 commented May 9, 2019

@ISNIT0 If that has not been made with 1.8.6, this has been made with 1.8.5 which AFAIK should indeed not have changed anything fundamental in a way how articles are retrieved. Like for #173, we really need to secure all the stuff is here. I though you had fixed that after the missing JS code in 1.8.1 or 1.8.2!
@Jaifroid Thx for reporting. I will remove the file.

@ISNIT0
Copy link
Contributor

ISNIT0 commented May 9, 2019

I don't think this is anything to do with JavaScript, but I'm looking into the missing articles

@ISNIT0
Copy link
Contributor

ISNIT0 commented May 9, 2019

I think this is likely to be related to the pageId .replace(/ /g, '_')

@Jaifroid
Copy link
Collaborator Author

Jaifroid commented May 9, 2019

You're probably right, @ISNIT0 , because I notice that the article Singapore/Little India has an illegal URL in wikivoyage_en_all_novid_2019-05.zim:
Singapore/Little India (URLs should not have spaces). This is the dirEntry for the article:

image

By contrast, the same article in wikivoyage_en_all_novid_2019-04.zim has the more reasonable "Singapore/Little_India" (underscore).

@Jaifroid

This comment has been minimized.

@kelson42
Copy link
Collaborator

@Jaifroid I have hidden your last comment as this is IMO a separate bug, I have open a ticket with your same comment here #726. Thank you for reporting that problem too.

@ISNIT0
Copy link
Contributor

ISNIT0 commented May 10, 2019

This seems to have been inadvertently fixed by #725

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants