-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROD] Ingest Appeal Docs #2007
Comments
We could use such API endpoints instead of scraping: |
@arunissun, would you mind to compare staging and prod data of the 'appeal_document' endpoint? |
Some differences can be; prod and staging count:
|
@tovari I will check the staging and prod data for appeal documents |
Thanks @arunissun for the analysis. Looks like the scraper doesn't work well anymore on prod (there are missing docs from 2024). |
Issue
Recently the ingest_appeal_docs job does not run fine.
Without a header hack (of personal cookie data) the scraping of www.ifrc.org/appeals/ gives:
'reason': 'Forbidden',
'status': 403
(Also pip install brotlipy is needed for the successful decompression of the data.)
The bigger issue is that why is that failure invisible in cronjob items? Normally an erroneous run should be seen there and give a big warning message.
@batpad @thenav56
The text was updated successfully, but these errors were encountered: