Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WikiPageviews and spec for new API behavior #5508

Closed
ragesoss opened this issue Oct 17, 2023 · 3 comments · Fixed by #5513
Closed

Update WikiPageviews and spec for new API behavior #5508

ragesoss opened this issue Oct 17, 2023 · 3 comments · Fixed by #5513

Comments

@ragesoss
Copy link
Member

We have a handful of specs in wiki_pageviews_spec.rb that recently started failing. It looks like this is because the behavior of the API has changed when handling requests for which no data is available, so our code needs to be updated to handle the current behavior.

The easiest way to gets started will be to run that spec, then figure out what changes need to be made to make the specs pass.

@gabina
Copy link
Member

gabina commented Oct 17, 2023

I'm working on this

@gabina
Copy link
Member

gabina commented Oct 18, 2023

These are the three failing tests:

1) WikiPageviews.views_for_article for an unviewed article returns an empty hash
     Failure/Error: raise PageviewApiError, response
     
     WikiPageviews::PageviewApiError:
       {"detail":"The date(s) you used are valid, but we either do not have data for those date(s), or the project you asked for is not loaded yet. Please check documentation for more information.","method":"get","status":404,"title":"Not Found","type":"about:blank","uri":"/metrics/pageviews/per-article/fr.wikisource/all-access/user/Voyages,_aventures_et_combats%2FChapitre_18/daily/2017040100/2017050100"}

2) WikiPageviews.average_views_for_article for an article that does not exist returns 0
   Failure/Error: raise PageviewApiError, response
   
   WikiPageviews::PageviewApiError:
     {"detail":"The date(s) you used are valid, but we either do not have data for those date(s), or the project you asked for is not loaded yet. Please check documentation for more information.","method":"get","status":404,"title":"Not Found","type":"about:blank","uri":"/metrics/pageviews/per-article/en.wikipedia/all-access/user/THIS_IS_NOT_A_REAL_ARTICLE/daily/2023082900/2023101700"}

3) WikiPageviews.average_views_for_article for an article that exist but has no view data returns 0
   Failure/Error: raise PageviewApiError, response
   
   WikiPageviews::PageviewApiError:
     {"detail":"The date(s) you used are valid, but we either do not have data for those date(s), or the project you asked for is not loaded yet. Please check documentation for more information.","method":"get","status":404,"title":"Not Found","type":"about:blank","uri":"/metrics/pageviews/per-article/fr.wikisource/all-access/user/Voyages,_aventures_et_combats%2FChapitre_18/daily/2023082900/2023101700"}

You can make one of the requests for the failing tests using this link.

As it's described on the issue, it looks like the pageviews API changed the behavior and started to response something like this when the article doesn't exist or the article doesn't have views in the given period:

{
    "detail": "The date(s) you used are valid, but we either do not have data for those date(s), or the project you asked for is not loaded yet. Please check documentation for more information.",
    "method": "get",
    "status": 404,
    "title": "Not Found",
    "type": "about:blank",
    "uri": "/metrics/pageviews/per-article/fr.wikisource/all-access/user/Voyages,_aventures_et_combats%2FChapitre_18/daily/2017040100/2017050100"
}

I wasn't able to find any documentation about the specific about:blank type in the docs.

I think a good approach to fix the tests could be to modify parse_results in the WikiPageviews class (link to method) to avoid raising PageviewApiError for these cases. @ragesoss do you mind checking this approach? If you're ok with this, I'll do a PR.

  def parse_results(response)
    return unless response
    data = Utils.parse_json(response)
    return data['items'] if data['items']
    # As of October 2017, the data type is https://www.mediawiki.org/wiki/HyperSwitch/errors/not_found
    return no_results if %r{errors/not_found}.match?(data['type'])
    return no_results if no_data_available_response?(data) # this is the new case
    raise PageviewApiError, response
  end
  # As of October 2023, we started to see 404 not found responses with about:blank type
  # and a specific detail when handling requests for which no data is available
  def no_data_available_response?(response)
    no_data_avialable_detail = 'The date(s) you used are valid, but we either do not have data '\
                               'for those date(s), or the project you asked for is not loaded yet.'\
                               ' Please check documentation for more information.'
    response['status'] == 404 && response['detail'] == no_data_avialable_detail
  end

@ragesoss
Copy link
Member Author

Yes, that sounds like the right strategy @gabina.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants