Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique subscriber counts #562

Merged
merged 21 commits into from
Apr 5, 2019
Merged

Unique subscriber counts #562

merged 21 commits into from
Apr 5, 2019

Conversation

OwlHute
Copy link
Contributor

@OwlHute OwlHute commented Apr 3, 2019

Closes #588

I have:

  • Formatted any Python files with black
  • Brought the branch up to date with master
  • Added any relevant Github labels
  • Added tests for any new additions
  • Added or updated any relevant documentation
  • Added an Architectural Decision Record (ADR), if appropriate
  • Added an MPLv2 License Header if appropriate
  • Updated the Changelog

Description

Description of what this PR does, and what it changes.

@OwlHute OwlHute requested review from greenape and maxalbert April 3, 2019 12:33
Example
-------
>>> unique_subscriber_counts( { "start_date" : "2019-03-20", "end_date" : "2019-03-31", "aggregation_unit" : "admin3" } )
[ ] # TODO - Fill this in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally forgot to delete this TODO item?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what goes there. Is there a systematic way to find out? I assume I can try actually running the code and extracting the relavent string from the output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you'd need to actually run it and copy & paste the output.

Note that in order to be able to successfully run it, you will need to change the dates to something between 2016-01-01 and 2016-01-07 because the data in the test database is between those dates.

@greenape greenape added enhancement New feature or request FlowAPI Issues related to the FlowKit API FlowClient Issues related to FlowClient FlowMachine Issues related to FlowMachine labels Apr 3, 2019
@maxalbert
Copy link
Contributor

The additions to the API spec (in the file tests/flowmachine_server_tests/test_server.test_api_spec_of_flowmachine_query_schemas.approved.txt) look correct, but they are in the "wrong" order (the keys in the file are sorted alphabetically so that the result is reproducible for the tests).

I assume you edited this file manually? While this is possible, it's easier to let ApprovalTests auto-generate it for you. When you run the relevant integration test (e.g. via cd integration_tests && pipenv run -svx tests/flowmachine_server_tests/test_server) it will open a diff tool and allow you to automatically accept the changes.

However, currently it is configured to use opendiff which I'm not sure is available on Windows. You will want to install some diff tool (see here for some choices) and temporarily change the setting inintegration_tests/tests/conftest.py(near the bottom) from opendiff to whatever tool you installed. Apologies that this is a bit clunky at the moment - we should look into supporting a variety of tools out of the box (which we can do in a separate PR once we have got it working for you on Windows).

@maxalbert
Copy link
Contributor

@OwlHute FYI, in PR #566 I added a bit of documentation about the approvaltests-based tests and added a toplevel approvaltests_diff_reporters.json file which allows to configure the diff tool more easily. See here for the addition to the docs about this.

@codecov
Copy link

codecov bot commented Apr 4, 2019

Codecov Report

Merging #562 into master will decrease coverage by <.01%.
The diff coverage is 91.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #562      +/-   ##
==========================================
- Coverage   92.74%   92.74%   -0.01%     
==========================================
  Files         115      116       +1     
  Lines        6038     6062      +24     
  Branches      672      673       +1     
==========================================
+ Hits         5600     5622      +22     
- Misses        312      314       +2     
  Partials      126      126
Impacted Files Coverage Δ
flowauth/backend/flowauth/models.py 93.29% <ø> (ø) ⬆️
flowclient/flowclient/client.py 92.34% <100%> (+0.08%) ⬆️
...e/server/query_schemas/unique_subscriber_counts.py 100% <100%> (ø)
...ine/core/server/query_schemas/flowmachine_query.py 45.94% <33.33%> (-1.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c330e5c...3e50c9b. Read the comment docs.

@OwlHute
Copy link
Contributor Author

OwlHute commented Apr 4, 2019

Re Max's flowmachine_server_tests comment, for now I have manually moved the unique_subscriber_counts entries into alphabetical order, as I have had trouble running the integration_tests:

Firstly, I have established that these tests definitely won't run in Windows, with or without using the FlowKit integration test pipenv environment, because two 3rd party modules they use, "pglast" and "ujson" are Unix-only. (The only other Unix dependency I've come across in FlowKit is the use of "pwd", for dealing with the Unix password database.)

I am able to run them (without missing module failures) in "Ubuntu for Windows" in their FlowKit development pipenv environment, which is good. However, this fails with hordes of errors, all I think due to the same thing: a database port not being configured, or its config not being picked up by the test scripts. Here is a sample error message:

  conn = _connect(dsn, connection_factory=connection_factory, **kwasync)

E sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server:
Connection refused
E Is the server running on host "localhost" (127.0.0.1) and accepting
E TCP/IP connections on port 9997?

Also, I see that after this commit one or more (?) of the automated CI tests have failed. But I can't see anything relating to Unique Subscriber Counts in the failure summary, or in any of the voluminous waffle following it! I'll copy and paste the full test output schpiel and search for any relevant failure reason(s). But at first sight it looks as if the failures may be unrelated to my changes!

@greenape
Copy link
Member

greenape commented Apr 4, 2019

@OwlHute There's a readme file that explains how to run the integration tests, located in the integration tests folder. Note that you'll need/want to use Pipenv, because various settings are in .env files that pipenv and docker both pick up automatically. You should be able to run the tests using pipenv install && pipenv run run-tests from the integration tests folder.

The test that is currently failing (which I believe @maxalbert has just fixed for you), is the autogenerated api schema.

Once the tests are all passing, we'll also want to look at the test coverage to assess whether you need to add some more tests to cover the new code.

@maxalbert
Copy link
Contributor

Thanks for the changes @OwlHute! The toplevel keys are now in the correct order. The nested keys were still in non-alphabetical order; for the sake of expediency I just pushed a change to fix this (you will want to run git pull to bring your local branch up to date).

Apologies for the test run failures. This is due to #467, I'll submit a PR to fix this in a minute. In the meantime, you should be able to run the tests by setting the following environment variables explicitly in your shell (you will need to run pipenv shell first to activate the pipenv environment, otherwise the integration_tests/.env file overwrites them again).

cd integration_tests
pipenv shell
export FLOWDB_PORT=9000
export REDIS_PORT=6379
export FLOWMACHINE_PORT=5555

Are you able to run the tests with these settings?

If you want to run just the API spec test (either for speed, or to avoid failures with Unix-only modules) you can do it like this (from within the integration tests pipenv environment as above):

pytest -svx tests/flowmachine_server_tests/test_server.py 

@maxalbert
Copy link
Contributor

@greenape Is there a need for any more test coverage or are all the changes here covered by existing tests?

@OwlHute
Copy link
Contributor Author

OwlHute commented Apr 4, 2019

@max, many thanks for these port settings. I ran the test in the pipenv environment in Ubuntu, and the output looks a lot better, in that it is much shorter! But there are still a few errors around connections, e.g.:

          sock.connect(socket_address)

E ConnectionRefusedError: [Errno 111] Connection refused

E ConnectionRefusedError: [Errno 111] Connection refused

/home/jr/.local/share/virtualenvs/integration_tests-sEjdA8oo/lib/python3.7/site-packages/redis/connection.py:538: ConnectionRefusedError

:::

/home/jr/.local/share/virtualenvs/integration_tests-sEjdA8oo/lib/python3.7/site-packages/redis/connection.py:497: ConnectionError
-------------------------------------------------- Captured log setup > --------------------------------------------------connection.py 103 INFO {"event":"Couldn't
get username for application name, using >'flowmachine'","logger":"flowmachine.core.connection","level":"info","timestamp":"2019-04-04T08:54:32.554599Z"}
init.py 213 INFO Logger created with level DEBUG

Copy link
Member

@greenape greenape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Needs tests, because this drops the coverage a little too much.

Two approaches there, first is to add a tavern style test, as seen here https://github.com/Flowminder/FlowKit/tree/master/integration_tests/tests/flowapi_tests + a test for constructing the dict in the FlowClient tests, here: https://github.com/Flowminder/FlowKit/tree/master/flowclient/tests/unit

Alternative would be to add a test to the parameterised test_run_query test here: https://github.com/Flowminder/FlowKit/blob/master/integration_tests/tests/test_queries.py which would hit both code paths.

@property
def _flowmachine_query_obj(self):
"""
Return the underlying flowmachine daily_location object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Return the underlying flowmachine daily_location object.
Return the underlying flowmachine UniqueSubscriberCounts object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented the suggested alternative, but that _flowmachine_query_ob j() method was already present and the only change I made was to fix the object name in theader. So that won't change the test coverage. Presumably something must be added to actually call _flowmachine_query_obj() ?! (I assumed some test would run it automatically, as its name looks pretty generic.)

Copy link
Contributor

@maxalbert maxalbert Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OwlHute This particular suggestion about changing daily_location to UniqueSubscriberCounts in the docstring is independent of @greenape's comment about code coverage and the need to add an additional test. (It may look related because Github displays it right underneath, but it has nothing to do with it).

Example
-------
>>> unique_subscriber_counts( { "start_date" : "2019-03-20", "end_date" : "2019-03-31", "aggregation_unit" : "admin3" } )
[ ] # TODO - Fill this in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example needs to be filled in as mentioned, and should actually work when run (args are wrapped in a dict here, so this would error when run?)

},
),
(
"unique_subscribers_count",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this added twice? It seems to be an exact copy of the previous one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOH! Looking at what I thought was one previous test case, I assumed it was the input arguments and the expected output arguments. But on closer inspection I see it is two previous test cases and only the input arguments are required. I've removed the duplicate entry now and re-committed & pushed to github

@maxalbert
Copy link
Contributor

Thanks for the changes @OwlHute. This looks (almost) good, but we need a changelog entry please - see the checkbox in the PR template at the top.

(Ideally we'd also want a Github issue that this PR closes, but we can add that separately.)

@OwlHute
Copy link
Contributor Author

OwlHute commented Apr 4, 2019

I have a horrible feeling I may have morphed the name "unique_subscriber_counts" to "unique_subscribers_count" in the added test - Checking now ..

@OwlHute
Copy link
Contributor Author

OwlHute commented Apr 4, 2019

Fixed!



def unique_subscriber_counts(
start_date: str, stop_date: str, aggregation_unit: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this earlier: this parameter should actually be named end_date rather than stop_date. (This was caught by the most recent integration test you added, which is why they are currently failing.)

@maxalbert
Copy link
Contributor

This looks good now, as far as I can tell. Happy with the test coverage @greenape?

@greenape
Copy link
Member

greenape commented Apr 4, 2019

Hm. Looks like needs a case added to one or both of test_get_query_params or test_get_query_kind in integration tests, in order to hit the get_obj_type method.

@maxalbert
Copy link
Contributor

@OwlHute Thanks for adding the extra test. It failed because the assert statement contained a hard-coded "daily_location" but your new test was against unique_subscriber_counts. I just pushed a change to fix this, so hopefully it will all pass now.

…ise the action handlers (rather than specific queries)
@maxalbert
Copy link
Contributor

I'm happy to merge this now. The code coverage is still every so slightly reduced, but if I'm interpreting the codecov reports correctly then this seems to be a code path that's also not hit by other queries, so I'm happy to deal with this separately.

@maxalbert
Copy link
Contributor

FYI, I have removed the tests in test_action_get_params.py and test_action_get_query_kind.py again because they are not meant to test specific queries (but rather the action handler functions), see #584.

@maxalbert maxalbert added the ready-to-merge Label indicating a PR is OK to automerge label Apr 5, 2019
@maxalbert maxalbert dismissed greenape’s stale review April 5, 2019 13:44

Test coverage seems to be sufficient now, apart from a small drop that will be investigated separately

@maxalbert maxalbert merged commit 5c283c6 into master Apr 5, 2019
@maxalbert maxalbert deleted the unique_subscriber_counts branch April 5, 2019 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request FlowAPI Issues related to the FlowKit API FlowClient Issues related to FlowClient FlowMachine Issues related to FlowMachine ready-to-merge Label indicating a PR is OK to automerge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants