Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SpatialAggregate to api #599

Closed
2 of 4 tasks
greenape opened this issue Apr 8, 2019 · 11 comments
Closed
2 of 4 tasks

Add SpatialAggregate to api #599

greenape opened this issue Apr 8, 2019 · 11 comments
Labels
FlowAPI Issues related to the FlowKit API FlowAuth Issues related to FlowAuth FlowClient Issues related to FlowClient FlowMachine Issues related to FlowMachine

Comments

@greenape
Copy link
Member

greenape commented Apr 8, 2019

Modal locations and daily locations are presently dealt with by calling aggregate() on any query created by the api. This implicitly returns them wrapped in SpatialAggregate. This should be made explicit, and the hidden call to aggregate() removed. This also paves the way for a broader classification of things-that-are-aggregated, which I think might warrant being a marshmallow spec in itself (probably allOf).

  • Add marshmallow spec for SpatialAggregate
  • Remove call to aggregate()
  • Update client
  • Update claims checking in FlowAPI
@greenape greenape added FlowClient Issues related to FlowClient FlowAuth Issues related to FlowAuth FlowMachine Issues related to FlowMachine FlowAPI Issues related to the FlowKit API labels Apr 8, 2019
@greenape
Copy link
Member Author

greenape commented Apr 9, 2019

One thing to consider here is that modal_location operates on daily_location specs, so this presumably needs to be a separate function in FlowClient. Also means a bare daily_location query spec should no longer be allowable, and means any examples using modal_location or daily_location will need to be updated.

@greenape
Copy link
Member Author

greenape commented Apr 9, 2019

#524

@OwlHute
Copy link
Contributor

OwlHute commented Apr 9, 2019

Am I right in thinking the idea is to make this aggregation process user-callable, so an API user can use it more generally on any pair of applicable data sets one of which is a subscriber list, and produce a sort of aggregated "product" of the sets?

@greenape
Copy link
Member Author

greenape commented Apr 9, 2019

Think that's in reference to #600 @OwlHute?

@OwlHute
Copy link
Contributor

OwlHute commented Apr 9, 2019

Oh, I didn't know there was a #600 until you mentioned it! Yes, I guess my previous question applies more to that, but I'm still a bit hazy about the overall goal and the specific interface changes required for #599 and #600.

If you think I should be able to work these out exactly from the summaries given so far, please let me know and I'll study the relevant code further and come up with something. But otherwise I'm assuming you plan to provide a more detailed requirements along those lines. (In other words, I don't want to be waiting for something that won't be provided!)

@greenape
Copy link
Member Author

greenape commented Apr 9, 2019

FlowAPI is an aggregates API - it should return only data aggregated to a geographic region. As a temporary hack to ensure that, this happens:

# FIXME: currently we also aggregate the query here, but this is not

Most things don't have an aggregate() method (which does: SpatialAggregate(the_original_object), but some do. Rather than have this weird implicit thing happening, which leads to things like #603, it would be better to be explicit that this is happening, and remove the sneaky call to aggregate().

For a lot of the query types we have, there's no natural 'default' spatial unit to aggregate to, which is where #600 comes in - that takes something with a subscriber identifier column, and another thing which assigns a location to each subscriber, and gives you a spatial aggregate.

@OwlHute
Copy link
Contributor

OwlHute commented Apr 9, 2019

OK thanks, but when the call to aggregate() is removed from the cases where it currently occurs, won't that mean the relevant output is no longer aggregated?! Or is the idea to aggregate at source so to speak at the point where the data is initially obtained, and have the API output it to the caller like any other aggregated-at-the-point-of-acquisition output? (If that makes sense)

@greenape
Copy link
Member Author

greenape commented Apr 9, 2019

Yes. The API requests that previously went:

{"query_kind": "daily_location" ...}

Should now go:

{"query_kind":"spatial_aggregate", "locations":{"query_kind": "daily_location" ...}}

@OwlHute
Copy link
Contributor

OwlHute commented Apr 9, 2019

If that is the only change, it sounds pretty simple. So I'm guessing other changes are also required (including obviously changes to the relevant test cases)

@OwlHute
Copy link
Contributor

OwlHute commented Apr 9, 2019

Following class added in flowmachine/flowmachine/core/server/query_schemas/custom_fields.py, preserving alphabetic order of existing classes:

class SpatialAggregate(fields.List):
    """
    List representing a spatial aggregate subset.
    """

    def __init__(self, required=False, allow_none=True, validate=None, **kwargs):
        if validate is not None:
            raise ValueError(
                "The SpatialAggregate field provides its own validation "
                "and thus does not accept a the 'validate' argument."
            )

        super().__init__(
            required=required, allow_none=allow_none, validate=OneOf([None]), **kwargs
        )

@greenape
Copy link
Member Author

Not sure I follow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FlowAPI Issues related to the FlowKit API FlowAuth Issues related to FlowAuth FlowClient Issues related to FlowClient FlowMachine Issues related to FlowMachine
Projects
None yet
Development

No branches or pull requests

2 participants