API usage by external users #1154

tovari · 2021-08-10T15:10:18Z

#1026 implementation allows to track API usage on each endpoints. The dashboard tracks also the api calls made by the frontend, which makes the external API usage not trackable.
We need also the ability to track only the external API calls.

gulfaraz · 2021-08-20T13:51:47Z

I counted API requests made from each IP addresses to learn that all go-api requests are made from the same IP address 0.0.0.0

The above chart was created using the below query with data from the last 7 days.

requests
| summarize hits = count() by client_IP
| render columnchart

I investigated further by making some GET requests for sample logs. There are client details which the current analytics system captures. The screenshots below are direct API hits made from my laptop in The Netherlands.

There appears to be some masking going on which removes my info before the request reaches the Django app.

@batpad my guess is this is happening either in the docker network layer or the load balancer (as you suggested)

Alternatively, a less graceful approach is to explicitly tag each call made from the go-frontend in fetch. Then in the go-api we capture this as a custom dimension.

nanometrenat · 2021-08-22T07:06:28Z

FYI previous ticket #572 (comment) speaks to what logs are available on the Django servers - I don't have access to the IM mailbox anymore to check, but I'm pretty sure we managed to get a list of IP addresses from those logs at that time (Feb 2020) - the problem we had was that we couldn't differentiate between API calls from the user's browser (i.e. from using the site) vs API calls via other means.

batpad · 2021-08-23T07:08:18Z

@gulfaraz do we know exactly where these logs are derived from? This would make sense for logs that were being emitted by the Django App. However, in these cases where there's something masking the originating IP address there "should" always be an X-Forwarded-For header added that should contain the real IP address. From rough reading online, it seems like the Azure logs should use the X-Forwarded-For header to determine the actual Client IP when available, but of course, this is not working for us some-how.

This would take a bit more investigation - it could possibly be one of a few different things:

In the best case scenario, just a change in filter to use the X-Forwarded-For header to determine the actual Client IP
In the more likely case, the X-Forwarded-For header is either not being applied correctly, or being dropped by the web-server

If the logs above are parsing the access logs generated by the gunicorn server running the application, it seems like it might require some config to get it to log the X-Forwarded-For IP rather than the proxy IP: https://docs.gunicorn.org/en/stable/deploy.html

Not 100% sure of the best way to debug this - I guess a starting point would be knowing exactly where that chart is trying to read the Client IP from, and work backwards from there.

gulfaraz · 2021-08-26T11:27:08Z

do we know exactly where these logs are derived from?
a starting point would be knowing exactly where that chart is trying to read the Client IP from, and work backwards from there

Azure uses the API requests' IP address to find client_City, client_StateOrProvince, and client_CountryOrRegion using GeoLite2 from MaxMind

This would make sense for logs that were being emitted by the Django App. However, in these cases where there's something masking the originating IP address there "should" always be an X-Forwarded-For header added that should contain the real IP address. From rough reading online, it seems like the Azure logs should use the X-Forwarded-For header to determine the actual Client IP when available, but of course, this is not working for us some-how.

Looks like the server drops the X-Forwarded-For header to maintain user privacy. The IP address isn't collected locally when the X-Forwarded-For header is set.

In the best case scenario, just a change in filter to use the X-Forwarded-For header to determine the actual Client IP

In the more likely case, the X-Forwarded-For header is either not being applied correctly, or being dropped by the web-server

Azure may be masking the IP address. I suggest disabling any masking on Azure's side before trying the above actions.

I tried to disable masking using these steps but I don't seem to have the required permissions.

batpad · 2021-08-27T12:02:09Z

@gulfaraz - this is some solid digging into this.

It would be nice to rule out Azure masking the IP address. It definitely seems like these logs are all coming from Azure and it's not parsing logs being emitted by the django app, so I don't think this is a django issue.

The Azure masking seems the most likely to me :( - if we can definitely rule out Azure masking the IP, then am happy to get on a call or so to try and delve into this more - definitely a mystery I'm quite interested in solving as well, thanks much for digging into this.

tovari added the feature label Aug 10, 2021

tovari assigned gulfaraz Aug 10, 2021

gulfaraz removed their assignment Sep 30, 2021

nanometrenat mentioned this issue Apr 11, 2024

Ability to track API requests and their source (frontend/human/other) #2069

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API usage by external users #1154

API usage by external users #1154

tovari commented Aug 10, 2021

gulfaraz commented Aug 20, 2021

nanometrenat commented Aug 22, 2021

batpad commented Aug 23, 2021

gulfaraz commented Aug 26, 2021

batpad commented Aug 27, 2021

API usage by external users #1154

API usage by external users #1154

Comments

tovari commented Aug 10, 2021

gulfaraz commented Aug 20, 2021

nanometrenat commented Aug 22, 2021

batpad commented Aug 23, 2021

gulfaraz commented Aug 26, 2021

batpad commented Aug 27, 2021