Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

berkeley: API endpoint /data_objects/study/{study_id} not returning correct results #689

Closed
Tracked by #577
aclum opened this issue Sep 19, 2024 · 8 comments
Closed
Tracked by #577
Assignees
Labels
bug Something isn't working

Comments

@aclum
Copy link
Contributor

aclum commented Sep 19, 2024

Describe the bug
This endpoint is returning an empty array b/c alldocs doesn't exist in berkeley

To Reproduce
Steps to reproduce the behavior:

  1. curl -X 'GET'
    'https://api-berkeley.microbiomedata.org/data_objects/study/nmdc%3Asty-11-33fbta56'
    -H 'accept: application/json'
  2. compare empty return body to results on production

Expected behavior
Response json body should contain the same records as production.

Example user story template:
AS A {user|persona|system},
[INSTEAD OF {current condition}]
I EXPECTED {result}
[SO THAT {value or justification}]
[NO LATER THAN {best by date}]

Screenshots
If applicable, add screenshots to help explain your problem.

Acceptance Criteria
Response body is ~51K in size.

Example scenario-based template:
Given (some given context or precondition), when (I take this action), then (this will be the specific outcome).

Additional context
Consider adding test coverage for this endpoint.

cc @sujaypatil96 because ncbi export code also needs alldocs

@aclum aclum added the bug Something isn't working label Sep 19, 2024
@eecavanna eecavanna changed the title berkeley find_data_objects_for_study_data_objects_study__study_id__get endpoint not returning the correct results. berkeley: API endpoint /data_objects/study/{study_id} not returning correct results Sep 19, 2024
@sujaypatil96 sujaypatil96 self-assigned this Sep 19, 2024
@eecavanna
Copy link
Collaborator

Once #694 has been merged in, I expect this bug to also be squashed.

@eecavanna
Copy link
Collaborator

On second thought, I think—even with the alldocs collection present—there is still a bug here specific to when using the Berkeley schema.

The endpoint code uses the Biosample's part_of field to identify biosamplies associated with the specified Study. In the Berkeley schema, there is no part_of slot on the Biosample class. There is an associated_studies field, though. CC: @sujaypatil96

@eecavanna
Copy link
Collaborator

eecavanna commented Sep 20, 2024

Locally, updating the endpoint's code to use associated_studies made the difference between a result of [] and a result whose content-length is 42452.

image

-     biosamples = mdb.biosample_set.find({"part_of": study["id"]}, ["id"])
+     biosamples = mdb.biosample_set.find({"associated_studies": study["id"]}, ["id"])

image

Note: This is with dump /global/cfs/projectdirs/m3408/nmdc-mongodumps/dump_nmdc-berkeley_2024-09-19_20-20-01 loaded in my local Mongo server.

@eecavanna
Copy link
Collaborator

Now that #694 has been merged in, I'll open a PR having the above code change in it.

@ssarrafan
Copy link
Contributor

@sujaypatil96 @eecavanna who needs to review this?

@eecavanna
Copy link
Collaborator

I merged it in without review (other than self-review) just now in the interest of time.

I would still like for @sujaypatil96 to review it when he gets a chance—and he can open a ticket (or message me on Slack) in case he has concerns about its contents. It is not urgent, from my perspective.

@eecavanna
Copy link
Collaborator

With the endpoint fix in place, and with the alldocs collection in place, the endpoint now returns a non-empty response:

image

The response (when downloaded as a file via Swagger UI) is 55 KB, which (in this context) I consider to be "in the ballpark" of the 51 KB @aclum put in the issue description. For example, maybe some workflows have added more data objects. 🤷

image

@aclum
Copy link
Contributor Author

aclum commented Sep 24, 2024

Confirmed fixed in berkeley

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

No branches or pull requests

4 participants