Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Endpoint to Download TSV Files for donors by Donor ID #1062

Merged
merged 6 commits into from
Oct 6, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions src/clinical/api/clinical-api.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,10 @@ import {
ClinicalErrorsApiBody,
ClinicalSearchApiBody,
CompletionState,
DonorDataApiBody,
} from './types';
import { loggerFor } from '../../logger';
const L = loggerFor(__filename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could Log, or Logger do the same job, while being a bit more declarative than a single capital letter l?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and it would be better. I'm being consistent with the existing pattern in this repo. I would - recommend a separate action to rename them all.


// TODO: Update value type to match mongo schema
export const completionFilters: Record<CompletionState, any> = {
Expand Down Expand Up @@ -140,6 +143,40 @@ class ClinicalController {
res.send(zip.toBuffer());
}

/**
* Download Clinical Data selected by Donor ID.
* Requires Full Read Access.
* Used by API for a user to download clinical data for specific donors and/or files.
* @param req
* @param res
* @returns
*/
@HasFullReadAccess()
async getDonorDataByIdAsTsvsInZip(req: Request, res: Response) {
const bodyParseResult = DonorDataApiBody.safeParse(req.body);
if (!bodyParseResult.success) {
return ControllerUtils.badRequest(
res,
`Invalid data in request body: ${JSON.stringify(bodyParseResult.error)}`,
);
}
const { donorIds } = bodyParseResult.data;

const donorEntityData = await service.getDonorEntityData(donorIds);

const date = currentDateFormatted();
const fileName = `filename=Donor_Clinical_Data_${date}.zip`;
Comment on lines +167 to +168
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sharing the naming convention used for for downloads requested by program submitters. I'm open to suggestions to improve the provided filename. Perhaps a timestamp is a good idea in addition to the current date.

Copy link
Member

@justincorrigible justincorrigible Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside from the filename format itself, would it make sense to abstract this into an env variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside from the filename format itself, would it make sense to abstract this into an env variable?

I don't see a reason to have different filenames per env, what's your reason? If you want filename constanta centralized into a constants file I can do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I'm just used to abstracting constants like this one, because that makes it easier for these values to be updated, without having to rebuild the code "just for a string change".

If you believe this doesn't need to be abstracted, I of course trust your insight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good reason. will make the change.

res
.status(200)
.contentType('application/zip')
.attachment(fileName)
.setHeader('content-disposition', fileName);

const zip = createClinicalZipFile(donorEntityData);

res.send(zip.toBuffer());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funny, I'd have imagined you'd get an error/warning here saying that you had already sent a response in line 170.
obviously I must be mistaken.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous use of res didn't send data, just set some headers and status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... I was under the impression res.status sends a response.
time for me to do some reading and update my understanding.

Copy link
Member

@justincorrigible justincorrigible Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up: https://expressjs.com/en/api.html#res.status
looks like I've been away from coding for too long, starting to forget how to 😢
...I miss creating things

}

@HasProgramReadAccess((req: Request) => req.params.programId)
async getProgramClinicalEntityData(req: Request, res: Response) {
const programShortName: string = req.params.programId;
Expand Down
8 changes: 8 additions & 0 deletions src/clinical/api/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,11 @@ export const ClinicalSearchApiBody = zod.object({
submitterDonorIds: zod.array(zod.string().nonempty()).default([]),
});
export type ClinicalSearchApiBody = zod.infer<typeof ClinicalSearchApiBody>;

/**
* Download Donor Data by ID
*/
export const DonorDataApiBody = zod.object({
donorIds: zod.array(zod.number().positive()).min(1),
});
export type DonorDataApiBody = zod.infer<typeof ClinicalDataApiBody>;
39 changes: 32 additions & 7 deletions src/clinical/clinical-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import {
ClinicalErrorsResponseRecord,
EntityAlias,
aliasEntityNames,
allEntityNames,
} from '../common-model/entities';
import { Errors } from '../utils';
import { patchCoreCompletionWithOverride } from '../submission/submission-to-clinical/stat-calculator';
Expand All @@ -53,13 +54,14 @@ export type ClinicalDonorEntityQuery = {
completionState?: {};
};

// Returns paginated Donor + Entity Submission Data
export type PaginatedClinicalQuery = ClinicalDonorEntityQuery & {
export type PaginationQuery = {
page: number;
pageSize?: number;
sort: string;
};

export type PaginatedClinicalQuery = ClinicalDonorEntityQuery & PaginationQuery;
Comment on lines -57 to +63
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a need to separate the a worker thread task to accept Pagination parameters separate from the Clinical Donor Entity query data, so we can separate out that data for reference to its own type PaginationQuery.


// GQL Query Arguments
// Submitted Data Search Bar
export type ClinicalSearchVariables = {
Expand Down Expand Up @@ -122,6 +124,10 @@ export async function getDonors(programId: string) {
return await donorDao.findBy({}, 999);
}

export async function getDonorsByIds(donorIds: number[]) {
return await donorDao.findByDonorIds(donorIds);
joneubank marked this conversation as resolved.
Show resolved Hide resolved
}

export async function findDonorId(submitterId: string, programId: string) {
const donor = await findDonorBySubmitterId(submitterId, programId);
if (!donor) {
Expand Down Expand Up @@ -219,27 +225,46 @@ export const getPaginatedClinicalData = async (
query: PaginatedClinicalQuery,
) => {
if (!programId) throw new Error('Missing programId!');
const start = new Date().getTime() / 1000;

const allSchemasWithFields = await dictionaryManager.instance().getSchemasWithFields();
// Get all donors + records for given entity
const { donors, totalDonors } = await donorDao.findByPaginatedProgramId(programId, query);

const taskToRun = WorkerTasks.ExtractEntityDataFromDonors;
const taskArgs = [donors as Donor[], totalDonors, allSchemasWithFields, query];
const taskArgs = [donors as Donor[], totalDonors, allSchemasWithFields, query.entityTypes, query];
Comment on lines 233 to +234
Copy link
Member Author

@joneubank joneubank Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ExtractEntityDataFromDonors task was updated to require the PaginatedQuery in place of the ClinicalDonorEntityQuery since the new request by DonorId doesn't include much of that request data. The only additional information used from that query type was the list of entities to include in the request, which is now passed as a separate argument query.entityTypes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the type for query implies the property (entityTypes) is guaranteed to be present.
is it safe to assume its actual presence is validated on runtime elsewhere?


// Return paginated data
const data = await runTaskInWorkerThread<{ clinicalEntities: ClinicalEntityData[] }>(
taskToRun,
taskArgs,
);

const end = new Date().getTime() / 1000;
L.debug(`getPaginatedClinicalData took ${end - start}s`);

return data;
};

export const getDonorEntityData = async (donorIds: number[]) => {
const allSchemasWithFields = await dictionaryManager.instance().getSchemasWithFields();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to do anything about this one, just felt the need to acknowledge out loud how nervous it makes me to have these long chains without ?. 😆

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long optional chaining makes me nervous too.
I actually really like how dictionaryManager does this - throws an error if there's no instance - so you could actually just catch the error and return ... empty array of donorIds in that case?

that node pattern of throw/catch actual errors vs just retuning nulls/undefined is used a lot in this repo and I think it works well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re ? part of this.... getSchemaWIthFields will never actually be run on undefined, an error will be thrown instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

welp! sounds like that requires a lot of "insider" knowledge on that dictionaryManager.

try+catch sure does sound desirable, but I guess my non-evident point is that this line is a bit "code golf"ey, and could use being broken into separate lines 🤷‍♂️

Copy link
Member Author

@joneubank joneubank Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need insider knowledge... the type system informs us of what the output will be. if we can trust the type system then we're safe here.

Copy link
Member

@justincorrigible justincorrigible Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh fun! didn't realise TS (intellisense) was visible in PR reviews at Github.com

must get myself that browser extension!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sarcasm acknowledged. agreed that the lack of introspection in the browser code review is limiting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be fair, we could review PRs by testing them locally... like we used to back in the day.

Modern tooling has become a crutch for what could be more declarative; which proves hard to keep up in the review process. How do you suggest we improve that?

// Get all donors + records for given entity
const donors = await getDonorsByIds(donorIds);
const totalDonors = donors.length;

const paginationQuery: PaginationQuery = {
page: 0,
sort: 'donorId',
pageSize: totalDonors,
};

const taskToRun = WorkerTasks.ExtractEntityDataFromDonors;
const taskArgs = [donors, donors.length, allSchemasWithFields, allEntityNames, paginationQuery];

// Return paginated data
const data = await runTaskInWorkerThread<{ clinicalEntities: ClinicalEntityData[] }>(
taskToRun,
taskArgs,
);
return data.clinicalEntities;
joneubank marked this conversation as resolved.
Show resolved Hide resolved
};

export const getClinicalSearchResults = async (
programId: string,
query: ClinicalDonorEntityQuery,
Expand Down
13 changes: 12 additions & 1 deletion src/clinical/donor-repo.ts
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ export interface DonorRepository {
programId: string,
query: ClinicalDonorEntityQuery,
): Promise<DeepReadonly<{ donors: Donor[] }>>;
findByDonorIds(donorIds: number[]): Promise<DeepReadonly<Donor[]>>;
deleteByProgramId(programId: string): Promise<void>;
deleteByProgramIdAndDonorIds(programId: string, donorIds: number[]): Promise<void>;
findByProgramAndSubmitterId(
Expand Down Expand Up @@ -392,7 +393,17 @@ export const donorDao: DonorRepository = {
submitterId: { $in: submitterIds },
programId: programId,
});
const mapped = result.map((d: DonorDocument) => {
const mapped = result.map(d => {
return MongooseUtils.toPojo(d) as Donor;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you, using as? le wut!

Copy link
Member Author

@joneubank joneubank Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old code pattern in the db repo module, there are better TS ways to do this now using mongoose. Following convention of the file instead of investing time in a re-write of the DB management.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

screenshotted to use without context in the future (context provider is undefineed) :p

});
return F(mapped);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what on earth is F!?

...I know it's just the established pattern, and it's obviously a function, but what does it do!?
without having to change the whole app, at least the import could be renamed in this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrites the type as DeepReadonly<T>. Again, following the service convention, I do not recommend adopting this pattern elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import { F as Freeze } should help... but even then, what's freezing? 😮‍💨

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't update/extend properties dynamically
if I remember correctly it's a very light lib around the native Object.freeze() https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/freeze

},

async findByDonorIds(donorIds: number[]): Promise<DeepReadonly<Donor[]>> {
const result = await DonorModel.find({
donorId: { $in: donorIds },
});
const mapped = result.map(d => {
return MongooseUtils.toPojo(d) as Donor;
});
return F(mapped);
Expand Down
35 changes: 23 additions & 12 deletions src/clinical/service-worker-thread/tasks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,11 @@

import { DeepReadonly } from 'deep-freeze';
import _, { isEmpty } from 'lodash';
import { ClinicalEntitySchemaNames, aliasEntityNames } from '../../common-model/entities';
import {
ClinicalEntitySchemaNames,
EntityAlias,
aliasEntityNames,
} from '../../common-model/entities';
import {
calculateSpecimenCompletionStats,
dnaSampleFilter,
Expand All @@ -29,7 +33,7 @@ import {
getSampleRegistrationDataFromDonor,
} from '../../common-model/functions';
import { notEmpty } from '../../utils';
import { PaginatedClinicalQuery, ClinicalDonorEntityQuery } from '../clinical-service';
import { ClinicalDonorEntityQuery, PaginationQuery } from '../clinical-service';
import {
ClinicalEntityData,
ClinicalInfo,
Expand Down Expand Up @@ -117,13 +121,14 @@ const mapEntityDocuments = (
entity: EntityClinicalInfo,
donorCount: number,
schemas: any,
query: PaginatedClinicalQuery,
entityTypes: EntityAlias[],
paginationQuery: PaginationQuery,
completionStats: CompletionDisplayRecord[],
) => {
const { entityName, results } = entity;

// Filter, Paginate + Sort
const { page, pageSize = results.length, sort, entityTypes } = query;
const { page, pageSize = results.length, sort } = paginationQuery;
const relevantSchemaWithFields = schemas.find((s: any) => s.name === entityName);
const entityInQuery = isEntityInQuery(entityName, entityTypes);

Expand Down Expand Up @@ -239,21 +244,20 @@ function extractEntityDataFromDonors(
donors: Donor[],
totalDonors: number,
schemasWithFields: any,
query: PaginatedClinicalQuery,
entityTypes: EntityAlias[],
paginationQuery: PaginationQuery,
) {
let clinicalEntityData: EntityClinicalInfo[] = [];

donors.forEach(d => {
Object.values(ClinicalEntitySchemaNames).forEach(entity => {
const isQueriedEntity = isEntityInQuery(entity, query.entityTypes);
const isRelatedEntity = getRequiredDonorFieldsForEntityTypes(query.entityTypes).includes(
entity,
);
const isQueriedEntity = isEntityInQuery(entity, entityTypes);
const isRelatedEntity = getRequiredDonorFieldsForEntityTypes(entityTypes).includes(entity);
const requiresSampleRegistration =
entity === ClinicalEntitySchemaNames.REGISTRATION &&
(query.entityTypes.includes('donor') || query.entityTypes.includes('sampleRegistration'));
(entityTypes.includes('donor') || entityTypes.includes('sampleRegistration'));
const requiresSpecimens =
entity === ClinicalEntitySchemaNames.SPECIMEN && query.entityTypes.includes('donor');
entity === ClinicalEntitySchemaNames.SPECIMEN && entityTypes.includes('donor');

const isRequiredEntity =
isQueriedEntity || isRelatedEntity || requiresSampleRegistration || requiresSpecimens;
Expand Down Expand Up @@ -287,7 +291,14 @@ function extractEntityDataFromDonors(

const clinicalEntities: ClinicalEntityData[] = clinicalEntityData
.map((entity: EntityClinicalInfo) =>
mapEntityDocuments(entity, totalDonors, schemasWithFields, query, completionStats),
mapEntityDocuments(
entity,
totalDonors,
schemasWithFields,
entityTypes,
paginationQuery,
completionStats,
),
)
.filter(notEmpty);

Expand Down
2 changes: 1 addition & 1 deletion src/common-model/entities.ts
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ export const aliasEntityNames: Record<ClinicalEntitySchemaNames, EntityAlias> =
biomarker: 'biomarker',
};

export const queryEntityNames = Object.values(aliasEntityNames);
export const allEntityNames = Object.values(aliasEntityNames);

export interface ClinicalEntityErrorRecord extends dictionaryEntities.SchemaValidationError {
entityName: ClinicalEntitySchemaNames;
Expand Down
5 changes: 1 addition & 4 deletions src/logger.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,10 @@ export interface Logger {
}

const winstonLogger = winston.createLogger(logConfiguration);
if (process.env.LOG_LEVEL == 'debug') {
if (process.env.LOG_LEVEL === 'debug') {
console.log('logger configured: ', winstonLogger);
}
export const loggerFor = (fileName: string): Logger => {
if (process.env.LOG_LEVEL == 'debug') {
console.debug('creating logger for', fileName);
}
const source = fileName.substring(fileName.indexOf('argo-clinical'));
return {
error: (msg: string, err: Error): void => {
Expand Down
65 changes: 36 additions & 29 deletions src/resources/swagger.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -842,44 +842,43 @@ paths:
'200':
description: 'returns new entity exception after deletion'

/clinical/donors:
get:
deprecated: true
/clinical/donors/tsv:
post:
tags:
- Deprecated
summary: Get all program donors from db
parameters:
- $ref: '#/components/parameters/QueryProgramId'
- Clinical Data
summary: Download TSVs for donors by ID.
description: This request downloads all clinical data for donors rquested by ID. The response is a zip containing all clinical data as TSVs. Requires full system read permission to access.
requestBody:
description: JSON object with list of donor IDs
required: true
content:
application/json:
schema:
type: object
properties:
donorIds:
description: List of donor IDs, as numbers, to be included in request. At least one Donor ID minimum is required.
type: array
items:
type: numbers
default: []
responses:
'401':
$ref: '#/components/responses/UnauthorizedError'
'403':
$ref: '#/components/responses/ForbiddenError'
'400':
description: 'User input error. Either the request body was invalid, or some of the requested Donor IDs were not found.'
'500':
$ref: '#/components/responses/ServerError'
'200':
description: 'List of donors for the given program'
description: 'Zip file with clinical data for requested donors.'
content:
application/json:
application/zip:
schema:
type: array
items:
type: object
delete:
tags:
- Test
summary: Test endpoint to delete program donors in db (disabled in prod)
parameters:
- $ref: '#/components/parameters/QueryProgramId'
responses:
'401':
$ref: '#/components/responses/UnauthorizedError'
'403':
$ref: '#/components/responses/ForbiddenError'
'500':
$ref: '#/components/responses/ServerError'
'200':
description: 'all donors deleted'
type: string
format: binary

/clinical/program/{programId}/donor/{donorId}:
get:
tags:
Expand Down Expand Up @@ -1068,6 +1067,16 @@ paths:
description: Returns a zip file with commited clinical entity data in tsvs, filtered by request query. Exported data may not be valid with current dictionary.
parameters:
- $ref: '#/components/parameters/PathProgramId'
- in: query
name: page
schema:
type: integer
description: Paginated Results page number to return
- in: query
name: pageSize
schema:
type: integer
description: Number of paginated results per page
requestBody:
description: JSON object containing search filters
required: true
Expand Down Expand Up @@ -1359,8 +1368,6 @@ tags:
description: APIs to do the icgc import
- name: Admin
description: Admin operations collected for convenience
- name: Test
description: Protected endpoints that run in dev only, not in prod.
- name: Deprecated
description: Deprecated enpoints
components:
Expand Down
8 changes: 4 additions & 4 deletions src/routes/data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@ router.post(
wrapAsync(clinicalApi.getProgramClinicalSearchResults),
);
router.post('/program/:programId/clinical-errors', wrapAsync(clinicalApi.getProgramClinicalErrors));
router.post(
'/program/:programId/clinical-tsv',
wrapAsync(clinicalApi.getSpecificClinicalDataAsTsvsInZip),
);
router.post('/program/:programId/clinical-tsv', wrapAsync(clinicalApi.getDonorDataByIdAsTsvsInZip));

// Download TSVs for Donors by Donor ID
router.post('/donors/tsv', wrapAsync(clinicalApi.getDonorDataByIdAsTsvsInZip));

// Get TSV Data
router.get(
Expand Down