Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BACK-43] Dedup hash val #195

Closed
wants to merge 33 commits into from
Closed
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
0b906be
include _deduplicator for upload type
jh-bate Aug 24, 2023
da75a44
add _deduplicator to output
jh-bate Aug 24, 2023
69c7645
deduplicator for all types
jh-bate Aug 30, 2023
f63537d
remove empty hash
jh-bate Aug 31, 2023
47b7e21
bson format
jh-bate Aug 31, 2023
90c4a47
omit deduplicator
jh-bate Aug 31, 2023
298d146
Merge branch 'master' into add-platform-deduplication-hash
jh-bate Sep 5, 2023
639286a
test mongo changes
jh-bate Sep 5, 2023
50759a8
_deduplicator for all types
jh-bate Sep 5, 2023
b81bf9e
generate hash as per platform
jh-bate Sep 5, 2023
d8217eb
formatting
jh-bate Sep 5, 2023
d2103b0
update to use the registered idFields for each data type
jh-bate Sep 6, 2023
c000e04
register for each individual type
jh-bate Sep 7, 2023
f804770
formatting
jh-bate Sep 7, 2023
e57f538
sanity tests for datum hash
jh-bate Sep 11, 2023
549aef6
test fixes
jh-bate Sep 11, 2023
8a8ff60
formatting
jh-bate Sep 12, 2023
60e3601
add tests that are direct comparison with what platform returns
jh-bate Sep 14, 2023
32473a2
fix test
jh-bate Sep 14, 2023
d58fa99
[BACK-43] Add platform style conversion (#193)
jh-bate Sep 18, 2023
c11f9d2
fix conversion rounding
jh-bate Sep 20, 2023
6528cb4
only upload has name and version _deduplicator detail
jh-bate Sep 25, 2023
80cc1b9
do not truncate value
jh-bate Oct 4, 2023
0d89f0b
add test for truncated val used in hash
jh-bate Oct 4, 2023
572050b
naming updates from review
jh-bate Oct 10, 2023
24a6d7a
updates to error if type not registered for platform hash
jh-bate Oct 11, 2023
4d8b48a
remove unreachable code
jh-bate Oct 11, 2023
986c7fa
Merge branch 'master' into dedup-hash-val
jh-bate Nov 7, 2023
21625df
tweak to provide upload destination
jh-bate Nov 29, 2023
a42583d
Merge branch 'master' into dedup-hash-val
jh-bate Nov 30, 2023
5da25ed
sp
jh-bate Dec 1, 2023
6ab36c1
Merge branch 'master' into dedup-hash-val
jh-bate Apr 4, 2024
618c957
remove commented out code for summary tests
jh-bate Apr 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions lib/misc.js
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
/*
* == BSD2 LICENSE ==
* Copyright (c) 2014, Tidepool Project
*
*
* This program is free software; you can redistribute it and/or modify it under
* the terms of the associated License, which is identical to the BSD 2-Clause
* License as published by the Open Source Initiative at opensource.org.
*
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
* FOR A PARTICULAR PURPOSE. See the License for more details.
*
*
* You should have received a copy of the License along with this program; if
* not, you can obtain one from Tidepool Project at tidepool.org.
* == BSD2 LICENSE ==
Expand Down Expand Up @@ -45,7 +45,7 @@ var except = amoeba.except;
* @param fields an array of values to be concatenated together into a unique string
* @returns {string} the base32 encoded hash of the delimited-concatenation of the provided fields (also known as a "unique" id)
*/
exports.generateId = function(fields) {
exports.generateId = function (fields) {
var hasher = crypto.createHash('sha1');

for (var i = 0; i < fields.length; ++i) {
Expand All @@ -65,3 +65,10 @@ exports.generateId = function(fields) {
return base32hex.encodeBuffer(hasher.digest(), { paddingChar: '-' });
};

exports.generateHash = function (identityFields) {
const identityString = identityFields.join('|').trim();
return crypto
.createHash('sha256')
.update(identityString, 'binary')
.digest('base64');
};
3 changes: 1 addition & 2 deletions lib/schema/basal.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,13 @@

'use strict';

var util = require('util');

var _ = require('lodash');

var schema = require('./schema.js');

var idFields = ['type', 'deliveryType', 'deviceId', 'time'];
schema.registerIdFields('basal', idFields);
schema.registerFieldsForDuplicator('basal', ['deliveryType']);

var mismatchedSeries = 'basal/mismatched-series';

Expand Down
1 change: 1 addition & 0 deletions lib/schema/bloodKetone.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('bloodKetone', idFields);
schema.registerFieldsForDuplicator('bloodKetone', ['units', 'value']);

module.exports = schema.makeHandler('bloodKetone', {
schema: {
Expand Down
3 changes: 1 addition & 2 deletions lib/schema/bolus.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,13 @@

'use strict';

var util = require('util');

var _ = require('lodash');

var schema = require('./schema.js');

var idFields = ['type', 'subType', 'deviceId', 'time'];
schema.registerIdFields('bolus', idFields);
schema.registerFieldsForDuplicator('bolus', ['subType']);

module.exports = function(streamDAO){
return schema.makeSubHandler(
Expand Down
1 change: 1 addition & 0 deletions lib/schema/cbg.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('cbg', idFields);
schema.registerFieldsForDuplicator('cbg', ['units','value']);

module.exports = schema.makeHandler('cbg', {
schema: {
Expand Down
2 changes: 1 addition & 1 deletion lib/schema/cgmSettings.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@
'use strict';

var schema = require('./schema.js');

var idFields = ['type', 'time', 'deviceId'];
schema.registerIdFields('cgmSettings', idFields);
schema.registerFieldsForDuplicator('cgmSettings');

var lowHighAlertsSchema = {
enabled: schema.isBoolean,
Expand Down
1 change: 1 addition & 0 deletions lib/schema/deviceEvent.js
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ var UNKNOWN_PREV = 'status/unknown-previous';

var idFields = ['type', 'subType', 'time', 'deviceId'];
schema.registerIdFields('deviceEvent', idFields);
schema.registerFieldsForDuplicator('deviceEvent',['subType']);

var statusReasons = ['manual', 'automatic'];

Expand Down
77 changes: 77 additions & 0 deletions lib/schema/duplicate.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
'use strict';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Why is this file called duplicate.js? Perhaps, platform.js or similar to indicate it is the platform deduplicator? That way when it has used it would be platform.registerFieldsForDuplicator which easily shows it is for platform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, perhaps include "platform" in the exported functions so it is clear that it is for platform deduplication.


var amoeba = require('amoeba');
var except = amoeba.except;
var misc = require('../misc.js');
var schema = require('./schema.js');

var idHashFieldMap = {};

const getIdHashFields = function (type) {
var retVal = idHashFieldMap[type];
if (retVal == null) {
throw except.IAE('No known hashFields for type[%s]', type);
}
return retVal;
};

exports.registerFieldsForDuplicator = function (type, idHashFields = []) {
if (idHashFieldMap[type] == null) {
idHashFieldMap[type] = [
'_userId',
'deviceId',
'time',
'type',
...idHashFields,
];
} else {
throw except.IAE(
'Id hash fields for type[%s] already defined[%j], cannot set[%j]',
type,
idHashFieldMap[type],
idHashFields
);
}
};

exports.generateHash = function (datum) {
if (typeof datum === 'string') {
return datum;
}
var idHashFields = idHashFieldMap[datum.type];
if (!idHashFields) {
return '';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the caller check for empty string and not write the platform deduplicator hash? We don't want any with empty strings in the database I think.

}
var vals = new Array(idHashFields.length);
for (var i = 0; i < idHashFields.length; ++i) {
let val = datum[idHashFields[i]];
if (val == null) {
throw except.IAE(
"Can't generate hash, field[%s] didn't exist on datum of type[%s]",
idHashFields[i],
datum.type
);
}
if (idHashFields[i] === 'time') {
const dateTime = new Date(val);
// NOTE: platform `time` is being returned minus millis so we need to do the same here
val = dateTime.toISOString().split('.')[0] + 'Z';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I didn't know this. And it has the Z timezone at the end?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to convert timezones here?

}
if (idHashFields[i] === 'value') {
if (
datum.type === 'smbg' ||
datum.type === 'bloodKetone' ||
datum.type === 'cbg'
) {
// NOTE: platform `value` precision is being used so that the hash will be the same
if (val.toString().length > 7) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only want to do this for mg/dL per our platform discussion.

val = schema.convertMgToMmolPrecision(val);
}
}
}
vals[i] = String(val);
}
return misc.generateHash(vals);
};

exports.getIdHashFields = getIdHashFields;
1 change: 1 addition & 0 deletions lib/schema/food.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('food', idFields);
schema.registerFieldsForDuplicator('food');

module.exports = schema.makeHandler('food', {
schema: {
Expand Down
1 change: 1 addition & 0 deletions lib/schema/insulin.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('insulin', idFields);
schema.registerFieldsForDuplicator('insulin');

module.exports = schema.makeHandler('insulin', {
schema: {
Expand Down
1 change: 1 addition & 0 deletions lib/schema/note.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'time', 'creatorId', 'text', 'deviceId'];
schema.registerIdFields('note', idFields);
schema.registerFieldsForDuplicator('note');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - I don't think platform handles this data type. If Uploader sends this data type, we'll need to update platform. Please note this in the narrative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uploader does not use note, but I think Tidepool Mobile and Blip does? Then again, they're probably both using Platform, right?


module.exports = function(streamDAO){
return schema.makeHandler('note', {
Expand Down
1 change: 1 addition & 0 deletions lib/schema/physicalActivity.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('physicalActivity', idFields);
schema.registerFieldsForDuplicator('physicalActivity');

var durationHoursSchema = {
value: schema.and(
Expand Down
1 change: 1 addition & 0 deletions lib/schema/pumpSettings.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('pumpSettings', idFields);
schema.registerFieldsForDuplicator('pumpSettings');

function forEachItem(obj, fn) {
if (Array.isArray(obj)) {
Expand Down
1 change: 1 addition & 0 deletions lib/schema/reportedState.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('reportedState', idFields);
schema.registerFieldsForDuplicator('reportedState');

var stateArraySchema = schema.isArrayWithValueSchema({
state: schema.in('alcohol', 'cycle', 'hyperglycemiaSymptoms', 'hypoglycemiaSymptoms', 'illness', 'stress'),
Expand Down
12 changes: 12 additions & 0 deletions lib/schema/schema.js
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ var except = amoeba.except;
var util = require('util');

var misc = require('../misc.js');
var duplicate = require('./duplicate.js');

exports.validDeviceTime = function(val) {
if (!/^(\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d)$/.test(val)) {
Expand Down Expand Up @@ -252,6 +253,13 @@ exports.convertMgToMmol = function(mg) {
return mg / 18.01559;
};

exports.convertMgToMmolPrecision = function(mgValue) {
const mmolLToMgdLConversionFactor = 18.01559;
const mmolLToMgdLPrecisionFactor = 100000.0;
let mmolVal = parseInt(mgValue / mmolLToMgdLConversionFactor * mmolLToMgdLPrecisionFactor + 0.5);
return mmolVal / mmolLToMgdLPrecisionFactor;
};

exports.convertUnits = function(datum) {
var fields = Array.prototype.slice.call(arguments, 1);
var normalUnits = exports.normalizeUnitName(datum.units);
Expand Down Expand Up @@ -513,3 +521,7 @@ exports.makeId = function(datum) {

return exports.generateId(datum, idFields);
};

exports.generateHash = duplicate.generateHash;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps include "platform" name in exports so it is clear it is for the platform deduplication.

exports.registerFieldsForDuplicator = duplicate.registerFieldsForDuplicator;
exports.getIdHashFields = duplicate.getIdHashFields;
13 changes: 7 additions & 6 deletions lib/schema/smbg.js
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
/*
* == BSD2 LICENSE ==
* Copyright (c) 2014, Tidepool Project
*
*
* This program is free software; you can redistribute it and/or modify it under
* the terms of the associated License, which is identical to the BSD 2-Clause
* License as published by the Open Source Initiative at opensource.org.
*
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
* FOR A PARTICULAR PURPOSE. See the License for more details.
*
*
* You should have received a copy of the License along with this program; if
* not, you can obtain one from Tidepool Project at tidepool.org.
* == BSD2 LICENSE ==
Expand All @@ -21,15 +21,16 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time', 'value'];
schema.registerIdFields('smbg', idFields);
schema.registerFieldsForDuplicator('smbg', ['units', 'value']);

module.exports = schema.makeHandler('smbg', {
schema: {
deviceTime: schema.validDeviceTime,
value: schema.isNumber,
units: schema.in('mmol/L', 'mmol/l', 'mg/dL', 'mg/dl'),
subType: schema.ifExists(schema.in('manual', 'linked', ''))
subType: schema.ifExists(schema.in('manual', 'linked', '')),
},
transform: function(datum, cb) {
transform: function (datum, cb) {
return cb(null, schema.convertUnits(datum, 'value'));
}
},
});
1 change: 1 addition & 0 deletions lib/schema/upload.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('upload', idFields);
schema.registerFieldsForDuplicator('upload');

module.exports = schema.makeHandler('upload', {
schema: {
Expand Down
1 change: 1 addition & 0 deletions lib/schema/urineKetone.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var schema = require('./schema.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('urineKetone', idFields);
schema.registerFieldsForDuplicator('urineKetone');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - I don't think platform handles this data type. If Uploader sends this data type, we'll need to update platform. Please note this in the narrative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uploader does not use urineKetone, only bloodKetone


module.exports = schema.makeHandler('urineKetone', {
schema: {
Expand Down
1 change: 1 addition & 0 deletions lib/schema/wizard.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ var settings = require('./pumpSettings.js');

var idFields = ['type', 'deviceId', 'time'];
schema.registerIdFields('wizard', idFields);
schema.registerFieldsForDuplicator('wizard');

var recommendedSchema = schema.and(
schema.isObject,
Expand Down
Loading