Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first pr for medperf demo #59

Draft
wants to merge 4 commits into
base: cf_policy_medperf
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@ dist
*.egg-info
*.ipynb_checkpoints
*.pyc
quick_*.sh
settings.sh
hfmodels-contract/
37 changes: 37 additions & 0 deletions medperf-contract/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Copyright 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This is necessary to get at the definitions necessary
# for the std::string class
INCLUDE(exchange_common)
LIST(APPEND WASM_LIBRARIES ${EXCHANGE_LIB})
LIST(APPEND WASM_INCLUDES ${EXCHANGE_INCLUDES})


INCLUDE(medperf_common.cmake)
LIST(APPEND WASM_LIBRARIES ${MEDPERF_LIB})
LIST(APPEND WASM_INCLUDES ${MEDPERF_INCLUDES})

ADD_LIBRARY(${MEDPERF_LIB} STATIC ${MEDPERF_SOURCES})
TARGET_INCLUDE_DIRECTORIES(${MEDPERF_LIB} PUBLIC ${MEDPERF_INCLUDES})

SET_PROPERTY(TARGET ${MEDPERF_LIB} APPEND_STRING PROPERTY COMPILE_OPTIONS "${WASM_BUILD_OPTIONS}")
SET_PROPERTY(TARGET ${MEDPERF_LIB} APPEND_STRING PROPERTY LINK_OPTIONS "${WASM_LINK_OPTIONS}")
SET_TARGET_PROPERTIES(${MEDPERF_LIB} PROPERTIES EXCLUDE_FROM_ALL TRUE)

BUILD_CONTRACT(medperf_token_object contracts/token_object.cpp)

# -----------------------------------------------------------------
INCLUDE(Python)
BUILD_WHEEL(medperf medperf_token_object)
32 changes: 32 additions & 0 deletions medperf-contract/MANIFEST
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
MANIFEST.in
./setup.py
./pdo/medperf/wsgi/provision_token_issuer.py
./pdo/medperf/wsgi/__init__.py
./pdo/medperf/wsgi/add_endpoint.py
./pdo/medperf/wsgi/provision_token_object.py
./pdo/medperf/wsgi/info.py
./pdo/medperf/wsgi/process_capability.py
./pdo/medperf/operations/use_dataset.py
./pdo/medperf/operations/__init__.py
./pdo/medperf/plugins/medperf_token_object.py
./pdo/medperf/plugins/__init__.py
./pdo/medperf/plugins/medperf_guardian.py
./pdo/medperf/__init__.py
./pdo/medperf/resources/resources.py
./pdo/medperf/resources/__init__.py
./pdo/medperf/common/guardian_service.py
./pdo/medperf/common/capability_keystore.py
./pdo/medperf/common/secrets.py
./pdo/medperf/common/__init__.py
./pdo/medperf/common/endpoint_registry.py
./pdo/medperf/common/utility.py
./pdo/medperf/common/capability_keys.py
./pdo/medperf/scripts/__init__.py
./pdo/medperf/scripts/guardianCLI.py
./pdo/medperf/scripts/scripts.py
./scripts/gs_stop.sh
./scripts/gs_start.sh
./scripts/gs_status.sh
./context/tokens.toml
./etc/medperf.toml
./etc/guardian_service.toml
4 changes: 4 additions & 0 deletions medperf-contract/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
recursive-include ../build/medperf-contract *.b64
recursive-include etc *.toml
recursive-include context *.toml
recursive-include scripts *.sh
124 changes: 124 additions & 0 deletions medperf-contract/PROTOCOL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
## Workflow

```mermaid
sequenceDiagram
participant E as Experiment Committee
participant PDO as PDO Contract (SGX)
participant G as Digital Guardian Service
participant D as Data Owner
participant M as MedPerf Server
%% participant MO as Model Owner
note over G, M: experiment-data association (assume finished)
note over G, M: model-data association (assume finished)
note over M, PDO: assume the states data are synchronized
loop Attestation
PDO -> G:
end
D ->> PDO: create token_issuer and call cmd_mint_token
PDO ->> PDO: generate new identity token_issuer and new contract token.token_object
D ->> PDO: transfer token.token_object to new owner token_experitment1
PDO ->> PDO: generate new identity token_expertiment1 and new contract token.token_experiment1
note over E, PDO: owership transferred to Experiment committee
note over M: notification about token/dataset availiability
box Assume Trusted
participant M
end
box SGX Enclave
participant G
end
E ->> PDO: request the inference on specific model model_id, experiment_id (cmd_use_dataset)
activate PDO
PDO ->> PDO: policy evaluation
note right of PDO: policy: <br> experiment_id = experiment_id' <br> model_id in {model_id}
PDO ->> E: return capabilities, encoded with urls for dockerfile and weights of the granted model.identity of
deactivate PDO
E ->> G: request access to service with capability (storage )
E ->> D:
activate G
note over G: pull/download/deploy mlcube (the model)
G ->> G: run inference over the dataset
G ->> D: return result (basic verification with hash)
D ->> E: return results
deactivate G
```


## Protocols

From the dataset owner side:
+ The dataset is identified by its hash.
+ Tokens are minted based on hash-based binding.
+ Guardian service of the token exposes an api to the callers


## Core functions

+ Allow data owner to tokenize a dataset (into a PDO contract)
+ with default policy checking the registration information from MedPerf
+ ownership transfer to experiment committee
+ generate capability to grant access to the guardian service

+ Allow experiment committee to initilize the use of data by interacting with PDO
+ capabilities are published on MedPerf server (or PDO states/ledger?) for downloading

+ Allow data owner to host a guardian service, which exposes interface (local) to initilize the test
+ data owner downloads the capability from the server (or ledger)
+ data owner feeds the capability to guardian service to initialize the experiment
+ guardian service publish experiment results to the server (or ledger)
+ allows access control?



### Guardian service

Datasets are hosted behind the service.
<!-- The service can be called from the -->
<!-- + called from the model owner -- one model inference over the dataset -->
<!-- + called from the experiment committee (TBA) -->

The guardian service is co-located with the dataset. guadian service provides a wsgi api to process the capability. Capability encodes `{model_id, dataset_id, url_to_docker, url_to_weights}`. After receiving capability, the guardian service:
1. pull/build docker images from `url_to_docker`
2. download weights from `url_to_weights`
3. model up and run over `dataset_id`

no execution integrity for now.
<!-- dataset encryption -->



### Contract methods

new contract methods under the class `ww::medperf::token_object`

`initialize`: public methode, mint token for the dataset, actual method behind `cmd_mint_token`
1. kvs of the experiment/model/dataset info
2. Store the registered metadata from medperf service (synthetic for PoC).
3. set a `max_evaluation`

| Keys | Values |
| ---- | ---- |
| experiment_id | identifier for the experimetn |
| model_id | identifier for the model |
| {urls} | url to model assets |
| dataset_id | hash of the dataset |
| max_evaluation | most models that allowed to evaluate |
| cur_evaluation | 0 |
| approved_capability| {} |
| TBA |

`get_datasetinfo`: public method, get the non-secret kvs info of dataset token
return the information associated with the dataset.

capability -links to- identity // invoked by the dataowner only

<!-- `get_capability`:
1. called by model owner -->

`use_dataset(model_id, dataset_id, experiment_id)`: only invoked by TO
1. check if model_id, dataset_id, experiment_id are in the kv storage
2. cur_evaluation + 1, if cur_evaluation = max_evaluation, fail
3. invoke `get_capability` and return secretly encoded {urls}.

allows multiple models in one call

`get_capability`: only invoked by TO, parse capability kv
40 changes: 40 additions & 0 deletions medperf-contract/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
<!---
Licensed under Creative Commons Attribution 4.0 International License
https://creativecommons.org/licenses/by/4.0/
--->

# PDO-enhanced MedPerf Workflow
This document includes an enhanced workflow for Federated Evaluation (FE) in machine learning, aiming in providing a policy-enforced digital asset usage framwork with tokenization. The framework is built upon the FE workflow from [MedPerf](https://github.com/mlcommons/medperf).

## Note

The initial version of this project only covers the protection of dataset. Other digital assets, including the model (weights)and experiment results are not included for now.


## Test with the MedPerf tutorial
The existing test simulates the pdo-enhanced workflow by running both components simultaneously. The pdo command hacks into the medperf database directly to simulate the information updates.

Set up PDO test environment and have the [MedPerf](https://github.com/mlcommons/medperf) installed and built. Then set up environment variables `MEDPERF_SQLITE_PATH`, `MEDPERF_VENV_PATH` and `MEDPERF_HOME`.

Simply run [easy_test.sh](./easy_test.sh) to start the test.




<!-- ```c++
ww::medperf::token_object::initilization()
ww::medperf::token_object::use_dataset()
ww::medperf::token_object::get
``` -->


## Testing workflow
+ create token issuer and mint the token for the dataset.
+ token issuer transfers token to model_owner1
+ model_owner1 invoke `cmd_use_dataset` with specific `model_id` and `dataset_id`
+ get capability for one model, increase the count
+ call service to run inference
+ model_owner1 invoke `cmd_use_dataset` with specific `model_id` and `dataset_id`
+ max_evaluation exceeds, fail

For detailed description of the whole protocol, check [here](./PROTOCOL.md).
54 changes: 54 additions & 0 deletions medperf-contract/context/tokens.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright 2024 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# -----------------------------------------------------------------
# token ${token}
# -----------------------------------------------------------------
[token.${token}.asset_type]
module = "pdo.exchange.plugins.asset_type"
identity = "token_type"
source = "${ContractFamily.Exchange.asset_type.source}"
name = "${token}"
description = "asset type for ${token} token objects"
link = "http://"

[token.${token}.vetting]
module = "pdo.exchange.plugins.vetting"
identity = "token_vetting"
source = "${ContractFamily.Exchange.vetting.source}"
asset_type_context = "@{..asset_type}"

[token.${token}.guardian]
module = "pdo.medperf.plugins.medperf_guardian"
url = "${url}"
identity = "${..token_issuer.identity}"
token_issuer_context = "@{..token_issuer}"
service_only = true

[token.${token}.token_issuer]
module = "pdo.exchange.plugins.token_issuer"
identity = "token_issuer"
source = "${ContractFamily.Exchange.token_issuer.source}"
token_object_context = "@{..token_object}"
vetting_context = "@{..vetting}"
guardian_context = "@{..guardian}"
description = "issuer for token ${token}"
count = 1

[token.${token}.token_object]
module = "pdo.medperf.plugins.medperf_token_object"
identity = "${..token_issuer.identity}"
source = "${ContractFamily.medperf.token_object.source}"
token_issuer_context = "@{..token_issuer}"
data_guardian_context = "@{..guardian}"
80 changes: 80 additions & 0 deletions medperf-contract/contracts/token_object.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
/* Copyright 2023 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <string>
#include <stddef.h>
#include <stdint.h>

#include "Dispatch.h"

#include "Cryptography.h"
#include "KeyValue.h"
#include "Environment.h"
#include "Message.h"
#include "Response.h"
#include "Types.h"
#include "Util.h"
#include "Value.h"
#include "WasmExtensions.h"

#include "contract/base.h"
#include "contract/attestation.h"
#include "exchange/issuer_authority_base.h"
#include "exchange/token_object.h"
#include "medperf/token_object.h"

// -----------------------------------------------------------------
// METHOD: initialize_contract
// -----------------------------------------------------------------
bool initialize_contract(const Environment& env, Response& rsp)
{
ASSERT_SUCCESS(rsp, ww::exchange::token_object::initialize_contract(env),
"failed to initialize the base contract");

return rsp.success(true);
}

// -----------------------------------------------------------------
// -----------------------------------------------------------------
contract_method_reference_t contract_method_dispatch_table[] = {

CONTRACT_METHOD2(get_verifying_key, ww::contract::base::get_verifying_key),
CONTRACT_METHOD2(initialize, ww::medperf::token_object::initialize),

// issuer methods
CONTRACT_METHOD2(get_asset_type_identifier, ww::exchange::issuer_authority_base::get_asset_type_identifier),
CONTRACT_METHOD2(get_issuer_authority, ww::exchange::issuer_authority_base::get_issuer_authority),
CONTRACT_METHOD2(get_authority, ww::exchange::issuer_authority_base::get_authority),

// from the attestation contract
CONTRACT_METHOD2(get_contract_metadata, ww::contract::attestation::get_contract_metadata),
CONTRACT_METHOD2(get_contract_code_metadata, ww::contract::attestation::get_contract_code_metadata),

// use the asset
CONTRACT_METHOD2(get_dataset_info, ww::medperf::token_object::get_dataset_info),
CONTRACT_METHOD2(use_dataset, ww::medperf::token_object::use_dataset),
CONTRACT_METHOD2(get_capability, ww::medperf::token_object::get_capability),
CONTRACT_METHOD2(owner_test, ww::medperf::token_object::owner_test),
CONTRACT_METHOD2(update_policy, ww::medperf::token_object::update_policy),

// object transfer, escrow & claim methods
CONTRACT_METHOD2(transfer,ww::exchange::token_object::transfer),
CONTRACT_METHOD2(escrow,ww::exchange::token_object::escrow),
CONTRACT_METHOD2(escrow_attestation,ww::exchange::token_object::escrow_attestation),
CONTRACT_METHOD2(release,ww::exchange::token_object::release),
CONTRACT_METHOD2(claim,ww::exchange::token_object::claim),

{ NULL, NULL }
};
6 changes: 6 additions & 0 deletions medperf-contract/easy_test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
source ${PDO_SOURCE_ROOT}/build/common-config.sh
source ${PDO_INSTALL_ROOT}'/bin/activate'
make -C ${PDO_CONTRACTS_ROOT}
make -C ${PDO_CONTRACTS_ROOT} install
$PDO_CONTRACTS_ROOT/medperf-contract/test/mp_test.sh --host $PDO_HOSTNAME --ledger $PDO_LEDGER_URL
deactivate
Loading