The python package braincube_connector
provides a tool for datascientists to access their data on Braincube directly from python.
Install with pip:
pip install braincube_connector
Since version 2.2.0
, the authentication uses a personal access token (PAT).
In order to create a PAT, you need to go in your braincube personal menu > Account > Access tokens > +/Add
The scopes of the token should include BRAINCUBE
and SSO_READ
.
Then two options exist to pass the PAT to the braincube_connector:
- Using a configuration dictionary when creating a client:
from braincube_connector import client
client.get_instance(config_dict={"api_key":"<my_personal_access_token>", "domain":"mybraincube.com"})
- Using a configuration file:
from braincube_connector import client
client.get_instance("config_file"="myfile.json")
myfile.json
{"api_key":"<my_personal_access_token>", "domain":"mybraincube.com"}
The braincube_connector used to support only this type of authentication. This is not the method we encourage
the most since the PAT is available, because the Oauth2 is obtained with the braincube-token-getter that is not under active development.
However if you still want to use this method, you need to setup the configuration file (or dictionary) as follows:
config.json
{
"client_id": "app id",
"client_secret": "app key",
"domain": "mybraincube.com",
"verify": true,
"oauth2_token": "token value"
}
By default the connector searches for a PAT and uses the oauth2_token when the PAT is not present in the dictionary.
Here is a list of the settings available in the configuration file:
domain
(optional ifsso_base_url
andbraincube_base_url
exist): The domain of the braincube to access.sso_base_url
(optional ifdomain
exists): The base URL of the SSO used to check the validity of your access token.braincube_base_url
(optional ifdomain
exists): The base URL of the Braincube API used to fetch data from.api_key
(optional ifoauth2_token
exists): a personal access token generated in the braincube account configuration.oauth2_token
(optional ifapi_key
exists): an OAuth2 token obtained with the braincube-token-getter. Used only whenapi_key
does not exist.verify
(optional, default isTrue
): IfFalse
, the requests do not verify the SSL certificate.
Setting
verify
to false must be used with care, it's a security threat (see requests documentation
The client_id
, client_secret
from the last section are used only by the braincube_token_getter when requesting a new OAuth token.
If the client is not initialized manually or if no configuration is passed to get_instance
, the package creates a client instance from one of these two files ./config.json
or ~/.braincube/config.json
(in this priority order) when they exist.
A client can be inialized manually from a custom configuration file.
from braincube_connector import client
client.get_instance(config_file="pathto/config.json")
Note: If the client is not initialized manually, the package creates a client instance from one of these two files ./config.json
or ~/.braincube/config.json
(in this priority order) if they exist.
The connector gives access to different entities(described in more details in the following sections) that share multiple methods:
<entity>.get_name()
: Returns the name of the entity.<entity>.get_bcid()
: Returns the bcId identifier of the entity.<entity>.get_uuid()
: Returns the braincube unique uuid identifier of the entity.
To obtain a list of all the available Braincube
entities with a client:
from braincube_connector import braincube
braincube.get_braincube_list()
Or to select a specific Braincube
entity from its name:
bc = braincube.get_braincube("demo")
The list of all the memory bases available within a Braincube
is obtained with
mb_list = bc.get_memory_base_list()
Note: The number of memory bases in a braincube can be numerous, hence get_memory_base_list
allows paginated requests bc.get_memory_base_list(page=0)
To select a unique memory base, go with its bcId:
mb = bc.get_memory_base(20)
The variable description are linked to a memory base.
var_desc = mb.get_variable(bcid="2000034")
For multiple variable descriptions:
mb.get_variable_list(page=0)
Note: Similarly to memory bases, providing no argument to get_variable_list
retrieves all the descriptions available in the memory base.
The type of variable is obtained with the function get_type
var_desc.get_type()
DataGroup are obtained from a memory base:
datagroup = mb.get_datagroup(bcid="10")
The list of the available datagroups can also be obtained with mb.get_datagroup_list()
.
A datagroup is a container that includes multiple variables. They are accessed with
datagroup.get_variable_ids() # Gets the variable bcIds
datagroup.get_variable_list() # Gets the list of VariableDescription objects.
An event is a predifined set of conditions in braincube. It is accessed as follows:
event = mb.get_event(bcid="10")
event_list = mb.get_event_list()
The interest of events is that you can access the conditions they contain in order create new filters for a get_data
function:
event.get_conditions()
The job desciption contains the settings used to build an analysis and gives a proxy to access these parameters easily. A JobDescription is obtained from a memory base as follows:
job_desc = mb.get_job(bcid="573")
job_list = mb.get_job_list(page=0)
The properties are acced with the following methods:
-
get_conditions:
Gets a list of the conditions used to select the job variables.job_desc.get_conditions() job_desc.get_conditions(combine=True) # Merge the conditions into one job_desc.get_conditions(include_events=True) # Includes the conditions from # the job's events
-
get_variable_ids:
Gets a list of the variables involved in the job, including the target variables and the influence variables.job_desc.get_variable_ids()
-
get_events:
Gets a list of the event objects used by the job.job_desc.get_events()
-
get_categories:
Gets a list of conditions used to categorise a job's data as good or bad. You may have a middle category, it's an old categorisation which will not be used anymore.job_desc.get_categories()
-
get_data:
When a job is created on braincube, a separate copy of the data is made. As for now this copy is not available from the webservices. However theget_data
method collects the job's data from the memory base using the same filters as when the job was created. Be aware that these data might be different from the job's data if the memory base has been updated since the job creation.Similarly to other object
get_data
, afilters
parameter is available to add additional filters to the job's conditions.job_desc.get_data()
The job rule descriptions are obtained with the methods get_rule
or get_rule_list
either from a job or a memory base. The only difference being that in the case of a memory base get_rule_list
gets all the rules existing in the memory base whereas for a job, it gets the rules specific to the job under consideration.
rule = job.get_rule(bcid="200")
rule_list = job.get_rule_list()
To access a RuleDescription
object's metadata, you can calle the get_metadata
function
rule.get_metadata()
A memory base can also request the data for a custom set of variable ids. Adding filters restricts the returned data to a desired subset of the data. The method is called as follows:
data = mb.get_data(["2000001", "2000034"], filters=my_filters, label_type="name", dataframe=True)
The output format is a dictionary or a pandas DataFrame when the dataframe
parameter is set to True
. The keys/column labels are the variable bcIds or names depending on whether label_type
is set to "bcid"
or "name"
respectively.
Note: By default the dates are not parsed to datetime
objects in order to speed up the get_data
function but it is possible to enable the parsing:
from braincube_connector import parameters
parameters.set_parameter({"parse_date": True})
The get_data
methods have the option to restrict the data that are collected by using a set of filters. The filters
parameter must be a list conditions (even for a single condition):
object.get_data(filters=[{"BETWEEN": ["mb20/d2000002",0,10]},{"BETWEEN": ["mb20/d2000003", -1, 1]}])
Here is a selection of the most common types of filters:
-
Equals to
Selects data when a variable is equal to{ "EQUALS": [ "mb20/d2000002", 2.0] }
-
Between
Selects the data when a variable belongs to a range.{ "BETWEEN": [ "mb20/d2000003", -1, 1] }
-
Lower than
Selects the data when a variable is lower than a certain value.{ "LESS": [ "mb20/d2000003", 10] }
Note: The
LESS_EQUALS
filter also exists. -
Greater than
Selects the data when a variable is greater than a certain value.{ "GREAT": [ "mb20/d2000003", 10] }
Note: The
GREAT_EQUALS
filter also exists. -
Not:
TheNOT
condition creates the opposite of an existing condition.{ "Not": [{"filter":...}] }
-
And gate
It is possible to combine filters using a and gate.{ "AND": [{"filter1":...}, {"filter2":...}] }
Notes:
- A
AND
filter can only host two conditions. In order to join more than two filters multipleAND
conditions should be nested one into another. - When multiple filters are provided in the
get_data
'sfilters
parameters, they are joined together within the function usingAND
gates.
- A
-
Or gate:
Similar toAND
but uses aOR
gate.{ "OR": [{"filter1":...}, {"filter2":...}] }
The braincube_connector provides a simple interface for the most common features of the braincube web-services or braindata but it is not extensive.
If you need to access an endpoint of braincube webservices or braindata, the request_ws
function of the library can help you. The function uses the configuration passed to the client creation to manage the authentication.
from braincube_connector import client
client.get_instance(config_dict={...})
json_result = client.request_ws("braincube/demo/braindata/mb20/simple")
Most braincube requests return a json, but for a few of them it might be better to deactivate the parsing by setting the response_as_json
parameter to False
. In the latter case, request_ws
returns the response object.
json_result = client.request_ws("braincube/demo/braindata/mb20/simple", response_as_json=False)
The library parameters can be set to custom values:
from braincube_connector import parameters
# Change the request pagination size to 10
parameters.set_parameter({"page_size": 10})
# Parse dates to datetime objects
parameters.set_parameter({"parse_date": True})
# The Braincube database stores multiple names (`tag`, `standard`, or `local`) for a variable
# By default `standard` id used, but you can change it as follows:
parameters.set_parameter(({"VariableDescription_name_key": "tag"}))