Machine learning flow is a graphical representation of data, by using the Flow Editor to prepare or shape data, train or deploy a model, or transform data and export it back to a database table or file in object storage.
Technology: SPSS Modeler, feature selection, auto classifier, data audit, field operations, data visualization
- Sign up to IBM Data Science Experience (DSX): https://datascience.ibm.com/
- Sign in to DSX
- On DSX, create a new project
- Click
Projects
, and selectView All Projects
- Click the
New
button to create a new project - On the
New Project
page, inputBank Churn
as the project name - In the
Target container
field, inputchurn
as the container name - Click the
Create
button
- Click
- Download dataset
bank-churn.csv
from Github- Use a new browser tab to access dataset: https://github.com/mlhubca/lab/blob/master/bank-churn/bank-churn.csv
- Right-click the
Raw
button on the toolbar, and selectSave Link As...
orSave Content As...
(depending on your browser)
- Upload dataset
bank-churn.csv
to your project- On DSX, open your project
- Click the
Add to project
dropdown and selectData asset
from the dropdown menu - On your right-hand panel, select the
Load
tab - Drop file
bank-churn.csv
to the box or browse filebank-churn.csv
and add the file to the project
Steps
- Add a new flow using
New flow
button or from the "Add to project" dropdown, select "SPSS Modeler flow" - On the Create Flow page,
- Specify a name, e.g.
Bank Churn Flow
- Select
IBM SPSS Modeler
Runtime - Click "Create Flow"
- Specify a name, e.g.
- Drag and drop node
bank-churn.csv
from the Files list to the flow - Click Palette icon (first icon on the toolbar) to show node palette
- Add
Data Audit
node from theOutputs
list on the palette - Connect file
bank-churn.csv
node toData Audit
node - Run
Data Audit
node to generate output
- Add
Filter
node from theField Operations
list on the palette - Connect file
bank-churn.csv
node toFilter
node - Open
Filter
node, select columnsCUST_ID
,TwitterID
andCHURN_LABLE
(to be filtered)
- Add node
Type
node from theField Operations
list on the palette - Connect file
Filter
node toType
node - Open
Type
node, add all columns to the Types list - Locate
CHURN
field, and- Change Measure from
Range
toFlag
- Change Role from
Input
toTarget
- Change Measure from
- Add
Feature Selection
node from theModeling
list on the palette - Connect
Feature Selection
node to node "Type" (note that the Feature Selection node name is being changed to CHURN) - Run node
CHURN
(Feature Selection). When the execution completes, a new model nodeCHURN
is created - Add
Data Audit
node from theOutputs
list on the palette - Connect the new model node
CHURN
toData Audit
node - Run
Data Audit
node to generate output
- Add
Partition
node from theField Operations
list on the palette - Connect the new model node
CHURN
toPartition
node - Open
Partition
node, change the Training and Test partition to the ratio of 80/20.
- Add
Auto Classifier
node from theModeling
list on the palette - Connect
Auto Classifier
to nodePartition
(note that the Auto Classifier node name is being changed to CHURN automatically) - Run node
CHURN
(Auto Classifier). When the execution completes, a new model nodeCHURN
is created automatically.
- Add
Analysis
node from theOutputs
list on the palette - Connect the new model nodel
CHURN
toAnalysis
node - Run
Analysis
node to generate output
- Run the whole flow by clicking the
Run
button on the toolbar