VFLBench

Repository containing scripts supporting the manuscript "VFLBench: A Practical Benchmark for Vertical Federated Learning in Smart Manufacturing", submitted to 19th International Conference on Computer Aided Systems Theory (Eurocast 2024).

Datasets

Hydraulic System I (HySys I)

This dataset was obtained experimentally using a hydraulic test rig, which consists of two circuits interconnected via an oil tank. The system operates by cyclically repeating constant load cycles, during which various process values are measured. The aim is to develop a regression model for predicting valve conditions. To replicate a vertical setup, the features are divided into two blocks based on the rig's configuration, with each block assigned to a different data holder. Following data split, feature reduction was applied separately to each data holder's private data to reduce the number of features.

Source: UC Irvine Machine Learning Repository
Hydraulic System II (HySys II)

Derived from the same source as HySyS I, this dataset has the same feature split. However, the task is different: to predict the stable flag, which is binary.

Source: UC Irvine Machine Learning Repository
Steel Fatigue Strength (SFS)

This dataset includes various experimental conditions during steel preparation, such as chemical composition, upstream processing details, and heat treatment. The target variable is fatigue strength. Features are vertically divided into two blocks and allocated to two hypothetical data holders: the first block contains chemical composition and upstream details, while the second block includes heat treatment information.

Source: Kaggle
Simulated Multistage Process (SMP)

This synthetic dataset is generated using a multistage process simulator. It emulates a three-stage process, assuming that three distinct manufacturing companies control and possess data from each specific stage. The primary goal of the data federation is to construct a predictive model for the output quality of the final stage.

Vertical Federated Learning algorithms

Privacy-preserving Partial Least Squares (P3LS)

A federated version of Partial Least Squares (PLS), which is a technique commonly used for monitoring and controlling manufacturing processes. P3LS involves a PLS algorithm based on singular value decomposition (SVD) and incorporates removable, randomly generated masks provided by a trusted authority to protect each data holder's private information.
Privacy-preserving Symbolic Regression (PPSR)

A privacy-preserving variant of Symbolic Regression (SR). PPSR employs Secure Multiparty Computation to allow parties to collaboratively build SR models in a vertical scenario without disclosing private data.
Secureboost

A federated learning algorithm that extends the gradient boosting framework, specifically XGBoost, to enable collaborative model training across multiple parties without sharing raw data. It ensures data confidentiality by performing secure aggregation of local computations utilizing Homomorphic Encryption, thereby preventing sensitive information leakage.
Split Neural Network (SplitNN)

Split learning involves dividing the network structure so that each party retains only a portion of it. These smaller structures combine to form a complete network model. During training, parties perform forward or backward calculations on their local structures and transfer the results to the next party. This way, it allows multiple data holders to contribute to the training of a joint model until it converges. During this process, Differential Privacy technology might be employed to enhance privacy protection.

VFLBench in action

Testing Environment

Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (64 cores)
384 GB System Memory

Prerequisites

Install Docker
Create vflbench image using the Dockerfile: docker build . -t vflbench

Run the paper experiments

Open the command line
Move to /paper_experiments
Execute bash run_all.sh to test all methods, or execute the individual shell script (e.g., bash run_splitnn.sh) to test each method separately

Contribution guidance

See the contributing document.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
docs		docs
models		models
paper_experiments		paper_experiments
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VFLBench

Datasets

Vertical Federated Learning algorithms

VFLBench in action

Testing Environment

Prerequisites

Run the paper experiments

Contribution guidance

References

About

Releases

Packages

Languages

software-competence-center-hagenberg/vflbench

Folders and files

Latest commit

History

Repository files navigation

VFLBench

Datasets

Vertical Federated Learning algorithms

VFLBench in action

Testing Environment

Prerequisites

Run the paper experiments

Contribution guidance

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages