New Diff Model for White-box Model Validation with Statistical Model Checking and Process Mining

Overview

diff_heu_heu.py is a Python command-line application designed to create a diff model, apply the Heuristics Miner algorithm, compare "old" and "new" process models, and generate visualizations highlighting their differences. This tool is particularly useful for analyzing and visualizing the differences in simulated models.

Illustrative example of the New Diff Model:

Definition

Consider two versions of a model, referred to as the 1^st model and the 2^nd model, each obtained by simulating potentially distinct variants of a formal specification. Unlike the original diff model, which relied on automatically generated graphical representations of the procedural part of the model, the new diff model is derived directly from the simulated event logs of the two model variants under comparison.

Let:

L1 be the event log obtained from simulating the 1^st model.
L2 be the event log obtained from simulating the 2^nd model.

Each log L_i contains sequences of events, with each event including at least a case ID, a timestamp, and an activity name. If the underlying formalism distinguishes between states and activities, a preprocessing step merges states and transitions into a unified set of activities for process discovery. Otherwise, if only activities are available, no such merging is required.

We apply the Heuristics Miner (HM) algorithm to each log separately, obtaining two Heuristics Nets (HNs):

H1 = (N1, E1, freq1) from L1
H2 = (N2, E2, freq2) from L2

Here:

N_i is a set of nodes representing discovered activities, as well as special start and end nodes.
E_i is a set of edges capturing directly-follows relationships among nodes in N_i.
freq_i assigns frequencies to each edge in E_i.

The new diff model D is defined as:

D = (N_D, E_D, lN, lE)

Where:

N_D = N1 ∪ N2 (the union of the nodes from both HNs).
E_D = E1 ∪ E2 (the union of all edges).
lN : N_D → { common, 1st-only, 2nd-only } labels each node based on whether it appears in both models (common), only in the 1^st model (1st-only), or only in the 2^nd model (2nd-only).
lE : E_D → { common, 1st-only, 2nd-only } labels each edge similarly.

In the new diff model:

Black nodes and edges (common) appear in both ( H_1 ) and ( H_2 ).
Red nodes and edges (1^st-only) represent behaviors and transitions present only in the 1^st model.
Blue nodes and edges (2^nd-only) indicate behaviors and transitions introduced in or unique to the 2^nd model.

This new diff model allows direct comparison of two mined models without relying on a known procedural representation.

Features

Pre-processing Logs: Cleans and formats event logs from CSV files.
Heuristics Miner: Applies the Heuristics Miner algorithm to discover process models.
Difference Analysis: Compares old and new process models to identify differences.
Visualization: Generates PDF graphs highlighting the differences between models.

Experiments Paper:

White-Box Validation of Collective Adaptive Systems by Statistical Model Checking and Process Mining. ISoLA (1) 2024: 204-222 Roberto Casaluce, Max Tschaikowski, Andrea Vandin:

Prerequisites

Python: Ensure you have Python 3.6 or higher installed. You can download Python from python.org.
Graphviz: This tool requires Graphviz to generate visualizations.

Installation

1. Clone the Repository

First, clone this repository or download the diff_heu_heu.py script to your local machine.

git clone https://github.com/rcasaluce/diff_heu_heu.git
cd process_logs_cli

2. Set Up a Virtual Environment

It's recommended to use a virtual environment to manage dependencies. Below are instructions for creating and activating a virtual environment on different operating systems.

Windows

Open Command Prompt:

Press Win + R, type cmd, and press Enter.
Navigate to the Project Directory:
```
cd path\to\diff_heu_heu
```
Create a Virtual Environment:
```
python -m venv venv
```
Activate the Virtual Environment:
```
venv\Scripts\activate
```

macOS and Linux

Open Terminal.
Navigate to the Project Directory:
```
cd path/to/diff_heu_heu
```
Create a Virtual Environment:
```
python3 -m venv venv
```
Activate the Virtual Environment:
```
source venv/bin/activate
```

3. Install Python Dependencies

With the virtual environment activated, install the required Python libraries using pip.

pip install -r requirements.txt

Alternatively, if a requirements.txt file is not provided, install the dependencies manually:

pip install pm4py pandas numpy graphviz pydotplus pygraphviz

Note: If you encounter issues installing pygraphviz, ensure that Graphviz is properly installed on your system and that the Graphviz binaries are accessible via your system's PATH.

4. Install Graphviz

Graphviz is required for generating the visualization PDFs.

Windows

Download Graphviz:

Download the Graphviz installer from the Graphviz Download Page.
Install Graphviz:

Run the installer and follow the on-screen instructions.
Add Graphviz to PATH:
- Open the Start Menu, search for "Environment Variables," and select "Edit the system environment variables."
- Click on "Environment Variables."
- Under "System variables," find and select the Path variable, then click "Edit."
- Click "New" and add the path to the Graphviz bin directory (e.g., C:\Program Files\Graphviz\bin).
- Click "OK" to save changes.
Verify Installation:

Open Command Prompt and run:
```
dot -V
```
You should see the Graphviz version information.

macOS

Using Homebrew:

If you have Homebrew installed, you can install Graphviz with:
```
brew install graphviz
```
Verify Installation:

Open Terminal and run:
```
dot -V
```
You should see the Graphviz version information.

Linux

Using APT (Debian/Ubuntu):

sudo apt-get update
sudo apt-get install graphviz

Using YUM (CentOS/RHEL):
```
sudo yum install graphviz
```
Verify Installation:

Open Terminal and run:
```
dot -V
```
You should see the Graphviz version information.

Usage

Command-Line Arguments

The script accepts the following command-line arguments:

--file_path_old: (Required) Path to the first_model.csv file.
--file_path_new: (Required) Path to the second_model.csv file.
--output_full: (Optional) Filename for the complete differences PDF. Default: complete_differences.
--output_filtered_full: (Optional) Filename for the filtered complete differences PDF. Default: filtered_differences.

Running the Script

Ensure that your virtual environment is activated and that all dependencies are installed.

Basic Usage

python diff_heu_heu.py \
    --file_path_old "path/to/first_model.csv" \
    --file_path_new "path/to/second_model.csv"

Specifying Output Filenames

python diff_heu_heu.py \
    --file_path_old "path/to/first_model.csv" \
    --file_path_new "path/to/second_model.csv" \
    --output_full "complete_differences" \
    --output_filtered_full "filtered_differences"

This will generate:

complete_differences.pdf
filtered_differences.pdf

Run Experiments

Paper: Roberto Casaluce, Max Tschaikowski, Andrea Vandin:

White-Box Validation of Collective Adaptive Systems by Statistical Model Checking and Process Mining. ISoLA (1) 2024: 204-222

Assuming your CSV files are located in ./logs/, run:

python diff_heu_heu.py \
    --file_path_old "./logs/robot_main_first.csv" \
    --file_path_new "./logs/robot_main_second.csv" \
    --output_full "complete_differences" \
    --output_filtered_full "filtered_differences"

Output

After execution, the script will generate the following PDF files in the current directory (or in the specified output path):

complete_differences.pdf: Visualizes the complete differences between the old and new process models.
filtered_differences.pdf: Visualizes the filtered differences.

Please cite this work using:


@inproceedings{DBLP:conf/isola/CasaluceTV24,
  author       = {Roberto Casaluce and
                  Max Tschaikowski and
                  Andrea Vandin},
  editor       = {Tiziana Margaria and
                  Bernhard Steffen},
  title        = {White-Box Validation of Collective Adaptive Systems by Statistical
                  Model Checking and Process Mining},
  booktitle    = {Leveraging Applications of Formal Methods, Verification and Validation.
                  REoCAS Colloquium in Honor of Rocco De Nicola - 12th International
                  Symposium, ISoLA 2024, Crete, Greece, October 27-31, 2024, Proceedings,
                  Part {I}},
  series       = {Lecture Notes in Computer Science},
  volume       = {15219},
  pages        = {204--222},
  publisher    = {Springer},
  year         = {2024},
  url          = {https://doi.org/10.1007/978-3-031-73709-1\_13},
  doi          = {10.1007/978-3-031-73709-1\_13},
  timestamp    = {Tue, 22 Oct 2024 21:07:33 +0200},
  biburl       = {https://dblp.org/rec/conf/isola/CasaluceTV24.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

License

This project is licensed under Apache License 2.0. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.ipynb_checkpoints		.ipynb_checkpoints
img		img
logs		logs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
diff_heu_heu.py		diff_heu_heu.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

New Diff Model for White-box Model Validation with Statistical Model Checking and Process Mining

Overview

Definition

Features

Experiments Paper:

Prerequisites

Installation

1. Clone the Repository

2. Set Up a Virtual Environment

Windows

macOS and Linux

3. Install Python Dependencies

4. Install Graphviz

Windows

macOS

Linux

Usage

Command-Line Arguments

Running the Script

Basic Usage

Specifying Output Filenames

Run Experiments

Paper: Roberto Casaluce, Max Tschaikowski, Andrea Vandin:

White-Box Validation of Collective Adaptive Systems by Statistical Model Checking and Process Mining. ISoLA (1) 2024: 204-222

Output

Please cite this work using:

License

About

Releases

Packages

Languages

License

rcasaluce/diff_heu_heu

Folders and files

Latest commit

History

Repository files navigation

New Diff Model for White-box Model Validation with Statistical Model Checking and Process Mining

Overview

Definition

Features

Experiments Paper:

Prerequisites

Installation

1. Clone the Repository

2. Set Up a Virtual Environment

Windows

macOS and Linux

3. Install Python Dependencies

4. Install Graphviz

Windows

macOS

Linux

Usage

Command-Line Arguments

Running the Script

Basic Usage

Specifying Output Filenames

Run Experiments

Paper: Roberto Casaluce, Max Tschaikowski, Andrea Vandin:

White-Box Validation of Collective Adaptive Systems by Statistical Model Checking and Process Mining. ISoLA (1) 2024: 204-222

Output

Please cite this work using:

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages