STEA

This repo contains the source code of paper "Dependency-aware Self-training for Entity Alignment", which has been accepted at WSDM 2023.

Download the used data from this Dropbox directory. Decompress it and put it under STEA_code/ as shown in the folder structure below.

📌 The code has been tested. Feel free to create issues if you cannot run it successfully. Thanks!

Structure of Folders

STEA_code/
  |- datasets/
  |- OpenEA/
  |- scripts/
  |- stea/
    |- Dual_AMN/
    |- GCN-Align/
    |- RREA/
  |- environment.yml
  |- README.md

After you run a certain script, the program will automatically create one folder output/ which stores the evaluation results.

Device

The configurations of my devices are as below:

The experiments on 15K datasets were run on one GPU server, which is configured with an Intel(R) Xeon(R) Gold 6128 3.40GHz CPU, 128GB memory, 3 NVIDIA GeForce GTX 2080Ti GPUs and Ubuntu 20.04 OS.
The experiments on 100K datasets were run on one computing cluster, which runs CentOS 7.8.2003, and allocates us 200GB memory and 2 NVidia Volta V100 SXM2 GPUs.

I think one basic configuration can be: 12GB GPU for 15K datasets, and 32GB GPU for 100K datasets.

Install Conda Environment

cd to the project directory first. Then, run the following command to install the major environment packages.

conda env create -f environment.yml

Activate the env via conda activate stea, and then install package graph-tool:

conda install -c conda-forge graph-tool==2.29

(It seems slow to install this package. So be patient.)

With the installed environment above, you can run STEA for Dual-AMN, RREA and GCN-Align.

If you also want to run STEA for AliNet, please also install the following packages with pip:

pip install igraph
pip install python-Levenshtein
pip install dataclasses

Run Scripts

Some shell scripts with parameter settings are provided under scripts/ folder. Some brief

run_{Self-training_method}_w_{EA_Model}.sh. Run a certain self-training method with a certain EA model. You can set the name of dataset, the annotation amount, and other settings as you need.
run_analyze_paramK.sh. Analyze the sensitivity to the hyperparameter K.
run_analyze_norm_minmax.sh. Replace the softmax-based normalisation module with a MinMax scaler for analyzing the necessity of our normalisation module.

For each task, the evaluation results as well as some other outputs can be found in a certain folder under the output/ directory.

Note: AliNet runs much slower than the other EA models. So you can explore the self-training methods with the other EA models first.

You Want to Report Issues?

We are willing to hear from you if you have any problem in running our code, or find inconsistency between your running results and what reported in the paper.

Citation

Please cite this paper if you use the released code in your work.

@inproceedings{DBLP:conf/wsdm/0025LHZ23,
  author    = {Bing Liu and
               Tiancheng Lan and
               Wen Hua and
               Guido Zuccon},
  editor    = {Tat{-}Seng Chua and
               Hady W. Lauw and
               Luo Si and
               Evimaria Terzi and
               Panayiotis Tsaparas},
  title     = {Dependency-aware Self-training for Entity Alignment},
  booktitle = {Proceedings of the Sixteenth {ACM} International Conference on Web
               Search and Data Mining, {WSDM} 2023, Singapore, 27 February 2023 -
               3 March 2023},
  pages     = {796--804},
  publisher = {{ACM}},
  year      = {2023},
  url       = {https://doi.org/10.1145/3539597.3570370},
  doi       = {10.1145/3539597.3570370},
  timestamp = {Fri, 24 Feb 2023 13:56:00 +0100},
  biburl    = {https://dblp.org/rec/conf/wsdm/0025LHZ23.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Acknowledgement

We used the source codes of RREA, Dual-AMN, OpenEA, and GCN-Align.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STEA

Structure of Folders

Device

Install Conda Environment

Run Scripts

You Want to Report Issues?

Citation

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
OpenEA		OpenEA
scripts		scripts
stea		stea
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

uqbingliu/STEA

Folders and files

Latest commit

History

Repository files navigation

STEA

Structure of Folders

Device

Install Conda Environment

Run Scripts

You Want to Report Issues?

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages