Official Code of NeurIPS2024 paper
Entity Alignment with Noisy Annotations from Large Language Models

The framework of LLM4EA.

Table of Content

Environment setup
Quick start
Ablations
Simulations
Customization
Bibtex
Acknowledgement
License

Environment setup

Step1. Install the required packages by running the following command:

pip install -r requirements.txt

Step2. Download the dataset from here and put it in the data folder.

Step3. Specify the gpt-api-key in the config.py file with your openai API key.

Quick start

LLM4EA: Run the following command to run llm4ea on D-W-15k dataset

python infer.py --dataset_name D-W-15K

Baseline: Run the following command to run dual-amn on D-W-15K dataset

python infer-baseline.py --dataset_name D-W-15K

Note: to facilatate reproducibility, we provide the annotated pseudo-labels generated during experiments, this quick start by default load the saved pseudo-labels. To run the actual experiment, please specify the argment --load_chk False in the command.

Ablations

There are three optional scripts: infer-baseline.py, infer-active-only.py, and infer-lr-only.py, which are variants of the infer.py script.

The infer-baseline.py script deactivates both the label refinement and active learning components of the framework, directly training the base EA model, Dual-AMN. This corresponds to the Dual-AMN baseline in the main table.
The infer-active-only.py script deactivates the label refinement component of the model. This corresponds to the w/o LR ablation setting in the paper.
The infer-lr-only.py script deactivates the active learning component of the model. This corresponds to the w/o Act ablation setting in the paper.

Simulations

If you have no access to an OpenAI API, you can run the simulation by running the following command:

python infer.py --dataset_name D-Y-15K --simulate --tpr 0.5

here, the arguement --tpr specifies the true positive rate for the synthesized pseudo-labels.

Customization

You may customize this framework to your dataset/task by revising the prompts. For instance, some dataset may not contain the entity names and rely on entity attributes to perform alignment, you may customize the self.messages and the self.choose function in annotator.py->Annotator.

Bibtex

If you find this work helpful, please cite our paper:

@inproceedings{
chen2024entity,
title={Entity Alignment with Noisy Annotations from Large Language Models},
author={Shengyuan Chen and Qinggang Zhang and Junnan Dong and Wen Hua and Qing Li and Xiao Huang},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
}

Acknowledgement

The code is based on PRASE and Dual-AMN, the dataset is from OpenEA benchmark, preprocessed by using the dump file wikidatawiki-20160801-abstract.xml from wikdata. The OpenEA dataset is licensed under the GPLv3 License.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
common		common
data		data
ea		ea
objects		objects
tools		tools
LICENSE.txt		LICENSE.txt
annotator.py		annotator.py
config.py		config.py
infer-active-only.py		infer-active-only.py
infer-baseline.py		infer-baseline.py
infer-lr-only.py		infer-lr-only.py
infer.py		infer.py
llm4ea.png		llm4ea.png
probabilisticReasoning.py		probabilisticReasoning.py
readme.md		readme.md
requirements.txt		requirements.txt
run.sh		run.sh
str_match_mrr.py		str_match_mrr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official Code of NeurIPS2024 paper
Entity Alignment with Noisy Annotations from Large Language Models

The framework of LLM4EA.

Table of Content

Environment setup

Quick start

Ablations

Simulations

Customization

Bibtex

Acknowledgement

License

About

Releases

Packages

Languages

License

chensyCN/llm4ea_official

Folders and files

Latest commit

History

Repository files navigation

Official Code of NeurIPS2024 paper Entity Alignment with Noisy Annotations from Large Language Models

The framework of LLM4EA.

Table of Content

Environment setup

Quick start

Ablations

Simulations

Customization

Bibtex

Acknowledgement

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Official Code of NeurIPS2024 paper
Entity Alignment with Noisy Annotations from Large Language Models

Packages