This is a repository for Incremental Learning with Large Language Models.
- It supports both generative and discriminative models in transformers.
- It supports using accelerate for distributed data parrallel and model parallel.
- It supports using wandb for logging.
- Instance-Incremental Learning
- Class-Incremental Learning
- Task-Incremental Learning
- Continual Instruction Tuning (Coming soon!)
- Continual Knowledge Editing (Coming soon!)
- Text Classification
- Intent Classification
- Relational Extraction
- Named Entity Recognition
More baselines will be released in the future!
- SEQ (Sequential Finetuning)
- ExperienceReplay
- PEFT (including, LoRA, PromptTuning)
- LAMOL (ICLR 2020)
- LAMOL_KD (arXiv)
- L2KD (EMNLP 2020)
- AdapterCL (EMNLP 2021)
- PCLL (EMNLP 2022)
- LFPT5 (ICLR 2022)
- ProgPrompt (ICLR 2023)
- SEQ* (ACL 2024)
- ExtendNER (AAAI 2021)
- SelfTrain (EMNLP 2022)
- CFNER (EMNLP 2022)
- SpanKL (AAAI 2023)
- DLD (SIGIR 2023)
- RDP (CIKM 2023)
- CPFD (EMNLP 2023)
- OCILNER (ACL 2023)
- ICE (ACL 2023 findings)
- IS3 (ACL 2024 findings)
- Concept-1K (The raw and the preprocessed Concept-1K are included in dataset/concept_1k, dataset/concept_1k_task10, dataset/concept_1k_task1).
- Topic3datasets (agnews, dbpedia, yahoo)
- CLINC150
- Banking77
- FewRel
- TACRED
- Few-NERD
- Ontonotes5
- I2B2
The config file of SEQ (just sequential fine-tuning) can be found in the SEQ_full.yaml
(in the config directory).
The config file of SEQ* can be found in the SEQ_pre_warm_fix.yaml
.
Note that the classifier type (linear or cosine linear) is not specified in all config files because we set it the script. An example can be found in https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm/blob/main/reproduce_shell/exp-CIL-sota/SOTA-CIL-Intent-discriminative-banking77_task7.sh
.
.
├── main_CL.py # This this the python file to be executed for running all experiments
├── utils # This folder contains all basic files for incremental learning
│ ├── backbone.py # This file loads backbone models from the transformers library
│ ├── buffer.py # This file defines the replay buffer
│ ├── classifier.py # This file loads Linear/CosineLinear classifiers
│ ├── wrapmodel.py # This file wrap the model for using DeepSpeed with accelerate
│ ├── dataformat_preprocess.py# This file preprocess the raw datasets to the continual learning dataset
│ ├── dataloader.py # This file prepare the input for languge models
│ ├── dataset.py # This file defines the format for different datasets for continual learning
│ ├── download_backbones.py # This file downloads models in advance to avoid network problem.
│ ├── evaluation.py # This file defines the evaluation process for various tasks
│ ├── factory.py # This file loads the various models from the ./models folder
│ ├── logger.py # This file defines the logger
│ ├── metric.py # This file defines the evaluation metric for continual learning
│ ├── optimizer.py # This file defines the optimizer for different models
│ ├── prompt.py # This file defines the prompt used for different tasks
│ ├── probing.py # This file computes the probing performance
│ └── config.py # This file defines general parameters and settings for the experiments
├── config # This folder contains the hyper-parameters for each methods in each datasets
├── dataset # This folder contains datasets for continual learning
├── models # This folder contains models for continual learning
└── experiments # This folder contains log data for each run
pip install -r requirement.txt
Check the support_dataset_list in utils/dataformat_preprocess.py and select the dataset you want for experiment.
Then, download the raw dataset to the folder dataset/{dataset-name}. For example, download the clinc150 to the folder dataset/clinc150. The raw datasets can be downloaded here. We note that the raw data of Conept-1K is in dataset/concept_1k. The preprocessed Concept-1K for 10 step incremental learning is in dataset/concept_1k_task10. The whole Concept-1K is in dataset/concept_1k_task1.
Next, exceute the preprocess_dataset.sh. It will automatically preprocess 8 default datasets for reproducing results ('topic3datasets','clinc150','banking77', 'fewrel','tacred','conll2003','fewnerd','i2b2','ontonotes5') and create new folders in datasets/{dataset-for-continual-learning-name} automatically (e.g.,backing_task7). If you do not need to customize the datasets, you can skip to Step 3.
To customize the datasets, you can run utils/dataformat_preprocess.py with your own parameters (e.g., random seeds, num of tasks). This process will create a new target folder dataset/{dataset-for-continual-learning-name}. In the target folder, two json files continual_data.json and continual_config.json will be saved. For example, you can prepare clinc150 and fewrel dataset by runing
python utils/dataformat_preprocess.py --dataset clinc150 --seed 1
and
python utils/dataformat_preprocess.py --dataset fewrel --seed 1
The program will create target folders dataset/clinc150_task15 and dataset/fewrel_task8.
For NER datasets, for example ontonotes5, you can run the following command
python utils/dataformat_preprocess.py --dataset ontonotes5 --seed 1 --base_task_entity 8 --incremental_task_entity 2 --seen_all_labels False
The program will create a target folder dataset/ontonotes5_task6_base8_inc2. We note that fixing the random seed enables that exctaly the same datasets can be generated on different devices. Finally, the post-precessed dataset clinc150_task15,fewrel_task8, and ontonotes5_task6_base8_inc2 are ready for continual learning!
The yaml file contains the hyper-parameters for each method. For example, the hyper-parameter of SEQ* (w/ and w/o pre-allocating future classifiers) for generative backbones under CIL settings is defined in config/CIL/generative_backbones/clinc150_task15/SEQ_pre_warm_fix.yaml and config/CIL/generative_backbones/clinc150_task15/SEQ_warm_fix.yaml respectively.
The scripts for reproducing the probing study are in the folder reproduce_shell/exp-probing.
The scripts for reproducing the probing study with different pre-training steps are in the folder reproduce_shell/exp-probing-pretraining.
The scripts for reproducing the experiments of comparing SEQ* with SOTA methods are in the folder reproduce_shell/exp-sota.
If you want to run an experiment, execute the main_CL.py. For example, you can run SEQ method on clinc150_task15 dataset with bert-base-cased using the following command:
python main_CL.py --exp_prefix {your-experiment-name} --cfg './config/clinc150_task15/SEQ_full.yaml' --backbone bert-base-cased --classifier Linear --training_epochs 5
If you want to use wandb for logging (see here for more help):
python main_CL.py --is_wandb True --wandb_project {your-project-name} --wandb_entity {your-entity-name} --exp_prefix {your-experiment-name} --cfg './config/clinc150_task15/SEQ_full.yaml' --backbone bert-base-cased --classifier Linear --training_epochs 5
If you want to use accelerate for data/model parallel (see here for more help):
accelerate launch --config_file {your-accelerate-config-file} main_CL.py --is_wandb True --wandb_project {your-project-name} --wandb_entity {your-entity-name} --exp_prefix {your-experiment-name} --cfg './config/clinc150_task15/SEQ_full.yaml' --backbone bert-base-cased --classifier Linear --training_epochs 5
Please refer to utils/config.py for more general paramters and models/{model-name}.py for more model-specific parameters.
The results on CIL and TIL scenario.
If you have questions about this repository, please feel free to contact me at [email protected].
If you find this repository useful, please consider citing our paper.
@misc{zheng2023learn,
title={Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models},
author={Junhao Zheng and Shengjie Qiu and Qianli Ma},
year={2023},
eprint={2312.07887},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{qiu2024incremental,
title={Incremental Sequence Labeling: A Tale of Two Shifts},
author={Qiu, Shengjie and Zheng, Junhao and Liu, Zhen and Luo, Yicheng and Ma, Qianli},
journal={arXiv preprint arXiv:2402.10447},
year={2024}
}
@misc{zheng2024concept1k,
title={Concept-1K: A Novel Benchmark for Instance Incremental Learning},
author={Junhao Zheng and Shengjie Qiu and Qianli Ma},
year={2024},
eprint={2402.08526},
archivePrefix={arXiv},
primaryClass={cs.LG}
}