forked from jerryji1993/DNABERT
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3d3ada1
commit c9b6ccb
Showing
240 changed files
with
51 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Binary file added
BIN
+2.65 KB
examples/data_process_template/__pycache__/process_pretrain_data.cpython-36.pyc
Binary file not shown.
Binary file added
BIN
+2.65 KB
examples/data_process_template/__pycache__/process_pretrain_data.cpython-38.pyc
Binary file not shown.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,3 +8,4 @@ statsmodels | |
biopython | ||
pandas | ||
pybedtools | ||
sentencepiece==0.1.91 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,7 @@ Author: Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Sam Shleifer, | |
Author-email: [email protected] | ||
License: Apache | ||
Description: # DNABERT | ||
This repository includes the implementation of 'DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome'. Please cite our paper if you use the models or codes. The repo is still under developing, so please kindly let us know if there is any issue. | ||
This repository includes the implementation of 'DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome'. Please cite our paper if you use the models or codes. The repo is still actively under development, so please kindly report if there is any issue encountered. | ||
|
||
In this package, we provides resources including: source codes of the DNABERT model, usage examples, pre-trained models, fine-tuned models and visulization tool. This package is still under development, as more features will be included gradually. Training of DNABERT consists of general-purposed pre-training and task-specific fine-tuning. As a contribution of our project, we released the pre-trained models in this repository. We extended codes from [huggingface](https://github.com/huggingface/transformers) and adapted them to the DNA scenario. | ||
|
||
|
@@ -79,7 +79,7 @@ Description: # DNABERT | |
export SOURCE=PATH_TO_DNABERT_REPO | ||
export OUTPUT_PATH=output$KMER | ||
|
||
python run_pretraining.py \ | ||
python run_pretrain.py \ | ||
--output_dir $OUTPUT_PATH \ | ||
--model_type=dna \ | ||
--tokenizer_name=dna$KMER \ | ||
|
@@ -268,7 +268,7 @@ Description: # DNABERT | |
--min_n_motif 3 \ | ||
--align_all_ties \ | ||
--save_file_dir $MOTIF_PATH \ | ||
--verbose \ | ||
--verbose | ||
``` | ||
|
||
The script will generate a .txt file and a weblogo .png file for each motif under `MOTIF_PATH`. | ||
|
Empty file.
Empty file.
Empty file.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_albert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_auto.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_bart.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_bert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_camembert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_ctrl.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_distilbert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_flaubert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_gpt2.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_mmbt.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_openai.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_roberta.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_t5.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_transfo_xl.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_utils.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_xlm.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_xlm_roberta.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/configuration_xlnet.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_albert.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_camembert.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_distilbert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_encoder_decoder.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_flaubert.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_openai.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_roberta.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_albert.cpython-37.pyc
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_camembert.cpython-37.pyc
100644 → 100755
Empty file.
Empty file.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_distilbert.cpython-37.pyc
100644 → 100755
Empty file.
Empty file.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_openai.cpython-37.pyc
100644 → 100755
Empty file.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_tf_pytorch_utils.cpython-37.pyc
Binary file not shown.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_roberta.cpython-37.pyc
100644 → 100755
Empty file.
Empty file.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_transfo_xl.cpython-37.pyc
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_transfo_xl_utilities.cpython-37.pyc
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/__pycache__/modeling_tf_xlm_roberta.cpython-37.pyc
100644 → 100755
Empty file.
Empty file.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_transfo_xl.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_transfo_xl_utilities.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/modeling_xlm_roberta.cpython-37.pyc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_albert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_auto.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_bart.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_bert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_bert_japanese.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_camembert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_ctrl.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_distilbert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_dna.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_flaubert.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_gpt2.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_openai.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_roberta.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_t5.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_transfo_xl.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_utils.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_xlm.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_xlm_roberta.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/__pycache__/tokenization_xlnet.cpython-37.pyc
Binary file not shown.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/convert_albert_original_tf_checkpoint_to_pytorch.py
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/convert_bart_original_pytorch_checkpoint_to_pytorch.py
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/convert_bert_pytorch_checkpoint_to_original_tf.py
100644 → 100755
Empty file.
Empty file.
Empty file modified
0
src/transformers/convert_roberta_original_pytorch_checkpoint_to_pytorch.py
100644 → 100755
Empty file.
Empty file.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/data/metrics/__pycache__/__init__.cpython-37.pyc
Binary file not shown.
Empty file.
Empty file.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/data/processors/__pycache__/__init__.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/data/processors/__pycache__/glue.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/data/processors/__pycache__/squad.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/data/processors/__pycache__/utils.cpython-37.pyc
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/transformers/data/processors/__pycache__/xnli.cpython-37.pyc
Binary file not shown.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-3/special_tokens_map.json
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-3/tokenizer_config.json
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-4/special_tokens_map.json
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-4/tokenizer_config.json
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-5/special_tokens_map.json
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-5/tokenizer_config.json
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-6/special_tokens_map.json
100644 → 100755
Empty file.
Empty file modified
0
src/transformers/dnabert-config/bert-config-6/tokenizer_config.json
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.