Skip to content

Latest commit

 

History

History
21 lines (12 loc) · 968 Bytes

README.md

File metadata and controls

21 lines (12 loc) · 968 Bytes

Enzyme-Optimal-Condition-Analysis

We investigate the optimal condition of an enzyme by directly analyzing the sequence. We propose an embedding method to represent the amino acids and the construct information as vectors in the latent space.

File descriptions:

1.change_to_csv.java: Convert .fas data to .csv data

2.change_to_number.java: Convert string data in .csv to real number or one-hot type

3.Split_train_test.java:Split the data into training set and test set proportionally

4.create_probability.java: Generate the sampling ratio of positive and negative samples

5.create_samples.java: Generate positive and negative samples according to the sampling ratio

6.protein_learning.java:Embedding learning

7.check_matrix.java: Evaluate the accuracy of the model

Citation

Li, X., Dou, Z., Sun, Y. et al. A sequence embedding method for enzyme optimal condition analysis. BMC Bioinformatics 21, 512 (2020). https://doi.org/10.1186/s12859-020-03851-5