Team TaoFuFa #5

Skyquek · 2020-04-14T15:57:53Z

In this project, we use genetic algorithm and deep learning to discover existing or new drugs that could bind to COVID-19 main protease (6LU7) and has a logP value that is lower than 5.

Did some formatting change and content addition

Checkpoint

Completed proofreading for section "Comparisons to the Original Repositories"

Update script for vina, added script for pdbqt->csv compilation, and a powershell folder splitter

By Janson

Skyquek · 2020-04-14T16:43:48Z

Conversion

How to use

* You may want to finish reading the repo before starting.

Initial generation

Run 'Initial Network.ipynb' to get generation 0 SMILES.

For every generation (including initial)

Run 'Evaluation and Refinement-localGA.ipynb'

Sharding the sdf files

In PyRx, load the sdf files and minimize the molecules, then export it to pdbqt file format.
Copy out the .pdbqt files to a seperate folder and run the sharding script (/scripts/folderSplitter/shard.ps1).
Copy everything from the /scripts/binding folder to each generated folder.
The folder is ready for distribution.

Computing the binding affinity

For each folder, run PowerShell in it and run the /scripts/binding/binding.ps1 file.
The validation results should be in the output folder.

Consolidation

When processing is done, consolidate all the files in each output folder, then copy files in/scripts/conversion to it.
Run PowerShell in it and run the /scripts/conversion/conversion.ps1 file.
The compute results would be consolidated in the results.csv file of the same folder.

A generation is complete, you may go to 2 to obtain the next gen.

Post processing (optional, but recommended to do for last gen)

Run Final Results.ipynb to visualize the data and filter up the best molecule.
A file will be created in generations/master_results_table_final.csv. This file can be validated by bioinformatics.

Update to newest generation

Graph that show comparison between old approaches vs new approaches

Edit final_gen.sdf and the readme result

png -> PNG

Remove the old power points file

Add ranked based selection and tournament selection to choose the best fit candidate.

Add violin plot

- Investigate on the gen15 Ans: Due to filter, our gen 15 looks weird - Add more graph in readme

Fixed the bug on Primary Key duplication

Skyquek and others added 30 commits April 11, 2020 12:51

Initial Commit

b04f5a2

Update ReadMe.md

43d93c2

tag teammate

9812884

Add Li Ho part

fa373e1

Added binding script and configuration

59f9d40

Update README.md - Rephrased some sentences and added some details

41fbdef

Update README.md

6ca96af

Update README.md

64a39a4

Resolve merge issues pr/1

5d43e67

Merge branch 'pr/1'

97b475a

Updated README.md

ac1814e

Did some formatting change and content addition

Ad the results of training to the folders for submission

2b393ee

Merge branch 'master' of https://github.com/Skyquek/fch-drug-discovery

6a42128

Fixed number change to variable

f0168a9

Formatting test

44ad842

Merge branch 'formattingTest'

5de9bb6

Updated README.md

e0925a6

Update README.md

f12e281

Checkpoint

Add Covid-19 png Images

5d5ff1f

Merge branch 'master' of https://github.com/Skyquek/fch-drug-discovery

de0f0cd

Add covid-19 png image in readme

a44b6b7

Change Image Size

9c392f8

remove align

0d7fd0f

Auto stash before merge of "master" and "origin/master"

1bd4eba

Fixed image not displaying

fd250ef

Correct the image broken link

9073ff0

Merge branch 'master' of https://github.com/Skyquek/fch-drug-discovery

8414eb3

Updated README.md

1c54bc8

Completed proofreading for section "Comparisons to the Original Repositories"

Merge branch 'readmeDrafting'

820e9ea

Added scripts

082223f

Update script for vina, added script for pdbqt->csv compilation, and a powershell folder splitter

kwongtn added 7 commits April 14, 2020 23:01

Disaster recovery

b2eb4c8

Update README.md - Added "Computational Implementations"

3f66b5a

Updated README.md - Added links to presentation and YouTube video

600c61e

Added presentation slides

c1666c5

Merge branch 'readmeDrafting'

a844dd4

Update README.md - Added future work

ac685d0

By Janson

Update README.md - Name correction

d0f4035

kwongtn and others added 22 commits April 15, 2020 00:44

Update README.md - Added steps on how to use

0823738

Updated to show slides

a870518

Update README.md - Fixed typo

0e74f4b

Add gen13 to gen 17

79b7228

Update to newest generation

Add Methodology Graph

4b454a8

Graph that show comparison between old approaches vs new approaches

Merge branch 'readmeResult' into gen13-17-Quek

0d93ddf

Update Readme.md and update results

88ce0ba

Edit final_gen.sdf and the readme result

fixed extension typo

f14af87

png -> PNG

Remove Flow_of_Covid.pptx

21a70cf

Remove the old power points file

Add few future work

d8a044e

Add ranked based selection and tournament selection to choose the best fit candidate.

Add Violin Plot

fd376a4

Add violin plot

Change violin plot width to 100%

0e3c446

Change typo figure 2 to figure 3

6382ff3

Clean up some code on gen 17 and add more graph in readme

82386fb

- Investigate on the gen15 Ans: Due to filter, our gen 15 looks weird - Add more graph in readme

Inverse figure 4 and figure 5 label

b8d67df

Merge remote-tracking branch 'origin/gen13-17-Quek'

317f559

Corrected some typo

adef2fb

Merge remote-tracking branch 'origin/gen13-17-Quek'

bc99e77

Bug Fixed and Retrained to Gen 10

eee226d

Fixed the bug on Primary Key duplication

Merge remote-tracking branch 'origin/gen13-17-Quek' into gen13-17-Quek

0312afa

Update new Results - Merge branch 'origin/GreenDeployment' to master

f0c3cc4

Fixed typo & update tables

014873a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Team TaoFuFa #5

Team TaoFuFa #5

Skyquek commented Apr 14, 2020

Skyquek commented Apr 14, 2020 •

edited

Loading

Team TaoFuFa #5

Are you sure you want to change the base?

Team TaoFuFa #5

Conversation

Skyquek commented Apr 14, 2020

Skyquek commented Apr 14, 2020 • edited Loading

Conversion

How to use

Initial generation

For every generation (including initial)

Sharding the sdf files

Computing the binding affinity

Consolidation

Post processing (optional, but recommended to do for last gen)

Skyquek commented Apr 14, 2020 •

edited

Loading