This project provides a Python-based automation pipeline for downloading haplotypic information from the IRRI SNP-Seek database. It is designed to assist bioinformaticians, plant breeders, rice researchers, and enthusiasts in efficiently retrieving large amounts of data.
The task of downloading repetitive genotypic/haplotypic information for thousands of loci from the IRRI SNP-Seek database can be time-consuming, often taking up to a week. This project aims to significantly reduce this time to approximately 72 hours by utilizing the WebDriver and Selenium package in Python.
The project includes four key components:
- Two CSV files
- A Python script
- A Chrome WebDriver executable file
To utilize this automation pipeline, you will need:
- Python installed on your system for script execution. You can download Python here.
- The Chrome WebDriver executable file, compatible with your version of Chrome. This can be downloaded from the ChromeDriver website.
- A stable, high-speed internet connection to ensure uninterrupted downloads.
Detailed instructions on how to use this pipeline will be provided, guiding users through the process of setting up their environment, running the Python script, and downloading the required data.
Link for Demo:
- Ajay Kumar - PhD Computer Science, Missouri - ajaykumarmizzou
This project is licensed under the MIT License - see the LICENSE file for details
- IRRI South Asia Hub, Hyderabad, IN.
- Inspiration- If agriculture fails, everything else fails.