stock-ml

End-to-End solution to:

quickly evaluate approaches to
maximize profits with
algorithmic trading

By evaluate is not meant how correctly the machine-learning model predicts future, but rather how much money does it make:

It's not whether you're right or wrong that's important, but how much money you make when you're right and how much you lose when you're wrong.

(WARREN BUFFET or GEORGE SOROS)

Why

Efficient workflow and reproducibility are extremely important in every machine learning projects, because they allow:

quickly iterate on new approaches (trading strategies or machine models) and compare the results faster
gain confidence in the results using backtests
save time and resources

How

The core of the tool is built with machine learning models for stock data prediction. The whole system however consists also of other, not less important parts:

~~data ingestion~~ (partially manual process: see notebooks\data_load.ipynb)
data transformation (⚙)
features selection (⚙)
machine learning model selection (⚙)
machine learning model training
stock trading strategy evaluation
~~deployment and maintenance~~ (not part of this project)

(⚙) - these parts are easily configurable and interchangeable using a Dependency Injection framework

What

python src/experiment.py

The main starting point of the system. It uses:

configurable (see: Hydra) pipelines & transformers to process data into the features for ML model.
- some of the feature transformations use indicators implemented in TA-Lib Technical Analysis Library
training of ML-models (scikit-learn and LightGBM)
- with cross validation using scikit-learn's TimeSeriesSplit, to split time series data into intervals (see here)
trading strategy
- scanning all the available stocks and selecting the most promising ones, according to the implemented strategy
- re-allocating portfolio according to the selected stocks and their weight-allocations
- 📌 with custom risk model
- backtested with customized experiments, based on QSTrader
Here's a sample result:

Setup

HINT: the setup needs to be better tested - with different versions of Python and associated libraries, with their own dependencies. The quick way of setting up that should work:

python -m pip install -r ./requirements.txt

Sometimes it needs to be installed manually:

conda create --name sml39 python=3.9
conda activate sml39
conda install ipython jupyterlab

python -m pip install -r ./requirements.txt
conda install pytables
conda install -c conda-forge yfinance 
conda install -c conda-forge lightgbm=3.2.1
pip install pillow==9.0.0

Data

In order to successfully make this project running, one needs to provide your own data from a data provider of your choice, and make it available e.g. in HDF5 format, possibly updating the config.yaml. The attached notebooks\data_load.ipynb might be of help here.

The reason this data was not included in this repository is trivial: I would prefer to avoid lawsuits.

References

Stefan Jansen: Machine Learning for Algorithmic Trading - Second Edition (highly recommended book, available e.g. here ) and associated github repository. Highly recommended.
QSTrader - backtesting library of my choice. It allows rapid prototyping and performance statistics for results analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.vscode		.vscode
models/lgbm/lgbm-ver0.0.1.1		models/lgbm/lgbm-ver0.0.1.1
notebooks		notebooks
results		results
src		src
tests		tests
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stock-ml

Why

How

What

Setup

Data

References

About

Releases

Packages

Languages

License

tgdula/pub-stock-ml

Folders and files

Latest commit

History

Repository files navigation

stock-ml

Why

How

What

Setup

Data

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages