YSDA-CPU-inference

Quantized inference on CPU (int8 / int4 / mixed precision )

The aim of this project is to investigate whether the int8 architecture can provide acceleration compared to the fp16/fp32 architecture (in particular, there must be good INT8 computing structures for this to be profitable)
You may find more info in project presentation

C++ config

In this branch we used directly builded libtorch as in Building libtorch using CMake
We builded it in Debug mode, to do this the one needs to run the following commands in /cpp folder.

Warning

Overall build will require a little less than 23 GB of disk space and about 14 GB of CPU RAM

git clone -b main --recurse-submodule https://github.com/pytorch/pytorch.git
mkdir pytorch-build
cd pytorch-build
cmake -DBUILD_SHARED_LIBS:BOOL=ON -DCMAKE_BUILD_TYPE:STRING=Debug -DPYTHON_EXECUTABLE:PATH=`which python3` -DCMAKE_INSTALL_PREFIX:PATH=../pytorch-install ../pytorch
cmake --build . --target install

Then, also in /cpp folder run

mkdir build
cd build
cmake ..
make

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.vscode		.vscode
benchmarks		benchmarks
bencmark_results		bencmark_results
cpp		cpp
notebooks		notebooks
profiling_results		profiling_results
python		python
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.gitignore		.gitignore
Quantized_GEMM.pdf		Quantized_GEMM.pdf
README.md		README.md
__init__.py		__init__.py
bs_bench.jpg		bs_bench.jpg
imagenet1000.txt		imagenet1000.txt
par_bs_bench.jpg		par_bs_bench.jpg
tinyimagenet.sh		tinyimagenet.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YSDA-CPU-inference

Quantized inference on CPU (int8 / int4 / mixed precision )

C++ config

Usefull links

Additioanl topic-related papers from experienced researchers

About

Releases

Packages

Languages

alexeybelkov/YSDA-CPU-inference

Folders and files

Latest commit

History

Repository files navigation

YSDA-CPU-inference

Quantized inference on CPU (int8 / int4 / mixed precision )

C++ config

Usefull links

Additioanl topic-related papers from experienced researchers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages