-
Notifications
You must be signed in to change notification settings - Fork 222
LightLDA
This document shows how to install and use LightLDA.
git clone --recursive https://github.com/Microsoft/lightlda
- Windows
Open windows/LightLDA.sln using Visual Studio 2013 and build all the projects.
- Linux (Tested on Ubuntu 14.04)
Run $sh ./build.sh
to install the program.
We provide some quick guidelines as follows for your reference. You can get get more detailed instructions about command line arguments by running $./lightlda --help
-
Preprocess LightLDA takes specific binary format as its input. To run LightLDA you should first convert your own data to the format supported by LightLDA. A tool is provided to convert the LibSVM-format data to LightLDA-format data. So for simplicity, you can prepare your dataset in the LibSVM format first. In the following steps, we assume your dataset is in LibSVM format.
- Counting dataset meta information.
./example/get_meta.py input.libsvm output.word_tf_file
- Split your LibSVM data into several parts.
- Convert your data from LibSVM format to binary format used by LightLDA.
./bin/dump_binary input.libsvm.part_id input.word_tf_file part_id
- Counting dataset meta information.
-
Training on single machine
We provide examples to illustrate how to use LightLDA to train topic models on a single machine. For instance, you can run in Powershell(Windows)
$ ./example/nytimes.ps1
or in Bash(Linux)$ ./example/nytimes.sh
to get a quick start of LightLDA. -
Training with distributed setting with MPI
Running MPI-based distributed LightLDA is quite similar to the single machine setting. Just use
mpiexec
and prepare a machine list file. Run$ mpiexec -machinefile machine_list lightlda lda_argments
.
DMTK
Multiverso
- Overview
- Multiverso setup
- Multiverso document
- Multiverso API document
- Multiverso applications
- Logistic Regression
- Word Embedding
- LightLDA
- Deep Learning
- Multiverso binding
- Run in docker
LightGBM