Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update xlearn notebook #1006

Merged
merged 2 commits into from
Dec 10, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,6 @@ RUN mkdir ${HOME}/.jupyter && \
# CPU Stage
FROM base AS cpu

# Setup Conda environment
RUN apt-get update && \
apt-get install -y python-pip && \
pip install cmake
RUN python recommenders/scripts/generate_conda_file.py --name base


Expand Down
8 changes: 4 additions & 4 deletions notebooks/02_model/fm_deep_dive.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"FM is an algorithm that uses factorization in prediction tasks with data set of high sparsity. The algorithm was original proposed in [\\[1\\]](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf). Traditionally, the algorithms such as SVM failed in dealing with highly sparse data that is usually seen in many contemporary problems, e.g., click-through rate prediction, recommendation, etc. FM handles the problem by modeling not just first-order linear components for predicting the label, but also the cross-product of the feature variables in order to capture more generalized correlation between variables and label. "
"FM is an algorithm that uses factorization in prediction tasks with data set of high sparsity. The algorithm was original proposed in [\\[1\\]](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf). Traditionally, the algorithms such as SVM do not perform well in dealing with highly sparse data that is usually seen in many contemporary problems, e.g., click-through rate prediction, recommendation, etc. FM handles the problem by modeling not just first-order linear components for predicting the label, but also the cross-product of the feature variables in order to capture more generalized correlation between variables and label. "
]
},
{
Expand Down Expand Up @@ -89,7 +89,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Compared to using fixed parameter for the high-order interaction components, using the factorized vectors increase generalization as well as expressiveness of the model. In addition to this, the computation complexity of the model is $O(kn)$ where $k$ and $n$ are the dimensionalities of the factorization vector and input feature vector, respectively. In practice, usually a two-way FM model is used, i.e., only the second-order feature interactions are considered, to favor computational efficiency without loss of model performance."
"Compared to using fixed parameter for the high-order interaction components, using the factorized vectors increase generalization as well as expressiveness of the model. In addition to this, the computation complexity of the equation (above) is $O(kn)$ where $k$ and $n$ are the dimensionalities of the factorization vector and input feature vector, respectively (see [the paper](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf) for detailed discussion). In practice, usually a two-way FM model is used, i.e., only the second-order feature interactions are considered to favor computational efficiency."
]
},
{
Expand Down Expand Up @@ -182,7 +182,7 @@
"|[libfm](https://github.com/srendle/libfm)|C++|Implementation of FM algorithm|-|\n",
"|[libffm](https://github.com/ycjuan/libffm)|C++|Original implemenation of FFM algorithm. It is handy in model building, but does not support Python interface|-|\n",
"|[xlearn](https://github.com/aksnzhy/xlearn)|C++ with Python interface|More computationally efficient compared to libffm without loss of modeling effectiveness|[notebook](https://github.com/microsoft/recommenders/blob/master/notebooks/02_model/fm_deep_dive.ipynb)|\n",
"|[Vowpal Wabbit FM](https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Matrix-factorization-example)|Online library with estimator API|Easy to use by calling API, but flexibility and configurability are limited|[notebook](https://github.com/microsoft/recommenders/blob/master/notebooks/02_model/vowpal_wabbit_deep_dive.ipynb) / [utilities](https://github.com/microsoft/recommenders/tree/master/reco_utils/recommender/vowpal_wabbit)\n",
"|[Vowpal Wabbit FM](https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Matrix-factorization-example)|Online library with estimator API|Easy to use by calling API|[notebook](https://github.com/microsoft/recommenders/blob/master/notebooks/02_model/vowpal_wabbit_deep_dive.ipynb) / [utilities](https://github.com/microsoft/recommenders/tree/master/reco_utils/recommender/vowpal_wabbit)\n",
"|[microsoft/recommenders xDeepFM](https://github.com/microsoft/recommenders/blob/master/reco_utils/recommender/deeprec/models/xDeepFM.py)|Python|Support flexible interface with different configurations of FM and FM extensions, i.e., LR, FM, and/or CIN|[notebook](https://github.com/microsoft/recommenders/blob/master/notebooks/00_quick_start/xdeepfm_criteo.ipynb) / [utilities](https://github.com/microsoft/recommenders/blob/master/reco_utils/recommender/deeprec/models/xDeepFM.py)|"
]
},
Expand Down Expand Up @@ -872,7 +872,7 @@
"metadata": {
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python (reco_base)",
"display_name": "Python 3.6 (Recommender)",
"language": "python",
"name": "reco_base"
},
Expand Down