Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging the extra_requires examples into the core package #1987

Merged
merged 4 commits into from
Sep 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 10 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ Here you can find the package documentation: https://microsoft-recommenders.read

This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:

- [Prepare Data](examples/01_prepare_data): Preparing and loading data for each recommender algorithm
- [Prepare Data](examples/01_prepare_data): Preparing and loading data for each recommender algorithm.
- [Model](examples/00_quick_start): Building models using various classical and deep learning recommender algorithms such as Alternating Least Squares ([ALS](https://spark.apache.org/docs/latest/api/python/_modules/pyspark/ml/recommendation.html#ALS)) or eXtreme Deep Factorization Machines ([xDeepFM](https://arxiv.org/abs/1803.05170)).
- [Evaluate](examples/03_evaluate): Evaluating algorithms with offline metrics
- [Model Select and Optimize](examples/04_model_select_and_optimize): Tuning and optimizing hyperparameters for recommender models
- [Operationalize](examples/05_operationalize): Operationalizing models in a production environment on Azure
- [Evaluate](examples/03_evaluate): Evaluating algorithms with offline metrics.
- [Model Select and Optimize](examples/04_model_select_and_optimize): Tuning and optimizing hyperparameters for recommender models.
- [Operationalize](examples/05_operationalize): Operationalizing models in a production environment on Azure.

Several utilities are provided in [recommenders](recommenders) to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. See the [Recommenders documentation](https://readthedocs.org/projects/microsoft-recommenders/).

Expand All @@ -40,16 +40,16 @@ We recommend [conda](https://docs.conda.io/projects/conda/en/latest/glossary.htm
conda create -n <environment_name> python=3.9
conda activate <environment_name>

# 3. Install the recommenders package with examples
pip install recommenders[examples]
# 3. Install the core recommenders package. It can run all the CPU notebooks.
pip install recommenders

# 4. create a Jupyter kernel
python -m ipykernel install --user --name <environment_name> --display-name <kernel_name>

# 5. Clone this repo within vscode or using command:
git clone https://github.com/microsoft/recommenders.git
# 5. Clone this repo within VSCode or using command line:
git clone https://github.com/recommenders-team/recommenders.git

# 6. Within VS Code:
# 6. Within VSCode:
# a. Open a notebook, e.g., examples/00_quick_start/sar_movielens.ipynb;
# b. Select Jupyter kernel <kernel_name>;
# c. Run the notebook.
Expand All @@ -58,14 +58,11 @@ git clone https://github.com/microsoft/recommenders.git
For more information about setup on other platforms (e.g., Windows and macOS) and different configurations (e.g., GPU, Spark and experimental features), see the [Setup Guide](SETUP.md).

In addition to the core package, several extras are also provided, including:
+ `[examples]`: Needed for running examples.
+ `[gpu]`: Needed for running GPU models.
+ `[spark]`: Needed for running Spark models.
+ `[dev]`: Needed for development for the repo.
+ `[all]`: `[examples]`|`[gpu]`|`[spark]`|`[dev]`
+ `[all]`: `[gpu]`|`[spark]`|`[dev]`
+ `[experimental]`: Models that are not thoroughly tested and/or may require additional steps in installation.
+ `[nni]`: Needed for running models integrated with [NNI](https://nni.readthedocs.io/en/stable/).


## Algorithms

Expand Down
27 changes: 19 additions & 8 deletions SETUP.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,34 @@
# Setup Guide

The repo, including this guide, is tested on Linux. Where applicable, we document differences in [Windows](#windows-specific-instructions) and [macOS](#macos-specific-instructions) although
The repo, including this guide, is tested on Linux. Where applicable, we document differences in [Windows](#windows-specific-instructions) and [MacOS](#macos-specific-instructions) although
such documentation may not always be up to date.

## Extras

In addition to the pip installable package, several extras are provided, including:
+ `[examples]`: Needed for running examples.
+ `[gpu]`: Needed for running GPU models.
+ `[spark]`: Needed for running Spark models.
+ `[dev]`: Needed for development.
+ `[all]`: `[examples]`|`[gpu]`|`[spark]`|`[dev]`
+ `[all]`: `[gpu]`|`[spark]`|`[dev]`
+ `[experimental]`: Models that are not thoroughly tested and/or may require additional steps in installation).
+ `[nni]`: Needed for running models integrated with [NNI](https://nni.readthedocs.io/en/stable/).


## Setup for Core Package

Follow the [Getting Started](./README.md#Getting-Started) section in the [README](./README.md) to install the package and run the examples.

## Setup for GPU

```bash
# 1. Make sure CUDA is installed.

# 2. Follow Steps 1-5 in the Getting Started section in README.md to install the package and Jupyter kernel, adding the gpu extra to the pip install command:
pip install recommenders[gpu]

# 3. Within VSCode:
# a. Open a notebook with a GPU model, e.g., examples/00_quick_start/wide_deep_movielens.ipynb;
# b. Select Jupyter kernel <kernel_name>;
# c. Run the notebook.
```

## Setup for Spark

Expand All @@ -26,9 +37,9 @@ Follow the [Getting Started](./README.md#Getting-Started) section in the [README
# sudo apt-get install openjdk-11-jdk

# 2. Follow Steps 1-5 in the Getting Started section in README.md to install the package and Jupyter kernel, adding the spark extra to the pip install command:
pip install recommenders[examples,spark]
pip install recommenders[spark]

# 3. Within VS Code:
# 3. Within VSCode:
# a. Open a notebook with a Spark model, e.g., examples/00_quick_start/als_movielens.ipynb;
# b. Select Jupyter kernel <kernel_name>;
# c. Run the notebook.
Expand Down Expand Up @@ -99,7 +110,7 @@ The `xlearn` package has dependency on `cmake`. If one uses the `xlearn` related

For Spark features to work, make sure Java and Spark are installed and respective environment varialbes such as `JAVA_HOME`, `SPARK_HOME` and `HADOOP_HOME` are set properly. Also make sure environment variables `PYSPARK_PYTHON` and `PYSPARK_DRIVER_PYTHON` are set to the the same python executable.

## macOS-Specific Instructions
## MacOS-Specific Instructions

We recommend using [Homebrew](https://brew.sh/) to install the dependencies on macOS, including conda (please remember to add conda's path to `$PATH`). One may also need to install lightgbm using Homebrew before pip install the package.

Expand Down
24 changes: 10 additions & 14 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@
import sys
import time

# workround for enabling editable user pip installs
# Workround for enabling editable user pip installs
site.ENABLE_USER_SITE = "--user" in sys.argv[1:]

# version
# Version
here = Path(__file__).absolute().parent
version_data = {}
with open(here.joinpath("recommenders", "__init__.py"), "r") as f:
Expand All @@ -29,7 +29,7 @@
install_requires = [
"numpy>=1.19", # 1.19 required by tensorflow 2.6
"pandas>1.0.3,<2",
"scipy>=1.0.0,<1.11.0", #FIXME: We limit <1.11.0 until #1954 is fixed
"scipy>=1.0.0,<1.11.0", # FIXME: We limit <1.11.0 until #1954 is fixed
"tqdm>=4.31.1,<5",
"matplotlib>=2.2.2,<4",
"scikit-learn>=0.22.1,<1.0.3",
Expand All @@ -49,17 +49,15 @@
"pandera[strategies]>=0.6.5", # For generating fake datasets
"scikit-surprise>=1.0.6",
"scrapbook>=0.5.0,<1.0.0",
"hyperopt>=0.1.2,<1",
"ipykernel>=4.6.1,<7",
"jupyter>=1,<2",
"locust>=1,<2",
"papermill>=2.1.2,<3",
]

# shared dependencies
extras_require = {
"examples": [
"hyperopt>=0.1.2,<1",
"ipykernel>=4.6.1,<7",
"jupyter>=1,<2",
"locust>=1,<2",
"papermill>=2.1.2,<3",
],
"gpu": [
"nvidia-ml-py3>=7.352.0",
# TensorFlow compiled with CUDA 11.2, cudnn 8.1
Expand All @@ -80,17 +78,15 @@
"pytest-mock>=3.6.1", # for access to mock fixtures in pytest
],
}
# for the brave of heart
# For the brave of heart
extras_require["all"] = list(set(sum([*extras_require.values()], [])))

# the following dependencies need additional testing
# The following dependencies need additional testing
extras_require["experimental"] = [
# xlearn requires cmake to be pre-installed
"xlearn==0.40a1",
# VW C++ binary needs to be installed manually for some code to work
"vowpalwabbit>=8.9.0,<9",
]
extras_require["nni"] = [
# nni needs to be upgraded
"nni==1.5",
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ def create_run_config(
)

# install recommenders
reco_extras = "dev,examples"
reco_extras = "dev"
if add_gpu_dependencies and add_spark_dependencies:
conda_dep.add_channel("conda-forge")
conda_dep.add_conda_package(conda_pkg_jdk)
Expand Down