kernel_regression

Rewriting of statsmodels' kernel regression functions for speed.

The main motivation behind this is the interpolation of missing values over large amount of time series. Vanilla interpolation approaches can give too much estimation variance or be vulnerable to data quality issues.

Single-threaded Nadarawa-Watson fit is about 150 times faster than the Python alternative. There is also a multi-threaded implementation of the local polynomial estimator.

At the moment, multivariate kernels with unordered and continuous variables are handled.

Because of the performance improvement, cross-validations and fittings of local multivariate polynomials are within reach.

How to compile the python wheel (Linux)

Make sure Rust, along with Cargo, is accessible in your terminal.
The models are currently developped for Python 3.11
Install maturin in your Python environment with

pip install maturin

At the root of the py-kernel-regression module, run

maturin develop --release

This is will compile the wheel and install py-kernel-regression in your Python environment. If you only want to get the wheel, use

maturin build --release

instead.

Example usage

Basic example

import numpy as np
from py_kernel_regression import KernelReg as KR

bw = [1.0, 0.2]
x_train = np.array([1.0, 3.2, 2.5, 1.2, 4.3])[:, None]
u_train = np.array([1, 3, 2, 2, 1])[:, None]
X_train = np.concatenate([x_train, u_train], axis=1)
Y_train = np.array([9.0, 9.0, 10.0, 3.0, 4.0])

x_new = np.array([[1.0, 2.0], [2.2, 3.0], [2.6, 2.0]])

ll_output = KR(bw, ["c", "u"], "loc_constant").fit_predict(Y_train, X_train, x_new)

print(ll_output)

Leave-one-out cross-validation

import numpy as np
n = 1000 
x = np.linspace(0, 100, n)
y = np.array([2.0 * np.sin(x_i * np.pi/50) for x_i in x])

exog = x[:, None] 
endog = y.ravel()

y_hat = KR([10.0], ["c"], "loc_linear").fit_predict(endog, exog, exog)

grid_bw = [0.5, 1.0, 5.0, 10.0, 25.0, 50.0, 100.0]
losses = []
for bw in grid_bw:
    loss = KR([bw], ["c"], "loc_linear").leave_one_out(endog, exog, "rmse")
    losses.append(loss)
 
 
y_hat = KR([np.argmin(grid_bw)], ["c"], "loc_linear").fit_predict(endog, exog, exog)

Notes

Got to improve/actually do error handling. Right now it's very unclear why/where there is an issue (e.g. too many variables declared).

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.vscode		.vscode
conda_recipe		conda_recipe
py-kernel-regression		py-kernel-regression
rust-kernel-regression		rust-kernel-regression
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
debug_output.txt		debug_output.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kernel_regression

How to compile the python wheel (Linux)

Example usage

Basic example

Leave-one-out cross-validation

Notes

About

Releases

Packages

Languages

License

marcandre259/kernel_regression

Folders and files

Latest commit

History

Repository files navigation

kernel_regression

How to compile the python wheel (Linux)

Example usage

Basic example

Leave-one-out cross-validation

Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages