Intel(R) Extension for Scikit-learn* speeds up scikit-learn beyond by providing drop-in patching. Acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL) that allows for fast usage of the framework suited for Data Scientists or Machine Learning users.
Project description
Intel(R) Extension for Scikit-learn*
Intel(R) Extension for Scikit-learn speeds up scikit-learn beyond by providing drop-in patching. Acceleration is achieved through the use of the Intel(R) oneAPI Data Analytics Library (oneDAL) that allows for fast usage of the framework suited for Data Scientists or Machine Learning users.
⚠️Intel(R) Extension for Scikit-learn contains scikit-learn patching functionality originally available in daal4py package. All future updates for the patching will be available in Intel(R) Extension for Scikit-learn only. Please use the package instead of daal4py.
Running full the latest scikit-learn test suite with Intel(R) Extension for Scikit-learn:
👀 Follow us on Medium
We publish blogs on Medium, so follow us to learn tips and tricks for more efficient data analysis the help of Intel(R) Extension for Scikit-learn. Here are our latest blogs:
- From Hours to Minutes: 600x Faster SVM
- Improve the Performance of XGBoost and LightGBM Inference
- Accelerate Kaggle Challenges Using Intel AI Analytics Toolkit
- Accelerate Your scikit-learn Applications
- Accelerate Linear Models for Machine Learning
- Accelerate K-Means Clustering
🔗 Important links
- Documentation
- scikit-learn API and patching
- Building from Sources
- About Intel(R) oneAPI Data Analytics Library
💬 Support
Report issues, ask questions, and provide suggestions using:
You may reach out to project maintainers privately at onedal.maintainers@intel.com
🛠 Installation
Intel(R) Extension for Scikit-learn is available at the Python Package Index, and in Intel channel.
# PyPi (recommended by default)
pip install scikit-learn-intelex
# Anaconda Cloud from Intel channel (recommended for Intel® Distribution for Python users)
conda install scikit-learn-intelex -c intel
[Click to expand] ℹ️ Supported configurations
📦 PyPi channel
OS / Python version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 |
---|---|---|---|---|
Linux | [CPU, GPU] | [CPU, GPU] | [CPU, GPU] | ❌ |
Windows | [CPU, GPU] | [CPU, GPU] | [CPU, GPU] | ❌ |
OsX | [CPU] | [CPU] | [CPU] | ❌ |
📦 Anaconda Cloud: Intel channel
OS / Python version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 |
---|---|---|---|---|
Linux | ❌ | [CPU, GPU] | ❌ | ❌ |
Windows | ❌ | [CPU, GPU] | ❌ | ❌ |
OsX | ❌ | [CPU] | ❌ | ❌ |
You can build the package from sources as well.
⚡️ Get Started
Intel CPU optimizations patching
import numpy as np
from sklearnex import patch_sklearn
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
Intel GPU optimizations patching
import numpy as np
from sklearnex import patch_sklearn
from daal4py.oneapi import sycl_context
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with sycl_context("gpu"):
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
🚀 Scikit-learn patching
Speedups of Intel(R) Extension for Scikit-learn over the original Scikit-learn |
---|
Technical details: float type: float64; HW: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz, 2 sockets, 28 cores per socket; SW: scikit-learn 0.23.1, Intel® oneDAl (2021.1 Beta 10) |
Intel(R) Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality listed below. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. These limitations described below. If the patching does not cover your scenarios, submit an issue on GitHub.
[Click to expand] 🔥 Applying the patching will impact the following existing scikit-learn algorithms:
Task | Functionality | Parameters support | Data support |
---|---|---|---|
Classification | SVC | All parameters except kernel = 'poly' and 'sigmoid'. |
No limitations. |
RandomForestClassifier | All parameters except warmstart = True and cpp_alpha != 0, criterion != 'gini'. |
Multi-output and sparse data is not supported. | |
KNeighborsClassifier | All parameters except metric != 'euclidean' or minkowski with p = 2. |
Multi-output and sparse data is not supported. | |
LogisticRegression / LogisticRegressionCV | All parameters except solver != 'lbfgs' or 'newton-cg', class_weight != None, sample_weight != None. |
Only dense data is supported. | |
Regression | RandomForestRegressor | All parameters except warmstart = True and cpp_alpha != 0, criterion != 'mse'. |
Multi-output and sparse data is not supported. |
KNeighborsRegressor | All parameters except metric != 'euclidean' or minkowski with p = 2. |
Sparse data is not supported. | |
LinearRegression | All parameters except normalize != False and sample_weight != None. |
Only dense data is supported, #observations should be >= #features . |
|
Ridge | All parameters except normalize != False, solver != 'auto' and sample_weight != None. |
Only dense data is supported, #observations should be >= #features . |
|
ElasticNet | All parameters except sample_weight != None. |
Multi-output and sparse data is not supported, #observations should be >= #features . |
|
Lasso | All parameters except sample_weight != None. |
Multi-output and sparse data is not supported, #observations should be >= #features . |
|
Clustering | KMeans | All parameters except precompute_distances and sample_weight != None. |
No limitations. |
DBSCAN | All parameters except metric != 'euclidean' or minkowski with p = 2. |
Only dense data is supported. | |
Dimensionality reduction | PCA | All parameters except svd_solver != 'full'. |
No limitations. |
TSNE | All parameters except metric != 'euclidean' or minkowski with p = 2. |
Sparse data is not supported. | |
Unsupervised | NearestNeighbors | All parameters except metric != 'euclidean' or minkowski with p = 2. |
Sparse data is not supported. |
Other | train_test_split | All parameters are supported. | Only dense data is supported. |
assert_all_finite | All parameters are supported. | Only dense data is supported. | |
pairwise_distance | With metric ='cosine' and 'correlation'. |
Only dense data is supported. | |
roc_auc_score | Parameters average , sample_weight , max_fpr and multi_class are not supported. |
No limitations. |
⚠️ We support optimizations for the last four versions of scikit-learn. The latest release of Intel(R) Extension for Scikit-learn 2021.2 supports scikit-learn 0.21.X, 0.22.X, 0.23.X and 0.24.X.
📜 Intel(R) Extension for Scikit-learn verbose
To find out which implementation of the algorithm is currently used (Intel(R) Extension for Scikit-learn or original Scikit-learn), set the environment variable:
- On Linux and Mac OS:
export SKLEARNEX_VERBOSE=INFO
- On Windows:
set SKLEARNEX_VERBOSE=INFO
For example, for DBSCAN you get one of these print statements depending on which implementation is used:
INFO: sklearn.cluster.DBSCAN.fit: uses Intel(R) oneAPI Data Analytics Library solver
INFO: sklearn.cluster.DBSCAN.fit: uses original Scikit-learn solver
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for scikit_learn_intelex-2021.2.2-py38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6676c457f97831effcdae98c110ada8716e735991ce7eabccc535cf1ef425730 |
|
MD5 | 6b48888a12bef2fb8ceff48170772567 |
|
BLAKE2b-256 | 8767568b50cdbadcc6e7b4128e86337714efb5cee9b4c28585a8675efb7d8fa5 |
Hashes for scikit_learn_intelex-2021.2.2-py38-none-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbd2cdd9f917f9ede2d6d90772cd12cd3e3cbfb615078556659c4fff9622075d |
|
MD5 | f5c136fded72285b0f4088c7b8cb09e6 |
|
BLAKE2b-256 | 59600a044089cb03643b1bdf0e972da65c961d066ae1ad1a7a1b0d4fa0f03269 |
Hashes for scikit_learn_intelex-2021.2.2-py38-none-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3276ff970ad12c92d0277c7b4a038a058652c546c0b9645ea47061f69dff34de |
|
MD5 | 0fe2cd5ee04855fd2060b725e7bb9546 |
|
BLAKE2b-256 | 4497151c733d1fca2637603f87263471f2f2bda95787f5d6e8df31ef0272aeaa |
Hashes for scikit_learn_intelex-2021.2.2-py37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bdfc5f7a4d1ec6a32fb72d8974f565ef356d3b9b92c7abb7574a00cec926d4e1 |
|
MD5 | 2ffdeb350f1f96fab2c213fdae9bf06a |
|
BLAKE2b-256 | 95292fd709d0f26cac0e4e566945e6c3ada5aeec91dc62974b743adfe4908dfc |
Hashes for scikit_learn_intelex-2021.2.2-py37-none-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6aa075f1c2edb39b42f33ed79147d3559a9a05c8e346b0267a7db70ebdf52a3e |
|
MD5 | e1d47ee6e74c3f7d29319d0dfe89f5c2 |
|
BLAKE2b-256 | 25db2aa43fea7a89d6190901372cfe5867444487a7f700b3a1250c617f12122d |
Hashes for scikit_learn_intelex-2021.2.2-py37-none-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | be12f6e9a2e42fc0a2daa635db8796260060ca9ddc2cf957bac70e7ba0122b1d |
|
MD5 | 9c2a99e1d16fc044e058de41d2849512 |
|
BLAKE2b-256 | ba72225d7f4ce8a0066a7e5175dbe7f01496051810406ee91a7a16b2cf094c7d |
Hashes for scikit_learn_intelex-2021.2.2-py36-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96cea0b33c5a33c04b7f949832e80a9601190cdaa7b79b6d768492a422a94160 |
|
MD5 | fdd2db918c518de71af3373b1354d9b3 |
|
BLAKE2b-256 | 3e0d49fb764aead36bb3c1b47fd8ec34755dfab8e422c8b71281d67537ee35eb |
Hashes for scikit_learn_intelex-2021.2.2-py36-none-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35a830ab236d516fa011b41e7d1a8960ff4940f204245357d6232469cdb614c2 |
|
MD5 | db2a06569fe4eff262f65e52bc0f5d75 |
|
BLAKE2b-256 | 20130c51e8380e48840f3e2850572421ffd8c61ae83c9b424c35cc63ed5594bb |
Hashes for scikit_learn_intelex-2021.2.2-py36-none-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1b1febe0b8096af94e9bf2c1fff6c334e3624425220e35d3a1d58dca741211c |
|
MD5 | 6ab57ddbd71055b9b75442feb34e1288 |
|
BLAKE2b-256 | 1ca0a8eb2dea69c4c9d2a8881bc3a0813f665994b747d5fa86f61f4a8aa76307 |