Skip to main content

This package boosts a sparse matrix multiplication followed by selecting the top-n multiplication

Project description

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse import rand
from sparse_dot_topn import awesome_cossim_topn

N = 10
a = rand(100, 1000000, density=0.005, format='csr')
b = rand(1000000, 200, density=0.005, format='csr')

# Use standard implementation

c = awesome_cossim_topn(a, b, N, 0.01)

# Use parallel implementation with 4 threads

d = awesome_cossim_topn(a, b, N, 0.01, use_threads=True, n_jobs=4)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparse_dot_topn-0.2.9.tar.gz (106.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sparse_dot_topn-0.2.9-cp37-cp37m-macosx_10_14_x86_64.whl (61.1 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

sparse_dot_topn-0.2.9-cp27-cp27m-macosx_10_14_intel.whl (69.1 kB view details)

Uploaded CPython 2.7mmacOS 10.14+ Intel (x86-64, i386)

File details

Details for the file sparse_dot_topn-0.2.9.tar.gz.

File metadata

  • Download URL: sparse_dot_topn-0.2.9.tar.gz
  • Upload date:
  • Size: 106.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.9.tar.gz
Algorithm Hash digest
SHA256 721f2bc4615c6050084f96518322f1dcb2b82a1016a4d9a1fe508d351d992136
MD5 3e34239b85785ebdd64d6fabf13d2d67
BLAKE2b-256 70d52a3a52acd89344f0c45cae320bd41ee49573caec656834b98c5ea48669b7

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.9-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.9-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 61.1 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.9-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f4c706dc8bf23b7c69aa4dfeb958884b36d65c2a9a5ca80eda5cfb8ae32e17a9
MD5 900479b8674ecb43f9a1fdb489b0dfb5
BLAKE2b-256 ade9c2dacfb88b81a6309459d99423b27063b8cbd7e6f881a8af26cef1c91dc1

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.9-cp27-cp27m-macosx_10_14_intel.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.9-cp27-cp27m-macosx_10_14_intel.whl
  • Upload date:
  • Size: 69.1 kB
  • Tags: CPython 2.7m, macOS 10.14+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.9-cp27-cp27m-macosx_10_14_intel.whl
Algorithm Hash digest
SHA256 19414d4172f4ea78f44f9b95847fbc5f68c50cae6ced99e623173628941fe100
MD5 d899b0402e84e854666f91d01a819bd9
BLAKE2b-256 41e2e628290b7f6aa3477c8ba051c3ba4c63ee90591936d4f2722ba3857c1608

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page