Skip to main content

This package boosts a sparse matrix multiplication followed by selecting the top-n multiplication

Project description

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

    import numpy as np
    from scipy.sparse import csr_matrix
    from scipy.sparse import rand
    from sparse_dot_topn import awesome_cossim_topn

    N = 10
    a = rand(100, 1000000, density=0.005, format='csr')
    b = rand(1000000, 200, density=0.005, format='csr')

    c = awesome_cossim_topn(a, b, 5, 0.01)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparse_dot_topn-0.2.4.tar.gz (55.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sparse_dot_topn-0.2.4-cp37-cp37m-macosx_10_12_x86_64.whl (31.3 kB view details)

Uploaded CPython 3.7mmacOS 10.12+ x86-64

sparse_dot_topn-0.2.4-cp27-cp27m-macosx_10_12_intel.whl (61.6 kB view details)

Uploaded CPython 2.7mmacOS 10.12+ Intel (x86-64, i386)

File details

Details for the file sparse_dot_topn-0.2.4.tar.gz.

File metadata

  • Download URL: sparse_dot_topn-0.2.4.tar.gz
  • Upload date:
  • Size: 55.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.10

File hashes

Hashes for sparse_dot_topn-0.2.4.tar.gz
Algorithm Hash digest
SHA256 8fac48ed6577f1e7256091576185bfa65b1ef6cc1914823ad1d0a90e2882b096
MD5 ccc5cd916420ac999dbbad9eb16f67d1
BLAKE2b-256 66af8c2ff179579e78a76ac694fd50d1f0e7d9ea63bbd7739de9123d94af4503

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.4-cp37-cp37m-macosx_10_12_x86_64.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.4-cp37-cp37m-macosx_10_12_x86_64.whl
  • Upload date:
  • Size: 31.3 kB
  • Tags: CPython 3.7m, macOS 10.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.10

File hashes

Hashes for sparse_dot_topn-0.2.4-cp37-cp37m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 33f5b915814a25784736f0690ed5cc7d84914a690501321d793ddcab688d1b08
MD5 ea3278e40bb4ad20db78a6f9b7eaf01a
BLAKE2b-256 9b48f9b5c94e4f61264d8f61a5790105cd558ad55fc2ef8f2477fd84a47b1a63

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.4-cp27-cp27m-macosx_10_12_intel.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.4-cp27-cp27m-macosx_10_12_intel.whl
  • Upload date:
  • Size: 61.6 kB
  • Tags: CPython 2.7m, macOS 10.12+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.10

File hashes

Hashes for sparse_dot_topn-0.2.4-cp27-cp27m-macosx_10_12_intel.whl
Algorithm Hash digest
SHA256 e117486a734753f6f0b10d902efa21b504c22027fa86704d570dba6c5620f2e2
MD5 5aa27b17c1e8bc6411dc79334672d267
BLAKE2b-256 41d9bde8b1ee7e39097b18ab0f9c47dd7e05f4f8f655485b8271336e3fa5fd8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page