Python Framework for Topic Modeling
Project description
Gensim is a Python framework for unsupervised learning from raw, unstructured digital texts. It provides a framework for learning hidden (latent) corpus structure. Once found, documents can be succinctly expressed in terms of this structure, queried for topical similarity and so on.
- Gensim includes the following features:
Memory independence – there is no need for the whole text corpus (or any intermediate term-document matrices) to reside fully in RAM at any one time.
Provides implementations for several popular topic inference algorithms, including Latent Semantic Analysis (LSA, LSI) and Latent Dirichlet Allocation (LDA), and makes adding new ones simple.
Contains I/O wrappers and converters around several popular data formats.
Allows similarity queries across documents in their latent, topical representation.
- The principal design objectives behind gensim are:
Straightforward interfaces and low API learning curve for developers, facilitating modifications and rapid prototyping.
Memory independence with respect to the size of the input corpus; all intermediate steps and algorithms operate in a streaming fashion, processing one document at a time.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gensim-0.2.tar.gz.
File metadata
- Download URL: gensim-0.2.tar.gz
- Upload date:
- Size: 119.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df51d7a3e254e0d6e7d3998d46bd0da6f13352106594985b9bd277b000838d1b
|
|
| MD5 |
4a34f3623134d21222faa8a4c5035d3e
|
|
| BLAKE2b-256 |
b4c74813ac45df446d4fa92575e8c863cdae68a79d11775caafcda3cdd665904
|
File details
Details for the file gensim-0.2-py2.5.egg.
File metadata
- Download URL: gensim-0.2-py2.5.egg
- Upload date:
- Size: 130.7 kB
- Tags: Egg
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86aae1eab2409436b1a7790d402a7c6e816adda98082b22c69a61ff06d2147da
|
|
| MD5 |
6cd22bc391fb8e7620b6d5aa0b316a5a
|
|
| BLAKE2b-256 |
496980e1aa54ff72384ae13a4501d819d8eeb13e7fd6d36aad595998b22979ce
|