Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pretrained statistical models and word vectors, and currently supports tokenization for 60+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.

💫 Version 2.3 out now! Check out the release notes here.

Azure Pipelines Travis Build Status Current Release Version pypi Version conda Version Python wheels PyPi downloads Conda downloads Model downloads Code style: black spaCy on Twitter

📖 Documentation

Documentation
spaCy 101 New to spaCy? Here's everything you need to know!
Usage Guides How to use spaCy and its features.
New in v2.3 New features, backwards incompatibilities and migration guide.
API Reference The detailed reference for spaCy's API.
Models Download statistical language models for spaCy.
Universe Libraries, extensions, demos, books and courses.
Changelog Changes and version history.
Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal and @ines, along with core contributors @svlandeg and @adrianeboyd. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests GitHub Issue Tracker
👩‍💻 Usage Questions Stack Overflow · Gitter Chat · Reddit User Group
🗯 General Discussion Gitter Chat · Reddit User Group

Features

  • Non-destructive tokenization
  • Named entity recognition
  • Support for 50+ languages
  • pretrained statistical models and word vectors
  • State-of-the-art speed
  • Easy deep learning integration
  • Part-of-speech tagging
  • Labelled dependency parsing
  • Syntax-driven sentence segmentation
  • Built in visualizers for syntax and NER
  • Convenient string-to-hash mapping
  • Export to numpy data arrays
  • Efficient binary serialization
  • Easy model packaging and deployment
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 2.7, 3.5+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13).

pip install spacy

To install additional data tables for lemmatization and normalization in spaCy v2.2+ you can run pip install spacy[lookups] or install spacy-lookups-data separately. The lookups package is needed to create blank models with lemmatization data for v2.2+ plus normalization data for v2.3+, and to lemmatize in languages that don't yet come with pretrained models and aren't powered by third-party libraries.

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install spacy

conda

Thanks to our great community, we've finally re-added conda support. You can now install spaCy via conda-forge:

conda install -c conda-forge spacy

For the feedstock including the build recipe and configuration, check out this repository. Improvements and pull requests to the recipe and setup are always appreciated.

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 1.x to spaCy 2.x, see the migration guide.

Download models

As of v1.7.0, models for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Models Detailed model descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage instructions.
# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name, a shortcut link or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp("This is a sentence.")

📖 For more info and examples, check out the models documentation.

Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system. See notes on Ubuntu, OS X and Windows for details.

# make sure you are using the latest pip
python -m pip install -U pip
git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py build_ext --inplace

Compared to regular install via pip, requirements.txt additionally installs developer dependencies such as Cython. For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

Ubuntu

Install system-level dependencies via apt-get:

sudo apt-get install build-essential python-dev git

macOS / OS X

Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.

Windows

Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter. For official distributions these are VS 2008 (Python 2.7), VS 2010 (Python 3.4) and VS 2015 (Python 3.5).

Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can find out where spaCy is installed and run pytest on that directory. Don't forget to also install the test utilities via spaCy's requirements.txt:

python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest <spacy-directory>

See the documentation for more details and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-2.3.0.tar.gz (6.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spacy-2.3.0-cp38-cp38-win_amd64.whl (9.6 MB view details)

Uploaded CPython 3.8Windows x86-64

spacy-2.3.0-cp38-cp38-manylinux1_x86_64.whl (9.9 MB view details)

Uploaded CPython 3.8

spacy-2.3.0-cp38-cp38-macosx_10_9_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

spacy-2.3.0-cp37-cp37m-win_amd64.whl (9.4 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy-2.3.0-cp37-cp37m-manylinux1_x86_64.whl (10.0 MB view details)

Uploaded CPython 3.7m

spacy-2.3.0-cp37-cp37m-macosx_10_9_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.7mmacOS 10.9+ x86-64

spacy-2.3.0-cp36-cp36m-win_amd64.whl (9.4 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy-2.3.0-cp36-cp36m-manylinux1_x86_64.whl (10.0 MB view details)

Uploaded CPython 3.6m

spacy-2.3.0-cp36-cp36m-macosx_10_9_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file spacy-2.3.0.tar.gz.

File metadata

  • Download URL: spacy-2.3.0.tar.gz
  • Upload date:
  • Size: 6.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0.tar.gz
Algorithm Hash digest
SHA256 5b0ae84b5d27e49f09b97384d806b74c70bc5d937551d45fc1f12adfce20315b
MD5 f0c49a32aeadf509a971779ed1c479fd
BLAKE2b-256 63049309749d00a44447c9e93510a3ccb21a37a36a23dbe8b35d09d4d2110094

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 9.6 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 52746167bfae7b96143bed3501e05f53be1c8c3b88399821c925b8f493fe07c5
MD5 c1b1b48e4d6c4d044228b609bcf55850
BLAKE2b-256 a06af8f8998150c7271aaacae7f33f1c3cec4ffeba255b9060ec087c032e6ab2

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.9 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fef690d04fff75ed7f2cef9864c8b7cb4abc61981835598265c74e7df073efef
MD5 0453f5c645cb973bc3b075d4609bc426
BLAKE2b-256 5a6b2a8ac06d35e57857ca4c367596425189162e2f29cc89152e727d0e613458

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0024e192c4a7f0cfe831ae97596622018ecdae5a3f6ab77278fb4cc61bcd6a98
MD5 42252f5d262e9034a7b749680856adfe
BLAKE2b-256 706dfd946ed35f82cdc79867480b6b8bcc44c156242a0cb8912357d766a87fc3

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 f79629c449079157f6bb0bbc279c0ee558959d543b87d39a1ffd91062d96ceb5
MD5 573cd7586e73b5e97dd1b3178e2e27f2
BLAKE2b-256 ed92f4380cf87495c9d5039adbff4980af457d48a6a1932276f15c02cd8fc183

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1a125996322fc0ae54aaea34620722452da04eca78c97c78c17cd55bbc5782df
MD5 512e7bbc00ac3a5501ae4c0686084e89
BLAKE2b-256 3753336f849003c88a868275959c1195c26f1bbbae777e8544b5e9fe3dd35b90

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.1 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5d384dbba9fb9b1bf9b78e1e0541d3e8f3b2b8cb2ead3ece47ebf69631d4278d
MD5 47388679cf765eb82bef3e1c0023db44
BLAKE2b-256 0ffce0db914ba5f05ba83a387239444b172cdda866b8d68f7f16b6c438525295

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 4f0911c8855cf9ba7cdb1dc67761ef272e8f032865d559d4de45dcf8804c8e44
MD5 6e77cbccb59ca7bc310ef9554ffbfb71
BLAKE2b-256 7b9515843ecb582680004e93353f6f07f702b9330afd73e7f39157bb21edb6a5

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 af3f3270cddca06ff697117e3b877a36915d7154b7c1a7b0413b8d50cf469de4
MD5 056ffde44b2c7317961d9bcc7dadab76
BLAKE2b-256 31c7e66e2af1cfa418c3a3917c116c4e00ccffa546f18f59e6acd7953d833c5c

See more details on using hashes here.

File details

Details for the file spacy-2.3.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for spacy-2.3.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0ceb0bf227f7d3c2343b73701a2a5685108e818ee6dce98de83a46667c8bf9b2
MD5 ddce3c09266310e85c0f6a2ad41b1e17
BLAKE2b-256 7b7a4adddf033ab9430e508fada347d01a7085734c1e294d554d8148d431ed79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page