Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pretrained statistical models and word vectors, and currently supports tokenization for 60+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.

💫 Version 2.3 out now! Check out the release notes here.

🌙 Version 3.0 (nightly) out now! Check out the release notes here.

Azure Pipelines Travis Build Status Current Release Version pypi Version conda Version Python wheels PyPi downloads Conda downloads Model downloads Code style: black spaCy on Twitter

📖 Documentation

Documentation
spaCy 101 New to spaCy? Here's everything you need to know!
Usage Guides How to use spaCy and its features.
New in v2.3 New features, backwards incompatibilities and migration guide.
API Reference The detailed reference for spaCy's API.
Models Download statistical language models for spaCy.
Universe Libraries, extensions, demos, books and courses.
Changelog Changes and version history.
Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal and @ines, along with core contributors @svlandeg and @adrianeboyd. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests GitHub Issue Tracker
👩‍💻 Usage Questions Stack Overflow · Gitter Chat · Reddit User Group
🗯 General Discussion Gitter Chat · Reddit User Group

Features

  • Non-destructive tokenization
  • Named entity recognition
  • Support for 50+ languages
  • pretrained statistical models and word vectors
  • State-of-the-art speed
  • Easy deep learning integration
  • Part-of-speech tagging
  • Labelled dependency parsing
  • Syntax-driven sentence segmentation
  • Built in visualizers for syntax and NER
  • Convenient string-to-hash mapping
  • Export to numpy data arrays
  • Efficient binary serialization
  • Easy model packaging and deployment
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 2.7, 3.5+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13). Before you install spaCy and its dependencies, make sure that pip and setuptools are up to date.

pip install -U pip setuptools
pip install spacy

For installation on python 3.5 where binary wheels are not provided for the most recent versions of the dependencies, you can prefer older binary wheels over newer source packages with --prefer-binary:

pip install spacy --prefer-binary

To install additional data tables for lemmatization and normalization in spaCy v2.2+ you can run pip install spacy[lookups] or install spacy-lookups-data separately. The lookups package is needed to create blank models with lemmatization data for v2.2+ plus normalization data for v2.3+, and to lemmatize in languages that don't yet come with pretrained models and aren't powered by third-party libraries.

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install spacy

conda

Thanks to our great community, we've finally re-added conda support. You can now install spaCy via conda-forge:

conda install -c conda-forge spacy

For the feedstock including the build recipe and configuration, check out this repository. Improvements and pull requests to the recipe and setup are always appreciated.

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 1.x to spaCy 2.x, see the migration guide.

Download models

As of v1.7.0, models for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Models Detailed model descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage instructions.
# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name, a shortcut link or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp("This is a sentence.")

📖 For more info and examples, check out the models documentation.

Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system. See notes on Ubuntu, OS X and Windows for details.

# make sure you are using the latest pip
python -m pip install -U pip
git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py build_ext --inplace

Compared to regular install via pip, requirements.txt additionally installs developer dependencies such as Cython. For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

Ubuntu

Install system-level dependencies via apt-get:

sudo apt-get install build-essential python-dev git

macOS / OS X

Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.

Windows

Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter. For official distributions these are VS 2008 (Python 2.7), VS 2010 (Python 3.4) and VS 2015 (Python 3.5).

Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can find out where spaCy is installed and run pytest on that directory. Don't forget to also install the test utilities via spaCy's requirements.txt:

python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest <spacy-directory>

See the documentation for more details and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-2.3.3.tar.gz (5.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spacy-2.3.3-cp39-cp39-win_amd64.whl (9.4 MB view details)

Uploaded CPython 3.9Windows x86-64

spacy-2.3.3-cp39-cp39-manylinux2014_x86_64.whl (10.3 MB view details)

Uploaded CPython 3.9

spacy-2.3.3-cp39-cp39-macosx_10_9_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

spacy-2.3.3-cp38-cp38-win_amd64.whl (9.7 MB view details)

Uploaded CPython 3.8Windows x86-64

spacy-2.3.3-cp38-cp38-manylinux2014_x86_64.whl (10.5 MB view details)

Uploaded CPython 3.8

spacy-2.3.3-cp38-cp38-macosx_10_9_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

spacy-2.3.3-cp37-cp37m-win_amd64.whl (9.5 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy-2.3.3-cp37-cp37m-manylinux2014_x86_64.whl (10.4 MB view details)

Uploaded CPython 3.7m

spacy-2.3.3-cp37-cp37m-macosx_10_9_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.7mmacOS 10.9+ x86-64

spacy-2.3.3-cp36-cp36m-win_amd64.whl (9.5 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy-2.3.3-cp36-cp36m-manylinux2014_x86_64.whl (10.4 MB view details)

Uploaded CPython 3.6m

spacy-2.3.3-cp36-cp36m-macosx_10_9_x86_64.whl (10.3 MB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file spacy-2.3.3.tar.gz.

File metadata

  • Download URL: spacy-2.3.3.tar.gz
  • Upload date:
  • Size: 5.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3.tar.gz
Algorithm Hash digest
SHA256 799fa5fc172ff0a5bc8eb5dfcd1db200747c114320d2dc40060594a71efa3e53
MD5 8b233987697bd9da78577ddd0d00255e
BLAKE2b-256 8f108c463b664a84326a32bec7fc8910d68ba069e69ff076e34932b005954934

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 e96fc3e27c42b3ce5e7b114ab060ade83fc3760a86b03d407bb999a6133fb2ae
MD5 334146ca02a0f550aa3d0753abd67616
BLAKE2b-256 8de007782a2ad144d79d16bcefb43669ec18df7b0675c04728874673f3c0190c

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp39-cp39-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0b432ef51b9e230015b735f1aac4233da36086de015e49fe87a29ce344f5d128
MD5 2c8ec4d7408ab88522d03940a0270d8c
BLAKE2b-256 fa39c8a9fb2bf72e00f29597cd3d00c104f469a83517fe9455695a06fef1bf7f

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.1 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f4989a66f12e096aa7d3145010227e347dbd28c7233d9aa842e3450c9ca665ed
MD5 c3472010c1417ba60cefcfe8a66f063d
BLAKE2b-256 8def4aaad138d9038a7a49501100fb3ff8be3c5506a179c1d57b53785677cb91

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 9.7 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b2f60e406bbc086a6e8376bccd9ed4e217a06037a3b52ad5efd5a3deff164865
MD5 5889018699247dc342d558c5f65e8968
BLAKE2b-256 8bb28c40013312a324e4b4633db541438c97599df9685cbfb5c5f0209113e106

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 10.5 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c4ff005113bffbf7b2841c0e09bd520688b858ea2010209c551d5726b1034b7e
MD5 9f4578941da0ab37ccbb5dab3e0d0662
BLAKE2b-256 f5e5f8d75a11f46721d669b1c5a112e309ec6ddcec6896c7afe6cae4bc948f8f

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 070ae51898fb1ab34b50dd37bf9767dcebb92662c523d5d5ea7a2246b752c9cf
MD5 d40f598f59ba1d375f560c0c4a2e70a3
BLAKE2b-256 a1e30df920603672c31b1aaa5ad27c278fbd70a7994d0f8bee21317e0799df32

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 9.5 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 7e7cba1077b021adb5d309cf77756e3421bf6964f652bdf89e86cf1b43b250b0
MD5 b3731394d7fb6d593115da2d59c53f46
BLAKE2b-256 9f7e0ed7bab0455376ac5220264de3c10649a5d5aae74161343c824517bd6e27

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 10.4 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8cb47a526a7e26eda548d17ffc498e671bb6b3eb70a2de8f88b42a41eed81c8d
MD5 a4ab3dbc11759b335cfa5c06fe433825
BLAKE2b-256 7d7432fcd45e7ad726314bd50c89fa6503f660efef1c283b9eec0b4528fc985c

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f1b37866c6fff23ccf49e56448ad927c0e14070ce1c0f3f55f694c6d51b40e94
MD5 9f060dd257c7146f73f142528915e109
BLAKE2b-256 5a14a5ebe4b2845306349cb6baad61f33e0de7b953a6e77ff5d805755711e8e0

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 9.5 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 45681c37579dae6040a485bf434e27398180fdfa2fa4ba21d25c6bf3337dba7f
MD5 2b86ad7c6284e50157268e1d05859f91
BLAKE2b-256 b60c77298ed4a31979e495514b36304a9967fe55b5f3f03f4241565089c46ba0

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 10.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1b178f0c1ce42a92168c2eba16b1337ac54a709537eab3a4351698d0bfce4203
MD5 1260255495c64f777ec464c7f959b47c
BLAKE2b-256 5a7900c319ca024ac62881fe1b6c6062e2652971954b2aba075f9fdaceedbadb

See more details on using hashes here.

File details

Details for the file spacy-2.3.3-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: spacy-2.3.3-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.53.0 CPython/3.7.9

File hashes

Hashes for spacy-2.3.3-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 3d88f7504b9c8021796d30618eb9fa3a3185535af352ce982152e3aeb5ef73a3
MD5 6475e6fdc16a6aa5e3ec7f386816e999
BLAKE2b-256 0de095451b167736e202b0790b22ac1d82ceaacdc707b4602f39cb7ee0074f70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page