Skip to main content

Industrial-strength Natural Language Processing (NLP) in Python

Project description

spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 50+ languages. It features state-of-the-art speed, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.

💫 Version 2.1 out now! Check out the release notes here.

Azure Pipelines Travis Build Status Current Release Version pypi Version conda Version Python wheels PyPi downloads Conda downloads Code style: black spaCy on Twitter

📖 Documentation

Documentation
spaCy 101 New to spaCy? Here's everything you need to know!
Usage Guides How to use spaCy and its features.
New in v2.1 New features, backwards incompatibilities and migration guide.
API Reference The detailed reference for spaCy's API.
Models Download statistical language models for spaCy.
Universe Libraries, extensions, demos, books and courses.
Changelog Changes and version history.
Contribute How to contribute to the spaCy project and code base.

💬 Where to ask questions

The spaCy project is maintained by @honnibal and @ines, along with core contributors @svlandeg and @adrianeboyd. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

Type Platforms
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests GitHub Issue Tracker
👩‍💻 Usage Questions Stack Overflow · Gitter Chat · Reddit User Group
🗯 General Discussion Gitter Chat · Reddit User Group

Features

  • Non-destructive tokenization
  • Named entity recognition
  • Support for 50+ languages
  • Pre-trained statistical models and word vectors
  • State-of-the-art speed
  • Easy deep learning integration
  • Part-of-speech tagging
  • Labelled dependency parsing
  • Syntax-driven sentence segmentation
  • Built in visualizers for syntax and NER
  • Convenient string-to-hash mapping
  • Export to numpy data arrays
  • Efficient binary serialization
  • Easy model packaging and deployment
  • Robust, rigorously evaluated accuracy

📖 For more details, see the facts, figures and benchmarks.

Install spaCy

For detailed installation instructions, see the documentation.

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 2.7, 3.5+ (only 64 bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, spaCy releases are available as source packages and binary wheels (as of v2.0.13).

pip install spacy

When using pip it is generally recommended to install packages in a virtual environment to avoid modifying system state:

python -m venv .env
source .env/bin/activate
pip install spacy

conda

Thanks to our great community, we've finally re-added conda support. You can now install spaCy via conda-forge:

conda config --add channels conda-forge
conda install spacy

For the feedstock including the build recipe and configuration, check out this repository. Improvements and pull requests to the recipe and setup are always appreciated.

Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're running spaCy v2.0 or higher, you can use the validate command to check if your installed models are compatible and if not, print details on how to update them:

pip install -U spacy
python -m spacy validate

If you've trained your own models, keep in mind that your training and runtime inputs must match. After updating spaCy, we recommend retraining your models with the new version.

📖 For details on upgrading from spaCy 1.x to spaCy 2.x, see the migration guide.

Download models

As of v1.7.0, models for spaCy can be installed as Python packages. This means that they're a component of your application, just like any other module. Models can be installed using spaCy's download command, or manually by pointing pip to a path or URL.

Documentation
Available Models Detailed model descriptions, accuracy figures and benchmarks.
Models Documentation Detailed usage instructions.
# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_sm

# out-of-the-box: download best-matching default model
python -m spacy download en

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Loading and using models

To load a model, use spacy.load() with the model name, a shortcut link or a path to the model data directory.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp(u"This is a sentence.")

You can also import a model directly via its full name and then call its load() method with no arguments.

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp(u"This is a sentence.")

📖 For more info and examples, check out the models documentation.

Support for older versions

If you're using an older version (v1.6.0 or below), you can still download and install the old models from within spaCy using python -m spacy.en.download all or python -m spacy.de.download all. The .tar.gz archives are also attached to the v1.6.0 release. To download and install the models manually, unpack the archive, drop the contained directory into spacy/data and load the model via spacy.load('en') or spacy.load('de').

Compile from source

The other way to install spaCy is to clone its GitHub repository and build it from source. That is the common way if you want to make changes to the code base. You'll need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, virtualenv and git installed. The compiler part is the trickiest. How to do that depends on your system. See notes on Ubuntu, OS X and Windows for details.

# make sure you are using the latest pip
python -m pip install -U pip
git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py build_ext --inplace

Compared to regular install via pip, requirements.txt additionally installs developer dependencies such as Cython. For more details and instructions, see the documentation on compiling spaCy from source and the quickstart widget to get the right commands for your platform and Python version.

Ubuntu

Install system-level dependencies via apt-get:

sudo apt-get install build-essential python-dev git

macOS / OS X

Install a recent version of XCode, including the so-called "Command Line Tools". macOS and OS X ship with Python and git preinstalled.

Windows

Install a version of the Visual C++ Build Tools or Visual Studio Express that matches the version that was used to compile your Python interpreter. For official distributions these are VS 2008 (Python 2.7), VS 2010 (Python 3.4) and VS 2015 (Python 3.5).

Run tests

spaCy comes with an extensive test suite. In order to run the tests, you'll usually want to clone the repository and build spaCy from source. This will also install the required development dependencies and test utilities defined in the requirements.txt.

Alternatively, you can find out where spaCy is installed and run pytest on that directory. Don't forget to also install the test utilities via spaCy's requirements.txt:

python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest <spacy-directory>

See the documentation for more details and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-2.2.0.dev17.tar.gz (5.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spacy-2.2.0.dev17-cp37-cp37m-win_amd64.whl (9.3 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy-2.2.0.dev17-cp37-cp37m-manylinux1_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.7m

spacy-2.2.0.dev17-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.0 MB view details)

Uploaded CPython 3.7mmacOS 10.10+ Intel (x86-64, i386)macOS 10.10+ x86-64macOS 10.6+ Intel (x86-64, i386)macOS 10.9+ Intel (x86-64, i386)macOS 10.9+ x86-64

spacy-2.2.0.dev17-cp36-cp36m-win_amd64.whl (9.3 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy-2.2.0.dev17-cp36-cp36m-manylinux1_x86_64.whl (10.2 MB view details)

Uploaded CPython 3.6m

spacy-2.2.0.dev17-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.3 MB view details)

Uploaded CPython 3.6mmacOS 10.10+ Intel (x86-64, i386)macOS 10.10+ x86-64macOS 10.6+ Intel (x86-64, i386)macOS 10.9+ Intel (x86-64, i386)macOS 10.9+ x86-64

spacy-2.2.0.dev17-cp35-cp35m-win_amd64.whl (9.2 MB view details)

Uploaded CPython 3.5mWindows x86-64

spacy-2.2.0.dev17-cp35-cp35m-manylinux1_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.5m

spacy-2.2.0.dev17-cp27-cp27mu-manylinux1_x86_64.whl (10.2 MB view details)

Uploaded CPython 2.7mu

File details

Details for the file spacy-2.2.0.dev17.tar.gz.

File metadata

  • Download URL: spacy-2.2.0.dev17.tar.gz
  • Upload date:
  • Size: 5.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17.tar.gz
Algorithm Hash digest
SHA256 999162b4e0b39474479029e0b6d5b35e98a30bfdb5aa90d9bb74320f4ce272ff
MD5 03371d8a1a9d63fa3570c7114845ddb5
BLAKE2b-256 fa963ac1c7e5aed5704d10cccf8439a0b1d9c7027bb0f9bd6d263ca3b1e27d79

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 a4586e422e5bbb26b260359919e6c0dc6f92392df43bef9f16d872aff974e81d
MD5 771a0b042922b6b20c4745dc847196fe
BLAKE2b-256 86211c7d3cd8ddc3733ffe3e63d45ce2a1adc31c720181ea974bd75b92441c14

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6d777277d34ce8d3a7052686faf1fd5fb3a5d66539c500cc477b68e053e9b652
MD5 4066566676a0700fcf1eef008f9f6073
BLAKE2b-256 3aa0561023ea270d0a17cca00e74b7650a5fe2666bb3f296140cd44e4e01e801

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for spacy-2.2.0.dev17-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 612d02b07c45868986f73efc3e99f926de676ae220b116752a15a8e1f8e7b747
MD5 1a1781036e61cb2c9ae71c19bc5ebd01
BLAKE2b-256 e5da1330d1727bcaa17fe320575296d872b5874538a795b0be73bb82cce06b44

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 894265ebb5c155236d3878645fec56c86a27bb9b790e37eae817a82a3b428158
MD5 39cb91724c9dd4d3b45cdd7b6928bcc2
BLAKE2b-256 786d808e400990ee68a71658edc3dda0413906c981848ac86961a2a67ad4dcdf

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 45ce2f8a55b8b27cfb7a689b524515e1cdb686f9582d64cf2882aa1590dd01b2
MD5 6e826eea9fb894105e7a6425e03a02b3
BLAKE2b-256 46a912741c73ebf9711b0e4555609410dcd0afa5021bb8ed7c704050e4029b27

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for spacy-2.2.0.dev17-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 f0aeb840211f3dea098f69c257a4372ded255d2fcdba4ca3ea697238b087a556
MD5 7bca7f773f38756a0cdbf5ddc2e05fba
BLAKE2b-256 1dc25021338403b30a9415e3e69fed321a2e2d5f114f457da9833a0a833b72da

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 9.2 MB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 c213ceef5892e376cb3c4ce59334d16665785402bbe6db16759a37a4e9a62b9b
MD5 caf3c6262b8c34663dd58bc9ff1a5923
BLAKE2b-256 6664271c0fb66c2b127576525781f12163503841542cd931ce6a433247f93bc6

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.1 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 13781328a24f3401710098e6b178b1a711797d58cdc15b3c660bdeb7fa30528b
MD5 da40e8e2f9f5cabd5ff3dac91aa23205
BLAKE2b-256 934ba7c179eb592a781a85fe55351a36bcd0ff9be37d28609d041cc86aaeb0e9

See more details on using hashes here.

File details

Details for the file spacy-2.2.0.dev17-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy-2.2.0.dev17-cp27-cp27mu-manylinux1_x86_64.whl
  • Upload date:
  • Size: 10.2 MB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-2.2.0.dev17-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c70490635c27a25eb892c49002bfba0056db0f933b00e62f9ca22b56e1b901d1
MD5 6821f9b6d17239c106becd112583e822
BLAKE2b-256 e48924caf39324218032c30df7357782f84ff30f98c530af00a6c0cafbc31717

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page