Skip to main content

Industrial-strength Natural Language Processing (NLP) with Python and Cython

Project description

<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>

# spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and
Cython. It's built on the very latest research, and was designed from day one
to be used in real products. spaCy comes with
[pre-trained statistical models](https://spacy.io/models) and word vectors, and
currently supports tokenization for **30+ languages**. It features the
**fastest syntactic parser** in the world, convolutional
**neural network models** for tagging, parsing and **named entity recognition**
and easy **deep learning** integration. It's commercial open-source software,
released under the MIT license.

💫 **Version 2.1 out now!** [Check out the release notes here.](https://github.com/explosion/spaCy/releases)

[![Travis Build Status](https://img.shields.io/travis/explosion/spaCy/master.svg?style=flat-square&logo=travis)](https://travis-ci.org/explosion/spaCy)
[![Appveyor Build Status](https://img.shields.io/appveyor/ci/explosion/spaCy/master.svg?style=flat-square&logo=appveyor)](https://ci.appveyor.com/project/explosion/spaCy)
[![Current Release Version](https://img.shields.io/github/release/explosion/spacy.svg?style=flat-square)](https://github.com/explosion/spaCy/releases)
[![pypi Version](https://img.shields.io/pypi/v/spacy.svg?style=flat-square)](https://pypi.python.org/pypi/spacy)
[![conda Version](https://img.shields.io/conda/vn/conda-forge/spacy.svg?style=flat-square)](https://anaconda.org/conda-forge/spacy)
[![Python wheels](https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white)](https://github.com/explosion/wheelwright/releases)
[![spaCy on Twitter](https://img.shields.io/twitter/follow/spacy_io.svg?style=social&label=Follow)](https://twitter.com/spacy_io)

## 📖 Documentation

| Documentation | |
| --- | --- |
| [spaCy 101] | New to spaCy? Here's everything you need to know!
| [Usage Guides] | How to use spaCy and its features. |
| [New in v2.0] | New features, backwards incompatibilities and migration guide. |
| [API Reference] | The detailed reference for spaCy's API. |
| [Models] | Download statistical language models for spaCy. |
| [Universe] | Libraries, extensions, demos, books and courses. |
| [Changelog] | Changes and version history. |
| [Contribute] | How to contribute to the spaCy project and code base. |

[spaCy 101]: https://spacy.io/usage/spacy-101
[New in v2.0]: https://spacy.io/usage/v2#migrating
[Usage Guides]: https://spacy.io/usage/
[API Reference]: https://spacy.io/api/
[Models]: https://spacy.io/models
[Universe]: https://spacy.io/universe
[Changelog]: https://spacy.io/usage/#changelog
[Contribute]: https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md

## 💬 Where to ask questions

The spaCy project is maintained by [@honnibal](https://github.com/honnibal)
and [@ines](https://github.com/ines). Please understand that we won't be able
to provide individual support via email. We also believe that help is much more
valuable if it's shared publicly, so that more people can benefit from it.

* **Bug Reports**: [GitHub Issue Tracker]
* **Usage Questions**: [Stack Overflow] · [Gitter Chat] · [Reddit User Group]
* **General Discussion**: [Gitter Chat] · [Reddit User Group]

[GitHub Issue Tracker]: https://github.com/explosion/spaCy/issues
[Stack Overflow]: http://stackoverflow.com/questions/tagged/spacy
[Gitter Chat]: https://gitter.im/explosion/spaCy
[Reddit User Group]: https://www.reddit.com/r/spacynlp

## Features

* **Fastest syntactic parser** in the world
* **Named entity** recognition
* Non-destructive **tokenization**
* Support for **30+ languages**
* Pre-trained [statistical models](https://spacy.io/models) and word vectors
* Easy **deep learning** integration
* Part-of-speech tagging
* Labelled dependency parsing
* Syntax-driven sentence segmentation
* Built in **visualizers** for syntax and NER
* Convenient string-to-hash mapping
* Export to numpy data arrays
* Efficient binary serialization
* Easy **model packaging** and deployment
* State-of-the-art speed
* Robust, rigorously evaluated accuracy

📖 **For more details, see the
[facts, figures and benchmarks](https://spacy.io/usage/facts-figures).**

## Install spaCy

For detailed installation instructions, see the
[documentation](https://spacy.io/usage).

* **Operating system**: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
* **Python version**: Python 2.7, 3.4+ (only 64 bit)
* **Package managers**: [pip] · [conda] (via `conda-forge`)

[pip]: https://pypi.python.org/pypi/spacy
[conda]: https://anaconda.org/conda-forge/spacy

### pip

Using pip, spaCy releases are available as source packages and binary wheels
(as of `v2.0.13`).

```bash
pip install spacy
```

When using pip it is generally recommended to install packages in a virtual
environment to avoid modifying system state:

```bash
python -m venv .env
source .env/bin/activate
pip install spacy
```

### conda

Thanks to our great community, we've finally re-added conda support. You can now
install spaCy via `conda-forge`:

```bash
conda config --add channels conda-forge
conda install spacy
```

For the feedstock including the build recipe and configuration,
check out [this repository](https://github.com/conda-forge/spacy-feedstock).
Improvements and pull requests to the recipe and setup are always appreciated.

### Updating spaCy

Some updates to spaCy may require downloading new statistical models. If you're
running spaCy v2.0 or higher, you can use the `validate` command to check if
your installed models are compatible and if not, print details on how to update
them:

```bash
pip install -U spacy
python -m spacy validate
```

If you've trained your own models, keep in mind that your training and runtime
inputs must match. After updating spaCy, we recommend **retraining your models**
with the new version.

📖 **For details on upgrading from spaCy 1.x to spaCy 2.x, see the
[migration guide](https://spacy.io/usage/v2#migrating).**

## Download models

As of v1.7.0, models for spaCy can be installed as **Python packages**.
This means that they're a component of your application, just like any
other module. Models can be installed using spaCy's `download` command,
or manually by pointing pip to a path or URL.

| Documentation | |
| --- | --- |
| [Available Models] | Detailed model descriptions, accuracy figures and benchmarks. |
| [Models Documentation] | Detailed usage instructions. |

[Available Models]: https://spacy.io/models
[Models Documentation]: https://spacy.io/docs/usage/models

```bash
# out-of-the-box: download best-matching default model
python -m spacy download en

# download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_lg

# pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.0.0.tar.gz
```

### Loading and using models

To load a model, use `spacy.load()` with the model's shortcut link:

```python
import spacy
nlp = spacy.load('en')
doc = nlp(u'This is a sentence.')
```

If you've installed a model via pip, you can also `import` it directly and
then call its `load()` method:

```python
import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()
doc = nlp(u'This is a sentence.')
```

📖 **For more info and examples, check out the
[models documentation](https://spacy.io/docs/usage/models).**

### Support for older versions

If you're using an older version (`v1.6.0` or below), you can still download
and install the old models from within spaCy using `python -m spacy.en.download all`
or `python -m spacy.de.download all`. The `.tar.gz` archives are also
[attached to the v1.6.0 release](https://github.com/explosion/spaCy/tree/v1.6.0).
To download and install the models manually, unpack the archive, drop the
contained directory into `spacy/data` and load the model via `spacy.load('en')`
or `spacy.load('de')`.

## Compile from source

The other way to install spaCy is to clone its
[GitHub repository](https://github.com/explosion/spaCy) and build it from
source. That is the common way if you want to make changes to the code base.
You'll need to make sure that you have a development environment consisting of a
Python distribution including header files, a compiler,
[pip](https://pip.pypa.io/en/latest/installing/),
[virtualenv](https://virtualenv.pypa.io/) and [git](https://git-scm.com)
installed. The compiler part is the trickiest. How to do that depends on your
system. See notes on Ubuntu, OS X and Windows for details.

```bash
# make sure you are using the latest pip
python -m pip install -U pip
git clone https://github.com/explosion/spaCy
cd spaCy

python -m venv .env
source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py build_ext --inplace
```

Compared to regular install via pip, [requirements.txt](requirements.txt)
additionally installs developer dependencies such as Cython. For more details
and instructions, see the documentation on
[compiling spaCy from source](https://spacy.io/usage/#source) and the
[quickstart widget](https://spacy.io/usage/#section-quickstart) to get
the right commands for your platform and Python version.

### Ubuntu

Install system-level dependencies via `apt-get`:

```bash
sudo apt-get install build-essential python-dev git
```

### macOS / OS X

Install a recent version of [XCode](https://developer.apple.com/xcode/),
including the so-called "Command Line Tools". macOS and OS X ship with Python
and git preinstalled.

### Windows

Install a version of the [Visual C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) or
[Visual Studio Express](https://www.visualstudio.com/vs/visual-studio-express/)
that matches the version that was used to compile your Python
interpreter. For official distributions these are VS 2008 (Python 2.7),
VS 2010 (Python 3.4) and VS 2015 (Python 3.5).

## Run tests

spaCy comes with an [extensive test suite](spacy/tests). In order to run the
tests, you'll usually want to clone the repository and build spaCy from source.
This will also install the required development dependencies and test utilities
defined in the `requirements.txt`.

Alternatively, you can find out where spaCy is installed and run `pytest` on
that directory. Don't forget to also install the test utilities via spaCy's
`requirements.txt`:

```bash
python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
pip install -r path/to/requirements.txt
python -m pytest <spacy-directory>
```

See [the documentation](https://spacy.io/usage/#tests) for more details and
examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-nightly-2.1.0a7.tar.gz (27.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spacy_nightly-2.1.0a7-cp37-cp37m-win_amd64.whl (26.8 MB view details)

Uploaded CPython 3.7mWindows x86-64

spacy_nightly-2.1.0a7-cp37-cp37m-manylinux1_x86_64.whl (27.6 MB view details)

Uploaded CPython 3.7m

spacy_nightly-2.1.0a7-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (30.9 MB view details)

Uploaded CPython 3.7mmacOS 10.10+ Intel (x86-64, i386)macOS 10.10+ x86-64macOS 10.6+ Intel (x86-64, i386)macOS 10.9+ Intel (x86-64, i386)macOS 10.9+ x86-64

spacy_nightly-2.1.0a7-cp36-cp36m-win_amd64.whl (26.8 MB view details)

Uploaded CPython 3.6mWindows x86-64

spacy_nightly-2.1.0a7-cp36-cp36m-manylinux1_x86_64.whl (27.6 MB view details)

Uploaded CPython 3.6m

spacy_nightly-2.1.0a7-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (31.1 MB view details)

Uploaded CPython 3.6mmacOS 10.10+ Intel (x86-64, i386)macOS 10.10+ x86-64macOS 10.6+ Intel (x86-64, i386)macOS 10.9+ Intel (x86-64, i386)macOS 10.9+ x86-64

spacy_nightly-2.1.0a7-cp35-cp35m-win_amd64.whl (26.8 MB view details)

Uploaded CPython 3.5mWindows x86-64

spacy_nightly-2.1.0a7-cp35-cp35m-manylinux1_x86_64.whl (27.5 MB view details)

Uploaded CPython 3.5m

spacy_nightly-2.1.0a7-cp27-cp27mu-manylinux1_x86_64.whl (27.6 MB view details)

Uploaded CPython 2.7mu

File details

Details for the file spacy-nightly-2.1.0a7.tar.gz.

File metadata

  • Download URL: spacy-nightly-2.1.0a7.tar.gz
  • Upload date:
  • Size: 27.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy-nightly-2.1.0a7.tar.gz
Algorithm Hash digest
SHA256 441f0232eeb3f05ace2d9b29b98885845b5993eb497e1a03086c7c5005fb8eca
MD5 301586a86d92f4a4744cb9678acf98e0
BLAKE2b-256 716c372886e3773b82baa6abb1e78bda8297f60b66091df6595230da8ebae437

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 26.8 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 ecba81a9638e716ee90b8294192e256963020a94202d04ca694420a393f3f122
MD5 bd9256129652d9779f81084da9d55b05
BLAKE2b-256 de71310f9f6effe2b63bbeb593981f14c9e35f0b0f8d60909277497fd1060730

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 27.6 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e7f05862a77930eeae7fe991891a0923b656c92932cd4fb8b217b5e37ff6cc8f
MD5 c02c32b0e84048f66375268b63b3a44d
BLAKE2b-256 9ae63102352abe37dc6fb396863751bfdfec38387d3bae4d0e35bd55fe76d558

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for spacy_nightly-2.1.0a7-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 ae1b53b8d67b2b34deb7169afb5788b808f17a4603ae01ea7b0c384616b9005d
MD5 a521f95bfc1da7229b5237033e6893ce
BLAKE2b-256 540fb3377643e7488ea547f9f1bf45d0c19aa5b5e427e9005e264b8f51c704e3

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 26.8 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 e1694e2291f375852b2ee9be30a4f98aceb23726c29d14fc32b0f3a13ac79faa
MD5 277e705316469a26d09277fb68d04458
BLAKE2b-256 1b4512d92fddec16ad7427e5b72835d51933bd9aca1fa5386d92bd65ba7f2ef4

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 27.6 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b15fbdafeec0a0887e0b154c320fb98061b263a602c4ade8e2f3dc6115f9acf6
MD5 36a3345689cac7d5de296fce278a81bd
BLAKE2b-256 0fbab3baa3272fbf0a635d122d00a000956a402213eb5ee3e799cd925189daa2

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for spacy_nightly-2.1.0a7-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 178d3ec2ded0da312b29b3044c3ac936e82202f1414f6d7beabd709a801b135b
MD5 ebbe7f64a30c4f92757aa2144b1660a1
BLAKE2b-256 0fed9ab95548250ff85f186ab3aca192f97a2b17563ae782be9eabd9a379e01a

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 26.8 MB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 cee2c62f742804f2b0b1224706b89f86e5cced6623fb0a2b86e4c3424e9fc613
MD5 21d6ca04efa0a152a22e9bfea199eecd
BLAKE2b-256 b6a41b190a313939423c31e10bc694c6ca2296c4761b014bbcfffa4e2a6008fe

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 27.5 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2c2c84a9eef879541fb6bd94c4247f9d987493a3a1cb4381c029c532e5329793
MD5 63c2c810799350c1c76bc1acda134b17
BLAKE2b-256 621614f962f3d2136bbb7a26bf0193291ffa17614b53f15f6ae5711e5e60cbbc

See more details on using hashes here.

File details

Details for the file spacy_nightly-2.1.0a7-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

  • Download URL: spacy_nightly-2.1.0a7-cp27-cp27mu-manylinux1_x86_64.whl
  • Upload date:
  • Size: 27.6 MB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for spacy_nightly-2.1.0a7-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a1827c0b4b3008f5360df3ef86ccb49ccd8c7e24348d3ed5373d464986052aca
MD5 fb06b4a0e73d89283d59077f51a30f62
BLAKE2b-256 cdb082c344ba790c05e8c9ce1ef956282586bbe99fe83e198bcefe2ad607a48a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page