Wrappers for including pre-trained transformers in spaCy pipelines
Project description
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines
spaCy-wrap is a minimal library intended for wrapping fine-tuned transformers from the Huggingface model hub in your spaCy pipeline allowing the inclusion of existing models within SpaCy workflows.
As for as possible it follows a similar API as spacy-transformers.
Installation
Installing spacy-wrap is simple using pip:
pip install spacy_wrap
There is no reason to update from GitHub as the version on PyPI should always be the same as on GitHub.
Example
The following shows a simple example of how you can quickly add a fine-tuned transformer model from the Huggingface model hub. In this example we will use the sentiment model by Barbieri et al. (2020) for classifying whether a tweet is positive, negative or neutral. We will add this model to a blank English pipeline:
import spacy
import spacy_wrap
nlp = spacy.blank("en")
config = {
"doc_extension_trf_data": "clf_trf_data", # document extention for the forward pass
"doc_extension_prediction": "sentiment", # document extention for the prediction
"labels": ["negative", "neutral", "positive"],
"model": {
"name": "cardiffnlp/twitter-roberta-base-sentiment", # the model name or path of huggingface model
},
}
transformer = nlp.add_pipe("classification_transformer", config=config)
doc = nlp("spaCy is a wonderful tool")
print(doc._.clf_trf_data)
# TransformerData(wordpieces=...
print(doc._.sentiment)
# 'positive'
print(doc._.sentiment_prob)
#{'prob': array([0.004, 0.028, 0.969], dtype=float32), 'labels': ['negative', 'neutral', 'positive']}
These pipelines can also easily be applied to multiple documents using the nlp.pipe
as one would expect from a spaCy component:
docs = nlp.pipe(
[
"I hate wrapping my own models",
"Isn't there a tool for this?",
"spacy-wrap is great for wrapping models",
]
)
for doc in docs:
print(doc._.sentiment)
# 'negative'
# 'neutral'
# 'positive'
More Examples
It is always nice to have more than one example. Here is another one where we add the Hate speech model for Danish to a blank Danish pipeline:
import spacy
import spacy_wrap
nlp = spacy.blank("da")
config = {
"doc_extension_trf_data": "clf_trf_data", # document extention for the forward pass
"doc_extension_prediction": "hate_speech", # document extention for the prediction
"labels": ["Not hate Speech", "Hate speech"],
"model": {
"name": "DaNLP/da-bert-hatespeech-detection", # the model name or path of huggingface model
},
}
transformer = nlp.add_pipe("classification_transformer", config=config)
doc = nlp("Senile gamle idiot") # old senile idiot
doc._.clf_trf_data
# TransformerData(wordpieces=...
doc._.hate_speech
# "Hate speech"
doc._.hate_speech_prob
# {'prob': array([0.013, 0.987], dtype=float32), 'labels': ['Not hate Speech', 'Hate speech']}
📖 Documentation
Documentation | |
---|---|
🔧 Installation | Installation instructions for spacy-wrap. |
📰 News and changelog | New additions, changes and version history. |
🎛 Documentation | The reference for spacy-wrap's API. |
💬 Where to ask questions
Type | |
---|---|
🚨 FAQ | FAQ |
🚨 Bug Reports | GitHub Issue Tracker |
🎁 Feature Requests & Ideas | GitHub Issue Tracker |
👩💻 Usage Questions | GitHub Discussions |
🗯 General Discussion | GitHub Discussions |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for spacy_wrap-1.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50e98b773cacfb2f3ccb0cb125b9b72a364f60aa862e6087d343e4d7f514f3e9 |
|
MD5 | 15115eb9616bc380f954d42d1e4411b8 |
|
BLAKE2b-256 | 8dd1ed221e672922b4f41a362b33fb7e5027f117bef53f42b1c81f60ebd693d4 |