anago

Sequence labeling library using Keras.

These details have not been verified by PyPI

Project links

Homepage

Project description

*anaGo* is a state-of-the-art library for sequence labeling using Keras.

anaGo can performs named-entity recognition (NER), part-of-speech tagging (POS tagging), semantic role labeling (SRL) and so on.

Feature Support

anaGo provide following features: * learning your own task without any knowledge. * defining your own model. * downloading learned model for many tasks. (e.g. NER, POS Tagging, etc…)

Install

To install anaGo, simply run:

$ pip install anago

or install from the repository:

$ git clone https://github.com/Hironsan/anago.git
$ cd anago
$ pip install -r requirements.txt

Get Started

Import

First, import the necessary modules:

import os
import anago
from anago.data.reader import load_data_and_labels, load_word_embeddings
from anago.data.preprocess import prepare_preprocessor
from anago.config import ModelConfig, TrainingConfig

They include loading modules, a preprocessor and configs.

And set parameters to use later:

DATA_ROOT = 'data/conll2003/en/ner'
SAVE_ROOT = './models'  # trained model
LOG_ROOT = './logs'     # checkpoint, tensorboard
embedding_path = './data/glove.6B/glove.6B.100d.txt'
model_config = ModelConfig()
training_config = TrainingConfig()

Loading data

After importing the modules, read data for training, validation and test:

train_path = os.path.join(DATA_ROOT, 'train.txt')
valid_path = os.path.join(DATA_ROOT, 'valid.txt')
test_path = os.path.join(DATA_ROOT, 'test.txt')
x_train, y_train = load_data_and_labels(train_path)
x_valid, y_valid = load_data_and_labels(valid_path)
x_test, y_test = load_data_and_labels(test_path)

After reading the data, prepare preprocessor and pre-trained word embeddings:

p = prepare_preprocessor(x_train, y_train)
embeddings = load_word_embeddings(p.vocab_word, embedding_path, model_config.word_embedding_size)
model_config.vocab_size = len(p.vocab_word)
model_config.char_vocab_size = len(p.vocab_char)

Now we are ready for training :)

Training a model

Let’s train a model. For training a model, we can use *Trainer*. Trainer manages everything about training. Prepare an instance of Trainer class and give train data and valid data to train method:

trainer = anago.Trainer(model_config, training_config, checkpoint_path=LOG_ROOT, save_path=SAVE_ROOT,
                        preprocessor=p, embeddings=embeddings)
trainer.train(x_train, y_train, x_valid, y_valid)

If training is progressing normally, progress bar will be displayed as follows:

...
Epoch 3/15
702/703 [============================>.] - ETA: 0s - loss: 60.0129 - f1: 89.70
703/703 [==============================] - 319s - loss: 59.9278
Epoch 4/15
702/703 [============================>.] - ETA: 0s - loss: 59.9268 - f1: 90.03
703/703 [==============================] - 324s - loss: 59.8417
Epoch 5/15
702/703 [============================>.] - ETA: 0s - loss: 58.9831 - f1: 90.67
703/703 [==============================] - 297s - loss: 58.8993
...

Evaluation for a model

To evaluate the trained model, we can use *Evaluator*. Evaluator performs evaluation. Prepare an instance of Evaluator class and give test data to eval method:

weights = os.path.join(SAVE_ROOT, 'model_weights.h5')

evaluator = anago.Evaluator(model_config, weights, save_path=SAVE_ROOT, preprocessor=p)
evaluator.eval(x_test, y_test)

After evaluation, F1 value is output:

- f1: 90.67

Tagging a sentence

To tag any text, we can use *Tagger*. Prepare an instance of Tagger class and give text to tag method:

weights = os.path.join(SAVE_ROOT, 'model_weights.h5')
tagger = anago.Tagger(model_config, weights, save_path=SAVE_ROOT, preprocessor=p)

Let’s try tagging a sentence, “President Obama is speaking at the White House.” We can do it as follows:

>>> sent = 'President Obama is speaking at the White House.'
>>> print(tagger.tag(sent))
[('President', 'O'), ('Obama', 'PERSON'), ('is', 'O'),
 ('speaking', 'O'), ('at', 'O'), ('the', 'O'),
 ('White', 'LOCATION'), ('House', 'LOCATION'), ('.', 'O')]
>>> print(tagger.get_entities(sent))
{'Person': ['Obama'], 'LOCATION': ['White House']}

Reference

This library uses bidirectional LSTM + CRF model based on Neural Architectures for Named Entity Recognition by Lample, Guillaume, et al., NAACL 2016.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.8

Jul 17, 2018

1.0.7

Jun 28, 2018

1.0.6

Jun 6, 2018

1.0.5

Jun 4, 2018

1.0.4

Jun 4, 2018

1.0.3

Jun 4, 2018

1.0.2

Jun 3, 2018

1.0.1

Jun 3, 2018

1.0.0

Jun 3, 2018

0.0.5

Feb 3, 2018

0.0.4

Dec 4, 2017

0.0.3

Nov 24, 2017

0.0.2

Nov 18, 2017

This version

0.0.1

Aug 31, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anago-0.0.1.tar.gz (15.3 kB view details)

Uploaded Aug 31, 2017 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anago-0.0.1-py3-none-any.whl (20.9 kB view details)

Uploaded Aug 31, 2017 Python 3

File details

Details for the file anago-0.0.1.tar.gz.

File metadata

Download URL: anago-0.0.1.tar.gz
Upload date: Aug 31, 2017
Size: 15.3 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for anago-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`de1baad520e6a7bc507839b692ba70755b45d51c49aad6a24288a673a500cc1f`
MD5	`b92452128c44757cd2fe34d55841bf3a`
BLAKE2b-256	`927503483690d09fe348c9fba91f354c20809d08c63332d540dd8cd77c472eba`

See more details on using hashes here.

File details

Details for the file anago-0.0.1-py3-none-any.whl.

File metadata

Download URL: anago-0.0.1-py3-none-any.whl
Upload date: Aug 31, 2017
Size: 20.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for anago-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7455d37aa22160072fe20da31ce35492b53f477a08fdafe2c8bea5e7e1ef6b39`
MD5	`c06ce6df4afccfadc5d7e4cdb4097391`
BLAKE2b-256	`58c898c8238dd79827abc030e85d54b7d631766142d86ff925d166ec369c2a53`

See more details on using hashes here.

anago 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Feature Support

Install

Get Started

Import

Loading data

Training a model

Evaluation for a model

Tagging a sentence

Reference

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes