A tool for converting raw text to slot
Project description
bert_slot_tokenizer
Version 0.2
bert_slot_tokenizer 是一个将slot filling 任务中slot解析为其他格式的工具
环境:
- Python 3
- Python 2
安装:
pip install bert-slot-tokenizer
支持的格式:
使用方法:
from bert_slot_tokenizer import SlotConverter
vacab_path = 'tests/test_data/example_vocab.txt'
# you can find a example here --> https://github.com/DevRoss/bert-slot-tokenizer/blob/master/tests/test_data/example_vocab.txt
sc = SlotConverter(vocab_path, do_lower_case=True)
text = 'Too YOUNG, too simple, sometimes naive! 蛤蛤+1s'
slot = {'name': '蛤蛤', 'time': '+1s'}
output_text, iob_slot = sc.convert2iob(text, slot)
print(output_text)
# ['too', 'young', ',', 'too', 'simple', ',', 'some', '##times', 'na', '##ive', '!', '蛤', '蛤', '+', '1', '##s']
print(iob_slot)
# ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-name', 'I-name', 'B-time', 'I-time', 'I-time']
写在最后:
感谢BERT对NLP领域的推动
感谢开源
欢迎PR和issue
联系方式: devross1997@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bert_slot_tokenizer-0.2.0.tar.gz
(11.9 kB
view hashes)
Built Distribution
Close
Hashes for bert_slot_tokenizer-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f63aae491be5dd1574c1df3c46cbefbb43d8a6953991df425353923df15eeea9 |
|
MD5 | f286e09099758f527143a81f5271c61f |
|
BLAKE2b-256 | 16dbec92e4b77ee0098a89f1e5e91e7af8a60ac9a02dc89e0863b4e9b290efba |
Close
Hashes for bert_slot_tokenizer-0.2.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45874e13c80ec1094a83d4d9781a7125abbc350578cea599defa1b0a62902beb |
|
MD5 | f9258e011811e2a8599648886fc75664 |
|
BLAKE2b-256 | d453eb4923e54609b1cc84de77c1361bd76c5970b6b39b7656a94f33a4187dc1 |