unbabel-comet

High-quality Machine Translation Evaluation

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Quick Installation

Detailed usage examples and instructions can be found in the Full Documentation.

Simple installation from PyPI

Pre-release of version 1.0:

pip install unbabel-comet==1.0.0rc3

To develop locally install Poetry and run the following commands:

git clone https://github.com/Unbabel/COMET
poetry install

Scoring MT outputs:

Via Bash:

Examples from WMT20:

echo -e "Dem Feuer konnte Einhalt geboten werden\nSchulen und Kindergärten wurden eröffnet." >> src.de
echo -e "The fire could be stopped\nSchools and kindergartens were open" >> hyp.en
echo -e "They were able to control the fire.\nSchools and kindergartens opened" >> ref.en

comet-score -s src.de -t hyp.en -r ref.en

You can select another model/metric with the --model flag and for reference-free (QE-as-a-metric) models you don't need to pass a reference.

comet-score -s src.de -t hyp.en -r ref.en --model wmt21-comet-qe-da

Following the work on Uncertainty-Aware MT Evaluation you can use the --mc_dropout flag to get a variance/uncertainty value for each segment score. If this value is high, it means that the metric is less confident in that prediction.

comet-score -s src.de -t hyp.en -r ref.en --mc_dropout 30

Languages Covered:

All the above mentioned models are build on top of XLM-R which cover the following languages:

Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Basque, Belarusian, Bengali, Bengali Romanized, Bosnian, Breton, Bulgarian, Burmese, Burmese, Catalan, Chinese (Simplified), Chinese (Traditional), Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hindi Romanized, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish (Kurmanji), Kyrgyz, Lao, Latin, Latvian, Lithuanian, Macedonian, Malagasy, Malay, Malayalam, Marathi, Mongolian, Nepali, Norwegian, Oriya, Oromo, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskri, Scottish, Gaelic, Serbian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tamil, Tamil Romanized, Telugu, Telugu Romanized, Thai, Turkish, Ukrainian, Urdu, Urdu Romanized, Uyghur, Uzbek, Vietnamese, Welsh, Western, Frisian, Xhosa, Yiddish.

Thus, results for language pairs containing uncovered languages are unreliable!

Scoring within Python:

COMET implements the Pytorch-Lightning model interface which means that you'll need to initialize a trainer in order to run inference.

from comet import download_model, load_from_checkpoint

model_path = download_model("wmt20-comet-da")
model = load_from_checkpoint(model_path)
data = [
    {
        "src": "Dem Feuer konnte Einhalt geboten werden",
        "mt": "The fire could be stopped",
        "ref": "They were able to control the fire."
    },
    {
        "src": "Schulen und Kindergärten wurden eröffnet.",
        "mt": "Schools and kindergartens were open",
        "ref": "Schools and kindergartens opened"
    }
]
predictions, system_score = model.predict(data, batch_size=8, gpus=1)

Model Zoo:

Model	Description
`wmt20-comet-da`	RECOMMENDED: Regression model build on top of XLM-R (large) trained on DA from WMT17, to WMT19. This model was presented at the WMT20 Metrics shared task: rei et al, 2020. Same as `wmt-large-da-estimator-1719` from previous versions.
`emnlp20-comet-rank`	Translation Ranking model build on top of XLM-R (base) trained with DARR from WMT17 and WMT18. This model was presented at EMNLP20: rei et al, 2020.

Note: Scores between models are not comparable! each model learns its own distribution and the scale might differ.

QE-as-a-metric:

Model	Description
`wmt20-comet-qe-da`	Reference-free Regression model build on top of XLM-R (large) trained on DA from WMT17, to WMT19. This model was presented at the WMT20 Metrics shared task: rei et al, 2020. Same as `wmt-large-qe-estimator-1719` from previous versions.

Train your own Metric:

Instead of using pretrained models your can train your own model with the following command:

comet-train --cfg configs/models/{your_model_config}.yaml

Tensorboard:

Launch tensorboard with:

tensorboard --logdir="lightning_logs/"

unittest:

In order to run the toolkit tests you must run the following command:

coverage run --source=comet -m unittest discover
coverage report -m

Publications

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

2.2.2

Mar 13, 2024

2.2.1

Jan 8, 2024

2.2.0

Oct 23, 2023

2.1.1

Oct 13, 2023

2.1.0

Sep 21, 2023

2.0.2

Aug 3, 2023

2.0.1

Apr 5, 2023

2.0.0

Mar 13, 2023

1.1.3

Oct 4, 2022

1.1.2

Jun 6, 2022

1.1.1

Jun 1, 2022

1.1.0

Apr 2, 2022

1.0.1

Nov 19, 2021

1.0.0

Nov 19, 2021

1.0.0rc9 pre-release

Oct 21, 2021

1.0.0rc8 pre-release

Oct 18, 2021

1.0.0rc7 pre-release

Oct 18, 2021

1.0.0rc6 pre-release

Sep 28, 2021

1.0.0rc5 pre-release

Sep 4, 2021

1.0.0rc4 pre-release

Aug 16, 2021

This version

1.0.0rc3 pre-release

Aug 15, 2021

1.0.0rc2 pre-release

Aug 10, 2021

1.0.0rc1 pre-release

Jul 27, 2021

0.1.0

Mar 11, 2021

0.0.7

Feb 9, 2021

0.0.6.post2

Nov 25, 2020

0.0.6.post1

Nov 24, 2020

0.0.6

Nov 21, 2020

0.0.4

Oct 8, 2020

0.0.3

Sep 22, 2020

0.0.2

Sep 22, 2020

0.0.1 yanked

Sep 22, 2020

Reason this release was yanked:

missing MANIFEST with reqs

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unbabel-comet-1.0.0rc3.tar.gz (27.3 kB view hashes)

Uploaded Aug 15, 2021 Source

Built Distribution

unbabel_comet-1.0.0rc3-py3-none-any.whl (44.5 kB view hashes)

Uploaded Aug 15, 2021 Python 3

Hashes for unbabel-comet-1.0.0rc3.tar.gz

Hashes for unbabel-comet-1.0.0rc3.tar.gz
Algorithm	Hash digest
SHA256	`0ad69ff5a3ccefb01d71850b1b0096dc4c959cd4776f005f767b51f9cf0de891`
MD5	`265df9145a1ee048349136e37a85745a`
BLAKE2b-256	`b97eeb80dafddb3c9d17ba931bedcda30387197b87c84220386ca90d1731fb2a`

Hashes for unbabel_comet-1.0.0rc3-py3-none-any.whl

Hashes for unbabel_comet-1.0.0rc3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`51060e8b703b973fc46442bb1ecb72451ece0cb9e87388988d4bb4c405ae70dd`
MD5	`5cf7ec41bbe74daf1b851ef8e2a7e29c`
BLAKE2b-256	`6896fcb3fe689b3b366159d323fec11ad23eb9d06c71189bc10a0ff65c0750c3`