Python bindings for the Transformer models implemented in C/C++ using GGML library.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

C Transformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.

Supported Models
Installation
Usage
- Hugging Face Hub
- LangChain
Documentation
License

Supported Models

Models	Model Type
GPT-2	`gpt2`
GPT-J, GPT4All-J	`gptj`
GPT-NeoX, StableLM	`gpt_neox`
Dolly V2	`dolly-v2`
StarCoder	`starcoder`

More models coming soon.

Installation

pip install ctransformers

Usage

It provides a unified interface for all models:

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained('/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))

Run in Google Colab

If you are getting illegal instruction error, try using lib='avx' or lib='basic':

llm = AutoModelForCausalLM.from_pretrained('/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')

It provides a generator interface for more control:

tokens = llm.tokenize('AI is going to')

for token in llm.generate(tokens):
    print(llm.detokenize(token))

This allows you to use a custom tokenizer.

It also provides access to the low-level C API. See Documentation section below.

Hugging Face Hub

It can be used with models hosted on the Hub:

llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml')

If a model repo has multiple model files (.bin files), specify a model file using:

llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml', model_file='ggml-model.bin')

It can be used with your own models uploaded on the Hub. For better user experience, upload only one model per repo.

To use it with your own model, add config.json file to your model repo specifying the model_type:

{
  "model_type": "gpt2"
}

You can also specify additional parameters under task_specific_params.text-generation:

{
  "model_type": "gpt2",
  "task_specific_params": {
    "text-generation": {
      "top_k": 40,
      "top_p": 0.95,
      "temperature": 0.8,
      "repetition_penalty": 1.1,
      "last_n_tokens": 64
    }
  }
}

See marella/gpt-2-ggml for a minimal example and marella/gpt-2-ggml-example for a full example.

LangChain

LangChain is a framework for developing applications powered by language models. A LangChain LLM object can be created using:

from ctransformers.langchain import CTransformers

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))

If you are getting illegal instruction error, try using lib='avx' or lib='basic':

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')

It can also be used with models hosted on the Hugging Face Hub:

llm = CTransformers(model='marella/gpt-2-ggml')

Additional parameters can be passed using the config parameter:

config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}

llm = CTransformers(model='marella/gpt-2-ggml', config=config)

It can be used with other LangChain modules:

from langchain import PromptTemplate, LLMChain

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=['question'])

llm_chain = LLMChain(prompt=prompt, llm=llm)

print(llm_chain.run('What is AI?'))

Documentation

Parameters

Name	Type	Description	Default
`top_k`	`int`	The top-k sampling parameter.	`40`
`top_p`	`float`	The top-p sampling parameter.	`0.95`
`temperature`	`float`	The temperature parameter.	`0.8`
`repetition_penalty`	`float`	The repetition penalty parameter.	`1.0`
`last_n_tokens`	`int`	Number of last tokens to use for repetition penalty.	`64`
`seed`	`int`	Seed for sampling tokens.	Random
`max_new_tokens`	`int`	Maximum number of new tokens to generate.	`256`
`reset`	`bool`	Whether to reset the model state before evaluating a new prompt.	`True`
`batch_size`	`int`	Batch size for evaluating tokens.	`8`
`threads`	`int`	Number of threads to use.	Auto

`class` `AutoModelForCausalLM`

`classmethod` `AutoModelForCausalLM.from_pretrained`

from_pretrained(
    model_path_or_repo_id: 'str',
    model_type: 'Optional[str]' = None,
    model_file: 'Optional[str]' = None,
    config: 'Optional[AutoConfig]' = None,
    lib: 'Optional[str]' = None,
    **kwargs
) → LLM

`class` `LLM`

`method` `LLM.init`

__init__(
    model_path: str,
    model_type: str,
    config: Optional[ctransformers.llm.Config] = None,
    lib: Optional[str] = None
)

`property` LLM.config

`property` LLM.model_path

`property` LLM.model_type

`method` `LLM.detokenize`

detokenize(tokens: Union[Sequence[int], int]) → str

`method` `LLM.eval`

eval(
    tokens: Sequence[int],
    batch_size: Optional[int] = None,
    threads: Optional[int] = None
) → None

`method` `LLM.generate`

generate(
    tokens: Sequence[int],
    top_k: Optional[int] = None,
    top_p: Optional[float] = None,
    temperature: Optional[float] = None,
    repetition_penalty: Optional[float] = None,
    last_n_tokens: Optional[int] = None,
    seed: Optional[int] = None,
    batch_size: Optional[int] = None,
    threads: Optional[int] = None,
    reset: Optional[bool] = None
) → Generator[int, NoneType, NoneType]

`method` `LLM.is_eos_token`

is_eos_token(token: int) → bool

`method` `LLM.reset`

reset() → None

`method` `LLM.sample`

sample(
    top_k: Optional[int] = None,
    top_p: Optional[float] = None,
    temperature: Optional[float] = None,
    repetition_penalty: Optional[float] = None,
    last_n_tokens: Optional[int] = None,
    seed: Optional[int] = None
) → int

`method` `LLM.tokenize`

tokenize(text: str) → List[int]

`method` `LLM.call`

__call__(
    prompt: str,
    max_new_tokens: Optional[int] = None,
    top_k: Optional[int] = None,
    top_p: Optional[float] = None,
    temperature: Optional[float] = None,
    repetition_penalty: Optional[float] = None,
    last_n_tokens: Optional[int] = None,
    seed: Optional[int] = None,
    batch_size: Optional[int] = None,
    threads: Optional[int] = None,
    reset: Optional[bool] = None
) → str

License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.2.27

Sep 10, 2023

0.2.26

Aug 30, 2023

0.2.25

Aug 29, 2023

0.2.24

Aug 24, 2023

0.2.23

Aug 20, 2023

0.2.22

Aug 12, 2023

0.2.21

Aug 7, 2023

0.2.20

Aug 5, 2023

0.2.19

Aug 4, 2023

0.2.18

Aug 2, 2023

0.2.17

Aug 1, 2023

0.2.16

Jul 30, 2023

0.2.15

Jul 28, 2023

0.2.14

Jul 20, 2023

0.2.13

Jul 17, 2023

0.2.12

Jul 15, 2023

0.2.11

Jul 3, 2023

0.2.10

Jun 22, 2023

0.2.9

Jun 18, 2023

0.2.8

Jun 12, 2023

0.2.7

Jun 11, 2023

0.2.5

Jun 2, 2023

0.2.4

May 31, 2023

0.2.3

May 30, 2023

0.2.2

May 28, 2023

0.2.1

May 25, 2023

0.2.0

May 21, 2023

0.1.2

May 19, 2023

0.1.1

May 18, 2023

This version

0.1.0

May 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctransformers-0.1.0.tar.gz (2.1 MB view hashes)

Uploaded May 14, 2023 Source

Built Distribution

ctransformers-0.1.0-py3-none-any.whl (2.1 MB view hashes)

Uploaded May 14, 2023 Python 3

Hashes for ctransformers-0.1.0.tar.gz

Hashes for ctransformers-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5bf3f253471edfcfe39db424f5d684b604610d29f1b8247e9f4c5098120eb29f`
MD5	`65a2562c86143298d80d0dec4a10bbc8`
BLAKE2b-256	`689c731f9158c2b84a8c32f17ad7e1c8e9195bed92d7d10003677388f41da799`

Hashes for ctransformers-0.1.0-py3-none-any.whl

Hashes for ctransformers-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`54e9c3621a27836f2205943bc1f0065e7380157b93fdad389cf92a30992445ec`
MD5	`aca8a6731a4aeea4d72046a1a263f20c`
BLAKE2b-256	`12d4fb66cdcf784a59de35be8e63d9f85cb75ca66b911d87516ffb32c13425bb`

ctransformers 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

C Transformers

Supported Models

Installation

Usage

Hugging Face Hub

LangChain

Documentation

Parameters

class AutoModelForCausalLM

classmethod AutoModelForCausalLM.from_pretrained

class LLM

method LLM.__init__

property LLM.config

property LLM.model_path

property LLM.model_type

method LLM.detokenize

method LLM.eval

method LLM.generate

method LLM.is_eos_token

method LLM.reset

method LLM.sample

method LLM.tokenize

method LLM.__call__

License

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`class` `AutoModelForCausalLM`

`classmethod` `AutoModelForCausalLM.from_pretrained`

`class` `LLM`

`method` `LLM.init`

`property` LLM.config

`property` LLM.model_path

`property` LLM.model_type

`method` `LLM.detokenize`

`method` `LLM.eval`

`method` `LLM.generate`

`method` `LLM.is_eos_token`

`method` `LLM.reset`

`method` `LLM.sample`

`method` `LLM.tokenize`

`method` `LLM.call`