Python bindings for the Transformer models implemented in C/C++ using GGML library.
Project description
C Transformers
Python bindings for the Transformer models implemented in C/C++ using GGML library.
Supported Models
Models | Model Type |
---|---|
GPT-2 | gpt2 |
GPT-J, GPT4All-J | gptj |
GPT-NeoX, StableLM | gpt_neox |
Dolly V2 | dolly-v2 |
StarCoder | starcoder |
More models coming soon.
Installation
pip install ctransformers
Usage
It provides a unified interface for all models:
from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained('/path/to/ggml-gpt-2.bin', model_type='gpt2')
print(llm('AI is going to'))
If you are getting illegal instruction
error, try using lib='avx'
or lib='basic'
:
llm = AutoModelForCausalLM.from_pretrained('/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
It provides a generator interface for more control:
tokens = llm.tokenize('AI is going to')
for token in llm.generate(tokens):
print(llm.detokenize(token))
This allows you to use a custom tokenizer.
It also provides access to the low-level C API. See Documentation section below.
Hugging Face Hub
It can be used with models hosted on the Hub:
llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml')
If a model repo has multiple model files (.bin
files), specify a model file using:
llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml', model_file='ggml-model.bin')
It can be used with your own models uploaded on the Hub. For better user experience, upload only one model per repo.
To use it with your own model, add config.json
file to your model repo specifying the model_type
:
{
"model_type": "gpt2"
}
You can also specify additional parameters under task_specific_params.text-generation
:
{
"model_type": "gpt2",
"task_specific_params": {
"text-generation": {
"top_k": 40,
"top_p": 0.95,
"temperature": 0.8,
"repetition_penalty": 1.1,
"last_n_tokens": 64
}
}
}
See marella/gpt-2-ggml for a minimal example and marella/gpt-2-ggml-example for a full example.
LangChain
LangChain is a framework for developing applications powered by language models. A LangChain LLM object can be created using:
from ctransformers.langchain import CTransformers
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
print(llm('AI is going to'))
If you are getting illegal instruction
error, try using lib='avx'
or lib='basic'
:
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
It can also be used with models hosted on the Hugging Face Hub:
llm = CTransformers(model='marella/gpt-2-ggml')
Additional parameters can be passed using the config
parameter:
config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
llm = CTransformers(model='marella/gpt-2-ggml', config=config)
It can be used with other LangChain modules:
from langchain import PromptTemplate, LLMChain
template = """Question: {question}
Answer:"""
prompt = PromptTemplate(template=template, input_variables=['question'])
llm_chain = LLMChain(prompt=prompt, llm=llm)
print(llm_chain.run('What is AI?'))
Documentation
Parameters
Name | Type | Description | Default |
---|---|---|---|
top_k |
int |
The top-k sampling parameter. | 40 |
top_p |
float |
The top-p sampling parameter. | 0.95 |
temperature |
float |
The temperature parameter. | 0.8 |
repetition_penalty |
float |
The repetition penalty parameter. | 1.0 |
last_n_tokens |
int |
Number of last tokens to use for repetition penalty. | 64 |
seed |
int |
Seed for sampling tokens. | Random |
max_new_tokens |
int |
Maximum number of new tokens to generate. | 256 |
reset |
bool |
Whether to reset the model state before evaluating a new prompt. | True |
batch_size |
int |
Batch size for evaluating tokens. | 8 |
threads |
int |
Number of threads to use. | Auto |
class AutoModelForCausalLM
classmethod AutoModelForCausalLM.from_pretrained
from_pretrained(
model_path_or_repo_id: 'str',
model_type: 'Optional[str]' = None,
model_file: 'Optional[str]' = None,
config: 'Optional[AutoConfig]' = None,
lib: 'Optional[str]' = None,
**kwargs
) → LLM
class LLM
method LLM.__init__
__init__(
model_path: str,
model_type: str,
config: Optional[ctransformers.llm.Config] = None,
lib: Optional[str] = None
)
property LLM.config
property LLM.model_path
property LLM.model_type
method LLM.detokenize
detokenize(tokens: Union[Sequence[int], int]) → str
method LLM.eval
eval(
tokens: Sequence[int],
batch_size: Optional[int] = None,
threads: Optional[int] = None
) → None
method LLM.generate
generate(
tokens: Sequence[int],
top_k: Optional[int] = None,
top_p: Optional[float] = None,
temperature: Optional[float] = None,
repetition_penalty: Optional[float] = None,
last_n_tokens: Optional[int] = None,
seed: Optional[int] = None,
batch_size: Optional[int] = None,
threads: Optional[int] = None,
reset: Optional[bool] = None
) → Generator[int, NoneType, NoneType]
method LLM.is_eos_token
is_eos_token(token: int) → bool
method LLM.reset
reset() → None
method LLM.sample
sample(
top_k: Optional[int] = None,
top_p: Optional[float] = None,
temperature: Optional[float] = None,
repetition_penalty: Optional[float] = None,
last_n_tokens: Optional[int] = None,
seed: Optional[int] = None
) → int
method LLM.tokenize
tokenize(text: str) → List[int]
method LLM.__call__
__call__(
prompt: str,
max_new_tokens: Optional[int] = None,
top_k: Optional[int] = None,
top_p: Optional[float] = None,
temperature: Optional[float] = None,
repetition_penalty: Optional[float] = None,
last_n_tokens: Optional[int] = None,
seed: Optional[int] = None,
batch_size: Optional[int] = None,
threads: Optional[int] = None,
reset: Optional[bool] = None
) → str
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ctransformers-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54e9c3621a27836f2205943bc1f0065e7380157b93fdad389cf92a30992445ec |
|
MD5 | aca8a6731a4aeea4d72046a1a263f20c |
|
BLAKE2b-256 | 12d4fb66cdcf784a59de35be8e63d9f85cb75ca66b911d87516ffb32c13425bb |