LLM plugin for running models using llama.cpp

These details have not been verified by PyPI

Project links

Project description

llm-llama-cpp

LLM plugin for running models using llama.cpp

Installation

Install this plugin in the same environment as llm.

llm install llm-llama-cpp

The plugin has an additional dependency on llama-cpp-python which needs to be installed separately.

If you have a C compiler available on your system you can install that like so:

llm install llama-cpp-python

You could also try installing one of the wheels made available in their latest release on GitHub. Find the URL to the wheel for your platform, if one exists, and run:

llm install https://...

If you are on an Apple Silicon Mac you can try this command, which should compile the package with METAL support for running on your GPU:

CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 llm install llama-cpp-python

Running a GGUF model directly

The quickest way to try this plugin out is to download a GGUF file and execute that using the gguf model with the -o path PATH option:

For example, download the una-cybertron-7b-v2-bf16.Q8_0.gguf file from TheBloke/una-cybertron-7B-v2-GGUF and execute it like this:

llm -m gguf \
  -o path una-cybertron-7b-v2-bf16.Q8_0.gguf \
  'Instruction: Five reasons to get a pet walrus
Response:'

The output starts like this:

Walruses are fascinating animals that possess unique qualities that can captivate and entertain you for hours on end. Getting the chance to be around one regularly would ensure that you'll never run out of interesting things to learn about them, whether from an educational or personal standpoint.

Pet walruses can help alleviate depression and anxiety, as they require constant care and attention. Nurturing a relationship with these intelligent creatures provides comfort and fulfillment, fostering a sense of purpose in your daily life. Moreover, their playful nature encourages laughter and joy, promoting overall happiness. [...]

Adding models

You can also add or download models to execute them directly using the -m option.

This tool should work with any model that works with llama.cpp.

The plugin can download models for you. Try running this command:

llm llama-cpp download-model \
  https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q6_K.gguf \
  --alias llama2-chat --alias l2c --llama2-chat

This will download the Llama 2 7B Chat GGUF model file (this one is 5.53GB), save it and register it with the plugin - with two aliases, llama2-chat and l2c.

The --llama2-chat option configures it to run using a special Llama 2 Chat prompt format. You should omit this for models that are not Llama 2 Chat models.

If you have already downloaded a llama.cpp compatible model you can tell the plugin to read it from its current location like this:

llm llama-cpp add-model path/to/llama-2-7b-chat.Q6_K.gguf \
  --alias l27c --llama2-chat

The model filename (minus the .gguf extension) will be registered as its ID for executing the model.

You can also set one or more aliases using the --alias option.

You can see a list of models you have registered in this way like this:

llm llama-cpp models

Models are registered in a models.json file. You can find the path to that file in order to edit it directly like so:

llm llama-cpp models-file

For example, to edit that file in Vim:

vim "$(llm llama-cpp models-file)"

To find the directory with downloaded models, run:

llm llama-cpp models-dir

Here's how to change to that directory:

cd "$(llm llama-cpp models-dir)"

Running a prompt through a model

Once you have downloaded and added a model, you can run a prompt like this:

llm -m llama-2-7b-chat.Q6_K 'five names for a cute pet skunk'

Or if you registered an alias you can use that instead:

llm -m llama2-chat 'five creative names for a pet hedgehog'

More models to try

Llama 2 7B

This model is Llama 2 7B GGML without the chat training. You'll need to prompt it slightly differently:

llm llama-cpp download-model \
  https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q6_K.gguf \
  --alias llama2

Try prompts that expect to be completed by the model, for example:

llm -m llama2 'Three fancy names for a posh albatross are:'

Llama 2 Chat 13B

This model is the Llama 2 13B Chat GGML model - a 10.7GB download:

llm llama-cpp download-model \
  'https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q6_K.gguf'\
  -a llama2-chat-13b --llama2-chat

Llama 2 Python 13B

This model is the Llama 2 13B Python GGML model - a 9.24GB download:

llm llama-cpp download-model \
  'https://huggingface.co/TheBloke/CodeLlama-13B-Python-GGUF/resolve/main/codellama-13b-python.Q5_K_M.gguf'\
  -a llama2-python-13b --llama2-chat

Options

The following options are available:

-o verbose 1 - output more verbose logging
-o max_tokens 100 - max tokens to return. Defaults to 4000.
-o no_gpu 1 - remove the default `n_gpu_layers=1`` argument, which should disable GPU usage
-o n_gpu_layers 10 - increase the n_gpu_layers argument to a higher value (the default is 1)
-o n_ctx 1024 - set the n_ctx argument to 1024 (the default is 4000)

For example:

llm chat -m llama2-chat-13b -o n_ctx 1024

These are mainly provided to support experimenting with different ways of executing the underlying model.

Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

cd llm-llama-cpp
python3 -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3

Dec 9, 2023

0.2b1 pre-release

Sep 28, 2023

0.2b0 pre-release

Sep 22, 2023

0.1a0 pre-release

Aug 1, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm-llama-cpp-0.3.tar.gz (11.5 kB view details)

Uploaded Dec 9, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_llama_cpp-0.3-py3-none-any.whl (11.6 kB view details)

Uploaded Dec 9, 2023 Python 3

File details

Details for the file llm-llama-cpp-0.3.tar.gz.

File metadata

Download URL: llm-llama-cpp-0.3.tar.gz
Upload date: Dec 9, 2023
Size: 11.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for llm-llama-cpp-0.3.tar.gz
Algorithm	Hash digest
SHA256	`8de7ab99fc62510e96ff8d5ca084aebf7b8f6051eae114409a7c35cd6d26c557`
MD5	`12a81286f02b77e3dbfa9ba29b10d08c`
BLAKE2b-256	`5c3a0163b63b1cce4b52877fd29bd25ec1b1335f2757c785d3f022b0f5c44997`

See more details on using hashes here.

File details

Details for the file llm_llama_cpp-0.3-py3-none-any.whl.

File metadata

Download URL: llm_llama_cpp-0.3-py3-none-any.whl
Upload date: Dec 9, 2023
Size: 11.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for llm_llama_cpp-0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a67d30ca7724a202b96d6d1f6dd9cbf1a6e44bde03db03820a487282eb857588`
MD5	`1bc71704c36b1dd7d01f53c24c24a8e8`
BLAKE2b-256	`129d696c621382c439ccff25535e3b09fad5c990cfae11ec4bdf65c2eaecd690`

See more details on using hashes here.

llm-llama-cpp 0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-llama-cpp

Installation

Running a GGUF model directly

Adding models

Running a prompt through a model

More models to try

Llama 2 7B

Llama 2 Chat 13B

Llama 2 Python 13B

Options

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes