Skip to main content

Chroma.

Project description

Chroma Chroma

Chroma - the open-source search engine for AI.
The fastest way to build Python or JavaScript LLM apps that search over your data!

Discord | License | Docs | Homepage

pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path

Chroma Cloud

Our hosted service, Chroma Cloud, powers serverless vector, hybrid, and full-text search. It's extremely fast, cost-effective, scalable and painless. Create a DB and try it out in under 30 seconds with $5 of free credits.

Get started with Chroma Cloud

API

The core API is only 4 functions (run our 💡 Google Colab):

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Learn about all features on our Docs

Features

  • Simple: Fully-typed, fully-tested, fully-documented == happiness
  • Integrations: 🦜️🔗 LangChain (python and js), 🦙 LlamaIndex and more soon
  • Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster
  • Feature-rich: Queries, filtering, regex and more
  • Free & Open Source: Apache 2.0 Licensed

Use case: ChatGPT for ______

For example, the "Chat your data" use case:

  1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.
  2. Query relevant documents with natural language.
  3. Compose documents into the context window of an LLM like GPT4 for additional summarization or analysis.

Embeddings?

What are embeddings?

  • Read the guide from OpenAI
  • Literal: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => [1.2, 2.1, ....]. This process makes documents "understandable" to a machine learning model.
  • By analogy: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find.
  • Technical: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer.
  • A small example: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge.

Chroma allows you to store these vectors or embeddings and search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses Sentence Transformers to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own.

Get involved

Chroma is a rapidly developing project. We welcome PR contributors and ideas for how to improve the project.

Release Cadence We currently release new tagged versions of the pypi and npm packages on Mondays. Hotfixes go out at any time during the week.

License

Apache 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromadb-1.5.2.tar.gz (2.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chromadb-1.5.2-cp39-abi3-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

chromadb-1.5.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

chromadb-1.5.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (20.6 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

chromadb-1.5.2-cp39-abi3-macosx_11_0_arm64.whl (20.0 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

chromadb-1.5.2-cp39-abi3-macosx_10_12_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file chromadb-1.5.2.tar.gz.

File metadata

  • Download URL: chromadb-1.5.2.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.4

File hashes

Hashes for chromadb-1.5.2.tar.gz
Algorithm Hash digest
SHA256 4fc3535a0fcd45343f93d298591882f68e659f24ed319aef14094b168105f956
MD5 3a0b7a8dc8836c1d55d9858dc175310a
BLAKE2b-256 9e48aa5906f9f817b73c9e87e085d3a64705d91b7bb4f76f4649b9379baea980

See more details on using hashes here.

File details

Details for the file chromadb-1.5.2-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: chromadb-1.5.2-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 21.9 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.4

File hashes

Hashes for chromadb-1.5.2-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 042e746ee0c9db34eef2723c4dca30197ded3bf9d27846d996fd51715ec7b0e3
MD5 474c143b0cef76bb9edd133617f21e00
BLAKE2b-256 e896fa83f81f8b618ffca7527915f99cf054c6f8bd272bf3cf5c0616757083ba

See more details on using hashes here.

File details

Details for the file chromadb-1.5.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e48e5b0f300d6f709446a5d9299614e3b6bca997772d810e1298b76b0c4e7dbb
MD5 a78e5bbbf43aa5161260af6689372de2
BLAKE2b-256 ae74b8cd9d9bc72c545a579fd1f7bb44558a801ed5f5bab164a25eea16d51ad9

See more details on using hashes here.

File details

Details for the file chromadb-1.5.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b533db30303ce5a82856ded8897c3cafd3160e1f2dccf5473d0bfdee49a159b3
MD5 a62d62fbe36522d99452157302ebc939
BLAKE2b-256 c2880a9b6dddac3097f321ec8b057d09a61b4edb2b42b891fce7c2bfd01cd4c3

See more details on using hashes here.

File details

Details for the file chromadb-1.5.2-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.2-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8e6a12adb34bf441f8cc368b6460fbc9e14bee5cf926f34e752da759d68dec56
MD5 5f6d7ea4a78a2eb2ff74050a9af2c7f3
BLAKE2b-256 85b3db3e5a8a47106d339c3e109e73859647969a81e9c54ca15bc6dde6685c1e

See more details on using hashes here.

File details

Details for the file chromadb-1.5.2-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chromadb-1.5.2-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a898ab200f9a22a16751eed5444dac330f1f82184264e16d5420e41e0afe63e4
MD5 70055e132c09e72caf1238f6aa61f03e
BLAKE2b-256 6b3b36989e7ebfa2ee10a85deacd423989b07f9e3bd176846863ace1305e9460

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page