The rchroma package provides an R interface to ChromaDB, a vector database for storing and querying embeddings. This vignette demonstrates the basic usage of the package.
Before using rchroma, you need to have a running ChromaDB instance. The easiest way to get started is using Docker: The easiest way to get started is using the provided Docker helper functions:
This will start a ChromaDB server on
http://localhost:8000.
For other installation methods and configuration options, please refer to the ChromaDB documentation.
First, we need to establish a connection to ChromaDB:
Collections are the main way to organize your data in ChromaDB:
Documents are the basic unit of data in ChromaDB. Each document consists of text content and its associated embedding:
# Add documents with embeddings
docs <- c(
  "apple fruit",
  "banana fruit",
  "carrot vegetable"
)
embeddings <- list(
  c(1.0, 0.0, 0.0), # apple
  c(0.8, 0.2, 0.0), # banana (similar to apple)
  c(0.0, 0.0, 1.0) # carrot (different)
)
# Add documents to the collection
add_documents(
  client,
  "my_collection",
  documents = docs,
  ids = c("doc1", "doc2", "doc3"),
  embeddings = embeddings
)
# Query similar documents using embeddings
results <- query(
  client,
  "my_collection",
  query_embeddings = list(c(1.0, 0.0, 0.0)), # should match apple best
  n_results = 2
)You can update or delete documents as needed: