LanceDB Vector Storeο
In this notebook we are going to show how to use LanceDB to perform vector searches in LlamaIndex
import logging
import sys
# Uncomment to see debug logs
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index import SimpleDirectoryReader, Document, StorageContext
from llama_index.indices.vector_store import VectorStoreIndex
from llama_index.vector_stores import LanceDBVectorStore
import textwrap
/Users/suo/miniconda3/envs/llama/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Setup OpenAIο
The first step is to configure the openai key. It will be used to created embeddings for the documents loaded into the index
import os
os.environ['OPENAI_API_KEY'] = ""
Loading documentsο
Load the documents stored in the paul_graham_essay/data
using the SimpleDirectoryReader
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
print('Document ID:', documents[0].doc_id, 'Document Hash:', documents[0].doc_hash)
Document ID: 0355fc6a-22a6-4959-93f2-4fbdbe08c213 Document Hash: 77ae91ab542f3abb308c4d7c77c9bc4c9ad0ccd63144802b7cbe7e1bb3a4094e
Create the indexο
Here we create an index backed by LanceDB using the documents loaded previously. GPTLanceDBIndex takes a few arguments.
uri (str, required): Location where LanceDB will store its files.
table_name (str, optional): The table name where the embeddings will be stored. Defaults to βvectorsβ.
nprobes (int, optional): The number of probes used. A higher number makes search more accurate but also slower. Defaults to 20.
refine_factor: (int, optional): Refine the results by reading extra elements and re-ranking them in memory. Defaults to None
More details can be found at the LanceDB docs
vector_store = LanceDBVectorStore(uri='/tmp/lancedb')
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 20729 tokens
Query the indexο
We can now ask questions using our index.
query_engine = index.as_query_engine()
response = query_engine.query("Who is the author?")
INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 5 tokens
INFO:openai:error_code=None error_message='The server had an error while processing your request. Sorry about that!' error_param=None error_type=server_error message='OpenAI API error received' stream_error=False
WARNING:langchain.llms.openai:Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: The server had an error while processing your request. Sorry about that!.
INFO:openai:error_code=None error_message='The server had an error while processing your request. Sorry about that!' error_param=None error_type=server_error message='OpenAI API error received' stream_error=False
WARNING:langchain.llms.openai:Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: The server had an error while processing your request. Sorry about that!.
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1828 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
print(textwrap.fill(str(response), 100))
The author is Paul Graham.
response = query_engine.query("What did the author do growing up?")
INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 8 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1917 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
print(textwrap.fill(str(response), 100))
The author grew up writing short stories and programming on the IBM 1401. He also nagged his father
to buy him a TRS-80 microcomputer, which he used to write simple games, a program to predict how
high his model rockets would fly, and a word processor. He also studied philosophy in college, but
eventually switched to AI.
Appending dataο
You can also add data to an existing index
del index
index = VectorStoreIndex.from_documents([Document("The sky is blue")], uri="/tmp/new_dataset")
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 4 tokens
query_engine = index.as_query_engine()
response = query_engine.query("Who is the author?")
print(textwrap.fill(str(response), 100))
INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 5 tokens
INFO:openai:error_code=None error_message='The server had an error while processing your request. Sorry about that!' error_param=None error_type=server_error message='OpenAI API error received' stream_error=False
WARNING:langchain.llms.openai:Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: The server had an error while processing your request. Sorry about that!.
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1823 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
The author is Paul Graham.
index = VectorStoreIndex.from_documents(documents, uri="/tmp/new_dataset")
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 20729 tokens
query_engine = index.as_query_engine()
response = query_engine.query("Who is the author?")
print(textwrap.fill(str(response), 100))
INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 5 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1823 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
The author is Paul Graham.