Tair Vector Storeο
In this notebook we are going to show a quick demo of using the TairVectorStore.
import os
import sys
import logging
import textwrap
import warnings
warnings.filterwarnings("ignore")
# stop huggingface warnings
os.environ["TOKENIZERS_PARALLELISM"] = "false"
# Uncomment to see debug logs
# logging.basicConfig(stream=sys.stdout, level=logging.INFO)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, Document
from llama_index.vector_stores import TairVectorStore
from IPython.display import Markdown, display
Setup OpenAIο
Lets first begin by adding the openai api key. This will allow us to access openai for embeddings and to use chatgpt.
import os
os.environ["OPENAI_API_KEY"] = "sk-<your key here>"
Read in a datasetο
# load documents
documents = SimpleDirectoryReader("../data/paul_graham").load_data()
print("Document ID:", documents[0].doc_id, "Document Hash:", documents[0].doc_hash)
Build index from documentsο
Letβs build a vector index with GPTVectorStoreIndex
, using TairVectorStore
as its backend. Replace tair_url
with the actual url of your Tair instance.
from llama_index.storage.storage_context import StorageContext
tair_url = (
"redis://{username}:{password}@r-bp****************.redis.rds.aliyuncs.com:{port}"
)
vector_store = TairVectorStore(
tair_url=tair_url, index_name="pg_essays", overwrite=True
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)
Query the dataο
Now we can use the index as knowledge base and ask questions to it.
query_engine = index.as_query_engine()
response = query_engine.query("What did the author learn?")
print(textwrap.fill(str(response), 100))
response = query_engine.query("What was a hard moment for the author?")
print(textwrap.fill(str(response), 100))
Deleting documentsο
To delete a document from the index, use delete
method.
document_id = documents[0].doc_id
document_id
info = vector_store.client.tvs_get_index("pg_essays")
print("Number of documents", int(info["data_count"]))
vector_store.delete(document_id)
info = vector_store.client.tvs_get_index("pg_essays")
print("Number of documents", int(info["data_count"]))
Deleting indexο
Delete the entire index using delete_index
method.
vector_store.delete_index()
print("Check index existence:", vector_store.client._index_exists())