Milvus Vector Store

In this notebook we are going to show a quick demo of using the MilvusVectorStore.

import logging
import sys

# Uncomment to see debug logs
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, Document
from llama_index.vector_stores import MilvusVectorStore
from IPython.display import Markdown, display
import textwrap
/Users/filiphaltmayer/miniconda3/envs/llama/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Setup OpenAI

Lets first begin by adding the openai api key. This will allow us to access openai for embeddings and to use chatgpt.

import os
os.environ["OPENAI_API_KEY"] = "sk-"

Generate our data

With our LLM set, lets start using the Milvus Index. As a first example, lets generate a document from the file found in the paul_graham_essay/data folder. In this folder there is a single essay from Paul Graham titled What I Worked On. To generate the documents we will use the SimpleDirectoryReader.

# load documents
documents = SimpleDirectoryReader('../paul_graham_essay/data').load_data()
print('Document ID:', documents[0].doc_id, 'Document Hash:', documents[0].doc_hash)
Document ID: 933666c4-2833-475a-a3d5-d279a0c174fa Document Hash: 77ae91ab542f3abb308c4d7c77c9bc4c9ad0ccd63144802b7cbe7e1bb3a4094e

Create an index across the data

Now that we have a document, we can can create an index and insert the document. For the index we will use a GPTMilvusIndex. GPTMilvusIndex takes in a few arguments:

  • collection_name (str, optional): The name of the collection where data will be stored. Defaults to β€œllamalection”.

  • index_params (dict, optional): The index parameters for Milvus, if none are provided an HNSW index will be used. Defaults to None.

  • search_params (dict, optional): The search parameters for a Milvus query. If none are provided, default params will be generated. Defaults to None.

  • dim (int, optional): The dimension of the embeddings. If it is not provided, collection creation will be done on first insert. Defaults to None.

  • host (str, optional): The host address of Milvus. Defaults to β€œlocalhost”.

  • port (int, optional): The port of Milvus. Defaults to 19530.

  • user (str, optional): The username for RBAC. Defaults to β€œβ€.

  • password (str, optional): The password for RBAC. Defaults to β€œβ€.

  • use_secure (bool, optional): Use https. Defaults to False.

  • overwrite (bool, optional): Whether to overwrite existing collection with same name. Defaults to False.

# Create an index over the documnts
from llama_index.storage.storage_context import StorageContext


vector_store = MilvusVectorStore(overwrite=True)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 17617 tokens

Query the data

Now that we have our document stored in the index, we can ask questions against the index. The index will use the data stored in itself as the knowledge base for chatgpt.

query_engine = index.as_query_engine()
response = query_engine.query("What did the author learn?")
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 4028 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 6 tokens
print(textwrap.fill(str(response), 100))
  The author learned that working on things that are not prestigious can be a good thing, as it can
lead to discovering something real and avoiding the wrong track. The author also learned that
ignorance can be beneficial, as it can lead to discovering something new and unexpected. The author
also learned the importance of working hard, even at the parts of the job they don't like, in order
to set an example for others. The author also learned the value of unsolicited advice, as it can be
beneficial in unexpected ways, such as when Robert Morris suggested that the author should make sure
Y Combinator wasn't the last cool thing they did.
response = query_engine.query("What was a hard moment for the author?")
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 4072 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 9 tokens
print(textwrap.fill(str(response), 100))
 A hard moment for the author was when he was dealing with urgent problems during YC and about 60%
of them had to do with Hacker News, a news aggregator he had created. He was overwhelmed by the
amount of work he had to do to keep Hacker News running, and it was taking away from his ability to
focus on other projects. He was also haunted by the idea that his own work ethic set the upper bound
for how hard everyone else worked, so he felt he had to work very hard. He was also dealing with
disputes between cofounders, figuring out when people were lying to them, and fighting with people
who maltreated the startups. On top of this, he was given unsolicited advice from Robert Morris to
make sure Y Combinator wasn't the last cool thing he did, which made him consider quitting.

This next test shows that overwriting removes the previous data.

vector_store = MilvusVectorStore(overwrite=True)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex.from_documents([Document("The answer is ten.")], storage_context)
query_engine = index.as_query_engine()
res = query_engine.query("Who is the author?")

print(flush=True)
print("Res:", res, flush=True)
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 5 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 44 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 5 tokens
Res: 
The author is unknown.

The next test shows adding additional data to an already existing index.

del index

query_engine = index.as_query_engine()
print(query_engine.query("What is the answer."))

vector_store = MilvusVectorStore(overwrite=False)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()
print(query_engine.query("What is the answer?"))
print(query_engine.query("Who is the author?"))
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 44 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 5 tokens
The answer is ten.
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 17617 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 41 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 5 tokens
Ten.
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 3720 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 5 tokens
The author of the text is Paul Graham, co-founder of Y Combinator.