Document Store
- class llama_index.storage.docstore.BaseDocumentStore
- abstract delete_document(doc_id: str, raise_error: bool = True) None
Delete a document from the store.
- get_node(node_id: str, raise_error: bool = True) Node
Get node from docstore.
- Parameters
node_id (str) – node id
raise_error (bool) – raise error if node_id not found
- llama_index.storage.docstore.DocumentStore
alias of
SimpleDocumentStore
- class llama_index.storage.docstore.KVDocumentStore(kvstore: BaseKVStore, namespace: Optional[str] = None)
Document (Node) store.
NOTE: at the moment, this store is primarily used to store Node objects. Each node will be assigned an ID.
The same docstore can be reused across index structures. This allows you to reuse the same storage for multiple index structures; otherwise, each index would create a docstore under the hood.
This will use the same docstore for multiple index structures.
- Parameters
kvstore (BaseKVStore) – key-value store
namespace (str) – namespace for the docstore
- add_documents(docs: Sequence[BaseDocument], allow_update: bool = True) None
Add a document to the store.
- Parameters
docs (List[BaseDocument]) – documents
allow_update (bool) – allow update of docstore from document
- delete_document(doc_id: str, raise_error: bool = True) None
Delete a document from the store.
- property docs: Dict[str, BaseDocument]
Get all documents.
- Returns
documents
- Return type
Dict[str, BaseDocument]
- document_exists(doc_id: str) bool
Check if document exists.
- get_document(doc_id: str, raise_error: bool = True) Optional[BaseDocument]
Get a document from the store.
- Parameters
doc_id (str) – document id
raise_error (bool) – raise error if doc_id not found
- get_document_hash(doc_id: str) Optional[str]
Get the stored hash for a document, if it exists.
- get_node(node_id: str, raise_error: bool = True) Node
Get node from docstore.
- Parameters
node_id (str) – node id
raise_error (bool) – raise error if node_id not found
- get_node_dict(node_id_dict: Dict[int, str]) Dict[int, Node]
Get node dict from docstore given a mapping of index to node ids.
- Parameters
node_id_dict (Dict[int, str]) – mapping of index to node ids
- get_nodes(node_ids: List[str], raise_error: bool = True) List[Node]
Get nodes from docstore.
- Parameters
node_ids (List[str]) – node ids
raise_error (bool) – raise error if node_id not found
- set_document_hash(doc_id: str, doc_hash: str) None
Set the hash for a given doc_id.
- class llama_index.storage.docstore.MongoDocumentStore(mongo_kvstore: MongoDBKVStore, namespace: Optional[str] = None)
Mongo Document (Node) store.
A MongoDB store for Document and Node objects.
- Parameters
mongo_kvstore (MongoDBKVStore) – MongoDB key-value store
namespace (str) – namespace for the docstore
- add_documents(docs: Sequence[BaseDocument], allow_update: bool = True) None
Add a document to the store.
- Parameters
docs (List[BaseDocument]) – documents
allow_update (bool) – allow update of docstore from document
- delete_document(doc_id: str, raise_error: bool = True) None
Delete a document from the store.
- property docs: Dict[str, BaseDocument]
Get all documents.
- Returns
documents
- Return type
Dict[str, BaseDocument]
- document_exists(doc_id: str) bool
Check if document exists.
- classmethod from_host_and_port(host: str, port: int, db_name: Optional[str] = None, namespace: Optional[str] = None) MongoDocumentStore
Load a MongoDocumentStore from a MongoDB host and port.
- classmethod from_uri(uri: str, db_name: Optional[str] = None, namespace: Optional[str] = None) MongoDocumentStore
Load a MongoDocumentStore from a MongoDB URI.
- get_document(doc_id: str, raise_error: bool = True) Optional[BaseDocument]
Get a document from the store.
- Parameters
doc_id (str) – document id
raise_error (bool) – raise error if doc_id not found
- get_document_hash(doc_id: str) Optional[str]
Get the stored hash for a document, if it exists.
- get_node(node_id: str, raise_error: bool = True) Node
Get node from docstore.
- Parameters
node_id (str) – node id
raise_error (bool) – raise error if node_id not found
- get_node_dict(node_id_dict: Dict[int, str]) Dict[int, Node]
Get node dict from docstore given a mapping of index to node ids.
- Parameters
node_id_dict (Dict[int, str]) – mapping of index to node ids
- get_nodes(node_ids: List[str], raise_error: bool = True) List[Node]
Get nodes from docstore.
- Parameters
node_ids (List[str]) – node ids
raise_error (bool) – raise error if node_id not found
- set_document_hash(doc_id: str, doc_hash: str) None
Set the hash for a given doc_id.
- class llama_index.storage.docstore.SimpleDocumentStore(simple_kvstore: Optional[SimpleKVStore] = None, name_space: Optional[str] = None)
Simple Document (Node) store.
An in-memory store for Document and Node objects.
- Parameters
simple_kvstore (SimpleKVStore) – simple key-value store
name_space (str) – namespace for the docstore
- add_documents(docs: Sequence[BaseDocument], allow_update: bool = True) None
Add a document to the store.
- Parameters
docs (List[BaseDocument]) – documents
allow_update (bool) – allow update of docstore from document
- delete_document(doc_id: str, raise_error: bool = True) None
Delete a document from the store.
- property docs: Dict[str, BaseDocument]
Get all documents.
- Returns
documents
- Return type
Dict[str, BaseDocument]
- document_exists(doc_id: str) bool
Check if document exists.
- classmethod from_persist_dir(persist_dir: str = './storage', namespace: Optional[str] = None) SimpleDocumentStore
Create a SimpleDocumentStore from a persist directory.
- Parameters
persist_dir (str) – directory to persist the store
namespace (Optional[str]) – namespace for the docstore
- classmethod from_persist_path(persist_path: str, namespace: Optional[str] = None) SimpleDocumentStore
Create a SimpleDocumentStore from a persist path.
- Parameters
persist_path (str) – Path to persist the store
namespace (Optional[str]) – namespace for the docstore
- get_document(doc_id: str, raise_error: bool = True) Optional[BaseDocument]
Get a document from the store.
- Parameters
doc_id (str) – document id
raise_error (bool) – raise error if doc_id not found
- get_document_hash(doc_id: str) Optional[str]
Get the stored hash for a document, if it exists.
- get_node(node_id: str, raise_error: bool = True) Node
Get node from docstore.
- Parameters
node_id (str) – node id
raise_error (bool) – raise error if node_id not found
- get_node_dict(node_id_dict: Dict[int, str]) Dict[int, Node]
Get node dict from docstore given a mapping of index to node ids.
- Parameters
node_id_dict (Dict[int, str]) – mapping of index to node ids
- get_nodes(node_ids: List[str], raise_error: bool = True) List[Node]
Get nodes from docstore.
- Parameters
node_ids (List[str]) – node ids
raise_error (bool) – raise error if node_id not found
- persist(persist_path: str = './storage/docstore.json') None
Persist the store.
- set_document_hash(doc_id: str, doc_hash: str) None
Set the hash for a given doc_id.