Knowledge Graph Index

Building the Knowledge Graph Index

KG-based data structures.

llama_index.indices.knowledge_graph.GPTKnowledgeGraphIndex

alias of KnowledgeGraphIndex

class llama_index.indices.knowledge_graph.KGTableRetriever(index: KnowledgeGraphIndex, query_keyword_extract_template: Optional[Prompt] = None, max_keywords_per_query: int = 10, num_chunks_per_query: int = 10, include_text: bool = True, retriever_mode: Optional[KGRetrieverMode] = KGRetrieverMode.KEYWORD, similarity_top_k: int = 2, **kwargs: Any)

KG Table Retriever.

Arguments are shared among subclasses.

Parameters
  • query_keyword_extract_template (Optional[QueryKGExtractPrompt]) – A Query KG Extraction Prompt (see Prompt Templates).

  • refine_template (Optional[RefinePrompt]) – A Refinement Prompt (see Prompt Templates).

  • text_qa_template (Optional[QuestionAnswerPrompt]) – A Question Answering Prompt (see Prompt Templates).

  • max_keywords_per_query (int) – Maximum number of keywords to extract from query.

  • num_chunks_per_query (int) – Maximum number of text chunks to query.

  • include_text (bool) – Use the document text source from each relevant triplet during queries.

  • retriever_mode (KGRetrieverMode) – Specifies whether to use keyowrds, embeddings, or both to find relevant triplets. Should be one of β€œkeyword”, β€œembedding”, or β€œhybrid”.

  • similarity_top_k (int) – The number of top embeddings to use (if embeddings are used).

retrieve(str_or_query_bundle: Union[str, QueryBundle]) List[NodeWithScore]

Retrieve nodes given query.

Parameters

str_or_query_bundle (QueryType) – Either a query string or a QueryBundle object.

class llama_index.indices.knowledge_graph.KnowledgeGraphIndex(nodes: Optional[Sequence[Node]] = None, index_struct: Optional[KG] = None, kg_triple_extract_template: Optional[Prompt] = None, max_triplets_per_chunk: int = 10, include_embeddings: bool = False, **kwargs: Any)

Knowledge Graph Index.

Build a KG by extracting triplets, and leveraging the KG during query-time.

Parameters
  • kg_triple_extract_template (KnowledgeGraphPrompt) – The prompt to use for extracting triplets.

  • max_triplets_per_chunk (int) – The maximum number of triplets to extract.

add_node(keywords: List[str], node: Node) None

Add node.

Used for manual insertion of nodes (keyed by keywords).

Parameters
  • keywords (List[str]) – Keywords to index the node.

  • node (Node) – Node to be indexed.

delete_nodes(doc_ids: List[str], delete_from_docstore: bool = False, **delete_kwargs: Any) None

Delete a list of nodes from the index.

Parameters

doc_ids (List[str]) – A list of doc_ids from the nodes to delete

delete_ref_doc(ref_doc_id: str, delete_from_docstore: bool = False, **delete_kwargs: Any) None

Delete a document and it’s nodes by using ref_doc_id.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

Parameters

documents (Optional[Sequence[BaseDocument]]) – List of documents to build the index from.

get_networkx_graph() Any

Get networkx representation of the graph structure.

NOTE: This function requires networkx to be installed. NOTE: This is a beta feature.

property index_id: str

Get the index struct.

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

property ref_doc_info: Dict[str, RefDocInfo]

Retrieve a dict mapping of ingested documents and their nodes+metadata.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

refresh_ref_docs(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

Parameters

index_id (str) – Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document and it’s corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

Parameters
  • document (Union[BaseDocument, BaseIndex]) – document to update

  • insert_kwargs (Dict) – kwargs to pass to insert

  • delete_kwargs (Dict) – kwargs to pass to delete

update_ref_doc(document: Document, **update_kwargs: Any) None

Update a document and it’s corresponding nodes.

This is equivalent to deleting the document and then inserting it again.

Parameters
  • document (Union[BaseDocument, BaseIndex]) – document to update

  • insert_kwargs (Dict) – kwargs to pass to insert

  • delete_kwargs (Dict) – kwargs to pass to delete

upsert_triplet(triplet: Tuple[str, str, str]) None

Insert triplets.

Used for manual insertion of KG triplets (in the form of (subject, relationship, object)).

Args

triplet (str): Knowledge triplet

upsert_triplet_and_node(triplet: Tuple[str, str, str], node: Node) None

Upsert KG triplet and node.

Calls both upsert_triplet and add_node. Behavior is idempotent; if Node already exists, only triplet will be added.

Parameters
  • keywords (List[str]) – Keywords to index the node.

  • node (Node) – Node to be indexed.