Structured Store Index

Structured store indices.

class llama_index.indices.struct_store.GPTNLPandasQueryEngine(index: GPTPandasIndex, instruction_str: Optional[str] = None, output_processor: Optional[Callable] = None, pandas_prompt: Optional[Prompt] = None, output_kwargs: Optional[dict] = None, head: int = 5, verbose: bool = False, **kwargs: Any)

GPT Pandas query.

Convert natural language to Pandas python code.

Parameters
  • df (pd.DataFrame) – Pandas dataframe to use.

  • instruction_str (Optional[str]) – Instruction string to use.

  • output_processor (Optional[Callable[[str], str]]) – Output processor. A callable that takes in the output string, pandas DataFrame, and any output kwargs and returns a string.

  • pandas_prompt (Optional[PandasPrompt]) – Pandas prompt to use.

  • head (int) – Number of rows to show in the table context.

class llama_index.indices.struct_store.GPTNLStructStoreQueryEngine(index: GPTSQLStructStoreIndex, text_to_sql_prompt: Optional[Prompt] = None, context_query_kwargs: Optional[dict] = None, synthesize_response: bool = True, response_synthesis_prompt: Optional[Prompt] = None, **kwargs: Any)

GPT natural language query engine over a structured database.

Given a natural language query, we will extract the query to SQL. Runs raw SQL over a GPTSQLStructStoreIndex. No LLM calls are made during the SQL execution. NOTE: this query cannot work with composed indices - if the index contains subindices, those subindices will not be queried.

Parameters
  • index (GPTSQLStructStoreIndex) – A GPT SQL Struct Store Index

  • text_to_sql_prompt (Optional[Prompt]) – A Text to SQL Prompt to use for the query. Defaults to DEFAULT_TEXT_TO_SQL_PROMPT.

  • context_query_kwargs (Optional[dict]) – Keyword arguments for the context query. Defaults to {}.

  • synthesize_response (bool) – Whether to synthesize a response from the query results. Defaults to True.

  • response_synthesis_prompt (Optional[Prompt]) – A Response Synthesis Prompt to use for the query. Defaults to DEFAULT_RESPONSE_SYNTHESIS_PROMPT.

property service_context: ServiceContext

Get service context.

class llama_index.indices.struct_store.GPTPandasIndex(df: DataFrame, nodes: Optional[Sequence[Node]] = None, index_struct: Optional[PandasStructTable] = None, **kwargs: Any)

Base GPT Pandas Index.

The GPTPandasStructStoreIndex is an index that stores a Pandas dataframe under the hood. Currently index β€œconstruction” is not supported.

During query time, the user can either specify a raw SQL query or a natural language query to retrieve their data.

Parameters

pandas_df (Optional[pd.DataFrame]) – Pandas dataframe to use. See Structured Index Configuration for more details.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

Parameters

documents (Optional[Sequence[BaseDocument]]) – List of documents to build the index from.

property index_id: str

Get the index struct.

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

Parameters

index_id (str) – Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document.

This is equivalent to deleting the document and then inserting it again.

Parameters
  • document (Union[BaseDocument, BaseGPTIndex]) – document to update

  • insert_kwargs (Dict) – kwargs to pass to insert

  • delete_kwargs (Dict) – kwargs to pass to delete

class llama_index.indices.struct_store.GPTSQLStructStoreIndex(nodes: Optional[Sequence[Node]] = None, index_struct: Optional[SQLStructTable] = None, service_context: Optional[ServiceContext] = None, sql_database: Optional[SQLDatabase] = None, table_name: Optional[str] = None, table: Optional[Table] = None, ref_doc_id_column: Optional[str] = None, sql_context_container: Optional[SQLContextContainer] = None, **kwargs: Any)

Base GPT SQL Struct Store Index.

The GPTSQLStructStoreIndex is an index that uses a SQL database under the hood. During index construction, the data can be inferred from unstructured documents given a schema extract prompt, or it can be pre-loaded in the database.

During query time, the user can either specify a raw SQL query or a natural language query to retrieve their data.

Parameters
  • documents (Optional[Sequence[DOCUMENTS_INPUT]]) – Documents to index. NOTE: in the SQL index, this is an optional field.

  • sql_database (Optional[SQLDatabase]) – SQL database to use, including table names to specify. See Structured Index Configuration for more details.

  • table_name (Optional[str]) – Name of the table to use for extracting data. Either table_name or table must be specified.

  • table (Optional[Table]) – SQLAlchemy Table object to use. Specifying the Table object explicitly, instead of the table name, allows you to pass in a view. Either table_name or table must be specified.

  • sql_context_container (Optional[SQLContextContainer]) – SQL context container. an be generated from a SQLContextContainerBuilder. See Structured Index Configuration for more details.

classmethod from_documents(documents: Sequence[Document], storage_context: Optional[StorageContext] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any) IndexType

Create index from documents.

Parameters

documents (Optional[Sequence[BaseDocument]]) – List of documents to build the index from.

property index_id: str

Get the index struct.

insert(document: Document, **insert_kwargs: Any) None

Insert a document.

refresh(documents: Sequence[Document], **update_kwargs: Any) List[bool]

Refresh an index with documents that have changed.

This allows users to save LLM and Embedding model calls, while only updating documents that have any changes in text or extra_info. It will also insert any documents that previously were not stored.

set_index_id(index_id: str) None

Set the index id.

NOTE: if you decide to set the index_id on the index_struct manually, you will need to explicitly call add_index_struct on the index_store to update the index store.

Parameters

index_id (str) – Index id to set.

update(document: Document, **update_kwargs: Any) None

Update a document.

This is equivalent to deleting the document and then inserting it again.

Parameters
  • document (Union[BaseDocument, BaseGPTIndex]) – document to update

  • insert_kwargs (Dict) – kwargs to pass to insert

  • delete_kwargs (Dict) – kwargs to pass to delete

class llama_index.indices.struct_store.GPTSQLStructStoreQueryEngine(index: GPTSQLStructStoreIndex, sql_context_container: Optional[SQLContextContainerBuilder] = None, **kwargs: Any)

GPT SQL query engine over a structured database.

Runs raw SQL over a GPTSQLStructStoreIndex. No LLM calls are made here. NOTE: this query cannot work with composed indices - if the index contains subindices, those subindices will not be queried.

class llama_index.indices.struct_store.SQLContextContainerBuilder(sql_database: SQLDatabase, context_dict: Optional[Dict[str, str]] = None, context_str: Optional[str] = None)

SQLContextContainerBuilder.

Build a SQLContextContainer that can be passed to the SQL index during index construction or during query-time.

NOTE: if context_str is specified, that will be used as context instead of context_dict

Parameters
  • sql_database (SQLDatabase) – SQL database

  • context_dict (Optional[Dict[str, str]]) – context dict

build_context_container(ignore_db_schema: bool = False) SQLContextContainer

Build index structure.

derive_index_from_context(index_cls: Type[BaseGPTIndex], ignore_db_schema: bool = False, **index_kwargs: Any) BaseGPTIndex

Derive index from context.

classmethod from_documents(documents_dict: Dict[str, List[BaseDocument]], sql_database: SQLDatabase, **context_builder_kwargs: Any) SQLContextContainerBuilder

Build context from documents.

query_index_for_context(index: BaseGPTIndex, query_str: Union[str, QueryBundle], query_tmpl: Optional[str] = 'Please return the relevant tables (including the full schema) for the following query: {orig_query_str}', store_context_str: bool = True, **index_kwargs: Any) str

Query index for context.

A simple wrapper around the index.query call which injects a query template to specifically fetch table information, and can store a context_str.

Parameters
  • index (BaseGPTIndex) – index data structure

  • query_str (QueryType) – query string

  • query_tmpl (Optional[str]) – query template

  • store_context_str (bool) – store context_str