Vector Storeο
Vector stores.
- class llama_index.vector_stores.AwaDBVectorStore(table_name: str = 'llamaindex_awadb', log_and_data_dir: Optional[str] = None, **kwargs: Any)ο
AwaDB vector store.
In this vector store, embeddings are stored within a AwaDB table.
During query time, the index uses AwaDB to query for the top k most similar nodes.
- Parameters
chroma_collection (chromadb.api.models.Collection.Collection) β ChromaDB collection instance
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to AwaDB.
- Parameters
nodes β List[BaseNode]: list of nodes with embeddings
- Returns
Added node ids
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get AwaDB client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- Returns
None
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query β vector store query
- Returns
Query results
- Return type
- class llama_index.vector_stores.BagelVectorStore(collection: Any, **kwargs: Any)ο
Vector store for Bagel.
- add(nodes: List[BaseNode], **kwargs: Any) List[str] ο
Add a list of nodes with embeddings to the vector store.
- Parameters
nodes β List of nodes with embeddings.
kwargs β Additional arguments.
- Returns
List of document ids.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get the Bagel cluster.
- delete(ref_doc_id: str, **kwargs: Any) None ο
Delete a document from the vector store.
- Parameters
ref_doc_id β Reference document id.
kwargs β Additional arguments.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query the vector store.
- Parameters
query β Query to run.
kwargs β Additional arguments.
- Returns
Query result.
- class llama_index.vector_stores.CassandraVectorStore(session: Any, keyspace: str, table: str, embedding_dimension: int, ttl_seconds: Optional[int] = None, insertion_batch_size: int = 20)ο
Cassandra Vector Store.
An abstraction of a Cassandra table with vector-similarity-search. Documents, and their embeddings, are stored in a Cassandra table and a vector-capable index is used for searches. The table does not need to exist beforehand: if necessary it will be created behind the scenes.
All Cassandra operations are done through the cassIO library.
- Parameters
session (cassandra.cluster.Session) β the Cassandra session to use
keyspace (str) β name of the Cassandra keyspace to work in
table (str) β table name to use. If not existing, it will be created.
embedding_dimension (int) β length of the embedding vectors in use.
ttl_seconds (Optional[int]) β expiration time for inserted entries. Default is no expiration.
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of node with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Return the underlying cassIO vector table object
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
Supported query modes: βdefaultβ (most similar vectors) and βmmrβ.
- Parameters
query (VectorStoreQuery) β
the basic query definition. Defines: mode (VectorStoreQueryMode): one of the supported modes query_embedding (List[float]): query embedding to search against similarity_top_k (int): top k most similar nodes mmr_threshold (Optional[float]): this is the 0-to-1 MMR lambda.
If present, takes precedence over the kwargs parameter. Ignored unless for MMR queries.
- Args for query.mode == βmmrβ (ignored otherwise):
- mmr_threshold (Optional[float]): this is the 0-to-1 lambda for MMR.
Note that in principle mmr_threshold could come in the query
- mmr_prefetch_factor (Optional[float]): factor applied to top_k
for prefetch pool size. Defaults to 4.0
- mmr_prefetch_k (Optional[int]): prefetch pool size. This cannot be
passed together with mmr_prefetch_factor
- class llama_index.vector_stores.ChatGPTRetrievalPluginClient(endpoint_url: str, bearer_token: Optional[str] = None, retries: Optional[Retry] = None, batch_size: int = 100, **kwargs: Any)ο
ChatGPT Retrieval Plugin Client.
In this client, we make use of the endpoints defined by ChatGPT.
- Parameters
endpoint_url (str) β URL of the ChatGPT Retrieval Plugin.
bearer_token (Optional[str]) β Bearer token for the ChatGPT Retrieval Plugin.
retries (Optional[Retry]) β Retry object for the ChatGPT Retrieval Plugin.
batch_size (int) β Batch size for the ChatGPT Retrieval Plugin.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Noneο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Get nodes for response.
- pydantic model llama_index.vector_stores.ChromaVectorStoreο
Chroma vector store.
In this vector store, embeddings are stored within a ChromaDB collection.
During query time, the index uses ChromaDB to query for the top k most similar nodes.
- Parameters
chroma_collection (chromadb.api.models.Collection.Collection) β ChromaDB collection instance
Show JSON schema
{ "title": "ChromaVectorStore", "description": "Chroma vector store.\n\nIn this vector store, embeddings are stored within a ChromaDB collection.\n\nDuring query time, the index uses ChromaDB to query for the top\nk most similar nodes.\n\nArgs:\n chroma_collection (chromadb.api.models.Collection.Collection):\n ChromaDB collection instance", "type": "object", "properties": { "stores_text": { "title": "Stores Text", "default": true, "type": "boolean" }, "is_embedding_query": { "title": "Is Embedding Query", "default": true, "type": "boolean" }, "flat_metadata": { "title": "Flat Metadata", "default": true, "type": "boolean" }, "host": { "title": "Host", "type": "string" }, "port": { "title": "Port", "type": "string" }, "ssl": { "title": "Ssl", "type": "boolean" }, "headers": { "title": "Headers", "type": "object", "additionalProperties": { "type": "string" } }, "collection_kwargs": { "title": "Collection Kwargs", "type": "object" } }, "required": [ "ssl" ] }
- Fields
collection_kwargs (Dict[str, Any])
flat_metadata (bool)
headers (Optional[Dict[str, str]])
host (Optional[str])
is_embedding_query (bool)
port (Optional[str])
ssl (bool)
stores_text (bool)
- field collection_kwargs: Dict[str, Any] [Optional]ο
- field flat_metadata: bool = Trueο
- field headers: Optional[Dict[str, str]] = Noneο
- field host: Optional[str] = Noneο
- field is_embedding_query: bool = Trueο
- field port: Optional[str] = Noneο
- field ssl: bool [Required]ο
- field stores_text: bool = Trueο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- classmethod class_name() str ο
Get class name.
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ο
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ο
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include β fields to include in new model
exclude β fields to exclude from new model, as with values this takes precedence over include
update β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep β set to True to make a deep copy of the model
- Returns
new model instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ο
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- classmethod from_dict(data: Dict[str, Any], **kwargs: Any) Self ο
- classmethod from_json(data_str: str, **kwargs: Any) Self ο
- classmethod from_orm(obj: Any) Model ο
- classmethod from_params(collection_name: str, host: Optional[str] = None, port: Optional[str] = None, ssl: bool = False, headers: Optional[Dict[str, str]] = None, collection_kwargs: Optional[dict] = None, **kwargs: Any) ChromaVectorStore ο
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ο
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- classmethod parse_obj(obj: Any) Model ο
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None) None ο
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query_embedding (List[float]) β query embedding
similarity_top_k (int) β top k most similar nodes
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ο
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ο
- to_dict(**kwargs: Any) Dict[str, Any] ο
- to_json(**kwargs: Any) str ο
- classmethod update_forward_refs(**localns: Any) None ο
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- classmethod validate(value: Any) Model ο
- property client: Anyο
Return client.
- class llama_index.vector_stores.CognitiveSearchVectorStore(search_or_index_client: Any, id_field_key: str, chunk_field_key: str, embedding_field_key: str, metadata_string_field_key: str, doc_id_field_key: str, filterable_metadata_field_keys: Optional[Union[List[str], Dict[str, str], Dict[str, Tuple[str, MetadataIndexFieldType]]]] = None, index_name: Optional[str] = None, index_mapping: Optional[Callable[[Dict[str, str], Dict[str, Any]], Dict[str, str]]] = None, index_management: IndexManagement = IndexManagement.NO_VALIDATION, embedding_dimensionality: int = 1536, **kwargs: Any)ο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index associated with the configured search client.
- Args
nodes: List[BaseNode]: nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete documents from the Cognitive Search Index with doc_id_field_key field equal to ref_doc_id.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query vector store.
- class llama_index.vector_stores.DeepLakeVectorStore(dataset_path: str = 'llama_index', token: Optional[str] = None, read_only: Optional[bool] = False, ingestion_batch_size: int = 1024, ingestion_num_workers: int = 4, overwrite: bool = False, exec_option: str = 'python', verbose: bool = True, **kwargs: Any)ο
The DeepLake Vector Store.
In this vector store we store the text, its embedding and a few pieces of its metadata in a deeplake dataset. This implemnetation allows the use of an already existing deeplake dataset if it is one that was created this vector store. It also supports creating a new one if the dataset doesnt exist or if overwrite is set to True.
- add(nodes: List[BaseNode]) List[str] ο
Add the embeddings and their nodes into DeepLake.
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings to insert.
- Returns
List of ids inserted.
- Return type
List[str]
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- Returns
DeepLake vectorstore dataset.
- Return type
Any
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query (VectorStoreQuery) β VectorStoreQuery class input, it has the following attributes: 1. query_embedding (List[float]): query embedding 2. similarity_top_k (int): top k most similar nodes
- Returns
VectorStoreQueryResult
- class llama_index.vector_stores.DocArrayHnswVectorStore(work_dir: str, dim: int = 1536, dist_metric: Literal['cosine', 'ip', 'l2'] = 'cosine', max_elements: int = 1024, ef_construction: int = 200, ef: int = 10, M: int = 16, allow_replace_deleted: bool = True, num_threads: int = 1)ο
Class representing a DocArray HNSW vector store.
This class is a lightweight Document Index implementation provided by Docarray. It stores vectors on disk in hnswlib, and stores all other data in SQLite.
- add(nodes: List[BaseNode]) List[str] ο
Adds nodes to the vector store.
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings.
- Returns
List of document IDs added to the vector store.
- Return type
List[str]
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Deletes a document from the vector store.
- Parameters
ref_doc_id (str) β Document ID to be deleted.
**delete_kwargs (Any) β Additional arguments to pass to the delete method.
- num_docs() int ο
Retrieves the number of documents in the index.
- Returns
The number of documents in the index.
- Return type
int
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Queries the vector store and retrieves the results.
- Parameters
query (VectorStoreQuery) β Query for the vector store.
- Returns
Result of the query from vector store.
- Return type
- class llama_index.vector_stores.DocArrayInMemoryVectorStore(index_path: Optional[str] = None, metric: Literal['cosine_sim', 'euclidian_dist', 'sgeuclidean_dist'] = 'cosine_sim')ο
Class representing a DocArray In-Memory vector store.
This class is a document index provided by Docarray that stores documents in memory.
- add(nodes: List[BaseNode]) List[str] ο
Adds nodes to the vector store.
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings.
- Returns
List of document IDs added to the vector store.
- Return type
List[str]
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Deletes a document from the vector store.
- Parameters
ref_doc_id (str) β Document ID to be deleted.
**delete_kwargs (Any) β Additional arguments to pass to the delete method.
- num_docs() int ο
Retrieves the number of documents in the index.
- Returns
The number of documents in the index.
- Return type
int
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None) None ο
Persists the in-memory vector store to a file.
- Parameters
persist_path (str) β The path to persist the index.
fs (fsspec.AbstractFileSystem, optional) β Filesystem to persist to. (doesnβt apply)
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Queries the vector store and retrieves the results.
- Parameters
query (VectorStoreQuery) β Query for the vector store.
- Returns
Result of the query from vector store.
- Return type
- class llama_index.vector_stores.ElasticsearchStore(index_name: str, es_client: Optional[Any] = None, es_url: Optional[str] = None, es_cloud_id: Optional[str] = None, es_api_key: Optional[str] = None, es_user: Optional[str] = None, es_password: Optional[str] = None, text_field: str = 'content', vector_field: str = 'embedding', batch_size: int = 200, distance_strategy: Optional[Literal['COSINE', 'DOT_PRODUCT', 'EUCLIDEAN_DISTANCE']] = 'COSINE')ο
Elasticsearch vector store.
- Parameters
index_name β Name of the Elasticsearch index.
es_client β Optional. Pre-existing AsyncElasticsearch client.
es_url β Optional. Elasticsearch URL.
es_cloud_id β Optional. Elasticsearch cloud ID.
es_api_key β Optional. Elasticsearch API key.
es_user β Optional. Elasticsearch username.
es_password β Optional. Elasticsearch password.
text_field β Optional. Name of the Elasticsearch field that stores the text.
vector_field β Optional. Name of the Elasticsearch field that stores the embedding.
batch_size β Optional. Batch size for bulk indexing. Defaults to 200.
distance_strategy β Optional. Distance strategy to use for similarity search. Defaults to βCOSINEβ.
- Raises
ConnectionError β If AsyncElasticsearch client cannot connect to Elasticsearch.
ValueError β If neither es_client nor es_url nor es_cloud_id is provided.
- add(nodes: List[BaseNode], *, create_index_if_not_exists: bool = True) List[str] ο
Add nodes to Elasticsearch index.
- Parameters
nodes β List of nodes with embeddings.
create_index_if_not_exists β Optional. Whether to create the Elasticsearch index if it doesnβt already exist. Defaults to True.
- Returns
List of node IDs that were added to the index.
- Raises
ImportError β If elasticsearch[βasyncβ] python package is not installed.
BulkIndexError β If AsyncElasticsearch async_bulk indexing fails.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Async delete node from Elasticsearch index.
- Parameters
ref_doc_id β ID of the node to delete.
delete_kwargs β Optional. Additional arguments to pass to AsyncElasticsearch delete_by_query.
- Raises
Exception β If AsyncElasticsearch delete_by_query fails.
- async aquery(query: VectorStoreQuery, custom_query: Optional[Callable[[Dict, Optional[VectorStoreQuery]], Dict]] = None, es_filter: Optional[List[Dict]] = None, **kwargs: Any) VectorStoreQueryResult ο
Asynchronous query index for top k most similar nodes.
- Parameters
query_embedding (VectorStoreQuery) β query embedding
custom_query β Optional. custom query function that takes in the es query body and returns a modified query body. This can be used to add additional query parameters to the AsyncElasticsearch query.
es_filter β Optional. AsyncElasticsearch filter to apply to the query. If filter is provided in the query, this filter will be ignored.
- Returns
Result of the query.
- Return type
- Raises
Exception β If AsyncElasticsearch query fails.
- async async_add(nodes: List[BaseNode], *, create_index_if_not_exists: bool = True) List[str] ο
Asynchronous method to add nodes to Elasticsearch index.
- Parameters
nodes β List of nodes with embeddings.
create_index_if_not_exists β Optional. Whether to create the AsyncElasticsearch index if it doesnβt already exist. Defaults to True.
- Returns
List of node IDs that were added to the index.
- Raises
ImportError β If elasticsearch python package is not installed.
BulkIndexError β If AsyncElasticsearch async_bulk indexing fails.
- property client: Anyο
Get async elasticsearch client
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete node from Elasticsearch index.
- Parameters
ref_doc_id β ID of the node to delete.
delete_kwargs β Optional. Additional arguments to pass to Elasticsearch delete_by_query.
- Raises
Exception β If Elasticsearch delete_by_query fails.
- static get_user_agent() str ο
Get user agent for elasticsearch client
- query(query: VectorStoreQuery, custom_query: Optional[Callable[[Dict, Optional[VectorStoreQuery]], Dict]] = None, es_filter: Optional[List[Dict]] = None, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query_embedding (List[float]) β query embedding
custom_query β Optional. custom query function that takes in the es query body and returns a modified query body. This can be used to add additional query parameters to the Elasticsearch query.
es_filter β Optional. Elasticsearch filter to apply to the query. If filter is provided in the query, this filter will be ignored.
- Returns
Result of the query.
- Return type
- Raises
Exception β If Elasticsearch query fails.
- class llama_index.vector_stores.EpsillaVectorStore(client: Any, collection_name: str = 'llama_collection', db_path: Optional[str] = './storage', db_name: Optional[str] = 'llama_db', dimension: Optional[int] = None, overwrite: bool = False, **kwargs: Any)ο
The Epsilla Vector Store.
In this vector store we store the text, its embedding and a few pieces of its metadata in a Epsilla collection. This implemnetation allows the use of an already existing collection. It also supports creating a new one if the collection does not exist or if overwrite is set to True.
As a prerequisite, you need to install
pyepsilla
package and have a running Epsilla vector database (for example, through our docker image) See the following documentation for how to run an Epsilla vector database: https://epsilla-inc.gitbook.io/epsilladb/quick-start- Parameters
client (Any) β Epsilla client to connect to.
collection_name (Optional[str]) β Which collection to use. Defaults to βllama_collectionβ.
db_path (Optional[str]) β The path where the database will be persisted. Defaults to β/tmp/langchain-epsillaβ.
db_name (Optional[str]) β Give a name to the loaded database. Defaults to βlangchain_storeβ.
dimension (Optional[int]) β The dimension of the embeddings. If not provided, collection creation will be done on first insert. Defaults to None.
overwrite (Optional[bool]) β Whether to overwrite existing collection with same name. Defaults to False.
- Returns
Vectorstore that supports add, delete, and query.
- Return type
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to Epsilla vector store.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- Returns
List of ids inserted.
- Return type
List[str]
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- client() Any ο
Return the Epsilla client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query (VectorStoreQuery) β query.
- Returns
Vector store query result.
- class llama_index.vector_stores.FaissVectorStore(faiss_index: Any)ο
Faiss Vector Store.
Embeddings are stored within a Faiss index.
During query time, the index uses Faiss to query for the top k embeddings, and returns the corresponding indices.
- Parameters
faiss_index (faiss.Index) β Faiss index instance
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
NOTE: in the Faiss vector store, we do not store text in Faiss.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Return the faiss index.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- persist(persist_path: str = './storage/vector_store.json', fs: Optional[AbstractFileSystem] = None) None ο
Save to file.
This method saves the vector store to disk.
- Parameters
persist_path (str) β The save_path of the file.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query_embedding (List[float]) β query embedding
similarity_top_k (int) β top k most similar nodes
- class llama_index.vector_stores.LanceDBVectorStore(uri: str, table_name: str = 'vectors', nprobes: int = 20, refine_factor: Optional[int] = None, **kwargs: Any)ο
The LanceDB Vector Store.
- Stores text and embeddings in LanceDB. The vector store will open an existing
LanceDB dataset or create the dataset if it does not exist.
- Parameters
uri (str, required) β Location where LanceDB will store its files.
table_name (str, optional) β The table name where the embeddings will be stored. Defaults to βvectorsβ.
nprobes (int, optional) β The number of probes used. A higher number makes search more accurate but also slower. Defaults to 20.
refine_factor β (int, optional): Refine the results by reading extra elements and re-ranking them in memory. Defaults to None
- Raises
ImportError β Unable to import lancedb.
- Returns
- VectorStore that supports creating LanceDB datasets and
querying it.
- Return type
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Noneο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- pydantic model llama_index.vector_stores.MetadataFiltersο
Metadata filters for vector stores.
Currently only supports exact match filters. TODO: support more advanced expressions.
Show JSON schema
{ "title": "MetadataFilters", "description": "Metadata filters for vector stores.\n\nCurrently only supports exact match filters.\nTODO: support more advanced expressions.", "type": "object", "properties": { "filters": { "title": "Filters", "type": "array", "items": { "$ref": "#/definitions/ExactMatchFilter" } } }, "required": [ "filters" ], "definitions": { "ExactMatchFilter": { "title": "ExactMatchFilter", "description": "Exact match metadata filter for vector stores.\n\nValue uses Strict* types, as int, float and str are compatible types and were all\nconverted to string before.\n\nSee: https://docs.pydantic.dev/latest/usage/types/#strict-types", "type": "object", "properties": { "key": { "title": "Key", "type": "string" }, "value": { "title": "Value", "anyOf": [ { "type": "integer" }, { "type": "number" }, { "type": "string" } ] } }, "required": [ "key", "value" ] } } }
- field filters: List[ExactMatchFilter] [Required]ο
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ο
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ο
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include β fields to include in new model
exclude β fields to exclude from new model, as with values this takes precedence over include
update β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep β set to True to make a deep copy of the model
- Returns
new model instance
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ο
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- classmethod from_dict(filter_dict: Dict) MetadataFilters ο
Create MetadataFilters from json.
- classmethod from_orm(obj: Any) Model ο
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ο
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- classmethod parse_obj(obj: Any) Model ο
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ο
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ο
- classmethod update_forward_refs(**localns: Any) None ο
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- classmethod validate(value: Any) Model ο
- class llama_index.vector_stores.MetalVectorStore(api_key: str, client_id: str, index_id: str)ο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Return Metal client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query vector store.
- class llama_index.vector_stores.MilvusVectorStore(uri: str = 'http://localhost:19530', token: str = '', collection_name: str = 'llamalection', dim: Optional[int] = None, embedding_field: str = 'embedding', doc_id_field: str = 'doc_id', similarity_metric: str = 'IP', consistency_level: str = 'Strong', overwrite: bool = False, text_key: Optional[str] = None, **kwargs: Any)ο
The Milvus Vector Store.
In this vector store we store the text, its embedding and a its metadata in a Milvus collection. This implementation allows the use of an already existing collection. It also supports creating a new one if the collection doesnt exist or if overwrite is set to True.
- Parameters
uri (str, optional) β The URI to connect to, comes in the form of βhttp://address:portβ.
token (str, optional) β The token for log in. Empty if not using rbac, if using rbac it will most likely be βusername:passwordβ.
collection_name (str, optional) β The name of the collection where data will be stored. Defaults to βllamalectionβ.
dim (int, optional) β The dimension of the embedding vectors for the collection. Required if creating a new colletion.
embedding_field (str, optional) β The name of the embedding field for the collection, defaults to DEFAULT_EMBEDDING_KEY.
doc_id_field (str, optional) β The name of the doc_id field for the collection, defaults to DEFAULT_DOC_ID_KEY.
similarity_metric (str, optional) β The similarity metric to use, currently supports IP and L2.
consistency_level (str, optional) β Which consistency level to use for a newly created collection. Defaults to βSessionβ.
overwrite (bool, optional) β Whether to overwrite existing collection with same name. Defaults to False.
text_key (str, optional) β What key text is stored in in the passed collection. Used when bringing your own collection. Defaults to None.
- Raises
ImportError β Unable to import pymilvus.
MilvusException β Error communicating with Milvus, more can be found in logging under Debug.
- Returns
Vectorstore that supports add, delete, and query.
- Return type
MilvusVectorstore
- add(nodes: List[BaseNode]) List[str] ο
Add the embeddings and their nodes into Milvus.
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings to insert.
- Raises
MilvusException β Failed to insert data.
- Returns
List of ids inserted.
- Return type
List[str]
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- Raises
MilvusException β Failed to delete the doc.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query_embedding (List[float]) β query embedding
similarity_top_k (int) β top k most similar nodes
doc_ids (Optional[List[str]]) β list of doc_ids to filter by
node_ids (Optional[List[str]]) β list of node_ids to filter by
output_fields (Optional[List[str]]) β list of fields to return
embedding_field (Optional[str]) β name of embedding field
- class llama_index.vector_stores.MyScaleVectorStore(myscale_client: Optional[Any] = None, table: str = 'llama_index', database: str = 'default', index_type: str = 'MSTG', metric: str = 'cosine', batch_size: int = 32, index_params: Optional[dict] = None, search_params: Optional[dict] = None, service_context: Optional[ServiceContext] = None, **kwargs: Any)ο
MyScale Vector Store.
In this vector store, embeddings and docs are stored within an existing MyScale cluster.
During query time, the index uses MyScale to query for the top k most similar nodes.
- Parameters
myscale_client (httpclient) β clickhouse-connect httpclient of an existing MyScale cluster.
table (str, optional) β The name of the MyScale table where data will be stored. Defaults to βllama_indexβ.
database (str, optional) β The name of the MyScale database where data will be stored. Defaults to βdefaultβ.
index_type (str, optional) β The type of the MyScale vector index. Defaults to βIVFFLATβ.
metric (str, optional) β The metric type of the MyScale vector index. Defaults to βcosineβ.
batch_size (int, optional) β the size of documents to insert. Defaults to 32.
index_params (dict, optional) β The index parameters for MyScale. Defaults to None.
search_params (dict, optional) β The search parameters for a MyScale query. Defaults to None.
service_context (ServiceContext, optional) β Vector store service context. Defaults to None
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- drop() None ο
Drop MyScale Index and table
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query (VectorStoreQuery) β query
- class llama_index.vector_stores.Neo4jVectorStore(username: str, password: str, url: str, embedding_dimension: int, database: str = 'neo4j', index_name: str = 'vector', node_label: str = 'Chunk', embedding_node_property: str = 'embedding', text_node_property: str = 'text', distance_strategy: str = 'cosine', retrieval_query: str = '', **kwargs: Any)ο
-
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- create_new_index() None ο
This method constructs a Cypher query and executes it to create a new vector index in Neo4j.
- database_query(query: str, params: Optional[dict] = None) List[Dict[str, Any]] ο
This method sends a Cypher query to the connected Neo4j database and returns the results as a list of dictionaries.
- Parameters
query (str) β The Cypher query to execute.
params (dict, optional) β Dictionary of query parameters. Defaults to {}.
- Returns
List of dictionaries containing the query results.
- Return type
List[Dict[str, Any]]
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query vector store.
- retrieve_existing_index() bool ο
Check if the vector index exists in the Neo4j database and returns its embedding dimension.
This method queries the Neo4j database for existing indexes and attempts to retrieve the dimension of the vector index with the specified name. If the index exists, its dimension is returned. If the index doesnβt exist, None is returned.
- Returns
The embedding dimension of the existing index if found.
- Return type
int or None
- class llama_index.vector_stores.OpensearchVectorClient(endpoint: str, index: str, dim: int, embedding_field: str = 'embedding', text_field: str = 'content', method: Optional[dict] = None, **kwargs: Any)ο
Object encapsulating an Opensearch index that has vector search enabled.
If the index does not yet exist, it is created during init. Therefore, the underlying index is assumed to either: 1) not exist yet or 2) be created due to previous usage of this class.
- Parameters
endpoint (str) β URL (http/https) of elasticsearch endpoint
index (str) β Name of the elasticsearch index
dim (int) β Dimension of the vector
embedding_field (str) β Name of the field in the index to store embedding array in.
text_field (str) β Name of the field to grab text from
method (Optional[dict]) β Opensearch βmethodβ JSON obj for configuring the KNN index. This includes engine, metric, and other config params. Defaults to: {βnameβ: βhnswβ, βspace_typeβ: βl2β, βengineβ: βfaissβ, βparametersβ: {βef_constructionβ: 256, βmβ: 48}}
**kwargs β Optional arguments passed to the OpenSearch client from opensearch-py.
- delete_doc_id(doc_id: str) None ο
Delete a document.
- Parameters
doc_id (str) β document id
- knn(query_embedding: List[float], k: int, filters: Optional[MetadataFilters] = None) VectorStoreQueryResult ο
Do knn search.
If there are no filters do approx-knn search. If there are (pre)-filters, do an exhaustive exact knn search using βpainless
scriptingβ.
Note that approximate knn search does not support pre-filtering.
- Parameters
query_embedding β Vector embedding to query.
k β Maximum number of results.
filters β Optional filters to apply before the search. Supports filter-context queries documented at https://opensearch.org/docs/latest/query-dsl/query-filter-context/
- Returns
Up to k docs closest to query_embedding
- class llama_index.vector_stores.OpensearchVectorStore(client: OpensearchVectorClient)ο
Elasticsearch/Opensearch vector store.
- Parameters
client (OpensearchVectorClient) β Vector index client to use for data insertion/querying.
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query_embedding (List[float]) β query embedding
- pydantic model llama_index.vector_stores.PGVectorStoreο
Show JSON schema
{ "title": "PGVectorStore", "description": "Abstract vector store protocol.", "type": "object", "properties": { "stores_text": { "title": "Stores Text", "default": true, "type": "boolean" }, "is_embedding_query": { "title": "Is Embedding Query", "default": true, "type": "boolean" }, "connection_string": { "title": "Connection String", "type": "string" }, "async_connection_string": { "title": "Async Connection String", "type": "string" }, "table_name": { "title": "Table Name", "type": "string" }, "embed_dim": { "title": "Embed Dim", "type": "integer" }, "hybrid_search": { "title": "Hybrid Search", "type": "boolean" }, "text_search_config": { "title": "Text Search Config", "type": "string" }, "debug": { "title": "Debug", "type": "boolean" }, "flat_metadata": { "title": "Flat Metadata", "default": false, "type": "boolean" } }, "required": [ "connection_string", "async_connection_string", "table_name", "embed_dim", "hybrid_search", "text_search_config", "debug" ] }
- Fields
async_connection_string (str)
connection_string (str)
debug (bool)
embed_dim (int)
hybrid_search (bool)
is_embedding_query (bool)
stores_text (bool)
table_name (str)
text_search_config (str)
- field async_connection_string: str [Required]ο
- field connection_string: str [Required]ο
- field debug: bool [Required]ο
- field embed_dim: int [Required]ο
- field hybrid_search: bool [Required]ο
- field is_embedding_query: bool = Trueο
- field stores_text: bool = Trueο
- field table_name: str [Required]ο
- field text_search_config: str [Required]ο
- class Select(*entities: _ColumnsClauseArgument[Any])ο
Represents a
SELECT
statement.The
_sql.Select
object is normally constructed using the_sql.select()
function. See that function for details.See also
_sql.select()
tutorial_selecting_data - in the 2.0 tutorial
- add_columns(*entities: _ColumnsClauseArgument[Any]) Select[Any] ο
Return a new
_expression.select()
construct with the given entities appended to its columns clause.E.g.:
my_select = my_select.add_columns(table.c.new_column)
The original expressions in the columns clause remain in place. To replace the original expressions with new ones, see the method
_expression.Select.with_only_columns()
.- Parameters
*entities β column, table, or other entity expressions to be added to the columns clause
See also
_expression.Select.with_only_columns()
- replaces existing expressions rather than appending.orm_queryguide_select_multiple_entities - ORM-centric example
- add_cte(*ctes: CTE, nest_here: bool = False) Self ο
Add one or more
_sql.CTE
constructs to this statement.This method will associate the given
_sql.CTE
constructs with the parent statement such that they will each be unconditionally rendered in the WITH clause of the final statement, even if not referenced elsewhere within the statement or any sub-selects.The optional :paramref:`.HasCTE.add_cte.nest_here` parameter when set to True will have the effect that each given
_sql.CTE
will render in a WITH clause rendered directly along with this statement, rather than being moved to the top of the ultimate rendered statement, even if this statement is rendered as a subquery within a larger statement.This method has two general uses. One is to embed CTE statements that serve some purpose without being referenced explicitly, such as the use case of embedding a DML statement such as an INSERT or UPDATE as a CTE inline with a primary statement that may draw from its results indirectly. The other is to provide control over the exact placement of a particular series of CTE constructs that should remain rendered directly in terms of a particular statement that may be nested in a larger statement.
E.g.:
from sqlalchemy import table, column, select t = table('t', column('c1'), column('c2')) ins = t.insert().values({"c1": "x", "c2": "y"}).cte() stmt = select(t).add_cte(ins)
Would render:
WITH anon_1 AS (INSERT INTO t (c1, c2) VALUES (:param_1, :param_2)) SELECT t.c1, t.c2 FROM t
Above, the βanon_1β CTE is not referred towards in the SELECT statement, however still accomplishes the task of running an INSERT statement.
Similarly in a DML-related context, using the PostgreSQL
_postgresql.Insert
construct to generate an βupsertβ:from sqlalchemy import table, column from sqlalchemy.dialects.postgresql import insert t = table("t", column("c1"), column("c2")) delete_statement_cte = ( t.delete().where(t.c.c1 < 1).cte("deletions") ) insert_stmt = insert(t).values({"c1": 1, "c2": 2}) update_statement = insert_stmt.on_conflict_do_update( index_elements=[t.c.c1], set_={ "c1": insert_stmt.excluded.c1, "c2": insert_stmt.excluded.c2, }, ).add_cte(delete_statement_cte) print(update_statement)
The above statement renders as:
WITH deletions AS (DELETE FROM t WHERE t.c1 < %(c1_1)s) INSERT INTO t (c1, c2) VALUES (%(c1)s, %(c2)s) ON CONFLICT (c1) DO UPDATE SET c1 = excluded.c1, c2 = excluded.c2
New in version 1.4.21.
- Parameters
*ctes β
zero or more
CTE
constructs.Changed in version 2.0: Multiple CTE instances are accepted
nest_here β
if True, the given CTE or CTEs will be rendered as though they specified the :paramref:`.HasCTE.cte.nesting` flag to
True
when they were added to thisHasCTE
. Assuming the given CTEs are not referenced in an outer-enclosing statement as well, the CTEs given should render at the level of this statement when this flag is given.New in version 2.0.
See also
- alias(name: Optional[str] = None, flat: bool = False) Subquery ο
Return a named subquery against this
_expression.SelectBase
.For a
_expression.SelectBase
(as opposed to a_expression.FromClause
), this returns aSubquery
object which behaves mostly the same as the_expression.Alias
object that is used with a_expression.FromClause
.Changed in version 1.4: The
_expression.SelectBase.alias()
method is now a synonym for the_expression.SelectBase.subquery()
method.
- as_scalar() ScalarSelect[Any] ο
Deprecated since version 1.4: The
_expression.SelectBase.as_scalar()
method is deprecated and will be removed in a future release. Please refer to_expression.SelectBase.scalar_subquery()
.
- property c: ReadOnlyColumnCollection[str, KeyedColumnElement[Any]]ο
Deprecated since version 1.4: The
_expression.SelectBase.c
and_expression.SelectBase.columns
attributes are deprecated and will be removed in a future release; these attributes implicitly create a subquery that should be explicit. Please call_expression.SelectBase.subquery()
first in order to create a subquery, which then contains this attribute. To access the columns that this SELECT object SELECTs from, use the_expression.SelectBase.selected_columns
attribute.
- column(column: _ColumnsClauseArgument[Any]) Select[Any] ο
Return a new
_expression.select()
construct with the given column expression added to its columns clause.Deprecated since version 1.4: The
_expression.Select.column()
method is deprecated and will be removed in a future release. Please use_expression.Select.add_columns()
E.g.:
my_select = my_select.column(table.c.new_column)
See the documentation for
_expression.Select.with_only_columns()
for guidelines on adding /replacing the columns of a_expression.Select
object.
- property column_descriptions: Anyο
Return a plugin-enabled βcolumn descriptionsβ structure referring to the columns which are SELECTed by this statement.
This attribute is generally useful when using the ORM, as an extended structure which includes information about mapped entities is returned. The section queryguide_inspection contains more background.
For a Core-only statement, the structure returned by this accessor is derived from the same objects that are returned by the
Select.selected_columns
accessor, formatted as a list of dictionaries which contain the keysname
,type
andexpr
, which indicate the column expressions to be selected:>>> stmt = select(user_table) >>> stmt.column_descriptions [ { 'name': 'id', 'type': Integer(), 'expr': Column('id', Integer(), ...)}, { 'name': 'name', 'type': String(length=30), 'expr': Column('name', String(length=30), ...)} ]
Changed in version 1.4.33: The
Select.column_descriptions
attribute returns a structure for a Core-only set of entities, not just ORM-only entities.See also
UpdateBase.entity_description
- entity information for aninsert()
,update()
, ordelete()
queryguide_inspection - ORM background
- property columns_clause_froms: List[FromClause]ο
Return the set of
_expression.FromClause
objects implied by the columns clause of this SELECT statement.New in version 1.4.23.
See also
_sql.Select.froms
- βfinalβ FROM list taking the full statement into account_sql.Select.with_only_columns()
- makes use of this collection to set up a new FROM list
- compare(other: ClauseElement, **kw: Any) bool ο
Compare this
_expression.ClauseElement
to the given_expression.ClauseElement
.Subclasses should override the default behavior, which is a straight identity comparison.
**kw are arguments consumed by subclass
compare()
methods and may be used to modify the criteria for comparison (see_expression.ColumnElement
).
- compile(bind: Optional[Union[Engine, Connection]] = None, dialect: Optional[Dialect] = None, **kw: Any) Compiled ο
Compile this SQL expression.
The return value is a
Compiled
object. Callingstr()
orunicode()
on the returned value will yield a string representation of the result. TheCompiled
object also can return a dictionary of bind parameter names and values using theparams
accessor.- Parameters
bind β An
Connection
orEngine
which can provide aDialect
in order to generate aCompiled
object. If thebind
anddialect
parameters are both omitted, a default SQL compiler is used.column_keys β Used for INSERT and UPDATE statements, a list of column names which should be present in the VALUES clause of the compiled statement. If
None
, all columns from the target table object are rendered.dialect β A
Dialect
instance which can generate aCompiled
object. This argument takes precedence over thebind
argument.compile_kwargs β
optional dictionary of additional parameters that will be passed through to the compiler within all βvisitβ methods. This allows any custom flag to be passed through to a custom compilation construct, for example. It is also used for the case of passing the
literal_binds
flag through:from sqlalchemy.sql import table, column, select t = table('t', column('x')) s = select(t).where(t.c.x == 5) print(s.compile(compile_kwargs={"literal_binds": True}))
See also
faq_sql_expression_string
- correlate(*fromclauses: Union[Literal[None, False], _FromClauseArgument]) Self ο
Return a new
_expression.Select
which will correlate the given FROM clauses to that of an enclosing_expression.Select
.Calling this method turns off the
_expression.Select
objectβs default behavior of βauto-correlationβ. Normally, FROM elements which appear in a_expression.Select
that encloses this one via its WHERE clause, ORDER BY, HAVING or columns clause will be omitted from this_expression.Select
objectβs FROM clause. Setting an explicit correlation collection using the_expression.Select.correlate()
method provides a fixed list of FROM objects that can potentially take place in this process.When
_expression.Select.correlate()
is used to apply specific FROM clauses for correlation, the FROM elements become candidates for correlation regardless of how deeply nested this_expression.Select
object is, relative to an enclosing_expression.Select
which refers to the same FROM object. This is in contrast to the behavior of βauto-correlationβ which only correlates to an immediate enclosing_expression.Select
. Multi-level correlation ensures that the link between enclosed and enclosing_expression.Select
is always via at least one WHERE/ORDER BY/HAVING/columns clause in order for correlation to take place.If
None
is passed, the_expression.Select
object will correlate none of its FROM entries, and all will render unconditionally in the local FROM clause.- Parameters
*fromclauses β one or more
FromClause
or other FROM-compatible construct such as an ORM mapped entity to become part of the correlate collection; alternatively pass a single valueNone
to remove all existing correlations.
See also
_expression.Select.correlate_except()
tutorial_scalar_subquery
- correlate_except(*fromclauses: Union[Literal[None, False], _FromClauseArgument]) Self ο
Return a new
_expression.Select
which will omit the given FROM clauses from the auto-correlation process.Calling
_expression.Select.correlate_except()
turns off the_expression.Select
objectβs default behavior of βauto-correlationβ for the given FROM elements. An element specified here will unconditionally appear in the FROM list, while all other FROM elements remain subject to normal auto-correlation behaviors.If
None
is passed, or no arguments are passed, the_expression.Select
object will correlate all of its FROM entries.- Parameters
*fromclauses β a list of one or more
_expression.FromClause
constructs, or other compatible constructs (i.e. ORM-mapped classes) to become part of the correlate-exception collection.
See also
_expression.Select.correlate()
tutorial_scalar_subquery
- corresponding_column(column: KeyedColumnElement[Any], require_embedded: bool = False) Optional[KeyedColumnElement[Any]] ο
Given a
_expression.ColumnElement
, return the exported_expression.ColumnElement
object from the_expression.Selectable.exported_columns
collection of this_expression.Selectable
which corresponds to that original_expression.ColumnElement
via a common ancestor column.- Parameters
column β the target
_expression.ColumnElement
to be matched.require_embedded β only return corresponding columns for the given
_expression.ColumnElement
, if the given_expression.ColumnElement
is actually present within a sub-element of this_expression.Selectable
. Normally the column will match if it merely shares a common ancestor with one of the exported columns of this_expression.Selectable
.
See also
_expression.Selectable.exported_columns
- the_expression.ColumnCollection
that is used for the operation._expression.ColumnCollection.corresponding_column()
- implementation method.
- cte(name: Optional[str] = None, recursive: bool = False, nesting: bool = False) CTE ο
Return a new
_expression.CTE
, or Common Table Expression instance.Common table expressions are a SQL standard whereby SELECT statements can draw upon secondary statements specified along with the primary statement, using a clause called βWITHβ. Special semantics regarding UNION can also be employed to allow βrecursiveβ queries, where a SELECT statement can draw upon the set of rows that have previously been selected.
CTEs can also be applied to DML constructs UPDATE, INSERT and DELETE on some databases, both as a source of CTE rows when combined with RETURNING, as well as a consumer of CTE rows.
SQLAlchemy detects
_expression.CTE
objects, which are treated similarly to_expression.Alias
objects, as special elements to be delivered to the FROM clause of the statement as well as to a WITH clause at the top of the statement.For special prefixes such as PostgreSQL βMATERIALIZEDβ and βNOT MATERIALIZEDβ, the
_expression.CTE.prefix_with()
method may be used to establish these.Changed in version 1.3.13: Added support for prefixes. In particular - MATERIALIZED and NOT MATERIALIZED.
- Parameters
name β name given to the common table expression. Like
_expression.FromClause.alias()
, the name can be left asNone
in which case an anonymous symbol will be used at query compile time.recursive β if
True
, will renderWITH RECURSIVE
. A recursive common table expression is intended to be used in conjunction with UNION ALL in order to derive rows from those already selected.nesting β
if
True
, will render the CTE locally to the statement in which it is referenced. For more complex scenarios, theHasCTE.add_cte()
method using the :paramref:`.HasCTE.add_cte.nest_here` parameter may also be used to more carefully control the exact placement of a particular CTE.New in version 1.4.24.
See also
HasCTE.add_cte()
The following examples include two from PostgreSQLβs documentation at https://www.postgresql.org/docs/current/static/queries-with.html, as well as additional examples.
Example 1, non recursive:
from sqlalchemy import (Table, Column, String, Integer, MetaData, select, func) metadata = MetaData() orders = Table('orders', metadata, Column('region', String), Column('amount', Integer), Column('product', String), Column('quantity', Integer) ) regional_sales = select( orders.c.region, func.sum(orders.c.amount).label('total_sales') ).group_by(orders.c.region).cte("regional_sales") top_regions = select(regional_sales.c.region).\ where( regional_sales.c.total_sales > select( func.sum(regional_sales.c.total_sales) / 10 ) ).cte("top_regions") statement = select( orders.c.region, orders.c.product, func.sum(orders.c.quantity).label("product_units"), func.sum(orders.c.amount).label("product_sales") ).where(orders.c.region.in_( select(top_regions.c.region) )).group_by(orders.c.region, orders.c.product) result = conn.execute(statement).fetchall()
Example 2, WITH RECURSIVE:
from sqlalchemy import (Table, Column, String, Integer, MetaData, select, func) metadata = MetaData() parts = Table('parts', metadata, Column('part', String), Column('sub_part', String), Column('quantity', Integer), ) included_parts = select(\ parts.c.sub_part, parts.c.part, parts.c.quantity\ ).\ where(parts.c.part=='our part').\ cte(recursive=True) incl_alias = included_parts.alias() parts_alias = parts.alias() included_parts = included_parts.union_all( select( parts_alias.c.sub_part, parts_alias.c.part, parts_alias.c.quantity ).\ where(parts_alias.c.part==incl_alias.c.sub_part) ) statement = select( included_parts.c.sub_part, func.sum(included_parts.c.quantity). label('total_quantity') ).\ group_by(included_parts.c.sub_part) result = conn.execute(statement).fetchall()
Example 3, an upsert using UPDATE and INSERT with CTEs:
from datetime import date from sqlalchemy import (MetaData, Table, Column, Integer, Date, select, literal, and_, exists) metadata = MetaData() visitors = Table('visitors', metadata, Column('product_id', Integer, primary_key=True), Column('date', Date, primary_key=True), Column('count', Integer), ) # add 5 visitors for the product_id == 1 product_id = 1 day = date.today() count = 5 update_cte = ( visitors.update() .where(and_(visitors.c.product_id == product_id, visitors.c.date == day)) .values(count=visitors.c.count + count) .returning(literal(1)) .cte('update_cte') ) upsert = visitors.insert().from_select( [visitors.c.product_id, visitors.c.date, visitors.c.count], select(literal(product_id), literal(day), literal(count)) .where(~exists(update_cte.select())) ) connection.execute(upsert)
Example 4, Nesting CTE (SQLAlchemy 1.4.24 and above):
value_a = select( literal("root").label("n") ).cte("value_a") # A nested CTE with the same name as the root one value_a_nested = select( literal("nesting").label("n") ).cte("value_a", nesting=True) # Nesting CTEs takes ascendency locally # over the CTEs at a higher level value_b = select(value_a_nested.c.n).cte("value_b") value_ab = select(value_a.c.n.label("a"), value_b.c.n.label("b"))
The above query will render the second CTE nested inside the first, shown with inline parameters below as:
WITH value_a AS (SELECT 'root' AS n), value_b AS (WITH value_a AS (SELECT 'nesting' AS n) SELECT value_a.n AS n FROM value_a) SELECT value_a.n AS a, value_b.n AS b FROM value_a, value_b
The same CTE can be set up using the
HasCTE.add_cte()
method as follows (SQLAlchemy 2.0 and above):value_a = select( literal("root").label("n") ).cte("value_a") # A nested CTE with the same name as the root one value_a_nested = select( literal("nesting").label("n") ).cte("value_a") # Nesting CTEs takes ascendency locally # over the CTEs at a higher level value_b = ( select(value_a_nested.c.n). add_cte(value_a_nested, nest_here=True). cte("value_b") ) value_ab = select(value_a.c.n.label("a"), value_b.c.n.label("b"))
Example 5, Non-Linear CTE (SQLAlchemy 1.4.28 and above):
edge = Table( "edge", metadata, Column("id", Integer, primary_key=True), Column("left", Integer), Column("right", Integer), ) root_node = select(literal(1).label("node")).cte( "nodes", recursive=True ) left_edge = select(edge.c.left).join( root_node, edge.c.right == root_node.c.node ) right_edge = select(edge.c.right).join( root_node, edge.c.left == root_node.c.node ) subgraph_cte = root_node.union(left_edge, right_edge) subgraph = select(subgraph_cte)
The above query will render 2 UNIONs inside the recursive CTE:
WITH RECURSIVE nodes(node) AS ( SELECT 1 AS node UNION SELECT edge."left" AS "left" FROM edge JOIN nodes ON edge."right" = nodes.node UNION SELECT edge."right" AS "right" FROM edge JOIN nodes ON edge."left" = nodes.node ) SELECT nodes.node FROM nodes
See also
_orm.Query.cte()
- ORM version of_expression.HasCTE.cte()
.
- distinct(*expr: _ColumnExpressionArgument[Any]) Self ο
Return a new
_expression.select()
construct which will apply DISTINCT to its columns clause.- Parameters
*expr β
optional column expressions. When present, the PostgreSQL dialect will render a
DISTINCT ON (<expressions>>)
construct.Deprecated since version 1.4: Using *expr in other dialects is deprecated and will raise
_exc.CompileError
in a future version.
- except_(*other: _SelectStatementForCompoundArgument) CompoundSelect ο
Return a SQL
EXCEPT
of this select() construct against the given selectable provided as positional arguments.- Parameters
*other β
one or more elements with which to create a UNION.
Changed in version 1.4.28: multiple elements are now accepted.
- except_all(*other: _SelectStatementForCompoundArgument) CompoundSelect ο
Return a SQL
EXCEPT ALL
of this select() construct against the given selectables provided as positional arguments.- Parameters
*other β
one or more elements with which to create a UNION.
Changed in version 1.4.28: multiple elements are now accepted.
- execution_options(**kw: Any) Self ο
Set non-SQL options for the statement which take effect during execution.
Execution options can be set at many scopes, including per-statement, per-connection, or per execution, using methods such as
_engine.Connection.execution_options()
and parameters which accept a dictionary of options such as :paramref:`_engine.Connection.execute.execution_options` and :paramref:`_orm.Session.execute.execution_options`.The primary characteristic of an execution option, as opposed to other kinds of options such as ORM loader options, is that execution options never affect the compiled SQL of a query, only things that affect how the SQL statement itself is invoked or how results are fetched. That is, execution options are not part of whatβs accommodated by SQL compilation nor are they considered part of the cached state of a statement.
The
_sql.Executable.execution_options()
method is generative, as is the case for the method as applied to the_engine.Engine
and_orm.Query
objects, which means when the method is called, a copy of the object is returned, which applies the given parameters to that new copy, but leaves the original unchanged:statement = select(table.c.x, table.c.y) new_statement = statement.execution_options(my_option=True)
An exception to this behavior is the
_engine.Connection
object, where the_engine.Connection.execution_options()
method is explicitly not generative.The kinds of options that may be passed to
_sql.Executable.execution_options()
and other related methods and parameter dictionaries include parameters that are explicitly consumed by SQLAlchemy Core or ORM, as well as arbitrary keyword arguments not defined by SQLAlchemy, which means the methods and/or parameter dictionaries may be used for user-defined parameters that interact with custom code, which may access the parameters using methods such as_sql.Executable.get_execution_options()
and_engine.Connection.get_execution_options()
, or within selected event hooks using a dedicatedexecution_options
event parameter such as :paramref:`_events.ConnectionEvents.before_execute.execution_options` or_orm.ORMExecuteState.execution_options
, e.g.:from sqlalchemy import event @event.listens_for(some_engine, "before_execute") def _process_opt(conn, statement, multiparams, params, execution_options): "run a SQL function before invoking a statement" if execution_options.get("do_special_thing", False): conn.exec_driver_sql("run_special_function()")
Within the scope of options that are explicitly recognized by SQLAlchemy, most apply to specific classes of objects and not others. The most common execution options include:
:paramref:`_engine.Connection.execution_options.isolation_level` - sets the isolation level for a connection or a class of connections via an
_engine.Engine
. This option is accepted only by_engine.Connection
or_engine.Engine
.:paramref:`_engine.Connection.execution_options.stream_results` - indicates results should be fetched using a server side cursor; this option is accepted by
_engine.Connection
, by the :paramref:`_engine.Connection.execute.execution_options` parameter on_engine.Connection.execute()
, and additionally by_sql.Executable.execution_options()
on a SQL statement object, as well as by ORM constructs like_orm.Session.execute()
.:paramref:`_engine.Connection.execution_options.compiled_cache` - indicates a dictionary that will serve as the SQL compilation cache for a
_engine.Connection
or_engine.Engine
, as well as for ORM methods like_orm.Session.execute()
. Can be passed asNone
to disable caching for statements. This option is not accepted by_sql.Executable.execution_options()
as it is inadvisable to carry along a compilation cache within a statement object.:paramref:`_engine.Connection.execution_options.schema_translate_map` - a mapping of schema names used by the Schema Translate Map feature, accepted by
_engine.Connection
,_engine.Engine
,_sql.Executable
, as well as by ORM constructs like_orm.Session.execute()
.
See also
_engine.Connection.execution_options()
:paramref:`_engine.Connection.execute.execution_options`
:paramref:`_orm.Session.execute.execution_options`
orm_queryguide_execution_options - documentation on all ORM-specific execution options
- exists() Exists ο
Return an
_sql.Exists
representation of this selectable, which can be used as a column expression.The returned object is an instance of
_sql.Exists
.See also
_sql.exists()
tutorial_exists - in the 2.0 style tutorial.
New in version 1.4.
- property exported_columns: ReadOnlyColumnCollection[str, ColumnElement[Any]]ο
A
_expression.ColumnCollection
that represents the βexportedβ columns of this_expression.Selectable
, not including_sql.TextClause
constructs.The βexportedβ columns for a
_expression.SelectBase
object are synonymous with the_expression.SelectBase.selected_columns
collection.New in version 1.4.
See also
_expression.Select.exported_columns
_expression.Selectable.exported_columns
_expression.FromClause.exported_columns
- fetch(count: _LimitOffsetType, with_ties: bool = False, percent: bool = False) Self ο
Return a new selectable with the given FETCH FIRST criterion applied.
This is a numeric value which usually renders as
FETCH {FIRST | NEXT} [ count ] {ROW | ROWS} {ONLY | WITH TIES}
expression in the resulting select. This functionality is is currently implemented for Oracle, PostgreSQL, MSSQL.Use
_sql.GenerativeSelect.offset()
to specify the offset.Note
The
_sql.GenerativeSelect.fetch()
method will replace any clause applied with_sql.GenerativeSelect.limit()
.New in version 1.4.
- Parameters
count β an integer COUNT parameter, or a SQL expression that provides an integer result. When
percent=True
this will represent the percentage of rows to return, not the absolute value. PassNone
to reset it.with_ties β When
True
, the WITH TIES option is used to return any additional rows that tie for the last place in the result set according to theORDER BY
clause. TheORDER BY
may be mandatory in this case. Defaults toFalse
percent β When
True
,count
represents the percentage of the total number of selected rows to return. Defaults toFalse
See also
_sql.GenerativeSelect.limit()
_sql.GenerativeSelect.offset()
- filter(*criteria: _ColumnExpressionArgument[bool]) Self ο
A synonym for the
_sql.Select.where()
method.
- filter_by(**kwargs: Any) Self ο
apply the given filtering criterion as a WHERE clause to this select.
- from_statement(statement: ReturnsRowsRole) ExecutableReturnsRows ο
Apply the columns which this
Select
would select onto another statement.This operation is plugin-specific and will raise a not supported exception if this
_sql.Select
does not select from plugin-enabled entities.The statement is typically either a
_expression.text()
or_expression.select()
construct, and should return the set of columns appropriate to the entities represented by thisSelect
.See also
orm_queryguide_selecting_text - usage examples in the ORM Querying Guide
- property froms: Sequence[FromClause]ο
Return the displayed list of
_expression.FromClause
elements.Deprecated since version 1.4.23: The
_expression.Select.froms
attribute is moved to the_expression.Select.get_final_froms()
method.
- get_children(**kw: Any) Iterable[ClauseElement] ο
Return immediate child
visitors.HasTraverseInternals
elements of thisvisitors.HasTraverseInternals
.This is used for visit traversal.
**kw may contain flags that change the collection that is returned, for example to return a subset of items in order to cut down on larger traversals, or to return child items from a different context (such as schema-level collections instead of clause-level).
- get_execution_options() _ExecuteOptions ο
Get the non-SQL options which will take effect during execution.
New in version 1.3.
See also
Executable.execution_options()
- get_final_froms() Sequence[FromClause] ο
Compute the final displayed list of
_expression.FromClause
elements.This method will run through the full computation required to determine what FROM elements will be displayed in the resulting SELECT statement, including shadowing individual tables with JOIN objects, as well as full computation for ORM use cases including eager loading clauses.
For ORM use, this accessor returns the post compilation list of FROM objects; this collection will include elements such as eagerly loaded tables and joins. The objects will not be ORM enabled and not work as a replacement for the
_sql.Select.select_froms()
collection; additionally, the method is not well performing for an ORM enabled statement as it will incur the full ORM construction process.To retrieve the FROM list thatβs implied by the βcolumnsβ collection passed to the
_sql.Select
originally, use the_sql.Select.columns_clause_froms
accessor.To select from an alternative set of columns while maintaining the FROM list, use the
_sql.Select.with_only_columns()
method and pass the :paramref:`_sql.Select.with_only_columns.maintain_column_froms` parameter.New in version 1.4.23: - the
_sql.Select.get_final_froms()
method replaces the previous_sql.Select.froms
accessor, which is deprecated.See also
_sql.Select.columns_clause_froms
- get_label_style() SelectLabelStyle ο
Retrieve the current label style.
New in version 1.4.
- group_by(_GenerativeSelect__first: Union[Literal[None, _NoArg.NO_ARG], _ColumnExpressionOrStrLabelArgument[Any]] = _NoArg.NO_ARG, *clauses: _ColumnExpressionOrStrLabelArgument[Any]) Self ο
Return a new selectable with the given list of GROUP BY criterion applied.
All existing GROUP BY settings can be suppressed by passing
None
.e.g.:
stmt = select(table.c.name, func.max(table.c.stat)).\ group_by(table.c.name)
- Parameters
*clauses β a series of
_expression.ColumnElement
constructs which will be used to generate an GROUP BY clause.
See also
tutorial_group_by_w_aggregates - in the unified_tutorial
tutorial_order_by_label - in the unified_tutorial
- having(*having: _ColumnExpressionArgument[bool]) Self ο
Return a new
_expression.select()
construct with the given expression added to its HAVING clause, joined to the existing clause via AND, if any.
- inherit_cache: Optional[bool] = Noneο
Indicate if this
HasCacheKey
instance should make use of the cache key generation scheme used by its immediate superclass.The attribute defaults to
None
, which indicates that a construct has not yet taken into account whether or not its appropriate for it to participate in caching; this is functionally equivalent to setting the value toFalse
, except that a warning is also emitted.This flag can be set to
True
on a particular class, if the SQL that corresponds to the object does not change based on attributes which are local to this class, and not its superclass.See also
compilerext_caching - General guideslines for setting the
HasCacheKey.inherit_cache
attribute for third-party or user defined SQL constructs.
- property inner_columns: _SelectIterableο
An iterator of all
_expression.ColumnElement
expressions which would be rendered into the columns clause of the resulting SELECT statement.This method is legacy as of 1.4 and is superseded by the
_expression.Select.exported_columns
collection.
- intersect(*other: _SelectStatementForCompoundArgument) CompoundSelect ο
Return a SQL
INTERSECT
of this select() construct against the given selectables provided as positional arguments.- Parameters
*other β
one or more elements with which to create a UNION.
Changed in version 1.4.28: multiple elements are now accepted.
**kwargs β keyword arguments are forwarded to the constructor for the newly created
_sql.CompoundSelect
object.
- intersect_all(*other: _SelectStatementForCompoundArgument) CompoundSelect ο
Return a SQL
INTERSECT ALL
of this select() construct against the given selectables provided as positional arguments.- Parameters
*other β
one or more elements with which to create a UNION.
Changed in version 1.4.28: multiple elements are now accepted.
**kwargs β keyword arguments are forwarded to the constructor for the newly created
_sql.CompoundSelect
object.
- is_derived_from(fromclause: Optional[FromClause]) bool ο
Return
True
if thisReturnsRows
is βderivedβ from the givenFromClause
.An example would be an Alias of a Table is derived from that Table.
- join(target: _JoinTargetArgument, onclause: Optional[_OnClauseArgument] = None, *, isouter: bool = False, full: bool = False) Self ο
Create a SQL JOIN against this
_expression.Select
objectβs criterion and apply generatively, returning the newly resulting_expression.Select
.E.g.:
stmt = select(user_table).join(address_table, user_table.c.id == address_table.c.user_id)
The above statement generates SQL similar to:
SELECT user.id, user.name FROM user JOIN address ON user.id = address.user_id
Changed in version 1.4:
_expression.Select.join()
now creates a_sql.Join
object between a_sql.FromClause
source that is within the FROM clause of the existing SELECT, and a given target_sql.FromClause
, and then adds this_sql.Join
to the FROM clause of the newly generated SELECT statement. This is completely reworked from the behavior in 1.3, which would instead create a subquery of the entire_expression.Select
and then join that subquery to the target.This is a backwards incompatible change as the previous behavior was mostly useless, producing an unnamed subquery rejected by most databases in any case. The new behavior is modeled after that of the very successful
_orm.Query.join()
method in the ORM, in order to support the functionality of_orm.Query
being available by using a_sql.Select
object with an_orm.Session
.See the notes for this change at change_select_join.
- Parameters
target β target table to join towards
onclause β ON clause of the join. If omitted, an ON clause is generated automatically based on the
_schema.ForeignKey
linkages between the two tables, if one can be unambiguously determined, otherwise an error is raised.isouter β if True, generate LEFT OUTER join. Same as
_expression.Select.outerjoin()
.full β if True, generate FULL OUTER join.
See also
tutorial_select_join - in the /tutorial/index
orm_queryguide_joins - in the queryguide_toplevel
_expression.Select.join_from()
_expression.Select.outerjoin()
- join_from(from_: _FromClauseArgument, target: _JoinTargetArgument, onclause: Optional[_OnClauseArgument] = None, *, isouter: bool = False, full: bool = False) Self ο
Create a SQL JOIN against this
_expression.Select
objectβs criterion and apply generatively, returning the newly resulting_expression.Select
.E.g.:
stmt = select(user_table, address_table).join_from( user_table, address_table, user_table.c.id == address_table.c.user_id )
The above statement generates SQL similar to:
SELECT user.id, user.name, address.id, address.email, address.user_id FROM user JOIN address ON user.id = address.user_id
New in version 1.4.
- Parameters
from_ β the left side of the join, will be rendered in the FROM clause and is roughly equivalent to using the
Select.select_from()
method.target β target table to join towards
onclause β ON clause of the join.
isouter β if True, generate LEFT OUTER join. Same as
_expression.Select.outerjoin()
.full β if True, generate FULL OUTER join.
See also
tutorial_select_join - in the /tutorial/index
orm_queryguide_joins - in the queryguide_toplevel
_expression.Select.join()
- label(name: Optional[str]) Label[Any] ο
Return a βscalarβ representation of this selectable, embedded as a subquery with a label.
See also
_expression.SelectBase.scalar_subquery()
.
- lateral(name: Optional[str] = None) LateralFromClause ο
Return a LATERAL alias of this
_expression.Selectable
.The return value is the
_expression.Lateral
construct also provided by the top-level_expression.lateral()
function.See also
tutorial_lateral_correlation - overview of usage.
- limit(limit: _LimitOffsetType) Self ο
Return a new selectable with the given LIMIT criterion applied.
This is a numerical value which usually renders as a
LIMIT
expression in the resulting select. Backends that donβt supportLIMIT
will attempt to provide similar functionality.Note
The
_sql.GenerativeSelect.limit()
method will replace any clause applied with_sql.GenerativeSelect.fetch()
.- Parameters
limit β an integer LIMIT parameter, or a SQL expression that provides an integer result. Pass
None
to reset it.
See also
_sql.GenerativeSelect.fetch()
_sql.GenerativeSelect.offset()
- offset(offset: _LimitOffsetType) Self ο
Return a new selectable with the given OFFSET criterion applied.
This is a numeric value which usually renders as an
OFFSET
expression in the resulting select. Backends that donβt supportOFFSET
will attempt to provide similar functionality.- Parameters
offset β an integer OFFSET parameter, or a SQL expression that provides an integer result. Pass
None
to reset it.
See also
_sql.GenerativeSelect.limit()
_sql.GenerativeSelect.fetch()
- options(*options: ExecutableOption) Self ο
Apply options to this statement.
In the general sense, options are any kind of Python object that can be interpreted by the SQL compiler for the statement. These options can be consumed by specific dialects or specific kinds of compilers.
The most commonly known kind of option are the ORM level options that apply βeager loadβ and other loading behaviors to an ORM query. However, options can theoretically be used for many other purposes.
For background on specific kinds of options for specific kinds of statements, refer to the documentation for those option objects.
Changed in version 1.4: - added
Executable.options()
to Core statement objects towards the goal of allowing unified Core / ORM querying capabilities.See also
loading_columns - refers to options specific to the usage of ORM queries
relationship_loader_options - refers to options specific to the usage of ORM queries
- order_by(_GenerativeSelect__first: Union[Literal[None, _NoArg.NO_ARG], _ColumnExpressionOrStrLabelArgument[Any]] = _NoArg.NO_ARG, *clauses: _ColumnExpressionOrStrLabelArgument[Any]) Self ο
Return a new selectable with the given list of ORDER BY criteria applied.
e.g.:
stmt = select(table).order_by(table.c.id, table.c.name)
Calling this method multiple times is equivalent to calling it once with all the clauses concatenated. All existing ORDER BY criteria may be cancelled by passing
None
by itself. New ORDER BY criteria may then be added by invoking_orm.Query.order_by()
again, e.g.:# will erase all ORDER BY and ORDER BY new_col alone stmt = stmt.order_by(None).order_by(new_col)
- Parameters
*clauses β a series of
_expression.ColumnElement
constructs which will be used to generate an ORDER BY clause.
See also
tutorial_order_by - in the unified_tutorial
tutorial_order_by_label - in the unified_tutorial
- outerjoin(target: _JoinTargetArgument, onclause: Optional[_OnClauseArgument] = None, *, full: bool = False) Self ο
Create a left outer join.
Parameters are the same as that of
_expression.Select.join()
.Changed in version 1.4:
_expression.Select.outerjoin()
now creates a_sql.Join
object between a_sql.FromClause
source that is within the FROM clause of the existing SELECT, and a given target_sql.FromClause
, and then adds this_sql.Join
to the FROM clause of the newly generated SELECT statement. This is completely reworked from the behavior in 1.3, which would instead create a subquery of the entire_expression.Select
and then join that subquery to the target.This is a backwards incompatible change as the previous behavior was mostly useless, producing an unnamed subquery rejected by most databases in any case. The new behavior is modeled after that of the very successful
_orm.Query.join()
method in the ORM, in order to support the functionality of_orm.Query
being available by using a_sql.Select
object with an_orm.Session
.See the notes for this change at change_select_join.
See also
tutorial_select_join - in the /tutorial/index
orm_queryguide_joins - in the queryguide_toplevel
_expression.Select.join()
- outerjoin_from(from_: _FromClauseArgument, target: _JoinTargetArgument, onclause: Optional[_OnClauseArgument] = None, *, full: bool = False) Self ο
Create a SQL LEFT OUTER JOIN against this
_expression.Select
objectβs criterion and apply generatively, returning the newly resulting_expression.Select
.Usage is the same as that of
_selectable.Select.join_from()
.
- params(_ClauseElement__optionaldict: Optional[Mapping[str, Any]] = None, **kwargs: Any) Self ο
Return a copy with
_expression.bindparam()
elements replaced.Returns a copy of this ClauseElement with
_expression.bindparam()
elements replaced with values taken from the given dictionary:>>> clause = column('x') + bindparam('foo') >>> print(clause.compile().params) {'foo':None} >>> print(clause.params({'foo':7}).compile().params) {'foo':7}
- prefix_with(*prefixes: _TextCoercedExpressionArgument[Any], dialect: str = '*') Self ο
Add one or more expressions following the statement keyword, i.e. SELECT, INSERT, UPDATE, or DELETE. Generative.
This is used to support backend-specific prefix keywords such as those provided by MySQL.
E.g.:
stmt = table.insert().prefix_with("LOW_PRIORITY", dialect="mysql") # MySQL 5.7 optimizer hints stmt = select(table).prefix_with( "/*+ BKA(t1) */", dialect="mysql")
Multiple prefixes can be specified by multiple calls to
_expression.HasPrefixes.prefix_with()
.- Parameters
*prefixes β textual or
_expression.ClauseElement
construct which will be rendered following the INSERT, UPDATE, or DELETE keyword.dialect β optional string dialect name which will limit rendering of this prefix to only that dialect.
- reduce_columns(only_synonyms: bool = True) Select ο
Return a new
_expression.select()
construct with redundantly named, equivalently-valued columns removed from the columns clause.βRedundantβ here means two columns where one refers to the other either based on foreign key, or via a simple equality comparison in the WHERE clause of the statement. The primary purpose of this method is to automatically construct a select statement with all uniquely-named columns, without the need to use table-qualified labels as
_expression.Select.set_label_style()
does.When columns are omitted based on foreign key, the referred-to column is the one thatβs kept. When columns are omitted based on WHERE equivalence, the first column in the columns clause is the one thatβs kept.
- Parameters
only_synonyms β when True, limit the removal of columns to those which have the same name as the equivalent. Otherwise, all columns that are equivalent to another are removed.
- replace_selectable(old: FromClause, alias: Alias) Self ο
Replace all occurrences of
_expression.FromClause
βoldβ with the given_expression.Alias
object, returning a copy of this_expression.FromClause
.Deprecated since version 1.4: The
Selectable.replace_selectable()
method is deprecated, and will be removed in a future release. Similar functionality is available via the sqlalchemy.sql.visitors module.
- scalar_subquery() ScalarSelect[Any] ο
Return a βscalarβ representation of this selectable, which can be used as a column expression.
The returned object is an instance of
_sql.ScalarSelect
.Typically, a select statement which has only one column in its columns clause is eligible to be used as a scalar expression. The scalar subquery can then be used in the WHERE clause or columns clause of an enclosing SELECT.
Note that the scalar subquery differentiates from the FROM-level subquery that can be produced using the
_expression.SelectBase.subquery()
method.See also
tutorial_scalar_subquery - in the 2.0 tutorial
- select(*arg: Any, **kw: Any) Select ο
Deprecated since version 1.4: The
_expression.SelectBase.select()
method is deprecated and will be removed in a future release; this method implicitly creates a subquery that should be explicit. Please call_expression.SelectBase.subquery()
first in order to create a subquery, which then can be selected.
- select_from(*froms: _FromClauseArgument) Self ο
Return a new
_expression.select()
construct with the given FROM expression(s) merged into its list of FROM objects.E.g.:
table1 = table('t1', column('a')) table2 = table('t2', column('b')) s = select(table1.c.a).\ select_from( table1.join(table2, table1.c.a==table2.c.b) )
The βfromβ list is a unique set on the identity of each element, so adding an already present
_schema.Table
or other selectable will have no effect. Passing a_expression.Join
that refers to an already present_schema.Table
or other selectable will have the effect of concealing the presence of that selectable as an individual element in the rendered FROM list, instead rendering it into a JOIN clause.While the typical purpose of
_expression.Select.select_from()
is to replace the default, derived FROM clause with a join, it can also be called with individual table elements, multiple times if desired, in the case that the FROM clause cannot be fully derived from the columns clause:select(func.count('*')).select_from(table1)
- selected_columnsο
A
_expression.ColumnCollection
representing the columns that this SELECT statement or similar construct returns in its result set, not including_sql.TextClause
constructs.This collection differs from the
_expression.FromClause.columns
collection of a_expression.FromClause
in that the columns within this collection cannot be directly nested inside another SELECT statement; a subquery must be applied first which provides for the necessary parenthesization required by SQL.For a
_expression.select()
construct, the collection here is exactly what would be rendered inside the βSELECTβ statement, and the_expression.ColumnElement
objects are directly present as they were given, e.g.:col1 = column('q', Integer) col2 = column('p', Integer) stmt = select(col1, col2)
Above,
stmt.selected_columns
would be a collection that contains thecol1
andcol2
objects directly. For a statement that is against a_schema.Table
or other_expression.FromClause
, the collection will use the_expression.ColumnElement
objects that are in the_expression.FromClause.c
collection of the from element.A use case for the
_sql.Select.selected_columns
collection is to allow the existing columns to be referenced when adding additional criteria, e.g.:def filter_on_id(my_select, id): return my_select.where(my_select.selected_columns['id'] == id) stmt = select(MyModel) # adds "WHERE id=:param" to the statement stmt = filter_on_id(stmt, 42)
Note
The
_sql.Select.selected_columns
collection does not include expressions established in the columns clause using the_sql.text()
construct; these are silently omitted from the collection. To use plain textual column expressions inside of a_sql.Select
construct, use the_sql.literal_column()
construct.New in version 1.4.
- self_group(against: Optional[OperatorType] = None) Union[SelectStatementGrouping, Self] ο
Apply a βgroupingβ to this
_expression.ClauseElement
.This method is overridden by subclasses to return a βgroupingβ construct, i.e. parenthesis. In particular itβs used by βbinaryβ expressions to provide a grouping around themselves when placed into a larger expression, as well as by
_expression.select()
constructs when placed into the FROM clause of another_expression.select()
. (Note that subqueries should be normally created using the_expression.Select.alias()
method, as many platforms require nested SELECT statements to be named).As expressions are composed together, the application of
self_group()
is automatic - end-user code should never need to use this method directly. Note that SQLAlchemyβs clause constructs take operator precedence into account - so parenthesis might not be needed, for example, in an expression likex OR (y AND z)
- AND takes precedence over OR.The base
self_group()
method of_expression.ClauseElement
just returns self.
- set_label_style(style: SelectLabelStyle) Self ο
Return a new selectable with the specified label style.
There are three βlabel stylesβ available,
_sql.SelectLabelStyle.LABEL_STYLE_DISAMBIGUATE_ONLY
,_sql.SelectLabelStyle.LABEL_STYLE_TABLENAME_PLUS_COL
, and_sql.SelectLabelStyle.LABEL_STYLE_NONE
. The default style is_sql.SelectLabelStyle.LABEL_STYLE_TABLENAME_PLUS_COL
.In modern SQLAlchemy, there is not generally a need to change the labeling style, as per-expression labels are more effectively used by making use of the
_sql.ColumnElement.label()
method. In past versions,_sql.LABEL_STYLE_TABLENAME_PLUS_COL
was used to disambiguate same-named columns from different tables, aliases, or subqueries; the newer_sql.LABEL_STYLE_DISAMBIGUATE_ONLY
now applies labels only to names that conflict with an existing name so that the impact of this labeling is minimal.The rationale for disambiguation is mostly so that all column expressions are available from a given
_sql.FromClause.c
collection when a subquery is created.New in version 1.4: - the
_sql.GenerativeSelect.set_label_style()
method replaces the previous combination of.apply_labels()
,.with_labels()
anduse_labels=True
methods and/or parameters.See also
_sql.LABEL_STYLE_DISAMBIGUATE_ONLY
_sql.LABEL_STYLE_TABLENAME_PLUS_COL
_sql.LABEL_STYLE_NONE
_sql.LABEL_STYLE_DEFAULT
- slice(start: int, stop: int) Self ο
Apply LIMIT / OFFSET to this statement based on a slice.
The start and stop indices behave like the argument to Pythonβs built-in
range()
function. This method provides an alternative to usingLIMIT
/OFFSET
to get a slice of the query.For example,
stmt = select(User).order_by(User).id.slice(1, 3)
renders as
SELECT users.id AS users_id, users.name AS users_name FROM users ORDER BY users.id LIMIT ? OFFSET ? (2, 1)
Note
The
_sql.GenerativeSelect.slice()
method will replace any clause applied with_sql.GenerativeSelect.fetch()
.New in version 1.4: Added the
_sql.GenerativeSelect.slice()
method generalized from the ORM.See also
_sql.GenerativeSelect.limit()
_sql.GenerativeSelect.offset()
_sql.GenerativeSelect.fetch()
- subquery(name: Optional[str] = None) Subquery ο
Return a subquery of this
_expression.SelectBase
.A subquery is from a SQL perspective a parenthesized, named construct that can be placed in the FROM clause of another SELECT statement.
Given a SELECT statement such as:
stmt = select(table.c.id, table.c.name)
The above statement might look like:
SELECT table.id, table.name FROM table
The subquery form by itself renders the same way, however when embedded into the FROM clause of another SELECT statement, it becomes a named sub-element:
subq = stmt.subquery() new_stmt = select(subq)
The above renders as:
SELECT anon_1.id, anon_1.name FROM (SELECT table.id, table.name FROM table) AS anon_1
Historically,
_expression.SelectBase.subquery()
is equivalent to calling the_expression.FromClause.alias()
method on a FROM object; however, as a_expression.SelectBase
object is not directly FROM object, the_expression.SelectBase.subquery()
method provides clearer semantics.New in version 1.4.
- suffix_with(*suffixes: _TextCoercedExpressionArgument[Any], dialect: str = '*') Self ο
Add one or more expressions following the statement as a whole.
This is used to support backend-specific suffix keywords on certain constructs.
E.g.:
stmt = select(col1, col2).cte().suffix_with( "cycle empno set y_cycle to 1 default 0", dialect="oracle")
Multiple suffixes can be specified by multiple calls to
_expression.HasSuffixes.suffix_with()
.- Parameters
*suffixes β textual or
_expression.ClauseElement
construct which will be rendered following the target clause.dialect β Optional string dialect name which will limit rendering of this suffix to only that dialect.
- union(*other: _SelectStatementForCompoundArgument) CompoundSelect ο
Return a SQL
UNION
of this select() construct against the given selectables provided as positional arguments.- Parameters
*other β
one or more elements with which to create a UNION.
Changed in version 1.4.28: multiple elements are now accepted.
**kwargs β keyword arguments are forwarded to the constructor for the newly created
_sql.CompoundSelect
object.
- union_all(*other: _SelectStatementForCompoundArgument) CompoundSelect ο
Return a SQL
UNION ALL
of this select() construct against the given selectables provided as positional arguments.- Parameters
*other β
one or more elements with which to create a UNION.
Changed in version 1.4.28: multiple elements are now accepted.
**kwargs β keyword arguments are forwarded to the constructor for the newly created
_sql.CompoundSelect
object.
- unique_params(_ClauseElement__optionaldict: Optional[Dict[str, Any]] = None, **kwargs: Any) Self ο
Return a copy with
_expression.bindparam()
elements replaced.Same functionality as
_expression.ClauseElement.params()
, except adds unique=True to affected bind parameters so that multiple statements can be used.
- where(*whereclause: _ColumnExpressionArgument[bool]) Self ο
Return a new
_expression.select()
construct with the given expression added to its WHERE clause, joined to the existing clause via AND, if any.
- property whereclause: Optional[ColumnElement[Any]]ο
Return the completed WHERE clause for this
_expression.Select
statement.This assembles the current collection of WHERE criteria into a single
_expression.BooleanClauseList
construct.New in version 1.4.
- with_for_update(*, nowait: bool = False, read: bool = False, of: Optional[_ForUpdateOfArgument] = None, skip_locked: bool = False, key_share: bool = False) Self ο
Specify a
FOR UPDATE
clause for this_expression.GenerativeSelect
.E.g.:
stmt = select(table).with_for_update(nowait=True)
On a database like PostgreSQL or Oracle, the above would render a statement like:
SELECT table.a, table.b FROM table FOR UPDATE NOWAIT
on other backends, the
nowait
option is ignored and instead would produce:SELECT table.a, table.b FROM table FOR UPDATE
When called with no arguments, the statement will render with the suffix
FOR UPDATE
. Additional arguments can then be provided which allow for common database-specific variants.- Parameters
nowait β boolean; will render
FOR UPDATE NOWAIT
on Oracle and PostgreSQL dialects.read β boolean; will render
LOCK IN SHARE MODE
on MySQL,FOR SHARE
on PostgreSQL. On PostgreSQL, when combined withnowait
, will renderFOR SHARE NOWAIT
.of β SQL expression or list of SQL expression elements, (typically
_schema.Column
objects or a compatible expression, for some backends may also be a table expression) which will render into aFOR UPDATE OF
clause; supported by PostgreSQL, Oracle, some MySQL versions and possibly others. May render as a table or as a column depending on backend.skip_locked β boolean, will render
FOR UPDATE SKIP LOCKED
on Oracle and PostgreSQL dialects orFOR SHARE SKIP LOCKED
ifread=True
is also specified.key_share β boolean, will render
FOR NO KEY UPDATE
, or if combined withread=True
will renderFOR KEY SHARE
, on the PostgreSQL dialect.
- with_hint(selectable: _FromClauseArgument, text: str, dialect_name: str = '*') Self ο
Add an indexing or other executional context hint for the given selectable to this
_expression.Select
or other selectable object.The text of the hint is rendered in the appropriate location for the database backend in use, relative to the given
_schema.Table
or_expression.Alias
passed as theselectable
argument. The dialect implementation typically uses Python string substitution syntax with the token%(name)s
to render the name of the table or alias. E.g. when using Oracle, the following:select(mytable).\ with_hint(mytable, "index(%(name)s ix_mytable)")
Would render SQL as:
select /*+ index(mytable ix_mytable) */ ... from mytable
The
dialect_name
option will limit the rendering of a particular hint to a particular backend. Such as, to add hints for both Oracle and Sybase simultaneously:select(mytable).\ with_hint(mytable, "index(%(name)s ix_mytable)", 'oracle').\ with_hint(mytable, "WITH INDEX ix_mytable", 'mssql')
See also
_expression.Select.with_statement_hint()
- with_only_columns(*entities: _ColumnsClauseArgument[Any], maintain_column_froms: bool = False, **_Select__kw: Any) Select[Any] ο
Return a new
_expression.select()
construct with its columns clause replaced with the given entities.By default, this method is exactly equivalent to as if the original
_expression.select()
had been called with the given entities. E.g. a statement:s = select(table1.c.a, table1.c.b) s = s.with_only_columns(table1.c.b)
should be exactly equivalent to:
s = select(table1.c.b)
In this mode of operation,
_sql.Select.with_only_columns()
will also dynamically alter the FROM clause of the statement if it is not explicitly stated. To maintain the existing set of FROMs including those implied by the current columns clause, add the :paramref:`_sql.Select.with_only_columns.maintain_column_froms` parameter:s = select(table1.c.a, table2.c.b) s = s.with_only_columns(table1.c.a, maintain_column_froms=True)
The above parameter performs a transfer of the effective FROMs in the columns collection to the
_sql.Select.select_from()
method, as though the following were invoked:s = select(table1.c.a, table2.c.b) s = s.select_from(table1, table2).with_only_columns(table1.c.a)
The :paramref:`_sql.Select.with_only_columns.maintain_column_froms` parameter makes use of the
_sql.Select.columns_clause_froms
collection and performs an operation equivalent to the following:s = select(table1.c.a, table2.c.b) s = s.select_from(*s.columns_clause_froms).with_only_columns(table1.c.a)
- Parameters
*entities β column expressions to be used.
maintain_column_froms β
boolean parameter that will ensure the FROM list implied from the current columns clause will be transferred to the
_sql.Select.select_from()
method first.New in version 1.4.23.
- with_statement_hint(text: str, dialect_name: str = '*') Self ο
Add a statement hint to this
_expression.Select
or other selectable object.This method is similar to
_expression.Select.with_hint()
except that it does not require an individual table, and instead applies to the statement as a whole.Hints here are specific to the backend database and may include directives such as isolation levels, file directives, fetch directives, etc.
See also
_expression.Select.with_hint()
_expression.Select.prefix_with()
- generic SELECT prefixing which also can suit some database-specific HINT syntaxes such as MySQL optimizer hints
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- classmethod class_name() str ο
Get class name.
- async close() None ο
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ο
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ο
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include β fields to include in new model
exclude β fields to exclude from new model, as with values this takes precedence over include
update β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep β set to True to make a deep copy of the model
- Returns
new model instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ο
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- classmethod from_dict(data: Dict[str, Any], **kwargs: Any) Self ο
- classmethod from_json(data_str: str, **kwargs: Any) Self ο
- classmethod from_orm(obj: Any) Model ο
- classmethod from_params(host: Optional[str] = None, port: Optional[str] = None, database: Optional[str] = None, user: Optional[str] = None, password: Optional[str] = None, table_name: str = 'llamaindex', connection_string: Optional[str] = None, async_connection_string: Optional[str] = None, hybrid_search: bool = False, text_search_config: str = 'english', embed_dim: int = 1536, debug: bool = False) PGVectorStore ο
Return connection string from database parameters.
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ο
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- classmethod parse_obj(obj: Any) Model ο
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None) None ο
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query vector store.
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ο
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ο
- to_dict(**kwargs: Any) Dict[str, Any] ο
- to_json(**kwargs: Any) str ο
- classmethod update_forward_refs(**localns: Any) None ο
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- classmethod validate(value: Any) Model ο
- property client: Anyο
Get client.
- pydantic model llama_index.vector_stores.PineconeVectorStoreο
Pinecone Vector Store.
In this vector store, embeddings and docs are stored within a Pinecone index.
During query time, the index uses Pinecone to query for the top k most similar nodes.
- Parameters
pinecone_index (Optional[pinecone.Index]) β Pinecone index instance
insert_kwargs (Optional[Dict]) β insert kwargs during upsert call.
add_sparse_vector (bool) β whether to add sparse vector to index.
tokenizer (Optional[Callable]) β tokenizer to use to generate sparse
Show JSON schema
{ "title": "PineconeVectorStore", "description": "Pinecone Vector Store.\n\nIn this vector store, embeddings and docs are stored within a\nPinecone index.\n\nDuring query time, the index uses Pinecone to query for the top\nk most similar nodes.\n\nArgs:\n pinecone_index (Optional[pinecone.Index]): Pinecone index instance\n insert_kwargs (Optional[Dict]): insert kwargs during `upsert` call.\n add_sparse_vector (bool): whether to add sparse vector to index.\n tokenizer (Optional[Callable]): tokenizer to use to generate sparse", "type": "object", "properties": { "stores_text": { "title": "Stores Text", "default": true, "type": "boolean" }, "is_embedding_query": { "title": "Is Embedding Query", "default": true, "type": "boolean" }, "flat_metadata": { "title": "Flat Metadata", "default": true, "type": "boolean" }, "api_key": { "title": "Api Key", "type": "string" }, "index_name": { "title": "Index Name", "type": "string" }, "environment": { "title": "Environment", "type": "string" }, "namespace": { "title": "Namespace", "type": "string" }, "insert_kwargs": { "title": "Insert Kwargs", "type": "object" }, "add_sparse_vector": { "title": "Add Sparse Vector", "type": "boolean" }, "text_key": { "title": "Text Key", "type": "string" }, "batch_size": { "title": "Batch Size", "type": "integer" } }, "required": [ "add_sparse_vector", "text_key", "batch_size" ] }
- Fields
add_sparse_vector (bool)
api_key (Optional[str])
batch_size (int)
environment (Optional[str])
flat_metadata (bool)
index_name (Optional[str])
insert_kwargs (Optional[Dict])
is_embedding_query (bool)
namespace (Optional[str])
stores_text (bool)
text_key (str)
- field add_sparse_vector: bool [Required]ο
- field api_key: Optional[str] = Noneο
- field batch_size: int [Required]ο
- field environment: Optional[str] = Noneο
- field flat_metadata: bool = Trueο
- field index_name: Optional[str] = Noneο
- field insert_kwargs: Optional[Dict] = Noneο
- field is_embedding_query: bool = Trueο
- field namespace: Optional[str] = Noneο
- field stores_text: bool = Trueο
- field text_key: str [Required]ο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- classmethod class_name() str ο
Get class name.
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ο
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ο
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include β fields to include in new model
exclude β fields to exclude from new model, as with values this takes precedence over include
update β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep β set to True to make a deep copy of the model
- Returns
new model instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ο
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- classmethod from_dict(data: Dict[str, Any], **kwargs: Any) Self ο
- classmethod from_json(data_str: str, **kwargs: Any) Self ο
- classmethod from_orm(obj: Any) Model ο
- classmethod from_params(api_key: Optional[str] = None, index_name: Optional[str] = None, environment: Optional[str] = None, namespace: Optional[str] = None, insert_kwargs: Optional[Dict] = None, add_sparse_vector: bool = False, tokenizer: Optional[Callable] = None, text_key: str = 'text', batch_size: int = 100, **kwargs: Any) PineconeVectorStore ο
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ο
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- classmethod parse_obj(obj: Any) Model ο
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None) None ο
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query_embedding (List[float]) β query embedding
similarity_top_k (int) β top k most similar nodes
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ο
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ο
- to_dict(**kwargs: Any) Dict[str, Any] ο
- to_json(**kwargs: Any) str ο
- classmethod update_forward_refs(**localns: Any) None ο
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- classmethod validate(value: Any) Model ο
- property client: Anyο
Return Pinecone client.
- pydantic model llama_index.vector_stores.QdrantVectorStoreο
Qdrant Vector Store.
In this vector store, embeddings and docs are stored within a Qdrant collection.
During query time, the index uses Qdrant to query for the top k most similar nodes.
- Parameters
collection_name β (str): name of the Qdrant collection
client (Optional[Any]) β QdrantClient instance from qdrant-client package
Show JSON schema
{ "title": "QdrantVectorStore", "description": "Qdrant Vector Store.\n\nIn this vector store, embeddings and docs are stored within a\nQdrant collection.\n\nDuring query time, the index uses Qdrant to query for the top\nk most similar nodes.\n\nArgs:\n collection_name: (str): name of the Qdrant collection\n client (Optional[Any]): QdrantClient instance from `qdrant-client` package", "type": "object", "properties": { "stores_text": { "title": "Stores Text", "default": true, "type": "boolean" }, "is_embedding_query": { "title": "Is Embedding Query", "default": true, "type": "boolean" }, "flat_metadata": { "title": "Flat Metadata", "default": false, "type": "boolean" }, "collection_name": { "title": "Collection Name", "type": "string" }, "url": { "title": "Url", "type": "string" }, "api_key": { "title": "Api Key", "type": "string" }, "batch_size": { "title": "Batch Size", "type": "integer" }, "client_kwargs": { "title": "Client Kwargs", "type": "object" } }, "required": [ "collection_name", "batch_size" ] }
- Fields
api_key (Optional[str])
batch_size (int)
client_kwargs (dict)
collection_name (str)
flat_metadata (bool)
is_embedding_query (bool)
stores_text (bool)
url (Optional[str])
- field api_key: Optional[str] = Noneο
- field batch_size: int [Required]ο
- field client_kwargs: dict [Optional]ο
- field collection_name: str [Required]ο
- field flat_metadata: bool = Falseο
- field is_embedding_query: bool = Trueο
- field stores_text: bool = Trueο
- field url: Optional[str] = Noneο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- classmethod class_name() str ο
Get class name.
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ο
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ο
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include β fields to include in new model
exclude β fields to exclude from new model, as with values this takes precedence over include
update β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep β set to True to make a deep copy of the model
- Returns
new model instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ο
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- classmethod from_dict(data: Dict[str, Any], **kwargs: Any) Self ο
- classmethod from_json(data_str: str, **kwargs: Any) Self ο
- classmethod from_orm(obj: Any) Model ο
- classmethod from_params(collection_name: str, url: Optional[str] = None, api_key: Optional[str] = None, client_kwargs: Optional[dict] = None, batch_size: int = 100, **kwargs: Any) QdrantVectorStore ο
Create a connection to a remote Qdrant vector store from a config.
- json(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Optional[Callable[[Any], Any]] = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode ο
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- classmethod parse_file(path: Union[str, Path], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- classmethod parse_obj(obj: Any) Model ο
- classmethod parse_raw(b: Union[str, bytes], *, content_type: unicode = None, encoding: unicode = 'utf8', proto: Protocol = None, allow_pickle: bool = False) Model ο
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None) None ο
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query (VectorStoreQuery) β query
- classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}') DictStrAny ο
- classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any) unicode ο
- to_dict(**kwargs: Any) Dict[str, Any] ο
- to_json(**kwargs: Any) str ο
- classmethod update_forward_refs(**localns: Any) None ο
Try to update ForwardRefs on fields based on this Model, globalns and localns.
- classmethod validate(value: Any) Model ο
- property client: Anyο
Return the Qdrant client.
- class llama_index.vector_stores.RedisVectorStore(index_name: str, index_prefix: str = 'llama_index', prefix_ending: str = '/vector', index_args: Optional[Dict[str, Any]] = None, metadata_fields: Optional[List[str]] = None, redis_url: str = 'redis://localhost:6379', overwrite: bool = False, **kwargs: Any)ο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to the index.
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings
- Returns
List of ids of the documents added to the index.
- Return type
List[str]
- Raises
ValueError β If the index already exists and overwrite is False.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: RedisTypeο
Return the redis client instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- delete_index() None ο
Delete the index and all documents.
- persist(persist_path: str, fs: Optional[AbstractFileSystem] = None, in_background: bool = True) None ο
Persist the vector store to disk.
- Parameters
persist_path (str) β Path to persist the vector store to. (doesnβt apply)
in_background (bool, optional) β Persist in background. Defaults to True.
fs (fsspec.AbstractFileSystem, optional) β Filesystem to persist to. (doesnβt apply)
- Raises
redis.exceptions.RedisError β If there is an error persisting the index to disk.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query the index.
- Parameters
query (VectorStoreQuery) β query object
- Returns
query result
- Return type
- Raises
ValueError β If query.query_embedding is None.
redis.exceptions.RedisError β If there is an error querying the index.
redis.exceptions.TimeoutError β If there is a timeout querying the index.
ValueError β If no documents are found when querying the index.
- class llama_index.vector_stores.RocksetVectorStore(collection: str, client: Optional[Any] = None, text_key: str = 'text', embedding_col: str = 'embedding', metadata_col: str = 'metadata', workspace: str = 'commons', api_server: Optional[str] = None, api_key: Optional[str] = None, distance_func: DistanceFunc = DistanceFunc.COSINE_SIM)ο
- class DistanceFunc(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)ο
- add(nodes: List[BaseNode]) List[str] ο
Stores vectors in the collection
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings
- Returns
Stored node IDs (List[str])
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Deletes nodes stored in the collection by their ref_doc_id
- Parameters
ref_doc_id (str) β The ref_doc_id of the document whose nodes are to be deleted
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Gets nodes relevant to a query
- Parameters
query (llama_index.vector_stores.types.VectorStoreQuery) β The query
similarity_col (Optional[str]) β The column to select the cosine similarity as (default: β_similarityβ)
- Returns
query results (llama_index.vector_stores.types.VectorStoreQueryResult)
- classmethod with_new_collection(dimensions: Optional[int] = None, **rockset_vector_store_args: Any) RocksetVectorStore ο
Creates a new collection and returns its RocksetVectorStore.
- Parameters
dimensions (Optional[int]) β The length of the vectors to enforce in the collectionβs ingest transformation. By default, the collection will do no vector enforcement.
collection (str) β The name of the collection to be created
client (Optional[Any]) β Rockset client object
workspace (str) β The workspace containing the colleciton to be created (default: βcommonsβ)
text_key (str) β The key to the text of nodes (default: llama_index.vector_stores.utils.DEFAULT_TEXT_KEY)
embedding_col (str) β The DB column containing embeddings (default: llama_index.vector_stores.utils.DEFAULT_EMBEDDING_KEY))
metadata_col (str) β The DB column containing node metadata (default: βmetadataβ)
api_server (Optional[str]) β The Rockset API server to use
api_key (Optional[str]) β The Rockset API key to use
distance_func (RocksetVectorStore.DistanceFunc) β The metric to measure vector relationship (default: RocksetVectorStore.DistanceFunc.COSINE_SIM)
- class llama_index.vector_stores.SimpleVectorStore(data: Optional[SimpleVectorStoreData] = None, fs: Optional[AbstractFileSystem] = None, **kwargs: Any)ο
Simple Vector Store.
In this vector store, embeddings are stored within a simple, in-memory dictionary.
- Parameters
simple_vector_store_data_dict (Optional[dict]) β data dict containing the embeddings and doc_ids. See SimpleVectorStoreData for more details.
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Noneο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- classmethod from_persist_dir(persist_dir: str = './storage', fs: Optional[AbstractFileSystem] = None) SimpleVectorStore ο
Load from persist dir.
- classmethod from_persist_path(persist_path: str, fs: Optional[AbstractFileSystem] = None) SimpleVectorStore ο
Create a SimpleKVStore from a persist directory.
- get(text_id: str) List[float] ο
Get embedding.
- persist(persist_path: str = './storage/vector_store.json', fs: Optional[AbstractFileSystem] = None) None ο
Persist the SimpleVectorStore to a directory.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Get nodes for response.
- class llama_index.vector_stores.SupabaseVectorStore(postgres_connection_string: str, collection_name: str, dimension: int = 1536, **kwargs: Any)ο
Supbabase Vector.
In this vector store, embeddings are stored in Postgres table using pgvector.
During query time, the index uses pgvector/Supabase to query for the top k most similar nodes.
- Parameters
postgres_connection_string (str) β postgres connection string
collection_name (str) β name of the collection to store the embeddings in
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Args
nodes: List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Noneο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete doc.
:param : param ref_doc_id (str): document id
- get_by_id(doc_id: str) list ο
Get row ids by doc id.
- Parameters
doc_id (str) β document id
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query index for top k most similar nodes.
- Parameters
query (List[float]) β query embedding
- class llama_index.vector_stores.TairVectorStore(tair_url: str, index_name: str, index_type: str = 'HNSW', index_args: Optional[Dict[str, Any]] = None, overwrite: bool = False, **kwargs: Any)ο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to the index.
- Parameters
nodes (List[BaseNode]) β List of nodes with embeddings
- Returns
List of ids of the documents added to the index.
- Return type
List[str]
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Tairο
Return the Tair client instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete a document.
- Parameters
doc_id (str) β document id
- delete_index() None ο
Delete the index and all documents.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query the index.
- Parameters
query (VectorStoreQuery) β query object
- Returns
query result
- Return type
- Raises
ValueError β If query.query_embedding is None.
- class llama_index.vector_stores.TimescaleVectorStore(service_url: str, table_name: str, num_dimensions: int = 1536, time_partition_interval: Optional[timedelta] = None)ο
-
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(embedding_results: List[BaseNode]) List[str] ο
Asynchronously add nodes with embedding to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- property client: Anyο
Get client.
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- query(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Query vector store.
- class llama_index.vector_stores.VectorStoreQuery(query_embedding: Optional[List[float]] = None, similarity_top_k: int = 1, doc_ids: Optional[List[str]] = None, node_ids: Optional[List[str]] = None, query_str: Optional[str] = None, output_fields: Optional[List[str]] = None, embedding_field: Optional[str] = None, mode: VectorStoreQueryMode = VectorStoreQueryMode.DEFAULT, alpha: Optional[float] = None, filters: Optional[MetadataFilters] = None, mmr_threshold: Optional[float] = None, sparse_top_k: Optional[int] = None)ο
Vector store query.
- class llama_index.vector_stores.VectorStoreQueryResult(nodes: Optional[Sequence[BaseNode]] = None, similarities: Optional[List[float]] = None, ids: Optional[List[str]] = None)ο
Vector store query result.
- pydantic model llama_index.vector_stores.WeaviateVectorStoreο
Weaviate vector store.
In this vector store, embeddings and docs are stored within a Weaviate collection.
During query time, the index uses Weaviate to query for the top k most similar nodes.
- Parameters
weaviate_client (weaviate.Client) β WeaviateClient instance from weaviate-client package
index_name (Optional[str]) β name for Weaviate classes
Show JSON schema
{ "title": "WeaviateVectorStore", "description": "Weaviate vector store.\n\nIn this vector store, embeddings and docs are stored within a\nWeaviate collection.\n\nDuring query time, the index uses Weaviate to query for the top\nk most similar nodes.\n\nArgs:\n weaviate_client (weaviate.Client): WeaviateClient\n instance from `weaviate-client` package\n index_name (Optional[str]): name for Weaviate classes", "type": "object", "properties": { "stores_text": { "title": "Stores Text", "default": true, "type": "boolean" }, "is_embedding_query": { "title": "Is Embedding Query", "default": true, "type": "boolean" }, "index_name": { "title": "Index Name", "type": "string" }, "url": { "title": "Url", "type": "string" }, "text_key": { "title": "Text Key", "type": "string" }, "auth_config": { "title": "Auth Config", "type": "object" }, "client_kwargs": { "title": "Client Kwargs", "type": "object" } }, "required": [ "index_name", "text_key" ] }
- Fields
auth_config (Dict[str, Any])
client_kwargs (Dict[str, Any])
index_name (str)
is_embedding_query (bool)
stores_text (bool)
text_key (str)
url (Optional[str])
- field auth_config: Dict[str, Any] [Optional]ο
- field client_kwargs: Dict[str, Any] [Optional]ο
- field index_name: str [Required]ο
- field is_embedding_query: bool = Trueο
- field stores_text: bool = Trueο
- field text_key: str [Required]ο
- field url: Optional[str] = Noneο
- add(nodes: List[BaseNode]) List[str] ο
Add nodes to index.
- Parameters
nodes β List[BaseNode]: list of nodes with embeddings
- async adelete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id. NOTE: this is not implemented for all vector stores. If not implemented, it will just call delete synchronously.
- async aquery(query: VectorStoreQuery, **kwargs: Any) VectorStoreQueryResult ο
Asynchronously query vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call query synchronously.
- async async_add(nodes: List[BaseNode]) List[str] ο
Asynchronously add nodes to vector store. NOTE: this is not implemented for all vector stores. If not implemented, it will just call add synchronously.
- classmethod class_name() str ο
Get class name.
- classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) Model ο
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = βallowβ was set since it adds all passed values
- copy(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, update: Optional[DictStrAny] = None, deep: bool = False) Model ο
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters
include β fields to include in new model
exclude β fields to exclude from new model, as with values this takes precedence over include
update β values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep β set to True to make a deep copy of the model
- Returns
new model instance
- delete(ref_doc_id: str, **delete_kwargs: Any) None ο
Delete nodes using with ref_doc_id.
- Parameters
ref_doc_id (str) β The doc_id of the document to delete.
- dict(*, include: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, exclude: Optional[Union[AbstractSetIntStr, MappingIntStrAny]] = None, by_alias: bool = False, skip_defaults: Optional[bool] = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny ο
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- classmethod from_dict(data: Dict[str, Any], **kwargs: Any) Self ο
-
classmethod from_json(data_str: str, **kwargs: Any)