Embeddingsο
Users have a few options to choose from when it comes to embeddings.
OpenAIEmbedding
: the default embedding class. Defaults to βtext-embedding-ada-002βHuggingFaceEmbedding
: a generic wrapper around HuggingFaceβs transformers models.OptimumEmbedding
: support for usage and creation of ONNX models from Optimum and HuggingFace.InstructorEmbedding
: a wrapper around Instructor embedding models.LangchainEmbedding
: a wrapper around Langchainβs embedding models.GoogleUnivSentEncoderEmbedding
: a wrapper around Googleβs Universal Sentence Encoder.AdapterEmbeddingModel
: an adapter around any embedding model.
OpenAIEmbeddingο
- pydantic model llama_index.embeddings.openai.OpenAIEmbeddingο
OpenAI class for embeddings.
- Parameters
mode (str) β
Mode for embedding. Defaults to OpenAIEmbeddingMode.TEXT_SEARCH_MODE. Options are:
OpenAIEmbeddingMode.SIMILARITY_MODE
OpenAIEmbeddingMode.TEXT_SEARCH_MODE
model (str) β
Model for embedding. Defaults to OpenAIEmbeddingModelType.TEXT_EMBED_ADA_002. Options are:
OpenAIEmbeddingModelType.DAVINCI
OpenAIEmbeddingModelType.CURIE
OpenAIEmbeddingModelType.BABBAGE
OpenAIEmbeddingModelType.ADA
OpenAIEmbeddingModelType.TEXT_EMBED_ADA_002
deployment_name (Optional[str]) β Optional deployment of model. Defaults to None. If this value is not None, mode and model will be ignored. Only available for using AzureOpenAI.
Show JSON schema
{ "title": "OpenAIEmbedding", "description": "OpenAI class for embeddings.\n\nArgs:\n mode (str): Mode for embedding.\n Defaults to OpenAIEmbeddingMode.TEXT_SEARCH_MODE.\n Options are:\n\n - OpenAIEmbeddingMode.SIMILARITY_MODE\n - OpenAIEmbeddingMode.TEXT_SEARCH_MODE\n\n model (str): Model for embedding.\n Defaults to OpenAIEmbeddingModelType.TEXT_EMBED_ADA_002.\n Options are:\n\n - OpenAIEmbeddingModelType.DAVINCI\n - OpenAIEmbeddingModelType.CURIE\n - OpenAIEmbeddingModelType.BABBAGE\n - OpenAIEmbeddingModelType.ADA\n - OpenAIEmbeddingModelType.TEXT_EMBED_ADA_002\n\n deployment_name (Optional[str]): Optional deployment of model. Defaults to None.\n If this value is not None, mode and model will be ignored.\n Only available for using AzureOpenAI.", "type": "object", "properties": { "model_name": { "title": "Model Name", "description": "The name of the embedding model.", "default": "unknown", "type": "string" }, "embed_batch_size": { "title": "Embed Batch Size", "description": "The batch size for embedding calls.", "default": 10, "type": "integer" }, "callback_manager": { "title": "Callback Manager" }, "deployment_name": { "title": "Deployment Name", "type": "string" }, "additional_kwargs": { "title": "Additional Kwargs", "description": "Additional kwargs for the OpenAI API.", "type": "object" }, "api_key": { "title": "Api Key", "description": "The OpenAI API key.", "type": "string" }, "api_type": { "title": "Api Type", "description": "The OpenAI API type.", "type": "string" }, "api_base": { "title": "Api Base", "description": "The base URL for OpenAI API.", "type": "string" }, "api_version": { "title": "Api Version", "description": "The API version for OpenAI API.", "type": "string" } }, "required": [ "api_base", "api_version" ] }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
Β»callback_manager
- field additional_kwargs: Dict[str, Any] [Optional]ο
Additional kwargs for the OpenAI API.
- field api_base: str [Required]ο
The base URL for OpenAI API.
- field api_key: str = Noneο
The OpenAI API key.
- field api_type: str = Noneο
The OpenAI API type.
- field api_version: str [Required]ο
The API version for OpenAI API.
- field deployment_name: Optional[str] = Noneο
- classmethod class_name() str ο
Get class name.
HuggingFaceEmbeddingο
- pydantic model llama_index.embeddings.huggingface.HuggingFaceEmbeddingο
Show JSON schema
{ "title": "HuggingFaceEmbedding", "description": "Base class for embeddings.", "type": "object", "properties": { "model_name": { "title": "Model Name", "description": "The name of the embedding model.", "default": "unknown", "type": "string" }, "embed_batch_size": { "title": "Embed Batch Size", "description": "The batch size for embedding calls.", "default": 10, "type": "integer" }, "callback_manager": { "title": "Callback Manager" }, "tokenizer_name": { "title": "Tokenizer Name", "description": "Tokenizer name from HuggingFace.", "type": "string" }, "max_length": { "title": "Max Length", "description": "Maximum length of input.", "type": "integer" }, "pooling": { "title": "Pooling", "description": "Pooling strategy. One of ['cls', 'mean'].", "type": "string" }, "query_instruction": { "title": "Query Instruction", "description": "Instruction to prepend to query text.", "type": "string" }, "text_instruction": { "title": "Text Instruction", "description": "Instruction to prepend to text.", "type": "string" }, "cache_folder": { "title": "Cache Folder", "description": "Cache folder for huggingface files.", "type": "string" } }, "required": [ "tokenizer_name", "max_length", "pooling" ] }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
Β»callback_manager
- field cache_folder: Optional[str] = Noneο
Cache folder for huggingface files.
- field max_length: int [Required]ο
Maximum length of input.
- field pooling: str [Required]ο
Pooling strategy. One of [βclsβ, βmeanβ].
- field query_instruction: Optional[str] = Noneο
Instruction to prepend to query text.
- field text_instruction: Optional[str] = Noneο
Instruction to prepend to text.
- field tokenizer_name: str [Required]ο
Tokenizer name from HuggingFace.
- classmethod class_name() str ο
Get class name.
OptimumEmbeddingο
- pydantic model llama_index.embeddings.huggingface_optimum.OptimumEmbeddingο
Show JSON schema
{ "title": "OptimumEmbedding", "description": "Base class for embeddings.", "type": "object", "properties": { "model_name": { "title": "Model Name", "description": "The name of the embedding model.", "default": "unknown", "type": "string" }, "embed_batch_size": { "title": "Embed Batch Size", "description": "The batch size for embedding calls.", "default": 10, "type": "integer" }, "callback_manager": { "title": "Callback Manager" }, "folder_name": { "title": "Folder Name", "description": "Folder name to load from.", "type": "string" }, "max_length": { "title": "Max Length", "description": "Maximum length of input.", "type": "integer" }, "pooling": { "title": "Pooling", "description": "Pooling strategy. One of ['cls', 'mean'].", "type": "string" }, "query_instruction": { "title": "Query Instruction", "description": "Instruction to prepend to query text.", "type": "string" }, "text_instruction": { "title": "Text Instruction", "description": "Instruction to prepend to text.", "type": "string" }, "cache_folder": { "title": "Cache Folder", "description": "Cache folder for huggingface files.", "type": "string" } }, "required": [ "folder_name", "max_length", "pooling" ] }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
Β»callback_manager
- field cache_folder: Optional[str] = Noneο
Cache folder for huggingface files.
- field folder_name: str [Required]ο
Folder name to load from.
- field max_length: int [Required]ο
Maximum length of input.
- field pooling: str [Required]ο
Pooling strategy. One of [βclsβ, βmeanβ].
- field query_instruction: Optional[str] = Noneο
Instruction to prepend to query text.
- field text_instruction: Optional[str] = Noneο
Instruction to prepend to text.
- classmethod class_name() str ο
Get class name.
- classmethod create_and_save_optimum_model(model_name_or_path: str, output_path: str, export_kwargs: Optional[dict] = None) None ο
InstructorEmbeddingο
- pydantic model llama_index.embeddings.instructor.InstructorEmbeddingο
Show JSON schema
{ "title": "InstructorEmbedding", "description": "Base class for embeddings.", "type": "object", "properties": { "model_name": { "title": "Model Name", "description": "The name of the embedding model.", "default": "unknown", "type": "string" }, "embed_batch_size": { "title": "Embed Batch Size", "description": "The batch size for embedding calls.", "default": 10, "type": "integer" }, "callback_manager": { "title": "Callback Manager" }, "query_instruction": { "title": "Query Instruction", "description": "Instruction to prepend to query text.", "type": "string" }, "text_instruction": { "title": "Text Instruction", "description": "Instruction to prepend to text.", "type": "string" }, "cache_folder": { "title": "Cache Folder", "description": "Cache folder for huggingface files.", "type": "string" } } }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
Β»callback_manager
- field cache_folder: Optional[str] = Noneο
Cache folder for huggingface files.
- field query_instruction: Optional[str] = Noneο
Instruction to prepend to query text.
- field text_instruction: Optional[str] = Noneο
Instruction to prepend to text.
- classmethod class_name() str ο
Get class name.
LangchainEmbeddingο
- pydantic model llama_index.embeddings.langchain.LangchainEmbeddingο
External embeddings (taken from Langchain).
- Parameters
langchain_embedding (langchain.embeddings.Embeddings) β Langchain embeddings class.
Show JSON schema
{ "title": "LangchainEmbedding", "description": "External embeddings (taken from Langchain).\n\nArgs:\n langchain_embedding (langchain.embeddings.Embeddings): Langchain\n embeddings class.", "type": "object", "properties": { "model_name": { "title": "Model Name", "description": "The name of the embedding model.", "default": "unknown", "type": "string" }, "embed_batch_size": { "title": "Embed Batch Size", "description": "The batch size for embedding calls.", "default": 10, "type": "integer" }, "callback_manager": { "title": "Callback Manager" } } }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
Β»callback_manager
- classmethod class_name() str ο
Get class name.
GoogleUnivSentEncoderEmbeddingο
- pydantic model llama_index.embeddings.google.GoogleUnivSentEncoderEmbeddingο
Show JSON schema
{ "title": "GoogleUnivSentEncoderEmbedding", "description": "Base class for embeddings.", "type": "object", "properties": { "model_name": { "title": "Model Name", "description": "The name of the embedding model.", "default": "unknown", "type": "string" }, "embed_batch_size": { "title": "Embed Batch Size", "description": "The batch size for embedding calls.", "default": 10, "type": "integer" }, "callback_manager": { "title": "Callback Manager" } } }
- Config
arbitrary_types_allowed: bool = True
- Fields
- Validators
_validate_callback_manager
Β»callback_manager
- classmethod class_name() str ο
Get class name.