OpenLLM#

pydantic model llama_index.llms.openllm.OpenLLM#

OpenLLM LLM.

Show JSON schema
{
   "title": "OpenLLM",
   "description": "OpenLLM LLM.",
   "type": "object",
   "properties": {
      "callback_manager": {
         "title": "Callback Manager"
      },
      "system_prompt": {
         "title": "System Prompt",
         "description": "System prompt for LLM calls.",
         "type": "string"
      },
      "messages_to_prompt": {
         "title": "Messages To Prompt"
      },
      "completion_to_prompt": {
         "title": "Completion To Prompt"
      },
      "output_parser": {
         "title": "Output Parser"
      },
      "pydantic_program_mode": {
         "default": "default",
         "allOf": [
            {
               "$ref": "#/definitions/PydanticProgramMode"
            }
         ]
      },
      "query_wrapper_prompt": {
         "title": "Query Wrapper Prompt"
      },
      "model_id": {
         "title": "Model Id",
         "description": "Given Model ID from HuggingFace Hub. This can be either a pretrained ID or local path. This is synonymous to HuggingFace's '.from_pretrained' first argument",
         "type": "string"
      },
      "model_version": {
         "title": "Model Version",
         "description": "Optional model version to save the model as.",
         "type": "string"
      },
      "model_tag": {
         "title": "Model Tag",
         "description": "Optional tag to save to BentoML store.",
         "type": "string"
      },
      "prompt_template": {
         "title": "Prompt Template",
         "description": "Optional prompt template to pass for this LLM.",
         "type": "string"
      },
      "backend": {
         "title": "Backend",
         "description": "Optional backend to pass for this LLM. By default, it will use vLLM if vLLM is available in local system. Otherwise, it will fallback to PyTorch.",
         "enum": [
            "vllm",
            "pt"
         ],
         "type": "string"
      },
      "quantize": {
         "title": "Quantize",
         "description": "Optional quantization methods to use with this LLM. See OpenLLM's --quantize options from `openllm start` for more information.",
         "enum": [
            "awq",
            "gptq",
            "int8",
            "int4",
            "squeezellm"
         ],
         "type": "string"
      },
      "serialization": {
         "title": "Serialization",
         "description": "Optional serialization methods for this LLM to be save as. Default to 'safetensors', but will fallback to PyTorch pickle `.bin` on some models.",
         "enum": [
            "safetensors",
            "legacy"
         ],
         "type": "string"
      },
      "trust_remote_code": {
         "title": "Trust Remote Code",
         "description": "Optional flag to trust remote code. This is synonymous to Transformers' `trust_remote_code`. Default to False.",
         "type": "boolean"
      },
      "class_name": {
         "title": "Class Name",
         "type": "string",
         "default": "OpenLLM"
      }
   },
   "required": [
      "model_id",
      "serialization",
      "trust_remote_code"
   ],
   "definitions": {
      "PydanticProgramMode": {
         "title": "PydanticProgramMode",
         "description": "Pydantic program mode.",
         "enum": [
            "default",
            "openai",
            "llm",
            "guidance",
            "lm-format-enforcer"
         ],
         "type": "string"
      }
   }
}

Config
  • arbitrary_types_allowed: bool = True

Fields
Validators
  • _validate_callback_manager » callback_manager

  • set_completion_to_prompt » completion_to_prompt

  • set_messages_to_prompt » messages_to_prompt

field backend: Optional[Literal['vllm', 'pt']] = None#

Optional backend to pass for this LLM. By default, it will use vLLM if vLLM is available in local system. Otherwise, it will fallback to PyTorch.

field model_id: str [Required]#

Given Model ID from HuggingFace Hub. This can be either a pretrained ID or local path. This is synonymous to HuggingFace’s ‘.from_pretrained’ first argument

field model_tag: Optional[str] = None#

Optional tag to save to BentoML store.

field model_version: Optional[str] = None#

Optional model version to save the model as.

field prompt_template: Optional[str] = None#

Optional prompt template to pass for this LLM.

field quantize: Optional[Literal['awq', 'gptq', 'int8', 'int4', 'squeezellm']] = None#

Optional quantization methods to use with this LLM. See OpenLLM’s –quantize options from openllm start for more information.

field serialization: Literal['safetensors', 'legacy'] [Required]#

Optional serialization methods for this LLM to be save as. Default to ‘safetensors’, but will fallback to PyTorch pickle .bin on some models.

field trust_remote_code: bool [Required]#

Optional flag to trust remote code. This is synonymous to Transformers’ trust_remote_code. Default to False.

async achat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Async chat endpoint for LLM.

async acomplete(*args: Any, **kwargs: Any) Any#

Async completion endpoint for LLM.

astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Async streaming chat endpoint for LLM.

astream_complete(*args: Any, **kwargs: Any) Any#

Async streaming completion endpoint for LLM.

chat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Chat endpoint for LLM.

classmethod class_name() str#

Get the class name, used as a unique ID in serialization.

This provides a key that makes serialization robust against actual class name changes.

complete(*args: Any, **kwargs: Any) Any#

Completion endpoint for LLM.

stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Streaming chat endpoint for LLM.

stream_complete(*args: Any, **kwargs: Any) Any#

Streaming completion endpoint for LLM.

property metadata: LLMMetadata#

LLM metadata.

pydantic model llama_index.llms.openllm.OpenLLMAPI#

OpenLLM Client interface. This is useful when interacting with a remote OpenLLM server.

Show JSON schema
{
   "title": "OpenLLMAPI",
   "description": "OpenLLM Client interface. This is useful when interacting with a remote OpenLLM server.",
   "type": "object",
   "properties": {
      "callback_manager": {
         "title": "Callback Manager"
      },
      "system_prompt": {
         "title": "System Prompt",
         "description": "System prompt for LLM calls.",
         "type": "string"
      },
      "messages_to_prompt": {
         "title": "Messages To Prompt"
      },
      "completion_to_prompt": {
         "title": "Completion To Prompt"
      },
      "output_parser": {
         "title": "Output Parser"
      },
      "pydantic_program_mode": {
         "default": "default",
         "allOf": [
            {
               "$ref": "#/definitions/PydanticProgramMode"
            }
         ]
      },
      "query_wrapper_prompt": {
         "title": "Query Wrapper Prompt"
      },
      "address": {
         "title": "Address",
         "description": "OpenLLM server address. This could either be set here or via OPENLLM_ENDPOINT",
         "type": "string"
      },
      "timeout": {
         "title": "Timeout",
         "description": "Timeout for sending requests.",
         "type": "integer"
      },
      "max_retries": {
         "title": "Max Retries",
         "description": "Maximum number of retries.",
         "type": "integer"
      },
      "api_version": {
         "title": "Api Version",
         "description": "OpenLLM Server API version.",
         "enum": [
            "v1"
         ],
         "type": "string"
      },
      "class_name": {
         "title": "Class Name",
         "type": "string",
         "default": "OpenLLM_Client"
      }
   },
   "required": [
      "timeout",
      "max_retries",
      "api_version"
   ],
   "definitions": {
      "PydanticProgramMode": {
         "title": "PydanticProgramMode",
         "description": "Pydantic program mode.",
         "enum": [
            "default",
            "openai",
            "llm",
            "guidance",
            "lm-format-enforcer"
         ],
         "type": "string"
      }
   }
}

Config
  • arbitrary_types_allowed: bool = True

Fields
Validators
  • _validate_callback_manager » callback_manager

  • set_completion_to_prompt » completion_to_prompt

  • set_messages_to_prompt » messages_to_prompt

field address: Optional[str] = None#

OpenLLM server address. This could either be set here or via OPENLLM_ENDPOINT

field api_version: Literal['v1'] [Required]#

OpenLLM Server API version.

field max_retries: int [Required]#

Maximum number of retries.

field timeout: int [Required]#

Timeout for sending requests.

async achat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Async chat endpoint for LLM.

async acomplete(*args: Any, **kwargs: Any) Any#

Async completion endpoint for LLM.

astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Async streaming chat endpoint for LLM.

astream_complete(*args: Any, **kwargs: Any) Any#

Async streaming completion endpoint for LLM.

chat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Chat endpoint for LLM.

classmethod class_name() str#

Get the class name, used as a unique ID in serialization.

This provides a key that makes serialization robust against actual class name changes.

complete(*args: Any, **kwargs: Any) Any#

Completion endpoint for LLM.

stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) Any#

Streaming chat endpoint for LLM.

stream_complete(*args: Any, **kwargs: Any) Any#

Streaming completion endpoint for LLM.

property metadata: LLMMetadata#

LLM metadata.