Picking the proper Large Language Model (LLM) is one of the first steps you need to consider when building any LLM application over your data.

LLMs are a core component of LlamaIndex. They can be used as standalone modules or plugged into other core LlamaIndex modules (indices, retrievers, query engines). They are always used during the response synthesis step (e.g. after retrieval). Depending on the type of index being used, LLMs may also be used during index construction, insertion, and query traversal.

LlamaIndex provides a unified interface for defining LLM modules, whether it’s from OpenAI, Hugging Face, or LangChain, so that you don’t have to write the boilerplate code of defining the LLM interface yourself. This interface consists of the following (more details below):

  • Support for text completion and chat endpoints (details below)

  • Support for streaming and non-streaming endpoints

  • Support for synchronous and asynchronous endpoints

Usage Pattern

The following code snippet shows how you can get started using LLMs.

from llama_index.llms import OpenAI

# non-streaming
resp = OpenAI().complete('Paul Graham is ')