Promptsο
Conceptο
Prompting is the fundamental input that gives LLMs their expressive power. LlamaIndex uses prompts to build the index, do insertion, perform traversal during querying, and to synthesize the final answer.
LlamaIndex uses a set of default prompt templates that work well out of the box.
In addition, there are some prompts written and used specifically for chat models like gpt-3.5-turbo
here.
Users may also provide their own prompt templates to further customize the behavior of the framework. The best method for customizing is copying the default prompt from the link above, and using that as the base for any modifications.
Usage Patternο
Defining a custom promptο
Defining a custom prompt is as simple as creating a format string
from llama_index.prompts import PromptTemplate
template = (
"We have provided context information below. \n"
"---------------------\n"
"{context_str}"
"\n---------------------\n"
"Given this information, please answer the question: {query_str}\n"
)
qa_template = PromptTemplate(template)
# you can create text prompt (for completion API)
prompt = qa_template.format(context_str=..., query_str=...)
# or easily convert to message prompts (for chat API)
messages = qa_template.format_messages(context_str=..., query_str=...)
Note: you may see references to legacy prompt subclasses such as
QuestionAnswerPrompt
,RefinePrompt
. These have been deprecated (and now are type aliases ofPromptTemplate
). Now you can directly specifyPromptTemplate(template)
to construct custom prompts. But you still have to make sure the template string contains the expected parameters (e.g.{context_str}
and{query_str}
) when replacing a default question answer prompt.
You can also define a template from chat messages
from llama_index.prompts import ChatPromptTemplate, ChatMessage, MessageRole
message_templates = [
ChatMessage(content="You are an expert system.", role=MessageRole.SYSTEM),
ChatMessage(
content="Generate a short story about {topic}",
role=MessageRole.USER,
),
]
chat_template = ChatPromptTemplate(message_templates=message_templates)
# you can create message prompts (for chat API)
messages = chat_template.format_messages(topic=...)
# or easily convert to text prompt (for completion API)
prompt = chat_template.format(topic=...)
Passing custom prompts into the pipelineο
Since LlamaIndex is a multi-step pipeline, itβs important to identify the operation that you want to modify and pass in the custom prompt at the right place.
At a high-level, prompts are used in 1) index construction, and 2) query engine execution
The most commonly used prompts will be the text_qa_template
and the refine_template
.
text_qa_template
- used to get an initial answer to a query using retrieved nodesrefine_tempalate
- used when the retrieved text does not fit into a single LLM call withresponse_mode="compact"
(the default), or when more than one node is retrieved usingresponse_mode="refine"
. The answer from the first query is inserted as anexisting_answer
, and the LLM must update or repeat the existing answer based on the new context.
Modify prompts used in index constructionο
Different indices use different types of prompts during construction (some donβt use prompts at all).
For instance, TreeIndex
uses a summary prompt to hierarchically
summarize the nodes, and KeywordTableIndex
uses a keyword extract prompt to extract keywords.
There are two equivalent ways to override the prompts:
via the default nodes constructor
index = TreeIndex(nodes, summary_template=<custom_prompt>)
via the documents constructor.
index = TreeIndex.from_documents(docs, summary_template=<custom_prompt>)
For more details on which index uses which prompts, please visit Index class references.
Modify prompts used in query engineο
More commonly, prompts are used at query-time (i.e. for executing a query against an index and synthesizing the final response).
There are also two equivalent ways to override the prompts:
via the high-level API
query_engine = index.as_query_engine(
text_qa_template=<custom_qa_prompt>,
refine_template=<custom_refine_prompt>
)
via the low-level composition API
retriever = index.as_retriever()
synth = get_response_synthesizer(
text_qa_template=<custom_qa_prompt>,
refine_template=<custom_refine_prompt>
)
query_engine = RetrieverQueryEngine(retriever, response_synthesizer)
The two approaches above are equivalent, where 1 is essentially syntactic sugar for 2 and hides away the underlying complexity. You might want to use 1 to quickly modify some common parameters, and use 2 to have more granular control.
For more details on which classes use which prompts, please visit Query class references.
Check out the reference documentation for a full set of all prompts.