LLMPredictor

Our LLMPredictor is a wrapper around Langchain’s LLMChain that allows easy integration into LlamaIndex.

Wrapper functions around an LLM chain.

Our MockLLMPredictor is used for token prediction. See Cost Analysis How-To for more information.

Mock chain wrapper.

class llama_index.token_counter.mock_chain_wrapper.MockLLMPredictor(max_tokens: int = 256, llm: Optional[BaseLLM] = None)

Mock LLM Predictor.

async apredict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]

Async predict the answer to a query.

Parameters

prompt (Prompt) – Prompt to use for prediction.

Returns

Tuple of the predicted answer and the formatted prompt.

Return type

Tuple[str, str]

get_llm_metadata() LLMMetadata

Get LLM metadata.

property last_token_usage: int

Get the last token usage.

property llm: BaseLanguageModel

Get LLM.

predict(prompt: Prompt, **prompt_args: Any) Tuple[str, str]

Predict the answer to a query.

Parameters

prompt (Prompt) – Prompt to use for prediction.

Returns

Tuple of the predicted answer and the formatted prompt.

Return type

Tuple[str, str]

stream(prompt: Prompt, **prompt_args: Any) Tuple[Generator, str]

Stream the answer to a query.

NOTE: this is a beta feature. Will try to build or use better abstractions about response handling.

Parameters

prompt (Prompt) – Prompt to use for prediction.

Returns

The predicted answer.

Return type

str

property total_tokens_used: int

Get the total tokens used so far.