Building a Multi-PDF Agent using Query Pipelines and HyDE¶
In this example, we show you how to build a multi-PDF agent that can reason across multiple tools, each one corresponding to a RAG pipeline with HyDE over a document.
Author: https://github.com/DoganK01
Install Dependencies¶
%pip install llama-index-llms-openai
%pip install llama-index
%pip install pyvis
%pip install arize-phoenix[evals]
%pip install llama-index-callbacks-arize-phoenix
Download Data and Do Imports¶
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
import os
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.query_engine import TransformQueryEngine
from IPython.display import Markdown, display
from llama_index.core import (
SimpleDirectoryReader,
VectorStoreIndex,
StorageContext,
load_index_from_storage,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
# define global callback setting
from llama_index.core.settings import Settings
from llama_index.core.callbacks import CallbackManager
Setup Observability¶
callback_manager = CallbackManager()
Settings.callback_manager = callback_manager
# setup Arize Phoenix for logging/observability
import phoenix as px
import llama_index.core
px.launch_app()
llama_index.core.set_global_handler("arize_phoenix")
os.environ["OPENAI_API_KEY"] = "sk-"
Setup Multi Doc HyDE Query Engine / Tool¶
We setup HyDE Query engines and their tools for our multi doc system.
HyDE, short for Hypothetical Document Embeddings, is an innovative retrieval technique aimed at bolstering the efficiency of document retrieval processes. This method operates by crafting a hypothetical document tailored to an incoming query, which is subsequently embedded. The resulting embedding is leveraged to efficiently retrieve real documents exhibiting similarities to the hypothetical counterpart.
try:
storage_context = StorageContext.from_defaults(
persist_dir="./storage/lyft"
)
lyft_index = load_index_from_storage(storage_context)
storage_context = StorageContext.from_defaults(
persist_dir="./storage/uber"
)
uber_index = load_index_from_storage(storage_context)
index_loaded = True
except:
index_loaded = False
if not index_loaded:
# load data
lyft_docs = SimpleDirectoryReader(
input_files=["./data/10k/lyft_2021.pdf"]
).load_data()
uber_docs = SimpleDirectoryReader(
input_files=["./data/10k/uber_2021.pdf"]
).load_data()
# build index
lyft_index = VectorStoreIndex.from_documents(lyft_docs)
uber_index = VectorStoreIndex.from_documents(uber_docs)
# persist index
lyft_index.storage_context.persist(persist_dir="./storage/lyft")
uber_index.storage_context.persist(persist_dir="./storage/uber")
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)
hyde = HyDEQueryTransform(include_original=True)
lyft_hyde_query_engine = TransformQueryEngine(lyft_engine, hyde)
uber_hyde_query_engine = TransformQueryEngine(uber_engine, hyde)
query_engine_tools = [
QueryEngineTool(
query_engine=lyft_hyde_query_engine,
metadata=ToolMetadata(
name="lyft_10k",
description=(
"Provides information about Lyft financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
),
QueryEngineTool(
query_engine=uber_hyde_query_engine,
metadata=ToolMetadata(
name="uber_10k",
description=(
"Provides information about Uber financials for year 2021. "
"Use a detailed plain text question as input to the tool."
),
),
),
]
Setup ReAct Agent Pipeline¶
What is ReAct Agent¶
ReAct is a technique that enables LLMs to reason and perform task-specific actions. It combines chain-of-thought reasoning with action planning. It enables LLMs to create reasoning tracks and task-specific actions, strengthening the synergy between them using memory.
The ReACT agent model refers to a framework that integrates the reasoning capabilities of LLMs with the ability to take actionable steps, creating a more sophisticated system that can understand and process information, evaluate situations, take appropriate actions, communicate responses, and track ongoing situations.
Reasoning Loop : The reasoning loop allows data agents to select and interact with tools in response to an input task.
Memory: LLMs, with access to memory, can store and retrieve data, ideal for apps tracking state or accessing multiple sources. Memory retains past interactions, enabling seamless reference to earlier conversation points. This integration involves allocating memory slots for relevant information and leveraging retrieval mechanisms during conversations. By recalling stored data, LLMs enhance contextual responses and integrate external sources, enriching user experiences.
Steps of the ReAct agent we will create¶
- Takes in agent inputs
- Calls ReAct prompt using LLM to generate next action/tool (or returns a response).
- If tool/action is selected, call tool pipeline to execute tool + collect response (In this case, our tools are HyDE Query Engine tools for both documents).
- If response is generated, get response.
An
AgentInputComponent
that allows you to convert the agent inputs (Task, state dictionary) into a set of inputs for the query pipeline.An
AgentFnComponent
: a general processor that allows you to take in the current Task, state, as well as any arbitrary inputs, and returns an output. In this cookbook we define a function component to format the ReAct prompt. However, you can put this anywhere
Note that any function passed into AgentFnComponent
and AgentInputComponent
MUST include task and state as input variables, as these are inputs passed from the agent.
Note that the output of an agentic query pipeline MUST be Tuple[AgentChatResponse, bool]
.
Task and State¶
Task: It contains the information required to fulfill the query requested by the user. User input, memory, metadatas, global states over time.
State: Some informations like memory
Agent Input Component¶
Generates inputs for the given task.
from llama_index.core.agent.react.types import (
ActionReasoningStep,
ObservationReasoningStep,
ResponseReasoningStep,
)
from llama_index.core.agent import Task, AgentChatResponse
from llama_index.core.query_pipeline import (
AgentInputComponent,
AgentFnComponent,
CustomAgentComponent,
QueryComponent,
ToolRunnerComponent,
)
from llama_index.core.llms import MessageRole
from typing import Dict, Any, Optional, Tuple, List, cast
## Agent Input Component
## This is the component that produces agent inputs to the rest of the components
## Can also put initialization logic here.
def agent_input_fn(task: Task, state: Dict[str, Any]) -> Dict[str, Any]:
"""Agent input function.
Returns:
A Dictionary of output keys and values. If you are specifying
src_key when defining links between this component and other
components, make sure the src_key matches the specified output_key.
"""
# initialize current_reasoning
if "current_reasoning" not in state:
state["current_reasoning"] = []
reasoning_step = ObservationReasoningStep(observation=task.input)
state["current_reasoning"].append(reasoning_step)
return {"input": task.input}
agent_input_component = AgentInputComponent(fn=agent_input_fn)
Define Agent Prompt¶
Here we define the agent component that generates a ReAct prompt, and after the output is generated from the LLM, parses into a structured object.
After the input is received, LLM is called with the ReAct agent prompt.
ReActChatFormatter
basically generates a fully formatted react prompt using ReAct Prompting (Chain-Of-Thought + Acting)
method
from llama_index.core.agent import ReActChatFormatter
from llama_index.core.query_pipeline import InputComponent, Link
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool
## define prompt function
def react_prompt_fn(
task: Task, state: Dict[str, Any], input: str, tools: List[BaseTool]
) -> List[ChatMessage]:
# Add input to reasoning
chat_formatter = ReActChatFormatter()
return chat_formatter.format(
tools,
chat_history=task.memory.get() + state["memory"].get_all(),
current_reasoning=state["current_reasoning"],
)
react_prompt_component = AgentFnComponent(
fn=react_prompt_fn, partial_dict={"tools": query_engine_tools}
)
Define Agent Output Parser + Tool Pipeline¶
Once the LLM gives an output, we have a decision tree:
If an answer is given, then we’re done. Process the output
If an action is given, we need to execute the specified tool with the specified args, and then process the output.
Tool calling can be done via the ToolRunnerComponent
module. This is a simple wrapper module that takes in a list of tools, and can be “executed” with the specified tool name (every tool has a name) and tool action.
We implement this overall module OutputAgentComponent
that subclasses CustomAgentComponent
.
perse_react_output_fn
function simply parses the ReAct prompt got from react_prompt_fn
into the reasoning step.
In this case, the ReAct Agent choose whatever go with a tool or done with tools and simply gets the output that will be fit in chat response for agents (AgentChatResponse
).
The run_tool_fn
function simply runs a tool if it is selected.
Finally, the incoming output is edited in accordance with the Agent output format by applying the process_agent_response_fn
function.
from typing import Set, Optional
from llama_index.core.agent.react.output_parser import ReActOutputParser
from llama_index.core.llms import ChatResponse
from llama_index.core.agent.types import Task
def parse_react_output_fn(
task: Task, state: Dict[str, Any], chat_response: ChatResponse
):
"""Parse ReAct output into a reasoning step."""
output_parser = ReActOutputParser()
reasoning_step = output_parser.parse(chat_response.message.content)
return {"done": reasoning_step.is_done, "reasoning_step": reasoning_step}
parse_react_output = AgentFnComponent(fn=parse_react_output_fn)
def run_tool_fn(
task: Task, state: Dict[str, Any], reasoning_step: ActionReasoningStep
):
"""Run tool and process tool output."""
tool_runner_component = ToolRunnerComponent(
query_engine_tools, callback_manager=task.callback_manager
)
tool_output = tool_runner_component.run_component(
tool_name=reasoning_step.action,
tool_input=reasoning_step.action_input,
)
observation_step = ObservationReasoningStep(observation=str(tool_output))
state["current_reasoning"].append(observation_step)
# TODO: get output
return {"response_str": observation_step.get_content(), "is_done": False}
run_tool = AgentFnComponent(fn=run_tool_fn)
def process_response_fn(
task: Task, state: Dict[str, Any], response_step: ResponseReasoningStep
):
"""Process response."""
state["current_reasoning"].append(response_step)
response_str = response_step.response
# Now that we're done with this step, put into memory
state["memory"].put(ChatMessage(content=task.input, role=MessageRole.USER))
state["memory"].put(
ChatMessage(content=response_str, role=MessageRole.ASSISTANT)
)
return {"response_str": response_str, "is_done": True}
process_response = AgentFnComponent(fn=process_response_fn)
def process_agent_response_fn(
task: Task, state: Dict[str, Any], response_dict: dict
):
"""Process agent response."""
return (
AgentChatResponse(response_dict["response_str"]),
response_dict["is_done"],
)
process_agent_response = AgentFnComponent(fn=process_agent_response_fn)
Stitch together Agent Query Pipeline¶
We can now stitch together the top-level agent pipeline: agent_input -> react_prompt -> llm -> react_output.
The last component is the if-else component that calls sub-components.
from llama_index.core.query_pipeline import QueryPipeline as QP
qp = QP(verbose=True)
from llama_index.core.query_pipeline import QueryPipeline as QP
from llama_index.llms.openai import OpenAI
qp.add_modules(
{
"agent_input": agent_input_component,
"react_prompt": react_prompt_component,
"llm": OpenAI(model="gpt-4-1106-preview"),
"react_output_parser": parse_react_output,
"run_tool": run_tool,
"process_response": process_response,
"process_agent_response": process_agent_response,
}
)
# link input to react prompt to parsed out response (either tool action/input or observation)
qp.add_chain(["agent_input", "react_prompt", "llm", "react_output_parser"])
# add conditional link from react output to tool call (if not done)
qp.add_link(
"react_output_parser",
"run_tool",
condition_fn=lambda x: not x["done"],
input_fn=lambda x: x["reasoning_step"],
)
# add conditional link from react output to final response processing (if done)
qp.add_link(
"react_output_parser",
"process_response",
condition_fn=lambda x: x["done"],
input_fn=lambda x: x["reasoning_step"],
)
# whether response processing or tool output processing, add link to final agent response
qp.add_link("process_response", "process_agent_response")
qp.add_link("run_tool", "process_agent_response")
Visualize Query Pipeline¶
from pyvis.network import Network
net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(qp.clean_dag)
print(net)
{ "Nodes": [ "agent_input", "react_prompt", "llm", "react_output_parser", "run_tool", "process_response", "process_agent_response" ], "Edges": [ { "src_key": null, "dest_key": null, "condition_fn": null, "input_fn": null, "width": 1, "from": "agent_input", "to": "react_prompt", "arrows": "to" }, { "src_key": null, "dest_key": null, "condition_fn": null, "input_fn": null, "width": 1, "from": "react_prompt", "to": "llm", "arrows": "to" }, { "src_key": null, "dest_key": null, "condition_fn": null, "input_fn": null, "width": 1, "from": "llm", "to": "react_output_parser", "arrows": "to" }, { "src_key": null, "dest_key": null, "width": 1, "from": "react_output_parser", "to": "run_tool", "arrows": "to" }, { "src_key": null, "dest_key": null, "width": 1, "from": "react_output_parser", "to": "process_response", "arrows": "to" }, { "src_key": null, "dest_key": null, "condition_fn": null, "input_fn": null, "width": 1, "from": "run_tool", "to": "process_agent_response", "arrows": "to" }, { "src_key": null, "dest_key": null, "condition_fn": null, "input_fn": null, "width": 1, "from": "process_response", "to": "process_agent_response", "arrows": "to" } ], "Height": "600px", "Width": "100%", "Heading": "" }
# Save the network as "agent_dat.html"
net.write_html("agent_dag.html")
from IPython.display import display, HTML
# Read the contents of the HTML file
with open("agent_dag.html", "r") as file:
html_content = file.read()
# Display the HTML content
display(HTML(html_content))
Setup Agent Worker around our Query Engines¶
from llama_index.core.agent import QueryPipelineAgentWorker
from llama_index.core.callbacks import CallbackManager
agent_worker = QueryPipelineAgentWorker(qp)
agent = agent_worker.as_agent(
callback_manager=CallbackManager([]), verbose=True
)
Run the Agent¶
# start task
task = agent.create_task(
"What was Uber's Management's Report on Internal Control over Financial Reporting?"
)
step_output = agent.run_step(task.task_id)
> Running step 26c623b9-0864-45d9-9f91-f893a4696727. Step input: What was Uber's Management's Report on Internal Control over Financial Reporting? > Running module agent_input with input: state: {'sources': [], 'memory': ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'), chat_store=SimpleChatSto... task: task_id='aa7707d1-a35a-4d96-b2cc-ded765a3a3e2' input="What was Uber's Management's Report on Internal Control over Financial Reporting?" memory=ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functool... > Running module react_prompt with input: input: What was Uber's Management's Report on Internal Control over Financial Reporting? > Running module llm with input: messages: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.\n\n## Too... > Running module react_output_parser with input: chat_response: assistant: Thought: I need to use the uber_10k tool to find the specific section about Uber's Management's Report on Internal Control over Financial Reporting for the year 2021. Action: uber_10k Actio... > Running module run_tool with input: reasoning_step: thought="I need to use the uber_10k tool to find the specific section about Uber's Management's Report on Internal Control over Financial Reporting for the year 2021." action='uber_10k' action_input={... > Running module process_agent_response with input: response_dict: {'response_str': 'Observation: {\'output\': ToolOutput(content="Uber\'s Management\'s Report on Internal Control over Financial Reporting stated that they excluded The Drizly Group, Inc. and TupeloPar...
print(step_output)
Observation: {'output': ToolOutput(content="Uber's Management's Report on Internal Control over Financial Reporting stated that they excluded The Drizly Group, Inc. and TupeloParent, Inc. from their assessment of internal control over financial reporting as of December 31, 2021 due to their acquisition by the company during 2021. The report also mentioned that Drizly and TupeloParent were excluded from the audit of internal control over financial reporting.", tool_name='uber_10k', raw_input={'input': "What was Uber's Management's Report on Internal Control over Financial Reporting?"}, raw_output=Response(response="Uber's Management's Report on Internal Control over Financial Reporting stated that they excluded The Drizly Group, Inc. and TupeloParent, Inc. from their assessment of internal control over financial reporting as of December 31, 2021 due to their acquisition by the company during 2021. The report also mentioned that Drizly and TupeloParent were excluded from the audit of internal control over financial reporting.", source_nodes=[NodeWithScore(node=TextNode(id_='931833d8-5d9e-4f37-bd0b-1ffeb58f0256', embedding=None, metadata={'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='239d870b-d23e-4805-8761-5e83f826f10a', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, hash='0037c6d1cdc56230c931100529a1d35ea0d556331ddae62b0b342c2403104e69'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='733e2fd3-fded-4e9d-8698-a16782d23a57', node_type=<ObjectType.TEXT: '1'>, metadata={'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, hash='3c2f360445d5ff1069578c4a7ba2867bd32389a5ab93ecd5ee1c8998a7c1f5fa'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='4876454e-2f82-4460-937a-fe7398fb5974', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='bb5f14b14770778aa72510551a86642ba81040186ee787609ae76877639d2590')}, text='Our audits also included evaluating the accounting principles used and significantestimates\n made by management, as well as evaluating the overall presentation of the consolidated financial statements. Our audit of internal control over financialreporting\n included obtaining an understanding of internal control over financial reporting, assessing the risk that a material weakness exists, and testing andevaluating\n the design and operating effectiveness of internal control based on the assessed risk. Our audits also included performing such other procedures as weconsidered necessary in th\ne circumstances. We believe that our audits provide a reasonable basis for our opinions.As\n described in Management’s Report on Internal Control over Financial Reporting, management has excluded The Drizly Group, Inc. (“Drizly”) and TupeloParent, Inc. (“Transplace”) from its assessment of internal control over financial reporting as of December 3\n1, 2021 because they were acquired by the Company inpurchase\n business combinations during 2021. We have also excluded Drizly and Transplace from our audit of internal control over financial reporting. Drizly andTransplace\n are wholly-owned subsidiaries whose total assets and total revenues excluded from management’s assessment and our audit of internal control overfinancial\n reporting collectively represent approximately 3% and 4%, respectively, of the related consolidated financial statement amounts as of and for the yearended December 31, 2021.\nDefinition and Limitations of Internal Control over Financial Repor\ntingA\n company’s internal control over financial reporting is a process designed to provide reasonable assurance regarding the reliability of financial reporting and thepreparation\n of financial statements for external purposes in accordance with generally accepted accounting principles. A company’s internal control over financialreporting includes those policies and procedures that (i) pertain to the maintenance of records that, in reasonable detail, accurately an\nd fairly reflect the transactionsand\n dispositions of the assets of the company; (ii) provide reasonable assurance that transactions are recorded as necessary to permit preparation of financialstatements in accordance with\n generally accepted accounting principles, and that receipts and expenditures of the company are being made only in accordance withauthorizations of management\n and directors of the company; and (iii) provide reasonable assurance regarding71', start_char_idx=3733, end_char_idx=6274, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.9047882048131767), NodeWithScore(node=TextNode(id_='733e2fd3-fded-4e9d-8698-a16782d23a57', embedding=None, metadata={'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='239d870b-d23e-4805-8761-5e83f826f10a', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, hash='0037c6d1cdc56230c931100529a1d35ea0d556331ddae62b0b342c2403104e69'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='5c1073c7-33f4-4f95-bb85-515619f4135a', node_type=<ObjectType.TEXT: '1'>, metadata={'page_label': '72', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, hash='5e414926a8e5d3a1ad31529426a96d04ce0fd55f12624297c511c6fa00845bb3'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='931833d8-5d9e-4f37-bd0b-1ffeb58f0256', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='87fae4857bb55d49c262875bb681de6578c8bf807af889f1cfdd6edc819c7ab2')}, text="Report of Independent Registered Public Accounting FirmTo the Board of Directors and Stockhold\ners of Uber Technologies, Inc.Opinions on the Financial Statements and Internal Control over Financial Reporting\nWe\n have audited the accompanying consolidated balance sheets of Uber Technologies, Inc. and its subsidiaries (the “Company”) as of December 31, 2021 and2020,\n and the related consolidated statements of operations, of comprehensive loss, of redeemable non-controlling interests and equity and of cash flows for eachof\n the three years in the period ended December 31, 2021, including the related notes and financial statement schedule listed in the accompanying index(collectively\n referred to as the “consolidated financial statements”). We also have audited the Company's internal control over financial reporting as of December31,\n 2021, based on criteria established in Internal Control - Integrated Framework (2013) issued by the Committee of Sponsoring Organizations of the TreadwayCommission (COSO).\nIn our\n opinion, the consolidated financial statements referred to above present fairly, in all material respects, the financial position of the Company as of December31,\n 2021 and 2020, and the results of its operations and its cash flows for each of the three years in the period ended December 31, 2021 in conformity withaccounting\n principles generally accepted in the United States of America. Also in our opinion, the Company maintained, in all material respects, effective internalcontrol over financia\nl reporting as of December 31, 2021, based on criteria established in Internal Control - Integrated Framework (2013) issued by the COSO.Changes in Accounting Principles\nAs discussed in\n Note 1 to the consolidated financial statements, the Company changed the manner in which it accounts for convertible instruments and contracts inan entity’s own equity in 2021 and the \nmanner in which it accounts for leases in 2019.Basis for Opinions\nThe\n Company's management is responsible for these consolidated financial statements, for maintaining effective internal control over financial reporting, and forits\n assessment of the effectiveness of internal control over financial reporting, included in Management’s Report on Internal Control over Financial Reportingappearing under Item\n 9A. Our responsibility is to express opinions on the Company’s consolidated financial statements and on the Company's internal control overfinancial\n reporting based on our audits. We are a public accounting firm registered with the Public Company Accounting Oversight Board (United States)(PCAOB)\n and are required to be independent with respect to the Company in accordance with the U.S. federal securities laws and the applicable rules andregulations of the Securi\nties and Exchange Commission and the PCAOB.We\n conducted our audits in accordance with the standards of the PCAOB. Those standards require that we plan and perform the audits to obtain reasonableassurance\n about whether the consolidated financial statements are free of material misstatement, whether due to error or fraud, and whether effective internalcontrol over financial reporti\nng was maintained in all material respects.Our\n audits of the consolidated financial statements included performing procedures to assess the risks of material misstatement of the consolidated financialstatements,\n whether due to error or fraud, and performing procedures that respond to those risks. Such procedures included examining, on a test basis, evidenceregarding\n the amounts and disclosures in the consolidated financial statements. Our audits also included evaluating the accounting principles used and significantestimates\n made by management, as well as evaluating the overall presentation of the consolidated financial statements. Our audit of internal control over financialreporting\n included obtaining an understanding of internal control over financial reporting, assessing the risk that a material weakness exists, and testing andevaluating\n the design and operating effectiveness of internal control based on the assessed risk. Our audits also included performing such other procedures as weconsidered necessary in th\ne circumstances. We believe that our audits provide a reasonable basis for our opinions.As\n described in Management’s Report on Internal Control over Financial Reporting, management has excluded The Drizly Group, Inc. (“Drizly”) and TupeloParent, Inc.", start_char_idx=0, end_char_idx=4599, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.8978082427817372), NodeWithScore(node=TextNode(id_='8bd26e38-840b-46ee-88d3-7bfe8f4285da', embedding=None, metadata={'page_label': '306', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='64b95267-2f6a-4c1f-8d8f-9de7fe76f4c8', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '306', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, hash='df98f8c54de64315e8021891795f855846ef72199e1261097b125f88c8178f86'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='75ed4d3d-893c-4466-8dc7-687f1934419b', node_type=<ObjectType.TEXT: '1'>, metadata={'page_label': '305', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, hash='2137604806ace0fa5b98cfb9734001992adc7b29e39ae0c0a7825cb9f6d5602e'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='cd51a0cc-80a7-421b-ae34-c7b914f997e6', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='3249f19608f14fd995c4e7168eeaecbb6a5663275b96522764a0d0521c921ae7')}, text='Exhibit 31.2CERTIFICATION OF PR\nINCIPAL FINANCIAL OFFICERPURSUANT TO EXCHANGE A\nCT RULES 13a-14(a) AND 15d-14(a)AS ADOPTED PURSUANT TO SECTI\nON 302 OF THE SARBANES-OXLEY ACT OF 2002I, Nelson Chai, certify that:\n1.\nI have reviewed this Annual Report on For m 10-K of Uber Technologies, Inc.;2.\nBased on my knowledge, this report does not contain any untrue statement of a material fact or omit to state a material fact necessary to make thestatements made, in l\night of the circumstances under which such statements were made, not misleading with respect to the period covered by this report;3.\nBased on my knowledge, the financia l statements, and other financial information included in this report, fairly present in all material respects thefinancial condition, r\nesults of operations and cash flows of the registrant as of, and for, the periods presented in this report;4.\nThe registrant’s other certifying officer and I are responsible for establishing and maintaining disclosure controls and procedures (as defined in ExchangeAct Rules 13a-15(e) and 15d-15(e)\n) and internal control over financial reporting (as defined in Exchange Act Rules 13a-15(f) and 15d-15(f)) for theregistrant and have:\n(a)\nDesigned such disclosure controls and procedures, or caused such disclosure controls and procedures to be designed under our supervision, toensure that materi\nal information relating to the registrant, including its consolidated subsidiaries, is made known to us by others within thoseentities, particul\narly during the period in which this report is being prepared;(b)\nDesigned such internal contro l over financial reporting, or caused such internal control over financial reporting to be designed under oursupervision, to provide reasonab\nle assurance regarding the reliability of financial reporting and the preparation of financial statements forexternal purposes in acc\nordance with generally accepted accounting principles;(c)\nEvaluated the effec tiveness of the registrant’s disclosure controls and procedures and presented in this report our conclusions about theeffectiveness of the d\nisclosure controls and procedures, as of the end of the period covered by this report based on such evaluation; and(d)\nDisclosed in this report any ch ange in the registrant’s internal control over financial reporting that occurred during the registrant’s most recentfiscal quarter (the registrant’s f\nourth fiscal quarter in the case of an annual report) that has materially affected, or is reasonably likely tomaterially affect, the registrant’s inter\nnal control over financial reporting; and5.\nThe registrant’s other certifying officer and I have disclosed, based on our most recent evaluation of internal control over financial reporting, to theregistrant’s auditors and\n the audit committee of the registrant’s board of directors (or persons performing the equivalent functions):(a)\nAll significant defic iencies and material weaknesses in the design or operation of internal control over financial reporting which are reasonablylikely to adversely affect the\n registrant’s ability to record, process, summarize and report financial information; and(b)\nAny fraud, whether or not mater ial, that involves management or other employees who have a significant role in the registrant’s internal controlover financial reporting.\nDate:\nFebruary 24, 2022 By: /s/ Nelson Chai Nelson Chai\nChief Financial Officer\n(Principal Financial Offic\ner)', start_char_idx=0, end_char_idx=3440, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.876161937500491)], metadata={'931833d8-5d9e-4f37-bd0b-1ffeb58f0256': {'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, '733e2fd3-fded-4e9d-8698-a16782d23a57': {'page_label': '73', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}, '8bd26e38-840b-46ee-88d3-7bfe8f4285da': {'page_label': '306', 'file_name': 'uber_2021.pdf', 'file_path': 'data/10k/uber_2021.pdf', 'file_type': 'application/pdf', 'file_size': 1880483, 'creation_date': '2024-03-12', 'last_modified_date': '2024-03-12'}}))}
# start task
task = agent.create_task("What was Lyft's revenue growth in 2021?")
step_output = agent.run_step(task.task_id)
> Running step ee9eff5f-a4be-4b76-bae6-d05d2af41ecd. Step input: What was Lyft's revenue growth in 2021? > Running module agent_input with input: state: {'sources': [], 'memory': ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'), chat_store=SimpleChatSto... task: task_id='7a038afc-3ead-4b0c-a924-41cd4f465270' input="What was Lyft's revenue growth in 2021?" memory=ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of... > Running module react_prompt with input: input: What was Lyft's revenue growth in 2021? > Running module llm with input: messages: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.\n\n## Too... > Running module react_output_parser with input: chat_response: assistant: Thought: I need to use the lyft_10k tool to find out the revenue growth for Lyft in 2021. Action: lyft_10k Action Input: {"input": "What was Lyft's revenue growth in 2021?"} > Running module run_tool with input: reasoning_step: thought='I need to use the lyft_10k tool to find out the revenue growth for Lyft in 2021.' action='lyft_10k' action_input={'input': "What was Lyft's revenue growth in 2021?"} > Running module process_agent_response with input: response_dict: {'response_str': 'Observation: {\'output\': ToolOutput(content="Lyft\'s revenue increased by 36% in 2021 compared to the prior year.", tool_name=\'lyft_10k\', raw_input={\'input\': "What was Lyft\'s r...
step_output = agent.run_step(task.task_id)
> Running step 279c7a46-ce9d-4202-bec4-01d5c1ed50bd. Step input: None > Running module agent_input with input: state: {'sources': [], 'memory': ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'), chat_store=SimpleChatSto... task: task_id='7a038afc-3ead-4b0c-a924-41cd4f465270' input="What was Lyft's revenue growth in 2021?" memory=ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of... > Running module react_prompt with input: input: What was Lyft's revenue growth in 2021? > Running module llm with input: messages: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.\n\n## Too... > Running module react_output_parser with input: chat_response: assistant: Thought: The user has repeated the question, but the tool has already provided the answer. Answer: Lyft's revenue increased by 36% in 2021 compared to the prior year. > Running module process_response with input: response_step: thought='The user has repeated the question, but the tool has already provided the answer.' response="Lyft's revenue increased by 36% in 2021 compared to the prior year." is_streaming=False > Running module process_agent_response with input: response_dict: {'response_str': "Lyft's revenue increased by 36% in 2021 compared to the prior year.", 'is_done': True}
step_output.is_last
True
print(step_output)
Uber's Management's Report on Internal Control over Financial Reporting for the year ended December 31, 2021, stated that management excluded The Drizly Group, Inc. ("Drizly") and TupeloParent, Inc. ("Transplace") from its assessment of internal control over financial reporting. This exclusion was due to their acquisition by the company during 2021. Drizly and Transplace were wholly-owned subsidiaries whose total assets and total revenues collectively represented approximately 3% and 4%, respectively, of the related consolidated financial statement amounts for the year.
response = agent.finalize_response(task.task_id)
print(str(response))
Uber's Management's Report on Internal Control over Financial Reporting for the year ended December 31, 2021, stated that management excluded The Drizly Group, Inc. ("Drizly") and TupeloParent, Inc. ("Transplace") from its assessment of internal control over financial reporting. This exclusion was due to their acquisition by the company during 2021. Drizly and Transplace were wholly-owned subsidiaries whose total assets and total revenues collectively represented approximately 3% and 4%, respectively, of the related consolidated financial statement amounts for the year.