Jan 21, 2025
Jon Reifschneider, Cofounder & CEO
One of the most promising (and most talked about) applications of large language models is the development of AI agents. Agentic AI is at the core of our technology approach at Inquisite, and through our work we have developed an appreciation of the opportunities and limits of what agents can do today. In this post we share our perspective on the use of AI agents, particularly for R&D workflows, and we describe our own approach to implementing agents within the Inquisite platform and some of our lessons learned.
What are agents?
A good place to start is to define the term "agent" in the context of modern AI. This is more nuanced than it sounds, since researchers and companies use the term in a variety of different ways. Some organizations use the term to refer to fully autonomous systems capable of operating independently with no human oversight, while other organizations use the term for AI workflows that involve chained LLM calls or use of tools to assist a LLM in answering a query.
A number of researchers have recently proposed using the term "agentic AI" rather than AI agents, to indicate that there is a spectrum of agent-like behavior rather than a binary agent / no agent differentiation. At Inquisite we subscribe to this view. We use agentic AI within our Assistants, but we structure it in a way that our human users retain control and stay within the decision-making loop on the decisions that matter.
Why agents?
There are lots of good reasons not to use agentic AI, three of the main ones being the 1) added cost and latency, 2) additional system complexity and 3) decreased reliability that come with using agents. We subscribe to Occam's Razor that the simplest solution is usually the best, and so we implement agentic AI within Inquisite only when necessary to accomplish a specific objective for our users. When possible to solve problems using chained LLM calls or a prescriptive workflow involving LLMs, that is typically preferable for the reasons mentioned above.
However, there are certain cases where it becomes impossible to prescribe a defined workflow which effectively solves the problem across a variety of use cases, and this is where agents can add value.
Ingredients of agentic AI
There are three main ingredients of AI agents: query planning/reasoning, memory, and tool use.
Query planning
The key ingredient in an AI agent is the ability to plan a sequence of actions to take in response to a user's query or instruction. This sequence of actions is constrained by the set of actions available to the agent, which might include things like the ability to search the web, write and execute code, perform SQL queries, retrieve data from APIs, etc. A certain level of reasoning is required in order to build an effective plan, and thus agents will typically use higher-end LLMs that excel in reasoning for this capability.
The planning capability is often iterative, meaning that the agent will build and execute a plan, but if the result does not match the desired output the agent may re-start its action loop by developing a new plan (taking into account the learning from the previous) to execute.
Tool use
The ability to plan may be central to an AI agent, but a plan without the ability to execute it is worthless to the user. In order to execute the plan, agents require the ability to perform actions. These actions may involve searching various sources for information, writing and executing code, or performing actions on behalf of a user such as scheduling a meeting, responding to an email, or booking a flight.
The ability to perform actions is typically referred to as "tool use", meaning the LLM can use certain tools made available to it to perform its work. A key decision in building agents is to define the set of tools which will be made available to the agent to use. The selection of tools depends on the set of use cases one expects the agent to be able to handle. In practice, tools are made available to an agent by writing code functions which the agent can call as needed as part of its plan. The query planning step of its work involves determining which functions to call, in which order, and with which parameters.
Memory
Agents need to retain both a short-term and long-term memory in order to perform their work. Long-term memory is often implementing using a database, often involving what is referred to as retrieval-augmented generation (RAG). In RAG, information is stored in a database and encoded into numeric vectors called embeddings which represent the meaning of each piece of information. Agents can search over the information by comparing a vector representing a query with the vectors for each piece of stored information and finding the most relevant information.
Short-term memory is also needed to retain needed information as the agent progresses through its sequence of actions. Many agents act iteratively, reverting back to planning when the output is deemed to not meet the requirements. This iterative behavior also requires the agent to retain memory of what is has previously tried and what the result was.
Looped Execution
This iterative behavior is a key strength of agents - rather than simply accepting and returning a sub-optimal result to the user at the end of a chain of LLM calls, agents have the ability to return to the start and try alternative paths, seeking to find a better output for the user.
Image source: Anthropic
However, this looped execution pattern also has drawbacks - most notably that each iteration of the loop requires significant additional cost and latency. Guardrails must be designed to manage costs so that the unit economics of the system make sense for all parties.
Inquisite's approach to agentic AI
Our Assistants follow the agentic AI design pattern:
We implement query planning, allowing an LLM to develop a customized plan for each user query rather than trying to force-fit queries to a set of rules. Planning often involves attempting to understand the user's intent, determining where to search for relevant information, and formulating a set of search queries to find the information needed
Assistants have access to limited sets of tools to use, most of which enable them to search for information in various places. Assistants have the ability to perform searches across different databases and APIs as needed for their task.
Assistants retain both long-term memory using our graph database as well as short-term memory during execution.
These design characteristics enable our Assistants to be flexible in responding to users' queries. However, we take a very practical approach to the level of flexibility we provide each Assistant. We believe that the simplest solution is best, and we add complexity only when needed to provide clear user benefit. We constrain the set of actions available and limit the degree of iterative behavior for each Assistant based on the task it performs. We also seek to get as much specific information as possible from the user on what they are seeking, rather than attempting to guess at it using LLMs. This enables our Assistants to have a high degree of reliability in performing their tasks, while also protecting against compounding errors and high costs due to the iterative behavior.
We also firmly believe that agents must be driven by humans, rather than acting autonomously with no oversight. This is particularly true in our space, where agents are providing critical information and performing critical tasks that have high impact on our users' organizations. This belief is reflected in our use of the term "Assistants", rather than agents. We design our Assistants to augment our human users, not replace them with automation. By doing this, we can greatly accelerate their workflows, while still providing control over the process, and free up our users to focus their time on moving the core science forward.