Demystifying AI knowledge agents

A plain English guide that explains how they work and what they can do for your organization
Michelle Greer|Jan 23, 2025

When you read about AI agents, you probably hear a lot about automation or augmentation of work. You probably see graphics of human-like robots, like this one:

While science fiction can be fun, this portrayal insinuates that agents exist only in the future. From drug discovery to helping employees navigate their workplace, AI is being used in enterprises right now. AI agents can not only summarize data, but can autonomously take action on it on our behalf. To perform work, agents need access to one or more of the following:

In this blog series on demystifying AI agents, we’ll break down how these capabilities work in practice. By understanding how agents work and what they do, you will gain a better understanding of how they can be useful not in some distant utopian future, but today. We’re going to start with a common use case: knowledge agents.

What are knowledge agents and why are they useful?

Enterprises generate and manage petabytes of data, and most of this data is unstructured. From blog posts to company guidebooks and intranets, it can be really hard for people to even track changes to a particular subject at an organization, much less act on what they find. And if you do end up becoming the subject matter expert on a customer, product, or topic, you often have to spend countless hours explaining that topic to others instead of getting better at your craft. At today’s pace of business, it just doesn’t scale.

Enter AI. A knowledge agent can do what a single human can’t. It can synthesize and interpret massive amounts of unstructured data to actually generate context-aware, nuanced answers as well as act on them.

Who wouldn’t want this kind of help at work? Knowledge agents can be incredibly useful in the following ways:

  • Save people time: Knowledge agents don’t just deliver search results. They provide specific answers on unstructured data that lives across any number of sources. This means employees can find accurate answers in seconds instead of stumbling through various documents.
  • Generate net new content: Writers block is real. Knowledge agents can augment your team by generating unique content based on the data provided, such as an email, thoughtful summary, blog post, instructions, rap battle response, etc.
  • Provide action based on these answers: Integrate your individual knowledge agent into a more robust agent that can complete complex tasks: e.g. “find any negative reviews of Product X across all of these surveys, provide specific feedback items that we could use to improve, and then post this in the #feedback Slack channel.”

What kind of data can knowledge agents interpret?

Knowledge agents can help people make sense of a lot of information. This information could be hosted in one place like a giant knowledge base, or it can live across many different sources like Slack, Salesforce, and Confluence. Knowledge agents can track subjects, customers, products, competitors, reviews, or any topic in real time. If you can provide the content, an AI agent can learn about it – even if topics are scattered across many different data sources. They can make sense of volumes of unstructured text and images from PDFs or web pages, as well as data from SaaS tools like Salesforce, Google Docs, or Slack.

Here are just a few ideas for knowledge agents:

When we think of AI, we often think of text outputs. But any kind of content can be interpreted by AI agents. Knowledge agents can be multi-modal, meaning they can interpret not only text, but video, images, and audio inputs as well.

How do knowledge agents work?

AI agents rely on Large Language Models (LLM) like OpenAI, Gemini, Mistral, or Llama to interpret and generate responses. While LLMs can seem almost human-like, they learn language differently than we do. LLMs rely on sophisticated machine learning algorithms to understand patterns, contexts, and nuances in language. They use training data to build complex neural networks. Some agents leverage public LLMs like OpenAI, Claude, or Mistral at their core. Many AI teams decide to build their own LLMs. This is more time consuming, but ensures that an LLM can securely use inputs as training data to improve accuracy.

Knowledge agents and LLMs aren’t magic. They tokenize content, i.e. break each word or piece of content down into parts. Then they convert those parts into something computers understand: numbers. In this case, agents break text into numerical weights, which are called vectors.

So in order to build a knowledge agent, someone or something has to break down your private data into contextual vectors, store those vectors in a vector database, and then keep that data up to date. Let’s walk through what this looks like:

Let’s assume you want an HR assistant that answers company policy questions based on the tome which is your online company procedural guidebook. You decide to build an AI agent instead of relying on search, because it will be able to deliver helpful summaries much faster as well as understand more conversational questions or instructions. At the core of this agent is an LLM and Retrieval-Augmented Generation (RAG), an automated capability of the Squid platform. How does it work?

  • Cleaning: First, you have to give an AI agent to your company handbook. Ideally this data is correctly formatted, deduplicated, and does not include irrelevant data, like HTML tags, employee comments, etc.
  • Contextualizing: Once your data is clean, it helps to tag this data with metadata or descriptions, which is just data about data. Metadata gives each part of your data context and improves its accuracy and usefulness.
  • Chunking: This unstructured data must be broken down into tokens and grouped into bigger parts due to LLM’s token limits, which is a process called chunking.
  • Embedding: Then, these chunks are categorized in a process called embedding. Embeddings are numerical representations of text that capture the semantic meaning of words or sentences. The goal of embedding is to map similar meanings to vectors that are close together in space. For example: the word “employee” would likely be mapped closely to “company”, but not to “snacks.”
  • Loading: These embeddings are then loaded into a vector database so they can be securely accessed by LLMs.
  • Updating: There needs to be a way to ensure data is always up-to-date. This can be accomplished with some sort of automated or manual trigger. Some content like stock prices or customer status must be current up to the minute or even second. Some content like manuals don’t change very frequently.

A knowledge agent needs instructions, which are also vectorized. This includes what to do and not do, a tone of voice, etc. They need a way to test and fine tune results, which is a process called evaluation. They need to understand how accurate they need to be, which is determined by something called temperature. AI agent builders need to secure their data by integrating with an Identity Provider (IdP) like OAuth or Okta, so that underlying data permissions are respected. And finally, the AI agent needs a user interface.

So what happens when you ask a question to a knowledge agent?

So now the fun part. Imagine a field employee at a company facility encounters a leaking pipe, which she knows could flood the building. She is scared but isn’t sure what the proper procedure is for this encounter, and she needs an answer quickly. She asks your knowledge agent:

“What do I do if a water pipe bursts in our facility?”

What / do / i / do/ if / a / water / pipe /bursts / in / our / facility / ?

Your knowledge agent’s embedding model would break up each word and punctuation mark in this sentence into a token. These tokens are then converted into a numerical embedded vector, and then the agent does a similarity search against the vector database to find the relevant chunks. In this case, the words “water” “pipe” and “bursts” provide helpful context. The AI agent then formulates a tokenized LLM prompt that includes the text, query, and instructions. The LLM processes the prompt, then creates an answer which is then detokenized and delivered back in natural language.

What other benefits do knowledge agents have?

When we think of a knowledge agent, it’s easy to fall into the trap of seeing a glorified search engine or ChatGPT wrapper. But they can be much more than this. Here are some examples of what you can do:

  • Solve complex problems: When we deliver instructions to humans, there is often some deductive reasoning that helps us come to the right solution. Using chain of thought, AI agents can break down a problem into steps, or decipher ambiguous answers.
  • Support multiple mediums: Agents can have multi-modal LLM support, meaning they support more than just text. An agent can recognize and compare images, video, or audio data too. Perhaps the poor employee above also encounters a chemical leak she doesn’t even recognize. She can take a picture of the agent and ask “What is this???”
    The AI agent will not only recognize her image, but also the urgency indicated by three question marks.
  • Take action on data: Do you want to build not just a knowledge expert, but an assistant? By leveraging AI functions in agents, knowledge agents can integrate with other systems like CRMs, messaging apps, etc and take action on a user’s behalf.
    Take our leaky pipe example. Our poor employee can not only read to run inside their vehicle and call a specialized maintenance team to help. An AI agent can be programmed to deliver a pre-canned message to that team with a click of a button. Thanks, helpful AI agent!

What are use cases that people are tackling with knowledge agents?

We’ve talked about Support and HR agents. What are common use cases for knowledge agents?

  • Sales and field enablement: Educating sales and field teams can mean the difference between making revenue numbers and not. An AI agent can not only answer questions about products and competitors, but can also quiz employees and walk through potential scenarios on calls.
  • Legal assistants: Do all contracts adhere to company and compliance policies? AI agents can scan countless documents and ensure they adhere to all appropriate terms and conditions.
  • Fraud prevention: AI agents can protect your company from risk and lost revenue by scanning documents and even images for red flags in real-time. They can compare two different kinds of documents, e.g. a potential fraudulent item and a baseline, to see if they match up.
  • Budget Creation and Tracking: AI Agents that connect to ERP, understand budgets at scale plan vs. actual and can create new budgets, monitor current budgets, and generate reports automatically

Get started with AI agents

AI agents can solve real world problems for your organization. If you’d like the benefits of AI agents without having to set up AI infrastructure, data connectors, or RAG engines, contact the team at Squid AI to discuss your use case. Squid AI offers configurable agents that can automate or augment a variety of tasks.

Get a free consultation