Summary
In this article, we explore an innovative method for creating persistent, intelligent AI agents through structured conversations with large language models (LLMs). By capturing insightful dialogues and transforming them into persistent entities, we can build customised AI partners that retain knowledge and apply it to various tasks. This user-friendly approach enables anyone to develop tailored AI solutions without the need for complex algorithms or extensive datasets, simply by engaging in natural conversations with LLMs.
Introduction
Have you ever had an incredibly insightful conversation with an AI, only to struggle to recreate that same level of brilliance later on? If you’ve found yourself lost in a sea of chat logs, desperately trying to replicate one amazing interaction you had with a large language model (LLM), the following approach may be the solution. By capturing and structuring those illuminating dialogues, you can transform fleeting conversations into lasting value.
So, how it works?
Instead of repetitively entering the same prompts, hoping the AI will recreate that “perfect” interaction, this method allows you to transform great dialogues into persistent value. By capturing conversations where you naturally taught, collaborated with, or guided the AI, you’re not just chatting – you’re building an empowered team. You’ll create AI partners that retain what they’ve learned and can apply those skills whenever needed, rather than starting from scratch each time. These AI partners, or ‘agents,’ act like virtual assistants or collaborators, retaining the knowledge and skills you’ve imparted to them through your conversations.
Grounded in Conversation
The foundation is the conversation itself. Starting by transforming dialogues with LLMs into AI agents. This agentic approach is grounded in the power of open discussion, keeping things simple without relying on complex algorithms or massive datasets – just the simplicity of a back-and-forth exchange.
For each agent you want on your team, you’ll have an in-depth conversation shaping what quality output you need from it. These conversations between you and a large language model are facilitated by a lightweight proxy that stores the dialogues locally. A simple Python script allows you to chat with the model while saving those interactions for later use cases. Think of it as a proxy helping to mold the discussions into fully-fledged agents. It provides useful commands like loading previous dialogues for deeper follow-up exchanges and saving new conversations to form additional agents.
Those recorded dialogues become an agent’s “memory.” Just as we can load them back into the proxy for further discussion, we can also integrate them into scripts built around the agents themselves. These scripts orchestrate agentic collaboration, conducting actions and facilitating responses as you interact with the different agents.
It’s akin to teaching an AI through examples, building its knowledge base through a series of interactive conversations. This approach is not only intuitive but immensely powerful, allowing you to create highly customized agents that reflect the nuances of human expertise and decision-making.
Few-shot Learning and Cost Efficiency
The concept aligns with few-shot learning, which is similar to teaching someone a new skill by showing them a few examples and letting them learn from those demonstrations. In the dialogue, you collaboratively work with the AI to develop the right set of instructions for the task at hand. Together you generate output examples (shots). In a way, your dialogue occasionally takes a meta approach, allowing the model to reflect on its own output and structure of the dialog.
For large, complex tasks that you want an agent to handle in the future, the underlying dialogue can become quite extensive. Incorporating such a lengthy dialogue into automated scripts could become costly. The solution is to feed the entire stored dialogue to an LLM with a large context window, like Gemini 1.5 Pro, and use a tailored prompt to reduce noise from the initial conversation. This new prompt, ideally with one or more examples, captures the core structure and intent from the user in a well-crafted form.
Why it Works
The way we actually interact with LLMs via chat is counterintuitive. Each time we ask a new question, the entire history, including the new query, is loaded in. The LLM has no inherent knowledge of the conversation – it only understands the context by reading through it again and again, like a sort of conversational Groundhog Day.
By storing the entire conversation locally and having the ability to load it back in, we’re essentially replicating a normal chat session with an LLM. However, storing it gives us the option to go into the history, remove parts, or manually add elements when needed. This isn’t a typical part of the process but can be a useful capability.
Anthropomorphising is Encouraged
I’ve seen good results when prompting the AI to take on specific personas or roles. For example, I created an agent consisting of a journalistic duo, Sarah and David, who excel at writing punchy dialogue. According to my still-unnamed assessment agent, they perform on par with professionals at The New York Times.
This assessment agent, let’s call him Michael B. for now, interacts quite naturally with Sarah and David. I’ve automated parts of their exchanges via a DialogManager script, and they collaborate through two to three iterations to produce dialogue that may even surpass The New York Times’ quality – though validating that is beyond the scope of this post.
For me, it’s much easier to approach this when I view the agents in their primordial stage as distinct roles and treat them as teammates with names when using them in scripts or normal dialogs with them for specific tasks.
Practical steps in developing your first agent.
First step:
To get started with creating your own AI agent, a good starting point is, first to identify the output you want the agent to produce. For example, you might want an agent that can write creative fiction stories, analyse financial data.
Once you have a clear idea of the desired output, you can begin a conversation with a language model like ChatGPT, providing examples and guidance to shape the agent’s capabilities. During this interactive dialogue, focus on demonstrating the kind of output you expect, whether it’s well-crafted stories, or insightful analyses.
Through your conversation and examples, the AI agent will gradually learn and develop the ability to produce the desired output on its own. Once you’re satisfied with the agent’s performance during the conversation, you can save that dialogue history, which becomes the agent’s ‘memory’ or knowledge base.
Going forward, you can load that saved dialogue into a script or application, allowing you to interact with your customised AI agent and receive the type of output you trained it to produce, whether it’s creative writing, data analysis, troubleshooting, or any other skill you’ve imparted.
Second step:
Keep your agent highly cohesive, as in let it produce a small output, not too big, not too small. It all depends on how well the output starts meeting the quality standard you need in the output. Iterate to find the right level for the current capabilities of large language models.
For complex tasks, it can be helpful to break down the process into smaller steps. You can create additional AI agents that evaluate the output at each step, based on the criteria you’ve set. This way, you’ll have a team of specialised agents working together, building upon each other’s work to gradually achieve the desired final output.
Practical Implementation of the code you can build around the concept.
Getting started is relatively straightforward:
- Create a Python script with two modes: interactive (for chatting with your chosen LLM API) and command mode (for saving, loading dialogues, and other utilities).
- Build a DialogManager script that leverages the above, allowing you to load agent histories, send messages directly, and facilitate exchanges between agents.
You can opt for a local version integrated with a note-taking app like Obsidian or use a cloud setup like my Google Cloud Colab Enterprise notebooks for collaboration or make a streamlit GUI to integrate it deeper with your own research tooling.
Bringing agents and code together
Below is a sample code snippet that collects a very detailed summary from an agent, spanning often more than one model response from Gemini 1.5 pro:
“ “`python
message_file_path = “/Users/Yours/Inbox/2024-06-13/Asana’s Head of AI AI Literacy a Priority/transcript/Asana’s Head of AI_ AI Literacy a Priority.txt”
model = initialize_model()
dialog_manager = DialogManager(model)
# Load agents:
topic_agent = load_dialog(“workflow_explains_complex_topic.json”)
dialog_manager.add_dialog(topic_agent)
checker_agent = load_dialog(“response_completion_checker_agent.json”)
dialog_manager.add_dialog(checker_agent)
# Collecting summary notes
summary_notes = aggregate_responses(
topic_agent.id,
checker_agent.id,
f”file {message_file_path}”
)
print(summary_notes)
“` “
Code snippet explained:
The two agents involved in the simple scenario are each specialized in their job, assisting in getting an output in the form of a note. A note that might contain several responses from the underlying LLM.
With a small sample setup, a set of agents, working together, imagine this coordination effort can be abstracted into a new function or class for a very concrete implementation in your use case.
Let me know if you need any other details!
Cheers, author.
You can contact the author at:
Rene Luijk
LinkedIn: https://www.linkedin.com/in/reneluijk/
