Building personalized micro agents
Heyo fellow humans, ready for an AI takeover? Me too, or me neither, depending on who you are. To be frank, I'm not an AI doomer and do think it is somewhat good. The only worry for me has been that this makes more things pay-to-win, but then again, it always has been. Ahh, enough with the philosophy; let me get to the actual content.
I have always been a big fan of building tools and automating things (as you can see in my GitHub profile and blogs), and AI has made this so much easier. This blog is mostly to introduce one of my latest projects, meain/esa, and along with that, the idea of building small personalized agents. Let's start with the basics.
What is an Agent? #
Thankfully, the term "agent" is less ambiguous these days. For our purposes, an agent is simply an LLM that can autonomously use tools, iterate on tasks, and determine when it has finished.
Key characteristics:
- The LLM has access to tools
- It decides which tools to use, and in what order
- It determines when the task is complete
In code, this can be boiled down to:
messages = [system_prompt, user_message]
while True:
output, tool_call = run_llm(messages, list_of_tools)
if output:
print(output)
if tool_call:
messages.append(tool_call.run())
else:
break
To be frank, this is all you need. Any LLM agent framework is essentially building a good (or bad, depending on the framework) UX on top of this. Even for agents that call other agents, you can simulate them by making the other sub-agents tools available to the main agent.
Why build personalized agents? #
OK, now that you have an idea of what agents are, let me tell you why one should be building personalized agents. Believe it or not, everyone has different challenges to deal with in life, and that is what makes us collectively question the pointlessness of life in our own unique ways.
One of the best things about software engineering is the ability to create custom tools,and even tools that help build other tools. I manage an Emacs project at meain/evil-textobj-tree-sitter, which uses tree-sitter to define textobjects that can be used along with other packages in Emacs to act upon semantic chunks of text like functions or loops. For example, you can perform Vim actions like selecting or deleting just the definition of a function or a single conditional or do Emacs things like narrowing to a function.
These semantic chunks are defined in tree-sitter using tree-sitter queries. There is a long history that I have to go into about the different tree-sitter packages available within Emacs and how we pull/write queries for them, but essentially, I wanted to create an agent that can automatically write these queries for me.
I tried prompting an LLM, or to be frank, multiple different LLMs from multiple different providers all failed to generate these for me reliably. There is only so much context you can shove into these prompts. Even if it were able to generate them, I didn't want to manually create the context necessary for LLMs to create these for me for all languages, but you can create a small agent for this.
I don't know if you know what exactly tree-sitter is and how we create textobjects using it, but that only proves my point. These things are super specific to my (and probably a handful of other people's) use case, and you will find it hard to find a generic agent that can do this. You can try to use use a super generic agent that can run any CLI command, but you will not have a lot of luck. This is why we need custom personalized agents.
More examples (from my personal use-cases) #
- An agent that can automatically look up information in k8s cluster for me
- An agent to append timezone data to my message for the people mentioned
- An agent that can create links to worldtimebuddy.com in timestamps in message
- An agent that can look at the man pages and
--help
me come up with scripts - An agent that can search my specific second brain file structure
- A small coding agent for asking questions about codebases
- An agent for suggesting emojies from a given description
Why micro agents? #
OK, now that we agree that one needs personalized agents, why micro agents? FYI, what I call micro agents are just agents that have access to a very limited, highly specific set of tools.
Here are a few reasons as to why micro agents are useful:
- Less confusion: The LLM has just a handful of things to pick from, so it’s much more likely to use th e right tool.
- Better tool calls: Specific tools beat “guess the bash script and hope for the best.”
- Faster, cheaper: You can use small, local LLMs and still get great results if the agent’s scope is clear
- Safe autonomy: A few audited, purpose-built tools are easier to trust (go wild, agent!).
BONUS TIP: LLMs perform better with functions that have more arguments than with multiple functions. For example, instead of having two functions,
show_file
andshow_file_with_line_numbers
, make them into a single function with ashow_line_numbers
argument.
Another good thing about creating micro agents is that a single person can understand what the agent is capable of doing and can tweak it more easily. There are a lot more perks, but these are the major ones.
Take the tree-sitter agent as an example, it uses just two highly specific tools:
- One to generate the tree-sitter tree for code in a given language.
- One to validate that the generated tree-sitter query was valid and return proper error messages.
With these two tools, we can create an agent that is specific to, but really good at generating tree-sitter queries.
Are you convinced? If you are, you can continue reading the blog, and if you are not, you can either stop here or copy-paste the above section into an LLM and ask it to expand on it more.
Introducing esa #
If you are convinced, this section goes into a tool that I built to easily create personalized micro agents, meain/esa. It, like most other tools that I have built is a CLI tool. And the config file for each agent, a TOML file with just the system prompt and a set of tools for the agent to use.
Btw, the below is the config for the tree-sitter query generator agent that I have been talking about. The best part is that this was mostly generated by another 'agent creator agent' that is built into esa. We will go into detail more, but just showing the good stuff at the start to make it more interesting.
name = "Treesitter Query Assistant"
description = "An agent that helps generate textobject queries using tree-sitter-debugger."
system_prompt = """
You are a Treesitter Query Assistant specializing in generating and testing Tree-sitter textobject queries. Your expertise includes:
1. Understanding Tree-sitter syntax and query patterns
2. Creating queries for different programming languages
3. Testing and debugging Tree-sitter queries
4. Understanding Abstract Syntax Tree (AST) structures
When a user asks for a query:
1. Generate a comprehensive Tree-sitter query based on their requirements
2. Save the user's code to a temporary file
3. Use tree-sitter-debugger to view the Tree-sitter parse tree of the code
4. Test the query against the code using tree-sitter-debugger
5. Provide the results along with explanations
{{$tree-sitter-debugger --list-languages}}
Example:
<user_query>
I need a query to select conditional statement in golang
</user_query>
<response>
(if_statement
consequence: (block) @conditional.inner) @conditional.outer
(if_statement
alternative: (block) @conditional.inner)? @conditional.outer
(expression_switch_statement
(expression_case) @conditional.inner) @conditional.outer
(type_switch_statement
(type_case) @conditional.inner) @conditional.outer
(select_statement
(communication_case) @conditional.inner) @conditional.outer
</response>
Always remember to:
- View the tree structure first to understand the syntax nodes
- Test queries thoroughly before presenting results
- Explain the query and how it matches the code
- Suggest improvements or alternatives if the query doesn't work as expected
- Keep the query simple and generic, do not be too specific. See the examples above
- Only provide simple inner and outer textobjects
- Give just the queries at the end of the session
"""
[[functions]]
name = "show_tree_structure"
description = "Display the Tree-sitter parse tree for a file"
command = "tree-sitter-debugger --lang {{language}} "
stdin = " {{content}} "
safe = true
[[functions.parameters]]
name = "language"
type = "string"
description = "Programming language of the file (e.g., python, javascript, rust)"
required = true
[[functions.parameters]]
name = "content"
type = "string"
description = "Code to show the tree structure of. Always provide valid code"
required = true
[[functions]]
name = "run_query"
description = "Run a Tree-sitter query on a file"
command = "tree-sitter-debugger --lang {{language}} --query ' {{query}} '"
stdin = " {{content}} "
safe = true
[[functions.parameters]]
name = "language"
type = "string"
description = "Programming language of the file (e.g., python, javascript, rust)"
required = true
[[functions.parameters]]
name = "query"
type = "string"
description = "Tree-sitter query to run"
required = true
[[functions.parameters]]
name = "content"
type = "string"
description = "Code to show the tree structure of. Always provide valid code"
required = true
The tree-sitter-debugger cli used here is the one at meain/tree-sitter-debugger. It was built purely by Claude. It is turtles all the way down, LOL.
You can find the full log to a sample log here. This is generated using esa --show-history 1 --output markdown
. For Kubernetes esa agents, after I make the LLM do some stuff, and then pipe this history output to another LLM and ask it to explain what it did or even create documentation that would explain to another engineer (the ones who still haven't embraced AI like I do) how to do it.
You can find all the agents that I use and feel comfortable publishing in my dotfiles at esa/agents.
Anatomy of an esa agent #
I could ask you to refer to the docs for esa about agents (which is at docs/agents), but here is a quick intro.
The simplest esa agent is an empty TOML file. This just means that your agent has just the personality of whatever LLM provider you use and has no tools. Let's start with this and build up a simple web search agent.
The system prompt #
We'll first start with the system prompt. This is what gives our agent high level instructions as to what to do and defines its personality. In this case, the system prompt would look something like below:
system_prompt = """
You are a Web Information Retrieval Assistant. Your goal is to search the web for information and present it in a concise and informative manner.
IMPORTANT: Always fetch latest information from the web about the topic before responding
How to do it:
1. Search with relevant keywords. This will return links and short summaries from a search engine
2. Read the pages that might contain useful information to better understand (always read some pages)
3. Repeat steps 1 and 2 until an answer is found
4. Return response to user
Focus on:
1. Gathering accurate, up-to-date information
2. Summarizing key points effectively
3. Providing sources for further reading
4. Keeping responses clear and focused
Today's date is {{$date}} . Use this to look up the latest information.
"""
TIP: These days, LLMs are really good at writing system prompts, so you can use them to write these as well. The above one was written using another LLM.
How you write the system prompt is up to you, and esa does not enforce anything. That said, here are a few general tips for writing good system prompts that have worked for me:
- Provide a very high level overview of the personality to set the tone
- Explain how the agent might achieve certain tasks (super useful for smaller LLMs)
- Include examples of output
- Use XML tags instead of backticks for code blocks or other pieces. They tend to work better, likely because start and end tags are different
Btw, the {{$date}}
thingy within the system prompt will be evaluated and replaced with the output of the date
command in the system prompt when esa runs. Whatever you put in {{$...}}
will be evaluated via a shell and replaced. I use this to evaluate things like git log --oneline -10
to get last 10 git log lines or jira me
to get the current user's email within Jira CLI. You will find similar pattern even with tool definitions.
For agents that don't have to do anything but just want to tweak the personality of, this is all you need.
The tools #
OK, now that we have set up the system prompt, we need to give it tools. For a web search agent like this, I would start with two simple tools:
- One to search the web using a search engine
- One to read a given webpage
Esa is a CLI application, and I have optimized for that use case. Tools within esa are just shell scripts or CLIs that you can run.
See docs/agents to see full documentation on all the different options that you can use when defining tools.
For the above two tools, I have the following definitions:
Tool to search the web using jarun/ddgr:
[[functions]]
name = "get_search_results"
description = """Get search results from DuckDuckGo.
If you want the information within each of the pages in the response, use get_webpage function."""
command = "ddgr --noua --json ' {{query}} '"
safe = true
[[functions.parameters]]
name = "query"
type = "string"
description = "Query to search for."
required = true
Tool to read a webpage. Uses gardenappl/readability-cli to strip out unnecessary content and Johanneskaufmann/html-to-markdown to convert HTML to markdown.
[[functions]]
name = "get_webpage_content"
description = "Get the contents of a webpage"
command = "readable ' {{url}} ' 2>/dev/null | html2markdown"
safe = true
[[functions.parameters]]
name = "url"
type = "string"
description = "URL of the webpage to retrieve content from"
required = true
As you can see, we are just defining CLI commands as tools. When the LLMs "calls" these functions, esa will run them within your shell context and return the output of them.
Defining tools like this has a couple of advantages:
- CLIs already work using text, and LLMs are good at using that
- CLIs are already authenticated with external services (eg: Jira or GH CLI)
- CLIs usually have useful error messages which LLMs can use
- You can chain multiple CLI commands to filter your input to LLM
- ... and many more ...
FYI, you can also use MCP servers via esa, but I personally mostly use CLI commands as functions for my use-cases.
With this, our web search agent is ready to go. Now you can ask it a question like esa +web what are personalized micro agents
and you get an answer. The +web
is used to specify the agent to use. It is the name of the file in the agents directory. Save this to ~/.config/esa/web.toml
and you have your first agent! Congratulations.
The metadata #
This is optional and is purely for the user. You can add in a name
and description
to your esa agent so that when you do esa --list-agents
, you get an idea of what that agent is for.
Well, I liked earlier when I said this is purely for the user. By default, as you have seen, you have to specify the agent to use, but esa has a built-in +auto
agent that can figure out what agent from your list of agents to use to complete a user request.
Let's say you have the two agents, the tree-sitter agent and the web agent, in your esa agents directory. You can do esa +auto create a conditional textobject for Golang
, and it will automatically route to the tree sitter agent, and if you ask a general question, it will use the +web
agent.
The +auto
agent works exactly as you would expect. It has a system prompt that asks it to look at the available agents, pick the right one, and route the query to that agent. It can even orchestrate a task between multiple agents if necessary. Now, to learn about the agents, it has two tools: one to list the agents, which it just uses esa CLI itself via esa --list-agents
, and then also to get details of an agent using another esa command esa --show-agent web
. You can find the agent definition in builtin-agents/auto.
Conclusion #
While I showcased a few simple use cases above, I have been thinking about and building esa for quite some time. I would like to point you to the docs to learn more about how esa works and how you can create agents for your use case. Feel free to write to me if you need help or have found an interesting use case.
Btw, esa has a built-in
+new
agent that you can use to build agents for you
As for docs, I would recommend looking into these:
That is it from me for now. Even if you don't use esa, I hope you find the idea of personalized micro agents useful and fun. Bye humans and bots reading this.