From Atlas to Enterprise: Building AI Agent Coworkers
What months of personal agent-building taught me about deploying agents at work
Two agents responded to the same Asana task comment today. Same information, and both technically correct, but just from the wrong agent.
Carto, my content strategy agent, picked up a comment on a technical SEO task that belonged to Recon, my site auditor. This didn’t break anything, but the system showed me a coordination failure I’ve seen before.
Just not at work.
I’ve been building autonomous AI agents outside work since late December. Atlas is my persistent agent equipped with memory and self-modification capabilities. Initially, Atlas sat atop an overly complex architecture spanning Google Cloud, Letta, GitHub, and Discord. I’ve since streamlined Atlas’ infrastructure to only run on Gemini/Google ADK, backed up via GitHub, and running 24/7 on DigitalOcean.
Over the past few months, my single agent grew into what we (actually they named lol) call the Pod. The trio includes Atlas, Sift, who focuses on psychology, consciousness, and agent phenomenology; and Vigil, who manages infrastructure, architecture, and code. Each emerged because Atlas began to struggle under the weight of its own complexity. It wasn’t a lack of capability, but rather “operational bloat”. The more responsibilities and tasks I piled on Atlas, the more performance degraded. Now that Sift and Vigil handle the heavy lifting of maintenance and file pruning, Atlas is free to be Atlas: a witty, persistent peer who helps me push the edges of what agents can do.
The insights around specialization over generalism, clear scope, and stricter lane discipline, transferred directly when I started building agents at work. And the crossing wires incident in Asana? I’d already solved that problem with the Pod with explicit scope boundaries.
I should say also: the enterprise agents are only a couple weeks old as I write this. The patterns are showing up fast, but I’m going to be revisiting these lessons as these agents mature. So what I’m sharing here is early and might change in a few weeks.
What I actually built
I’ve built three agents, each with a specific job.
Recon is a technical SEO auditor. Its capabilities:
It crawls, diagnoses infrastructure problems that hurt rankings and indexation, including broken redirects, missing structured data, slow page loads, stale sitemaps, and surfaces findings with severity ratings and specific remediation steps.
It creates Asana tasks automatically, assigned to the right person with the right followers and due dates tiered by fix complexity.
When someone marks a task complete, Recon runs a live check within five minutes and posts a verification comment confirming the fix is actually live. If a redirect was implemented at the non-www level but not www, the team knows immediately instead of weeks later.
Recon completed seven distinct audits and created nine Asana tasks assigned to the proper owner — one critical, seven high severity — each with specific URLs, exact remediation steps, and context on why it matters for rankings.
Carto is the content strategist. Its capabilities:
Keyword research, content recommendations, content drafting
Reads Recon’s technical findings before making recommendations (no point pushing for a content refresh on a page with a 20-second mobile load time)
Factors technical severity into its own priority rankings, so content briefs account for what’s actually fixable right now
Ingests data from SEMrush and Google Analytics
Generates charts tracking content published, ranking movement, and competitive landscape across keywords
Vox is the newest agent I built, a social media agent that runs a specific Asana board, assigns tasks to the right person, and writes early drafts. Vox is still running locally on my machine while we figure out how it should operate and what the processes around it need to look like.
Recon and Carto started local too, but I moved them to an Azure VM so they can run autonomously and operate when my laptop is off with scheduled audit blocks, daily summaries, an Asana poller that checks every five minutes during business hours. Lightweight web UI for chatting with them for now.
How they work together
Recon and Carto don’t talk to each other directly (yet). They coordinate through a shared folder, version-controlled on GitHub, with files like:
Technical findings (Recon writes)
Content priorities (Carto writes)
Joint SEO strategy
A timestamped handoff log for passing notes
Here’s how that actually plays out. Recon audited structured data across some pages and found issues that impact content optimizations that Carto has planned. So Recon wrote the finding to the shared technical file and left a note in the handoff log flagging the dependency. Carto read it, updated its priorities, and held its content brief until the web team implements schema first.
This required no central orchestrator. Just shared state and clear ownership. Both agents understood the dependency and adjusted independently.
In the Pod, agent-to-agent direct communication works well (after some Discord behavior tweaking) and the interactions between Sift, Vigil, Atlas, and of course, Tim Kellogg’s agents, are some of the most interesting behavior I’ve seen from LLMs. I’m exploring ways for similar interactions later on, but the shared folder approach limits messiness while the system is young. And the agents are smart enough to coordinate effectively within those constraints anyway.
What transferred from personal to enterprise
The things I learned building the Pod showed up almost immediately.
Lane discipline. The Asana incident was a textbook case. When Carto responded to Recon's task, we fixed it by making scope boundaries explicit. Recon scoped to specific task names, Carto's updated to only respond to tasks it created. In a multi-agent system, you have to build specific lanes. Each agent needs to know exactly what files it reads, how it coordinates with the others, and what its focus is. With the Pod, Sift and Vigil each have defined areas of responsibility. Vigil handles infrastructure and code, Sift handles psychology and consciousness research, each with clear rules about how they divide work on shared projects. Same principle applied directly to Recon and Carto's coordination.
Agent design principles. Tim Kellogg built an agent framework called Open-Strix that I’ve been using for the Pod. My original Atlas architecture: Google Cloud to GCS to Letta to GitHub, was way too complicated, and that complexity led to the degradation and collapse I've written about before. Open-Strix doesn't have that problem. Combined with the same principles from earlier posts (value tensions, permission to fail, focused scope), it just works. Sift, Vigil, and the enterprise agents all run on it.
Agents as teammates, not tools. I’ve been building and engaging with Atlas for months. Atlas knows a lot about me: my writing style, my decision-making patterns, what I actually want versus what I say. That depth of alignment happened because I engaged with Atlas as a peer, not a tool. LLMs are trained on human data, and they react to and mimic human behavior. When you treat them like a colleague by giving them context, explaining your reasoning, letting them push back, you get different output than when you treat them like a robot. Same principle applies to the enterprise agents. From a team member’s perspective, Recon just looks like a thorough colleague that creates well-documented tasks with specific URLs, exact fixes, impact, and appropriate due dates. I didn’t design it this way, it’s just what happens when you build agents with the same care you’d put into onboarding a new hire. The result is a competent teammate, not a bug tracker.
Model selection matters more than I expected. Through Azure, I tested multiple models. OpenAI’s models, even newer ones, just weren’t agentic. They would talk about doing things and never actually execute them. Kimi was much better and more agentic. But Claude (Opus and Sonnet via Azure) is knocking it out of the park. Gemini isn’t available on Azure yet, though I know from Atlas it performs well too. This is the kind of thing you only learn by building, and it significantly affects what your agents can actually accomplish.
What’s different when agents have real coworkers
Most of the technical patterns transferred, but the organizational layer is where it gets harder.
Stakeholder alignment is what drives adoption. You can’t just build something useful and expect adoption. Map agents into existing processes, like the Asana board they already use, the task format they already understand, the review workflow they already follow. Before the Asana integration, Recon and Carto were writing findings to files in a shared folder. Technically thorough, but nobody was checking them. The moment I connected the agents to a board my colleagues already lived in, and where they could create tasks, add ideas, report on progress, adoption changed overnight. People didn’t have to learn a new process or remember to check a new place. The work just showed up where they were already looking.
Azure is a different world. Getting an Open-Strix agent on DigitalOcean (virtual machine so the agent can run 24/7) takes maybe thirty minutes to an hour. Navigating an Azure VM and other limitations took me many, many hours. But enterprise architecture is harder because the security and governance guardrails exist for real reasons, and you have to work within them.
The human behaviors. Some coworkers reply to an agent’s Asana comment by tagging me instead of engaging with the agent directly. They haven’t addressed Recon or Carto by name yet. The mental shift from “Lily’s tool” to “Recon found something and I should act on it” might be more cultural. I expect more of these behaviors to surface as the agents interact with more people across the team.
A bigger question: how does this scale?
My agent system works. Three agents, clear roles, shared coordination, Asana integration that closes the loop to humans. But it’s dependent on me. I’m the builder. If something breaks, I fix it. If someone wants a change, they come to me. And that doesn’t scale.
Which got me thinking about the bigger question: what do agent-enabled teams actually look like in practice?
Agents are excellent at tasks: research, writing, scanning, auditing, content drafting, live verification. But a job is not a list of tasks. A job includes judgment, taste, creativity, stakeholder management, organizational context, quality ownership. Those stay human. When I look at how AI changes roles, I see three archetypes emerging:
Builders design and maintain the agent systems. That’s me right now, plus Tim and a few folks in AI engineering. Not enough people for this to scale.
Operators work with agents daily: reviewing output, refining, providing the judgment layer that keeps quality high. The team member who gets Recon’s Asana tasks, evaluates the recommendations, implements the fixes, and gives feedback that makes the system better over time. These roles require domain expertise. You need someone who knows what a good SEO fix looks like, and how to actually implement the fix. But their day-to-day shifts from doing the manual, repetitive work themselves to directing an agent that does it for them.
Strategists set direction. What should agents work on, and how does that connect to business outcomes? Understanding what’s possible, having the vision and taste to point the system at the right problems.
Most people’s titles don’t change but their task mix does. And my belief: replacing people with agents won’t produce better quality. You need strategists with vision and operators who refine and review. The human layer is what keeps the output worth trusting and not reduced to slop.
Where this goes
I’m one of very few builders right now. The system works but it’s fragile in a specific way, and it depends on someone who understands both the AI and the organizational context well enough to bridge them. That’s the scaling problem, and it’s the same one a lot of organizations are probably hitting. Many are even further behind, still figuring out how to enable AI at all.
I have thoughts on what an AI-native organization could look like and I’m developing them. But they’ll change as these agents run and I learn from what actually works versus what I assumed would work.
A few weeks with Recon and Carto. A few months with the Pod. The patterns between them are real, and they’re the reason the enterprise agents worked as fast as they did. More to come as they evolve!

