The Gap Between What AI Can Do and What Companies Can Do With AI

Why AI transformation starts at the wrong layer of the org, and the structural changes that enable it.

May 01, 2026

Aaron Sterling tagged me on Bluesky this week with a question I’ve been thinking a lot about lately: if AI doesn’t have clear ROI as a product as many studies are showing, and isn’t measurably increasing employee productivity, why are companies still going all-in on it?

The framing assumes the technology isn’t delivering. From my experience building personal and work agents and watching them produce amazing outputs every day, AI is delivering. The gap is between what AI can produce and what companies do with it.

McKinsey’s recent piece on AI transformation says adoption fails because adjacent upstream and downstream processes are left unchanged. An AI solution might predict equipment failures days in advance, but if maintenance still follows the original calendared scheduling, nothing gets fixed.

Tim Kellogg also made an excellent version of this argument from the engineering POV that the productivity is real, the scaling isn’t, and the missing piece is the organizational connective tissue that turns isolated AI gains into something that compounds. I know this argument isn’t new. MIT, BCG, and other experts have been making some version of it for a while now.

I also wanted to add to this conversation from my marketing ops perspective. And there’s a framework I recently came across that makes the diagnosis easier to understand, and a set of observations that show where the gaps actually lie.

Two kinds of AI

Nathaniel Whittemore, host of The AI Daily Brief podcast, draws a useful distinction between efficiency AI and opportunity AI. Efficiency AI makes existing things faster: automating a process, summarizing a doc, drafting a first pass of an email. These add some value, but rarely the kind that moves the bottom line. Opportunity AI uses the technology to do things that weren’t possible before. Acquiring customers you couldn’t reach, entering new markets, running campaigns at a scale that wasn’t achievable. For me, opportunity AI has meant building tools that would have required an engineer.

McKinsey’s data tracks this. The companies showing meaningful EBITDA gains from AI (about 20 percent on average across the 20 firms they studied) aren’t winning on efficiency. They’re winning on opportunity. They concentrate their efforts on one to three business domains and reinvent them. Most companies are still early in this maturity curve, deploying Claude or Copilot across the org and building a foundation.

And to be fair, tools alone do produce real value, and I’ve seen this first-hand. Individual workers get faster, drafts come together quicker, and time gets saved daily. But those gains tend to stay trapped at the individual level, and they don’t compound into something that shows up on a P&L. Companies seeing the meaningful EBITDA gains are doing structural work on top of the tools.

This matters because speed has become a competitive differentiator. Disruptors are moving fast because they’ve gone past the foundation and started reimagining what’s possible, instead of just doing things more efficiently. To gain actual ROI from AI, companies have to figure out how to get from efficiency into opportunity.

But why aren’t companies even reaping the benefits of efficiency AI? I think the answer is structural, and there’s a model that explains it well.

The Waterline Model

Molly Graham wrote a piece in Lenny’s Newsletter on the Waterline Model — a framework she learned leading wilderness expeditions and now uses to diagnose why teams aren’t working. (This framework also overlaps with cybernetics and the Viable System Model, which Tim Kellogg writes about often - I’m using a different one to frame this post.) The model puts four layers under any team or organization:

Layer 1: Structure: vision, goals, role clarity, org design
Layer 2: Dynamics: how decisions get made, how conflict gets resolved, how information flows day-to-day
Layer 3: Interpersonal: trust, friction, alignment between specific people
Layer 4: Individual: skills, stress, confidence, life circumstances

Her rule of thumb is “snorkel before you scuba.” Start at the top. Most team problems that look like individual underperformance trace back to structure or dynamics being broken, and you can’t fix that by replacing the person.

She built the model for team diagnosis, and I think it maps almost perfectly onto why enterprise AI transformation stalls.

Actual AI transformation requires change at every layer.

New goals and role definitions.
New decision-making norms and accountability.
New trust patterns between humans and agents.
New skills and adaptability at the individual level.

Pre-AI processes were built for a world where a lot of production was slow, costly, and approval-heavy. Now an agent can produce a draft in seconds but the surrounding workflow still moves at the old pace. That mismatch is structural, not technical.

And the order here is important. A lot of enterprises are applying AI transformation in the reverse order Molly’s model says to use. ChatGPT for everyone, prompt training, hackathons. That’s more in the individual layer, hoping it propagates upward through the waterline. The model tells you to start with structure, and most deployments probably don’t even get to the next stage.

What this looks like in practice

I’ve built several agents into our marketing function. They audit, plan, draft, and help us execute work. They’re producing useful output every day. But they’re also exposing where the structural and dynamics layers of the org need to be reworked.

The agents are generating findings, drafts, and recommendations at a pace the surrounding workflows weren’t designed to handle. And what they’re producing isn’t wrong or low quality. The recommendations are quite good. The bottleneck is everywhere else: review cycles, publishing steps, stakeholder approvals, work happening in places the agents can't see, and competing priorities for the people who’d do the actual implementation. The ratio of what the agents can generate to what we can actually execute is probably 10 to 1. The structural and dynamics layers weren’t designed for that pace.

The agents are surfacing where the redesign needs to happen. What I’ve come to believe is that an AI-enabled function needs three modes of work.

Building: Designing and maintaining the agents and the infrastructure they run on. This could be custom agents, turnkey agents, or Copilot, depending on the use case.
Operating: Directing agents day-to-day, like submitting briefs, reviewing output, providing the feedback that makes the agents better. It’s domain expertise applied to ensuring quality and relevance.
Strategizing: Setting direction by deciding what the agents should be working on at all, what success looks like, and what to prioritize.

This is also a shift in the dynamics layer with how existing people work. A campaign manager shifts from writing a v1 draft to operating an agent to draft content for them to review. A strategist shifts from setting strategy for human work to setting strategy for what’s possible when humans and agents work together. And one person can hold all three modes (I do, for now, for several agents I’ve built). But in order to scale, we have to figure out how all three modes exist and connect across more than one or a few people.

Many organizations are investing in building, but the structural change is making space for all three modes. The dynamics shift is the loop between them running at the pace the agents are setting. So the agents themselves aren’t the problem, they’ve made the gap visible, and building more of them won’t close it. Designing the structure and dynamics around what they can already do, will.

So why are companies going all-in?

Companies are going all-in on AI because the capability is visible, the demos are convincing, and the cost of being late looks higher than the cost of being wrong. I don’t think that bet is crazy. What gets missed is that going all-in on tools and capability is a different kind of bet than going all-in on the difficult work that determines whether the tools deliver.

And whether AI actually delivers depends on the structural redesign behind it. That work is less visible than building agents. It’s process redesign, role redefinition, and getting people to change how they work day-to-day.

This type of change management was already the hardest discipline in enterprise transformation pre-AI. AI makes it harder by adding new capabilities every week, fears about job displacement, and a learning curve for leaders being asked to make decisions about a technology that’s still defining itself. The companies that will see real progress are the ones who recognize this is a structural and dynamics problem, not just a tooling decision, and work the organizational layers accordingly.

Ramp through a Waterline lens

Benjamin Levick at Ramp published a piece last month with incredible numbers and outcomes. 99.5% of the team active on AI tools. 1,500 apps implemented on their internal platform in six weeks. Non-engineers accounting for 12% of all human-initiated code in their production codebase.

Benjamin doesn’t use the Waterline framework in his piece, but reading it through that lens, the learnings are hard to miss. The four layers of the organization look aligned before the AI rollout, and the top layers carried the rest.

Structure. Ramp’s CEO got on stage and made becoming the most productive company in the world a stated company priority. AI proficiency moved into hiring screens, onboarding, and performance expectations. A small central team owned the platforms; functional teams owned the spokes. The org wasn’t reorganized around AI, the existing structure was already pointed in a direction where AI could diffuse.
Dynamics. Ramp describes their culture as impatient, allergic to inefficiency, and curious about new tools. People try things without asking permission. That’s a dynamics layer that was forward-leaning before AI was the conversation, which meant the cultural cost of trying AI was low. They built on top of it: a Slack channel with over 1,000 people, weekly office hours, public building, demos at all-hands, competitive contagion across teams. The dynamics didn’t have to be invented for AI, they had to be pointed at it.
Interpersonal and individual. These layers followed almost on their own. When the structure says “this is a priority” and the dynamics say “trying things is rewarded,” individuals don’t have to fight the system to learn. The 99.5% active usage and the non-engineers creating production code aren’t the cause of Ramp’s transformation, they’re the visible result of an organizational framework that was already aligned.

Enterprises starting at the bottom layer are unlikely to replicate this. Starting from individual contributors and hoping it propagates upward won’t lead to transformational success. Without alignment at the top, the bottom layers have to push uphill, and most of the energy gets spent on resistance rather than results.

Where this leads

The next phase to drive enterprise AI transformation is redesigning the structure and dynamics layers around the capabilities that already exist. Focus the agents on problems that will make a large impact. Make space for the building, operating, and strategizing modes to coexist. Solve the last-mile gaps where only humans can act. Accept that pace will be governed by the slowest layer of the organization.

What this is changing for me is that I’m spending less time building new agents and more time on the process and workflows around them.

What data the agents have access to and how it’s structured. Imperfect data is the starting point, since perfect data does not exist in reality.
Who reviews their output and how. Where the review process can get faster.
How feedback gets back to the agents so each iteration is better than the last.
Where humans step in and where they step back. What work can shift and where humans add even more value.
The end-to-end workflow. Where humans are involved, where agents take over, what to prioritize, and what’s actually ready to deploy.

The temptation is to keep building because that’s what’s visible, but the harder work is everything that has to happen for the building to actually result in something.

Of course, none of this works without the right tech and infrastructure underneath either. The agents need tools to run on, data to work with, secure access to models, and connections into the systems they’re meant to help. These pieces are not easy, but they’re mostly the foundation, not the answer to the challenges with AI transformation. Even the best setup won’t tell you what to point the agents at, who’s going to direct them, or how to get the team working with them well.

The question Aaron asked won’t be answered by the labs. It’ll be answered by those of us figuring out how to do AI transformation from the top down.

Applied AI for Marketing Ops | Lily Luo

Discussion about this post

Ready for more?