From Artifact to Production: 3 Ways to Productionize AI at Work

What it actually takes to move AI out of the chat and onto your team, and where it goes from there.

May 07, 2026

A couple days ago, I had Claude help me analyze my credit card statements with the live artifacts feature. I wanted to play around with the capability, and use it to see where I was paying the most, what subscriptions I could cut, and what my spending looked like over the last few months. About 15 minutes after I started, I had a working dashboard with categories, totals, recurring charges, and a couple of useful charts. This was genuinely useful analysis and I wanted to show it to my husband.

But I couldn’t send it to him directly. I could screenshot it, copy out the summary, describe one of the charts in a text. But the dashboard, the interactivity, the part that made it useful was an artifact in a chat I had open on my laptop. It wasn’t portable.

That’s pretty much the same problem a lot of us are trying to solve at work right now, scaled up.

A lot of people are using Claude, Gemini, and ChatGPT as thought partners, or Copilot inside the office suite. I do this every day. I lean on Claude to stress-test strategies, build proofs of concept, draft frameworks. But almost none of it is shareable in the form it’s produced because it lives on my screen only.

This post is about what I’ve been doing to bridge that gap on my team and how to turn one-person artifacts into things the team can actually use. And then a thought on where this goes when you scale it past one team to the whole org, which gets into really exciting territory.

The bridge problem

To get past individual AI artifacts and chats, the AI has to externalize into a workflow, a tool, a system someone else can use without sitting next to you.

The easiest way to do that is to embed it where the team already works: a button in the CRM, a short form, or a folder in SharePoint that updates without anyone doing it manually. The leverage is in pulling the AI out of the chat and into the actual workflow.

That sounds simple and it isn’t, because what many usually try is more AI in the form of a bigger model, or something more autonomous, more agent-y. But most of the time the answer isn’t more AI. It’s automating the workflow around the AI.

When something is worth productionizing

Not every Claude artifact should be a production tool. The credit card dashboard example is fine as a one-off. I don’t need a permanent credit card monitoring system and can use existing apps to solve ongoing monitoring. Even if I did, the act of thinking through it with Claude was most of the value. A lot of strategy and POC work is the same.

So the first question to ask isn’t how do I build this for the team. It’s should I?

Will the team need this same kind of work next month, with different inputs? Will multiple people want it? Is it expensive to redo from scratch every time someone asks? If the answer is no, then it doesn’t need to be scaled for the team. If the answer is yes, you’re looking at productionization (I think this is a word LOL), which means thinking about three things.

Data integration. What does this need on an ongoing basis, and where does that data live? If the inputs are pasteable and one-off, you don’t have a production tool, you have a prompt. If the inputs are recurring and pulled from authoritative sources (your CRM, your CMS, public APIs, your data warehouse), you need to wire those connections so the tool can pull what it needs. The data layer is the part of productionization that takes the most build time.
Output and audience. Where does the result go, and who is it for? Is it via an email? A Slack message? A SharePoint folder? An update to a CRM record? A dashboard? A great markdown file that no one opens is no better than a Claude artifact you couldn’t share.
Ownership. Someone has to build it, and someone has to maintain it. I know this because I get asked for updates on my tools regularly. A few of us on the team are managing this today. (I also talk about the scaling problem this creates in my last post.)

A small exercise to make this concrete: Write down the top three things your team manually pulls every week, and where each one lives. CRM exports, dashboards, third-party platforms, a Google Sheet someone updates by hand on Friday, whatever it is. That list contains your data layer.

Then for each one, ask the second question: where would the result need to land for the team to actually use it? An email? A Slack message? Sometimes the best answer is the same place the manual work was already happening. If it’s a dashboard your team checks every Friday, automating the update inside that dashboard means there’s nothing new to open. This second question is usually the difference between something that gets adopted and something that doesn’t.

The three tiers

Once you’ve determined that something is worth productionizing, the next step is defining what category or tier of workflow this belongs to in order to start building it.

Tier 1: Automated workflow. Deterministic inputs, templated outputs. AI is optional or completely absent. The value is in the automation and standardization of the process.

Tier 2: Workflow with AI. A structured pipeline where AI handles a specific step nothing else can. Triggers, data fetching, formatting, delivery, etc. are all deterministic. The AI steps do things that actually need intelligence within this workflow.

Tier 3: Agent. Continuous worker with judgment. Runs whether you trigger it or not. Makes decisions, takes actions, handles ambiguity inside a defined scope.

If you choose the wrong type, you waste time and resources building something the team won’t adopt. So the right move is starting from the pain point, not deciding you need an agent before you understand what’s actually needed.

Here are three examples of what I built to make the tiering less abstract.

Tier 1: Banner Generator. Production tools don’t have to use AI.

I built a tool that produces brand-aligned display banners for the campaign team. The inputs are a form: a few context fields, the headline and subheadline copy, a CTA URL, the sizes you want, the persona, the industry. The outputs are a folder in SharePoint and an email with the banner files ready to drop into the ad platform.

The form has an AI mode that generates the copy via Azure OpenAI, and a free-form mode that lets the user type their own copy. Both produce the same set of properly sized, brand-aligned banners. Both deliver the same way.

Most of the team uses free-form.

That surprised me at first. I built the AI mode carefully and the copy it generates is good. But the team mostly skips it, because they already know what messages they want to write. They have a campaign concept and a positioning line. Running it through AI mode means an extra review cycle to make sure the AI’s interpretation matches what they actually wanted. Skipping AI mode means they get to the deliverable faster.

So the AI mode is optional, and that’s the point. You don’t have to add AI onto a workflow just because you can. Sometimes the production capability is the form, the brand-approved design, and the automated delivery.

Once I had the generator running, I ran into a different kind of problem. With just headline and subheadline character limits, the auto-wrap was breaking text in awkward places, words splitting across lines, weird spacing on the smaller sizes. But because I’d built it myself, I could fix it in a few hours. I added per-line input fields mapped to how the headline and subheadline actually display, plus a live preview page so users could see exactly how each size would render before they hit submit.

That’s the part of building custom that’s underrated. An off-the-shelf tool is mostly one-size-fits-all, and a lot of customizations depend on the vendor’s roadmap. When you’ve built the tool yourself, you can implement the fix the same afternoon. That iteration speed is real production value.

Tier 2: The Analysis Dossier. When the workflow is structured but the process needs intelligence.

I’ve written about this one a few times already. The short version: sellers used to spend hours doing account research, pulling a 10-K, scanning the earnings call, finding the org chart, cross-referencing the engagement history, building a quick deck.

The Analysis Dossier is the workflow I built to automate all of that. The seller hits a button in their CRM, or fills out a three-field form. They get back a 10-section dossier with a company overview, org chart, strategic priorities, earnings call analysis, technographic profile, relevant insights, engagement history, discovery questions, value props, and an auto-generated PowerPoint slide. Right to their inbox.

Adoption is sitting around 80% of the team. The reason it’s that high is not only because of the quality output, but because I put it where they already work. They don’t open a new tool, log into a chat window, paste data, prompt the AI, copy the output. They click one button or fill three fields. The friction to use it is lower than the friction not to.

When the user hits the button or submits the form, the workflow runs six steps:

The trigger fires a webhook.
Zapier orchestrates the rest.
Services pull the inputs, including SEC filings, an earnings API, an org chart provider, engagement history from the CRM.
Azure OpenAI steps synthesize that multi-source input into the structured 10-section format.
A markdown-to-HTML formatter turns the synthesis into an email report.
The email lands in the requestor’s inbox, and SharePoint and the CRM get auto-updated with the artifact for archival (and is the basis of other workflows and reports).

The AI work was the most straightforward piece. The harder part was the data layer. That’s where most of the build time went. The SEC information, the earnings API, the org chart provider, the CRM setup, the email formatter code, the failure handling for when one of those data sources is rate-limited or down. AI is doing about 20% of the work and getting most of the credit.

The reason this is Tier 2 and not Tier 1 is that the synthesis step needs AI. You can’t template a strategic priorities summary across thousands of companies. You can’t deterministically map an earnings transcript to a company’s top challenges. That’s a job for an LLM. But everything around the model, like the triggers, the data fetching, the formatting, the delivery, the documenting, is a deterministic, reliable workflow.

Same principle as Tier 1: only put AI where it actually needs to be. The difference here is that process at this scale genuinely needs it. Tier 2 isn’t AI with a wrapper. It’s a workflow that uses AI at exactly the steps that need the LLM.

Tier 3: The SEO agent. The harness is the lever, not the model.

Tier 3 is for work that needs continuous judgment, not just one-shot synthesis or regular outputs. SEO is a great example. SEO work is never complete. Search engine position results are regularly changing, new competitor pages appear, algorithm changes hit. A page that ranked third last quarter ranks ninth this month and you don’t know why until you look. Running that work as a process you trigger weekly is technically possible, but the surface area is too broad for a single workflow and the cadence is too unpredictable for a button to trigger the work.

So, I built an SEO agent that produces:

Visibility reports
Keyword gap analysis
Page-level optimization briefs
Full blog drafts with SEO targeting

It’s integrated into the team’s workflow via Asana and picks up tickets, drafts, posts comments, hands work back. It runs where the team already works.

The SEO agent uses some really important agent principles I learned from building my personal agent, Atlas, which I’ve written about a lot. Experimenting with Atlas taught me that the model is about 20% of what makes an agent work, and the 80% is the conditions around it, which is basically what’s called the agent harness.

A harness is the structure around the AI model that gives it memory, scope, and feedback. The work agents run on open-strix, an open-source agent harness. There’s a lot in it, but the five pieces that matter most for productionizing AI are:

Memory and identity blocks. Structured files the agent reads from and writes to. They hold both who the agent is (identity, communication style, current focus) and where it is in its work (where it left off, what decisions were made, what state it’s in).
Data and context. The reference material the agent works with, such as content files, case studies, brand guidelines, directories it has access to, data a user gives it. This is what gives its output specific grounding instead of generic answers.
Skills and tools. Defined capabilities like running SEO analysis, drafting a brief, posting an Asana comment, generating a chart, doing a webpage teardown. The agent doesn’t reinvent or create the workflow each time. It picks the skill that fits.
Schedule. When the agent runs, on what cadence, with what triggers. The SEO agent has a morning kick-off block, regular work blocks that run every few hours, an end-of-day summary, a weekly visibility check, and an automated poller for Asana that pings the agent automatically when there’s a new comment to respond to.
Journal. A running record of what the agent did, what it decided, and where it got stuck. This is the part that lets the system learn, and lets me debug it without reading raw logs.

Without these foundations you will run into issues with the agent:

Without memory and identity blocks, the agent forgets who it is and where it left off.
Without good data and context, the agent produces generic output.
Without defined skills, it improvises and you get a different result each time.
Without a schedule, it doesn’t run when it should and progress slows.
Without a journal, you can’t tell what it actually did.

The agent runs on the same model as the Claude web UI. But what it produces feels more like what a real coworker could produce — specific, personalized, high quality, and that’s the harness, not the model.

This is why the move from a Claude artifact to a production agent isn’t about using a smarter model. The conditions around it: memory, skills, schedule, journal tie directly to the production part. You can swap the model and the output barely changes. Remove the harness and the whole thing falls apart.

The SEO agent is Tier 3 because the use case fits the requirements for this type of solution. If I’d tried to build banner production with this kind of agent, I’d have made something fragile, complex, and unnecessary.

The horizon: From team to org.

All three of those tools, the banner generator, analysis dossier, SEO agent, happen at the team level. They’re production tools that teams use, owned and maintained by a few builders (or sometimes just me). That’s where most of my AI work lives right now.

The next horizon is the same idea applied across the organization.

I wrote in my last post about Benjamin Levick at Ramp and what happened when his company built a platform for thousands of people to use AI in their actual work. What Ramp found that worked was a structured model: a small group of builders with centralized governance owns the platform and the core tools. The operators close to the business make the customizations that fit how their teams actually work and know what good looks like. Builders can implement structural changes quickly when operators flag what’s needed.

Here’s what that could look like in practice for a marketing campaign launch where the roles coordinate through the same platform:

The ABM manager writes a campaign brief into the platform. Emails, banners, and assets get generated.
The demand gen strategist sees a new channel needs to be added and describes the new banner type and sizes needed.
A builder adds the capability into the banner tool. New sizes generate.
Campaign Ops reviews and refines the auto-created programs in the marketing automation platform.
A manager reviews, edits, and approves the copy. Campaign Ops schedules the sends and launch.

None of those steps wait on a vendor’s roadmap. Operators modify their own workflows because they’re markdown files, prompts, and configurations they can update directly. And when structural work is needed, like new banner sizes or types, a builder can implement quickly on the same platform. And with protocols like MCP and the headless agent products vendors like Salesforce are now building, the last mile of execution can start moving onto the platform too.

The ability to build and tune the right tool for the team and workflows you have can unlock huge value across the organization. But prerequisites have to exist first:

The data layer. Strategy docs, content libraries, brand guidelines, pricing and product info, all somewhere accessible like SharePoint or Notion.
Developers and IT. Partner with those teams on access, connections, and the infrastructure that can support team-level building.
Governance. Decisions about who owns what, how data flows, and how the org keeps track of what people are building so agent sprawl doesn’t become its own problem.

And this is not easy work at this level. But I don’t think you have to do it all at once. Every tier you implement at the team level is a building block toward the platform, and so are the data layer, the workflows, and the governance around who owns what. The next version doesn’t come from trying to put everything perfectly into place. It comes from building quickly with what you have, learning fast, and iterating.

What I've built across the three tiers gets you some tools that save the team time. A platform gets you many more tools and agents, each tuned to a specific function, with the people closest to the work building and updating them in real time.

The bigger opportunity here is making how the company works, with its internal processes, strategy, and institutional knowledge, into a platform the whole org uses to build, run, and update its own workflows. The end result isn’t a few great tools. It’s strategy, tools, and automation built across the company by the people doing the work.

I’m continuing to think about how to scale what we’ve built into that next layer, and what that looks like. But to frame this for now:

Individual AI. Me using Claude as a thought partner, like analyzing my credit card and spending habits.
Efficiency AI. The three tiers I walked through to embed workflows across the team.
Opportunity AI. A system where people can build, iterate, and launch for themselves within the structure builders and strategists set.

That last one is where the real upside is. Not just speed or efficiency, but an organization that builds at the rate it thinks. The architecture and governance for that scale is what I'm thinking through next — more to come on that soon!

Applied AI for Marketing Ops | Lily Luo

Discussion about this post

Ready for more?