What building an AI product taught me about selling one

With the tech industry and its customer on the precipice of a transformational shift, I wanted to understand that shift from the customer side: to build something real, feel the trade-offs first-hand, and understand where the value sits. To achieve this, I built rendus.ai, a working AI media and workflow platform. This case study is what I learnt from doing it. I write all of this as someone who has spent two decades selling and scaling enterprise software, not as a founder pitching a product.

This page is what the build taught me about adoption, trust, measurement and messaging: the parts that decide whether a deal closes and a rollout sticks. The build itself — architecture, what broke, what I’d do differently — lives on a separate page for the technical stakeholders who want it.

For enterprise technology leaders, that distinction is critical. Buyers are no longer evaluating AI solely on model capability. They are evaluating whether it can be integrated, governed, measured and trusted inside real business processes. The question is no longer “What can the model do?” but “What has to change inside the organisation for this capability to create value?”

That reinforced a lesson that has held throughout two decades in enterprise software: successful technology adoption is rarely limited by capability alone. It succeeds when organisations can integrate it into workflows, operating models and decision-making.

What I built

I built rendus.ai, a working AI media and workflow product: multi-provider model orchestration, a node-based workflow engine, scheduled publishing, a credit ledger and organisation-level permissions. The live demo is up; this was a learning project, not a startup launch. The point was the build, not the business.

See what I built, how I built it, and what broke →

My 5 learnings

Five learnings stood out for me. Together they point to the same conclusion: the challenge is less in choosing the correct AI model, and more in integrating that capability into the way organisations actually work.

1 Validation becomes critical as models improve

More capable models make errors harder to spot.

Models have improved faster than people’s ability to validate them. Improved fluency is now masking errors more effectively, making inaccuracy harder to spot. The risk is the false positive: an answer that reads as correct, passes a human glance, and is wrong.

This was one of the first practical lessons from the build. As the models improved, I found myself trusting the output more while spending less time challenging it. That turned out to be exactly the wrong instinct.

This is also where the variance problem lives. Ask the same model the same question twice and you’ll often get two different answers. Users expect software; AI gives them weather.

Bridging that gap through validation, structured outputs, retries and evals is core work for any team running AI in production. It is also why I moved towards an adversarial coding approach: one model generated, another challenged, and I made the final judgement. The model can be probabilistic. The product cannot.

Validation protects individual outputs. Evals measure whether the system is improving over time. Both become more important as models become more capable.

2 Context is the real multiplier

The same model performs very differently depending on what it knows.

The model doesn’t remember you, and it doesn’t know your system. Every conversation starts cold, every session forgets the last one, and the model only knows what you put in front of it right now. Waiting for the next model release to fix this is a strategy that has been failing for two years. Even Anthropic’s own engineering team, building long-running coding agents on Claude, found that out-of-the-box context handling wasn’t enough.¹

The people getting the most out of AI aren’t the ones with the best prompts. They’re the ones with the best filing systems: vaults, notes, plans, decision logs. They feed them back in as the persistent layer the model itself lacks. For me, that meant written plans in Markdown, organised in Obsidian, and fed into AI sessions deliberately. The model is rented. Your notes aren’t.

My Obsidian vault. Each dot represents a Markdown file: planning, implementation and architecture docs.

The same logic applies within a session. A model with rich context writes code that fits your system. A model with thin context writes code that works for the file you pasted and nothing else. The same prompt produces brilliant work in one project and mediocre work in another. The prompt is the same, the context isn’t. Every conversation with AI is a context budget, and most of the value comes from how well you spend it.

AI without context can fix local problems while degrading system coherence. A hundred individually-correct changes can still produce a worse architecture. Surfacing the right context at the right moment is the real work.

3 Organisations need briefs, not prompts

The shift is from telling systems what to do to telling them what you want.

AI usage has moved through three eras. Transactional was the first: one question, one answer, a smarter Google. Multi-step came next: chained prompts, templates and workflows where one output becomes another’s input. The third era, agentic, is the one most products are still catching up to: you describe an outcome and the system designs its own path to it.

The shift that matters isn’t “from chat to agents”. It’s from instructing (telling the machine the steps) to briefing (telling it the outcome). For most of computing history, humans translated intent into instructions the machine could execute. Agents invert that contract.

As systems become more autonomous, validation, governance and evaluation become more important, not less. Memory, context, evals and judgement don’t become optional. They become non-negotiable.

4 Outcomes matter more than outputs

The question is not whether AI works, but whether it creates value.

There are two questions you have to answer about any AI system: did the AI do the thing, and does the thing matter?

The first is a technical question: test sets, evals, sampling and LLM judges. It tells you whether the system works.

The second is a business question: hours saved, deals closed, tickets deflected and P&L moved. It tells you whether the work made a difference.

Most teams answer the first and forget the second. The result is a system that passes every test and changes nothing on the bottom line. The AI works perfectly; nothing happens.

The teams that win measure both outputs and outcomes, and they don’t confuse “the model behaved” with “the business benefited.”

5 Judgement is where advantage remains

Technology scales. Good decisions do not.

Models are commoditising. Anyone can call the API. The same intelligence is increasingly available to everyone, and that levelling is happening faster than most companies are prepared for.

The differentiator is no longer the model itself. It is the decisions around it: where to apply it, what to trust, what to measure, what to automate and what to leave to humans.

The model is available to everyone. The decisions about how to use it are not.

What this means for enterprise adoption of AI

Enterprise adoption of AI is becoming less a model-capability problem and more an organisational one. The technology is evolving faster than most organisations can absorb it, and absorption, not access, increasingly determines whether AI investment pays off.

The engineering challenge quickly stops being “can the model generate the output?” and becomes a set of operational questions familiar to anyone who has sold or implemented enterprise software:

Operational question	What it’s really about
Validating outputs	Trust
Evaluating performance	Continuous improvement
Maintaining context	Organisational memory
Managing permissions	Governance
Governing autonomy	Accountability
Recovering from failure	Reliability
Tracking usage and cost	Economics
Integrating into workflow	Adoption

At enterprise scale, it is not about credit ledgers. It is about identity, audit, governance, procurement and operational ownership. The model is the part you rent. The system around it is what you own. That ownership is where the cost, risk and value sit.

Three patterns stood out most clearly.

Reliability and trust. AI does not fail like traditional software. It can be wrong, inconsistent, delayed or plausibly incorrect. That makes validation, auditability, fallback paths and human oversight essential. The first time an AI system is confidently wrong in production, trust becomes the problem.

Integration and ownership. AI that sits outside the flow of work remains an experiment. AI that connects to systems of record, business processes, permissions and reporting becomes operational. Success depends less on model capability and more on clear ownership: who approves access, who measures outcomes, who handles exceptions and who remains accountable. Without that, the pilot succeeds and the rollout dies in committee.

Economics and identity. Cost, latency and permissions shape what is actually deployable. Enterprises care less about benchmark performance and more about whether outcomes can be delivered at a cost and risk level they can defend; the CFO doesn’t sign a contract whose unit economics flex with the next model release. As agents become more common, identity moves from a sign-up problem to an every-call problem: who is acting, on whose behalf, and with what permissions?

None of these challenges exist in isolation. They share the same underlying infrastructure: identity, observability, governance and operating models. The organisations that move fastest on AI will not necessarily be the ones with the best models. They will be the ones that integrate these capabilities into the way the business actually works.

What this means for selling AI

Doing this project has shaped how I think about go-to-market going forward.

For technology companies, AI capability is rapidly becoming table stakes. Buyers know the models are powerful. They can see the demos themselves. The harder challenge is helping organisations turn that capability into measurable outcomes.

The pattern I increasingly see is product teams shipping AI features faster than customers can operationalise them. Feature velocity collides with the slower realities of governance, workflow integration, ownership, pricing and ROI measurement. Shipping the feature is often the straightforward part. Creating adoption is harder.

That changes the role of GTM. The companies that succeed will not simply sell access to AI. They will help customers integrate it into workflows, establish ownership, measure value and build trust in the outcomes. They will understand not just what the product can do, but who owns the problem, why it matters, how value is measured and what blocks adoption inside the customer. Adoption becomes part of the sales motion, not something left for post-sales teams to solve later.

That is ultimately what this project reinforced for me. Building rendus.ai taught me what that system looks like from the inside. Two decades in enterprise software have taught me what it takes to drive adoption, scale technology and create measurable outcomes at the other end.

Capability comes from AI. Value comes from the operating model around it.

Sources

Anthropic Engineering, “Effective harnesses for long-running agents,” November 2025. https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents ↩

What building an AI product taught me about selling one

What I built

My 5 learnings

1 Validation becomes critical as models improve

2 Context is the real multiplier

3 Organisations need briefs, not prompts

4 Outcomes matter more than outputs

5 Judgement is where advantage remains

What this means for enterprise adoption of AI

What this means for selling AI

Sources

Footnotes