8 min readJohn McBride

What Running an Enterprise Gen AI Hub Teaches You About Real AI Adoption

Lessons from running Spark AI: 46 models behind one interface, a weekly-audited model registry, and why assistant builders beat chat windows.

aienterprisegovernanceadoption

For the past couple of years I've been building and running an internal Gen AI hub for a large organization. We call it Spark AI. It started as a safe place for employees to use a chatbot without pasting company data into a public website. It turned into something much bigger: a single interface sitting in front of 46 models across 6 providers, with dozens of purpose-built tools layered on top.

Running a platform like this teaches you things you can't learn from vendor decks or LinkedIn hot takes. Most of what I believed about enterprise AI adoption on day one turned out to be wrong. Here's what actually held up.

## One interface beats six logins

The first lesson is structural. If your employees need separate accounts for each AI provider, adoption dies at the password reset screen.

We put everything behind one front door. Anthropic, OpenAI, Google, and three other providers all live behind the same login, the same UI patterns, the same access controls. A user who wants to draft a document doesn't think about which API is serving the request. They pick a tool and go.

This matters more than it sounds. The model market churns constantly. Providers leapfrog each other every few months. When your users are bound to one vendor's chat app, every market shift is a migration project. When they're behind your own interface, a market shift is a config change. We've swapped the default model under a tool more than once and nobody noticed except the people watching quality go up.

The other benefit is negotiating posture. You're never hostage to a single provider's pricing or outage. The week one provider has a bad day, traffic quietly shifts to another.

## The model registry is the most boring thing that saved us

Here's the unglamorous part nobody talks about: model IDs rot.

Providers deprecate models on their own schedule, sometimes with short notice. A tool that calls a retired model ID doesn't fail loudly in a staging environment where someone's watching. It fails in production, for a user, in the middle of their workday. And with 46 models in play, hand-tracking that is a losing game.

Our fix was a model registry: one file that is the single source of truth for every model ID the platform is allowed to call. No tool hardcodes a model string. Everything resolves through the registry.

Then we went one step further and made an automated agent audit that registry every week. It checks what we have listed against what the providers actually offer, flags deprecations, flags new releases worth evaluating, and reports out. A weekly cadence sounds slow until you realize the alternative is finding out about a deprecation from an angry user.

If you take one technical idea from this post, take this one. The registry costs almost nothing to build. Not having it costs you a small outage every time a provider cleans house, multiplied across every tool you've shipped.

## Assistant builders beat chat windows

When we launched, I assumed the general chat interface would be the workhorse. Give people a powerful model and a text box, and they'll figure it out.

Some did. Most didn't. A blank chat window puts the entire burden of prompting, context, and workflow design on the user, every single session. The people who got value were the ones already comfortable with the technology. Everyone else tried it twice, got a mediocre answer, and went back to their old way of working.

What changed adoption was letting people build their own assistants. We shipped an assistant builder: pick a model, write instructions once, attach your own knowledge and documents, save it, share it. Suddenly the prompt engineering happens one time, by the person who understands the job, and then the whole team benefits.

A contracts analyst doesn't want to re-explain contract review every morning. She wants an assistant that already knows how she works. Once she builds it, the chat window problem disappears, because the context lives in the assistant instead of in her head.

The same logic drove our move from chat to purpose-built tools. Document chat with podcast generation. An image studio. A prompt engineering trainer. Each one is a narrow front end over the same model pool, designed around a job instead of around a conversation. Tools with a clear job get used. Generic chat gets sampled.

## Governance that doesn't kill momentum

The standard enterprise failure mode is a six-month review process for every AI use case. By the time legal signs off, the team that asked has lost interest and the model they evaluated is deprecated. The opposite failure mode is a free-for-all where nobody knows what's running or what it costs.

We ended up with a middle path, and it rests on a few principles.

**Enforce access in the database, not the interface.** Hiding a button in the UI is decoration. Real gating happens at the data layer, where it can't be bypassed by someone who knows how to open developer tools. Sensitive tools are visible only to the groups that should have them, and the enforcement is server-side.

**Put hard budget caps on autonomous work.** Anything that runs without a human watching gets a spend ceiling, and the system enforces it mid-execution. We don't tally costs after the fact and wince. If a job hits its cap, it stops, right then. This single rule is what made leadership comfortable letting agents run at all.

**Keep a kill switch.** There is one control that pauses all autonomous activity on the platform, instantly. We've rarely needed it. Its existence is the point. It converts "what if something goes wrong" from an objection into a procedure.

**A human approves anything that ships.** Agents can draft plans, write content, even build software. None of it reaches production without a person signing off. This isn't distrust of the models. It's the line that lets you move fast everywhere else, because everyone knows where the hard stop is.

**Open the intake to everyone.** Any employee can submit a request for something they want built. It lands in a queue, gets triaged, and a human approves the plan before work starts. This did two things we didn't fully predict: it surfaced use cases nobody on the platform team would have thought of, and it made the whole organization feel like the hub belonged to them rather than to IT.

The pattern across all five: govern the boundaries, not the activity. Decide where the hard limits are, enforce them mechanically, and then get out of the way inside those limits.

## Adoption is a curriculum, not a launch

One more thing I underestimated: the gap between "the tool exists" and "people use the tool" is enormous, and it doesn't close on its own.

We built a guided tour as the literal first card on the platform. We link out to free external AI courses. We maintain a transparency page that explains, in plain language, how the platform handles security and data. None of this is sophisticated engineering. All of it moved adoption more than any model upgrade did.

The honest version is that your most enthusiastic users need almost nothing from you, and your average user needs far more than you think. Build for the average user. The enthusiasts will be fine.

## Practical takeaways

If you're standing up an AI capability inside your organization, here's the short list I'd hand you:

1. **Put one interface in front of all providers.** Own the front door so model churn becomes a config change instead of a migration.
2. **Build a model registry on day one.** Single source of truth for every model ID, nothing hardcoded, audited on a schedule — automate the audit if you can.
3. **Ship an assistant builder early.** Per-user, per-team assistants with attached knowledge convert casual users into daily users in a way raw chat never will.
4. **Govern boundaries mechanically.** Database-level access control, hard budget caps enforced mid-run, a kill switch, and human approval before anything ships. Then stop adding process.
5. **Open the request queue to everyone.** Your best use cases are sitting in the heads of people who will never attend your AI steering committee.
6. **Invest in onboarding like it's a feature.** Because it is. A guided tour and a plain-language security page outperform a model upgrade.

None of this requires exotic technology. It requires treating the hub as a product with users, not a procurement line item. Two years in, that's the real lesson: the models keep getting better on their own. The platform around them is the part you have to get right.