Most of what gets written about building with AI assumes you’re starting from nothing. A clean repo, an empty database, a model, and a chat box. That has almost never been my situation. I’ve spent most of my career on enterprise platforms, Salesforce systems that a business has been running on for years, and adding AI to one of those is a different kind of work than the tutorials prepare you for.
I want to write down what actually changes, because the gap between a demo and something you can leave running on a system of record is wide, and most of it has nothing to do with the model.
The data comes with rules attached
On a greenfield app, data is just something you fetch and render. On a platform like Salesforce, the data shows up wrapped in rules that were written long before your feature existed. A record has an owner. It has a sharing model. It has field-level security, validation rules, and triggers that run when it changes.
The first time I connected an LLM to read records, I had to stop and ask a basic question: should this model be able to see a record the user in front of it can’t? The answer is obviously no. But it’s easy to build it the wrong way, because the convenient path is to give your AI service a powerful account and let it read everything. The moment you do that, you’ve put the platform’s most sensitive data behind a component that doesn’t understand who’s asking.
So the first thing I check on any of this work is whether the retrieval runs as the user, inside the same permissions the platform already enforces. If the user can’t see it, the model shouldn’t either. You don’t want to reinvent access control next to a system that already has a good one.
Pulling the right context is most of the work
A model can only reason about what you put in front of it. On a platform, the information that actually answers a question is spread across related records: an account, its contacts, its open cases, its history. Gathering that, in a way that’s both relevant and allowed, turned out to be most of the effort. The prompt was the last small piece.
You run into two problems quickly. Give the model too little and it answers in generalities, ignoring the specific record the user is looking at, and the whole thing feels bolted on. Give it too much and you’ve quietly pulled in related records the user may not be cleared to see, which is a data leak dressed up as a summary. Getting this right, narrowing to what’s relevant and permitted, is the part worth slowing down for.
The platform has limits, and they show up under load
Enterprise platforms are shared by a lot of tenants, so they protect themselves with limits: on how many queries you run, how long a transaction takes, how many API calls you make. Your AI feature wants to fan out across related records and then wait on a model call that might take ten seconds. That runs straight into those limits the first time a large account comes through.
The version that works does the AI work off to the side, not inside the request the user is waiting on. Queue it, use the platform’s async tools, and let the save complete on its own. A model call has no business sitting in the middle of a user’s transaction.
Whatever the AI writes, someone has to answer for
This is the part that separates a consumer app from a system of record. If a chatbot suggests something wrong in a consumer app, you shrug. On a platform, the records are what the business runs on, and often what an auditor looks at later. “The AI did it” is not an answer anyone wants to give in that room.
So a few things I treat as non-negotiable once an AI feature can write back. The change has to be attributable: you should be able to tell that this update came from the assistant, with what input. It has to be reversible, because a confident wrong answer needs to be as easy to undo as it was to make. And anything truly consequential gets a person in the loop before it commits. Drafting a value for review is fine. Silently changing a customer’s record is not.
None of this is exotic. It’s the same discipline a platform already runs on, applied to a new and less predictable component. If you come into this work treating the platform’s rules as the requirements rather than friction to get around, most of the hard decisions make themselves.