Thesis on Enterprise Agents
Enterprise AI spend crossed from experiment to line item sometime in the last eighteen months. Every Fortune 500 company is now trying one way or another to become AI-native, reflected by an ever increasing inference bill. Yet, the return plateaus where the agent switches from demo to production environments. We think that the bottleneck that blocks the potential of AI agents is not necessarily the quality of its foundation model, but how AI deployment is being approached. The following are our three theses on the future of enterprise AI agents.
Persistent Memory will matter more than Raw Intelligence
The argument for the importance of persistent agentic memory is probably best phrased by Sarah Guo: a token spent reasoning over your company's data is worth more than a token spent reasoning over the open internet. The standard workaround these days for agent memory is retrieval — surface the right documents at query time and hope the model connects the dots. We think that while RAG is a necessary layer of the agentic harness, there is more to be explored in models that more natively absorb the institutional knowledge of the enterprise that it powers. Retrieval treats organizational knowledge as a search problem. A good employee doesn't reread the company wiki before every meeting. They accumulate understanding — decisions, failures, preferences, relationships — and that's what makes them worth more in month twelve than week one.
An intelligent agent needs both the ability to search scattered documents across data warehouses and the ability to remember and embed information that shape how an organization operate. We achieve this by data-efficient scale RL and post train models on the enterprise's own data so institutional knowledge lives in the system, not in a prompt.
The right model for right task beats the best model for everything
There is no best model, only a frontier — and the frontier is not consolidating, but more fragmenting than ever.
Different families of models have task-specific strength, and choosing the wrong model for a task type has real cost and quality consequences. The price spread is enormous: multiple orders of magnitude between frontier reasoning models and efficient workhorses, yet most enterprises route every task — meeting summaries, contract analysis, email drafts — to a single frontier model, which results in an AI bill growing faster than AI output.
Intelligent model routing fixes this. We are pushing for an orchestration layer that understands each task's actual requirements — reasoning depth, latency, cost ceiling, data sensitivity — and dispatches accordingly. This looks like fast, cheap models for the high-volume work that dominates most enterprise token consumption, and more expensive reasoning models for workflows that genuinely require such massive amount of token consumption. In our deployments, model routing cuts inference cost significantly while improving output latency, because tasks land on models suited to them.
The deeper consequence with appropriate application of routing is that routing changes who owns the customer relationship. When the orchestration layer picks the model, model providers become interchangeable suppliers. The strategic position shifts to the layer that decides where tokens get spent. In a world where token economics determine AI ROI, the router is the margin.
Forward-deployed engineering will outperform one-size-fit-all platforms
The reason why AI agents seemed more promising in demos than when in deployments is due to a lack of natively embedded workflows in place that the enterprise actually runs on. A sales team's workflows and vocabulary are nothing like the finance team's, and that finance team is nothing like the one next door. A one-size-fit-all platform fits no one, and configuration UIs don't close that gap, because the people who know the workflows don't know the platform, and the people who know the platform have never seen the workflows.
Palantir made forward-deployed engineering fashionable, and we think that this should be the default mode for how agentic SaaS companies interact with their customers. Agents are far more context-sensitive than dashboards. An agent that misunderstands a team's workflow doesn't sit unused, but makes mistakes. To actually ease the coordination costs they were designed to reduce, AI deployments must be tailored to specific operational quirks, such as an agent recognizing that a sales team fragments its data across HubSpot and disparate tracking tools, and find ways to mitigate such frictions through automation.
A priority on forward-deployed engineering does not mean that the product isn't ready for friction-less adoption. Rather, the path to frictionless AI adoption starts with engineering talent. In complex enterprise environments, agents aren't "plug-and-play"; they are "deploy-and-integrate," requiring experts who can navigate and bridge the gap between model capabilities and messy internal workflows. The "friction" isn't a bug of the agent, but a reality of the business, and it takes human talent to map the AI to that reality.
Where the future lies
ERP needed implementation armies. Cloud needed solutions architects. Each time, the durable franchises were built by companies that took deployment seriously. The models will keep getting better. The moat was never benchmark scores. It's accumulated context, an economic layer that makes every token earn its keep, and the organizational knowledge of how to make agents work inside real companies.
That's where we're placing our bets.
Eragon is the AI operating system for enterprises. Book a demo now.