When OpenAI launched Sora2 as a consumer app, a wake-up call echoed through the startup world. The infrastructure layer—once content to provide APIs—was now competing directly with the applications built on top of it. Anthropic followed with Claude Teams. The farms weren’t just selling ingredients anymore; they were opening restaurants.
For founders, this raises an existential question: How do you build defensible AI businesses when your infrastructure providers are also your competitors?
The answer isn’t in better models or more computing. It’s in something the giants can’t easily replicate: exclusive access to scarce data built on trust.
The Permission Economy
Marc Andrusko and Alex Rampell from a16z frame this perfectly: while model companies will always have bigger models and more distribution, startups can win in “walled gardens of data”—domains where information is proprietary, regulated, and valuable precisely because it’s restricted.
Think of it this way: ChatGPT can scrape the public internet. It cannot access your company’s Slack history, your hospital’s patient records, or a law firm’s case files. That permission gap is your moat.
Two companies exemplify this strategy:
VLex, a Spanish legal tech company, spent two decades digitizing fragmented court decisions and statutes across regional jurisdictions—building what became Europe’s most comprehensive legal database. When generative AI arrived, VLex didn’t need to compete on model quality. It competed on completeness. A lawyer crafting a brief needs every relevant precedent. Miss one case, lose the argument. General-purpose models can’t guarantee that coverage; VLex can.
OpenEvidence pursued the same approach in medicine. While WebMD and forums flood the internet with health information, clinicians need peer-reviewed research locked behind paywalls like Elsevier. OpenEvidence spent years building partnerships to access vetted medical literature. The result? An AI that answers clinical questions with evidence-backed precision—something impossible without that proprietary corpus.
The pattern is clear: defensibility comes from owning datasets that are hard to access, constantly updated, and impossible to replicate through public scraping.
Where the Gardens Still Need Planting
The opportunity extends far beyond law and medicine. Consider:
Supply Chain & Logistics: Shipping manifests, customs filings, and trucking records are fragmented globally. No one owns the complete picture. Build it, and you enable predictive trade finance and geopolitical risk modeling.
Municipal Government Records: Permits, zoning applications, environmental studies—scattered across thousands of local jurisdictions. Consolidate them, and you unlock AI for real estate development and infrastructure planning.
Climate Data: Emissions tracking, carbon intensity metrics, and local climate risk data sit in government agencies, NGOs, and scientific institutions—often in PDFs. There’s no Bloomberg for climate yet. That’s the opportunity.
Frontier Sciences: Fields like synthetic biology and quantum materials publish in disparate journals without centralized aggregation. Structure that research, and you accelerate R&D cycles.
These aren’t markets where you out-compute OpenAI. They’re markets where you out-trust them by building relationships that grant exclusive access.
The Trust Equation
Peter Thiel famously said competition is for losers—monopolies capture value. But in AI, the new monopolies aren’t built on better algorithms. They’re built on institutional trust and regulatory permission.
Healthcare providers won’t share patient data with a chatbot. Law firms won’t upload privileged communications to a public API. Manufacturers won’t expose supply chain details to competitors. These gatekeepers control access, and access creates moats.
This dynamic mirrors what Ben Thompson calls “aggregation theory in reverse.” Traditional aggregators (Google, Facebook) commoditized suppliers and owned distribution. AI infrastructure companies own compute and models—but they don’t own the restricted data that enterprises protect most fiercely.
Your defensibility isn’t in training bigger models. It’s in being the trusted partner with permission to access what others cannot.
The Groundwork Required
Building walled gardens isn’t easy. It requires:
- Patient relationship-building: VLex spent 20 years. OpenEvidence negotiated licensing deals across medical publishers. This isn’t a hack-sprint; it’s institutional sales.
- Regulatory navigation: Healthcare, finance, and government data come with compliance requirements. Your moat is partly legal—competitors can’t replicate your partnerships without similar approvals.
- Data structuring: Raw access isn’t enough. Fragmented PDFs and unstructured records need normalization. Your AI advantage comes from curated, structured datasets, not just volume.
- Continuous updating: Static datasets decay. Your moat deepens when you’re the only one maintaining live feeds of regulatory changes, new case law, or emerging research.
This upfront investment is the barrier to entry—and the reason incumbents often ignore these opportunities. Niches feel too small. Integration feels too complex. But AI economics make formerly marginal verticals suddenly viable.
The Strategic Imperative
As Elad Gil notes in High Growth Handbook, the best defense against giants isn’t trying to beat them at their own game. It’s finding the dimension where their scale doesn’t matter—or where it actively works against them.
Google can’t build trust with every hospital system. OpenAI can’t negotiate data partnerships with thousands of municipalities. Anthropic can’t secure licensing from every niche scientific journal.
But you can. And once you do, you’re not just building a product. You’re building a data monopoly that compounds over time.
The Path Forward
The AI infrastructure wars are real, but they’re not your war. Let OpenAI and Anthropic fight over who has the smartest model. Your battle is different:
- Where is valuable data currently fragmented?
- Who controls access, and why would they trust you?
- What workflows unlock once that data is structured?
- How do you maintain exclusivity as you scale?
Answer these, and you’re not building an AI company. You’re building a permission-based monopoly in a domain where trust and access matter more than compute.
The farms may be opening restaurants. But they can’t farm land they don’t own.
Ready to explore your walled garden opportunity? Join M Studio’s monthly Founders Meeting to map domains where permission creates defensibility—and learn how to build trust-based moats in AI-native markets.