Friday, 20 March 2026 / Published in Entrepreneurship

The Data Defensibility Thesis: Why Smart Capital Is Repricing Software

The software industry is undergoing a major shift. Valuations for SaaS companies have dropped sharply, with EV/revenue multiples falling from a peak of 6.7x in 2021 to 3.1x by late 2025. Investors are no longer rewarding growth alone. Instead, they’re focusing on companies with strong data infrastructure that delivers lasting value. Here’s what’s driving this change:

AI Automation: AI has made software development faster and cheaper, reducing the value of standalone software features.
Data Platforms: Companies with proprietary, high-quality data are commanding higher valuations, while traditional SaaS businesses are struggling.
Market Trends: Analytics & Data Management was the only SaaS category to see valuation growth in 2025, with multiples rising 11%, while other categories like Security and ERP saw steep declines.

The key takeaway? Software is becoming a commodity, and the real edge lies in owning exclusive, high-value data that competitors can’t replicate.

Quick Highlights:

AI Impact: Generative AI is lowering development costs, pressuring companies that rely on basic software functionality.
Valuation Trends: AI-driven platforms with proprietary data trade at 12–20x revenue multiples, while generic SaaS solutions fall below 5x.
Investor Focus: Companies that embed data generation into workflows are better positioned for long-term success.

To thrive in this new landscape, businesses must shift from selling software features to building defensible data systems. Investors are already prioritizing these factors when evaluating opportunities.

SaaS Valuation Shift: Software vs Data-Driven Companies 2021-2026

Why Software Is Commoditizing

AI Tools Have Made Software Development Faster and Cheaper

The barriers to creating software have all but disappeared. The rise of tools like AI-assisted development has made it possible for non-engineers to build functional products efficiently, a trend encapsulated by "vibe coding" – Collins Dictionary’s Word of the Year in 2025. What once required large teams and years of iteration can now be achieved in a fraction of the time.

Generative AI has drastically reduced costs for traditional software companies by automating workflows that previously took years to refine. These tools allow repetitive tasks to be completed by autonomous agents at roughly 1% of their historical cost. As TheMeridiem Team observes:

The market has stopped valuing software companies primarily on their growth rates… It’s started valuing them on whether their core business model survives AI displacement.

This shift has disrupted companies built to centralize workflows or boost human efficiency. Customers no longer depend on these established processes, and investors are punishing businesses that merely package third-party APIs without offering proprietary data or distinct value. These so-called "AI wrappers" have seen their valuations drop to 2–3x revenue, while application software multiples contracted by 41% between early 2025 and early 2026, declining from 5.8x to 3.4x EV/NTM Revenue. In 2025, AI-referenced targets accounted for 72% of all SaaS M&A transactions, with premiums reserved for assets offering genuine technical advantages. The dramatic drop in software development costs is driving this broader market re-pricing trend.

SaaS Multiples Have Stabilized at Pre-Boom Levels

As the market evolves, a clear line has emerged between companies leveraging defensible data and those relying solely on software features. This shift has brought SaaS multiples back to their pre-boom levels, highlighting a broader move from valuing software functionality to prioritizing data infrastructure. By early 2026, only 13% of public software companies were growing at rates above 20%, a stark contrast to the 44% seen during the 2021 peak.

Valuations now vary widely between AI-driven platforms and traditional SaaS applications. Since the launch of ChatGPT, companies specializing in foundational infrastructure and data platforms – dubbed "AI Darlings" – have seen their value skyrocket by 513%. Meanwhile, horizontal application software has dropped 14% over the same period. Median public AI market cap-to-revenue multiples now exceed 10x, while traditional SaaS companies have dipped below 5x.

As AI tools make software creation more accessible, investors are reevaluating traditional software business models. The focus has shifted from growth rates to a deeper question: can a company’s core business model survive the wave of AI-driven automation? Relying on software functionality alone is no longer enough; the true competitive edge now lies in proprietary data and infrastructure.

What Makes Data Defensible

The 4 Requirements for Data Defensibility

Simply collecting data doesn’t automatically create a competitive edge. Investors draw a clear line between raw data and data that can truly hold its ground. What sets defensible data apart comes down to four key factors.

Proprietary generation: This is data that competitors can’t easily replicate. It’s not something they can get by throwing more computing power at open-source models or scraping public sources. Instead, it comes from exclusive access to unique data streams that are tough to duplicate.
Compounding value: With every interaction, the system improves, creating a growing advantage that competitors can’t easily overcome.
Workflow embedding: The data collection process is deeply integrated into essential enterprise workflows. Pulling it out or duplicating it would be so costly and complex that it’s nearly impossible.
Domain specificity: The data is fine-tuned for a specific industry or use case. This makes it difficult to apply outside that niche, limiting its utility to competitors in other markets.

These four factors explain why companies with proprietary data assets tied to foundational LLMs and GenAI often achieve revenue multiples between 12–20x. On the other hand, companies that simply layer their services on top of third-party APIs struggle to reach even 3x revenue. A notable example is IBM’s 2026 acquisition of Confluent for $12.65 billion at a 9.6x revenue multiple. Confluent’s real-time data streaming was critical for enterprise AI, acting as a “nervous system” for managing proprietary operational telemetry – something competitors couldn’t replicate (Source: Windsor Drake AI Software Valuation Report, 2026).

Now that we’ve outlined what makes data defensible, let’s explore why so many data strategies fall short of these benchmarks.

Why Most Data Moats Fail

Even with the four criteria in mind, many companies stumble by confusing sheer volume with a real competitive advantage. Research from a16z points to a common issue: many so-called data advantages are just scale effects, not actual network effects. Scale effects can be matched by competitors with enough investment in acquiring customers or gathering data. True defensibility requires either scarcity – exclusive access to data sources – or telemetry scale that can’t be replicated simply by spending more money.

Data strategies often fail when companies rely too heavily on third-party APIs without adding any proprietary value. This approach compresses valuations to as low as 2–3x revenue. Another pitfall is high inference costs, which eat into margins and make the business resemble a low-margin service operation rather than a high-margin software company. Additionally, if the training data isn’t backed by documented consent or a clear chain of custody, investors heavily discount its value due to regulatory risks. Windsor Drake’s 2026 valuation analysis underscores this point:

If you’re wrapping someone else’s API and calling it a product, you’re fighting for the 3x slot.

This shift is already visible in public markets. AI-native platforms with proprietary data and strong safety measures trade at median market cap-to-revenue multiples above 10x. Meanwhile, traditional SaaS companies have dipped below 5x. The message is clear: software functionality alone no longer cuts it. The real edge lies in the data layer, and that data must meet all four defensibility criteria to hold its value.

How Investors Should Adjust Their Evaluation Framework

How to Assess Proprietary Data Generation

When evaluating companies in today’s data-driven world, investors need to shift their focus. Instead of asking, "What does this software do?", the question should be, "What data does this company have exclusive control over?" It’s all about assessing who owns the data and how it’s generated. Companies that create unique first-party data – whether through sensors, proprietary workflows, or custom authoring tools – stand apart from those that merely process external files or rely on third-party APIs.

A key consideration is whether competitors, even with significant resources, could replicate this proprietary data. Take AwanTunai, for example, an Indonesian fintech company. By embedding its proprietary ERP system into MSME supply chains, it captured real-time transaction data directly from wholesalers. This unique approach enabled them to maintain a 3% non-performing loan rate during the COVID-19 pandemic, while the broader industry faced rates as high as 20-30% (Source: Insignia Business Review, 2025). That’s the power of first-party data in action.

Another example is H2OK, a Flybridge portfolio company. Their IoT platform for industrial liquid optimization generates proprietary datasets through its sensor systems. Unlike competitors relying on public or third-party data sources, H2OK’s data is exclusive and compounds in value over time (Source: Flybridge, 2025). The operational costs of switching away from their system are steep, as doing so would mean losing the critical data infrastructure that powers key processes.

The ultimate question for investors is clear: Does this company control the entry point where work gets done? As Elvia Perez from Las Olas Venture Capital explains:

Don’t just integrate around workflows. Own the entry point where work gets done.

By owning the entry point, companies can transform into systems of intelligence, rather than remaining as supplementary tools. This focus on proprietary data generation forms the backbone of why enterprises are increasingly favoring vertical solutions over horizontal ones, a trend explored further below.

Why Enterprises Are Concentrating Spend on Vertical Solutions

As software becomes more commoditized, enterprises are gravitating toward platforms offering measurable data advantages. According to QED Investors and PYMNTS (Jan 2026), businesses are now channeling budgets into vertical solutions with proven performance, moving away from speculative horizontal AI vendors. This shift reflects a broader trend: companies are prioritizing products that deliver clear ROI through domain-specific data, rather than generic tools.

The numbers back this up. In 2025, Analytics & Data Management was the only SaaS category to see valuation growth, with median EV/TTM revenue multiples rising 11% year-over-year (Source: SEG 2026 Annual SaaS Report). Furthermore, AI-related targets made up roughly 72% of the 2,698 SaaS M&A transactions that year. Buyers are paying a premium for businesses with data infrastructure that grows in value – not just for software features that can be easily copied.

Vertical solutions excel because they consolidate fragmented data into cohesive, high-value datasets. For instance, in healthcare, platforms that unify payroll, regulatory, and clinical data create something far more valuable than the sum of these isolated parts. This data unification – the ability to integrate and structure domain-specific information – gives vertical solutions a defensibility that horizontal tools can’t achieve without years of integration work.

For investors, the takeaway is straightforward: companies that generate proprietary data as a byproduct of their workflows will command higher valuations. As Alex Iskold from 2048 Ventures puts it:

Purely algorithmic AI, without a proprietary data set, has no moat and is not defensible long term.

Valuing companies solely on their software capabilities misses the bigger picture. The real competitive advantage lies in data-driven solutions, particularly those tied to vertical markets. This reinforces the idea that data ownership and integration are the new cornerstones of defensibility in today’s market.

How M Studio Builds Data-Driven Defensibility

Structuring Operational Data into Proprietary Assets

M Studio takes a unique approach to building defensibility by embedding data collection directly into everyday operations. By working closely with founders, they design workflows that naturally generate exclusive, proprietary data. Unlike methods that try to add defensibility after the fact, M Studio ensures that data capture is baked into the core processes. This means that every transaction enhances the value of the data, making it harder for competitors to replicate simply by applying computing power to publicly available models. The result? Higher switching costs and a data infrastructure that becomes irreplaceable.

To maintain the value of these datasets, M Studio prioritizes documented consent and clear data provenance. Without these, datasets can face significant valuation cuts. In fact, by 2025, AI-referenced targets made up about 72% of all SaaS M&A transactions, highlighting how critical true AI differentiation has become. Buyers are now far more cautious about distinguishing between meaningful AI capabilities and superficial "AI-washing." You can learn more about these operational frameworks by joining our AI Acceleration Newsletter.

M Studio’s infrastructure is designed to support over 200 AI models, including GPT-4, Claude, and Gemini. This approach gives organizations the flexibility to choose the best model for their specific needs while avoiding vendor lock-in. As AI technology advances, this adaptability ensures that the data infrastructure remains strong and continues to provide a competitive edge. This strategy sets the stage for AI systems that not only improve operational efficiency but also drive revenue growth.

Building AI Systems That Scale Revenue Operations

After structuring operational data into proprietary assets, the next step is turning that data into scalable value through AI-powered systems. M Studio creates tools that automate revenue operations while simultaneously generating more proprietary data. By deeply integrating these tools into essential workflows, they become indispensable, further strengthening defensibility.

Cost efficiency is a major focus. M Studio’s systems are designed to cut compute costs by 40–50%, which is crucial for protecting margins as businesses scale (Windsor Drake, 2026). This ensures that as revenue increases, unit economics improve rather than deteriorate into less profitable territory. The AI systems not only enhance revenue operations but also reinforce the value of the proprietary data, meeting the defensibility standards M Studio emphasizes.

The strategy also targets industries that are both data-rich and heavily regulated, where proprietary datasets can command premium valuations. Companies leveraging exclusive datasets and unique algorithms have achieved valuations of 12–20× revenue, far exceeding the 3.1× median for traditional SaaS businesses (Source: Aventis Advisors, Jan 2026). Embedding AI into critical workflows makes these systems essential, boosting net revenue retention to over 120% and justifying higher valuation multiples (Aventis Advisors, Jan 2026).

Through its venture studio approach, M Studio links this robust data infrastructure to capital markets. This positions companies as owners of valuable, defensible data assets that grow in worth as they scale, rather than just as software providers.

Conclusion

The software market has undergone a dramatic shift. Public software valuations have dropped significantly, with multiples compressing from a high of 6.7x in 2021 to around 3.1x by late 2025 (Aventis Advisors, Jan 2026). Meanwhile, the Analytics & Data Management segment has seen an 11% valuation increase year-over-year (SEG 2026 Annual SaaS Report). This trend highlights a clear pivot: the market now prioritizes data infrastructure over standalone software capabilities.

Investors who focus solely on software features risk misjudging the true value of a company. The critical question has evolved from "what does the software do?" to "what proprietary data does this company generate, and how does that data grow in value?" As Manish Sood, CEO and Founder of Reltio, explains:

Unified, real-time, trustworthy data is the context that powers the shift to agentic AI.

Without this foundational data layer, software risks becoming little more than a commodity wrapper for third-party APIs, often leading to lower revenue multiples of just 2–3x.

The contrast in valuations is striking. Companies with proprietary datasets – especially in foundational AI – can command revenue multiples of 12–20x. On the other hand, generic enterprise applications typically fall into the 3–6x range. This underscores the importance of investing in businesses that generate and leverage proprietary data, as software alone no longer guarantees differentiation or value. For investors, these evolving market dynamics necessitate a fresh approach to portfolio strategy.

Next Steps for Investors

In 2026, building a strong portfolio starts with prioritizing data infrastructure. Investors should scrutinize whether a company’s operations produce proprietary data, determine if that data compounds in value over time, and ensure that switching costs are deeply embedded rather than superficial. To stay informed, consider subscribing to the AI Acceleration Newsletter for insights into structuring operational data into defensible assets.

M Studio focuses on bridging data infrastructure with capital distribution. Using our venture studio approach, we identify businesses that naturally create robust data moats, design AI systems to turn that data into proprietary assets, and connect those assets to institutional capital. The companies that will achieve premium valuations in 2027 and beyond are being built now – on the strength of proprietary data, not just software features.

FAQs

How can I verify a company’s data is truly proprietary?

To determine if a company’s data is genuinely proprietary, start by examining how it’s generated. Look for processes that are difficult to replicate, such as those involving exclusive partnerships or specialized algorithms. Consider whether the data becomes more valuable over time, integrates deeply into workflows (making it costly for users to switch), and is customized for a particular field or industry. Additionally, assess any barriers in place, like exclusive access agreements or regulatory protections, that make duplication challenging or impossible.

What are the best signals that a data moat will compound over time?

The most telling signs of a growing data moat are its proprietary nature, unique value creation that’s hard to replicate, and seamless integration into key workflows, which builds switching costs. Data that’s exclusive, tough to duplicate, delivers lasting value, and produces actionable insights stands out as especially strong. Furthermore, the increasing reliance on data-driven platforms by businesses underscores the need for scalable, high-quality data assets that can maintain a competitive edge over time.

How should data defensibility change my valuation and underwriting model?

Data defensibility changes the game for valuations, moving the focus away from just software functionality to emphasizing proprietary, hard-to-duplicate data assets. As software becomes more of a commodity, traditional metrics like EV/Revenue lose their reliability. Instead, it’s more effective to evaluate factors like:

The exclusivity of the data: How unique and irreplaceable is the dataset?
Its ability to grow in value over time: Does the data improve or expand with use?
Switching costs: How difficult is it for users or competitors to move away from this data?

These aspects provide a clearer picture of competitive strength in an AI-driven world, where data moats tend to hold up better than software features alone.

JOIN in 3 Steps

The Data Defensibility Thesis: Why Smart Capital Is Repricing Software