Executive Summary
For fifteen years, B2B data has been a race to the bottom: everyone buys the same firmographics from the same scraped sources. That advantage is now a commodity — and commodities don't win deals.
The teams that win next aren't the ones with more data. They're the ones with data nobody else can manufacture: bespoke data assets built through proprietary acquisition, entity resolution, validation, and complex workflows. Enigma, Windfall, and LeadGenius show the model. The moat isn't the data — it's the machine that produces it.
For the last fifteen years, B2B data has been a race to the bottom. Everyone has the same firmographics. Everyone has the same job titles, the same company sizes, the same funding announcements pulled from the same press releases. The "data advantage" that powered a generation of revenue teams has quietly become a commodity.
Here's the uncomfortable truth: if an LLM can find it by reading a press release, so can your competitor's LLM. The moment data is easy to acquire, it stops being an advantage and becomes table stakes. The teams that win the next decade aren't the ones with more data. They're the ones with data nobody else can get.
That data has a name: bespoke data. And it's defined by what it takes to build it. This is the thesis behind the bespoke era of go-to-market — and it's why generic inputs can't power a differentiated engine. As we've argued before, you can't build a bespoke GTM engine on generic inputs.
The Difference Between Aggregation and Assets
Most "data providers" are aggregators. They point a crawler at the open web, normalize what comes back, and resell it. It's a fine business, but it produces no moat — because the inputs are public and the process is replicable by anyone with a scraper and a weekend.
A genuine data asset is different. It's the output of a defensible pipeline:
- Proprietary acquisition — access to raw signals that aren't sitting on a public page: transaction feeds, panels, licensed partnerships, original collection.
- Entity resolution — the hard, unglamorous work of stitching messy records to a single real-world person, household, or company across dozens of conflicting sources.
- Validation and enrichment — multi-step verification that turns a raw signal into something you'd actually bet a campaign on.
- Complex workflows — the orchestration of all of the above, run continuously, at scale, with quality controls that took years to tune.
This is the part an LLM can't shortcut. A language model is brilliant at finding and summarizing what already exists in text. It is useless at manufacturing a dataset that has never been written down. You cannot prompt your way to credit card transaction data. You have to build the machine that produces it.
Three Companies That Built Real Moats
The best way to understand bespoke data is to look at companies that have done the hard thing. Each of the three below sells a dataset that simply cannot be reconstructed by scraping the web — because the underlying signal isn't on the web at all.
Enigma — Transaction Data as a Business Signal
Most data tells you what a company says about itself. Enigma works with real transaction signals to estimate revenue, growth trajectory, and operational health for small and mid-sized businesses — telling you what a business actually does.
Why it's a moat: Card transaction patterns aren't published anywhere. No press release says "this restaurant's revenue grew 14% quarter over quarter." Turning raw, noisy financial signals into reliable revenue estimates for millions of businesses requires entity resolution, modeling, and validation. That pipeline is the product, and the open web doesn't contain the input.
Windfall — Household Net Worth
Windfall builds a view of household net worth and wealth attributes that goes far beyond the crude "income range" guesses most providers offer. It answers a question almost nobody else answers accurately: how much is this person actually worth?
Why it's a moat: Net worth isn't disclosed — it has to be constructed, assembled from a wide range of signals, resolved to the right individual and household, and continuously refreshed as circumstances change. The defensibility lives in the entity resolution (people move, marry, change names) and the modeling that converts scattered indicators into a defensible estimate. You can't scrape a number that exists nowhere in published form.
LeadGenius — Contact-Level Signals Beyond the Firmographic
LeadGenius specializes in hard-to-find, contact-level B2B data: verified physical addresses, granular technology usage at the individual level, and other attributes that require human-in-the-loop research combined with automated workflows.
Why it's a moat: A physical address tied to a specific decision-maker, or a confirmed read on what technology a particular person (not just their company) uses, doesn't come from a single source. It comes from a blended pipeline — automated collection, targeted research, and a validation layer that confirms the signal is real before it ships. That combination of machine scale and human verification is expensive and slow to build, which is exactly why it's defensible. It's the same thinking behind contact-level technographics: mapping who does what, with what, and where — not just which logo owns a license.
What These Three Have in Common
None of them are in the business of finding data. They're in the business of making it. Each one:
- starts from a signal that isn't publicly written down,
- invests heavily in entity resolution to attach that signal to the right real-world entity,
- runs validation steps so the output is trustworthy enough to act on, and
- operates a continuous, complex workflow that competitors would need years and serious capital to reproduce.
That's the template for a defensible data asset. The IP is the pipeline. The moat is the accumulated investment in acquisition, resolution, and validation. The output is a dataset that powers plays no one else can run.
Why This Matters Now
The arrival of cheap, capable LLMs has paradoxically made bespoke data more valuable, not less. When everyone can instantly summarize the public web, the public web stops being a differentiator. The teams that stand out will be the ones holding signals the models can't reach — because those signals were never published in the first place.
The GTM implication is direct. A playbook built on commodity firmographics produces commodity results: the same accounts, the same messaging, the same conversion rates as everyone else fishing in the same pond. A playbook built on bespoke data produces specialized plays — finding businesses by their real revenue trajectory, targeting individuals by genuine net worth, reaching contacts at verified addresses with messaging tied to their actual tech stack.
That's not a marginal edge. It's the difference between competing on execution and competing on access to truths your competitors can't see.
The Takeaway
The future of B2B data isn't more data. It's better-protected data — assets built behind real IP and technology moats, manufactured through aggregation, entity resolution, validation, and complex workflows that no LLM can shortcut.
The press release is free. Everyone has it. The advantage now belongs to the companies that build the machine that produces what the press release never said.



