Startup Data Infrastructure

The Data Layer
for Venture Intelligence

Access comprehensive startup data via API or curated feeds. Organizations, domains, embeddings, and similarity scores - all in one place.

Powering data infrastructure for select venture teams

20M+

Organizations

Company profiles enriched with firmographics, social links, and growth signals.

150M+

Domains

Websites crawled and indexed with content, links, and similarity scores.

500k/day

Pages processed

Real-time web crawling capacity across 850+ curated sources.

Data Sources

Where we gather intelligence

850+ curated sources across six signal categories.

Launch Platforms

2,500/week

Catch new products the moment they launch, across all major platforms in one unified feed

Developer Platforms

15,000/week

Find technical founders, trending repositories, and the next breakout devtool before VCs notice

Funding Data

800/week

Track every round from pre-seed to Series D. Know who's raising, how much, and from whom

Accelerators & Incubators

500/week

Access vetted dealflow from top programs worldwide, including YC, Techstars, and 100+ others

News Outlets

3,000/week

Stay informed without reading 200+ publications. Know when companies make headlines

Company Data

50,000/week

Automatic firmographic updates. Track headcount changes, website updates, and growth signals

The Scale

Enterprise-grade data infrastructure

Built for funds that need comprehensive market coverage.

20M+

Organizations

Company profiles with firmographics and growth metrics

150M+

Domains

Websites crawled, indexed, and continuously monitored

100M+

Embeddings

Semantic vectors for similarity and search

1B+

Similarity Pairs

Pre-computed company relationships

850+

Sources

Curated accelerators, VCs, and industry outlets

500k+

Pages/Day

Real-time crawling and content extraction

Architecture

Built for scale and reliability

Enterprise-grade infrastructure with ML at the core.

Continuous Sync

Real-time crawling keeps data fresh across all sources

Semantic Embeddings

ML-powered similarity using modern transformer models

Structured Schemas

Consistent data models with typed fields and validation

Historical Snapshots

Track changes over time with versioned records

Entity Resolution

Deduplicated records with cross-source linking

Privacy-First

One-way data flow. No PII in exports.

Data Products

Choose your integration depth

From real-time API access to custom enterprise solutions.

Real-time API

REST endpoints for search, organizations, domains, and similarity queries.

Sub-second response times. 99.9% uptime.

View Documentation

Data Feeds

Curated datasets by region and industry. Weekly or daily delivery.

JSON, Parquet, or CSV. S3 or direct download.

Explore Feeds

Custom Solutions

Bespoke data pipelines, custom enrichment, and dedicated support.

White-glove onboarding. SLA guarantees.

Contact Us

Data Feeds

Pre-built for your investment thesis

Curated datasets updated weekly. Or request a custom feed.

DACH Startups

Germany, Austria, and Switzerland tech ecosystem

country in (DE, AT, CH)ml:fundability > 0.5

European Climate Tech

Sustainability and clean energy ventures across Europe

region = Europesector = Climateml:fundability > 0.5

Pre-Seed AI/ML

Early-stage AI companies before their first round

funding is nullsector = AI/MLml:fundability > 0.7

US B2B SaaS

Enterprise software companies in North America

country = USsector = SaaSml:fundability > 0.5

Technical Founders

Developer-first companies with open source traction

github_stars > 100sector = Devtoolsml:fundability > 0.6

Accelerator Graduates

Recent cohorts from YC, Techstars, and top programs

accelerator is not nullgraduated_at > 2024

Series A Pipeline

Seed-funded companies showing Series A signals

last_round = Seedheadcount_growth > 50%ml:fundability > 0.8

Nordic Healthtech

Healthcare and life sciences in Scandinavia

country in (SE, NO, DK, FI)sector = Healthcareml:fundability > 0.5

UK Deep Tech

Advanced technology ventures in the United Kingdom

country = UKsector = Deep Techml:fundability > 0.6

Southeast Asian Fintech

Financial technology startups across ASEAN

region = ASEANsector = Fintechml:fundability > 0.5

South American Tech

High-growth tech startups in LATAM markets

region = LATAMheadcount > 10ml:fundability > 0.5

Custom Feed

Request a bespoke dataset tailored to your thesis

Request Custom Feed →

What You Can Build

From data to decisions

Technical building blocks for venture intelligence.

01

Dealflow Agents

Build autonomous agents that source, filter, and rank deals matching your thesis.

02

Competitive Monitoring

Track competitor portfolios, market movements, and emerging players in your sectors.

03

Portfolio Support

Enrich portfolio company data, identify synergies, and surface partnership opportunities.

04

Portfolio Tracking

Monitor portfolio health with growth signals, hiring trends, and market positioning.

05

Market Mapping

Build comprehensive market landscapes with company clusters and competitive dynamics.

06

CRM Enrichment

Keep your pipeline fresh with automated company data updates and new signals.

07

Custom Dashboards

Power internal deal analytics and reporting with structured, queryable data.

08

Thesis Validation

Test investment theses against real market data and historical patterns.

KL

Built by a Fund CTO

Karl Lorey

Co-founder and former CTO at First Momentum, a European VC fund, where I built the data infrastructure that powered our sourcing. After years of solving this problem in-house, I'm making enterprise-grade deal sourcing accessible to every fund.

10+ years building crawlers, ML models, and data pipelines. I know what works because I've lived the problem.

Ready to power your data infrastructure?

Request access to discuss API access, data feeds, or custom solutions.

Currently accepting select venture and growth equity teams.