
Protege, a New York-based platform specializing in the secure and ethical exchange of proprietary data for AI training, closed a $25 million Series A funding round. This round brings the company’s total funding to $35 million, following a $10 million seed round in September 2024.
Founded in early 2024, Protege addresses a critical bottleneck in AI development: access to high-quality, proprietary training data. The platform connects data holders—such as organizations in healthcare and media—with AI developers, enabling compliant, governed data sharing for model training. Protege emphasizes ethical sourcing, curating multimodal datasets (e.g., video, audio, clinical notes, and medical images) aligned with specific use cases, regulatory standards, and research goals. It also helps data holders monetize underutilized assets while ensuring data security and intellectual property protection.
The founding team includes:
- Bobby Samuels (CEO and Co-Founder), who has highlighted the platform’s role in overcoming data access frictions.
- Travis May (Co-Founder), CEO of Shaper Capital and former CEO of LiveRamp and Datavant, bringing expertise in data ecosystems.
- Engy Ziedan (Chief Scientific Officer), focusing on dataset curation.
- Richard Ho (CTO), overseeing technical infrastructure.
Protege operates as a marketplace for AI training data, with an expansive catalog that includes over 300,000 hours of video content, 500,000 hours of audio, billions of clinical notes, and hundreds of millions of medical images. The company has expanded into verticals like Audio & Speech and Motion Capture, launched in August 2025.
Funding Round Details
The Series A was led by Footwork, with participation from existing investors including CRV (which led the seed round), Bloomberg Beta, Flex Capital, Shaper Capital, and Liquid 2 Ventures. The valuation was not publicly disclosed. This investor group reflects confidence in Protege’s traction, with Footwork’s Nikhil Basu Trivedi noting the company’s execution in healthcare, media, and frontier AI labs.
| Funding Round | Amount | Lead Investor | Key Participants | Total Funding Post-Round | |
| Seed | $10 million | CRV | SV Angel, Liquid 2 Ventures, Bloomberg Beta, Flex Capital, Adam D’Angelo, Travis May | $10 million | |
| Series A | $25 million | Footwork | CRV, Bloomberg Beta, Flex Capital, Shaper Capital, Liquid 2 Ventures | $35 million |

Recommended: Flyhomes Secures $15 Million In Series D Funding Round
Use of Funds and Strategic Implications
The capital will support product enhancements, expansion into new verticals, and strengthened partnerships with enterprise customers and data providers. This aligns with Protege’s growth trajectory, as the company has scaled its business 20x in 2025, partnering with over 100 data holders and generating tens of millions in revenue for them through data licensing. Key achievements include collaborations with foundational AI models and acquisitions like Calliope Networks in December 2024, which unlocked premium video data.
In the broader AI market, where training data scarcity hinders progress—particularly for specialized, real-world applications—Protege’s model positions it as a key enabler. Competitors in data labeling or synthetic data (e.g., Scale AI) focus on different aspects, but Protege’s emphasis on proprietary, multimodal sources differentiates it, especially in regulated sectors like healthcare. The funding underscores investor bets on data infrastructure as a foundational layer for AI advancement, amid rising demands for ethical and compliant data practices.
With this infusion, Protege aims to accelerate its role in unlocking proprietary datasets, potentially driving breakthroughs in AI for industries like oncology and media. The company’s scientist-led approach and focus on governance could help it capture a larger share of the emerging AI data market, projected to grow rapidly as models require more diverse, high-fidelity inputs. Expansion plans include deeper integrations and new partnerships, building on recent ones like those with Gradient Health for medical imaging and Syndesis Health for global healthcare data.
Please email us your feedback and news tips at hello(at)dailycompanynews.com
