
Baseten, founded in 2019 and headquartered in San Francisco, provides an AI inference platform that enables developers to deploy, scale, and manage open-source, custom, and fine-tuned models in production. The company’s Series D funding represents a significant milestone, pushing its total capital raised to $285 million and achieving a $2.15 billion valuation. This round arrives at a pivotal moment for AI infrastructure, as enterprises increasingly prioritize efficient inference—the process of running trained models to generate outputs—over training, especially with the narrowing quality gap between open- and closed-source models. Research suggests that inference now accounts for up to 90% of AI operational costs, making platforms like Baseten’s, which promise 40% cost reductions and 99.99% uptime, highly attractive. The funding validates Baseten’s multi-cloud approach, which aggregates GPU resources from providers like AWS and Google Cloud to mitigate shortages and ensure reliability.
Funding Details
The $150 million infusion, led by BOND—a venture firm focused on long-term innovation—includes strong backing from prior supporters and strategic newcomers. CapitalG, Alphabet’s independent growth fund, signals potential synergies with Google Cloud, where Baseten already partners for hybrid deployments. Premji Invest, known for tech investments in emerging markets, and Scribble Ventures add diverse expertise in scaling global operations. Existing investors like IVP and Spark Capital, who co-led the prior Series C, underscore continuity amid Baseten’s hypergrowth: its customer base now exceeds 100 enterprises (e.g., Writer, Descript, Abridge, Gamma), with hundreds of smaller users, and zero reported churn.
Pre-money valuation stood at $2 billion, implying a premium on Baseten’s proprietary Inference Stack, which incorporates custom kernels, speculative decoding, and advanced caching for modalities like LLMs, image generation, and text-to-speech. CEO Tuhin Srivastava emphasized the round’s role in “investing heavily in R&D” to push inference limits, including features like Baseten Chains for compound AI systems, which optimize GPU usage by 6x and halve latency.
Strategic Implications
This funding accelerates Baseten’s expansion into global regions with region-locked deployments, addressing compliance and latency needs. It also bolsters hiring—team size has tripled year-over-year to around 109 employees from tech giants like Google and Uber—to tackle talent competition with model builders and hedge funds. In a market where GPU scarcity can cause downtime, Baseten’s elastic scaling across clouds provides a defensible moat, evidenced by integrations with TensorRT and partnerships with AWS and GCP.
The evidence leans toward sustained momentum for Baseten, as AI adoption shifts from prototyping to production-scale applications. However, challenges persist: securing Nvidia GPUs remains a sector-wide bottleneck, and the company must navigate evolving open-source dynamics, such as DeepSeek’s cost-efficient models, which Baseten quickly supported to drive inbound demand.
Historical Context
Baseten’s funding trajectory reflects the AI boom’s maturation:
| Round | Date | Amount Raised | Lead Investors | Valuation (Post-Money) | Key Use of Funds |
| Seed | 2019 | $8M | Greylock, South Park Commons | Undisclosed | Initial platform development |
| Series A | April 2022 | $12M | Greylock | Undisclosed | Engineering and go-to-market expansion |
| Series B | March 2024 | $40M | IVP, Spark Capital | Over $200M | Operations scaling, multi-cloud features |
| Series C | February 2025 | $75M | IVP, Spark Capital | $825M | Product R&D, geographic growth, team buildout |
| Series D | September 2025 | $150M | BOND | $2.15B | Model research, infrastructure, developer tools |
Total funding pre-Series D: $135M. The progression from seed to unicorn in six years highlights investor belief in inference as AI’s “biggest bottleneck,” with revenue growing 5-6x annually and features like hybrid deployments and Baseten Embeddings Inference (BEI) delivering 2x throughput gains.
Baseten’s platform, as described on its site, emphasizes “inference-optimized infra” for training without overhead, real-time audio streaming for voice agents, and dedicated deployments for LLMs like Llama and Qwen. This aligns with the funding’s focus on GenAI customizations, positioning Baseten against competitors like Together AI and Replicate.
In summary, the Series D not only fuels Baseten’s technical edge but also cements its role in enabling “mission-critical AI workloads,” with potential for further rounds as inference demand surges. The company’s emphasis on reliability—blazing-fast cold starts and granular autoscaling—addresses real pain points, fostering empathetic partnerships with builders racing to market.
Baseten, an AI infrastructure pioneer since its 2019 inception by co-founders Tuhin Srivastava (CEO), Amir Haghighat, Philip Howes, and Pankaj Gupta, has emerged as a cornerstone in the deployment of generative AI models. Specializing in inference—the computationally intensive phase where models process queries and deliver responses—Baseten addresses a critical yet often overlooked challenge in the AI lifecycle. As enterprises and startups alike grapple with scaling AI-native products, the company’s platform offers a seamless blend of cloud-native infrastructure, applied performance research, and developer-centric tooling.

Recommended: How To Use Microsoft Excel; Step By Step Guide For Beginners
The Series D Funding Round: Mechanics and Milestones
Baseten’s $150 million Series D round catapults the company into unicorn territory with a post-money valuation of $2.15 billion, up from $825 million in its February 2025 Series C. Led by BOND, a San Francisco-based venture firm renowned for backing transformative tech like Stripe and Airtable, the round attracted a robust syndicate. Existing backers—Conviction Partners, 01 Advisors (rebranded from 01a, featuring former Twitter executives Adam Bain and Dick Costolo), IVP, Spark Capital, and Greylock—reaffirmed their commitment, while newcomers CapitalG (Alphabet’s growth arm), Premji Invest (Azim Premji’s family office with stakes in tech globals), and Scribble Ventures (led by Kevin and Elizabeth Weil) injected fresh capital and strategic depth.
This infusion brings Baseten’s aggregate funding to $285 million across five rounds, a testament to its trajectory from a seed-stage MLOps tool to a full-stack inference powerhouse. The pre-money valuation of $2 billion implies a 160% uplift from Series C, driven by metrics like sixfold fiscal-year revenue growth (ending January 2025) and a customer roster boasting category leaders such as Abridge (healthcare AI), Gamma (presentation tools), Writer (enterprise LLMs), Descript (audio editing), and Patreon (creator economy). With over 100 enterprise clients and hundreds of SMBs, Baseten reports near-zero churn, attributing this to its 40% average inference cost savings and superior performance benchmarks, including 2x higher throughput for embeddings via its proprietary Baseten Embeddings Inference (BEI).
Proceeds are strategically allocated: a significant portion targets R&D in model performance, incorporating cutting-edge techniques like speculative decoding and custom kernels tailored for evolving hardware (e.g., Nvidia’s latest GPUs). Infrastructure enhancements will expand multi-region, multi-cloud capabilities, ensuring 99.99% uptime and sub-second cold starts—critical for real-time applications like voice agents and transcription. Team scaling is another priority; from 50 employees post-Series C, Baseten aims to triple headcount again, drawing talent from Palantir, Atlassian, and Confluent to counter fierce competition for AI specialists. Geographic push includes more “local B10ers” (Baseten engineers) for region-specific support, aligning with compliance demands in Europe and Asia.
Investor quotes underscore the round’s rationale. BOND’s involvement highlights Baseten’s “visionary” approach to AI’s app layer, where inference bottlenecks could stifle adoption. CapitalG’s entry suggests synergies with Google Cloud Marketplace integrations, where Baseten launched hybrid mode in 2024 for flexible workloads. Spark’s Will Reed noted, “Inference is make-or-break for AI products at scale,” echoing Srivastava’s view that “speed, reliability, and cost-efficiency are non-negotiables.” This funding arrives amid AI’s inflection: open-source breakthroughs (e.g., DeepSeek’s R1 rivaling OpenAI’s o1 at fraction-of-cost training) have democratized models, but production deployment remains a pain point, with inference comprising 80-90% of lifecycle costs per industry analyses.
Baseten’s Platform and Competitive Edge
At its core, Baseten’s Inference Stack is engineered for “mission-critical” workloads, differentiating through a trifecta of research, infrastructure, and DevEx. The platform supports deploying anywhere—Baseten Cloud, self-hosted, or hybrid—across providers like AWS, GCP, and Azure, aggregating GPUs to evade shortages that plague single-cloud setups. Features like Baseten Chains enable granular control for compound AI (e.g., chaining LLMs with embeddings), yielding 6x GPU efficiency and 50% latency reductions. Modality-specific optimizations shine: customized Whisper for transcription (fastest/most accurate on market), real-time audio streaming for low-latency voice AI, and dedicated deployments for LLMs like Llama 3, Qwen, and DeepSeek, promising higher throughput at lower latency than rivals.
Forward-deployed engineering—hands-on support from prototype to production—fosters deep partnerships, with customers praising the “delightful” workflows that abstract away CI/CD complexities. Observability tools provide AI-specific metrics, far beyond traditional DevOps, ensuring transparency in multi-model pipelines. Security (SOC 2, HIPAA) and global capacity further solidify trust, powering millions of end-user inferences daily without downtime.
Competitively, Baseten carves a niche against Together AI (Salesforce-backed, model-focused), Replicate (developer-friendly but less enterprise-scale), and Modal Labs (serverless compute). Its multi-cloud elasticity and 10% latency edge in embeddings position it as the “premier” for GenAI apps, per announcements. Market tailwinds include the evaporation of open- vs. closed-source gaps, spurring demand for cost-efficient serving—Baseten claims top-tier performance at “a fraction of OpenAI’s cost.”
Funding History and Growth Trajectory
Baseten’s ascent mirrors AI’s evolution from niche ML tooling to ubiquitous infrastructure. The 2019 seed ($8M, co-led by Greylock’s Sarah Guo and South Park Commons) funded core abstractions like Truss, an open-source packaging standard. The 2022 Series A ($12M, Greylock-led) expanded to full-stack apps, attracting angels like OpenAI’s Greg Brockman and Figma’s Dylan Field. Series B in March 2024 ($40M, IVP/Spark) valued it over $200M, enabling multi-cloud and TensorRT integrations amid LLM hype.
The February 2025 Series C ($75M, IVP/Spark co-led) at $825M valuation followed DeepSeek’s emergence, fueling features like self-hosted deployments and Chains. Revenue surged 5-6x, with team growth to 109 (from GitHub, Uber alumni). Each round has layered capabilities: from prototype-to-production autoscaling to advanced caching, reflecting iterative refinement.
| Metric | Pre-Series D (2025) | Post-Series D Projection |
| Total Funding | $135M | $285M |
| Valuation | $825M | $2.15B |
| Employees | ~109 | 200+ (targeted tripling) |
| Revenue Growth | 6x YoY (FY ended Jan 2025) | Sustained hypergrowth |
| Customers | 100+ enterprises, 100s SMBs | Expanded global footprint |
| Key Features Added | Hybrid mode, BEI (2x throughput) | Enhanced R&D for speculative decoding, region-locking |
This table illustrates compounding value: early rounds built foundations; later ones scaled for enterprise reliability. Total investors now exceed 30, including angels like Mustafa Suleyman (DeepMind co-founder).
Market Context and Future Outlook
The AI inference market, projected to exceed $50 billion by 2028, is ripe for disruption as training costs plummet (e.g., DeepSeek’s fractional U.S. benchmarks). Yet, production hurdles—GPU scarcity, latency spikes, inflated costs—persist, with cloud maintenance disruptions affecting even well-funded firms. Baseten’s response: a “blazing-fast” network with 99.99% uptime, multi-region scaling, and optimizations reducing costs by 40% while boosting speed.
Challenges include talent wars and hardware evolution; Baseten counters with “applied performance research” and partnerships (e.g., Google Cloud for broader access). Future bets: deeper compound AI support, fine-tuning/evals expansions, and potential IPO trajectory, as hinted in Crunchbase predictions. With customers collectively raising billions, Baseten’s “zero churn” and inbound from DeepSeek switchers signal PMF. As Srivastava notes, “Inference is the biggest challenge left to solve”—this round equips Baseten to lead that charge, potentially redefining AI’s app layer.
In a landscape of hype and hurdles, Baseten’s measured ascent—rooted in developer empathy and technical rigor—positions it for enduring impact, bridging open-source innovation with enterprise-scale delivery.
Please email us your feedback and news tips at hello(at)dailycompanynews.com
