Compresr

LLM-native context compression

Winter 2026ActiveB2BInfrastructureArtificial IntelligenceDeveloper ToolsB2BEnterprise SoftwareSan Francisco, CA, USA

Company

Compresr provides an API that compresses LLM context without losing what matters. It’s a drop-in for agents and RAG that cuts token costs and improves accuracy.

Verdict

High Signal

Market Opportunity

LLM inference cost reduction is a massive and growing pain point for every company building on top of foundation models — effectively the entire B2B AI market. Token costs directly eat into margins for AI-native companies and enterprises deploying agents at scale. Clear ICP: companies running RAG pipelines or agentic systems with high token consumption. Pricing page exists suggesting real monetization thinking.

Medium Signal

Founder Signal

Ivan Zakazov (CEO) is the strongest signal: ~5 years relevant experience across Philips Research and Microsoft internship, EPFL PhD dropout with published papers on prompt compression directly relevant to this product (EMNLP-25, NeurIPS-24). Oussama Gabouj (CTO) has EPFL dLab research on prompt compression with EMNLP 2025 paper accepted. Berke Argin and Kamel Charaf are essentially fresh grads (UBS intern, Bell Labs part-time) with thin industry experience. Team is technically credible but thin on commercial/go-to-market experience.

Low Signal

Competition

LLMLingua (Microsoft Research) is the most direct competitor — a well-known open-source prompt compression library that Compresr explicitly benchmarks against. ContextualAI, Cohere's reranking API, and various RAG optimization tools compete in the adjacent space. OpenAI and Anthropic could trivially build context compression natively into their APIs, which is an existential risk. The open-source nature of part of the product is a double-edged sword.

Medium Signal

Product

Live pip-installable package with 174 GitHub stars, a demo, documentation, and pricing page — real product exists. Shows a concrete benchmark (FinanceBench: 10x compression, 74.5% accuracy vs 72.3% baseline, 76% cheaper) and works with Claude Code/Codex/OpenClaw. However, no named paying customers, no stated revenue, and the benchmark uses 'GPT-5.2' which doesn't exist yet — raises questions about data credibility.

OverallB Tier

Compresr has a technically credible team anchored by Ivan Zakazov (PhD + Microsoft + Philips, published papers directly on this problem) and a real, installable product with meaningful GitHub traction. The core problem — token cost reduction for LLM pipelines — is real and large. However, the competitive risk is severe: Microsoft Research already ships LLMLingua, and any major LLM provider can internalize compression natively, making this a feature rather than a company. The benchmark data uses 'GPT-5.2' (doesn't exist as of March 2026), which undermines credibility. No paying customers or revenue signals visible. Two of the four founders are essentially students with intern-level experience, and there's no commercial operator on the team. Strong research foundation but needs to show defensibility and revenue fast before model providers commoditize this.