One pipeline, many agents, one human signature.
Agentic VC is the operating front-end of a longer research program at Kalmantic AI Labs ↗: how to allocate capital — USD, tokens, compute — to applicants that may themselves be agents. This page is the full mechanical description.
Agents outperform humans in our simulations.
Across the cohorts we have simulated to date, an agent-underwritten portfolio reaches roughly 2× the True ROI of a human-allocator baseline by month 12 — driven not by faster decisions, but by better choke-point detection and tighter capital mixes. Methodology and per-series detail are below.
Fig. S — Phase-0 cohort simulation, 12 months. Shaded band = breakout window. Green line = agent-cohort True ROI. Dotted faint line = human-allocator baseline.
The first three to six months are an oscillating search phase — false starts, mis-priced compute, premature product loops. The break comes when the team identifies its actual bottleneck and the agent re-weights the capital mix toward resolving it. The human baseline, run with the same applicant pool but a conventional partner-driven check, climbs steadily but never breaks out in the same window. Methodology is open and lives in the lab's repository.
New data points. New capital stack. Two structural advantages.
The outperformance is not because the agent is "smarter" than a partner. It is because the agent operates on a richer substrate — it sees more signals, and it allocates over more dimensions than money. Both edges compound across a cohort.
Agents instrument what partners cannot.
- ·Early-entry signals — pre-MVP commit cadence, agent-run trace quality, build-plan revision velocity.
- ·Early-exit signals — silent acquihires, recursive forks, capacity unwinds that never make it into a press release.
- ·Write-off signals — patterns that precede a kill decision by months: token-burn ratios, eval drift, choke-point regression.
- ·Provenance signals — who actually wrote what, which agent did which run, what was reproduced and what was claimed.
Humans see headlines. Agents see the substrate that produces them.
The thing being allocated is becoming multi-dimensional.
- ·USD — still the unit of account, but increasingly a wrapper around the other three.
- ·Tokens — protocol credits, API usage, model access. Often the most leveraged dollar a team can deploy.
- ·Energy — GPU-hours, kWh, regional capacity. The unforgeable cost floor of agent-first teams.
- ·Secrets from research — pre-publication results, eval suites, fine-tuning corpora, model weights under embargo.
A partner prices one dimension well. An agent reasons over the vector.
These two edges show up in the chart as a faster identification of the choke point (Reason A) and a tighter capital mix once that choke point is identified (Reason B). The combination is what bends the True ROI curve away from the human baseline at month 5–6 — not speed, not headcount, not access.
Application to wire, in days.
Six stops. Five of them run on agents. The sixth is always a named human partner.
Fig. C — Funding pipeline. Filled square = human · hollow square = agent · circle = capital event.
The application
A submission contains: applicant identity (human, hybrid, or agent), the problem and build plan, the capital ask broken down across USD / tokens / GPU, a milestone you'll commit to, and links to any public artifacts (repo, demo, prior agent run logs).
The underwriting agent
The lead agent evaluates four things: thesis fit, build feasibility, capital-mix sanity (does the ask match the work?), and prior history with us. It produces a written rationale, not a black-box score.
Valuation
The agent proposes a number with its inputs visible — comps, milestone-risk, dilution implied by the requested mix. The proposal is a starting point for negotiation, not a take-it-or-leave-it.
Human oversight
Nothing wires without a named human partner's signoff. The full audit trail — every agent action, every human approval — is timestamped and retained.
Disbursement
Each leg uses its own rail: USD via bank transfer to the founder or human guardian, tokens via on-chain transfer to a controlled wallet, GPU via compute-credit allocation.
Reporting
Funded entities check in monthly with the agent. Quarterly, a human partner reviews the cohort. Misuse of token or GPU grants pauses the relationship.
One GP-Agent, six specialists, zero black boxes.
The lead agent is a router and a synthesizer. It does not score applications by itself. It delegates to single-purpose sub-agents — each with a narrow brief, its own eval set, and a replaceable model — and assembles their outputs into a written recommendation a human partner can read in ten minutes.
Fig. A — Agent topology. Sub-agents are deliberately narrow so they can be swapped, re-trained, or red-teamed independently.
Intake
Normalize the application, dedupe against history, surface inconsistencies.
Diligence
Three parallel passes: thesis fit, build feasibility, team / agent track record.
Valuation
Propose a number with explicit comps and milestone-risk discount. Show its work.
Treasury
Recommend the USD / tokens / GPU mix that matches the build plan's actual cost structure.
Compliance
KYC the applicant (human, guardian, or agent's controller). Run sanctions and conflict checks.
Reporting
Monthly check-ins. Flags misuse of token or GPU grants for partner review.
Human-in-the-loop now. Autonomous, carefully, later.
The fund runs in one of two modes per decision. Phase 0 runs 100% HIL. Autonomous mode is gated behind a transparent criteria set we publish — not a private switch.
- ·Every wire requires a named partner's signature.
- ·The agent recommends; the human ratifies, rejects, or sends back for re-work.
- ·Disagreements between agent and human are logged with rationale on both sides.
- ·Latency target: 72h from completed application to decision.
- ·Pre-approved applicants under a small, public cap can be wired by the agent.
- ·Requires N successful HIL rounds with the same agent stack and the same partner panel.
- ·Every autonomous wire is reviewed within 24h by a human; reversibility is preserved.
- ·Mode change requires a published policy update — not a config flag.
The two modes share the same agent stack, the same audit trail, and the same eval suite. The only difference is whether a human signature is required before the wire or after it.
Four eras of capital allocation. We are at the start of era three.
Humans fund humans
Agents do not exist as economic actors. Capital flows person-to-person, gated by partner intuition.
Humans fund humans using agents
Funds adopt LLMs for triage and memo drafting. The underwriting decision is still entirely human.
Agents fund humans, humans, and hybrids
The underwriting agent runs the diligence and writes the recommendation. A human partner signs every wire. Phase 0 sits here.
Agents fund agents
A software entity is the primary applicant. Capital lands in wallets and compute accounts it controls. A human guardian remains on file; a human partner signs the wire.
Each era subsumes the previous one. Era 4 does not eliminate human founders — it adds a second class of applicants. Symmetrically, the long-term thesis is that the LP side will follow the GP side with a multi-year lag.
An underwriting agent has to be auditable. Auditable, for us, means open.
Kalmantic AI Labs ↗ maintains the underwriting stack in the open. Anyone can read how a decision was reached, reproduce a diligence pass on their own machine, or fork the stack for a different thesis. Closed-source allocation agents are, in our view, a non-starter for a category that doesn't exist yet.
Public stack
Sub-agent prompts, eval suites, scoring rubrics, and the orchestration code live in a public repository alongside the website.
Reproducible decisions
Every diligence pass produces a redacted, hash-anchored trace. Applicants can replay the exact run that led to their outcome.
Forkable thesis
The thesis layer is a swappable module. Other funds can run the same machinery against a different worldview without re-implementing the plumbing.
Six series per cohort. Two of them are why the agent wins.
The simulation above plots six series. The first three are table-stakes for any VC dashboard. The two that follow are non-standard — they are where the agent's edge actually comes from. The sixth is the human-allocator baseline used for comparison.
Revenue
Recognized topline. Decoupled from valuation in early months.
Compute / token burn
Watched for premature scale — we'd rather see efficiency improvements before throughput.
Implied valuation
Anchored on the agent's prior comps, not on the founder's last round.
Choke-point research signal
Did the team identify the actual bottleneck? Did they publish, instrument, or reduce it? Leads revenue by ~2 months in our simulations.
True ROI (counterfactual-adjusted)
Realized return adjusted for compute-price decay and the counterfactual: would this work have happened without the check? Headline series on the chart.
Human-allocator baseline
Same applicant pool, conventional partner-driven check, no agent in the loop. Climbs steadily but doesn't break out — this is what the agent cohort is measured against.
Open-source MoE models, dedicated to the allocation domain.
The current stack uses general-purpose frontier models behind each sub-agent. That is a transitional choice. The lab's roadmap is to publish a family of Mixture-of-Experts models tuned specifically for capital allocation in the agent era — released openly under a permissive license.
Thesis-fit expert
Maps applications to a fund's stated thesis with calibrated confidence. Designed to be swappable per fund.
Valuation-comp expert
Trained on a public corpus of agent-native deal comps, with explicit uncertainty bands rather than point estimates.
Capacity-planning expert
Reasons over GPU price curves, token-mix elasticity, and the cost structure of agent-first teams.
Choke-point detector
Reads a build plan and predicts the most likely bottleneck — and the research move that would resolve it.
Counterfactual ROI
Estimates the marginal contribution of a check vs. the no-funding world. Powers the True ROI series on the cohort chart.
Eval harness
A public benchmark suite for allocation agents — released alongside the models, so other funds can compare apples to apples.
Release cadence, model sizes, and the license text will be published on the lab's repository. Nothing on this page commits the fund to use the lab's models exclusively — the sub-agent architecture is intentionally model-agnostic.