RIVER What / The framework
What  /  the framework

RIVER, defined.

RIVER is a framework for measuring and operating the full value chain from idea to impact in organizations that practice progressive, controlled, and reversible release. It treats framework, artifact, and metrics as a single coherent operating discipline, not a dashboard.

The definition

The framework, on its own terms.

Definition
RIVER names, types, and measures the segments of the value chain past deploy, organized around release delta as the unit of analysis. No other framework operates those segments release by release.

RIVER encompasses the four DORA metrics as the reference instrumentation for the commit-to-deploy segment; there is no RIVER metric that replaces a DORA metric. What RIVER adds is a per-release operating artifact, and measurement scoped to that artifact, for the segments past deploy.

RIVER is philosophy-coupled to the separation of deploy and release. It does not make sense in organizations that ship in a single binary event from commit to user. It is tool-neutral in expression: it can be instrumented on any stack that produces the required data.

Worldview

The five commitments RIVER is built on.

Descriptive of where the industry's most capable organizations already operate, prescriptive for organizations moving toward that practice. Not negotiable within the framework.

01
Release is progressive and controlled.
A staged exposure of functionality across cohorts over time, with explicit control at each stage. The question is not whether a release happened but where it has reached.
02
Targeting is the unit of control.
Release is to a segment, percentage, tier, geography, customer class, or experimental arm. "Production" is a destination, not the unit at which release decisions are made.
03
Reversibility is native.
A release can be pulled back in seconds without a redeploy. The cost of trying a release is approximately zero; the cost of being wrong is the time spent in the degraded state.
04
Experimentation is intrinsic to release.
The same mechanism that exposes a feature can measure whether it worked. Shipping and learning are one act.
05
Release is a product decision, not just an engineering one.
Targeting rules, success criteria, and cohort definitions are co-owned by product and engineering, with operations accountable for guardrails.
Release delta

The unit of analysis.

The central abstraction of RIVER is the release delta: a declared, structured artifact created before a release begins.

Release delta / R-2026-0142 Declared 7 May 2026, 14:22 UTC
Type
Growth
Hypothesis
Surfacing onboarding-step progress in the dashboard navigation will lift the share of new workspace admins who reach Day-7 activation.
Success signal
Day-7 activation rate among exposed admins rises by 6 percentage points or more over the holdout, sustained across a 14-day rolling window.
Target cohort
New workspace admins created in the trailing 30 days. Exposure progresses 5%, then 25%, then 100%, conditional on guardrails.
Horizon
21 days from start of exposure.

Release delta is the unit of analysis: team-level, product-line-level, and organization-level rollups aggregate over deltas, not over deploys, flags, tickets, or story points. The act of declaring a delta before shipping is coercive, and the coercion is the point. Declaration forces three questions teams routinely avoid: what we believe will happen, how we will know, and by when.

The baseline, declared.

A success signal is itself a structured component: it declares the metric, the direction and magnitude of expected movement, the time window, and the baseline the movement is judged against. The baseline is the component teams most often leave implicit, and it is the one a declaration cannot survive without. Every delta declares its baseline at declaration time.

RIVER is design-aware and non-prescriptive about how the comparison is made. Evaluation methods span a spectrum, from randomized experimental arms through guarded progressive rollouts to purely observational attribution, and stronger designs produce stronger attribution. The framework names the spectrum; it does not mandate a position on it. Four baseline types cover the spectrum, ordered by the strength of attribution they support.

Concurrent Comparison
Judged against a group not exposed during the same window: a holdout, an experimental arm, a matched cohort. Strongest attribution; the comparison shares the delta's time window.
Forecast Comparison
Judged against a declared projection of the metric in the absence of the release, recorded at declaration. Strong when the forecast is honest; the projection is the contestable element.
Historical Comparison
Judged against the metric's own past: a trailing average, a prior period, a pre/post split. Weakest comparative attribution; still legitimate RIVER practice, named as what it is.
Absolute Threshold
Judged against a fixed value with no comparison group: error rate below a ceiling, latency under a bound. The native baseline for guardrail-driven, Risk Reduction, and Platform/Enablement deltas.

The ordering carries the claim: attribution is only as strong as the baseline it is judged against. An organization's distribution across baseline types is a fact about its evidentiary practice, not a ranking of its teams.

Delta taxonomy

Six types, each measured against its own standard.

Not every release is supposed to move revenue. The type is assigned at declaration time.

Growth
Acquire, activate, or convert users. Success: movement in funnel metrics within the exposed cohort.
Retention / Engagement
Deepen usage among existing users. Success: frequency, depth, or stickiness in the exposed cohort.
Monetization
Expand revenue per user, unlock new revenue, or convert free to paid. Success: revenue movement within the exposed cohort.
Experience / Quality
Improve the existing experience without changing what the product does. Success: satisfaction, ticket reduction, task completion.
Platform / Enablement
Make future work faster, safer, more reliable. The user is another engineer or team. Success: downstream velocity, incident rate, internal adoption.
Risk Reduction
Compliance, security, resilience, regulatory exposure. Success: reduction in the targeted risk, not growth.
Adoption, layered

Three signals, not one.

The word "adoption" in common use collapses three distinct phenomena with different time horizons and different evidentiary value.

Adoption signals  /  resolution and strength Earlier signals resolve faster. Later signals carry more weight.
HOURS DAYS WEEKS MONTHS Time to resolve STRENGTH FIRST-USE visibility signal Has the user touched it at all? SUSTAINED-USE workflow signal Is it used repeatedly over weeks? VALUE-REALIZATION primary RIVER metric Did it produce the outcome it was built for? PRIMARY All three signals are measured within the exposed cohort, not the total user base.

First-use resolves in hours to days; it is evidence of visibility, not value. Sustained-use resolves in weeks; the feature has found a place in real workflows. Value-realization resolves in weeks to months; the user has completed the action the feature was built to enable, defined per-delta by the declared success signal.

Metric families

Seven families, mapped to the value chain.

The asymmetry with DORA's four is intentional. Named representative metrics are version-one candidates, subject to empirical refinement. Definitions live in the Glossary.

01
Exposure
How long deployed code sits unreleased and how quickly it moves to first user exposure.
Feature Dark Time
02
Cohort Progression
How exposure moves through its target cohort and whether cohorts advance smoothly or are blocked.
Rollout Velocity
03
Reversal
How often releases are reversed through kill-switches, flag-offs, or targeting rollbacks. The change-failure analog DORA misses entirely.
Release Reversal Rate
04
Guardrail
How often automated guardrails paused or reversed a release in response to a monitored metric.
Guarded Release Activation Rate
05
Experiment-Linked
How systematically releases are tied to declared hypotheses and to outcome movement after the fact. These metrics measure RIVER's own adoption.
Hypothesis Attachment Rate
06
Adoption
First-use, sustained-use, and value-realization within the exposed cohort, defined per-delta.
Value-Realization Rate
07
Impact
Whether adopted features moved the business metrics they were built to move. The framework's terminal measurement.
Outcome Realization Rate
Maturity ladder

Five levels. Teams climb into them.

RIVER is a measurement framework and a maturity practice. Teams do not adopt RIVER in a single act; they climb into it over quarters or years through five levels: Deploy, Control, Declare, Prove, Learn. The transition from Control to Declare is the hardest, where measurement stops being instrumentation and starts being ceremony.

The ladder is a language, not a ranking. An organization that can say "we are at Control on most teams and Declare on two pilot teams" has a more tractable conversation about what to invest in next than one that can only say "we are working on our metrics."

The ascent / five levels of practice ↑ Higher impact per release
1
Deploy
2
Control
3
Declare
4
Prove
5
Learn
Compounds
Delivery
Hygiene
Deploy ≠
Release
Delta
Declared
Outcomes
Proven
Learning
Loop
1
Deploy Delivery hygiene

The team ships reliably. Delivery hygiene is in place: deployment frequency, lead time for changes, change failure rate, and time to restore are tracked and trending in the right direction. This is the foundation of a good release practice, not a deficient state to be escaped.

2
Control Deploy ≠ release

Deploy and release are separate events. The team rolls out progressively, targets specific cohorts, and can reverse a release in seconds without a redeploy. Release-layer metrics exist but aren't consistent across teams or releases.

3
Declare Delta, ahead of ship

Before a release ships, the team states what it expects to happen: hypothesis, success signal, target cohort, time horizon. Some releases get measured against what was declared. The discipline is emerging, not yet uniform.

4
Prove Systematic attribution

Outcome attribution is systematic. Most releases carry a declared success signal and are evaluated against it after the fact. The team builds a durable record of which deltas realized and which didn't. Not every hypothesis will hold; the record's credibility depends on its honesty.

5
Learn The compounding loop

Evidence from realized and unrealized deltas feeds the next cycle. The team's predictions sharpen over time, not just its measurements. This is the compounding loop: the "Evolution" in RIVER. It is rare in current industry practice.

VALUE CHAIN framework-wide
Idea
Commit
Deploy
Release
Adoption
Impact