Branno

The experimentation primitive. Branno runs multi-armed bandit experiments over any decision the platform makes — a price, a ui_view, an offer, a config value — allocates each cohort to an arm with cohort-stable assignment, attributes outcomes back to the arm that earned them, and converges on a winner. It closes the config-as-code loop: a Radiant proposal can ship as a live experiment and merge itself once a variant wins.

Status: Live and wired end-to-end into pricing. Hober calls allocate when a line is added to an order and attributes the realized revenue back when the order closes (durably, through the outbox relay); a continuous-reward Normal-Gamma allocator joins the binary Thompson one; and an auto-convergence detector runs hourly on the platform scheduler, settling experiments that clear their confidence threshold — which, for an experiment started from a Radiant proposal, merges the winning variant with no human in the loop. Demerzel surfaces the convergence verdict per experiment.

What it owns

Branno owns the lifecycle of an experiment: which variants compete (Arms), how a given cohort is assigned to one of them (Allocation), how outcomes accrue against each arm, and when the experiment converges. The hot path is two calls — allocate at the decision point, attribute when the outcome lands — both designed to be cheap and idempotent.

Branno does not own the thing being tested. The asset lives in Radiant; the metric definition lives in Pelorat; the decision surfaces are Hardin pricing, Magnifico campaigns, and render-runtime ui_views. Branno only decides which arm and records how it did.

Concepts

Experiment: The container. Carries a decisionKind (PRICE, UI_VIEW, OFFER, CONFIG), a decisionPoint (the opaque route key the allocator matches on), an allocatorKind, the outcomeMetric + outcomeDirection (MAXIMIZE / MINIMIZE), and convergence guards (minSamplesPerArm, confidenceThresholdPermille). State machine: DRAFT → RECRUITING → CONVERGED, with TERMINATED as the emergency branch. A killSwitch falls all traffic back to control.
Arm: One competing variant. Exactly one arm per experiment is the control (isControl). Each arm carries a decision-kind-specific payload (the price, the ui_view id, the offer) and accumulates sufficient statistics — observations, rewardSum, rewardSumsq — that the allocator's posterior reads from directly.
Allocation: A persisted cohort → arm assignment. A UNIQUE(experimentId, cohortKind, cohortKey) constraint is the whole trick: the same cohort (CUSTOMER, ORDER, SESSION, or DEVICE) always resolves to the same arm, so a customer never sees the price flicker between page loads. The row is also where the outcome lands (outcomeValue, outcomeSourceType, outcomeSourceId).
Allocators: Pluggable assignment strategies: Thompson sampling (Beta-Bernoulli posterior draw, for binary rewards — the default), Normal-Gamma Thompson (Gaussian posterior for continuous rewards like revenue-per-order — scale-invariant, no hyperparameters), UCB1 (upper-confidence-bound exploration), ε-greedy, and fixed-split (deterministic hash buckets per targetSharePermille). Status drives behaviour: DRAFT allocates nothing, RECRUITING runs the strategy, CONVERGED serves the winner, TERMINATED / killSwitch serves control.
Attribution: Push model: a caller POSTs an outcome with the cohort key and a source reference; Branno finds the matching Allocation, stamps the outcome, and bumps the arm's statistics. Idempotent on (outcomeSourceType, outcomeSourceId) so a redelivered event never double-counts, and an attributionWindowSeconds guard drops outcomes that land too late to be credited.
Convergence & the Radiant handoff: An experiment converges two ways: an operator settles it by hand, or the auto-convergence detector does — an hourly scheduler job that, for Thompson-family experiments, checks every arm has its minSamplesPerArm and that the leader's win-probability (from a Monte-Carlo sample of the posteriors) clears confidenceThresholdPermille. Either way, if the experiment was started from a Radiant proposal (sourceProposalId), Branno calls back into Radiant: a winning variant merges its payload as a new AssetVersion; a winning control closes the proposal as "bandit lost". The proposal sat APPROVED the whole time — the bandit, not a human, decided the merge.

API surface

All endpoints are versioned under /branno/v1/, return RFC 7807 problem details on error, and read tenantId from the bearer token. The lifecycle endpoints are operator-facing; allocate and attribute are the hot path.

Branno endpoints in the Foundation API reference OpenAPI 3.1 schema for Branno with request/response shapes, parameters, and a try-it client.

Quick reference

Method	Path	Purpose
POST / GET	`/branno/v1/experiments`	Create a `DRAFT` experiment, or list experiments by status.
GET / PATCH / DELETE	`/branno/v1/experiments/{id}`	Fetch, edit (`DRAFT` only), or delete an experiment.
POST / GET	`/branno/v1/experiments/{id}/arms`	Add or list arms (edits gated to `DRAFT`; exactly one control).
POST	`/branno/v1/experiments/{id}/start`	Transition `DRAFT → RECRUITING` (requires ≥2 arms, one control).
POST	`/branno/v1/experiments/{id}/converge`	Settle on a winning arm; fires the Radiant handoff if started from a proposal.
POST	`/branno/v1/experiments/{id}/terminate`	Emergency stop — flips the killSwitch and falls all traffic back to control.
POST	`/branno/v1/allocate`	Hot path: resolve the active experiment for a `decisionPoint` + cohort, return the arm + payload + `allocationId`.
POST	`/branno/v1/attribute`	Hot path: stamp an outcome onto an allocation and bump the arm's statistics.

Example: allocate, then attribute

This is the live pricing loop. When a line is added to a Hober order, Hober asks Branno which price arm the order gets — keyed on the order, so re-adding the same item never flickers the price:

POST /branno/v1/allocate
Content-Type: application/json
Authorization: Bearer <token>

{
  "decisionPoint": "hardin.price.pv_01J8K2…",
  "cohortKind":    "ORDER",
  "cohortKey":     "ord_01JAZB…"
}
→ 200 OK
{
  "experimentId": "expt_01JAZB…",
  "armId":        "arm_01JAZC…",
  "allocationId": "aloc_01JAZD…",
  "isControl":    false,
  "payload":      { "priceCents": 5400 }
}

When the order closes, an outbox subscriber attributes the realized line revenue back to the arm. Idempotent on the source reference, so a redelivered close never double-counts:

POST /branno/v1/attribute
Content-Type: application/json
Authorization: Bearer <token>

{
  "decisionPoint":     "hardin.price.pv_01J8K2…",
  "cohortKind":        "ORDER",
  "cohortKey":         "ord_01JAZB…",
  "outcomeValue":      54.00,
  "outcomeSourceType": "hober_order",
  "outcomeSourceId":   "ord_01JAZB…"
}
→ 202 Accepted

How it fits with the rest

flowchart LR
  R[Radiant proposal] -- banditExperimentId, on approve --> Br(Branno)
  DP[Decision point] -- allocate --> Br
  Br -- arm payload --> DP
  Out[Mallow / Hober outcome] -- attribute --> Br
  Br -- converge: winning arm --> R
  Pel[Pelorat] -. defines outcome metric .-> Br

Branno sits one layer above the things it tests and is wired to Radiant through shared interfaces in both directions: an approved Radiant proposal carrying a banditExperimentId hands off to Branno to start the experiment instead of merging immediately, and a converged Branno experiment hands back to Radiant to complete or close that same proposal. Pelorat names the outcome metric; Magnifico, Hardin, and render-runtime are the decision surfaces whose payloads the arms carry. Nothing inside Branno knows what a "price" or a "ui_view" means — the payload is opaque, and the primitive at the decision point interprets it. The live decision surface today is Hardin pricing through Hober: a per-variant price experiment allocates when a line is added and attributes realized revenue at order close, so the continuous-reward allocator learns from dollars, not clicks.