Home / ICM considerations
On the failure modes of ICM-naive automated policies in tournament late stages.
The Independent Chip Model (ICM) maps chip stacks to dollar-expectation under a constant payout structure. Automated agents trained or tuned against chip-EV objectives behave correctly in the early stages of a multi-table tournament but degrade measurably as payout-jump structure asserts itself. This note describes the observed failure modes and the corrections applied in the 2023–2026 sub-cohort.
ICM treats each player's tournament equity as the probability-weighted sum of payouts, where probabilities are derived from stack proportions under a Malmuth–Harville assumption. The result is that chips won are worth less, in dollar terms, than chips lost: a doubling of stack does not double equity. The implication for policy is that the value function the agent should be optimising is not linear in chips, and this non-linearity is non-trivial near pay jumps.
Three failure modes recur in the cohort and are reported in order of magnitude.
An agent indifferent to ICM treats marginal pushes as +chip-EV when they are −$EV. Empirically observed in this cohort, this manifests as a measurable drop in $EV per hand played in the orbit immediately preceding the money bubble, even where the chip-EV trend is positive. Across the 2024–2026 window the average gap between chip-EV and $EV in this orbit was approximately seven percentage points [1].
| Boundary | Players left | Top stack: $EV / chip-EV | Short stack: $EV / chip-EV |
|---|---|---|---|
| Min-cash | n+1 → n | 0.78 | 1.42 |
| Final table | 10 → 9 | 0.81 | 1.31 |
| Final three | 4 → 3 | 0.84 | 1.22 |
| Heads-up | 3 → 2 | 0.88 | 1.18 |
Two implications follow. First, the top stack derives less dollar value from each additional chip than chip-EV reports, which argues for selective rather than universal aggression. Second, the short stack's chips are dollar-worth more than their chip count suggests, which argues against trivially calling off marginal spots simply because the chip-EV is positive.
A pure-ICM solver also has known limitations: it assumes equal future-game skill among remaining players, which is rarely true in a mid-stakes pool. Adjustments here are small (1–2% in $EV) but consistent in direction.
The 2023.Q4 policy update introduced a stage-aware value function with an ICM term whose weight scales with proximity to a pay jump. Pre- and post-update windows, paired by buy-in band and network, show the bubble-orbit $EV gap reduced from approximately 7.4 to 2.1 percentage points. The residual gap is attributed to (i) discretisation in the value function, (ii) skill-asymmetry effects not modelled by ICM, and (iii) the long tail of unusual payout ladders not represented in the training distribution.
The structural results are not new. Sklansky and Malmuth (1989) introduced the concern in pre-online play; Harville (1973) supplied the proportional-share derivation that underlies the standard solver; subsequent work in the online era (Chen 2019; Brown 2021) extended the analysis to large-field online MTTs. The present document does not contest those results; it reports their reproduction in a recent, narrow-stakes cohort.
The figures reported here are medians over a heterogeneous sample, and variance at the boundary stages is substantial. ICM itself is a model; payout structures with large overlays, ladders flattened by deals, and time-restricted formats all distort the mapping. Practitioners should treat the corrections as directional rather than as point estimates.
Operational enquiries are handled out of band.
Late-reg with us