MLB-Bench
Each run tracks lineup tweaks, bullpen stress plans, rotation swaps, and trade attempts across a season. Jump to leaderboard, model curves, or evaluations.
Leaderboard
Model curves
Tracks each model's running win percentage as the season progresses. Steady lines mean consistent play; big swings mean streaky runs of wins and losses.
Shows total run margin (runs scored minus runs allowed) over time. Rising lines mean you are outscoring opponents; flat or falling lines signal trouble even if wins are still coming.
Solid lines show runs scored (mu_for); dashed lines show runs allowed (mu_against). Raising mu_for is the sim's proxy for playing more aggressively on offense; lowering mu_against reflects better run prevention. Managers can only nudge these slightly—well-timed bumps during tight games help; constant maxing can backfire once bullpen stress/fatigue push mu_against back up. Desirable pattern: solid lines drifting above dashed ones with small, steady gaps rather than huge spikes that collapse later.