mecheval

The mechanical, physical, and CAD evaluation suite for AI models.

50%
industry pass5 · fraction of (model, task) pairs that have earned a clean pass5
2026-04-28 → 2026-06-12
113 / 228
pass5 · achieved / ready
0.98
top score · openai-direct-gpt-5
1233 runs
7 models · 64 tasks

Models

modeltasksruns pass^5scoretokenswall
openai-direct-gpt-5 25 125 20/23 0.98 2.5k 25.2s
claude-direct-claude-opus-4-7 37 216 32/37 0.97 1.7k 7.1s
claude-direct-claude-sonnet-4-6 34 193 29/34 0.95 1.4k 7.5s
openai-direct-gpt-5-mini 26 133 19/25 0.92 2.5k 26.3s
claude-mcp-claude-opus-4-7 52 267 13/52 0.80 452.9k 106.3s
openai-direct-gpt-4o-mini 26 126 0/23 0.13 1.2k 8.9s
claude-direct-claude-haiku-4-5-20251001 34 173 0/34 0.13 1.7k 4.4s

Task × model matrix

expectedtask ↓ · model →openai-direct-gpt-5claude-direct-claude-opus-4-7claude-direct-claude-sonnet-4-6openai-direct-gpt-5-miniclaude-mcp-claude-opus-4-7openai-direct-gpt-4o-miniclaude-direct-claude-haiku-4-5-20251001
a1-block-01 4/4* 1.00PASS 1.00PASS 1.00PASS 1.004/5 0.901/5 0.202/5 0.40
a1-cone-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.002/5 0.400/5 0.00
a1-cube-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.002/5 0.700/5 0.003/5 0.60
a1-pipe-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.001/5 0.202/5 0.40
a1-plate-01 PASS 1.00PASS 1.00PASS 1.004/5 0.803/5 0.700/4* 0.001/5 0.20
a1-sphere-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.003/5 0.600/5 0.00
a1-stepped-shaft-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.001/5 0.201/5 0.30
a2-bolt-circle-block-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.000/5 0.001/5 0.20
a2-channel-bracket-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.003/5 0.900/5 0.000/5 0.00
a2-cube-with-pocket-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.000/5 0.400/5 0.00
a2-cubemark-01 PASS 1.00PASS 1.00PASS 1.004/5 0.804/5 0.900/5 0.000/5 0.00
a2-finned-block-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.004/5 0.900/5 0.000/5 0.00
a2-flanged-cap-01 0/5 0.86PASS 1.00PASS 1.003/5 0.943/5 0.770/4* 0.000/5 0.00
a2-l-bracket-01 4/4* 1.00PASS 1.00PASS 1.00PASS 1.003/5 0.730/5 0.000/5 0.00
a2-mounting-rail-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.002/5 0.700/5 0.001/5 0.20
a2-square-flange-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.002/5 0.700/5 0.101/5 0.20
a2-stepped-block-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.000/5 0.151/5 0.20
a2-stepped-pyramid-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.00PASS 1.000/5 0.152/5 0.40
a2-tee-bracket-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.003/5 0.800/5 0.001/5 0.20
a2-washer-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.004/5 0.973/5 0.602/5 0.40
a3-cross-shaft-01 0/5 0.750/5 0.750/5 0.750/5 0.753/5 0.850/5 0.000/5 0.30
a3-hex-bolt-pattern-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.000/5 0.370/5 0.070/5 0.00
a3-hex-nut-01 1/5 0.672/5 0.83
a3-octagonal-flange-01 0/5 0.830/5 0.700/5 0.800/5 0.430/5 0.630/2* 0.000/5 0.00
a3-pentagonal-prism-01 PASS 1.00PASS 1.00
a3-rotated-block-01 PASS 1.002/5 0.75
a3-spherical-dome-block-01 PASS 1.00PASS 1.00PASS 1.004/5 0.90PASS 1.000/5 0.351/5 0.20
a3-tangent-cylinders-01 PASS 1.004/5 0.95
a3-three-tangent-cylinders-01 PASS 1.00PASS 1.00PASS 1.00PASS 1.004/5 0.950/5 0.000/5 0.00
a4-bolt-circle-flange-with-bore-01 PASS 1.00PASS 1.004/5 0.800/5 0.00
a4-counterbore-plate-01 PASS 1.00PASS 1.000/5 0.400/5 0.00
a4-flanged-shaft-01 PASS 1.00PASS 1.001/5 0.720/5 0.00
a4-rectangular-tube-01 PASS 1.00PASS 1.004/5 0.900/5 0.00
a4-rounded-bar-01 PASS 1.00PASS 1.003/5 0.900/5 0.00
a4-slotted-bracket-01 0/5 0.830/5 0.830/5 0.830/5 0.00
a4-stepped-pyramid-with-holes-01 3/5 0.933/5 0.930/5 0.500/5 0.00
a4-x-frame-01 PASS 1.00PASS 1.00PASS 1.001/5 0.20
a5-disc-hub-01 3/5 0.85
a5-double-d-shaft-01 4/5 0.90
a5-hex-bolt-blank-01 4/5 0.95
a5-hollow-cap-01 PASS 1.00
a5-lightened-disc-01 2/5 0.82
a5-ribbed-plate-01 0/5 0.63
a5-stepped-boss-plate-01 1/5 0.45
a5-u-bracket-01 3/5 0.77
a6-compound-bore-ring-01 PASS 1.00
a6-compound-boss-01 2/5 0.70
a6-motor-flange-01 3/5 0.85
a6-pulley-01 2/5 0.63
a6-sprocket-blank-01 0/5 0.42
a6-yoke-block-01 2/5 0.75
c-reacher-01 0/5 0.080/2* 0.200/5 0.040/5 0.040/5 0.00
d1-sphere-01
f1-cap-tube-01
f1-plug-port-01
f1-shim-gap-01
f1-spacer-shaft-01
f2-collar-shaft-axial-01
f2-plug-port-tilted-01
f2-shim-gap-tilted-01
f2-spacer-shaft-sideways-01
f3-cap-snap-bottle-01
f3-clip-pipe-01
f4-collar-loaded-01

Each cell shows pass5 for the most recent 5 attempts at (model, task). The leftmost column is the latest passing reference render.

Cost · score Pareto

0.00 0.25 0.50 0.75 1.00 1k 10k 100k 1M tokens (log) score claude-direct-claude-haiku-4-5-20251001 claude-direct-claude-opus-4-7 claude-direct-claude-sonnet-4-6 claude-mcp-claude-opus-4-7 openai-direct-gpt-4o-mini openai-direct-gpt-5 openai-direct-gpt-5-mini claude-direct-claude-haiku-4-5-20251001 · a1-block-01 score=0.40 tokens=865 · wall=1.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-cone-01 score=0.00 tokens=787 · wall=1.2s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-cube-01 score=0.60 tokens=807 · wall=1.7s attempts=7claude-direct-claude-haiku-4-5-20251001 · a1-pipe-01 score=0.40 tokens=915 · wall=1.8s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-plate-01 score=0.20 tokens=1.8k · wall=4.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-sphere-01 score=0.00 tokens=735 · wall=1.2s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-stepped-shaft-01 score=0.30 tokens=1.1k · wall=2.4s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-bolt-circle-block-01 score=0.20 tokens=2.6k · wall=7.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-channel-bracket-01 score=0.00 tokens=1.3k · wall=3.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-cube-with-pocket-01 score=0.00 tokens=1.1k · wall=2.5s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-cubemark-01 score=0.00 tokens=3.0k · wall=8.7s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-finned-block-01 score=0.00 tokens=1.7k · wall=4.4s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-flanged-cap-01 score=0.00 tokens=2.3k · wall=6.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-l-bracket-01 score=0.00 tokens=1.5k · wall=3.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-mounting-rail-01 score=0.20 tokens=1.6k · wall=4.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-square-flange-01 score=0.20 tokens=2.1k · wall=6.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-stepped-block-01 score=0.20 tokens=1.1k · wall=2.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-stepped-pyramid-01 score=0.40 tokens=1.4k · wall=3.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-tee-bracket-01 score=0.20 tokens=1.8k · wall=4.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-washer-01 score=0.40 tokens=911 · wall=1.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-cross-shaft-01 score=0.30 tokens=1.2k · wall=3.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-hex-bolt-pattern-01 score=0.00 tokens=2.6k · wall=7.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-octagonal-flange-01 score=0.00 tokens=3.8k · wall=11.8s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-spherical-dome-block-01 score=0.20 tokens=1.5k · wall=3.4s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-three-tangent-cylinders-01 score=0.00 tokens=1.5k · wall=3.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-bolt-circle-flange-with-bore-01 score=0.00 tokens=2.4k · wall=6.5s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-counterbore-plate-01 score=0.00 tokens=2.9k · wall=8.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-flanged-shaft-01 score=0.00 tokens=2.2k · wall=5.7s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-rectangular-tube-01 score=0.00 tokens=1.2k · wall=2.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-rounded-bar-01 score=0.00 tokens=1.4k · wall=3.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-slotted-bracket-01 score=0.00 tokens=1.7k · wall=4.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-stepped-pyramid-with-holes-01 score=0.00 tokens=2.3k · wall=6.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-x-frame-01 score=0.20 tokens=1.2k · wall=2.8s attempts=6claude-direct-claude-haiku-4-5-20251001 · c-reacher-01 score=0.00 tokens=2.3k · wall=7.4s attempts=5claude-direct-claude-opus-4-7 · a1-block-01 score=1.00 tokens=1.0k · wall=2.7s attempts=5claude-direct-claude-opus-4-7 · a1-cone-01 score=1.00 tokens=957 · wall=3.2s attempts=5claude-direct-claude-opus-4-7 · a1-cube-01 score=1.00 tokens=961 · wall=2.7s attempts=5claude-direct-claude-opus-4-7 · a1-pipe-01 score=1.00 tokens=1.1k · wall=3.5s attempts=5claude-direct-claude-opus-4-7 · a1-plate-01 score=1.00 tokens=1.6k · wall=7.1s attempts=10claude-direct-claude-opus-4-7 · a1-sphere-01 score=1.00 tokens=911 · wall=2.4s attempts=5claude-direct-claude-opus-4-7 · a1-stepped-shaft-01 score=1.00 tokens=1.2k · wall=3.9s attempts=5claude-direct-claude-opus-4-7 · a2-bolt-circle-block-01 score=1.00 tokens=2.3k · wall=11.4s attempts=5claude-direct-claude-opus-4-7 · a2-channel-bracket-01 score=1.00 tokens=1.4k · wall=4.9s attempts=5claude-direct-claude-opus-4-7 · a2-cube-with-pocket-01 score=1.00 tokens=1.3k · wall=4.1s attempts=5claude-direct-claude-opus-4-7 · a2-cubemark-01 score=1.00 tokens=2.2k · wall=11.1s attempts=5claude-direct-claude-opus-4-7 · a2-finned-block-01 score=1.00 tokens=1.6k · wall=6.5s attempts=5claude-direct-claude-opus-4-7 · a2-flanged-cap-01 score=1.00 tokens=2.1k · wall=10.2s attempts=10claude-direct-claude-opus-4-7 · a2-l-bracket-01 score=1.00 tokens=1.6k · wall=6.6s attempts=5claude-direct-claude-opus-4-7 · a2-mounting-rail-01 score=1.00 tokens=1.6k · wall=6.8s attempts=5claude-direct-claude-opus-4-7 · a2-square-flange-01 score=1.00 tokens=1.9k · wall=9.1s attempts=5claude-direct-claude-opus-4-7 · a2-stepped-block-01 score=1.00 tokens=1.3k · wall=4.1s attempts=5claude-direct-claude-opus-4-7 · a2-stepped-pyramid-01 score=1.00 tokens=1.5k · wall=5.3s attempts=5claude-direct-claude-opus-4-7 · a2-tee-bracket-01 score=1.00 tokens=1.8k · wall=8.2s attempts=5claude-direct-claude-opus-4-7 · a2-washer-01 score=1.00 tokens=1.1k · wall=3.9s attempts=5claude-direct-claude-opus-4-7 · a3-cross-shaft-01 score=0.75 tokens=1.3k · wall=4.3s attempts=5claude-direct-claude-opus-4-7 · a3-hex-bolt-pattern-01 score=1.00 tokens=2.1k · wall=9.6s attempts=5claude-direct-claude-opus-4-7 · a3-hex-nut-01 score=0.67 tokens=1.4k · wall=5.8s attempts=10claude-direct-claude-opus-4-7 · a3-octagonal-flange-01 score=0.70 tokens=3.1k · wall=16.6s attempts=5claude-direct-claude-opus-4-7 · a3-pentagonal-prism-01 score=1.00 tokens=3.2k · wall=24.5s attempts=10claude-direct-claude-opus-4-7 · a3-rotated-block-01 score=1.00 tokens=1.2k · wall=3.5s attempts=10claude-direct-claude-opus-4-7 · a3-spherical-dome-block-01 score=1.00 tokens=1.6k · wall=5.5s attempts=6claude-direct-claude-opus-4-7 · a3-tangent-cylinders-01 score=1.00 tokens=1.3k · wall=4.7s attempts=10claude-direct-claude-opus-4-7 · a3-three-tangent-cylinders-01 score=1.00 tokens=1.6k · wall=6.7s attempts=5claude-direct-claude-opus-4-7 · a4-bolt-circle-flange-with-bore-01 score=1.00 tokens=2.2k · wall=10.3s attempts=5claude-direct-claude-opus-4-7 · a4-counterbore-plate-01 score=1.00 tokens=2.5k · wall=12.7s attempts=5claude-direct-claude-opus-4-7 · a4-flanged-shaft-01 score=1.00 tokens=2.0k · wall=9.7s attempts=5claude-direct-claude-opus-4-7 · a4-rectangular-tube-01 score=1.00 tokens=1.3k · wall=4.4s attempts=5claude-direct-claude-opus-4-7 · a4-rounded-bar-01 score=1.00 tokens=1.5k · wall=5.3s attempts=5claude-direct-claude-opus-4-7 · a4-slotted-bracket-01 score=0.83 tokens=1.7k · wall=7.3s attempts=5claude-direct-claude-opus-4-7 · a4-stepped-pyramid-with-holes-01 score=0.93 tokens=2.2k · wall=9.7s attempts=5claude-direct-claude-opus-4-7 · a4-x-frame-01 score=1.00 tokens=1.4k · wall=4.7s attempts=5claude-direct-claude-sonnet-4-6 · a1-block-01 score=1.00 tokens=829 · wall=3.2s attempts=5claude-direct-claude-sonnet-4-6 · a1-cone-01 score=1.00 tokens=778 · wall=2.8s attempts=7claude-direct-claude-sonnet-4-6 · a1-cube-01 score=1.00 tokens=770 · wall=3.0s attempts=11claude-direct-claude-sonnet-4-6 · a1-pipe-01 score=1.00 tokens=847 · wall=3.1s attempts=5claude-direct-claude-sonnet-4-6 · a1-plate-01 score=1.00 tokens=1.5k · wall=8.4s attempts=10claude-direct-claude-sonnet-4-6 · a1-sphere-01 score=1.00 tokens=723 · wall=2.4s attempts=10 (on pareto frontier)claude-direct-claude-sonnet-4-6 · a1-stepped-shaft-01 score=1.00 tokens=925 · wall=3.7s attempts=10claude-direct-claude-sonnet-4-6 · a2-bolt-circle-block-01 score=1.00 tokens=2.1k · wall=10.9s attempts=5claude-direct-claude-sonnet-4-6 · a2-channel-bracket-01 score=1.00 tokens=1.1k · wall=5.0s attempts=5claude-direct-claude-sonnet-4-6 · a2-cube-with-pocket-01 score=1.00 tokens=1.0k · wall=4.7s attempts=5claude-direct-claude-sonnet-4-6 · a2-cubemark-01 score=1.00 tokens=2.2k · wall=13.6s attempts=5claude-direct-claude-sonnet-4-6 · a2-finned-block-01 score=1.00 tokens=1.4k · wall=7.2s attempts=5claude-direct-claude-sonnet-4-6 · a2-flanged-cap-01 score=1.00 tokens=1.9k · wall=12.0s attempts=5claude-direct-claude-sonnet-4-6 · a2-l-bracket-01 score=1.00 tokens=1.3k · wall=6.6s attempts=5claude-direct-claude-sonnet-4-6 · a2-mounting-rail-01 score=1.00 tokens=1.3k · wall=5.8s attempts=5claude-direct-claude-sonnet-4-6 · a2-square-flange-01 score=1.00 tokens=1.6k · wall=7.9s attempts=5claude-direct-claude-sonnet-4-6 · a2-stepped-block-01 score=1.00 tokens=1.1k · wall=4.4s attempts=5claude-direct-claude-sonnet-4-6 · a2-stepped-pyramid-01 score=1.00 tokens=1.2k · wall=5.6s attempts=5claude-direct-claude-sonnet-4-6 · a2-tee-bracket-01 score=1.00 tokens=1.5k · wall=7.2s attempts=5claude-direct-claude-sonnet-4-6 · a2-washer-01 score=1.00 tokens=849 · wall=3.1s attempts=5claude-direct-claude-sonnet-4-6 · a3-cross-shaft-01 score=0.75 tokens=1.2k · wall=4.9s attempts=5claude-direct-claude-sonnet-4-6 · a3-hex-bolt-pattern-01 score=1.00 tokens=2.0k · wall=13.1s attempts=5claude-direct-claude-sonnet-4-6 · a3-octagonal-flange-01 score=0.80 tokens=2.7k · wall=25.0s attempts=5claude-direct-claude-sonnet-4-6 · a3-spherical-dome-block-01 score=1.00 tokens=1.3k · wall=5.7s attempts=5claude-direct-claude-sonnet-4-6 · a3-three-tangent-cylinders-01 score=1.00 tokens=1.7k · wall=12.3s attempts=5claude-direct-claude-sonnet-4-6 · a4-bolt-circle-flange-with-bore-01 score=1.00 tokens=2.0k · wall=10.2s attempts=5claude-direct-claude-sonnet-4-6 · a4-counterbore-plate-01 score=1.00 tokens=2.2k · wall=12.8s attempts=5claude-direct-claude-sonnet-4-6 · a4-flanged-shaft-01 score=1.00 tokens=1.8k · wall=8.4s attempts=5claude-direct-claude-sonnet-4-6 · a4-rectangular-tube-01 score=1.00 tokens=1.1k · wall=4.2s attempts=5claude-direct-claude-sonnet-4-6 · a4-rounded-bar-01 score=1.00 tokens=1.2k · wall=5.8s attempts=5claude-direct-claude-sonnet-4-6 · a4-slotted-bracket-01 score=0.83 tokens=1.4k · wall=7.2s attempts=5claude-direct-claude-sonnet-4-6 · a4-stepped-pyramid-with-holes-01 score=0.93 tokens=1.9k · wall=11.0s attempts=5claude-direct-claude-sonnet-4-6 · a4-x-frame-01 score=1.00 tokens=1.1k · wall=4.3s attempts=5claude-direct-claude-sonnet-4-6 · c-reacher-01 score=0.08 tokens=1.7k · wall=8.6s attempts=5claude-mcp-claude-opus-4-7 · a1-block-01 score=0.90 tokens=189.6k · wall=64.2s attempts=7claude-mcp-claude-opus-4-7 · a1-cone-01 score=1.00 tokens=45.6k · wall=16.2s attempts=5claude-mcp-claude-opus-4-7 · a1-cube-01 score=0.70 tokens=117.4k · wall=30.2s attempts=6claude-mcp-claude-opus-4-7 · a1-pipe-01 score=1.00 tokens=268.6k · wall=48.7s attempts=5claude-mcp-claude-opus-4-7 · a1-plate-01 score=0.70 tokens=552.9k · wall=102.3s attempts=6claude-mcp-claude-opus-4-7 · a1-sphere-01 score=1.00 tokens=45.4k · wall=7.9s attempts=5claude-mcp-claude-opus-4-7 · a1-stepped-shaft-01 score=1.00 tokens=328.9k · wall=65.2s attempts=6claude-mcp-claude-opus-4-7 · a2-bolt-circle-block-01 score=1.00 tokens=345.8k · wall=64.8s attempts=5claude-mcp-claude-opus-4-7 · a2-channel-bracket-01 score=0.90 tokens=467.6k · wall=84.2s attempts=5claude-mcp-claude-opus-4-7 · a2-cube-with-pocket-01 score=1.00 tokens=332.0k · wall=122.0s attempts=5claude-mcp-claude-opus-4-7 · a2-cubemark-01 score=0.90 tokens=555.7k · wall=176.4s attempts=5claude-mcp-claude-opus-4-7 · a2-finned-block-01 score=0.90 tokens=432.3k · wall=136.8s attempts=5claude-mcp-claude-opus-4-7 · a2-flanged-cap-01 score=0.77 tokens=533.2k · wall=118.8s attempts=6claude-mcp-claude-opus-4-7 · a2-l-bracket-01 score=0.73 tokens=480.9k · wall=96.4s attempts=6claude-mcp-claude-opus-4-7 · a2-mounting-rail-01 score=0.70 tokens=519.8k · wall=152.4s attempts=5claude-mcp-claude-opus-4-7 · a2-square-flange-01 score=0.70 tokens=557.3k · wall=172.8s attempts=5claude-mcp-claude-opus-4-7 · a2-stepped-block-01 score=1.00 tokens=309.6k · wall=78.7s attempts=5claude-mcp-claude-opus-4-7 · a2-stepped-pyramid-01 score=1.00 tokens=461.4k · wall=169.3s attempts=5claude-mcp-claude-opus-4-7 · a2-tee-bracket-01 score=0.80 tokens=538.0k · wall=116.3s attempts=5claude-mcp-claude-opus-4-7 · a2-washer-01 score=0.97 tokens=191.1k · wall=36.6s attempts=5claude-mcp-claude-opus-4-7 · a3-cross-shaft-01 score=0.85 tokens=484.7k · wall=108.3s attempts=5claude-mcp-claude-opus-4-7 · a3-hex-bolt-pattern-01 score=0.37 tokens=603.6k · wall=114.5s attempts=5claude-mcp-claude-opus-4-7 · a3-hex-nut-01 score=0.83 tokens=420.7k · wall=104.3s attempts=5claude-mcp-claude-opus-4-7 · a3-octagonal-flange-01 score=0.63 tokens=476.7k · wall=136.9s attempts=5claude-mcp-claude-opus-4-7 · a3-pentagonal-prism-01 score=1.00 tokens=368.2k · wall=108.6s attempts=5claude-mcp-claude-opus-4-7 · a3-rotated-block-01 score=0.75 tokens=245.5k · wall=113.8s attempts=5claude-mcp-claude-opus-4-7 · a3-spherical-dome-block-01 score=1.00 tokens=451.1k · wall=136.9s attempts=5claude-mcp-claude-opus-4-7 · a3-tangent-cylinders-01 score=0.95 tokens=355.6k · wall=69.3s attempts=5claude-mcp-claude-opus-4-7 · a3-three-tangent-cylinders-01 score=0.95 tokens=366.0k · wall=77.4s attempts=5claude-mcp-claude-opus-4-7 · a4-bolt-circle-flange-with-bore-01 score=0.80 tokens=474.0k · wall=90.7s attempts=5claude-mcp-claude-opus-4-7 · a4-counterbore-plate-01 score=0.40 tokens=643.6k · wall=182.6s attempts=5claude-mcp-claude-opus-4-7 · a4-flanged-shaft-01 score=0.72 tokens=620.3k · wall=139.4s attempts=5claude-mcp-claude-opus-4-7 · a4-rectangular-tube-01 score=0.90 tokens=426.1k · wall=76.5s attempts=5claude-mcp-claude-opus-4-7 · a4-rounded-bar-01 score=0.90 tokens=559.9k · wall=99.2s attempts=5claude-mcp-claude-opus-4-7 · a4-slotted-bracket-01 score=0.83 tokens=495.7k · wall=114.7s attempts=5claude-mcp-claude-opus-4-7 · a4-stepped-pyramid-with-holes-01 score=0.50 tokens=622.9k · wall=155.4s attempts=5claude-mcp-claude-opus-4-7 · a4-x-frame-01 score=1.00 tokens=322.6k · wall=55.8s attempts=5claude-mcp-claude-opus-4-7 · a5-disc-hub-01 score=0.85 tokens=541.2k · wall=98.9s attempts=5claude-mcp-claude-opus-4-7 · a5-double-d-shaft-01 score=0.90 tokens=438.7k · wall=97.8s attempts=5claude-mcp-claude-opus-4-7 · a5-hex-bolt-blank-01 score=0.95 tokens=460.6k · wall=98.6s attempts=5claude-mcp-claude-opus-4-7 · a5-hollow-cap-01 score=1.00 tokens=302.8k · wall=53.3s attempts=5claude-mcp-claude-opus-4-7 · a5-lightened-disc-01 score=0.82 tokens=643.4k · wall=164.7s attempts=5claude-mcp-claude-opus-4-7 · a5-ribbed-plate-01 score=0.63 tokens=654.5k · wall=191.4s attempts=5claude-mcp-claude-opus-4-7 · a5-stepped-boss-plate-01 score=0.45 tokens=643.4k · wall=123.7s attempts=5claude-mcp-claude-opus-4-7 · a5-u-bracket-01 score=0.77 tokens=542.5k · wall=111.3s attempts=5claude-mcp-claude-opus-4-7 · a6-compound-bore-ring-01 score=1.00 tokens=427.6k · wall=84.0s attempts=5claude-mcp-claude-opus-4-7 · a6-compound-boss-01 score=0.70 tokens=640.7k · wall=122.2s attempts=5claude-mcp-claude-opus-4-7 · a6-motor-flange-01 score=0.85 tokens=579.8k · wall=105.3s attempts=5claude-mcp-claude-opus-4-7 · a6-pulley-01 score=0.63 tokens=629.7k · wall=120.2s attempts=5claude-mcp-claude-opus-4-7 · a6-sprocket-blank-01 score=0.42 tokens=608.7k · wall=113.5s attempts=5claude-mcp-claude-opus-4-7 · a6-yoke-block-01 score=0.75 tokens=589.7k · wall=125.2s attempts=5claude-mcp-claude-opus-4-7 · c-reacher-01 score=0.04 tokens=637.7k · wall=170.4s attempts=5openai-direct-gpt-4o-mini · a1-block-01 score=0.20 tokens=746 · wall=3.0s attempts=5openai-direct-gpt-4o-mini · a1-cone-01 score=0.40 tokens=696 · wall=2.4s attempts=5openai-direct-gpt-4o-mini · a1-cube-01 score=0.00 tokens=706 · wall=3.2s attempts=6openai-direct-gpt-4o-mini · a1-pipe-01 score=0.20 tokens=800 · wall=3.4s attempts=5openai-direct-gpt-4o-mini · a1-plate-01 score=0.00 tokens=1.3k · wall=11.2s attempts=4openai-direct-gpt-4o-mini · a1-sphere-01 score=0.60 tokens=649 · wall=2.3s attempts=5 (on pareto frontier)openai-direct-gpt-4o-mini · a1-stepped-shaft-01 score=0.20 tokens=858 · wall=4.1s attempts=5openai-direct-gpt-4o-mini · a2-bolt-circle-block-01 score=0.00 tokens=1.5k · wall=15.1s attempts=5openai-direct-gpt-4o-mini · a2-channel-bracket-01 score=0.00 tokens=1.1k · wall=8.2s attempts=5openai-direct-gpt-4o-mini · a2-cube-with-pocket-01 score=0.40 tokens=958 · wall=5.7s attempts=5openai-direct-gpt-4o-mini · a2-cubemark-01 score=0.00 tokens=2.1k · wall=20.8s attempts=5openai-direct-gpt-4o-mini · a2-finned-block-01 score=0.00 tokens=1.3k · wall=11.6s attempts=5openai-direct-gpt-4o-mini · a2-flanged-cap-01 score=0.00 tokens=1.5k · wall=14.4s attempts=4openai-direct-gpt-4o-mini · a2-l-bracket-01 score=0.00 tokens=1.2k · wall=9.3s attempts=5openai-direct-gpt-4o-mini · a2-mounting-rail-01 score=0.00 tokens=1.2k · wall=9.3s attempts=5openai-direct-gpt-4o-mini · a2-square-flange-01 score=0.10 tokens=1.4k · wall=11.8s attempts=5openai-direct-gpt-4o-mini · a2-stepped-block-01 score=0.15 tokens=951 · wall=5.3s attempts=5openai-direct-gpt-4o-mini · a2-stepped-pyramid-01 score=0.15 tokens=1.1k · wall=8.1s attempts=5openai-direct-gpt-4o-mini · a2-tee-bracket-01 score=0.00 tokens=1.4k · wall=12.6s attempts=5openai-direct-gpt-4o-mini · a2-washer-01 score=0.60 tokens=808 · wall=3.9s attempts=5openai-direct-gpt-4o-mini · a3-cross-shaft-01 score=0.00 tokens=946 · wall=5.9s attempts=5openai-direct-gpt-4o-mini · a3-hex-bolt-pattern-01 score=0.07 tokens=1.5k · wall=15.1s attempts=5openai-direct-gpt-4o-mini · a3-octagonal-flange-01 score=0.00 tokens=1.6k · wall=15.8s attempts=2openai-direct-gpt-4o-mini · a3-spherical-dome-block-01 score=0.35 tokens=1.1k · wall=7.4s attempts=5openai-direct-gpt-4o-mini · a3-three-tangent-cylinders-01 score=0.00 tokens=1.1k · wall=8.4s attempts=5openai-direct-gpt-4o-mini · c-reacher-01 score=0.04 tokens=1.4k · wall=13.0s attempts=5openai-direct-gpt-5 · a1-block-01 score=1.00 tokens=1.2k · wall=9.4s attempts=4openai-direct-gpt-5 · a1-cone-01 score=1.00 tokens=1.1k · wall=8.3s attempts=5openai-direct-gpt-5 · a1-cube-01 score=1.00 tokens=1.2k · wall=10.1s attempts=6openai-direct-gpt-5 · a1-pipe-01 score=1.00 tokens=1.2k · wall=14.1s attempts=5openai-direct-gpt-5 · a1-plate-01 score=1.00 tokens=2.7k · wall=24.8s attempts=5openai-direct-gpt-5 · a1-sphere-01 score=1.00 tokens=888 · wall=5.6s attempts=5openai-direct-gpt-5 · a1-stepped-shaft-01 score=1.00 tokens=1.3k · wall=10.2s attempts=5openai-direct-gpt-5 · a2-bolt-circle-block-01 score=1.00 tokens=3.5k · wall=32.8s attempts=5openai-direct-gpt-5 · a2-channel-bracket-01 score=1.00 tokens=1.7k · wall=13.1s attempts=5openai-direct-gpt-5 · a2-cube-with-pocket-01 score=1.00 tokens=2.1k · wall=18.2s attempts=5openai-direct-gpt-5 · a2-cubemark-01 score=1.00 tokens=3.6k · wall=39.5s attempts=5openai-direct-gpt-5 · a2-finned-block-01 score=1.00 tokens=2.5k · wall=24.6s attempts=5openai-direct-gpt-5 · a2-flanged-cap-01 score=0.86 tokens=3.3k · wall=37.8s attempts=5openai-direct-gpt-5 · a2-l-bracket-01 score=1.00 tokens=2.9k · wall=29.9s attempts=4openai-direct-gpt-5 · a2-mounting-rail-01 score=1.00 tokens=2.7k · wall=22.1s attempts=5openai-direct-gpt-5 · a2-square-flange-01 score=1.00 tokens=3.1k · wall=29.8s attempts=5openai-direct-gpt-5 · a2-stepped-block-01 score=1.00 tokens=1.8k · wall=12.4s attempts=5openai-direct-gpt-5 · a2-stepped-pyramid-01 score=1.00 tokens=2.1k · wall=17.4s attempts=5openai-direct-gpt-5 · a2-tee-bracket-01 score=1.00 tokens=3.0k · wall=26.8s attempts=5openai-direct-gpt-5 · a2-washer-01 score=1.00 tokens=1.3k · wall=12.1s attempts=5openai-direct-gpt-5 · a3-cross-shaft-01 score=0.75 tokens=2.6k · wall=28.4s attempts=5openai-direct-gpt-5 · a3-hex-bolt-pattern-01 score=1.00 tokens=3.5k · wall=34.8s attempts=5openai-direct-gpt-5 · a3-octagonal-flange-01 score=0.83 tokens=7.5k · wall=98.8s attempts=6openai-direct-gpt-5 · a3-spherical-dome-block-01 score=1.00 tokens=2.7k · wall=30.0s attempts=5openai-direct-gpt-5 · a3-three-tangent-cylinders-01 score=1.00 tokens=3.2k · wall=38.1s attempts=5openai-direct-gpt-5-mini · a1-block-01 score=1.00 tokens=1.2k · wall=12.2s attempts=5openai-direct-gpt-5-mini · a1-cone-01 score=1.00 tokens=1.0k · wall=7.5s attempts=5openai-direct-gpt-5-mini · a1-cube-01 score=1.00 tokens=1.1k · wall=10.6s attempts=6openai-direct-gpt-5-mini · a1-pipe-01 score=1.00 tokens=1.3k · wall=10.6s attempts=5openai-direct-gpt-5-mini · a1-plate-01 score=0.80 tokens=2.7k · wall=33.6s attempts=5openai-direct-gpt-5-mini · a1-sphere-01 score=1.00 tokens=879 · wall=6.0s attempts=5openai-direct-gpt-5-mini · a1-stepped-shaft-01 score=1.00 tokens=1.4k · wall=14.3s attempts=5openai-direct-gpt-5-mini · a2-bolt-circle-block-01 score=1.00 tokens=3.7k · wall=30.6s attempts=5openai-direct-gpt-5-mini · a2-channel-bracket-01 score=1.00 tokens=1.7k · wall=10.6s attempts=5openai-direct-gpt-5-mini · a2-cube-with-pocket-01 score=1.00 tokens=1.8k · wall=10.4s attempts=5openai-direct-gpt-5-mini · a2-cubemark-01 score=0.80 tokens=3.8k · wall=30.4s attempts=5openai-direct-gpt-5-mini · a2-finned-block-01 score=1.00 tokens=2.5k · wall=17.6s attempts=5openai-direct-gpt-5-mini · a2-flanged-cap-01 score=0.94 tokens=3.2k · wall=39.7s attempts=5openai-direct-gpt-5-mini · a2-l-bracket-01 score=1.00 tokens=2.4k · wall=21.1s attempts=5openai-direct-gpt-5-mini · a2-mounting-rail-01 score=1.00 tokens=2.3k · wall=20.0s attempts=5openai-direct-gpt-5-mini · a2-square-flange-01 score=1.00 tokens=2.9k · wall=21.6s attempts=5openai-direct-gpt-5-mini · a2-stepped-block-01 score=1.00 tokens=1.8k · wall=10.5s attempts=5openai-direct-gpt-5-mini · a2-stepped-pyramid-01 score=1.00 tokens=1.9k · wall=13.9s attempts=5openai-direct-gpt-5-mini · a2-tee-bracket-01 score=1.00 tokens=2.5k · wall=17.0s attempts=5openai-direct-gpt-5-mini · a2-washer-01 score=1.00 tokens=1.3k · wall=9.4s attempts=5openai-direct-gpt-5-mini · a3-cross-shaft-01 score=0.75 tokens=2.6k · wall=42.8s attempts=6openai-direct-gpt-5-mini · a3-hex-bolt-pattern-01 score=1.00 tokens=3.6k · wall=58.2s attempts=5openai-direct-gpt-5-mini · a3-octagonal-flange-01 score=0.43 tokens=6.1k · wall=111.7s attempts=9openai-direct-gpt-5-mini · a3-spherical-dome-block-01 score=0.90 tokens=2.4k · wall=30.7s attempts=5openai-direct-gpt-5-mini · a3-three-tangent-cylinders-01 score=1.00 tokens=2.7k · wall=41.4s attempts=5openai-direct-gpt-5-mini · c-reacher-01 score=0.20 tokens=6.4k · wall=50.7s attempts=2

Tokens (log) vs mean score across the most recent 5 attempts. Each point is one (model, task) pair. The dashed line is the Pareto frontier — points on it are not dominated by any cheaper, better alternative.

Wall-clock seconds vs score
0.00 0.25 0.50 0.75 1.00 1s 10s 100s wall-clock seconds (log) score claude-direct-claude-haiku-4-5-20251001 claude-direct-claude-opus-4-7 claude-direct-claude-sonnet-4-6 claude-mcp-claude-opus-4-7 openai-direct-gpt-4o-mini openai-direct-gpt-5 openai-direct-gpt-5-mini claude-direct-claude-haiku-4-5-20251001 · a1-block-01 score=0.40 tokens=865 · wall=1.6s attempts=5 (on pareto frontier)claude-direct-claude-haiku-4-5-20251001 · a1-cone-01 score=0.00 tokens=787 · wall=1.2s attempts=5 (on pareto frontier)claude-direct-claude-haiku-4-5-20251001 · a1-cube-01 score=0.60 tokens=807 · wall=1.7s attempts=7 (on pareto frontier)claude-direct-claude-haiku-4-5-20251001 · a1-pipe-01 score=0.40 tokens=915 · wall=1.8s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-plate-01 score=0.20 tokens=1.8k · wall=4.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-sphere-01 score=0.00 tokens=735 · wall=1.2s attempts=5claude-direct-claude-haiku-4-5-20251001 · a1-stepped-shaft-01 score=0.30 tokens=1.1k · wall=2.4s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-bolt-circle-block-01 score=0.20 tokens=2.6k · wall=7.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-channel-bracket-01 score=0.00 tokens=1.3k · wall=3.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-cube-with-pocket-01 score=0.00 tokens=1.1k · wall=2.5s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-cubemark-01 score=0.00 tokens=3.0k · wall=8.7s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-finned-block-01 score=0.00 tokens=1.7k · wall=4.4s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-flanged-cap-01 score=0.00 tokens=2.3k · wall=6.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-l-bracket-01 score=0.00 tokens=1.5k · wall=3.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-mounting-rail-01 score=0.20 tokens=1.6k · wall=4.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-square-flange-01 score=0.20 tokens=2.1k · wall=6.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-stepped-block-01 score=0.20 tokens=1.1k · wall=2.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-stepped-pyramid-01 score=0.40 tokens=1.4k · wall=3.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-tee-bracket-01 score=0.20 tokens=1.8k · wall=4.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a2-washer-01 score=0.40 tokens=911 · wall=1.9s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-cross-shaft-01 score=0.30 tokens=1.2k · wall=3.1s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-hex-bolt-pattern-01 score=0.00 tokens=2.6k · wall=7.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-octagonal-flange-01 score=0.00 tokens=3.8k · wall=11.8s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-spherical-dome-block-01 score=0.20 tokens=1.5k · wall=3.4s attempts=5claude-direct-claude-haiku-4-5-20251001 · a3-three-tangent-cylinders-01 score=0.00 tokens=1.5k · wall=3.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-bolt-circle-flange-with-bore-01 score=0.00 tokens=2.4k · wall=6.5s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-counterbore-plate-01 score=0.00 tokens=2.9k · wall=8.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-flanged-shaft-01 score=0.00 tokens=2.2k · wall=5.7s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-rectangular-tube-01 score=0.00 tokens=1.2k · wall=2.6s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-rounded-bar-01 score=0.00 tokens=1.4k · wall=3.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-slotted-bracket-01 score=0.00 tokens=1.7k · wall=4.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-stepped-pyramid-with-holes-01 score=0.00 tokens=2.3k · wall=6.3s attempts=5claude-direct-claude-haiku-4-5-20251001 · a4-x-frame-01 score=0.20 tokens=1.2k · wall=2.8s attempts=6claude-direct-claude-haiku-4-5-20251001 · c-reacher-01 score=0.00 tokens=2.3k · wall=7.4s attempts=5claude-direct-claude-opus-4-7 · a1-block-01 score=1.00 tokens=1.0k · wall=2.7s attempts=5claude-direct-claude-opus-4-7 · a1-cone-01 score=1.00 tokens=957 · wall=3.2s attempts=5claude-direct-claude-opus-4-7 · a1-cube-01 score=1.00 tokens=961 · wall=2.7s attempts=5claude-direct-claude-opus-4-7 · a1-pipe-01 score=1.00 tokens=1.1k · wall=3.5s attempts=5claude-direct-claude-opus-4-7 · a1-plate-01 score=1.00 tokens=1.6k · wall=7.1s attempts=10claude-direct-claude-opus-4-7 · a1-sphere-01 score=1.00 tokens=911 · wall=2.4s attempts=5claude-direct-claude-opus-4-7 · a1-stepped-shaft-01 score=1.00 tokens=1.2k · wall=3.9s attempts=5claude-direct-claude-opus-4-7 · a2-bolt-circle-block-01 score=1.00 tokens=2.3k · wall=11.4s attempts=5claude-direct-claude-opus-4-7 · a2-channel-bracket-01 score=1.00 tokens=1.4k · wall=4.9s attempts=5claude-direct-claude-opus-4-7 · a2-cube-with-pocket-01 score=1.00 tokens=1.3k · wall=4.1s attempts=5claude-direct-claude-opus-4-7 · a2-cubemark-01 score=1.00 tokens=2.2k · wall=11.1s attempts=5claude-direct-claude-opus-4-7 · a2-finned-block-01 score=1.00 tokens=1.6k · wall=6.5s attempts=5claude-direct-claude-opus-4-7 · a2-flanged-cap-01 score=1.00 tokens=2.1k · wall=10.2s attempts=10claude-direct-claude-opus-4-7 · a2-l-bracket-01 score=1.00 tokens=1.6k · wall=6.6s attempts=5claude-direct-claude-opus-4-7 · a2-mounting-rail-01 score=1.00 tokens=1.6k · wall=6.8s attempts=5claude-direct-claude-opus-4-7 · a2-square-flange-01 score=1.00 tokens=1.9k · wall=9.1s attempts=5claude-direct-claude-opus-4-7 · a2-stepped-block-01 score=1.00 tokens=1.3k · wall=4.1s attempts=5claude-direct-claude-opus-4-7 · a2-stepped-pyramid-01 score=1.00 tokens=1.5k · wall=5.3s attempts=5claude-direct-claude-opus-4-7 · a2-tee-bracket-01 score=1.00 tokens=1.8k · wall=8.2s attempts=5claude-direct-claude-opus-4-7 · a2-washer-01 score=1.00 tokens=1.1k · wall=3.9s attempts=5claude-direct-claude-opus-4-7 · a3-cross-shaft-01 score=0.75 tokens=1.3k · wall=4.3s attempts=5claude-direct-claude-opus-4-7 · a3-hex-bolt-pattern-01 score=1.00 tokens=2.1k · wall=9.6s attempts=5claude-direct-claude-opus-4-7 · a3-hex-nut-01 score=0.67 tokens=1.4k · wall=5.8s attempts=10claude-direct-claude-opus-4-7 · a3-octagonal-flange-01 score=0.70 tokens=3.1k · wall=16.6s attempts=5claude-direct-claude-opus-4-7 · a3-pentagonal-prism-01 score=1.00 tokens=3.2k · wall=24.5s attempts=10claude-direct-claude-opus-4-7 · a3-rotated-block-01 score=1.00 tokens=1.2k · wall=3.5s attempts=10claude-direct-claude-opus-4-7 · a3-spherical-dome-block-01 score=1.00 tokens=1.6k · wall=5.5s attempts=6claude-direct-claude-opus-4-7 · a3-tangent-cylinders-01 score=1.00 tokens=1.3k · wall=4.7s attempts=10claude-direct-claude-opus-4-7 · a3-three-tangent-cylinders-01 score=1.00 tokens=1.6k · wall=6.7s attempts=5claude-direct-claude-opus-4-7 · a4-bolt-circle-flange-with-bore-01 score=1.00 tokens=2.2k · wall=10.3s attempts=5claude-direct-claude-opus-4-7 · a4-counterbore-plate-01 score=1.00 tokens=2.5k · wall=12.7s attempts=5claude-direct-claude-opus-4-7 · a4-flanged-shaft-01 score=1.00 tokens=2.0k · wall=9.7s attempts=5claude-direct-claude-opus-4-7 · a4-rectangular-tube-01 score=1.00 tokens=1.3k · wall=4.4s attempts=5claude-direct-claude-opus-4-7 · a4-rounded-bar-01 score=1.00 tokens=1.5k · wall=5.3s attempts=5claude-direct-claude-opus-4-7 · a4-slotted-bracket-01 score=0.83 tokens=1.7k · wall=7.3s attempts=5claude-direct-claude-opus-4-7 · a4-stepped-pyramid-with-holes-01 score=0.93 tokens=2.2k · wall=9.7s attempts=5claude-direct-claude-opus-4-7 · a4-x-frame-01 score=1.00 tokens=1.4k · wall=4.7s attempts=5claude-direct-claude-sonnet-4-6 · a1-block-01 score=1.00 tokens=829 · wall=3.2s attempts=5claude-direct-claude-sonnet-4-6 · a1-cone-01 score=1.00 tokens=778 · wall=2.8s attempts=7claude-direct-claude-sonnet-4-6 · a1-cube-01 score=1.00 tokens=770 · wall=3.0s attempts=11claude-direct-claude-sonnet-4-6 · a1-pipe-01 score=1.00 tokens=847 · wall=3.1s attempts=5claude-direct-claude-sonnet-4-6 · a1-plate-01 score=1.00 tokens=1.5k · wall=8.4s attempts=10claude-direct-claude-sonnet-4-6 · a1-sphere-01 score=1.00 tokens=723 · wall=2.4s attempts=10 (on pareto frontier)claude-direct-claude-sonnet-4-6 · a1-stepped-shaft-01 score=1.00 tokens=925 · wall=3.7s attempts=10claude-direct-claude-sonnet-4-6 · a2-bolt-circle-block-01 score=1.00 tokens=2.1k · wall=10.9s attempts=5claude-direct-claude-sonnet-4-6 · a2-channel-bracket-01 score=1.00 tokens=1.1k · wall=5.0s attempts=5claude-direct-claude-sonnet-4-6 · a2-cube-with-pocket-01 score=1.00 tokens=1.0k · wall=4.7s attempts=5claude-direct-claude-sonnet-4-6 · a2-cubemark-01 score=1.00 tokens=2.2k · wall=13.6s attempts=5claude-direct-claude-sonnet-4-6 · a2-finned-block-01 score=1.00 tokens=1.4k · wall=7.2s attempts=5claude-direct-claude-sonnet-4-6 · a2-flanged-cap-01 score=1.00 tokens=1.9k · wall=12.0s attempts=5claude-direct-claude-sonnet-4-6 · a2-l-bracket-01 score=1.00 tokens=1.3k · wall=6.6s attempts=5claude-direct-claude-sonnet-4-6 · a2-mounting-rail-01 score=1.00 tokens=1.3k · wall=5.8s attempts=5claude-direct-claude-sonnet-4-6 · a2-square-flange-01 score=1.00 tokens=1.6k · wall=7.9s attempts=5claude-direct-claude-sonnet-4-6 · a2-stepped-block-01 score=1.00 tokens=1.1k · wall=4.4s attempts=5claude-direct-claude-sonnet-4-6 · a2-stepped-pyramid-01 score=1.00 tokens=1.2k · wall=5.6s attempts=5claude-direct-claude-sonnet-4-6 · a2-tee-bracket-01 score=1.00 tokens=1.5k · wall=7.2s attempts=5claude-direct-claude-sonnet-4-6 · a2-washer-01 score=1.00 tokens=849 · wall=3.1s attempts=5claude-direct-claude-sonnet-4-6 · a3-cross-shaft-01 score=0.75 tokens=1.2k · wall=4.9s attempts=5claude-direct-claude-sonnet-4-6 · a3-hex-bolt-pattern-01 score=1.00 tokens=2.0k · wall=13.1s attempts=5claude-direct-claude-sonnet-4-6 · a3-octagonal-flange-01 score=0.80 tokens=2.7k · wall=25.0s attempts=5claude-direct-claude-sonnet-4-6 · a3-spherical-dome-block-01 score=1.00 tokens=1.3k · wall=5.7s attempts=5claude-direct-claude-sonnet-4-6 · a3-three-tangent-cylinders-01 score=1.00 tokens=1.7k · wall=12.3s attempts=5claude-direct-claude-sonnet-4-6 · a4-bolt-circle-flange-with-bore-01 score=1.00 tokens=2.0k · wall=10.2s attempts=5claude-direct-claude-sonnet-4-6 · a4-counterbore-plate-01 score=1.00 tokens=2.2k · wall=12.8s attempts=5claude-direct-claude-sonnet-4-6 · a4-flanged-shaft-01 score=1.00 tokens=1.8k · wall=8.4s attempts=5claude-direct-claude-sonnet-4-6 · a4-rectangular-tube-01 score=1.00 tokens=1.1k · wall=4.2s attempts=5claude-direct-claude-sonnet-4-6 · a4-rounded-bar-01 score=1.00 tokens=1.2k · wall=5.8s attempts=5claude-direct-claude-sonnet-4-6 · a4-slotted-bracket-01 score=0.83 tokens=1.4k · wall=7.2s attempts=5claude-direct-claude-sonnet-4-6 · a4-stepped-pyramid-with-holes-01 score=0.93 tokens=1.9k · wall=11.0s attempts=5claude-direct-claude-sonnet-4-6 · a4-x-frame-01 score=1.00 tokens=1.1k · wall=4.3s attempts=5claude-direct-claude-sonnet-4-6 · c-reacher-01 score=0.08 tokens=1.7k · wall=8.6s attempts=5claude-mcp-claude-opus-4-7 · a1-block-01 score=0.90 tokens=189.6k · wall=64.2s attempts=7claude-mcp-claude-opus-4-7 · a1-cone-01 score=1.00 tokens=45.6k · wall=16.2s attempts=5claude-mcp-claude-opus-4-7 · a1-cube-01 score=0.70 tokens=117.4k · wall=30.2s attempts=6claude-mcp-claude-opus-4-7 · a1-pipe-01 score=1.00 tokens=268.6k · wall=48.7s attempts=5claude-mcp-claude-opus-4-7 · a1-plate-01 score=0.70 tokens=552.9k · wall=102.3s attempts=6claude-mcp-claude-opus-4-7 · a1-sphere-01 score=1.00 tokens=45.4k · wall=7.9s attempts=5claude-mcp-claude-opus-4-7 · a1-stepped-shaft-01 score=1.00 tokens=328.9k · wall=65.2s attempts=6claude-mcp-claude-opus-4-7 · a2-bolt-circle-block-01 score=1.00 tokens=345.8k · wall=64.8s attempts=5claude-mcp-claude-opus-4-7 · a2-channel-bracket-01 score=0.90 tokens=467.6k · wall=84.2s attempts=5claude-mcp-claude-opus-4-7 · a2-cube-with-pocket-01 score=1.00 tokens=332.0k · wall=122.0s attempts=5claude-mcp-claude-opus-4-7 · a2-cubemark-01 score=0.90 tokens=555.7k · wall=176.4s attempts=5claude-mcp-claude-opus-4-7 · a2-finned-block-01 score=0.90 tokens=432.3k · wall=136.8s attempts=5claude-mcp-claude-opus-4-7 · a2-flanged-cap-01 score=0.77 tokens=533.2k · wall=118.8s attempts=6claude-mcp-claude-opus-4-7 · a2-l-bracket-01 score=0.73 tokens=480.9k · wall=96.4s attempts=6claude-mcp-claude-opus-4-7 · a2-mounting-rail-01 score=0.70 tokens=519.8k · wall=152.4s attempts=5claude-mcp-claude-opus-4-7 · a2-square-flange-01 score=0.70 tokens=557.3k · wall=172.8s attempts=5claude-mcp-claude-opus-4-7 · a2-stepped-block-01 score=1.00 tokens=309.6k · wall=78.7s attempts=5claude-mcp-claude-opus-4-7 · a2-stepped-pyramid-01 score=1.00 tokens=461.4k · wall=169.3s attempts=5claude-mcp-claude-opus-4-7 · a2-tee-bracket-01 score=0.80 tokens=538.0k · wall=116.3s attempts=5claude-mcp-claude-opus-4-7 · a2-washer-01 score=0.97 tokens=191.1k · wall=36.6s attempts=5claude-mcp-claude-opus-4-7 · a3-cross-shaft-01 score=0.85 tokens=484.7k · wall=108.3s attempts=5claude-mcp-claude-opus-4-7 · a3-hex-bolt-pattern-01 score=0.37 tokens=603.6k · wall=114.5s attempts=5claude-mcp-claude-opus-4-7 · a3-hex-nut-01 score=0.83 tokens=420.7k · wall=104.3s attempts=5claude-mcp-claude-opus-4-7 · a3-octagonal-flange-01 score=0.63 tokens=476.7k · wall=136.9s attempts=5claude-mcp-claude-opus-4-7 · a3-pentagonal-prism-01 score=1.00 tokens=368.2k · wall=108.6s attempts=5claude-mcp-claude-opus-4-7 · a3-rotated-block-01 score=0.75 tokens=245.5k · wall=113.8s attempts=5claude-mcp-claude-opus-4-7 · a3-spherical-dome-block-01 score=1.00 tokens=451.1k · wall=136.9s attempts=5claude-mcp-claude-opus-4-7 · a3-tangent-cylinders-01 score=0.95 tokens=355.6k · wall=69.3s attempts=5claude-mcp-claude-opus-4-7 · a3-three-tangent-cylinders-01 score=0.95 tokens=366.0k · wall=77.4s attempts=5claude-mcp-claude-opus-4-7 · a4-bolt-circle-flange-with-bore-01 score=0.80 tokens=474.0k · wall=90.7s attempts=5claude-mcp-claude-opus-4-7 · a4-counterbore-plate-01 score=0.40 tokens=643.6k · wall=182.6s attempts=5claude-mcp-claude-opus-4-7 · a4-flanged-shaft-01 score=0.72 tokens=620.3k · wall=139.4s attempts=5claude-mcp-claude-opus-4-7 · a4-rectangular-tube-01 score=0.90 tokens=426.1k · wall=76.5s attempts=5claude-mcp-claude-opus-4-7 · a4-rounded-bar-01 score=0.90 tokens=559.9k · wall=99.2s attempts=5claude-mcp-claude-opus-4-7 · a4-slotted-bracket-01 score=0.83 tokens=495.7k · wall=114.7s attempts=5claude-mcp-claude-opus-4-7 · a4-stepped-pyramid-with-holes-01 score=0.50 tokens=622.9k · wall=155.4s attempts=5claude-mcp-claude-opus-4-7 · a4-x-frame-01 score=1.00 tokens=322.6k · wall=55.8s attempts=5claude-mcp-claude-opus-4-7 · a5-disc-hub-01 score=0.85 tokens=541.2k · wall=98.9s attempts=5claude-mcp-claude-opus-4-7 · a5-double-d-shaft-01 score=0.90 tokens=438.7k · wall=97.8s attempts=5claude-mcp-claude-opus-4-7 · a5-hex-bolt-blank-01 score=0.95 tokens=460.6k · wall=98.6s attempts=5claude-mcp-claude-opus-4-7 · a5-hollow-cap-01 score=1.00 tokens=302.8k · wall=53.3s attempts=5claude-mcp-claude-opus-4-7 · a5-lightened-disc-01 score=0.82 tokens=643.4k · wall=164.7s attempts=5claude-mcp-claude-opus-4-7 · a5-ribbed-plate-01 score=0.63 tokens=654.5k · wall=191.4s attempts=5claude-mcp-claude-opus-4-7 · a5-stepped-boss-plate-01 score=0.45 tokens=643.4k · wall=123.7s attempts=5claude-mcp-claude-opus-4-7 · a5-u-bracket-01 score=0.77 tokens=542.5k · wall=111.3s attempts=5claude-mcp-claude-opus-4-7 · a6-compound-bore-ring-01 score=1.00 tokens=427.6k · wall=84.0s attempts=5claude-mcp-claude-opus-4-7 · a6-compound-boss-01 score=0.70 tokens=640.7k · wall=122.2s attempts=5claude-mcp-claude-opus-4-7 · a6-motor-flange-01 score=0.85 tokens=579.8k · wall=105.3s attempts=5claude-mcp-claude-opus-4-7 · a6-pulley-01 score=0.63 tokens=629.7k · wall=120.2s attempts=5claude-mcp-claude-opus-4-7 · a6-sprocket-blank-01 score=0.42 tokens=608.7k · wall=113.5s attempts=5claude-mcp-claude-opus-4-7 · a6-yoke-block-01 score=0.75 tokens=589.7k · wall=125.2s attempts=5claude-mcp-claude-opus-4-7 · c-reacher-01 score=0.04 tokens=637.7k · wall=170.4s attempts=5openai-direct-gpt-4o-mini · a1-block-01 score=0.20 tokens=746 · wall=3.0s attempts=5openai-direct-gpt-4o-mini · a1-cone-01 score=0.40 tokens=696 · wall=2.4s attempts=5openai-direct-gpt-4o-mini · a1-cube-01 score=0.00 tokens=706 · wall=3.2s attempts=6openai-direct-gpt-4o-mini · a1-pipe-01 score=0.20 tokens=800 · wall=3.4s attempts=5openai-direct-gpt-4o-mini · a1-plate-01 score=0.00 tokens=1.3k · wall=11.2s attempts=4openai-direct-gpt-4o-mini · a1-sphere-01 score=0.60 tokens=649 · wall=2.3s attempts=5openai-direct-gpt-4o-mini · a1-stepped-shaft-01 score=0.20 tokens=858 · wall=4.1s attempts=5openai-direct-gpt-4o-mini · a2-bolt-circle-block-01 score=0.00 tokens=1.5k · wall=15.1s attempts=5openai-direct-gpt-4o-mini · a2-channel-bracket-01 score=0.00 tokens=1.1k · wall=8.2s attempts=5openai-direct-gpt-4o-mini · a2-cube-with-pocket-01 score=0.40 tokens=958 · wall=5.7s attempts=5openai-direct-gpt-4o-mini · a2-cubemark-01 score=0.00 tokens=2.1k · wall=20.8s attempts=5openai-direct-gpt-4o-mini · a2-finned-block-01 score=0.00 tokens=1.3k · wall=11.6s attempts=5openai-direct-gpt-4o-mini · a2-flanged-cap-01 score=0.00 tokens=1.5k · wall=14.4s attempts=4openai-direct-gpt-4o-mini · a2-l-bracket-01 score=0.00 tokens=1.2k · wall=9.3s attempts=5openai-direct-gpt-4o-mini · a2-mounting-rail-01 score=0.00 tokens=1.2k · wall=9.3s attempts=5openai-direct-gpt-4o-mini · a2-square-flange-01 score=0.10 tokens=1.4k · wall=11.8s attempts=5openai-direct-gpt-4o-mini · a2-stepped-block-01 score=0.15 tokens=951 · wall=5.3s attempts=5openai-direct-gpt-4o-mini · a2-stepped-pyramid-01 score=0.15 tokens=1.1k · wall=8.1s attempts=5openai-direct-gpt-4o-mini · a2-tee-bracket-01 score=0.00 tokens=1.4k · wall=12.6s attempts=5openai-direct-gpt-4o-mini · a2-washer-01 score=0.60 tokens=808 · wall=3.9s attempts=5openai-direct-gpt-4o-mini · a3-cross-shaft-01 score=0.00 tokens=946 · wall=5.9s attempts=5openai-direct-gpt-4o-mini · a3-hex-bolt-pattern-01 score=0.07 tokens=1.5k · wall=15.1s attempts=5openai-direct-gpt-4o-mini · a3-octagonal-flange-01 score=0.00 tokens=1.6k · wall=15.8s attempts=2openai-direct-gpt-4o-mini · a3-spherical-dome-block-01 score=0.35 tokens=1.1k · wall=7.4s attempts=5openai-direct-gpt-4o-mini · a3-three-tangent-cylinders-01 score=0.00 tokens=1.1k · wall=8.4s attempts=5openai-direct-gpt-4o-mini · c-reacher-01 score=0.04 tokens=1.4k · wall=13.0s attempts=5openai-direct-gpt-5 · a1-block-01 score=1.00 tokens=1.2k · wall=9.4s attempts=4openai-direct-gpt-5 · a1-cone-01 score=1.00 tokens=1.1k · wall=8.3s attempts=5openai-direct-gpt-5 · a1-cube-01 score=1.00 tokens=1.2k · wall=10.1s attempts=6openai-direct-gpt-5 · a1-pipe-01 score=1.00 tokens=1.2k · wall=14.1s attempts=5openai-direct-gpt-5 · a1-plate-01 score=1.00 tokens=2.7k · wall=24.8s attempts=5openai-direct-gpt-5 · a1-sphere-01 score=1.00 tokens=888 · wall=5.6s attempts=5openai-direct-gpt-5 · a1-stepped-shaft-01 score=1.00 tokens=1.3k · wall=10.2s attempts=5openai-direct-gpt-5 · a2-bolt-circle-block-01 score=1.00 tokens=3.5k · wall=32.8s attempts=5openai-direct-gpt-5 · a2-channel-bracket-01 score=1.00 tokens=1.7k · wall=13.1s attempts=5openai-direct-gpt-5 · a2-cube-with-pocket-01 score=1.00 tokens=2.1k · wall=18.2s attempts=5openai-direct-gpt-5 · a2-cubemark-01 score=1.00 tokens=3.6k · wall=39.5s attempts=5openai-direct-gpt-5 · a2-finned-block-01 score=1.00 tokens=2.5k · wall=24.6s attempts=5openai-direct-gpt-5 · a2-flanged-cap-01 score=0.86 tokens=3.3k · wall=37.8s attempts=5openai-direct-gpt-5 · a2-l-bracket-01 score=1.00 tokens=2.9k · wall=29.9s attempts=4openai-direct-gpt-5 · a2-mounting-rail-01 score=1.00 tokens=2.7k · wall=22.1s attempts=5openai-direct-gpt-5 · a2-square-flange-01 score=1.00 tokens=3.1k · wall=29.8s attempts=5openai-direct-gpt-5 · a2-stepped-block-01 score=1.00 tokens=1.8k · wall=12.4s attempts=5openai-direct-gpt-5 · a2-stepped-pyramid-01 score=1.00 tokens=2.1k · wall=17.4s attempts=5openai-direct-gpt-5 · a2-tee-bracket-01 score=1.00 tokens=3.0k · wall=26.8s attempts=5openai-direct-gpt-5 · a2-washer-01 score=1.00 tokens=1.3k · wall=12.1s attempts=5openai-direct-gpt-5 · a3-cross-shaft-01 score=0.75 tokens=2.6k · wall=28.4s attempts=5openai-direct-gpt-5 · a3-hex-bolt-pattern-01 score=1.00 tokens=3.5k · wall=34.8s attempts=5openai-direct-gpt-5 · a3-octagonal-flange-01 score=0.83 tokens=7.5k · wall=98.8s attempts=6openai-direct-gpt-5 · a3-spherical-dome-block-01 score=1.00 tokens=2.7k · wall=30.0s attempts=5openai-direct-gpt-5 · a3-three-tangent-cylinders-01 score=1.00 tokens=3.2k · wall=38.1s attempts=5openai-direct-gpt-5-mini · a1-block-01 score=1.00 tokens=1.2k · wall=12.2s attempts=5openai-direct-gpt-5-mini · a1-cone-01 score=1.00 tokens=1.0k · wall=7.5s attempts=5openai-direct-gpt-5-mini · a1-cube-01 score=1.00 tokens=1.1k · wall=10.6s attempts=6openai-direct-gpt-5-mini · a1-pipe-01 score=1.00 tokens=1.3k · wall=10.6s attempts=5openai-direct-gpt-5-mini · a1-plate-01 score=0.80 tokens=2.7k · wall=33.6s attempts=5openai-direct-gpt-5-mini · a1-sphere-01 score=1.00 tokens=879 · wall=6.0s attempts=5openai-direct-gpt-5-mini · a1-stepped-shaft-01 score=1.00 tokens=1.4k · wall=14.3s attempts=5openai-direct-gpt-5-mini · a2-bolt-circle-block-01 score=1.00 tokens=3.7k · wall=30.6s attempts=5openai-direct-gpt-5-mini · a2-channel-bracket-01 score=1.00 tokens=1.7k · wall=10.6s attempts=5openai-direct-gpt-5-mini · a2-cube-with-pocket-01 score=1.00 tokens=1.8k · wall=10.4s attempts=5openai-direct-gpt-5-mini · a2-cubemark-01 score=0.80 tokens=3.8k · wall=30.4s attempts=5openai-direct-gpt-5-mini · a2-finned-block-01 score=1.00 tokens=2.5k · wall=17.6s attempts=5openai-direct-gpt-5-mini · a2-flanged-cap-01 score=0.94 tokens=3.2k · wall=39.7s attempts=5openai-direct-gpt-5-mini · a2-l-bracket-01 score=1.00 tokens=2.4k · wall=21.1s attempts=5openai-direct-gpt-5-mini · a2-mounting-rail-01 score=1.00 tokens=2.3k · wall=20.0s attempts=5openai-direct-gpt-5-mini · a2-square-flange-01 score=1.00 tokens=2.9k · wall=21.6s attempts=5openai-direct-gpt-5-mini · a2-stepped-block-01 score=1.00 tokens=1.8k · wall=10.5s attempts=5openai-direct-gpt-5-mini · a2-stepped-pyramid-01 score=1.00 tokens=1.9k · wall=13.9s attempts=5openai-direct-gpt-5-mini · a2-tee-bracket-01 score=1.00 tokens=2.5k · wall=17.0s attempts=5openai-direct-gpt-5-mini · a2-washer-01 score=1.00 tokens=1.3k · wall=9.4s attempts=5openai-direct-gpt-5-mini · a3-cross-shaft-01 score=0.75 tokens=2.6k · wall=42.8s attempts=6openai-direct-gpt-5-mini · a3-hex-bolt-pattern-01 score=1.00 tokens=3.6k · wall=58.2s attempts=5openai-direct-gpt-5-mini · a3-octagonal-flange-01 score=0.43 tokens=6.1k · wall=111.7s attempts=9openai-direct-gpt-5-mini · a3-spherical-dome-block-01 score=0.90 tokens=2.4k · wall=30.7s attempts=5openai-direct-gpt-5-mini · a3-three-tangent-cylinders-01 score=1.00 tokens=2.7k · wall=41.4s attempts=5openai-direct-gpt-5-mini · c-reacher-01 score=0.20 tokens=6.4k · wall=50.7s attempts=2

Wall-clock seconds (log) vs mean score. The left edge is fast; the right edge is patient.

k/k* denotes fewer than 5 attempts at this (model, task) — pass5 pending. Score is the mean check-pass rate across the most recent 5 attempts. Corpus: 1233 run blobs across 7 models and 64 tasks.

generated 2026-06-17T03:16:07.179Z · static site, regenerate with npm run build -w @mecheval/leaderboard