mecheval
The mechanical, physical, and CAD evaluation suite for AI models.
50%
113 / 228
pass5 · achieved / ready
0.98
top score · openai-direct-gpt-5
1233 runs
7 models · 64 tasks
Models
| model | tasks | runs | pass^5 | score | tokens | wall |
|---|---|---|---|---|---|---|
| openai-direct-gpt-5 | 25 | 125 | 20/23 | 0.98 | 2.5k | 25.2s |
| claude-direct-claude-opus-4-7 | 37 | 216 | 32/37 | 0.97 | 1.7k | 7.1s |
| claude-direct-claude-sonnet-4-6 | 34 | 193 | 29/34 | 0.95 | 1.4k | 7.5s |
| openai-direct-gpt-5-mini | 26 | 133 | 19/25 | 0.92 | 2.5k | 26.3s |
| claude-mcp-claude-opus-4-7 | 52 | 267 | 13/52 | 0.80 | 452.9k | 106.3s |
| openai-direct-gpt-4o-mini | 26 | 126 | 0/23 | 0.13 | 1.2k | 8.9s |
| claude-direct-claude-haiku-4-5-20251001 | 34 | 173 | 0/34 | 0.13 | 1.7k | 4.4s |
Task × model matrix
| expected | task ↓ · model → | openai-direct-gpt-5 | claude-direct-claude-opus-4-7 | claude-direct-claude-sonnet-4-6 | openai-direct-gpt-5-mini | claude-mcp-claude-opus-4-7 | openai-direct-gpt-4o-mini | claude-direct-claude-haiku-4-5-20251001 |
|---|---|---|---|---|---|---|---|---|
| a1-block-01 | 4/4* 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.90 | 1/5 0.20 | 2/5 0.40 | |
| a1-cone-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 2/5 0.40 | 0/5 0.00 | |
| a1-cube-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 2/5 0.70 | 0/5 0.00 | 3/5 0.60 | |
| a1-pipe-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 1/5 0.20 | 2/5 0.40 | |
| a1-plate-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.80 | 3/5 0.70 | 0/4* 0.00 | 1/5 0.20 | |
| a1-sphere-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 3/5 0.60 | 0/5 0.00 | |
| a1-stepped-shaft-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 1/5 0.20 | 1/5 0.30 | |
| a2-bolt-circle-block-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 0/5 0.00 | 1/5 0.20 | |
| a2-channel-bracket-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 3/5 0.90 | 0/5 0.00 | 0/5 0.00 | |
| a2-cube-with-pocket-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 0/5 0.40 | 0/5 0.00 | |
| a2-cubemark-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.80 | 4/5 0.90 | 0/5 0.00 | 0/5 0.00 | |
| a2-finned-block-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.90 | 0/5 0.00 | 0/5 0.00 | |
| a2-flanged-cap-01 | 0/5 0.86 | PASS 1.00 | PASS 1.00 | 3/5 0.94 | 3/5 0.77 | 0/4* 0.00 | 0/5 0.00 | |
| a2-l-bracket-01 | 4/4* 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 3/5 0.73 | 0/5 0.00 | 0/5 0.00 | |
| a2-mounting-rail-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 2/5 0.70 | 0/5 0.00 | 1/5 0.20 | |
| a2-square-flange-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 2/5 0.70 | 0/5 0.10 | 1/5 0.20 | |
| a2-stepped-block-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 0/5 0.15 | 1/5 0.20 | |
| a2-stepped-pyramid-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 0/5 0.15 | 2/5 0.40 | |
| a2-tee-bracket-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 3/5 0.80 | 0/5 0.00 | 1/5 0.20 | |
| a2-washer-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.97 | 3/5 0.60 | 2/5 0.40 | |
| a3-cross-shaft-01 | 0/5 0.75 | 0/5 0.75 | 0/5 0.75 | 0/5 0.75 | 3/5 0.85 | 0/5 0.00 | 0/5 0.30 | |
| a3-hex-bolt-pattern-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 0/5 0.37 | 0/5 0.07 | 0/5 0.00 | |
| a3-hex-nut-01 | — | 1/5 0.67 | — | — | 2/5 0.83 | — | — | |
| a3-octagonal-flange-01 | 0/5 0.83 | 0/5 0.70 | 0/5 0.80 | 0/5 0.43 | 0/5 0.63 | 0/2* 0.00 | 0/5 0.00 | |
| a3-pentagonal-prism-01 | — | PASS 1.00 | — | — | PASS 1.00 | — | — | |
| a3-rotated-block-01 | — | PASS 1.00 | — | — | 2/5 0.75 | — | — | |
| a3-spherical-dome-block-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.90 | PASS 1.00 | 0/5 0.35 | 1/5 0.20 | |
| a3-tangent-cylinders-01 | — | PASS 1.00 | — | — | 4/5 0.95 | — | — | |
| a3-three-tangent-cylinders-01 | PASS 1.00 | PASS 1.00 | PASS 1.00 | PASS 1.00 | 4/5 0.95 | 0/5 0.00 | 0/5 0.00 | |
| a4-bolt-circle-flange-with-bore-01 | — | PASS 1.00 | PASS 1.00 | — | 4/5 0.80 | — | 0/5 0.00 | |
| a4-counterbore-plate-01 | — | PASS 1.00 | PASS 1.00 | — | 0/5 0.40 | — | 0/5 0.00 | |
| a4-flanged-shaft-01 | — | PASS 1.00 | PASS 1.00 | — | 1/5 0.72 | — | 0/5 0.00 | |
| a4-rectangular-tube-01 | — | PASS 1.00 | PASS 1.00 | — | 4/5 0.90 | — | 0/5 0.00 | |
| a4-rounded-bar-01 | — | PASS 1.00 | PASS 1.00 | — | 3/5 0.90 | — | 0/5 0.00 | |
| a4-slotted-bracket-01 | — | 0/5 0.83 | 0/5 0.83 | — | 0/5 0.83 | — | 0/5 0.00 | |
| a4-stepped-pyramid-with-holes-01 | — | 3/5 0.93 | 3/5 0.93 | — | 0/5 0.50 | — | 0/5 0.00 | |
| a4-x-frame-01 | — | PASS 1.00 | PASS 1.00 | — | PASS 1.00 | — | 1/5 0.20 | |
| a5-disc-hub-01 | — | — | — | — | 3/5 0.85 | — | — | |
| a5-double-d-shaft-01 | — | — | — | — | 4/5 0.90 | — | — | |
| a5-hex-bolt-blank-01 | — | — | — | — | 4/5 0.95 | — | — | |
| a5-hollow-cap-01 | — | — | — | — | PASS 1.00 | — | — | |
| a5-lightened-disc-01 | — | — | — | — | 2/5 0.82 | — | — | |
| a5-ribbed-plate-01 | — | — | — | — | 0/5 0.63 | — | — | |
| a5-stepped-boss-plate-01 | — | — | — | — | 1/5 0.45 | — | — | |
| a5-u-bracket-01 | — | — | — | — | 3/5 0.77 | — | — | |
| a6-compound-bore-ring-01 | — | — | — | — | PASS 1.00 | — | — | |
| a6-compound-boss-01 | — | — | — | — | 2/5 0.70 | — | — | |
| a6-motor-flange-01 | — | — | — | — | 3/5 0.85 | — | — | |
| a6-pulley-01 | — | — | — | — | 2/5 0.63 | — | — | |
| a6-sprocket-blank-01 | — | — | — | — | 0/5 0.42 | — | — | |
| a6-yoke-block-01 | — | — | — | — | 2/5 0.75 | — | — | |
| c-reacher-01 | — | — | 0/5 0.08 | 0/2* 0.20 | 0/5 0.04 | 0/5 0.04 | 0/5 0.00 | |
— |
d1-sphere-01 | — | — | — | — | — | — | — |
— |
f1-cap-tube-01 | — | — | — | — | — | — | — |
— |
f1-plug-port-01 | — | — | — | — | — | — | — |
— |
f1-shim-gap-01 | — | — | — | — | — | — | — |
— |
f1-spacer-shaft-01 | — | — | — | — | — | — | — |
— |
f2-collar-shaft-axial-01 | — | — | — | — | — | — | — |
— |
f2-plug-port-tilted-01 | — | — | — | — | — | — | — |
— |
f2-shim-gap-tilted-01 | — | — | — | — | — | — | — |
— |
f2-spacer-shaft-sideways-01 | — | — | — | — | — | — | — |
— |
f3-cap-snap-bottle-01 | — | — | — | — | — | — | — |
— |
f3-clip-pipe-01 | — | — | — | — | — | — | — |
— |
f4-collar-loaded-01 | — | — | — | — | — | — | — |
Each cell shows pass5 for the most recent 5 attempts at (model, task). The leftmost column is the latest passing reference render.
Cost · score Pareto
Tokens (log) vs mean score across the most recent 5 attempts. Each point is one (model, task) pair. The dashed line is the Pareto frontier — points on it are not dominated by any cheaper, better alternative.
Wall-clock seconds vs score
Wall-clock seconds (log) vs mean score. The left edge is fast; the right edge is patient.
k/k* denotes fewer than 5 attempts at this (model, task) — pass5 pending. Score is the mean check-pass rate across the most recent 5 attempts. Corpus: 1233 run blobs across 7 models and 64 tasks.