← mecheval / task / a1-plate-01

Drilled plate A · A1 · a1-plate-01

primitives · boolean · holes

Expected

Prompt

Make a 50mm × 30mm × 10mm rectangular plate. Center it in X and Y, with the bottom face on the XY plane (z = 0 to z = 10). Drill four cylindrical through-holes of diameter 3mm, axes parallel to Z, positioned 5mm in from each corner of the top face.

Checks

0
valid_solid
{
  "type": "valid_solid"
}
1
bbox
{
  "type": "bbox",
  "min": [
    -25,
    -15,
    0
  ],
  "max": [
    25,
    15,
    10
  ],
  "tolerance_mm": 0.1
}
2
mass_props
{
  "type": "mass_props",
  "volume_mm3": 14717.3,
  "tolerance_pct": 0.5
}
3
hole_count
{
  "type": "hole_count",
  "diameter_mm": 3,
  "expected": 4,
  "diameter_tolerance_mm": 0.05
}
4
hole_positions
{
  "type": "hole_positions",
  "diameter_mm": 3,
  "positions": [
    [
      20,
      10,
      0
    ],
    [
      -20,
      10,
      0
    ],
    [
      -20,
      -10,
      0
    ],
    [
      20,
      -10,
      0
    ]
  ],
  "tolerance_mm": 0.1
}
5
step_roundtrip
{
  "type": "step_roundtrip",
  "tolerance_pct": 0.1
}

Anti-cheese

{
  "max_solid_count": 1
}

Limits

{
  "max_tokens": 50000,
  "max_wallclock_sec": 300,
  "max_tool_calls": 50
}

Recent attempts

Runs (45)

modelrun statusscorefirst failtokenswall
claude-mcp-claude-opus-4-7 20260611T172012Z-cb3d PASS 1.00 496.8k 100.9s
claude-mcp-claude-opus-4-7 20260611T172001Z-f98c fail 0.17 bbox · Z off by +1.00mm 594.2k 105.9s
claude-mcp-claude-opus-4-7 20260611T171906Z-12d2 PASS 1.00 616.1k 118.5s
claude-mcp-claude-opus-4-7 20260611T171835Z-0c36 PASS 1.00 498.4k 96.8s
claude-mcp-claude-opus-4-7 20260611T171832Z-c295 fail 0.33 bbox · Z off by +2.00mm 558.7k 89.5s
openai-direct-gpt-5-mini 20260428T212655Z-e0fb fail 0.00 valid_solid · solid invalid 3.0k 37.0s
openai-direct-gpt-5-mini 20260428T212624Z-a391 PASS 1.00 2.9k 30.7s
openai-direct-gpt-5 20260428T212602Z-c2bf PASS 1.00 2.5k 24.1s
openai-direct-gpt-5-mini 20260428T212550Z-a979 PASS 1.00 2.8k 33.5s
openai-direct-gpt-5 20260428T212535Z-9b56 PASS 1.00 2.7k 26.3s
openai-direct-gpt-5-mini 20260428T212513Z-336d PASS 1.00 2.4k 37.3s
openai-direct-gpt-5 20260428T212512Z-a59e PASS 1.00 2.6k 23.2s
openai-direct-gpt-4o-mini 20260428T212504Z-3936 fail 0.00 valid_solid · solid invalid 1.4k 14.3s
openai-direct-gpt-4o-mini 20260428T212455Z-cae1 fail 0.00 valid_solid · solid invalid 1.2k 8.9s
openai-direct-gpt-5 20260428T212447Z-dc71 PASS 1.00 2.9k 25.1s
openai-direct-gpt-4o-mini 20260428T212445Z-c4d8 fail 0.00 valid_solid · solid invalid 1.2k 9.7s
openai-direct-gpt-5-mini 20260428T212443Z-9cd5 PASS 1.00 2.6k 29.6s
openai-direct-gpt-4o-mini 20260428T212433Z-2fb5 fail 0.00 valid_solid · solid invalid 1.2k 12.0s
openai-direct-gpt-5 20260428T212421Z-91d7 PASS 1.00 2.8k 25.3s
claude-direct-claude-sonnet-4-6 20260428T211336Z-2fde PASS 1.00 1.4k 6.8s
claude-direct-claude-sonnet-4-6 20260428T211327Z-d58e PASS 1.00 1.4k 9.0s
claude-direct-claude-haiku-4-5-20251001 20260428T211325Z-541a PASS 1.00 1.7k 4.4s
claude-direct-claude-haiku-4-5-20251001 20260428T211320Z-af9a fail 0.00 valid_solid · solid invalid 1.9k 5.0s
claude-direct-claude-sonnet-4-6 20260428T211318Z-a254 PASS 1.00 1.4k 8.2s
claude-direct-claude-haiku-4-5-20251001 20260428T211316Z-88dd fail 0.00 valid_solid · solid invalid 1.7k 4.6s
claude-direct-claude-haiku-4-5-20251001 20260428T211311Z-23f3 fail 0.00 valid_solid · solid invalid 1.8k 4.8s
claude-direct-claude-sonnet-4-6 20260428T211308Z-1872 PASS 1.00 1.6k 9.6s
claude-direct-claude-haiku-4-5-20251001 20260428T211305Z-8c29 fail 0.00 valid_solid · solid invalid 1.9k 5.9s
claude-direct-claude-sonnet-4-6 20260428T211300Z-5218 PASS 1.00 1.4k 8.3s
claude-direct-claude-sonnet-4-6 20260428T211144Z-9859 PASS 1.00 1.4k 8.7s
claude-direct-claude-sonnet-4-6 20260428T211135Z-65a1 PASS 1.00 1.4k 8.6s
claude-direct-claude-sonnet-4-6 20260428T211127Z-faa5 PASS 1.00 1.4k 8.2s
claude-direct-claude-sonnet-4-6 20260428T211119Z-a793 PASS 1.00 1.4k 7.0s
claude-direct-claude-sonnet-4-6 20260428T211112Z-1249 PASS 1.00 1.5k 7.8s
claude-direct-claude-opus-4-7 20260428T205602Z-0bf4 PASS 1.00 1.7k 7.6s
claude-direct-claude-opus-4-7 20260428T205554Z-7b1f PASS 1.00 1.7k 7.6s
claude-direct-claude-opus-4-7 20260428T205547Z-4faa PASS 1.00 1.5k 6.7s
claude-direct-claude-opus-4-7 20260428T205541Z-8ebc PASS 1.00 1.5k 6.3s
claude-direct-claude-opus-4-7 20260428T205518Z-42a4 PASS 1.00 1.7k 7.4s
claude-direct-claude-opus-4-7 20260428T153537Z-bc66 fail 0.83 bbox · Z off by -5.00mm 1.5k 6.3s
claude-direct-claude-opus-4-7 20260428T153525Z-5f47 fail 0.83 bbox · Z off by -5.00mm 1.5k 6.8s
claude-direct-claude-opus-4-7 20260428T153513Z-cbe8 fail 0.83 bbox · Z off by -5.00mm 1.7k 7.8s
claude-mcp-claude-opus-4-7 20260428T145526Z-5eb0 fail 0.33 bbox · Z off by -5.00mm 364.9k 118.4s
claude-direct-claude-opus-4-7 20260428T142452Z-c85b fail 0.83 bbox · Z off by -5.00mm 1.5k 7.9s
claude-direct-claude-opus-4-7 20260428T141632Z-d7d2 fail 0.83 bbox · Z off by -5.00mm 1.5k 6.8s

generated 2026-06-17T03:16:07.195Z · static site, regenerate with npm run build -w @mecheval/leaderboard