← mecheval / task / a2-cube-with-pocket-01

Cube with rectangular pocket A · A2 · a2-cube-with-pocket-01

cube · pocket · boolean · difference

Expected

Prompt

Start with a 40mm cube, centered in X and Y with the bottom face on the XY plane (so it spans x in [-20, 20], y in [-20, 20], z in [0, 40]). Cut a rectangular pocket into the top face: remove material in x in [-10, 10], y in [-10, 10], z in [30, 40]. The pocket is open at the top, closed on the four sides and the bottom (10mm of cube remains underneath the pocket). Output a single solid.

Checks

0
valid_solid
{
  "type": "valid_solid"
}
1
bbox
{
  "type": "bbox",
  "min": [
    -20,
    -20,
    0
  ],
  "max": [
    20,
    20,
    40
  ],
  "tolerance_mm": 0.05
}
2
mass_props
{
  "type": "mass_props",
  "volume_mm3": 60000,
  "tolerance_pct": 0.1
}
3
step_roundtrip
{
  "type": "step_roundtrip",
  "tolerance_pct": 0.1
}

Anti-cheese

{
  "max_solid_count": 1
}

Limits

{
  "max_tokens": 30000,
  "max_wallclock_sec": 180,
  "max_tool_calls": 30
}

Recent attempts

Runs (35)

modelrun statusscorefirst failtokenswall
claude-mcp-claude-opus-4-7 20260611T172847Z-fea2 PASS 1.00 428.7k 202.9s
claude-mcp-claude-opus-4-7 20260611T172816Z-eae1 PASS 1.00 350.8k 104.2s
claude-mcp-claude-opus-4-7 20260611T172809Z-8070 PASS 1.00 288.0k 188.4s
claude-mcp-claude-opus-4-7 20260611T172755Z-9b69 PASS 1.00 244.9k 51.7s
claude-mcp-claude-opus-4-7 20260611T172652Z-1a7e PASS 1.00 347.4k 62.7s
openai-direct-gpt-5 20260428T215518Z-2456 PASS 1.00 1.9k 14.3s
openai-direct-gpt-5 20260428T215509Z-5125 PASS 1.00 1.4k 9.7s
openai-direct-gpt-5 20260428T215445Z-1870 PASS 1.00 2.7k 23.8s
openai-direct-gpt-5 20260428T215415Z-1e51 PASS 1.00 3.0k 29.7s
openai-direct-gpt-5 20260428T215401Z-0bb6 PASS 1.00 1.6k 13.6s
openai-direct-gpt-5-mini 20260428T215143Z-51b8 PASS 1.00 1.8k 10.9s
openai-direct-gpt-5-mini 20260428T215133Z-c4d2 PASS 1.00 1.8k 10.1s
openai-direct-gpt-5-mini 20260428T215123Z-d2a4 PASS 1.00 1.7k 9.8s
openai-direct-gpt-5-mini 20260428T215112Z-a503 PASS 1.00 1.9k 11.5s
openai-direct-gpt-5-mini 20260428T215102Z-b263 PASS 1.00 1.7k 9.6s
openai-direct-gpt-4o-mini 20260428T213711Z-9f1e fail 0.50 bbox · X off by +20.00mm 959 6.3s
openai-direct-gpt-4o-mini 20260428T213705Z-37ac fail 0.75 bbox · X off by +20.00mm 957 6.4s
openai-direct-gpt-4o-mini 20260428T213659Z-ebfb fail 0.00 valid_solid · solid invalid 959 5.3s
openai-direct-gpt-4o-mini 20260428T213654Z-7530 fail 0.00 valid_solid · solid invalid 958 5.8s
openai-direct-gpt-4o-mini 20260428T213649Z-5eaf fail 0.75 bbox · X off by +20.00mm 957 4.8s
claude-direct-claude-sonnet-4-6 20260428T212232Z-1b2f PASS 1.00 1.0k 4.3s
claude-direct-claude-sonnet-4-6 20260428T212228Z-a4ae PASS 1.00 1.0k 4.1s
claude-direct-claude-sonnet-4-6 20260428T212222Z-3c88 PASS 1.00 1.0k 6.2s
claude-direct-claude-sonnet-4-6 20260428T212217Z-4289 PASS 1.00 1.0k 4.4s
claude-direct-claude-sonnet-4-6 20260428T212213Z-c1dc PASS 1.00 1.0k 4.4s
claude-direct-claude-opus-4-7 20260428T212048Z-8693 PASS 1.00 1.3k 4.1s
claude-direct-claude-opus-4-7 20260428T212044Z-7e1e PASS 1.00 1.3k 4.0s
claude-direct-claude-opus-4-7 20260428T212040Z-21d0 PASS 1.00 1.3k 4.3s
claude-direct-claude-opus-4-7 20260428T212036Z-9151 PASS 1.00 1.3k 4.0s
claude-direct-claude-opus-4-7 20260428T212032Z-a580 PASS 1.00 1.3k 4.1s
claude-direct-claude-haiku-4-5-20251001 20260428T211859Z-1d2a fail 0.00 valid_solid · solid invalid 1.2k 2.5s
claude-direct-claude-haiku-4-5-20251001 20260428T211857Z-29c0 fail 0.00 valid_solid · solid invalid 1.1k 2.3s
claude-direct-claude-haiku-4-5-20251001 20260428T211854Z-89dc fail 0.00 valid_solid · solid invalid 1.1k 2.5s
claude-direct-claude-haiku-4-5-20251001 20260428T211852Z-9fb2 fail 0.00 valid_solid · solid invalid 1.1k 2.4s
claude-direct-claude-haiku-4-5-20251001 20260428T211849Z-a438 fail 0.00 valid_solid · solid invalid 1.1k 2.7s

generated 2026-06-17T03:16:07.204Z · static site, regenerate with npm run build -w @mecheval/leaderboard