← mecheval / task / a2-stepped-pyramid-01

Stepped square pyramid (three tiers) A · A2 · a2-stepped-pyramid-01

pyramid · stacked · union · multi-feature

Expected

Prompt

Make a stepped square pyramid built from three stacked rectangular blocks. All tiers are centered on the Z axis. Bottom tier: 40mm × 40mm × 10mm tall (x in [-20, 20], y in [-20, 20], z in [0, 10]). Middle tier sits directly on top: 28mm × 28mm × 10mm tall (x in [-14, 14], y in [-14, 14], z in [10, 20]). Top tier sits directly on top of the middle: 16mm × 16mm × 10mm tall (x in [-8, 8], y in [-8, 8], z in [20, 30]). Fuse all three into a single solid.

Checks

0
valid_solid
{
  "type": "valid_solid"
}
1
bbox
{
  "type": "bbox",
  "min": [
    -20,
    -20,
    0
  ],
  "max": [
    20,
    20,
    30
  ],
  "tolerance_mm": 0.05
}
2
mass_props
{
  "type": "mass_props",
  "volume_mm3": 26400,
  "tolerance_pct": 0.1
}
3
step_roundtrip
{
  "type": "step_roundtrip",
  "tolerance_pct": 0.1
}

Anti-cheese

{
  "max_solid_count": 1
}

Limits

{
  "max_tokens": 30000,
  "max_wallclock_sec": 180,
  "max_tool_calls": 30
}

Recent attempts

Runs (35)

modelrun statusscorefirst failtokenswall
claude-mcp-claude-opus-4-7 20260611T175927Z-48b7 PASS 1.00 585.4k 224.1s
claude-mcp-claude-opus-4-7 20260611T175915Z-a50d PASS 1.00 584.9k 242.4s
claude-mcp-claude-opus-4-7 20260611T175751Z-b343 PASS 1.00 326.3k 95.7s
claude-mcp-claude-opus-4-7 20260611T175729Z-3883 PASS 1.00 347.6k 106.2s
claude-mcp-claude-opus-4-7 20260611T175726Z-e452 PASS 1.00 462.7k 178.0s
openai-direct-gpt-5 20260428T215635Z-684e PASS 1.00 2.5k 24.6s
openai-direct-gpt-5 20260428T215622Z-33aa PASS 1.00 2.1k 13.2s
openai-direct-gpt-5 20260428T215607Z-378e PASS 1.00 2.0k 15.0s
openai-direct-gpt-5 20260428T215553Z-40b2 PASS 1.00 1.9k 13.7s
openai-direct-gpt-5 20260428T215533Z-b05e PASS 1.00 2.0k 20.6s
openai-direct-gpt-5-mini 20260428T215252Z-a730 PASS 1.00 1.9k 11.3s
openai-direct-gpt-5-mini 20260428T215241Z-d236 PASS 1.00 1.9k 11.2s
openai-direct-gpt-5-mini 20260428T215228Z-110f PASS 1.00 1.9k 13.0s
openai-direct-gpt-5-mini 20260428T215206Z-0046 PASS 1.00 2.0k 22.0s
openai-direct-gpt-5-mini 20260428T215154Z-03db PASS 1.00 2.0k 11.8s
openai-direct-gpt-4o-mini 20260428T213749Z-8163 fail 0.75 bbox · X off by +20.00mm 1.2k 9.2s
openai-direct-gpt-4o-mini 20260428T213743Z-c9d0 fail 0.00 valid_solid · solid invalid 1.1k 6.2s
openai-direct-gpt-4o-mini 20260428T213736Z-b140 fail 0.00 valid_solid · solid invalid 1.1k 7.3s
openai-direct-gpt-4o-mini 20260428T213728Z-961b fail 0.00 valid_solid · solid invalid 1.1k 7.8s
openai-direct-gpt-4o-mini 20260428T213718Z-08af fail 0.00 valid_solid · solid invalid 1.1k 10.0s
claude-direct-claude-sonnet-4-6 20260428T212259Z-f435 PASS 1.00 1.2k 5.8s
claude-direct-claude-sonnet-4-6 20260428T212253Z-d937 PASS 1.00 1.2k 5.4s
claude-direct-claude-sonnet-4-6 20260428T212248Z-1c49 PASS 1.00 1.2k 5.6s
claude-direct-claude-sonnet-4-6 20260428T212242Z-965b PASS 1.00 1.2k 5.3s
claude-direct-claude-sonnet-4-6 20260428T212236Z-7d66 PASS 1.00 1.2k 5.9s
claude-direct-claude-opus-4-7 20260428T212114Z-ee56 PASS 1.00 1.5k 5.2s
claude-direct-claude-opus-4-7 20260428T212108Z-de73 PASS 1.00 1.5k 6.1s
claude-direct-claude-opus-4-7 20260428T212103Z-9c86 PASS 1.00 1.5k 4.9s
claude-direct-claude-opus-4-7 20260428T212057Z-c52b PASS 1.00 1.5k 5.4s
claude-direct-claude-opus-4-7 20260428T212052Z-9d54 PASS 1.00 1.5k 5.0s
claude-direct-claude-haiku-4-5-20251001 20260428T211914Z-ddb7 PASS 1.00 1.4k 3.5s
claude-direct-claude-haiku-4-5-20251001 20260428T211911Z-6be9 PASS 1.00 1.4k 2.8s
claude-direct-claude-haiku-4-5-20251001 20260428T211908Z-090c fail 0.00 valid_solid · solid invalid 1.4k 3.1s
claude-direct-claude-haiku-4-5-20251001 20260428T211905Z-2cc9 fail 0.00 valid_solid · solid invalid 1.5k 3.1s
claude-direct-claude-haiku-4-5-20251001 20260428T211902Z-1171 fail 0.00 valid_solid · solid invalid 1.4k 3.2s

generated 2026-06-17T03:16:07.220Z · static site, regenerate with npm run build -w @mecheval/leaderboard