← mecheval / task / a4-stepped-pyramid-with-holes-01

Three-step pyramid with bolt holes through base A · A4 · a4-stepped-pyramid-with-holes-01

stepped · pyramid · bolt-holes · boolean · advanced

Expected

Prompt

Make a stepped pyramid composed of three concentric square layers stacked along Z, fused into a single solid. Base layer: 80mm × 80mm × 10mm, centered in X and Y, bottom face on the XY plane (z = 0 to z = 10). Middle layer: 60mm × 60mm × 10mm, centered, sitting on top of the base layer (z = 10 to z = 20). Top layer: 40mm × 40mm × 10mm, centered, sitting on top of the middle layer (z = 20 to z = 30). The base layer has four bolt holes of diameter 6mm, axes parallel to Z, drilled through the full 10mm thickness of the base only (z = 0 to z = 10), located at (30, 30), (-30, 30), (-30, -30), (30, -30). Output a single solid.

Checks

0
valid_solid
{
  "type": "valid_solid"
}
1
bbox
{
  "type": "bbox",
  "min": [
    -40,
    -40,
    0
  ],
  "max": [
    40,
    40,
    30
  ],
  "tolerance_mm": 0.1
}
2
mass_props
{
  "type": "mass_props",
  "volume_mm3": 114869.03,
  "tolerance_pct": 0.5
}
3
hole_count
{
  "type": "hole_count",
  "diameter_mm": 6,
  "expected": 4,
  "diameter_tolerance_mm": 0.05
}
4
hole_positions
{
  "type": "hole_positions",
  "diameter_mm": 6,
  "positions": [
    [
      30,
      30,
      0
    ],
    [
      -30,
      30,
      0
    ],
    [
      -30,
      -30,
      0
    ],
    [
      30,
      -30,
      0
    ]
  ],
  "tolerance_mm": 0.15
}
5
step_roundtrip
{
  "type": "step_roundtrip",
  "tolerance_pct": 0.5
}

Anti-cheese

{
  "max_solid_count": 1
}

Limits

{
  "max_tokens": 40000,
  "max_wallclock_sec": 240,
  "max_tool_calls": 40
}

Recent attempts

Runs (20)

modelrun statusscorefirst failtokenswall
claude-mcp-claude-opus-4-7 20260611T223337Z-b49a fail 0.33 bbox · Z off by -19.80mm 663.2k 118.8s
claude-mcp-claude-opus-4-7 20260611T223156Z-ddf8 fail 0.83 step_roundtrip · STEP drift on 1/1 solid 620.2k 190.4s
claude-mcp-claude-opus-4-7 20260611T223141Z-518c fail 0.50 mass_props · volume off by 1.5% 588.9k 260.1s
claude-mcp-claude-opus-4-7 20260611T223136Z-b182 fail 0.33 bbox · Z off by -18.00mm 626.8k 120.1s
claude-mcp-claude-opus-4-7 20260611T223029Z-33fa fail 0.50 mass_props · volume off by 1.0% 615.4k 87.8s
claude-direct-claude-sonnet-4-6 20260429T120501Z-977a fail 0.83 step_roundtrip · STEP drift on 1/1 solid 1.8k 9.5s
claude-direct-claude-opus-4-7 20260429T120459Z-ee4c PASS 1.00 2.2k 9.8s
claude-direct-claude-opus-4-7 20260429T120450Z-ff11 PASS 1.00 2.1k 8.8s
claude-direct-claude-sonnet-4-6 20260429T120449Z-2893 PASS 1.00 1.9k 12.4s
claude-direct-claude-opus-4-7 20260429T120440Z-2613 fail 0.83 step_roundtrip · STEP drift on 1/1 solid 2.2k 9.4s
claude-direct-claude-sonnet-4-6 20260429T120438Z-e0b6 fail 0.83 step_roundtrip · STEP drift on 1/1 solid 1.9k 10.9s
claude-direct-claude-opus-4-7 20260429T120430Z-af74 PASS 1.00 2.2k 10.6s
claude-direct-claude-sonnet-4-6 20260429T120426Z-bbdc PASS 1.00 1.9k 11.5s
claude-direct-claude-opus-4-7 20260429T120420Z-7572 fail 0.83 step_roundtrip · STEP drift on 1/1 solid 2.2k 9.8s
claude-direct-claude-sonnet-4-6 20260429T120415Z-10c6 PASS 1.00 1.9k 10.8s
claude-direct-claude-haiku-4-5-20251001 20260429T120342Z-dbf3 fail 0.00 valid_solid · solid invalid 2.3k 6.2s
claude-direct-claude-haiku-4-5-20251001 20260429T120336Z-2fa3 fail 0.00 valid_solid · solid invalid 2.2k 5.9s
claude-direct-claude-haiku-4-5-20251001 20260429T120329Z-c74b fail 0.00 valid_solid · solid invalid 2.2k 6.1s
claude-direct-claude-haiku-4-5-20251001 20260429T120323Z-21da fail 0.00 valid_solid · solid invalid 2.4k 6.5s
claude-direct-claude-haiku-4-5-20251001 20260429T120316Z-5316 fail 0.00 valid_solid · solid invalid 2.5k 6.6s

generated 2026-06-17T03:16:07.255Z · static site, regenerate with npm run build -w @mecheval/leaderboard