← mecheval / model / openai-direct-gpt-4o-mini

openai-direct-gpt-4o-mini

126 run blobs across 26 tasks

All runs

taskrun statusscorefirst failtokenswall
a3-spherical-dome-block-01 20260428T233026Z-9848 fail 0.50 bbox · X off by +20.00mm 1.1k 5.9s
a3-spherical-dome-block-01 20260428T233015Z-f4f6 fail 0.50 bbox · X off by +20.00mm 1.2k 11.0s
a3-spherical-dome-block-01 20260428T233007Z-2fac fail 0.50 bbox · X off by +20.00mm 1.1k 7.9s
a3-spherical-dome-block-01 20260428T233002Z-e5f6 fail 0.00 valid_solid · solid invalid 1.1k 5.3s
a3-spherical-dome-block-01 20260428T232955Z-9e26 fail 0.25 bbox · X off by +20.00mm 1.1k 7.0s
a3-three-tangent-cylinders-01 20260428T232949Z-5542 fail 0.00 valid_solid · solid invalid 1.1k 5.8s
a3-three-tangent-cylinders-01 20260428T232940Z-577c fail 0.00 valid_solid · solid invalid 1.1k 8.4s
a3-three-tangent-cylinders-01 20260428T232932Z-e636 fail 0.00 valid_solid · solid invalid 1.1k 7.8s
a3-three-tangent-cylinders-01 20260428T232923Z-68aa fail 0.00 valid_solid · solid invalid 1.1k 9.9s
a3-three-tangent-cylinders-01 20260428T232913Z-c272 fail 0.00 valid_solid · solid invalid 1.2k 10.0s
a3-hex-bolt-pattern-01 20260428T232902Z-ac59 fail 0.33 bbox · X off by +30.00mm 1.3k 10.7s
a3-hex-bolt-pattern-01 20260428T232844Z-d044 fail 0.00 valid_solid · solid invalid 1.7k 17.7s
a3-hex-bolt-pattern-01 20260428T232828Z-f3ac fail 0.00 valid_solid · solid invalid 1.5k 15.6s
a3-hex-bolt-pattern-01 20260428T232812Z-78d9 fail 0.00 valid_solid · solid invalid 1.5k 15.8s
a3-hex-bolt-pattern-01 20260428T232757Z-bb93 fail 0.00 valid_solid · solid invalid 1.5k 15.4s
a3-cross-shaft-01 20260428T232749Z-b807 fail 0.00 valid_solid · solid invalid 986 8.0s
a3-cross-shaft-01 20260428T232743Z-1d98 fail 0.00 valid_solid · solid invalid 919 5.8s
a3-cross-shaft-01 20260428T232738Z-32b0 fail 0.00 valid_solid · solid invalid 918 5.0s
a3-cross-shaft-01 20260428T232732Z-a87a fail 0.00 valid_solid · solid invalid 960 5.6s
a3-cross-shaft-01 20260428T232727Z-67ea fail 0.00 valid_solid · solid invalid 948 5.2s
a3-octagonal-flange-01 20260428T232636Z-c549 fail 0.00 valid_solid · solid invalid 1.7k 17.4s
a3-octagonal-flange-01 20260428T232606Z-bd4a fail 0.00 valid_solid · solid invalid 1.5k 14.1s
c-reacher-01 20260428T213853Z-690a fail 0.00 valid_solid · solid invalid 1.4k 11.0s
c-reacher-01 20260428T213839Z-c742 fail 0.20 body_valid · check not implemented 1.5k 13.8s
c-reacher-01 20260428T213822Z-b58a fail 0.00 valid_solid · solid invalid 1.5k 16.8s
c-reacher-01 20260428T213811Z-dfa6 fail 0.00 valid_solid · solid invalid 1.4k 10.6s
c-reacher-01 20260428T213758Z-4442 fail 0.00 valid_solid · solid invalid 1.3k 12.9s
a2-stepped-pyramid-01 20260428T213749Z-8163 fail 0.75 bbox · X off by +20.00mm 1.2k 9.2s
a2-stepped-pyramid-01 20260428T213743Z-c9d0 fail 0.00 valid_solid · solid invalid 1.1k 6.2s
a2-stepped-pyramid-01 20260428T213736Z-b140 fail 0.00 valid_solid · solid invalid 1.1k 7.3s
a2-stepped-pyramid-01 20260428T213728Z-961b fail 0.00 valid_solid · solid invalid 1.1k 7.8s
a2-stepped-pyramid-01 20260428T213718Z-08af fail 0.00 valid_solid · solid invalid 1.1k 10.0s
a2-cube-with-pocket-01 20260428T213711Z-9f1e fail 0.50 bbox · X off by +20.00mm 959 6.3s
a2-cube-with-pocket-01 20260428T213705Z-37ac fail 0.75 bbox · X off by +20.00mm 957 6.4s
a2-cube-with-pocket-01 20260428T213659Z-ebfb fail 0.00 valid_solid · solid invalid 959 5.3s
a2-cube-with-pocket-01 20260428T213654Z-7530 fail 0.00 valid_solid · solid invalid 958 5.8s
a2-cube-with-pocket-01 20260428T213649Z-5eaf fail 0.75 bbox · X off by +20.00mm 957 4.8s
a2-stepped-block-01 20260428T213644Z-bd9a fail 0.75 bbox · X off by +25.00mm 972 4.7s
a2-stepped-block-01 20260428T213639Z-3b25 fail 0.00 valid_solid · solid invalid 970 5.5s
a2-stepped-block-01 20260428T213633Z-c6c9 fail 0.00 valid_solid · solid invalid 941 5.6s
a2-stepped-block-01 20260428T213628Z-ff20 fail 0.00 valid_solid · solid invalid 934 5.1s
a2-stepped-block-01 20260428T213622Z-8108 fail 0.00 valid_solid · solid invalid 936 5.4s
a2-channel-bracket-01 20260428T213611Z-f773 fail 0.00 valid_solid · solid invalid 1.1k 11.1s
a2-channel-bracket-01 20260428T213604Z-e779 fail 0.00 valid_solid · solid invalid 1.1k 6.9s
a2-channel-bracket-01 20260428T213558Z-2f31 fail 0.00 valid_solid · solid invalid 1.0k 5.9s
a2-channel-bracket-01 20260428T213549Z-94b0 fail 0.00 valid_solid · solid invalid 1.1k 9.0s
a2-channel-bracket-01 20260428T213541Z-5bba fail 0.00 valid_solid · solid invalid 1.1k 8.4s
a2-bolt-circle-block-01 20260428T213527Z-4a13 fail 0.00 valid_solid · solid invalid 1.5k 14.1s
a2-bolt-circle-block-01 20260428T213514Z-c235 fail 0.00 valid_solid · solid invalid 1.5k 12.3s
a2-bolt-circle-block-01 20260428T213455Z-a3cf fail 0.00 valid_solid · solid invalid 1.6k 19.7s
a2-bolt-circle-block-01 20260428T213437Z-685e fail 0.00 valid_solid · solid invalid 1.5k 18.0s
a2-bolt-circle-block-01 20260428T213425Z-540c fail 0.00 valid_solid · solid invalid 1.5k 11.5s
a2-mounting-rail-01 20260428T213416Z-475d fail 0.00 valid_solid · solid invalid 1.2k 9.1s
a2-mounting-rail-01 20260428T213406Z-444c fail 0.00 valid_solid · solid invalid 1.3k 10.0s
a2-mounting-rail-01 20260428T213355Z-a142 fail 0.00 valid_solid · solid invalid 1.1k 10.6s
a2-mounting-rail-01 20260428T213345Z-baa6 fail 0.00 valid_solid · solid invalid 1.3k 9.8s
a2-mounting-rail-01 20260428T213338Z-8aac fail 0.00 valid_solid · solid invalid 1.1k 6.8s
a2-finned-block-01 20260428T213324Z-074f fail 0.00 valid_solid · solid invalid 1.4k 14.4s
a2-finned-block-01 20260428T213317Z-42fe fail 0.00 valid_solid · solid invalid 1.2k 7.2s
a2-finned-block-01 20260428T213307Z-4fcb fail 0.00 valid_solid · solid invalid 1.3k 10.0s
a2-finned-block-01 20260428T213253Z-5004 fail 0.00 valid_solid · solid invalid 1.4k 13.3s
a2-finned-block-01 20260428T213240Z-691c fail 0.00 valid_solid · solid invalid 1.4k 13.1s
a2-cubemark-01 20260428T213222Z-07fe fail 0.00 valid_solid · solid invalid 2.1k 18.2s
a2-cubemark-01 20260428T213207Z-8739 fail 0.00 valid_solid · solid invalid 1.6k 15.0s
a2-cubemark-01 20260428T213143Z-26bf fail 0.00 valid_solid · solid invalid 2.1k 23.6s
a2-cubemark-01 20260428T213116Z-0e69 fail 0.00 valid_solid · solid invalid 2.5k 27.2s
a2-cubemark-01 20260428T213056Z-bf96 fail 0.00 valid_solid · solid invalid 2.2k 20.3s
a2-tee-bracket-01 20260428T213039Z-f07f fail 0.00 valid_solid · solid invalid 1.6k 17.1s
a2-tee-bracket-01 20260428T213028Z-e5a4 fail 0.00 valid_solid · solid invalid 1.2k 11.1s
a2-tee-bracket-01 20260428T213016Z-0155 fail 0.00 valid_solid · solid invalid 1.4k 12.0s
a2-tee-bracket-01 20260428T213001Z-06b5 fail 0.00 valid_solid · solid invalid 1.3k 14.0s
a2-tee-bracket-01 20260428T212953Z-bd73 fail 0.00 valid_solid · solid invalid 1.3k 8.8s
a2-square-flange-01 20260428T212940Z-28cf fail 0.00 valid_solid · solid invalid 1.5k 12.3s
a2-square-flange-01 20260428T212927Z-f70f fail 0.00 valid_solid · solid invalid 1.4k 13.6s
a2-square-flange-01 20260428T212915Z-56db fail 0.00 valid_solid · solid invalid 1.3k 11.6s
a2-square-flange-01 20260428T212902Z-d8d8 fail 0.50 bbox · X off by +40.00mm 1.5k 12.7s
a2-square-flange-01 20260428T212853Z-8601 fail 0.00 valid_solid · solid invalid 1.1k 9.0s
a2-washer-01 20260428T212849Z-2c92 PASS 1.00 806 4.0s
a2-washer-01 20260428T212846Z-0610 PASS 1.00 796 3.4s
a2-washer-01 20260428T212842Z-f603 fail 0.00 valid_solid · solid invalid 816 3.5s
a2-washer-01 20260428T212837Z-fa48 fail 0.00 valid_solid · solid invalid 816 4.7s
a2-washer-01 20260428T212834Z-fb11 PASS 1.00 804 3.7s
a2-l-bracket-01 20260428T212824Z-cd39 fail 0.00 valid_solid · solid invalid 1.2k 9.2s
a2-l-bracket-01 20260428T212817Z-8ed1 fail 0.00 valid_solid · solid invalid 1.2k 7.8s
a2-l-bracket-01 20260428T212806Z-e5ff fail 0.00 valid_solid · solid invalid 1.1k 10.3s
a2-l-bracket-01 20260428T212757Z-4ef2 fail 0.00 valid_solid · solid invalid 1.2k 9.6s
a2-l-bracket-01 20260428T212747Z-b3e1 fail 0.00 valid_solid · solid invalid 1.2k 9.8s
a2-flanged-cap-01 20260428T212734Z-4748 fail 0.00 valid_solid · solid invalid 1.3k 12.9s
a2-flanged-cap-01 20260428T212727Z-1fbd fail 0.00 valid_solid · solid invalid 1.0k 6.7s
a2-flanged-cap-01 20260428T212654Z-34a8 fail 0.00 valid_solid · solid invalid 1.9k 19.3s
a2-flanged-cap-01 20260428T212636Z-19f4 fail 0.00 valid_solid · solid invalid 1.6k 18.6s
a1-pipe-01 20260428T212632Z-8e8c fail 0.00 valid_solid · solid invalid 805 3.2s
a1-pipe-01 20260428T212629Z-9270 fail 0.00 valid_solid · solid invalid 817 3.7s
a1-pipe-01 20260428T212625Z-1b5f PASS 1.00 807 3.7s
a1-pipe-01 20260428T212621Z-971c fail 0.00 valid_solid · solid invalid 805 3.5s
a1-pipe-01 20260428T212618Z-9c33 fail 0.00 valid_solid · solid invalid 766 3.2s
a1-block-01 20260428T212616Z-065e fail 0.00 valid_solid · solid invalid 738 2.5s
a1-block-01 20260428T212613Z-2ad3 fail 0.00 valid_solid · solid invalid 740 2.7s
a1-block-01 20260428T212610Z-a8d5 PASS 1.00 772 3.1s
a1-block-01 20260428T212606Z-d541 fail 0.00 valid_solid · solid invalid 742 3.4s
a1-block-01 20260428T212603Z-3c4d fail 0.00 valid_solid · solid invalid 738 3.3s
a1-cone-01 20260428T212600Z-41a8 fail 0.00 valid_solid · solid invalid 695 2.7s
a1-cone-01 20260428T212558Z-e36f PASS 1.00 696 2.2s
a1-cone-01 20260428T212555Z-4fd3 fail 0.00 valid_solid · solid invalid 696 2.5s
a1-cone-01 20260428T212553Z-8c37 fail 0.00 valid_solid · solid invalid 696 2.2s
a1-cone-01 20260428T212550Z-977b PASS 1.00 695 2.6s
a1-sphere-01 20260428T212548Z-faa7 fail 0.00 valid_solid · solid invalid 653 2.3s
a1-sphere-01 20260428T212546Z-6860 PASS 1.00 649 1.9s
a1-sphere-01 20260428T212543Z-6b4c fail 0.00 valid_solid · solid invalid 653 3.0s
a1-sphere-01 20260428T212541Z-344a PASS 1.00 636 2.0s
a1-sphere-01 20260428T212539Z-ae93 PASS 1.00 653 2.1s
a1-stepped-shaft-01 20260428T212535Z-e73e PASS 1.00 836 3.5s
a1-stepped-shaft-01 20260428T212531Z-59c3 fail 0.00 valid_solid · solid invalid 861 4.1s
a1-stepped-shaft-01 20260428T212526Z-53ae fail 0.00 valid_solid · solid invalid 861 4.9s
a1-stepped-shaft-01 20260428T212522Z-1bc6 fail 0.00 valid_solid · solid invalid 866 4.1s
a1-stepped-shaft-01 20260428T212518Z-05e6 fail 0.00 valid_solid · solid invalid 865 4.0s
a1-plate-01 20260428T212504Z-3936 fail 0.00 valid_solid · solid invalid 1.4k 14.3s
a1-plate-01 20260428T212455Z-cae1 fail 0.00 valid_solid · solid invalid 1.2k 8.9s
a1-plate-01 20260428T212445Z-c4d8 fail 0.00 valid_solid · solid invalid 1.2k 9.7s
a1-plate-01 20260428T212433Z-2fb5 fail 0.00 valid_solid · solid invalid 1.2k 12.0s
a1-cube-01 20260428T212419Z-7957 fail 0.00 valid_solid · solid invalid 702 2.8s
a1-cube-01 20260428T212416Z-4c5e fail 0.00 valid_solid · solid invalid 703 3.2s
a1-cube-01 20260428T212413Z-5107 fail 0.00 valid_solid · solid invalid 698 3.3s
a1-cube-01 20260428T212409Z-f0f5 fail 0.00 valid_solid · solid invalid 731 3.5s
a1-cube-01 20260428T212406Z-e29d fail 0.00 valid_solid · solid invalid 698 3.5s
a1-cube-01 20260428T212250Z-57e5 fail 0.00 valid_solid · solid invalid 731 5.6s

generated 2026-06-17T03:16:07.291Z · static site, regenerate with npm run build -w @mecheval/leaderboard