From 0% to 36% on Day 1 of ARC-AGI-3

The Agentica SDK by Symbolica achieves an unverified competition score of 36.08% on the publicly available dataset of 25 ARC-AGI-3 games [1], passing 113 out of 182 playable levels and completing 7 out of 25 games [2]. This uses the same harness we previously published for solving the 3 public pre-release ARC-AGI-3 games.

Our implementation outperforms CoT baselines of 0.2% (Opus 4.6 Max) and 0.3% (GPT 5.4 High), while maintaining a far lower cost: Agentica's 36.08% for $1,005 vs. Opus 4.6's 0.25% for $8,900.

Check out the code on GitHub symbolica-ai/ARC-AGI-3-Agents

Figure 1. A comparison of the score and cost per task on the ARC-AGI-3 public eval set between Chain of Thought (CoT) models and the Agentica ARC-AGI-3 agent for Opus 4.6 (120k) High. For details on the cost per task for Agentica Opus 4.6 (120k) High see the code.

Gallery - Games Won

97.6%

118 actions

CN0497.6% WIN

84.16%

273 actions

LP8584.16% WIN

83.28%

516 actions

AR2583.28% WIN

77.59%

123 actions

FT0977.59% WIN

Score Breakdown - All Games

Beat human baselineGame wonGame ended

Game	L1	L2	L3	L4	L5	L6	L7	L8	L9	L10	Score
CN04	20	19	22	21	35	—					97.60
LP85	17	11	18	18	23	153	19	13			84.16
AR25	50	30	97	28	73	84	106	47			83.28
FT09	3	7	14	21	21	56					77.59
CD82	60	36	57	14	16	20					70.15
TR87	42	32	39	29	43	3,962					69.21
TU93	17	18	23	45	81	62	14	91	48		67.87
KA59	37	56	37	52	27	113	59				65.33
SB26	18	221	15	20	17	19	67	203			49.35
M0R0	25	43	121	12	61	—					40.06
RE86	24	37	61	132	66	280	263	—			35.54
SU15	16	232	17	105	136	90	27	150	—		35.17
S5I5	33	72	77	141	365	—	—	—			23.85
WA30	39	58	86	80	132	—	—	—	—		22.22
SC25	78	9	30	42	—	—					18.42
VC33	11	15	29	143	—	—	—				17.14
DC22	94	99	114	128	—	—					15.56
G50T	69	180	467	—	—	—	—				8.70
LS20	26	387	251	213	212	502	—				7.13
LF52	23	137	246	174	928	—	—	—	—	—	5.36
R11L	4	432	—	—	—	—					4.76
TN36	57	69	528	—	—	—	—				1.31
SK48	74	72	266	—	—	—	—	—			1.21
SP80	28	120	—	—	—	—					0.73
BP35	48	10	—	—	—	—	—	—	—		0.22
Overall											36.08

Chat with Agentica

We've sandboxed the SDK and let it run any persistent task, including solving ARC puzzles.

Read about Agentica here

References

[1] ARC Prize Foundation. ARC-AGI-3: A New Challenge for Frontier Agentic Intelligence. Arc Prize Foundation.

[2] ARC Prize. ARC-AGI-3. ARC Prize.

Appendix

A note on scoring

Human baseline scores available via the ARC-AGI-3 API state that the game cn04 has 6 levels in total. This does not match the number of levels in the corresponding game available via the API.