π AI λͺ¨λΈ λ²€μΉλ§ν¬
βΌ
π₯οΈ Terminal-Bench 2.0 (Top 5)
π Chatbot Arena ELO (Top 5)
π§ ARC-AGI-2 λ¬μ±λ₯
π€ 84.6% β Gemini 3 Deep Think (Google)
π§ Human Panel = 100% κΈ°μ€
μ΄ 1건
β² 0
FACTS Benchmark Suite: LLMμ μ¬μ€μ±μ λ§€κ°λ³μ, κ²μ, λ©ν°λͺ¨λ¬ μΆλ‘ 3κ° μμμμ 체κ³μ μΌλ‘ νκ°νλ λ²€μΉλ§ν¬.