2026-02-17 23:59 KST Β· by Angtiger Β· 맀일 08:00 KST μ—…λ°μ΄νŠΈ
17 sources
65 new posts

πŸ† AI λͺ¨λΈ 벀치마크

β–Ό

πŸ–₯ Terminal-Bench 2.0 (Top 5)

Source: Terminal-Bench Β· Anthropic Β· OpenAI

πŸ† Chatbot Arena ELO (Top 5)

🧠 ARC-AGI-2 달성λ₯ 

84.6%
πŸ€– 84.6% β€” Gemini 3 Deep Think (Google) πŸ§‘ Human Panel = 100% κΈ°μ€€
← 전체 보기 πŸ“… 2026-02
총 65건 Β· νŽ˜μ΄μ§€ 2/4
HF Daily Papers STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts
STATe-of-Thoughts: κ³ μˆ˜μ€€ μΆ”λ‘  νŒ¨ν„΄μ„ νƒμƒ‰ν•˜λŠ” 해석 κ°€λŠ₯ν•œ Inference-Time-Compute 방법. κΈ°μ‘΄ Tree-of-Thoughts의 λ‹€μ–‘μ„± λΆ€μ‘± 문제λ₯Ό κ΅¬μ‘°ν™”λœ μ•‘μ…˜ ν…œν”Œλ¦ΏμœΌλ‘œ ν•΄κ²°ν•œλ‹€.
Y Combinator The New Way To Build A Startup
μŠ€νƒ€νŠΈμ—…μ„ λ§Œλ“œλŠ” μƒˆλ‘œμš΄ 방법에 λŒ€ν•œ μ˜μƒ.
vibenote from subinium Anthropic (@AnthropicAI) on X
Anthropic 곡식 κ³„μ •μ˜ X(Twitter) 포슀트.
λΉ„μ¦ˆκΉŒνŽ˜ (BZCF) "λ‚΄ 말 λ“€μœΌλΌκ³ " (λΈ”λž™λ‘ λž˜λ¦¬ν•‘ν¬)
BlackRock 래리 ν•‘ν¬μ˜ λ©”μ‹œμ§€μ— λŒ€ν•œ μ˜μƒ.
HF Daily Papers SPILLage: Agentic Oversharing on the Web
SPILLage: LLM 기반 μ›Ή μ—μ΄μ „νŠΈκ°€ μ‚¬μš©μž λ¦¬μ†ŒμŠ€(이메일, μΊ˜λ¦°λ” λ“±)λ₯Ό 제3μžμ—κ²Œ κ³Όλ„ν•˜κ²Œ κ³΅μœ ν•˜λŠ” μ—μ΄μ „νŠΈ μ˜€λ²„μ…°μ–΄λ§ 문제λ₯Ό κ³΅μ‹ν™”ν•˜κ³  λΆ„μ„ν•œ 연ꡬ.
OpenAI Blog Introducing GPT-5.3-Codex-Spark
GPT-5.3-Codex-Spark λ¦¬μ„œμΉ˜ 프리뷰 μΆœμ‹œ. GPT-5.3-Codex의 μ†Œν˜• λ²„μ „μœΌλ‘œ μ‹€μ‹œκ°„ 코딩을 μœ„ν•΄ μ„€κ³„λœ 졜초의 λͺ¨λΈ.
OpenAI Blog Beyond rate limits: scaling access to Codex and Sora
Codex와 Sora의 μ‚¬μš©λŸ‰μ΄ μ›λž˜ μ˜ˆμƒμ„ μ΄ˆκ³Όν•¨μ— 따라 속도 μ œν•œμ„ λ„˜μ–΄ 접근성을 ν™•λŒ€ν•œλ‹€.
OpenAI Blog Scaling social science research
μ—°κ΅¬μžλ“€μ΄ 정성적 데이터λ₯Ό 뢄석 κ°€λŠ₯ν•œ 수치둜 λ³€ν™˜ν•˜λŠ” μƒˆλ‘œμš΄ 도ꡬ μ†Œκ°œ. κ³Όν•™μžλ“€μ΄ 더 λΉ λ₯΄κ²Œ μ΄λ™ν•˜κ³  더 μ–΄λ €μš΄ 문제λ₯Ό ν•΄κ²°ν•  수 μžˆλ„λ‘ μ§€μ›ν•œλ‹€.
HF Daily Papers Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts
3B νŒŒλΌλ―Έν„°λ§ŒμœΌλ‘œ μ—μ΄μ „νŠΈ 행동, μ½”λ“œ 생성, 일반 좔둠을 λ™μ‹œμ— λ‹¬μ„±ν•˜λŠ” 톡합 λ²”μš© μ–Έμ–΄ λͺ¨λΈ Nanbeige4.1-3B λ°œν‘œ. 졜초의 μ˜€ν”ˆμ†ŒμŠ€ μ†Œν˜• μ–Έμ–΄ λͺ¨λΈ(SLM)λ‘œμ„œ μ΄λŸ¬ν•œ λ‹€μž¬λ‹€λŠ₯함을 μ‹€ν˜„ν–ˆλ‹€.
Hugging Face Blog Custom Kernels for All from Codex and Claude
Codex와 Claudeλ₯Ό ν™œμš©ν•œ μ»€μŠ€ν…€ 컀널 개발의 λ―Όμ£Όν™”. μ˜€ν”ˆμ†ŒμŠ€μ™€ μ˜€ν”ˆ μ‚¬μ΄μ–ΈμŠ€λ₯Ό 톡해 AIλ₯Ό λ°œμ „μ‹œν‚€κ³  λŒ€μ€‘ν™”ν•˜λŠ” μ—¬μ •.
OpenAI Blog Introducing Lockdown Mode and Elevated Risk labels in ChatGPT
ChatGPT에 Lockdown Mode와 Elevated Risk 라벨 λ„μž…. AI μ‹œμŠ€ν…œμ΄ μ›Ήκ³Ό μ—°κ²°λœ μ•±μ—μ„œ λ³΅μž‘ν•œ μž‘μ—…μ„ μˆ˜ν–‰ν•˜λ©΄μ„œ prompt injection 곡격 λ“± λ³΄μ•ˆ μœ„ν—˜μ΄ μ¦κ°€ν•˜κ³  μžˆλ‹€.
vibenote from subinium OpenAI Developers (@OpenAIDevs) on X
OpenAI Developers 곡식 κ³„μ •μ˜ X(Twitter) 포슀트.
vibenote from subinium Google Gemini (@GeminiApp) on X
Google Gemini 곡식 κ³„μ •μ˜ X(Twitter) 포슀트.
λΉ„μ¦ˆκΉŒνŽ˜ (BZCF) "싱가포λ₯΄κ°€ μ„±κ³΅ν•œ 이유" (리콴유)
λ¦¬μ½΄μœ κ°€ λ§ν•˜λŠ” 싱가포λ₯΄ 성곡 λΉ„κ²° μ˜μƒ.
Hugging Face Blog OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments
OpenEnv: μ‹€μ œ ν™˜κ²½μ—μ„œ 도ꡬ μ‚¬μš© μ—μ΄μ „νŠΈλ₯Ό ν‰κ°€ν•˜λŠ” 연ꡬ. μ˜€ν”ˆμ†ŒμŠ€μ™€ μ˜€ν”ˆ μ‚¬μ΄μ–ΈμŠ€λ₯Ό ν†΅ν•œ AI λ°œμ „.
OpenAI Blog Harness engineering: leveraging Codex in an agent-first world
μ—μ΄μ „νŠΈ 퍼슀트 μ„Έκ³„μ—μ„œ Codexλ₯Ό ν™œμš©ν•΄ μˆ˜λ™ μž‘μ„± μ½”λ“œ 0μ€„λ‘œ μ†Œν”„νŠΈμ›¨μ–΄ μ œν’ˆμ˜ λ‚΄λΆ€ 베타λ₯Ό κ΅¬μΆ•ν•˜κ³  μΆœμ‹œν•œ 사둀.
Google DeepMind Gemini 3 Deep Think: Advancing science, research and engineering
Gemini 3 Deep Thinkκ°€ μ΅œμ²¨λ‹¨ μΆ”λ‘  λŠ₯λ ₯으둜 κ³Όν•™, 연ꡬ, μ—”μ§€λ‹ˆμ–΄λ§ λΆ„μ•Όμ˜ λ°œμ „μ„ κ°€μ†ν™”ν•œλ‹€.
λΉ„μ¦ˆκΉŒνŽ˜ (BZCF) 1μ‹œκ°„ λ³Όλ§Œν•œ κ°€μΉ˜ μžˆλ‹€ (a16z 풀인터뷰)
a16z 풀인터뷰: 1μ‹œκ°„ λ³Όλ§Œν•œ κ°€μΉ˜κ°€ μžˆλŠ” λŒ€λ‹΄.
OpenAI Blog Testing ads in ChatGPT
ChatGPT에 κ΄‘κ³  ν…ŒμŠ€νŠΈ μ‹œμž‘. λ―Έκ΅­ λ‚΄ Free 및 Go ν‹°μ–΄μ˜ 둜그인 성인 μ‚¬μš©μž λŒ€μƒ. Plus, Pro, Business, Enterprise, Education ν‹°μ–΄μ—λŠ” κ΄‘κ³ κ°€ ν‘œμ‹œλ˜μ§€ μ•ŠλŠ”λ‹€.
HF Daily Papers DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories
DeepImageSearch: μ‹œκ°μ  νžˆμŠ€ν† λ¦¬μ—μ„œ μ»¨ν…μŠ€νŠΈ 인식 이미지 검색을 μœ„ν•œ λ©€ν‹°λͺ¨λ‹¬ μ—μ΄μ „νŠΈ 벀치마크. 이미지 검색을 자율 탐색 과제둜 μž¬μ •μ˜ν•˜λŠ” μƒˆλ‘œμš΄ μ—μ΄μ „νŠΈ νŒ¨λŸ¬λ‹€μž„μ„ μ œμ‹œν•œλ‹€.
Β«β€Ή 이전 1 2 3 4 λ‹€μŒ β€ΊΒ»