2026-02-17 21:23 KST Β· by Angtiger Β· 맀일 10:00 KST μ—…λ°μ΄νŠΈ
5 sources
1 new posts
← 전체 보기 πŸ“… 2025-12-09
총 1건
Google DeepMind FACTS Benchmark Suite: Systematically evaluating the factuality of large language models
FACTS Benchmark Suite: LLM의 사싀성을 λ§€κ°œλ³€μˆ˜, 검색, λ©€ν‹°λͺ¨λ‹¬ μΆ”λ‘  3개 μ˜μ—­μ—μ„œ μ²΄κ³„μ μœΌλ‘œ ν‰κ°€ν•˜λŠ” 벀치마크.