2605.26001v1 May 25, 2026 cs.CL

AI-Assisted Systematization for Evaluating GenAI Systems

Hussein Mozannar
Hussein Mozannar
Microsoft
Citations: 1,700
h-index: 18
Solon Barocas
Solon Barocas
Citations: 11,463
h-index: 39
Alexandra Chouldechova
Alexandra Chouldechova
Citations: 205
h-index: 7
Dhruv Agarwal
Dhruv Agarwal
Citations: 166
h-index: 5
Emily Sheng
Emily Sheng
Citations: 132
h-index: 6
C. Atalla
C. Atalla
Citations: 190
h-index: 7
J. Garcia-Gathright
J. Garcia-Gathright
Citations: 672
h-index: 15
Hannah Washington
Hannah Washington
Citations: 100
h-index: 4
Hanna M. Wallach
Hanna M. Wallach
Citations: 242
h-index: 7

Evaluating generative AI (GenAI) systems is challenging because many targets of evaluation are broad, contested concepts, such as "reasoning," "fairness," or "creativity." When these concepts are left underspecified, it becomes unclear what should be measured or how evaluation results should be interpreted. This problem reflects a missing step: systematization, that is, moving from a broad background concept to an explicit, structured account of the concept in measurable terms. To help address the fact that systematization is cognitively demanding and resource-intensive, we investigate whether AI assistance can support this process. To enable AI-assisted systematization and assess its quality, we introduce a structured representation of a systematized concept, a concept spec, and a validation worksheet. We then develop two AI-assisted systematizers: a direct, zero-shot approach and a multi-agent approach that more closely mirrors manual systematization approaches from existing literature. We use these systematizers to produce concept specs for two concepts -- hate-based rhetoric and digital empathy -- and evaluate resulting concept specs on content validity and information recoverability.

0 Citations
0 Influential
19.5 Altmetric
97.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!