2605.04759v1 May 06, 2026 cs.CL

Gyan: 설명 가능한 신경-기호 언어 모델

Gyan: An Explainable Neuro-Symbolic Language Model

V. Srinivasan

Citations: 227

h-index: 6

Vishaal Jatav

Citations: 12

h-index: 2

A. Chandrababu

Citations: 13

h-index: 2

Geetika Sharma

Citations: 18

h-index: 2

트랜스포머 기반의 사전 훈련된 대규모 언어 모델이 널리 사용되고 있습니다. 하지만, 대규모 사전 훈련을 거치더라도 이러한 모델들이 완전한 구성적 맥락을 이해하지 못하며, 더욱이 인간이 이해하는 맥락과는 거리가 멀다는 증거가 늘어나고 있습니다. 또한, 이러한 모델들은 구조적 특성상 환각 현상을 일으키기 쉽고, 유지 관리가 어렵고, 해석하기 어렵고, 훈련 및 추론에 막대한 컴퓨팅 자원을 필요로 합니다. 본 논문에서는 이러한 한계점을 극복한, 새로운 비-트랜스포머 아키텍처 기반의 설명 가능한 언어 모델인 Gyan을 소개합니다. Gyan은 3개의 널리 인용되는 데이터셋에서 최고 성능을 달성했으며, 2개의 독점 데이터셋에서도 우수한 성능을 보였습니다. 새로운 아키텍처는 언어 모델과 지식 획득 및 표현을 분리합니다. 이 모델은 수사 구조 이론, 의미 역할 이론 및 지식 기반 계산 언어학을 활용합니다. Gyan의 의미 표현 구조는 완전한 구성적 맥락을 포착하며, '세계 모델'로 맥락을 확장하여 인간의 사고방식을 모방하려고 시도합니다. AI 모델의 도입은 특히 중요한 사용 사례에서 신뢰성과 투명성에 크게 의존합니다. 우리의 연구 결과는 신뢰할 수 있고 안정적인 모델을 구축하여 중요한 작업을 수행할 수 있음을 보여줍니다. 우리는 본 연구가 언어 모델을 위한 투명하고 신뢰할 수 있는 아키텍처 개발에 지대한 영향을 미칠 수 있다고 믿습니다.

Original Abstract

Transformer based pre-trained large language models have become ubiquitous. There is increasing evidence to suggest that even with large scale pre-training, these models do not capture complete compositional context and certainly not, the full human analogous context. Besides, by the very nature of the architecture, these models hallucinate, are difficult to maintain, are not easily interpretable and require enormous compute resources for training and inference. Here, we describe Gyan, an explainable language model based on a novel non-transformer architecture, without any of these limitations. Gyan achieves SOTA performance on 3 widely cited data sets and superior performance on two proprietary data sets. The novel architecture decouples the language model from knowledge acquisition and representation. The model draws on rhetorical structure theory, semantic role theory and knowledge-based computational linguistics. Gyan's meaning representation structure captures the complete compositional context and attempts to mimic humans by expanding the context to a 'world model'. AI model adoption critically depends on trust and transparency especially in mission critical use cases. Collectively, our results demonstrate that it is possible to create models which are trustable and reliable for mission critical tasks. We believe our work has tremendous potential for guiding the development of transparent and trusted architectures for language models.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!