2603.16660v1 Mar 17, 2026 cs.CL

언어학적으로 관련된 언어가 자원 부족 환경에서 LLM 번역을 향상시킬 수 있는가?

Can Linguistically Related Languages Guide LLM Translation in Low-Resource Settings?

A. Ramasethu

Citations: 2

h-index: 1

Niyathi Allu

Citations: 0

h-index: 0

Rohin Garg

Citations: 29

h-index: 4

Harshwardhan Fartale

Citations: 3

h-index: 1

Dun Li Chan

Citations: 0

h-index: 0

대규모 언어 모델(LLM)은 다양한 downstream 작업에서 뛰어난 성능을 보여주지만, 극심한 자원 부족 환경에서의 기계 번역 효율성은 여전히 제한적입니다. 일반적인 적응 기술은 대규모 병렬 데이터나 광범위한 미세 조정에 의존하는데, 이는 소외된 언어의 경우 실현 불가능합니다. 본 연구에서는 다음과 같은 질문을 탐구합니다. 데이터가 부족한 환경에서, 언어학적으로 유사한 피벗 언어와 소량의 예제가 LLM의 즉각적인 적응에 얼마나 유용한 지침을 제공할 수 있는가? 본 연구에서는 언어학적으로 관련된 피벗 언어와 소량의 in-context 예제를 결합하여 파라미터 업데이트 없이 데이터 효율적인 실험 설정을 구축하고, 통제된 조건에서 번역 성능을 평가합니다. 분석 결과, 피벗 기반 프롬프팅은 특정 구성에서 개선 효과를 가져올 수 있지만, 특히 모델의 어휘에서 대상 언어의 표현이 부족한 경우에 효과가 더 두드러지며, 전반적으로 개선 효과는 미미하고 소량 예제 구성에 민감합니다. 밀접하게 관련된 언어 또는 더 잘 표현된 언어의 경우, 개선 효과가 감소하거나 일관성이 없는 경향을 보입니다. 본 연구의 결과는 추론 시 프롬프팅과 피벗 기반 예제를 어떻게, 그리고 언제 저자원 번역 환경에서 미세 조정의 가벼운 대안으로 사용할 수 있는지에 대한 실증적인 지침을 제공합니다.

Original Abstract

Large Language Models (LLMs) have achieved strong performance across many downstream tasks, yet their effectiveness in extremely low-resource machine translation remains limited. Standard adaptation techniques typically rely on large-scale parallel data or extensive fine-tuning, which are infeasible for the long tail of underrepresented languages. In this work, we investigate a more constrained question: in data-scarce settings, to what extent can linguistically similar pivot languages and few-shot demonstrations provide useful guidance for on-the-fly adaptation in LLMs? We study a data-efficient experimental setup that combines linguistically related pivot languages with few-shot in-context examples, without any parameter updates, and evaluate translation behavior under controlled conditions. Our analysis shows that while pivot-based prompting can yield improvements in certain configurations, particularly in settings where the target language is less well represented in the model's vocabulary, the gains are often modest and sensitive to few shot example construction. For closely related or better represented varieties, we observe diminishing or inconsistent gains. Our findings provide empirical guidance on how and when inference-time prompting and pivot-based examples can be used as a lightweight alternative to fine-tuning in low-resource translation settings.

0 Citations

0 Influential

2 Altmetric

10.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!