Oral presentation at NAACL 2025

18.03.2025

CORE research group has one paper accepted as oral presentation at NAACL 2025, one of the leading conferences on natural language processing.

The paper “Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages” by Jannik Brinkmann, Chris Wendler, Christian Bartelt, and Aaron Mueller has been accepted as an oral presentation at the main conference of NAACL 2025. The Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL, Ranking A) is one of the top-tier conference on natural language processing and will be held in Albuquerque, New Mexico (USA), April 29 — May 4, 2025.

Abstract: Human bilinguals often use similar brain regions to process multiple languages, depending on when they learned their second language and their proficiency. In large language models (LLMs), how are multiple languages learned and encoded? In this work, we explore the extent to which LLMs share representations of morphosyntactic concepts such as grammatical number, gender, and tense across languages. We train sparse autoencoders on Llama-3-8B and Aya-23-8B, and demonstrate that abstract grammatical concepts are often encoded in feature directions shared across many languages. We use causal interventions to verify the multilingual nature of these representations; specifically, we show that ablating only multilingual features decreases classifier performance to near-chance across languages. We then use these features to precisely modify model behavior in a machine translation task; this demonstrates both the generality and selectivity of these feature's roles in the network. Our findings suggest that even models trained predominantly on English data can develop robust, cross-lingual abstractions of morphosyntactic concepts.

The full paper can be read at https://arxiv.org/abs/2501.06346.

Zurück

Name	Zweck	Ablauf	Typ	Anbieter
_pk_id	Wird verwendet, um ein paar Details über den Benutzer wie die eindeutige Besucher-ID zu speichern.	13 Monate	HTML	Matomo
_pk_ref	Wird benutzt, um die Informationen der Herkunftswebsite des Benutzers zu speichern.	6 Monate	HTML	Matomo
_pk_ses	Kurzzeitiges Cookie, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo
_pk_cvar	Kurzzeitiges Cookie, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo
_pk_hsr	Kurzzeitiges Cookie, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo

Oral presentation at NAACL 2025

Info

Querverweis

Portale

Social Media

Inhalt

Schnellzugriff

Inhalt

Schnellzugriff

Inhalt

Inhalt

Schnellzugriff

Oral presentation at NAACL 2025