Conference Programme – Day 3

LREC 2026: Program
Friday, 15 May, 2026
09:00 - 10:40    Session O29: Infrastructures, Policy and Legal Issues II - Auditorium Illes Balears
Chair: Karen Fort
09:00 - 09:20    Mitigating Misinterpretation in Policy Documents through Automated Language Understanding
Momojit Biswas, Anka Chandrahas Tummepalli, Preethu Rose Anish
TCS Research
09:20 - 09:40    Sovereign AI-based Public Services Are Viable and Affordable
António Branco1, Luis M. S. Gomes2, Rodrigo Santos1, Eduardo Santos1, João Ricardo Silva1, Nuno Marques1, Madalena Rodrigues1
1University of Lisbon, 2Faculdade de Ciencias da Universidade de Lisboa
09:40 - 10:00    A Typology of Synthetic Datasets for Dialogue Processing in Clinical Contexts
Steven Bedrick1, A. Seza Dogruoz2, Sergiu Nisioi3
1Oregon Health & Science University, 2Universiteit Gent, 3Human Language Technologies Research Center, University of Bucharest
10:00 - 10:20    Text+: A National Hub Including Legacy Language Data
Florian Barth1, Christoph Draxler2, Jennifer Ecker3, Stefan Fischer4, Philippe Genêt5, Alina Hemmer6, Timm Lehmberg7, Thorsten Trippel8, Andreas Witt3, Arden Zimmermann5, Claus Zinn9
1University of Göttingen, 2Institute of Phonetics and Speech Processing, LMU Munich, 3Leibniz Institute for the German Language, 4Universität des Saarlandes, 5Deutsche Nationalbibliothek, 6University of Hamburg, 7Academy of Science and Humanities in Hamburg, 8Leibniz-Institut für Deutsche Sprache, 9University of Tübingen
10:20 - 10:40    Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech
Tanvi Dinkar1, Aiqi Jiang1, Simona Frenda2, Poppy Gerrard-Abbott3, Nancie A Gunson2, Gavin Abercrombie1, Ioannis Konstas2
1Heriot Watt University, 2Heriot-Watt University, 3University of Edinburgh/Heriot-Watt University
09:00 - 10:40    Session O30: Opinion and Argument Mining, Sentiment Analysis - Auditorium Mallorca
Chair: Diana Inkpen
09:00 - 09:20    Towards Complex Debate Understanding: Predicting Claim Impact Scores through the Modelling of Claim Interactions
Maxime Brouat1, Mihai Surdeanu2, Srdjan Vesic1, Eduardo Blanco2
1CRIL CNRS Univ. Artois, 2University of Arizona
09:20 - 09:40    Is There Anything More Deceptive than an Obvious Fact? Investigating Implicitness in User-Generated Argumentative Text
Ekaterina Sviridova1, Elena Cabrio2, Serena Villata3
1Université Côte d'Azur, 2Université Côte d'Azur, Inria, CNRS, I3S, 3Université Côte d'Azur, CNRS, Inria, I3S
09:40 - 10:00    Best-Worst Scaling of Hype in Biomedical Research: Building an Intensity Lexicon of Promotional Adjectives
Neil Millar1, Dipesh Satav1, Bojan Batalo2, Erica K. Shimomoto3, Ryosuke L Ohniwa1
1University of Tsukuba, 2AIST, 3National Institute of Advanced Industrial Science and Technology
10:00 - 10:20    Trust Me, I Can Convince You: The Contextualized Argument Appraisal Framework and the ContArgA Corpus
Lynn Greschner, Sabine Weber, Roman Klinger
University of Bamberg
10:20 - 10:40    Towards Clinical Applications of NLP: Detecting Emotion Regulation via Emotional Categories and Expression Modes in French Transcriptions
Salome Klein1, Amalia Todirascu2, Hélène Vassiliadou3
1UR 1339/LiLPa & FRLC (University of Strasbourg), 2LiLPa, University of Strasbourg, 3University of Strasbourg
09:00 - 10:40    Session O31: Bias, Offensive and Non-inclusive Language - Menorca 1
Chair: Penny Labropoulou
09:00 - 09:20    R.U.Psycho? A Framework for Robust Unified Psychometric Testing of Language Models
Julian Schelb1, Orr Borin2, David Garcia1, Andreas Spitz1
1University of Konstanz, 2Recosys
09:20 - 09:40    Code-switching as a Bias Indicator in LLMs: "the Consequences Are Not the Same Para Nosotros"
Fanny Ducel1, Aurélie Névéol2, Vidit Khazanchi3, Loïc Leclere4, Arthur Pedrini4, Léa Bouchet5, Benjamin Caissial5, Karen Fort6
1LISN, Université Paris-Saclay, 2Université Paris Saclay, CNRS, LISN, 3LORIA, 4Université de Lorraine, LORIA, 5Université de Lorraine, 6Université de Lorraine / LORIA
09:40 - 10:00    Exploration of How Hate Is Framed on Social Media
Rakshitha Rao Ailneni and Sanda Harabagiu
University of Texas at Dallas
10:00 - 10:20    Are Social Biases in LLMs Consistent across Generative Tasks? A Case Study for Basque
Muitze Zulaika1, Xabier Saralegi1, Julia Shershneva2, Lia Gonzalez2, Arkaitz Fullaondo2
1Orai NLP Technologies, 2University of the Basque Country (EHU)
10:20 - 10:40    Fine-grained Narrative Classification in Biased News Articles
Zeba Afroz1, Harsh Vardhan1, pawan bhakuni2, Aanchal Punia3, Rajdeep Kumar4, Md. Shad Akhtar1
1Indraprastha Institute of Information Technology, Delhi, 2Bharat Electronics Ghaziabad, 3Bharat Electronics, 4Bharat Electronics limited
09:00 - 10:40    Session O32: Speech Resources, Processing, Applications - Eivissa 1
Chair: Nikola Ljubešić
09:00 - 09:20    A Shoal of Voices: Parallel Read Speech from Professional Swedish Narrators
Christina Tånnander1, Jim O'Regan2, Jens Edlund3
1KTH Speech, Music and Hearing, MTM, 2KTH Royal Institute of Technology, 3KTH Speech, Music and Hearing
09:20 - 09:40    Deep Learning-Based Multi-Aspect Pronunciation Assessment for Individuals with Down Syndrome
David Fernández-García, César González-Ferreras, Valentín Cardeñoso-Payo, Mario Corrales-Astorgano
Universidad de Valladolid
09:40 - 10:00    WikIPA: Integrating WikiPron and Lingua Libre for Multilingual IPA Transcription
Pierluigi Cassotti1, Jacob Lee Suchardt2, Domenico De Cristofaro3
1University of Gothenburg, 2Leipzig University, 3Free University of Bozen
10:00 - 10:20    How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse
Saki Imai1, Lee Kezar2, Laurel Aichler3, Mert Inan1, Erin Walker4, Alicia Wooten3, Lorna Cobban Quandt3, Malihe Alikhani1
1Northeastern University, 2University of Southern California, 3Gallaudet University, 4University of Pittsburgh
10:20 - 10:40    Setting the Stage for Disfluency: Implications of Contextual Task Framing Effects for the Design of Listening Tasks
Ambika Kirkland1 and Jens Edlund2
1KTH Royal Institute of Technology, 2KTH Speech, Music and Hearing
09:00 - 10:40    Session P8.1.1: Machine Translation I - Poster Area 2
Chair: Oscar Araque
ACAData: Parallel Dataset of Academic Data for Machine Translation
Iñaki Lacunza1, Javier Garcia Gilabert2, Francesca De Luca Fornaciari3, Javier Aula-Blasco1, Aitor Gonzalez-Agirre4, Maite Melero5, Marta Villegas1
1Barcelona Supercomputing Center, 2Barcelona Super Computing Center, 3BSC Barcelona Supercomputing Center, 4Barcelona Supercomputing Center (BSC), 5AI Institute - Barcelona Supercomputing Center
A Single Model Ensemble Framework for Neural Machine Translation Using Pivot Translation
Seokjin Oh1, Keonwoong Noh2, Woohwan Jung2
1SK Siltron, 2Korea University
Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures
Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frederic Blain, Eva Vanmassenhove
Tilburg University
Building a One-Million-Pair Bokmål–Nynorsk Translation Corpus: A Quality-First Harvesting and Cleaning Pipeline
Per E Kummervold1, Thea Tollersrud2, Angelina Zanardi2
1The National Library of Norway, 2National Library of Norway
New Trends for Modern Machine Translation with Large Reasoning Models
Sinuo Liu1, Chenyang Lyu2, Minghao Wu3, Zifu Shang2, Longyue Wang4, Weihua Luo2, Kaifu Zhang2
1University of Edinburgh, 2Alibaba Group, 3Monash University, 4Tencent AI Lab
MaitH 1.0: A Parallel Corpus and Baseline for Low-Resource Maithili-Hindi Translation
Kamanksha Prasad Dubey1, Chandresh Maurya2, Kumar Padmanabh3
1INDIAN INSTITUTE OF TECHNOLOGY, 2IIT Indore, 3EBTIC (Etisalat British Telecom Innovation Center, Khalifa University)
NRD: A Hybrid Disentanglement Framework for Mitigating Interference in Multilingual Machine Translation
Jiarui Zhang1 and Yifan Deng2
1Institute of Information Engineering, 2University of Chinese Academy of Sciences
Linguistic and Demographic Factors in an Online Free Translation Task
Tyler Lee, Irina Stenger, Tania Avgustinova
Saarland University
Biases in Translation: Assessing Opinion Distortion in Machine Translated Texts
Nazanin Shafiabadi1 and François Yvon2
1Sorbonne University and ISIR, 2ISIR CNRS & Sorbonne Université
When Translations Surprise: Human Awareness of Predictability in Translations
Cristian García-Romero1, Miquel Esplà-Gomis2, Felipe Sanchez-Marti­nez2
1University of Alicante, 2Universitat d'Alacant
Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation
Xinyue Ma1, Pol Pastells2, Mireia Farrus1, Mariona Taule2
1Universitat de Barcelona, 2University of Barcelona
CoTERM: A Consistency-Oriented Term Metric for MT System Evaluation
Amir Hazem1 and Kyo Kageura2
1RCAST, The University of Tokyo, 2University of Tokyo
SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages
Hannah Liu1, Junghyun Min2, Annie En-Shiun Lee3, Ethan Yue Heng Cheung1, Shou-Yi Hung1, Elsie Chan1, Shiyao Qian1, RUNTONG LIANG1, Kimlan Huynh1, Wing Yu Yip1, York Hay Ng1, Tsz Fung Yau4, Ka Ieng Charlotte Lo1, You-Wei Wu5, Richard Tzong-Han Tsai6
1University of Toronto, 2Georgetown University, 3Ontario Tech University, University of Toronto, 4Scotiabank, 5National Central University, 6Academia Sinica
Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models
Spyridon Mavromatis1, Sokratis Sofianopoulos2, Prokopis Prokopidis3, Maria Giagkou3
1Institute for Speech and Language Processing, Athena Research Center & National and Kapodistrian University of Athens, 2Researcher, 3ILSP/Athena RC
CzechDocs: A Multiway Parallel Dataset of Formatted Documents for Minority Languages in Czechia
Josef Jon1 and Ondřej Bojar2
1Charles University, 2Charles University, MFF UFAL
09:00 - 10:40    Session P8.1.2: Machine Translation II - Poster Area 2
Chair: Sahar Ghannay
Linguistic Knowledge-Infused Fine-Tuning for Mitigating Gender Bias in Machine Translation
Luis Ernesto Garcia Estrada1, Audrey Mash2, Carlos Escolano3, Maite Melero2, Christine Basta4
1Universidad Politecnica de Catalunya, 2BSC, 3Universitat Politècnica de Catalunya, Barcelona Supercomputing Center, 4Alexandria University
What Triggers My Model? Contrastive Explanations Inform Gender Choices by Translation Models
Janiça Hackenbuchner
LT3, Ghent University
ViKhoMT: A Vietnamese–K'Ho Neural Machine Translation Dataset and Evaluation for Community Health Communication
Tram Truong1, Vinh Nguyen2, Dang Van Thin1, Ngan Nguyen3
1University of Information Technology,Vietnam National University Ho Chi Minh city, 2None, 3University of Information Technology, Vietnam National University Hochiminh City
Hindsight Quality Prediction Experiments in Multi-Candidate Human-Post-Edited Machine Translation
Malik Marmonier, Benoît Sagot, Rachel Bawden
Inria
PETra: A Multilingual Corpus of Pragmatic Explicitation in Translation
Doreen Osmelak1, Koel Dutta Chowdhury2, Uliana Sentsova1, Cristina España-Bonet3, Josef van Genabith4
1Saarland University, 2Saarland Informatics Campus,Saarland University, 3BSC/DFKI GmbH, 4DFKI
A Dataset for Probing Translationese Preferences in English-to-Swedish Translation
Jenny Kunz1, Anja Jarochenko2, Marcel Bollmann2
1Linkoping University, 2Linköping University
STAR-IL: A Dataset for Style-Aware Machine Translation of Product Reviews in Indian Languages
Ketaki Shetye1, Dipti Misra Sharma2, Parameswari Krishnamurthy3
1International Institute of Information Technology, 2IIIT, Hyderabad, 3Assistant Professor, IIIT Hyderabad
Cultural and Knowledge Biases in LLMs through the Lens of Entity-Aware Machine Translation
Lu Xu, Luca Moroni, Roberto Navigli
Sapienza University of Rome
Referenceless Evaluation of Machine Translation Models by Ranking Performance in Romanian to English Translate-train Settings
Mihail Feraru, Alexandra Diaconu, Bogdan Dumitru Alexe
University of Bucharest
Every Word Presented in Context: Syntactic Coverage as Objective for Low-Resource Machine Translation with Large Language Models
Samuel Frontull and Thomas Ströhle
University of Innsbruck
Multilingual KokoroChat: A Multi-LLM Ensemble Translation Method for Creating a Multilingual Counseling Dialogue Dataset
Ryoma Suzuki1, Zhiyang Qi2, Michimasa Inaba1
1The University of Electro-Communications, 2The University of Tokyo
NepTam: A Nepali-Tamang Parallel Corpus and Baseline Machine Translation Experiments
Rupak Raj Ghimire1, Bipesh Subedi2, Balaram Prasain3, Prakash Poudyal1, Praveen Acharya4, Nischal Karki1, Rupak Tiwari1, Rishikesh Kumar Sharma1, Jenny Poudel1, Bal Krishna Bal5
1Kathmandu University, 2Department of Computer Science and Engineering, Kathmandu University, 3Central Department of Linguistics, Tribhuvan University, 4Dublin City University, 5Department of Computer Science and Engineering, Kathmandu University, Nepal
Scoring the Translation: On Target Automatic Keyword-Based Evaluation of Machine Translation in the Sports Domain
Steinþór Steingrímsson1 and Einar Sigurdsson2
1The Árni Magnússon Institute for Icelandic Studies, 2University of Pennsylvania
Towards Improving Multimodal Machine Translation with LLMs: A Focus on Indic Languages
Amulya Ratna Dash1, Chirag Wadhwa2, Yashvardhan Sharma3
1Birla Institute of Technology & Science, Pilani, 2Birla Institute of Technology and Science, Pilani, Pilani campus, 3Birla Institute of Technology and Science
09:00 - 10:40    Session P8.2: Multilinguality and Translation Aids - Poster Area 2
Chair: Jan Niehues
Parallel Sentence Filtering for Low-Resource Language Pairs: A Case Study for Upper Sorbian, German, and Czech
Ruiyang Jiang1, Shu Okabe2, Alexander Fraser3
1Technical University of Munich, 2TUM Heilbronn, 3Ludwig-Maximilians-Universität München
OpenSubtitles2024: A Massively Parallel Dataset of Movie Subtitles for MT Development and Evaluation
Joerg Tiedemann and Hengyu Luo
University of Helsinki
CREST: Universal Safety Guardrails through Cluster-Guided Cross-Lingual Transfer
Lavish Bansal and Naman Mishra
Repello AI
Semantic Alignment across Ancient Egyptian Language Stages via Normalization-Aware Multitask Learning
He Huang
Ludwig Maximilian University of Munich
Conditioning LLMs to Generate Code-Switched Text
Maite Heredia1, Gorka Labaka2, Jeremy Barnes3, Aitor Soroa4
1HiTZ Basque Center for Language Technology - Ixa NLP Group, University of the Basque Country UPV/EHU, 2HiTZ Center - Ixa, University of the Basque Country (UPV/EHU), 3University of the Basque Country EHU/UPV, 4HiTZ Center - Ixa, University of the Basque Country UPV/EHU
Are the LLMs Capable of Maintaining at Least the Language Genus?
Sandra Mitrović1, David Kletz2, Ljiljana Dolamic3, Fabio Rinaldi4
1SUPSI - IDSIA, 2Supsi, IDSIA, 3armasuisse S&T, 4IDSIA USI-SUPSI, Dalle Molle Institute for Artificial Intelligence
Gender Bias in MT for a Genderless Language: New Benchmarks for Basque
Amaia Murillo1, Olatz Perez-de-Viñaspre2, Naiara Perez3
1HiTZ Center, University of the Basque Country UPV/EHU, 2HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 3University of the Basque Country
Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition
Aleix Sant1, Jordi Luque2, Carlos Escolano3
1Telefónica Innovación Digital, 2Telefonica Research, 3Universitat Politècnica de Catalunya, Barcelona Supercomputing Center
Multilingual Target-Stance Extraction
Ethan Leigh Mines1 and Bonnie J Dorr2
1The University of Florida, 2University of Florida
MUNIChus: MUltilingual News Image Captioning Benchmark
Yuji Chen1, Alistair Plum2, Hansi Hettiarachchi1, Diptesh Kanojia3, Saroj Basnet4, Marcos Zampieri4, Tharindu Ranasinghe1
1Lancaster University, 2University of Luxembourg, 3University of Surrey, 4George Mason University
GlossMATE: Multi-Agent Translator Explanations for Glosses
Changbing Yang1, Patrick Littell2, Gabriel Bernier-Colborne3, Yanfei Lu4, Mengzhe Geng3
1University of British Columbia, 2National Research Council of Canada, 3National Research Council Canada, 4University of Toronto
Diagnosing Translated Benchmarks: An Automated Quality Assurance Study of the EU20 Benchmark Suite
Klaudia Thellmann, Bernhard Stadler, Michael Färber
TU Dresden
Resource-Lean Lexicon Induction for German Dialects
Robert Litschko1, Barbara Plank1, Diego Frassinelli2
1LMU Munich, 2CIS, LMU Munich
09:00 - 10:40    Session P8.3: Multimodality - Poster Area 2
Chair: Tomek Strzalkowski
FENCE: A Financial and Multimodal Jailbreak Detection Dataset
Mirae Kim, Seonghun Jeong, Youngjun Kwak
Kakaobank
Evaluating Multimodal Large Language Models on Vertically Written Japanese Text
Keito Sasagawa1, Shuhei Kurita2, Daisuke Kawahara1
1Waseda University, 2National Institute of Informatics
ProMQA-Assembly: Multimodal Procedural QA Dataset on Assembly
Kimihiro Hasegawa1, Wiradee Imrattanatrai2, Masaki Asada2, Susan E Holm1, Yuran Wang1, Xuanang Zhou3, Ken Fukuda4, Teruko Mitamura1
1Carnegie Mellon University, 2National Institute of Advanced Industrial Science and Technology, 3CMU, 4AIRC/AIST
K-MIND: Korean Multimodal INteraction Data for Dyadic Conversation Analysis
Jae Hee Yang1, Yuha Shin2, Saim Shin1, Je Woo Kim1, Jin Yea Jang1
1Korea Electronics Technology Institute, 2MaumAI
Do Multimodal LLMs Understand Order? Measuring the Fragility of Multimodal Reasoning under Input Order Perturbations
Sheng-Lun Wei1, Yu-Ling Liao2, Hen-Hsen Huang3, Hsin-Hsi Chen1
1National Taiwan University, 2National Taiwan University, Taiwan, 3Institute of Information Science, Academia Sinica
Early Fusion with Contrastive Learning: A Lightweight Alternative for Multi-modal Classification
Felix Wernlein1, Abhik Jana2, Sandipan Sikdar1
1Leibniz University Hannover, 2IIT Bhubaneswar
Multimodal Entrainment and Feedback in Online Group Meetings
Patrizia Paggio1, Manex Agirrezabal1, Giulia Di Cristina2, Bart Jongejan1, Costanza Navarretta1
1University of Copenhagen, 2University of Turin
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
HYEYEON KIM1, Sungwoo Han2, Jingun Kwon3, Hidetaka Kamigaito4, Manabu Okumura5
1Department of Artificial Intelligence, Chungnam National University, 2Chungnam National University, Department of Artificial Intelligence, GILAB, 3Chungnam National University, 4Nara Institute of Science and Technology, 5Tokyo Institute of Technology
Multimodal Reference by Means of the Pronoun We and Hand Gestures in a Novel Corpus of Parliamentary Opening Debates
Costanza Navarretta
University of Copenhagen
Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque
Lukas Arana1, Julen Etxaniz1, Ander Salaberria1, Gorka Azkune2
1HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 2University of Basque Country
Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches
Anum Afzal1, Yuki Saito2, Hiroya Takamura3, Katsuhito Sudoh4, Shinnosuke Takamichi5, Graham Neubig6, Florian Matthes7, Tatsuya Ishigaki8
1Technical University of Munich, 2The University of Tokyo, 3The National Institute of Advanced Industrial Science and Technology (AIST), 4Nara Women's University, 5Keio University, 6Carnegie Mellon University, 7Technische Universität München, 8National Institute of Advanced Industrial Science and Technology (AIST)
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Sara Ghaboura1, Shubham Patle1, Ketan More1, Wafa Hamad Mohamed Alghallabi1, Omkar Thawakar1, Jorma Laaksonen2, Hisham Cholakkal1, Salman Khan1, Rao Anwer1
1Mohamed bin Zayed University of AI, 2Aalto University
Event Chronography in Multi-modal Data: The BME Method for Quantitative Analyses
Anaïs Claire Murat, Maria Koutsombogera, Carl Vogel
Trinity College Dublin
CANVAS: A Multimodal Dataset of Chinese Textbook Images for Bias and Representation Analysis
Haotian Zhu, Kefan Yu, Min Li
University of Washington
MM-Conv: A Multimodal Dataset and Benchmark for Context-Aware Grounding in 3D Dialogue
Anna Deichler1, Jim O'Regan1, Fethiye Irmak Dogan1, Anna Klezovich1, Lubos Marcinek1, Iolanda Leite1, Jonas Beskow2
1KTH Royal Institute of Technology, 2KTH Speech, music and hearing
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
June Hyoung Kwon, Jungmin Yun, Youngbin Kim
Chung-Ang University
DREAM: A Multicultural Multimodal Dataset Linking Dialogues and Realistic Image Sequences
Juan Mallo1, Marcos Estecha-Garitagoitia1, Ricardo Cordoba2, Luis Fernando D'Haro3
1Universidad Politécnica de Madrid, 2Speech Technology Group. Information Processing and Telecommunications Center. Universidad Politécnica de Madrid, 3Speech Technology and Machine Learning Group, E.T.S.I. Telecomunicación, Universidad Politécnica de Madrid
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs
Masayuki Kawarada1, Tatsuya Ishigaki2, Hiroya Takamura3
1CyberAgent/National Institute of Advanced Industrial Science and Technology, 2National Institute of Advanced Industrial Science and Technology (AIST), 3The National Institute of Advanced Industrial Science and Technology (AIST)
09:00 - 10:40    Session P8.4: Cross-modality - Poster Area 2
Chair: Anton Ingason
Can Video LLMs See Through Illusions? Video-Illusion QA Benchmark Dataset
Souto Ohira1, Tosho Hirasawa2, Mamoru Komachi1
1Hitotsubashi University, 2OMRON SINIC X Corporation
To Skip, to Swap or to Not Swap? Identifying Step Transition Types in Instructional Manuals
Hsiu-Yu Yang1, Michael Roth2, Andreas Bulling3, Carina Silberer3
1Institute for Natural Language Processing, Stuttgart University, 2University of Technology Nuremberg, 3University of Stuttgart
Fruitcakes and Cupcakes Emerging from Noise: The ComposiGen Dataset of Compounds and Their Compositionality
Jule Godbersen1, Sinan Cem Kurtyigit2, Emma Raimundo Schulz3, Tonmoy Rakshit3, Diego Frassinelli4, Sabine Schulte im Walde3, Carina Silberer3
1Saarland University, 2Technical University of Munich, 3University of Stuttgart, 4CIS, LMU Munich
Large Language Models' Internal Perception of Symbolic Music
Andrew Shin and Kunitake Kaneko
Keio University
Entity Image and Mixed-Modal Image Retrieval Datasets
Cristian-Ioan Blaga1, Paul Suganthan G C1, Sahil Dua1, Krishna Srinivasan2, Enrique Alfonseca2, Peter Dornbach1, Tom Duerig1, Imed Zitouni2, Zhe Dong3
1Google, 2, 3Microsoft
Generating Sign Language Poses from HamNoSys and Natural Language Descriptions
Santiago Máximo1 and Luis Chiruzzo2
1Universidad de la República, 2Universidad de la Republica
Evaluating Discriminability of Vision-Language Models
Masayasu Muraoka1 and Naoaki Okazaki2
1IBM Research - Tokyo, 2Institute of Science Tokyo
Seeing the Other Side: Diagnostic Tasks for Viewpoint Reasoning in Vision–Language Models
Makoto Takenaka1 and Hitomi Yanaka2
1Mitsubishi Electric, 2the University of Tokyo
Multi-modal, Multi-task, Multi-criteria Automatic Evaluation with Vision Language Models
Masanari Oi1, Masahiro Kaneko2, Naoaki Okazaki1, Nakamasa Inoue1
1Institute of Science Tokyo, 2MBZUAI
Challenges in Image-Caption Association in Portuguese: Evaluating the CLIP Model on the FM30K Dataset
Vitória Colonetti Benedet, Gutavo Lopes Tamiosso, Rafael Oleques Nunes, Dennis Giovani Balreira
UFRGS
A Large-Scale Instruction-Tuning Dataset and Models for Slovenian Vision-Language Tasks
Matej Martinc1 and Domen Vreš2
1Jozef Stefan Institute, 2Univerza v Ljubljani
A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding
Dilara Torunoğlu-Selamet1, Doğukan Arslan1, Rodrigo Wilkens2, Wei He2, Doruk Eryiğit3, Thomas Pickard4, Adriana S Pagano5, Aline Villavicencio6, Gülşen Eryiğit1, Ágnes Abuczki7, Aida Cardoso8, Alesia Lazarenka9, Dina Almassova10, Amália Mendes11, Anna Kanellopoulou12, Antoni Brosa-Rodriguez13, Baiba Valkovska14, Beata Wojtowicz15, Bolette Pedersen16, Carlos Manuel Hidalgo-Ternero17, Chaya Liebeskind18, Danka Jokić19, Diego Alves20, Eleni Triantafyllidi12, Erik Velldal21, Fred Philippy22, Giedre Valunaite Oleskeviciene23, Ieva Rizgeliene24, Inguna Skadina25, Irina Lobzhanidze26, Isabell Stinessen Haugen27, Jauza Akbar Krito28, Jelena M. Marković29, Johanna Monti30, Josue Alejandro Sauca31, Kaja Dobrovoljc32, Kingsley O Ugwuanyi33, Laura Rituma34, Lilja Øvrelid35, Maha Tufail Agro36, Manzura Abjalova37, Maria Chatzigrigoriou38, María del Mar Sánchez Ramos39, Marija Pendevska40, Masoumeh Seyyedrezaei41, Mehrnoush Shamsfard42, Momina Ahsan43, Muhammad Ahsan Riaz Khan44, Nathalie Carmen Hau Norman16, Nilay Erdem Ayyıldız45, Nina Hosseini-Kivanani46, Noémi Ligeti-Nagy47, Numaan Naeem43, Olha Kanishcheva48, Olha Yatsyshyna49, Daniil Orel43, Petra Giommarelli50, Petya Osenova51, Radovan Garabik52, Regina E Semou53, Rozane Rebechi54, Salsabila Zahirah Pranida43, Samia Touileb27, Sanni Nimb55, Sarfraz Ahmad44, Sarvinoz Sharipova56, Shahar Golan57, Shaoxiong Ji58, Sopuruchi Christian Aboh59, Srdjan Sucur29, Stella Markantonatou60, Sussi Olsen61, Vahide Tajalli42, Veronika Lipp47, Voula Giouli62, Yelda Yeşildal Eraydın63, Zahra Saaberi64, Zhuohan Xie43
1Istanbul Technical University, 2University of Exeter, 3Istanbul Technical University NLP Group, 4University of Sheffield, 5Federal University of Minas Gerais, 6University of Exeter, UK, 7Károli Gáspár University of the Reformed Church in Hungary, 8Centro de Linguística da Universidade Nova de Lisboa, 9Tesi srl, 10Nazarbayev University, 11University of Lisbon - Centre of Linguistics, School of Arts and Humanities, 12Aristotle University of Thessaloniki, 13Universitat Rovira i Virgili, 14IMCS, University of Latvia, 15University of Warsaw, 16University of Copenhagen, 17Researcher, 18Jerusalem College of Technology , Lev Academic Center, 19University of Belgrade, 20Saarland University, 21University of Oslo, 22University of Luxembourg, 23Mykolas Romeris University, 24Vilnius university Institute of Data Science and Digital Technologies, 25Tilde/ Institute of Mathematics and Computer Science, University of Latvia, 26Ilia State University, 27University of Bergen, 28Universitas Gadjah Mada, 29University of East Sarajevo, 30"L'Orientale" University of Naples, 31Internacional University of Valencia, 32University of Ljubljana & Jozef Stefan Institute, 33SOAS University of London, 34Institute of Mathematics and Computer science, University of Latvia, 35Dept of Informatics, University of Oslo, 36Mohamed bin Zayed University of Artificial Intelligence, 37Alisher Navo'i Tashkent State Uzbek Language and Literature, 38National and Kapodistrian University of Athens, 39University of Alcalá, 40St. Cyrillus and Methodius University, 41Istinye University, 42Faculty of Computer Science and Engineering, Shahid Beheshti University, 43MBZUAI, 44Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), 45Assoc. Prof., 46RTL & University of Luxembourg, 47ELTE Research Centre for Linguistics, 48Heidelberg University, 49Ternopil Volodymyr Hnatiuk National Pedagogical University, 50University of Pisa, 51Sofia University "St. Kl. Ohridski" and IICT-BAS, 52L. Stur Institute of Linguistics, Slovak Academy of Sciences, 53NKUA, 54Universidade Federal do Rio Grande do Sul, 55Society for Danish Language and Literature (DSL), 56Samarkand State Institute of Foreign Languages, 57Jerusalem College of Technology, 58University of Turku and ELLIS Institute Finland, 59English and Communication, The Hong Kong Polytechnic University, 60ILSP/ATHENA RESEARCH CENTER, 61UCPH, NorS, Centre for Language Technology, 62Aristotle University of Thessaloniki / ILSP, ATHENA RC, 63Dr., 64NLP Lab, Shahid Beheshti University, Tehran, Iran
Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision–Language Models
Shiho Matta1, Lis Kanashiro Pereira2, Peitao Han3, Shigeru Kitazawa3, Fei Cheng1
1Kyoto University, 2NICT, 3The University of Osaka
I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes
Shijia Zhou1, Saif M. Mohammad2, Barbara Plank3, Diego Frassinelli4
1Ludwig Maximilian University of Munich, 2National Research Council Canada, 3LMU Munich, 4CIS, LMU Munich
DEJIMA: A Novel Large-scale Japanese Dataset for Image Captioning and Visual Question Answering
Toshiki Katsube1, Fukuhara Taiga1, Kenichiro Ando2, Yusuke Mukuta1, Kohei Uehara1, Tatsuya Harada1
1The University of Tokyo, 2RIKEN
CLEVR-3D-DeRef
Mary Lynn Martin, Martha Palmer, Maria Leonor Pacheco
University of Colorado Boulder
09:00 - 10:40    Session P8.5: Sign Languages - Poster Area 2
Chair: Dominik Schlechtweg
Bridging Text-to-Sign Translation via Codebook-Oriented Pretraining
Ninlawat Phuangchoke and Chantri Polprasert
Asian Institute of Technology (AIT)
A Resource and Evaluation Method for Phonological Continuity in Japanese Sign Language
Jundai Inoue1, Daisuke Hara2, Makoto Miwa2
1Knowledge and Data Engineering Lab, Toyota Technological Institute at Japan, 2Toyota Technological Institute
Sentiment Analysis of German Sign Language Fairy Tales
Fabrizio Nunnari1, Siddhant Jain1, Patrick Gebhard2
1German Research Center for Artificial Intelligence (DFKI), 2DFKI
A Critical Study of Automatic Evaluation in Sign Language Translation
Shakib Yazdani1, Yasser HAMIDULLAH2, Cristina España-Bonet3, Eleftherios Avramidis4, Josef van Genabith2
1German Research Center for Artificial Intelligence (DFKI), 2DFKI, 3BSC/DFKI GmbH, 4Alangu; German Research Center for Artificial Intelligence (DFKI)
How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation
Anna Klezovich1, Johanna Mesch2, Gustav Eje Henter3, Jonas Beskow4
1Division of Speech, Music and Hearing, KTH, 2Stockholm University, 3KTH Royal Institute of Technology, 4KTH Speech, music and hearing
Decomposing Sign Language Movements: A Multi-Band Visualization Method for Articulatory Analysis
Antonio F. G. Sevilla and José María Lahoz-Bengoechea
Universidad Complutense de Madrid
10:40 - 11:00    Coffee Break
11:00 - 12:40    Session O33: Psycholinguistics, Cognitive Linguistics and Linguistic Theories - Auditorium Illes Balears
Chair: Jelke Bloem
11:00 - 11:20    Implicit Bias in Peer Review: Through the Lens of Language Abstraction
Xulang Zhang, Rui Mao, Erik Cambria
Nanyang Technological University
11:20 - 11:40    The PARLO Dementia Corpus: A German Multi-Center Resource for Alzheimer's Disease
Franziska Braun1, Christopher Witzl2, Florian Hönig3, Elmar Nöth4, Tobias Bocklet2, Korbinian Riedhammer5
1Technische Hochschule Nürnberg Georg Simon Ohm, 2Technische Hochschule Nürnberg, 3KST Institut GmbH, Bad Emstal, 4Friedrich-Alexander-University Erlangen-Nuremberg, 5Technische Hochschule Nuernberg Georg Simon Ohm
11:40 - 12:00    Lexical and Discourse Semantics in a Reading-time Corpus of English
Jakub Dotlacil1, Laia Colina Fortuny1, Li Kloostra1, Johan Bos2
1Utrecht University, 2University of Groningen
12:00 - 12:20    Semantic Capacity in Language Learners and LLMs: A Case Study of Quantifier Scope
Shaohua Fang, Yue Li, Yan Cong
Purdue University
12:20 - 12:40    How Long Does a Quick Kiss Take? Studying Event Duration of Light Verb Constructions Using Explicit Word Embeddings
Lin de Huybrecht and Geraint A. Wiggins
Vrije Universiteit Brussel
11:00 - 12:40    Session O34: Opinion and Argument Mining - Auditorium Mallorca
Chair: Valerio Basile
11:00 - 11:20    Disambiguation of Emotion Annotations by Contextualizing Events in Plausible Narratives
Johannes Schaefer1 and Roman Klinger2
1Fundamentals of Natural Language Processing, 2University of Bamberg
11:20 - 11:40    Identifying Contexts of Distress in College Students' Reddit Posts: A Comparative Study of Classical NLP and Large Language Models
Carine Graff and Nikhil Krishnaswamy
Colorado State University
11:40 - 12:00    TiC-MuFormer: Time-Aware Caption-Integrated Multimodal Transformers for User-Level Mental Health Modeling
Georgios Tsoumplekas, Yannis Spyridis, Vasileios Argyriou
Kingston University
12:00 - 12:20    Improving Neural Argumentative Stance Classification in Controversial Topics with Emotion-Lexicon Features
Mohammad Yeghaneh Abkenar1, Weixing Wang2, Manfred Stede1, Mark A Finlayson3, Davide Picca4, Panagiotis Ioannidis5
1University of Potsdam, 2Hasso Plattner Institute, 3Florida International University, 4University of Lausanne, 5PI Squared Insights
12:20 - 12:40    Emotion Transcription in Conversation: A Benchmark for Capturing Subtle and Complex Emotional States through Natural Language
Yoshiki Tanaka1, Ryuichi Uehara1, Koji Inoue2, Michimasa Inaba1
1The University of Electro-Communications, 2Kyoto University
11:00 - 12:40    Session O35: Parsing - Menorca 1
Chair: Menno van Zaanen
11:00 - 11:20    SETUP: Sentence-level English-To-Uniform Meaning Representation Parser
Emma Markle, Javier Gutierrez Bach, Shira Wein
Amherst College
11:20 - 11:40    This One or That One? A Study on Accessibility via Demonstratives with Multimodal Large Language Models
Yu Wang1, Emmanuele Chersoni2, Chu-Ren Huang3
1The Hong Kong Polytechnic University, 2Hong Kong Polytechnic University, 3The Hong Kong Polytechnic Universiy
11:40 - 12:00    AMR Parsing beyond English: An Experiment on Bulgarian, French, Hungarian and Ukrainian
Ivaylo Mitov1, Tadzhat Marharian1, Zsofia F Hauk1, Samba FALL1, Maxime Amblard2, Bruno Guillaume3
1Institut des sciences du Digital, Management & Cognition, 2Université de Lorraine, 3LORIA / Inria Nancy Grand-Est
12:00 - 12:20    Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN
Rémi DE VERGNETTE1 and Maxime Amblard2
1Université de Lorraine, CNRS, Inria, LORIA, F-53999 Nancy, France, 2Université de Lorraine
12:20 - 12:40    Two Ojibwe Constraint Grammars: Morphological Disambiguation and Dependency Parsing
Matthias Diederichsen and Christopher Hammerly
University of British Columbia
11:00 - 12:40    Session O36: Multimodality and Speech - Eivissa 1
Chair: Jonas Beskow
11:00 - 11:20    Multimodal LLMs Do Not Compose Skills Optimally across Modalities
Paula Ontalvilla1, Aitor Ormazabal2, Gorka Azkune3
1HiTZ Center - Ixa, University of the Basque Country (UPV/EHU, 2University of the Basque Country, 3University of Basque Country
11:20 - 11:40    Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
Maha Tufail Agro1, Atharva A Kulkarni2, Karima Kadaoui1, Zeerak Talat3, Hanan Aldarmaki2
1Mohamed bin Zayed University of Artificial Intelligence, 2MBZUAI, 3University of Edinburgh
11:40 - 12:00    MUStReason: A Benchmark for Diagnosing Pragmatic Reasoning in VideoLMs for Multimodal Sarcasm Detection.
Anisha Saha1, Varsha Suresh2, Timothy Hospedales3, Vera Demberg2
1Max Planck Institute for Informatics, Saarland Informatics Campus., 2Saarland University, 3University of Edinburgh
12:00 - 12:20    Human-Centered Multimodal Fusion for Sexism Detection in Memes with Eye-Tracking, Heart Rate, and EEG Signals
Iván Arcos Gabaldón, Paolo Rosso, Elena Gomis Vicent
Universitat Politècnica de València, UPV
12:20 - 12:40    Nos_Brais-GL: A FAIR Galician TTS Corpus for Neural Speech Synthesis
Adina Ioana Vladu1, Antonio Moscoso Sánchez2, Carmen Magariños3, María Perez Lago1, Elisa Fernández Rei1
1Instituto da Lingua Galega, Universidade de Santiago de Compostela, 2Instituto da Lingua Galega, Centro Singular en Tecnoloxías Intelixentes, Universidade de Santiago de Compostela, 3Instituto da Lingua Galega, Departamento de Electrónica e Computación, Universidade de Santiago de Compostela
11:00 - 12:40    Session P9.1: Natural Language Generation - Poster Area 1
Chair: Victoria Arranz
DR-CUP: A Dataset on Real-time Commentary in U.S. Presidential Debates
Yu-Yu Chang1, Huan-Wen Ho1, Chung-Chi Chen2, Ming-Hung Wang3
1National Chung Chen University, 2National Institute of Advanced Industrial Science and Technology, 3National Chung Cheng University
Russian Generative Spelling, Punctuation and Capitalization Correction
Nikita Martynov1, Danil Astafurov2, Ulyana Isaeva1, Ivan Vasil'yevich Maksimov3, Joqsan Azocar4, Dmitrii Kosenko4, Alena Fenogenova5
1SaluteDevices, 2ITMO University, 3Moscow Institute of Physics and Technology, 4MIPT, 5SberAI
Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization
Chaimae Chellaf El Hammoud1, Salima Mdhaffar2, Yannick Estève3, Stéphane Huet4
1Avignon, 2Avignon university, 3LIA - Avignon Université, 4Université d'Avignon
Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering
Purva Chiniya1, Kevin Joseph Scaria2, Sagar Chaturvedi1
1Amazon, 2Amazon.com
The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form Factuality Evaluation
Pavel Braslavski1, Dmitrii Iarosh2, Nikita Sergeevich Sushko3, Andrey Sakhovskiy4, Vasily Konovalov5, Elena Tutubalina6, Alexander Panchenko7
1HSE University, 2Skolkovo Institute of Science and Technology, Russia, 3Independant Researcher, 4Sber AI, Russia; Skoltech, Russia, 5Affiliation, 6HSE University, Russia and Kazan Federal University, Russia and AIRI, Russia and Insilico Medicine Hong Kong, Hong Kong, 7S-NLP
MeteoGalEus: An Iberian Multilingual Weather Dataset in Galician, Euskera, and Spanish
Ainhoa Vivel-Couso1, Nella Zabrina Pramata2, David Robredo3, Aitor Soroa4, Jose Maria Alonso-Moral1
1University of Santiago de Compostela, 2University of Basque Country, 3Universidade de Santiago de Compostela, 4HiTZ Center - Ixa, University of the Basque Country UPV/EHU
RadTimeline: Timeline Summarization for Longitudinal Radiological Lung Findings
Sitong Zhou, Meliha Yetisgen, Mari Ostendorf
University of Washington
InstructSum: A Benchmark to Evaluate Instruction-Following Capability of Large Language Models in Summarization
Kosuke Nishida1, Kyosuke Nishida2, Itsumi Saito3
1NTT, Inc., 2NTT Human Informatics Laboratories, 3Tohoku University
NOVELSUM: Evaluating Long-Form Summary Generation for Historical Scandinavian Novels
Ali Al-Laith, Alexander Conroy, Kirstine Nielsen Degn, Jens Bjerring-Hansen, Daniel Hershcovich
University of Copenhagen
Evaluating Large Language Models for Text-to-Gloss Translation in Kazakh-Russian Sign Language: A Pilot Study
Zhanibek Kozhirbayev and Alfarabi Imashev
National Laboratory Astana, Nazarbayev University
HotelCheckSpan: A Benchmark Dataset for LLM Faithfulness
Patricia Schmidtova1, Ondrej Dusek1, Saad Mahamood2
1Charles University, 2Shopware
11:00 - 12:40    Session P9.2.1: Machine Learning II - Poster Area 1
Chair: Udo Kruschwitz
Procrustes Analysis for Improving Language Model Merging
Olivier Ferret
CEA-List
MetaCORA: A Meta-Learned Curriculum for Adversarial and Contrastive Robustness in Speech Recognition
Yuqian Dai, Chun Fai Chan, Ying Ki Wong, Tsz Ho Pun
Logistics and Supply Chain MultiTech R&D Centre Limited
Insights from Transfer Learning Experiments with Word-in-Context and Word Sense Disambiguation Models
Alp Mujko and Dominik Schlechtweg
University of Stuttgart
Joint Identification and Induction of Semantic Frames with Scalable Semi-Supervised Graph Clustering
Fabian Barteld1, Steffen Remus2, Saba Anwar2, Julian Stawecki1, Alexander Ziem1, Chris Biemann2
1Heinrich Heine University Düsseldorf, 2Universität Hamburg
Low-Rank Compression of Language Models via Differentiable Rank Selection
Sidhant Sundrani, Francesco Tudisco, Pasquale Minervini
University of Edinburgh
Self-supervised Data Augmentation for Text Classification in Low-Data Settings
Deyu Ding1, Mengying Wang2, Andreas Spitz2
1Southern University of Science and Technology, 2University of Konstanz
Distribution-aware Low-bitwidth Quantization for Large Language Models
Bao Tan Duy Huynh, Takashi Tsunakawa, Masafumi Nishida
Shizuoka University
TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
ChengYeh Yang1, Chien-Chun Wang1, Li-Wei Chen2, Hung-Shin Lee2, Hsin-Min Wang3, Berlin Chen1
1National Taiwan Normal University, 2United Link Co., Ltd., 3Institute of Information Science, Academia Sinica
Harnessing Synergy in Context and Emoji for Joint Detection of Harmful Online Content in Multi-turn Conversations
Feiyan Hu, Ciara Anne Byrne, Jiang Zhou, Rena Maycock, Mark Langan
Chirp
Dynamic Layer Selection for Efficient Tone Recognition in Self-Supervised Speech Models
Saint Germes B. BENGONO OBIANG, Norbert TSOPZE, Paulin MELATAGIA YONTA
Univertity of Yaounde 1
Intent Recognition in Speech-to-Text Processing in the Context of Natural Interaction with Cognitive Assistive Systems
Behnam Ensan1, Magnus Jung2, Matthias Busch2, Adreas Wendemuth3
1Chair of Cognitive Systems, Otto-von-Guericke-University Magdeburg, 2doctoral candidate, 3Professor for Cognitive Systems, University Magdeburg
Merging Continual Pretraining Models for Domain-Specialized LLMs: A Case Study in Finance
Kentaro Ueda1, François Portet2, Hirohiko Suwa1, Keiichi Yasumoto1
1Nara Institute of Science and Technology, 2Université Grenoble Alpe
Phonetic-based Ranking for Improved Pseudo-Labeling in Low-Resource ASR
Marco Matassoni1, Roberto Gretter1, Falavigna Daniele1, Mohamed Nabih Ali Mohamed Nawar1, Alessio Brutti1, Matteo Negri1, Mauro Cettolo1, Marco Gaido2, Sara Papi1, Luisa Bentivogli1
1Fondazione Bruno Kessler, 2Fondazione Bruno Kessler, University of Trento
Privacy-Preserving Information Extraction with Local LLMs: A Comparative Study on Dutch Debt Collection Letters
Beyza Celep, Natalia Amat-Lefort, Joost Visser
Leiden University
11:00 - 12:40    Session P9.2.2: Machine Learning III - Poster Area 1
Chair: Marko Tadić
Forewarned Is Forearmed: When Non-Sequential Embedding Turns into an Anomaly Detector
Elys Allesiardo, Antoine Caubrière, Valentin Vielzeuf
Orange Research
A Joint Detection Framework for Latvian Loanwords and Calques Using Monolingual Data
Yelingyun Zhang, Atis Kapenieks, Marina Platonova
Riga Technical University
Pantagruel: Unified Self-Supervised Encoders for French Text and Speech
Phuong-Hang Le1, Valentin Pelloin2, Arnault Chatelain3, Maryem Bouziane4, Mohammed Ghennai5, Qianwen Guan6, Kirill Milintsevich7, Salima Mdhaffar8, Aidan Mannion9, Nils Defauw10, Shuyue Gu6, Alexandre Daniel Audibert11, Marco Dinarelli12, Yannick Estève13, Lorraine Goeuriot9, Steffen Lalande7, Nicolas Hervé2, Maximin Coavoux14, François Portet15, Étienne Ollion16, Marie Candito17, Maxime Peyrard5, Solange Rossato12, Benjamin Lecouteux18, Aurélie Nardy19, Gilles Sérasset11, Vincent Segonne20, Solène Evain5, Diandra Fabre5, Didier Schwab21
1Saclay AI, 2INA, 3CREST (Ecole Polytechnique, ENSAE, CNRS), 4Avignon Université, LIA, 5Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 6Université Paris Cité, 7Institut national de l'audiovisuel, 8Avignon university, 9LIG, Université Grenoble Alpes, 10Univ. Grenoble Alpes, CNRS, Grenoble INP, 11Université Grenoble Alpes, 12LIG, 13LIA - Avignon Université, 14CNRS, Univ Grenoble Alpes, 15Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble, 16CNRS-CREST, 17LLF, Université Paris Cité, 18LIG/GETALP, 19Lidilem, 20IRISA - Université Bretagne Sud, 21Univ. Grenoble Alpes
Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights
Eneko Valero1, Maria Ribalta i Albado1, Oscar Sainz1, Naiara Perez2, German Rigau3
1University of the Basque Country (UPV/EHU), 2University of the Basque Country, 3UPV/EHU
SemiAdapt: Semi-Supervised and Efficient LoRA-Based Domain Adaptation for Low-Resource Irish Machine Translation with Transformers
Josh Mcgiff and Nikola S. Nikolov
University of Limerick
Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts
Valentin Pelloin1, Lina Bekkali2, Reda Dehak3, David Doukhan4
1INA, 2École nationale des ponts et chaussées (ENPC), 3EPITA, 4Institut national de l'audiovisuel (Ina)
SENS-ASR: Semantic Embedding Injection in Neural-transducer for Streaming Automatic Speech Recognition
Youness Dkhissi1, Valentin Vielzeuf2, Elys Allesiardo1, Anthony Larcher3
1Orange Innovation, 2Orange Research, 3Université du Mans - LIUM
Efficient Financial Language Understanding via Distillation with Synthetic Data
Wen-Fong (Xavier) Huang and Edwin Simpson
University of bristol
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
Aditya Kamlesh Parikh1, Cristian Tejedor-García2, Catia Cucchiarini3, Helmer Strik4
1Radboud University, 2CLST, Radboud University, 3Radboud University Nijmegen/Nederlandse Taalunie, 4Centre for Language and Speech Technology (CLST), Centre for Language Studies (CLS), Radboud University Nijmegen
Leveraging Semi-Supervised Learning for Multimodal Hate Speech Data Annotation and Detection
Rathi Adarshi Rammohan1, Zhao Ren1, Dominik Puchała2, Aleksandra Świderska2, Dennis Küster1, Tanja Schultz1
1University of Bremen, 2University of Warsaw
Lexicalized Constituency Parsing for Middle Dutch: Low-resource Training and Cross-Domain Generalization
Yiming Liang1 and Fang Zhao2
1Universiteit Gent, 2Université Paris Cité & Laboratoire de Linguistic Formelle
Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search
Kyle A McCleary and James M Ghawaly
Louisiana State University
Reason-to-Learn (R2L): Multi-Agent Knowledge Distillation for Lightweight LLMs in Sentiment Analysis
Le-Huy Tu1, Quan Nguyen2, Vincent NGUYEN3, Johanna Bjorklund4, Xuan-Son Vu5
1DopikAI JSC., 2Umeå University, 3University of Orleans, INSA CVL, LIFO EA, France, 4Umea University, 5Lund University and DeepTensor AB
PRiSM: Partial Ranking via Inter-layer Semantic Measurement for Efficient Fine-tuning of Language Models
Aldrin Kabya Biswas1, Md Fahim2, Md. Ashraful Amin1, Amin Ahsan Ali1, AKM Mahbubur Rahman1
1Center for Computational & Data Sciences, Independent University, Bangladesh, 2Center for Computational & Data Sciences at Independent University, Bangladesh (IUB)
11:00 - 12:40    Session P9.3.1: Language Modeling and LRs III - Poster Area 1
Chair: Monica Monachini
SEFL: A Framework for Generating Synthetic Educational Assignment Feedback with LLM Agents
Mike Zhang1, Amalie Pernille Dilling2, Léon Gondelman2, Niels Erik Ruan Lyngdorf2, Euan D Lindsay2, Johannes Bjerva3
1University of Copenhagen, 2Aalborg University, 3Department of Computer Science, Aalborg University
LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
Hailay Kidu Teklehaymanot1, Dren Fazlija2, Wolfgang Nejdl1
1L3S Research Center, 2L3S Research Center, Leibniz University Hannover
A Cheap Lunch: Synthetic Annotation With Reduced Human Effort for Medical Text Mining
Shutao Chen and Piek T.J.M. Vossen
Vrije Universiteit Amsterdam
Supervised Contrastive Fine-Tuning for Active Few-Shot Learning
Zirui Zhang, Lei Ge, Shengyu Qiao
Information Engineering University
Simulating Student Interactions for Virtual Pretesting with In-Context Learning
Arthur Thuy1, Luca Benedetto2, Ekaterina Loginova3, Dries F. Benoit1
1Ghent University, 2University of Cambridge, Institut Polytechnique de Paris, 3Dedalus Healthcare
An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter LLMs
Deshan Koshala Sumanathilaka, Nicholas Micallef, Julian Hough
Swansea University
Building Effective Japanese Medical LLMs with an Open Recipe for Domain Adaptation through Continued Pre-training
Akiko Aizawa1, Yuki Arase2, Fei Cheng3, Jiahao Huang4, Zhiyi Huang2, Junfeng Jiang4, Teruhito Kanazawa1, Daisuke Kawahara5, Kazuma Kobayashi1, Takashi Kodama3, Sadao Kurohashi3, Yusuke Oda1, Yuma Tsuta1, Zhen Wan3, Zhishen Yang1, Rio Yokota2
1National Institute of Informatics, 2Institute of Science Tokyo, 3Kyoto University, 4University of Tokyo, 5Waseda University
New Encoders for German Trained from Scratch: Comparing ModernGBERT with Converted LLM2Vec Models
Julia Wunderle1, Anton Ehrmanntraut2, Jan Pfister3, Fotis Jannidis2, Andreas Hotho4
1University of Wuerzburg, 2Universität Würzburg, 3Julius-Maximilians-Universität Würzburg (JMU), 4University of Würzburg
Arabic ChartSumm: An English-to-Arabic Benchmark for Metadata-to-Text Summarization
Passant Elchafei1 and Amany Fashwan2
1Ulm University, Germany, 2Phonetics and Linguistics Department, Faculty of Arts, Alexandria University, Alexandria
Introducing a Bangla Sentence – Gloss Pair Dataset for Bangla Sign Language Translation and Research
Neelavro Saha, Rafi Shahriyar, Nafis Ashraf Roudra, Saadman Sakib, Annajiat Alim Rasel
BRAC University
Language Models as Semantic Augmenters for Sequential Recommenders
Mahsa Valizadeh, Xiangjue Dong, Rui Tuo, James Caverlee
Texas A&M University
Efficient Adaptation of English Language Models for Morphologically Rich and Underrepresented Languages: The Case of Arabic
Ahmed Samy Eldamaty1, Mohamed Maher Zenhom Abdelrahman2, Mohamed Mostafa Ibrahim Elbehery1, Mariam Ashraf1, Radwa Elshawi2
1Giza Systems, 2University of Tartu
11:00 - 12:40    Session P9.3.2: Language Modeling and LRs IV - Poster Area 1
Chair: Simonetta Montemagni
GhostWriter: Hidden AI-Generated Texts over Multiple Languages, Domains and Generators
Manuel Schaaf1, Kevin Bönisch2, Alexander Mehler1
1Goethe-University Frankfurt am Main, 2Text Technology Lab, Goethe-University
Using LLMs to Extract Instances of Schematic Constructions from Unannotated L2 Learner Corpora
Jelena Kallas1, Ahto Kiil2, Heete Sahkai1, Geda Paulsen3, Kertu Saul4
1Institute of the Estonian Language, 2University of Tartu, 3Institute of the Estonian Language, Uppsala University, 4Institute of the Estonian Language, University of Tartu
Corruption-Based Data Augmentation for Arabic Essay Scoring: A Preliminary Study on the Organization Trait
May Saed Bashendy and Tamer Elsayed
Qatar University
Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach
Salim Al Mandhari1, Hieu Pham Dinh2, Mo El-Haj2, Paul Rayson1
1Lancaster University, 2VinUniversity
ManufactuBERT: Efficient Continual Pretraining for Manufacturing
Robin Armingaud and Romaric Besancon
CEA LIST
Śmigiel Dataset: Laying Foundations for Investigating Machine-Generated Text Detection in Polish
Jakub Strebeyko1, Alina Wróblewska2, Piotr Przybyła3
1University of Warsaw, Warsaw, Poland, 2Institute of Computer Science, Polish Academy of Sciences, 3Universitat Pompeu Fabra
Extracting Medical Image-Related Entities from Spanish Electronic Health Records Using NER Methods
Alexander Platas1, Marcos Merino1, Elena Zotova1, Montse Cuadros1, Karen López-Linares1, Mikel Pérez de Mendiola2, María Gálvez2, Cristina Barba2, Antón Asla2
1Vicomtech, 2Serikat
A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German
Shiva Banasaz Nouri1, Elena Leitner2, Julian Moreno-Schneider2, Georg Rehm2
1TU Berlin, 2DFKI
LLM-Based Data Generation and Clinical Skills Evaluation for Low-Resource French OSCEs
Tian Huang1, Tom Bourgeade2, Irina Illina3
1LORIA, University of Lorraine, 2LORIA - INRIA, University of Lorraine, 3LORIA/INRIA
Instruction-Tuned Urdu LLMs: Efficient Adaptation of Llama Models and Evaluation Resources for Urdu
Munief Hassan Tahir1, Sana Shams2, Sarmad Hussain3, Miriam Butt4
1Al Khawarizmi Institute of Computer Science, 2Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, 3Center for Language Engineering, KICS, UET, 4University of Konstanz
Is Biomedical Specialization Still Worth It? Insights from Domain-Adaptive Language Modelling with a New French Health Corpus
Aidan Mannion1, Cécile Macaire1, Armand Violle2, Stéphane Ohayon2, Xavier Tannier3, Didier Schwab4, Lorraine Goeuriot1, François Portet5
1LIG, Université Grenoble Alpes, 2LIMICS, Sorbonne Université, INSERM, 3Limics, Sorbonne Université, 4Univ. Grenoble Alpes, 5Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble
TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation
Toms Bergmanis, Ingus Jānis Pretkalniņš, Martins Kronis, Davis Nicmanis, Jeļizaveta Jelinska, Roberts Rozis, Rinalds Vīksna, Marcis Pinnis
Tilde
Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs
Saugata Purkayastha1, Pranav Kushare1, Pragya Paramita Pal1, Sukannya Purkayastha2
1Saarland University, 2TU Darmstadt
11:00 - 12:40    Session P9.3.3: Language Modeling and LRs V - Poster Area 1
Chair: Lluis Padro
``Emphasizing the Commendable'': A Study of Homogenized Transitive Verb Constructions in Machine Generated Peer Reviews
Hing-Yuet Fung1, Chi-kiu Lo2, Samuel Larkin3
1Independent Researcher, 2National Research Council of Canada, 3National Research Council Canada
CoDAE: Adapting Large Language Models for Education via Chain-of-Thought Data Augmentation
Shuzhou Yuan1, Willliam LaCroix2, Hardik Ghoshal3, Ercong Nie4, Michael Färber3
1Dresden University of Technology, 2Saarland University, 3TU Dresden, 4Centre for Information and Language Processing, LMU Munich
Synthetic Instruction Generation for Low-Resource Nordic Languages: Viability and Limitations in LLM Instruction-Tuning
Mathias Stenlund1, Annika Simonsen1, Lars Bungum2, Jan Ebert3, Jiangtao Wang3, Oleg Filatov3, Hemanadhan Myneni1, Morris Riedel1, Hafsteinn Einarsson1
1University of Iceland, 2NTNU, 3Jülich Supercomputing Centre
AYN: A Tiny Yet Competitive Indian Legal Language Model Pretrained from Scratch
Mitodru Niyogi1, Eric Gaussier2, Arnab Bhattacharya3
1CNRS, 2Univ. Grenoble Alpes, 3Dept. of Computer Science and Engineering, IIT Kanpur
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
Eeham Khan1, Firas Saidani2, Owen Van Esbroeck1, Richard Khoury2, Leila Kosseim1
1Concordia University, 2Université de Laval
Reformulate and Create, Don't Translate: Creating Natural Prompts for Underserved Languages
Annika Simonsen1, Mathias Stenlund2, Lars Bungum3, Marc Daníel Skipstað Volhardt2, Hafsteinn Einarsson2
1The University of Iceland, 2University of Iceland, 3Norwegian University of Science and Technology
Generating High Quality Synthetic Data for Dutch Medical Conversations
Cecilia Kuan1, Aditya Kamlesh Parikh1, Henk van den Heuvel2
1Radboud University, 2CLS/CLST, Radboud University Nijmegen
DeepICD-R1: Medical Reasoning through Hierarchical Rewards and Unsupervised Distillation
Tom Röhr1, Thomas Maximilian Josef Steffek1, Roman Teucher2, Keno Bressem3, Alexei Figueroa1, Paul Grundmann1, Peter Troeger1, Felix Alexander Gers1, Alexander Löser1
1Berliner Hochschule für Technik (BHT), 2Fraunhofer Research Engineer, 3Department of Diagnostic and Interventional Radiology, School of Medicine, University Hospital Rechts der Isar, Technical University of Munich
SynthLLM: An LLM-based Scalable Synthetic Data Generation Pipeline for Low-Resource Languages
Solmaz Panahi1, Vasudevan Nedumpozhimana2, John Kelleher3
1Maynooth University, 2TU Dublin, 3Trinity Colledge Dublin
Persona-Conditioned Generation of Patient Self-Reports from EHRs
Yuexin Wu1, jianming wei2, Vasile Rus1
1UNIVERSITY OF MEMPHIS, 2University Medical Center Utrecht
SocialStep: Fast Prediction of Social Determinants of Health
Paul Landes1, Adam Richard Cross2, Jimeng Sun3
1University of Illinois at Chicago, 2University of Illinois College of Medicine Peoria, 3University of Illinois Urbana-Champaign
Dynamically Acquiring Text Content to Enable the Classification of Lesser-known Entities for Real-world Tasks
Fahmida Alam and Ellen Riloff
University of Arizona
RILEC: Detection and Generation of L1 Russian Interference Errors in English Learner Texts
Darya Kharlamova1 and Irina Proskurina2
1National Research University Higher School of Economics, 2Laboratoire Hubert Curien, UMR CNRS 5516, Saint-Etienne, France, Université Claude Bernard Lyon 1, Université Lumière Lyon 2, ERIC, 69100, Villeurbanne, France
12:40 - 14:10    Lunch Break
14:10 - 14:55    Keynote 2 - Dan Jurafsky: The Social Failures of Language Models as Conversational Partners - Auditorium Illes Balears
Chair: Nancy Ide
14:55 - 15:00    Short Break (5mn)
15:00 - 16:40    Session O37: Evaluation, Validation, Quality Assurance - Auditorium Illes Balears
Chair: Joakim Nivre
15:00 - 15:20    Critical Foreign Policy Decision (CFPD) Benchmark: Measuring Diplomatic Preferences of Large Language Models
Benjamin Jensen1, Ian J Reynolds1, Yasir Atalan1, Michael Garcia2, Austin Woo2, Anthony Chen2, Trevor Howarth2
1Center for Strategic and International Studies, 2Scale AI
15:20 - 15:40    CrisisCL: A Domain Incremental Learning Benchmark for Crisis Management
Paul Le Van Kiem1, Romain Meunier1, Farah Benamara2, Véronique MORICEAU3
1IRIT, 2University of toulouse, 3IRIT, Université de Toulouse
15:40 - 16:00    Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation
Neha Sharma1, Navneet Agarwal2, Kairit Sirts1
1University of Tartu, 2EXAI, University of Tartu
16:00 - 16:20    LLMs as Annotators: Evaluating Model–Human Alignment in Detecting Contentious Language in Historical Corpora
Yahui Zhao1, Clemencia Siro2, Laura Hollink1
1Centrum Wiskunde & Informatica (CWI), 2Centrum Wiskunde & Informatica
16:20 - 16:40    Widespread Gender and Pronoun Bias in Moral Judgments across LLMs
Gustavo Lucius Fernandes, Jeiverson Santos, Pedro O.S Vaz-de-Melo
UFMG
15:00 - 16:40    Session O38: Knowledge Discovery and Representation - Auditorium Mallorca
Chair: Gilles Sérasset
15:00 - 15:20    Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation
Lewis N Watson, Carl Strathearn, Kenny Mitchell, Yanchao Yu
Edinburgh Napier University
15:20 - 15:40    Injecting Structured Biomedical Knowledge into Language Models:Continual Pretraining vs. GraphRAG
Jaafer Klila1, Sondes Bannour Souihi2, rahma boujelbane3, Nasredine Semmar4, Lamia Hadrich-Belguith5
1PhD student, 2CEA, 3FSEGS, 4CEA LIST, 5ANLP Research Group, MIRACL Lab, FSEGS, Sfax University
15:40 - 16:00    Linguistic Knowledge Graphs for Sense Prediction: A Case-study on Latin
Eleonora Ghizzota1, Paola Marongiu2, Pierpaolo Basile3, Stefano Ferilli4, Barbara McGillivray5
1University of Bari Aldo Moro, 2CNR-ILC, Istituto di Linguistica Computazionale 'A. Zampolli', 3Department of Computer Science, University of Bari Aldo Moro, 4Universitá degli Studi di Bari, 5King's College London
16:00 - 16:20    ACID: On the Perception of Online Classism
Arianna Muti1, Elisa Bassignana2, Amanda Cercas Curry1, Federica Durante3, Dirk Hovy1, Debora Nozza1
1Bocconi University, 2IT University of Copenhagen, 3Università Milano Bicocca
16:20 - 16:40    The Spectrum of Sentiment: Optimistic, Pessimistic, and Neutral Voices in Online Depression Discourse
Stefana Arina Tabusca1, Ana-Maria Bucur2, Liviu P. Dinu1
1University of Bucharest, 2Università della Svizzera italiana
15:00 - 16:40    Session O39: Applications Involving LRs and Evaluation III - Menorca 1
Chair: Andreas Hotho
15:00 - 15:20    A Benchmark Dataset and Comparative Evaluation of Phonemized and Romanized Urdu for Text-to-Speech
M Kaab Bin Shahid1 and Muhammed Izharuddin2
1University of Stuttgart, 2Aligarh Muslim University
15:20 - 15:40    S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature
Abigail Berthe-Pardo1, Gaspard Michel2, Elena V. Epure2, Christophe Cerisara3
1Université de Lorraine, CNRS, LORIA, 2Deezer Research, 3Universite de Lorraine, CNRS, LORIA
15:40 - 16:00    BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios
Yunseung Lee1, Subin Kim2, Youngjun Kwak2, Jaegul Choo3
1KakaoBank Corp., 2Kakaobank, 3Korea Advanced Institute of Science and Technology
12:20 - 12:40    ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech
Marios Koniaris, Argyro Tsipi, Panayiotis Tsanakas
National Technical University of Athens
16:20 - 16:40    Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+
Mason Shipton1, York Hay Ng2, Aditya Khan2, Phuong H Hoang2, Xiang Lu3, A. Seza Dogruoz4, Annie En-Shiun Lee5
1Ontario Tech University, 2University of Toronto, 3University of Michigan, 4Universiteit Gent, 5Ontario Tech University, University of Toronto
15:00 - 16:40    Session O40: Multimodality, Cross-modality - Eivissa 1
Chair: Costanza Navarretta
15:00 - 15:20    SciClaimEval: Cross-modal Claim Verification in Scientific Papers
Xanh Ho1, Yun-Ang Wu2, Sunisth Kumar3, Tian Cheng Xia4, Florian Boudin5, Andre Greiner-Petter6, Akiko Aizawa1
1National Institute of Informatics, 2National Taiwan University, 3University of Tokyo, 4University of Bologna, 5Inria, LS2N, Nantes Université, 6University of Goettingen
15:20 - 15:40    Localizing Events in Space: Comparing Humans and AI Models
Derrick Eui Gyu Kim, Kenneth Lai, James Pustejovsky
Brandeis University
15:40 - 16:00    STRUDEL: Unrolling a Benchmark for Evaluating Vision-Language Models on Structured Diagram Understanding across Domains
Daniel Steinigen, Lucie Flek, Sebastian Houben
Fraunhofer IAIS
16:00 - 16:20    VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
Byeonggeuk Lim, Kyeonghyun Kim, Jungmin Yun, Youngbin Kim
Chung-ang University
16:20 - 16:40    VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Josef Kuchar1, Marek Kadlcik2, Michal Spiegel3, Michal Stefanik1
1Masaryk University, 2Faculty of Informatics, Masaryk University, 3Kempelen Institute of Intelligent Technologies
15:00 - 16:40    Session P10.1: Social Media - Poster Area 2
Chair: Thorsten Trippel
ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source
Hung Tuan Le1, Long Truong To1, Manh Trong Nguyen1, Kiet Van Nguyen2
1University of Information Technology, HCM VNU, 2University of Information Technology, VNU-HCM
Automated Extraction of Answer Candidates for Question Generation
Claudia Preda1, Mihai Dascalu1, Stefan Ruseti2, Danielle S. McNamara3
1National University of Science and Technology POLITEHNICA Bucharest, 2University Politehnica of Bucharest, 3Arizona State University
Green Bots versus Red Bots: Evaluating Large Language Models for Simulating Persuasion Dynamics in Online Influence Campaigns
Majd Eddin Al Ali1, Filip Mihai Muntean2, Lucia Donatelli1, Jurriaan van Diggelen3
1Vrije Universiteit Amsterdam, 2Vrije Universiteit, 3TNO
Towards Expectation Detection in Language: A Case Study on Treatment Expectations in Reddit
Aswathy Velutharambath1 and Amelie Wührl2
1University of Stuttgart, 2IT University of Copenhagen
Empathy Speaks in Metaphors: The Empathy-Metaphor Corpus of Figurative Language in Empathetic Text
Gyeongeun Lee1 and Natalie Parde2
1University of Illinois at Chicago, 2University of Illinois Chicago
A Computational Diachronic Analysis of Gen Z Mental Health Discourse: A Large-scale Reddit Corpus Study from Pre- to Post-COVID
Felix Mao
Rye Country Day School
"Oat Milk Vegan Chocolate Taste Great!": Monitoring the Food Transition Debate in Reddit
Greta Zella1, Jan Willem Bolderdijk2, Saskia Peels1, Gerry Wakker1, Tommaso Caselli3
1University of Groningen, 2University of Amsterdam, University of Groningen, 3Rijksuniversiteit Groningen
ClimateChat-300K: A Multi-Modal Facebook Dataset for Understanding Diverse Perspectives in Climate Communication
Wajdi Zaghouani1, Md. Rafiul Biswas2, Mabrouka Bessghaier1, Shimaa Amer Ibrahim1, George Mikros2
1Northwestern University Qatar, 2Hamad Bin Khalifa University
HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse
Sai Kartheek Reddy Kasu1, Shankar Biradar2, SUNIL SAUMYA3, Md. Shad Akhtar4
1Student, 2Assistant Professor, 3INDIAN INSTITUTE OF INFORMATION TECHNOLOGY DHARWAD, 4Indraprastha Institute of Information Technology, Delhi
MindSET: Advancing Mental Health Benchmarking through Large-Scale Social Media Data
Saad Mankarious1, Edward Kempa2, Daniel Wiechmann3, Elma Kerz4, Yu Qiao5, Ayah Zirikly6
1Cornell College, 2University of Florida, Department of Computer and Information Science and Engineering, 3Institute for Logic Language and Computation, 4Exaia Technologies, 5RWTH Aachen University, 6Johns Hopkins University
A Corpus of Misunderstood Irony on Turkish Social Media
Çağrı Çöltekin and Güliz Güneş
University of Tübingen
15:00 - 16:40    Session P10.2.1: Linguistics and Psycholinguistics I - Poster Area 2
Chair: Aditya Kamlesh Parikh
A Corpus of Joint EEG and Self-Paced Reading of Natural Dutch Texts
Sara Møller Østergaard, Lenneke Doris Lichtenberg, Laura Boon, Bruno Nicenboim
Tilburg University
Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?
Prateek Kumar Rajput, Yewei Song, Iyiola Emmanuel Olatunji, Jacques Klein, Tegawendé Bissyande
University of Luxembourg
A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production
Qiao Gan1, Jonathan Dunn2, Andrea Nini3, Benjamin Adams1
1University of Canterbury, 2University of Illinois Urbana-Champaign, 3University of Manchester
Semantic Information: A Difference That Makes a Difference
J. Nathanael Philipp1, Max Kölbl2, Michael Richter3
1Sächsische Akademie der Wissenschaften zu Leipzig, 2Osaka University, 3Leipzig University
Modeling the Memory-Surprisal Trade-Off over Time: Communicative Efficiency Decreases with Lexico-Grammatical Change in Scientific English
Julius Steuer1, Marie-Pauline Krielke2, Stefania Degaetano-Ortlieb2, Elke Teich3, Dietrich Klakow2
1Heidelberg Institute for Theoretical Studies, 2Saarland University, 3Universität des Saarlandes
Mechanistic Interpretability Meets Cognitive Linguistics: Modelling Locative Image Schemas in the Circuit Framework
Mattia Proietti1, Afra Alishahi2, Grzegorz Chrupała2, Alessandro Lenci3
1Università di Pisa, 2Tilburg University, 3University of Pisa
Variation Is the Norm: Embracing Sociolinguistics in NLP
Anne-Marie Lutgen1, Alistair Plum1, Verena Blaschke2, Barbara Plank2, Christoph Purschke1
1University of Luxembourg, 2LMU Munich
Appraisal Theory-Informed Emotion Prediction
Xiaowei Wang1, Jayant Teotia2, Rui Mao3, Wandeep Kaur Ratan Singh1, Sabrina Binti Tiun1, Erik Cambria4
1Universiti Kebangsaan Malaysia, 2NTU, 3Ruimao Tech, 4Nanyang Technological University
The Evolution of Philosophy: A Metaphorical Cognition Perspective
Rui Mao1, Dapeng Chen2, Zihao Huang3, Xulang Zhang3, Erik Cambria3
1Ruimao Tech, 2Jiangsu Open University, 3Nanyang Technological University
15:00 - 16:40    Session P10.2.2: Linguistics and Psycholinguistics II - Poster Area 2
Chair: Cecilia Kuan
Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues
Yu Wang1, Olcay Türk1, Angela Grimminger2, Hendrik Buschmeier1
1Bielefeld University, 2Paderborn University
Figurative Language in Alzheimer's Discourse: Linguistic and Neural Alignment in Clinical Narratives
Diana Kylymnyk1, Vitória Hilgert Tomasel2, Helena Caseli3, Edward Watkins4, Aline Villavicencio5, Rodrigo Wilkens4
1Department of Computer Science and Psychology, University of Exeter, 2Federal University of Sao Carlos, 3Federal University of São Carlos, 4university of Exeter, 5University of Exeter, UK
Prompting Instruction-tuned LLMs for Semantic Similarity Values
Xander Akiko Snelder, Yunchong Huang, Jelke Bloem
University of Amsterdam
Rethinking Evaluation in Retrieval-Augmented Personalized Dialogue: A Cognitive and Linguistic Perspective
Tianyi Zhang1 and David Traum2
1University of Southern California, 2University of Southern California Institute for Creative Technologies
Evaluating Multimodal Large Language Model Narrative Interpretation through the Lens of Appraisal Theory
Jayant Teotia1, Xiaowei Wang2, Xulang Zhang3, Rui Mao3, Erik Cambria3
1NTU, 2Universiti Kebangsaan Malaysia, 3Nanyang Technological University
Mapping Liberty Metaphors across Cultures and Time
Sidney Suen1, Rui Mao1, Kenneth Kwok2, Erik Cambria1
1Nanyang Technological University, 2Agency for Science, Technology and Research
The Sensorimotor Norms for the Chinese Classifiers
Yimei Shao1, Yu-Yin Hsu1, Chu-Ren Huang2
1The Hong Kong Polytechnic University, 2The Hong Kong Polytechnic Universiy
DeepQuestion: Systematic Generation of Real-World Challenges for Evaluating LLMs Performance
Ali Khoramfar, Ali Ramezani, Mohammad Mahdi Mohajeri, Mohammad Javad Dousti, Majid Nili Ahmadabadi, Heshaam Faili
University of Tehran
Pragmatic Modelling in Language Learning: Caregiver Question-Answer Feedback in Child-Directed Dialogue
Maryam Bala1, Johannes Heim2, Elspeth Edelstein2, Arabella Sinclair3
1University of Southampton, 2University of Aberdeen, 3University College London
15:00 - 16:40    Session P10.3.1: Parsing and Tagging I - Poster Area 2
Chair: Patrick Paroubek
Modular Approach to Automating Morphological Components in Grammar Engineering
Ekaterina Voloshina1 and Krasimir Angelov2
1University of Gothenburg, Chalmers University of Technology, 2University of Gothenburg and Chalmers University of Technology
MorfFlex: Handling Rich Morphology
Jaroslava Hlaváčová1, Marie Mikulová2, Barbora Štěpánková3, Milan Straka3, Jan Hajič2
1CUNI, 2Charles University, 3Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics
Using Valency Inheritance in Building a Valency Lexicon
Václava Kettnerová1, Veronika Kolářová1, Jiří Mírovský2, Michal Olbrich2
1Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 2Charles University
From CHAT to Coded CoNLL-U: A Reproducible Pipeline for the Syntactic Annotation and Querying of Child Language Data
Achim Stein
University of Stuttgart
TækTåk: Syntactic Analysis of Language Use on Danish TikTok
Thea Kristensen and Rob van der Goot
IT University of Copenhagen
Adaptive Chunking: Optimizing Chunking-Method Selection for RAG
Paulo Roberto de Moura Júnior, Jean Lelong, Annabelle Blangero
Ekimetrics
Do Large Language Models Grasp the Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish
Lujun LI1, Yewei Song1, Lama Sleem1, Yiqun Wang1, Yangjie Xu1, Cedric LOTHRITZ2, Niccolo' Gentile3, Radu State1, Tegawendé F. Bissyandé1, Jacques Klein1
1University of Luxembourg, 2Luxembourg Institute of Science and Technology (LIST), 3Foyer S.A.
Survey of Tools for Manual Linguistic Annotation: Supporting Diversity through Interactive Exploration
Ludovica Pannitto1, Kaja Dobrovoljc2, Bruno Guillaume3
1LILEC - University of Bologna, 2University of Ljubljana & Jozef Stefan Institute, 3LORIA / Inria Nancy Grand-Est
TextLens & LeTTuce: Automated Corpus Annotation and Multilingual Tagging as a Service
Cynthia Van Hee1, Jonas Doumen2, Vincent Prins3, Pranaydeep Singh4, Vincent Vandeghinste3, Els Lefever5
1LT3, Language and Translation Technology Team (Ghent University), 2KU Leuven, imec research group itec, 3Instituut voor de Nederlandse Taal, 4LT3, University of Ghent, 5LT3, Ghent University
The Corpus of Contemporary Polish — a New Reference Corpus with Rich Syntactic Annotations
Witold Kieraś1, Małgorzata Marciniak2, Marcin Woliński1, Katarzyna Krasnowska-Kieraś1, Marek Łaziński1
1Institute of Computer Science, Polish Academy of Sciences, 2Institute of Computer Science Polish Academy of Sciences
Prague Dependency Treebank - Consolidated 2.0: Enriching a Complex Annotation Scheme
Marie Mikulová1, Jiří Mírovský1, Milan Straka2, Pavlína Synková1, Jan Štěpánek3, Barbora Štěpánková2, Jan Hajič1
1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 3Charles University in Prague, Faculty of Mathematics and Physics, UFAL
Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies
Marie Mikulová1, Barbora Štěpánková2, Daniel Zeman3, Jan Štěpánek4, Milan Straka2, Jan Hajič1
1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 3Charles University, Faculty of Mathematics and Physics, 4Charles University in Prague, Faculty of Mathematics and Physics, UFAL
Encoding Logical Relations of Chinese Complex Sentences within the Universal Dependencies Framework
Hongpu Zhu and Hongzhi Xu
Shanghai International Studies University
Unsupervised Labelling of Mutation Triggers in Welsh
Nicolás Gutiérrez-Rolón and Fernando Alva-Manchego
Cardiff University
15:00 - 16:40    Session P10.3.2: Parsing and Tagging II - Poster Area 2
Chair: Amy Isard
UzUDT: Uzbek Universal Dependencies Treebank
Sanatbek Gayratovich Matlatipov1 and Mersaid Aripov2
1Dr, 2Professor
BRAGD: Constrained Multi-Label POS Tagging for Faroese
Annika Simonsen1, Barbara Scalvini2, Uni Johannesen2, Iben Nyholm Debess2, Hafsteinn Einarsson3, Vésteinn Snæbjarnarson4
1The University of Iceland, 2University of the Faroe Islands, 3University of Iceland, 4University of Copenhagen
Syntactic Sugar for Syntactic Queries: Sequential Representations for Dependency Queries
Niklas Deworetzki1 and Arianna Masciolini2
1Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, 2Språkbanken Text; Department of Swedish, Multilingualism, Language Technology; University of Gothenburg
Context Is (Almost) Everything: Llama-3 on Structured Output and AMR Parsing
Maja Buljan1, Stephan Oepen2, Lilja Øvrelid3
1Language Technology Group (LTG), University of Oslo, 2Universitetet i Oslo, 3Dept of Informatics, University of Oslo
Towards the Morphological Annotation of North Markian (Low German)
Christian Chiarcos
University of Augsburg
Cross-Dataset Inconsistencies in Morphological Annotation: Evidence from Universal Dependencies
Vlasta Ohlídalová
Masaryk University
Improving Latvian Morphosyntactic Parsing with Pretrained Encoders and Analyzer-Constrained Decoding
Arturs Znotins
Institute of Mathematics and Computer Science, University of Latvia
CommonMorph: Participatory Morphological Documentation Platform
Aso Mahmudi1, Sina Ahmadi2, Kemal Maulana Kurniawan3, Rico Sennrich2, Eduard H Hovy3, Ekaterina Vylomova3
1The University of Melbourne, 2University of Zurich, 3University of Melbourne
Datasets for Verb Alternations across Languages: BLM Templates and Data Augmentation Strategies
Giuseppe Samo1 and Paola Merlo2
1IDIAP, 2University of Geneva
A Large and Balanced Multi-Domain Arabic Corpus Annotated for Morphology, Syntax, and Readability
Khalid N. Elmadani1, Adel Mahmoud Wizani2, Hanada Taha Thomure3, Nizar Habash1
1New York University Abu Dhabi, 2University of Turin, 3Zayed University
The DELPH-IN Grammary: A Curated Repository of Grammars and Treebanks
Francis Bond1 and Dan Flickinger2
1Palacky University, 2Stanford University
Morphemes without Borders: Evaluating Root–Pattern Morphology in Arabic Tokenizers and LLMs
Yara Yousif Alakeel1, Chatrine Qwaider2, Hanan Aldarmaki2, Sawsan Alqahtani1
1SDAIA, 2MBZUAI
15:00 - 16:40    Session P10.4.1: Lexicon and Semantics II - Poster Area 2
Chair: António Branco
APODICTUS: Automatic Processing of DICTionary Update candidateS
Felix Blessing1, Johannes S. Sax1, Julian Kaufmann1, Wei Zhao2, Nikolay Arefyev3, Dominik Schlechtweg1
1University of Stuttgart, 2University of Aberdeen, 3University of Oslo
A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation
Robert Krovetz
Lexical Research
Creating a Hybrid Rule and Neural Network Based Semantic Tagger Using Silver Standard Data: The PyMUSAS Framework for Multilingual Semantic Annotation
Andrew Moore1, Paul Rayson1, Dawn Archer2, Tim Czerniak3, Dawn Knight4, Daisy Monika Lal1, Gearóid Ó Donnchadha5, Mícheál J. Ó Meachair6, Scott Piao1, Elaine Uí Dhonnchadha3, Johanna Vuorinen5, Yan Yabo7, Xiaobin Yang7
1Lancaster University, 2Manchester Metropolitan University, 3Trinity College Dublin, 4Cardiff University, 5independent researcher, 6Fiontar & Scoil na Gaeilge, Dublin City University, 7Hubei University
Scare Quotes as Markers of "Questionable" Word Usages and Misalignment in Conversation: An Annotation Study
Aina Garí Soler1, Juan Carlos Zevallos Huaco2, Matthieu Labeau3, Chloé Clavel4
1PSL University, INRIA Paris, 2Independent Researcher, 3Telecom Paris, 4INRIA
Modeling Clinical Uncertainty in Radiology Reports: From Explicit Uncertainty Markers to Implicit Reasoning Pathways
Paloma Rabaey1, Jong Hak Moon2, Jung-Oh Lee3, Min Gwan Kim4, Hangyul Yoon2, Thomas Demeester1, Edward Choi2
1Ghent University, 2KAIST, 3Mount Sinai Hospital, 4Seoul National University Hospital
ArabDiscrim: A Decade-Long Arabic Facebook Corpus on Racism and Discrimination
Wajdi Zaghouani1, Shimaa Amer Ibrahim1, Mabrouka Bessghaier1, Houda Bouamor2
1Northwestern University Qatar, 2Carnegie Mellon University in Qatar
DAMETA: An LLM Benchmark for Danish Metaphor Interpretation with Systematically Varied Distractors
Nina Skovgaard Schneidermann1, Sanni Nimb2, Nathalie Carmen Hau Norman1, Sussi Olsen3, Bolette Pedersen1
1University of Copenhagen, 2Society for Danish Language and Literature (DSL), 3UCPH, NorS, Centre for Language Technology
A New Semantic Artifact Based Framework for Studying and Documenting Algospeak and Related Phenomena
Fahad Khan1, Elisa Gugliotta2, Elisa Squadrito3, Maura Tarquini2, Francesca Frontini4
1Istituto di Linguistica Computazionale "Antonio Zampolli", CNR, 2Università degli Studi di Sassari, 3Università di Macerata, 4CNR- Istituto di Linguistica Computazionale "A. Zampolli" - ILC Consiglio Nazionale delle Ricerche
Creating a High Quality Abstract Meaning Representation Dataset Automatically
Johannes Heinecke1, Asadullah Munshi2, Frédéric Herledan2, Geraldine Damnati1
1Orange Innovation, 2Orange
Towards a Comprehensive English Wordnet-Wikidata Mapping
John P. McCrae1, Johann Bergh2, Krasimir Angelov3
1Insight Center for Data Analytics, National University of Ireland Galway, 2Lingolutions, 3University of Gothenburg and Chalmers University of Technology
AmDi - Ambiguous Words Diachronic Dataset
Felix Thielen1 and Kai Kugler2
1Trier Univerity, 2Trier University
15:00 - 16:40    Session P10.4.2: Lexicon and Semantics III - Poster Area 2
Chair: Gerard De Melo
GerVLPro: A CEFR-Graded Vocabulary List of L2 Learners' Productive Vocabulary in German
Noah-Manuel Michael1, Anna Huelsing2, Andrea Horbach3
1Kiel University, 2CAU, 3CAU Kiel / Leibniz Institute for Science and Mathematics Education
Building Bridges between Student and Curricular Language: Creating a Corpus of Abstract Meaning Representations for the Classroom
Kristin Wright-Bettner1, Zheng Cai2, zekun zhao3, James H. Martin1, Jeffrey Flanigan4, Martha Palmer5
1University of Colorado Boulder, 2The University of Colorado, 3University of California, Santa Cruz, 4UC Santa Cruz, 5University of Colorado
Mu'jam Arriyadh: A Comprehensive Lexicon for Contemporary Arabic Language
Afrah A. Altamimi1, Abdulrahman Alosaimy2, Halah Munif Alharbi3, Hawra Aljasim3, Muneera Alhoshan4, Amal Almazrua5, Hanan Alharbi3, Abdulrahman Saeed Alshehri1, Bayan M Almuqhim3, Maryam H Algarny3, Yahya A Asiri6, Abdullah I. Alharbi7, SALEH ZAIDAN ALBALAWI3, Fawziah Mohammed Asiri1, Sara Ali Alhifthi8, Abdullah Alfaifi5
1KSGAAL, 2King Salman Academy for Arabic Language / Imam Mohammed Bin Saud Islamic University, 3King Salman Global Academy for Arabic Language, 4King Salman Global Global Academy for Arabic Language, 5KSAA, 6King salman global academy of Arabic language, 7King Salman Global Academy for Arabic, 8Saudi Arabia
The Romanian Corpus Annotated with Multiword Expressions. PARSEME-Ro Version 2.0
Verginica Barbu Mititelu1, Mihaela Cristescu2, Elena Irimia1, Carmen Mîrzea Vasile2
1Research Institute for Artificial Intelligence, Romanian Academy, 2University of Bucharest
Missing Links: LLM-Augmentation of Event Triggers of State Changes in the OpenPI Dataset
Kyeongmin Rim1 and James Pustejovsky2
1Department of Computer Science, Brandeis University, 2Brandeis University
VUPMC: A New Political Metaphor Corpus in Mandarin Chinese
Xiaojuan Tan
VU Amsterdam
Not All Disneys Are the Same: Making Coreference Metonymy-Aware
Bingyang Ye, Jingxuan Tu, James Pustejovsky
Brandeis University
JSTS-Neg: Japanese Semantic Textual Similarity Dataset for Evaluating Negation Understanding Ability
Reiko Yuasa, Yoshihide Kato, Shigeki Matsubara
Nagoya University
Few-shot Prompting or Supervised Tuning? A Comparative Study of LLMs for Linguistically Distant Language Pairs in BDI
Deepen Naorem1, Sanasam Ranbir Singh2, Telem Joyson Singh3, Priyankoo Sarmah4
1Indian Institute of Technology, Guwahati, 2Indian Institute of Technology, 3IIT Guwahati, 4Indian Institute of Technology Guwahati
When Structure Matters: Cross-Lingual Hyperbolic Embeddings for Chinese and English Wordnets
Mao-Chang Ku1, Da-Chen Lian2, Pin-Er Chen1, Po-Ya Angela Wang1, Wei-Ling Chen1, Shu-Kai HSIEH2
1National Taiwan University, 2Graduate Institute of Linguistics, National Taiwan University
16:40 - 17:00    Coffee Break
17:00 - 18:20    LREC 2026 Closing Ceremony - Auditorium Illes Balears
Chair: Stelios Piperidis
20:00    LREC 2026 GALA Dinner
End of Day 3