Conference Programme – Day 3

Document sans nom
                         Day 3
Friday, 15 May, 2026
09:00 - 10:40    Session O29: Infrastructures, Policy and Legal Issues II - Room 1
09:00 - 09:20  Mitigating Misinterpretation in Policy Documents through Automated Language Understanding
Momojit Biswas, Anka Chandrahas Tummepalli, Preethu Rose Anish
TCS Research
09:20 - 09:40  Sovereign AI-based Public Services Are Viable and Affordable
António Branco1, Luis Gomes2, Rodrigo Santos1, Eduardo Santos1, João Ricardo Silva1, Nuno Marques1, Madalena Rodrigues1
1University of Lisbon, 2Faculdade de Ciencias da Universidade de Lisboa
09:40 - 10:00  A Typology of Synthetic Datasets for Dialogue Processing in Clinical Contexts
Steven Bedrick1, A. Seza Dogruoz2, Sergiu Nisioi3
1Oregon Health & Science University, 2Universiteit Gent, 3Human Language Technologies Research Center, University of Bucharest
10:00 - 10:20  Text+: A National Hub Including Legacy Language Data
Florian Barth1, Christoph Draxler2, Jennifer Ecker3, Stefan Fischer4, Philippe Genêt5, Alina Hemmer6, Timm Lehmberg7, Thorsten Trippel8, Andreas Witt3, Arden Zimmermann5, Claus Zinn9
1University of Göttingen, 2Institute of Phonetics and Speech Processing, LMU Munich, 3Leibniz Institute for the German Language, 4Universität des Saarlandes, 5Deutsche Nationalbibliothek, 6University of Hamburg, 7Academy of Science and Humanities in Hamburg, 8Leibniz-Institut für Deutsche Sprache, 9University of Tübingen
10:20 - 10:40  Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech
Tanvi Dinkar1, Aiqi Jiang1, Simona Frenda2, Poppy Gerrard-Abbott3, Nancie Gunson2, Gavin Abercrombie1, Ioannis Konstas2
1Heriot Watt University, 2Heriot-Watt University, 3University of Edinburgh/Heriot-Watt University
09:00 - 10:40    Session O30: Opinion and Argument Mining, Sentiment Analysis - Room 2
09:00 - 09:20  Towards Complex Debate Understanding: Predicting Claim Impact Scores through the Modelling of Claim Interactions
Maxime Brouat1, Mihai Surdeanu2, Srdjan Vesic1, Eduardo Blanco2
1CRIL CNRS Univ. Artois, 2University of Arizona
09:20 - 09:40  Is There Anything More Deceptive than an Obvious Fact? Investigating Implicitness in User-Generated Argumentative Text
Ekaterina Sviridova1, Elena Cabrio2, Serena Villata3
1Université Côte d'Azur, 2Université Côte d'Azur, Inria, CNRS, I3S, 3Université Côte d'Azur, CNRS, Inria, I3S
09:40 - 10:00  Best-Worst Scaling of Hype in Biomedical Research: Building an Intensity Lexicon of Promotional Adjectives
Neil Millar1, Dipesh Satav1, Bojan Batalo2, Erica K. Shimomoto3, Ryosuke Ohniwa1
1University of Tsukuba, 2AIST, 3National Institute of Advanced Industrial Science and Technology
10:00 - 10:20  Trust Me, I Can Convince You: The Contextualized Argument Appraisal Framework and the ContArgA Corpus
Lynn Greschner, Sabine Weber, Roman Klinger
University of Bamberg
10:20 - 10:40  Towards Clinical Applications of NLP: Detecting Emotion Regulation via Emotional Categories and Expression Modes in French Transcriptions
Salome Klein1, Amalia Todirascu2, Hélène Vassiliadou3
1UR 1339/LiLPa & FRLC (University of Strasbourg), 2LiLPa, University of Strasbourg, 3University of Strasbourg
09:00 - 10:40    Session O31: Bias, Offensive and Non-inclusive Language - Room 3
09:00 - 09:20  R.U.Psycho? A Framework for Robust Unified Psychometric Testing of Language Models
Julian Schelb1, Orr Borin2, David Garcia1, Andreas Spitz1
1University of Konstanz, 2Recosys
09:20 - 09:40  Code-switching as a Bias Indicator in LLMs: "the Consequences Are Not the Same Para Nosotros"
Fanny Ducel1, Aurélie Névéol2, Vidit Khazanchi3, Loïc Leclere4, Arthur Pedrini4, Léa Bouchet5, Benjamin Caissial5, Karen Fort6
1LISN, Université Paris-Saclay, 2Université Paris Saclay, CNRS, LISN, 3LORIA, 4Université de Lorraine, LORIA, 5Université de Lorraine, 6Sorbonne Universite and LORIA
09:40 - 10:00  Exploration of How Hate Is Framed on Social Media
Rakshitha Rao Ailneni and Sanda Harabagiu
University of Texas at Dallas
10:00 - 10:20  Are Social Biases in LLMs Consistent across Generative Tasks? A Case Study for Basque
Muitze Zulaika1, Xabier Saralegi1, Julia Shershneva2, Lia Gonzalez2, Arkaitz Fullaondo2
1Orai NLP Technologies, 2University of the Basque Country (EHU)
10:20 - 10:40  Fine-grained Narrative Classification in Biased News Articles
Zeba Afroz1, Harsh Vardhan1, pawan bhakuni2, Aanchal Punia3, Rajdeep Kumar4, Md. Shad Akhtar1
1Indraprastha Institute of Information Technology, Delhi, 2Bharat Electronics Ghaziabad, 3Bharat Electronics, 4Bharat Electronics limited
09:00 - 10:40    Session O32: Speech Resources, Processing, Applications - Room 4
09:00 - 09:20  A Shoal of Voices: Parallel Read Speech from Professional Swedish Narrators
Christina Tånnander1, Jim O'Regan2, Jens Edlund3
1KTH Speech, Music and Hearing, MTM, 2KTH Royal Institute of Technology, 3KTH Speech, Music and Hearing
09:20 - 09:40  Deep Learning-Based Multi-Aspect Pronunciation Assessment for Individuals with Down Syndrome
David Fernández-García, César González-Ferreras, Valentín Cardeñoso-Payo, Mario Corrales-Astorgano
Universidad de Valladolid
09:40 - 10:00  WikIPA: Integrating WikiPron and Lingua Libre for Multilingual IPA Transcription
Pierluigi Cassotti1, Jacob Suchardt2, Domenico De Cristofaro3
1University of Gothenburg, 2Leipzig University, 3Free University of Bozen
10:00 - 10:20  How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse
Saki Imai1, Lee Kezar2, Laurel Aichler3, Mert Inan1, Erin Walker4, Alicia Wooten3, Lorna Quandt3, Malihe Alikhani1
1Northeastern University, 2University of Southern California, 3Gallaudet University, 4University of Pittsburgh
10:20 - 10:40  Setting the Stage for Disfluency: Implications of Contextual Task Framing Effects for the Design of Listening Tasks
Ambika Kirkland1 and Jens Edlund2
1KTH Royal Institute of Technology, 2KTH Speech, Music and Hearing
09:00 - 10:40    Session P8.1.1: Machine Translation I - Poster Area
  ACAData: Parallel Dataset of Academic Data for Machine Translation
Iñaki Lacunza1, Javier Garcia Gilabert2, Francesca De Luca Fornaciari3, Javier Aula-Blasco1, Aitor Gonzalez-Agirre4, Maite Melero1, Marta Villegas1
1Barcelona Supercomputing Center, 2Barcelona Super Computing Center, 3BSC Barcelona Supercomputing Center, 4Barcelona Supercomputing Center (BSC)
  A Single Model Ensemble Framework for Neural Machine Translation Using Pivot Translation
Seokjin Oh1, Keonwoong Noh2, Woohwan Jung3
1SK Siltron, 2Korea University, 3Hanyang University
  Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures
Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frederic Blain, Eva Vanmassenhove
Tilburg University
  Building a One-Million-Pair Bokmål–Nynorsk Translation Corpus: A Quality-First Harvesting and Cleaning Pipeline
Per Kummervold1, Thea Tollersrud2, Angelina Zanardi2
1The National Library of Norway, 2National Library of Norway
  New Trends for Modern Machine Translation with Large Reasoning Models
Sinuo Liu1, Chenyang Lyu2, Minghao Wu3, Zifu Shang2, Longyue Wang4, Weihua Luo2, Kaifu Zhang2
1University of Edinburgh, 2Alibaba Group, 3Monash University, 4Tencent AI Lab
  MaitH 1.0: A Parallel Corpus and Baseline for Low-Resource Maithili-Hindi Translation
Kamanksha Dubey1, Chandresh Maurya2, Kumar Padmanabh3
1INDIAN INSTITUTE OF TECHNOLOGY, 2IIT Indore, 3EBTIC (Etisalat British Telecom Innovation Center, Khalifa University)
  NRD: A Hybrid Disentanglement Framework for Mitigating Interference in Multilingual Machine Translation
Jiarui Zhang1 and Yifan Deng2
1Institute of Information Engineering, 2University of Chinese Academy of Sciences
  Linguistic and Demographic Factors in Online Free Translation Task
Tyler Lee, Irina Stenger, Tania Avgustinova
Saarland University
  Biases in Translation: Assessing Opinion Distortion in Machine Translated Texts
Nazanin Shafiabadi1 and François Yvon2
1Sorbonne University and ISIR, 2ISIR CNRS & Sorbonne Université
  When Translations Surprise: Human Awareness of Predictability in Translations
Cristian García-Romero1, Miquel Esplà-Gomis2, Felipe Sanchez-Martinez2
1University of Alicante, 2Universitat d'Alacant
  Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation
Xinyue Ma1, Pol Pastells2, Mireia Farrus1, Mariona Taule2
1Universitat de Barcelona, 2University of Barcelona
  CoTERM: A Consistency-Oriented Term Metric for MT System Evaluation
Amir Hazem1 and Kyo Kageura2
1RCAST, The University of Tokyo, 2University of Tokyo
  SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages
Hannah Liu1, Junghyun Min2, Annie Lee1, Ethan Yue Heng Cheung1, Shou-Yi Hung1, Elsie Chan1, Shiyao Qian1, RUNTONG LIANG1, Kimlan Huynh1, Wing Yu Yip1, York Hay Ng1, Tsz Fung Yau3, Ka Ieng Charlotte Lo1, You-Wei Wu4, Richard Tzong-Han Tsai5
1University of Toronto, 2Georgetown University, 3Scotiabank, 4National Central University, 5Academia Sinica
  Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models
Spyridon Mavromatis1, Sokratis Sofianopoulos2, Prokopis Prokopidis3, Maria Giagkou3
1Institute for Speech and Language Processing, Athena Research Center & National and Kapodistrian University of Athens, 2Researcher, 3ILSP/Athena RC
09:00 - 10:40    Session P8.1.2: Machine Translation II - Poster Area
  Linguistic Knowledge-Infused Fine-Tuning for Mitigating Gender Bias in Machine Translation
Luis Ernesto Garcia Estrada1, Audrey Mash2, Carlos Escolano3, Maite Melero2, Christine Basta4
1Universidad Politecnica de Catalunya, 2BSC, 3Universitat Politècnica de Catalunya, Barcelona Supercomputing Center, 4Alexandria University
  What Triggers My Model? Contrastive Explanations Inform Gender Choices by Translation Models
Janiça Hackenbuchner
Ghent University
  ViKhoMT: A Vietnamese–K'Ho Neural Machine Translation Dataset and Evaluation for Community Health Communication
Tram Truong1, Vinh Nguyen2, Dang Thin1, Ngan Nguyen3
1University of Information Technology,Vietnam National University Ho Chi Minh city, 2None, 3University of Information Technology, Vietnam National University Hochiminh City
  Hindsight Quality Prediction Experiments in Multi-Candidate Human-Post-Edited Machine Translation
Malik Marmonier, Benoît Sagot, Rachel Bawden
Inria
  PETra: A Multilingual Corpus of Pragmatic Explicitation in Translation
Doreen Osmelak1, Koel Dutta Chowdhury2, Uliana Sentsova1, Cristina España-Bonet3, Josef van Genabith4
1Saarland University, 2Saarland Informatics Campus,Saarland University, 3BSC/DFKI GmbH, 4DFKI
  A Dataset for Probing Translationese Preferences in English-to-Swedish Translation
Jenny Kunz1, Anja Jarochenko2, Marcel Bollmann2
1Linkoping University, 2Linköping University
  STAR-IL: A Dataset for Style-Aware Machine Translation of Product Reviews in Indian Languages
Ketaki Shetye1, Dipti Sharma2, Parameswari Krishnamurthy3
1International Institute of Information Technology, 2IIIT, Hyderabad, 3Assistant Professor, IIIT Hyderabad
  Cultural and Knowledge Biases in LLMs through the Lens of Entity-Aware Machine Translation
Lu Xu, Luca Moroni, Roberto Navigli
Sapienza University of Rome
  Referenceless Evaluation of Machine Translation Models by Ranking Performance in Romanian to English Translate-train Settings
Mihail Feraru, Alexandra Diaconu, Bogdan Alexe
University of Bucharest
  Every Word Presented in Context: Syntactic Coverage as Objective for Low-Resource Machine Translation with Large Language Models
Samuel Frontull and Thomas Ströhle
University of Innsbruck
  Multilingual KokoroChat: A Multi-LLM Ensemble Translation Method for Creating a Multilingual Counseling Dialogue Dataset
Ryoma Suzuki, Zhiyang Qi, Michimasa Inaba
The University of Electro-Communications
  NepTam: A Nepali-Tamang Parallel Corpus and Baseline Machine Translation Experiments
Rupak Raj Ghimire1, Bipesh Subedi2, Balaram Prasain3, Prakash Poudyal1, Praveen Acharya4, Nischal Karki1, Rupak Tiwari1, Rishikesh Sharma1, Jenny Poudel1, Bal Krishna Bal5
1Kathmandu University, 2Department of Computer Science and Engineering, Kathmandu University, 3Central Department of Linguistics, Tribhuvan University, 4Dublin City University, 5Department of Computer Science and Engineering, Kathmandu University, Nepal
  Scoring the Translation: On Target Automatic Keyword-Based Evaluation of Machine Translation in the Sports Domain
Steinthor Steingrimsson1 and Einar Sigurdsson2
1The Arni Magnusson Institute for Icelandic Studies, 2University of Pennsylvania
  Towards Improving Multimodal Machine Translation with LLMs: A Focus on Indic Languages
Amulya Ratna Dash1, Chirag Wadhwa2, Yashvardhan Sharma3
1Birla Institute of Technology & Science, Pilani, 2Birla Institute of Technology and Science, Pilani, Pilani campus, 3Birla Institute of Technology and Science
09:00 - 10:40    Session P8.2: Multilinguality and Translation Aids - Poster Area
  Parallel Sentence Filtering for Low-Resource Language Pairs: A Case Study for Upper Sorbian, German, and Czech
Ruiyang Jiang1, Shu Okabe2, Alexander Fraser3
1Technical University of Munich, 2TUM Heilbronn, 3Ludwig-Maximilians-Universität München
  OpenSubtitles2024: A Massively Parallel Dataset of Movie Subtitles for MT Development and Evaluation
Joerg Tiedemann and Hengyu Luo
University of Helsinki
  CREST: Universal Safety Guardrails through Cluster-Guided Cross-Lingual Transfer
Lavish Bansal and Naman Mishra
Repello AI
  Semantic Alignment across Ancient Egyptian Language Stages via Normalization-Aware Multitask Learning
He Huang
Ludwig Maximilian University of Munich
  Conditioning LLMs to Generate Code-Switched Text
Maite Heredia1, Gorka Labaka2, Jeremy Barnes3, Aitor Soroa4
1HiTZ Basque Center for Language Technology - Ixa NLP Group, University of the Basque Country UPV/EHU, 2HiTZ Center - Ixa, University of the Basque Country (UPV/EHU), 3University of the Basque Country EHU/UPV, 4HiTZ Center - Ixa, University of the Basque Country UPV/EHU
  Are the LLMs Capable of Maintaining at Least the Language Genus?
Sandra Mitrovic1, David Kletz2, Ljiljana Dolamic3, Fabio Rinaldi4
1SUPSI - IDSIA, 2Supsi, IDSIA, 3armasuisse S&T, 4IDSIA, Swiss AI Institute
  Gender Bias in MT for a Genderless Language: New Benchmarks for Basque
Amaia Murillo1, Olatz Perez-de-Viñaspre2, Naiara Perez3
1HiTZ Center, University of the Basque Country UPV/EHU, 2HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 3University of the Basque Country
  Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition
Aleix Sant1, Jordi Luque2, Carlos Escolano3
1Telefónica Innovación Digital, 2Telefonica Research, 3Universitat Politècnica de Catalunya, Barcelona Supercomputing Center
  Multilingual Target-Stance Extraction
Ethan Mines1 and Bonnie Dorr2
1The University of Florida, 2University of Florida
  MUNIChus: MUltilingual News Image Captioning Benchmark
Yuji Chen1, Alistair Plum2, Hansi Hettiarachchi1, Diptesh Kanojia3, Saroj Basnet4, Marcos Zampieri4, Tharindu Ranasinghe1
1Lancaster University, 2University of Luxembourg, 3University of Surrey, 4George Mason University
  GlossMATE: Multi-Agent Translator Explanations for Glosses
Changbing Yang1, Patrick Littell2, Gabriel Bernier-Colborne3, Yanfei Lu4, Mengzhe Geng3
1University of British Columbia, 2National Research Council of Canada, 3National Research Council Canada, 4University of Toronto
  Diagnosing Translated Benchmarks: An Automated Quality Assurance Study of the EU20 Benchmark Suite
Klaudia Thellmann, Bernhard Stadler, Michael Färber
TU Dresden
  Resource-Lean Lexicon Induction for German Dialects
Robert Litschko1, Barbara Plank1, Diego Frassinelli2
1LMU Munich, 2CIS, LMU Munich
09:00 - 10:40    Session P8.3: Multimodality - Poster Area
  FENCE: A Financial and Multimodal Jailbreak Detection Dataset
Mirae Kim, Seonghun Jeong, Youngjun Kwak
Kakaobank
  Evaluating Multimodal Large Language Models on Vertically Written Japanese Text
Keito Sasagawa1, Shuhei Kurita2, Daisuke Kawahara1
1Waseda University, 2National Institute of Informatics
  ProMQA-Assembly: Multimodal Procedural QA Dataset on Assembly
Kimihiro Hasegawa1, Wiradee Imrattanatrai2, Masaki Asada2, Susan Holm1, Yuran Wang1, Xuanang Zhou3, Ken Fukuda4, Teruko Mitamura1
1Carnegie Mellon University, 2National Institute of Advanced Industrial Science and Technology, 3CMU, 4AIRC/AIST
  K-MIND: Korean Multimodal INteraction Data for Dyadic Conversation Analysis
Jae Hee Yang1, Yuha Shin2, Saim Shin1, Je Woo Kim1, Jin Yea Jang1
1Korea Electronics Technology Institute, 2MaumAI
  Do Multimodal LLMs Understand Order? Measuring the Fragility of Multimodal Reasoning under Input Order Perturbations
Sheng-Lun Wei1, Yu-Ling Liao2, Hen-Hsen Huang3, Hsin-Hsi Chen1
1National Taiwan University, 2National Taiwan University, Taiwan, 3Institute of Information Science, Academia Sinica
  Early Fusion with Contrastive Learning: A Lightweight Alternative for Multi-modal Classification
Felix Wernlein1, Abhik Jana2, Sandipan Sikdar1
1Leibniz University Hannover, 2IIT Bhubaneswar
  Multimodal Entrainment and Feedback in Online Group Meetings
Patrizia Paggio1, Manex Agirrezabal1, Giulia Di Cristina2, Bart Jongejan1, Costanza Navarretta1
1University of Copenhagen, 2University of Turin
  MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
HYEYEON KIM1, Sungwoo Han2, Jingun Kwon3, Hidetaka Kamigaito4, Manabu Okumura5
1Department of Artificial Intelligence, Chungnam National University, 2Chungnam National University, Department of Artificial Intelligence, GILAB, 3Chungnam National University, 4Nara Institute of Science and Technology, 5Tokyo Institute of Technology
  Multimodal Reference by Means of the Pronoun We and Hand Gestures in a Novel Corpus of Parliamentary Opening Debates
Costanza Navarretta
University of Copenhagen
  Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque
Lukas Arana1, Julen Etxaniz1, Ander Salaberria1, Gorka Azkune2
1HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 2University of Basque Country
  Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches
Anum Afzal1, Yuki Saito2, Hiroya Takamura3, Katsuhito Sudoh4, Shinnosuke Takamichi5, Graham Neubig6, Florian Matthes7, Tatsuya Ishigaki8
1Technical University of Munich, 2The University of Tokyo, 3The National Institute of Advanced Industrial Science and Technology (AIST), 4Nara Women's University, 5Keio University, 6Carnegie Mellon University, 7Technische Universität München, 8National Institute of Advanced Industrial Science and Technology (AIST)
  ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Sara Ghaboura1, Shubham Patle1, Ketan More1, Wafa Alghallabi1, Omkar Thawakar1, Jorma Laaksonen2, Hisham Cholakkal1, Salman Khan1, Rao Anwer1
1Mohamed bin Zayed University of AI, 2Aalto University
  Event Chronography in Multi-modal Data: The BME Method for Quantitative Analyses
Anaïs Murat, Maria Koutsombogera, Carl Vogel
Trinity College Dublin
  CANVAS: A Multimodal Dataset of Chinese Textbook Images for Bias and Representation Analysis
Haotian Zhu, Kefan Yu, Min Li
University of Washington
  MM-Conv: A Multimodal Dataset and Benchmark for Context-Aware Grounding in 3D Dialogue
Anna Deichler1, Jim O'Regan1, Fethiye Irmak Dogan1, Anna Klezovich1, Lubos Marcinek1, Iolanda Leite1, Jonas Beskow2
1KTH Royal Institute of Technology, 2KTH Speech, music and hearing
  Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
June Hyoung Kwon, Jungmin Yun, Youngbin Kim
Chung-Ang University
  DREAM: A Multicultural Multimodal Dataset Linking Dialogues and Realistic Image Sequences
Juan Mallo1, Marcos Estecha-Garitagoitia1, Ricardo Cordoba2, Luis Fernando D'Haro3
1Universidad Politécnica de Madrid, 2Speech Technology Group. Dept. of Electronic Engineering. Universidad Politecnica de Madrid, 3Speech Technology and Machine Learning Group, E.T.S.I. Telecomunicación, Universidad Politécnica de Madrid
  Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs
Masayuki Kawarada1, Tatsuya Ishigaki2, Hiroya Takamura3
1CyberAgent/National Institute of Advanced Industrial Science and Technology, 2National Institute of Advanced Industrial Science and Technology (AIST), 3The National Institute of Advanced Industrial Science and Technology (AIST)
09:00 - 10:40    Session P8.4: Cross-modality - Poster Area
  Can Video LLMs See through Illusions? Benchmark Dataset and Comprehensive Analysis
Souto Ohira1, Tosho Hirasawa2, Mamoru Komachi1
1Hitotsubashi University, 2OMRON SINIC X Corporation
  To Skip, to Swap or to Not Swap? Identifying Step Transition Types in Instructional Manuals
Hsiu-Yu Yang1, Michael Roth2, Andreas Bulling3, Carina Silberer3
1Institute for Natural Language Processing, Stuttgart University, 2University of Technology Nuremberg, 3University of Stuttgart
  Fruitcakes and Cupcakes Emerging from Noise: The ComposiGen Dataset of Compounds and Their Compositionality
Jule Godbersen1, Sinan Kurtyigit2, Emma Raimundo Schulz3, Tonmoy Rakshit3, Diego Frassinelli4, Sabine Schulte im Walde3, Carina Silberer3
1Saarland University, 2Technical University of Munich, 3University of Stuttgart, 4CIS, LMU Munich
  Large Language Models' Internal Perception of Symbolic Music
Andrew Shin and Kunitake Kaneko
Keio University
  Entity Image and Mixed-Modal Image Retrieval Datasets
Cristian-Ioan Blaga1, Paul Suganthan G C1, Sahil Dua1, Krishna Srinivasan2, Enrique Alfonseca2, Peter Dornbach1, Tom Duerig1, Imed Zitouni2, Zhe Dong3
1Google, 2, 3Microsoft
  Generating Sign Language Poses from HamNoSys and Natural Language Descriptions
Santiago Máximo1 and Luis Chiruzzo2
1Universidad de la República, 2Universidad de la Republica
  Evaluating Discriminability of Vision-Language Models
Masayasu Muraoka1 and Naoaki Okazaki2
1IBM Research - Tokyo, 2Institute of Science Tokyo
  Seeing the Other Side: Diagnostic Tasks for Viewpoint Reasoning in Vision–Language Models
Makoto Takenaka1 and Hitomi Yanaka2
1Mitsubishi Electric, 2the University of Tokyo
  Multi-modal, Multi-task, Multi-criteria Automatic Evaluation with Vision Language Models
Masanari Oi1, Masahiro Kaneko2, Naoaki Okazaki1, Nakamasa Inoue1
1Institute of Science Tokyo, 2MBZUAI
  Challenges in Image-Caption Association in Portuguese: Evaluating the CLIP Model on the FM30K Dataset
Vitória Colonetti Benedet, Gutavo Lopes Tamiosso, Rafael Oleques Nunes, Dennis Giovani Balreira
UFRGS
  A Large-Scale Instruction-Tuning Dataset and Models for Slovenian Vision-Language Tasks
Matej Martinc1 and Domen Vreš2
1Jozef Stefan Institute, 2Univerza v Ljubljani
  A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding
Dilara Torunoglu-Selamet1, Dogukan Arslan1, Rodrigo Wilkens2, Wei He2, Doruk Eryigit3, Thomas Pickard4, Adriana Pagano5, Aline Villavicencio6, Gülsen Eryigit1, Ágnes Abuczki7, Aida Cardoso8, Alesia Lazarenka9, Dina Almassova10, Amalia Mendes11, Anna Kanellopoulou12, Antoni Brosa-Rodriguez13, Baiba Valkovska14, Beata Wojtowicz15, Bolette Pedersen16, Carlos Manuel Hidalgo-Ternero17, Chaya Liebeskind18, Danka Jokic19, Diego Alves20, Eleni Triantafyllidi12, Erik Velldal21, Fred Philippy22, Giedre Valunaite Oleskeviciene23, Ieva Rizgeliene24, Inguna Skadina25, Irina Lobzhanidze26, Isabell Haugen27, Jauza Akbar Krito28, Jelena Markovic29, Johanna Monti30, Josue Sauca31, Kaja Dobrovoljc32, Kingsley Ugwuanyi33, Laura Rituma34, Lilja Øvrelid35, Maha Tufail Agro36, Manzura Abjalova37, Maria Chatzigrigoriou38, María del Mar Sánchez Ramos39, Marija Pendevska40, Masoumeh Seyyedrezaei41, Mehrnoush Shamsfard42, Momina Ahsan43, Muhammad Ahsan Khan44, Nathalie Norman16, Nilay Erdem Ayyildiz45, Nina Hosseini-Kivanani46, Noémi Ligeti-Nagy47, Numaan Naeem43, Olha Kanishcheva48, Olha Yatsyshyna49, Daniil Orel43, Petra Giommarelli50, Petya Osenova51, Radovan Garabik52, Regina Semou53, Rozane Rebechi54, Salsabila Zahirah Pranida43, Samia Touileb27, Sanni Nimb55, Sarfraz Ahmad44, Sarvinoz Sharipova56, Shahar Golan57, Shaoxiong Ji58, Sopuruchi Aboh59, Srdjan Sucur29, Stella Markantonatou60, Sussi Olsen61, Vahide Tajalli42, Veronika Lipp47, Voula Giouli62, Yelda Yesildal Eraydin63, Zahra Saaberi64, Zhuohan Xie43
1Istanbul Technical University, 2University of Exeter, 3Istanbul Technical University NLP Group, 4University of Sheffield, 5Federal University of Minas Gerais, 6University of Exeter, UK, 7Károli Gáspár University of the Reformed Church in Hungary, 8Centro de Linguística da Universidade Nova de Lisboa, 9Tesi srl, 10Nazarbayev University, 11University of Lisbon - Centre of Linguistics, School of Arts and Humanities, 12Aristotle University of Thessaloniki, 13Universitat Rovira i Virgili, 14IMCS, University of Latvia, 15University of Warsaw, 16University of Copenhagen, 17Researcher, 18Jerusalem College of Technology , Lev Academic Center, 19University of Belgrade, 20Saarland University, 21University of Oslo, 22University of Luxembourg, 23Mykolas Romeris University, 24Vilnius university Institute of Data Science and Digital Technologies, 25Tilde/ Institute of Mathematics and Computer Science, University of Latvia, 26Ilia State University, 27University of Bergen, 28Universitas Gadjah Mada, 29University of East Sarajevo, 30"L'Orientale" University of Naples, 31Internacional University of Valencia, 32University of Ljubljana, 33SOAS University of London, 34Institute of Mathematics and Computer science, University of Latvia, 35Dept of Informatics, University of Oslo, 36Mohamed bin Zayed University of Artificial Intelligence, 37Alisher Navo'i Tashkent State Uzbek Language and Literature, 38National and Kapodistrian University of Athens, 39University of Alcalá, 40St. Cyrillus and Methodius University, 41Istinye University, 42Faculty of Computer Science and Engineering, Shahid Beheshti University, 43MBZUAI, 44Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), 45Assoc. Prof., 46RTL & University of Luxembourg, 47ELTE Research Centre for Linguistics, 48Heidelberg University, 49Ternopil Volodymyr Hnatiuk National Pedagogical University, 50University of Pisa, 51Sofia University "St. Kl. Ohridski" and IICT-BAS, 52L. Stur Institute of Linguistics, Slovak Academy of Sciences, 53NKUA, 54Universidade Federal do Rio Grande do Sul, 55Society for Danish Language and Literature (DSL), 56Samarkand State Institute of Foreign Languages, 57Jerusalem College of Technology, 58University of Turku and ELLIS Institute Finland, 59English and Communication, The Hong Kong Polytechnic University, 60ILSP/R.C. "Athena", 61UCPH, NorS, Centre for Language Technology, 62Aristotle University of Thessaloniki / ILSP, ATHENA RC, 63Dr., 64NLP Lab, Shahid Beheshti University, Tehran, Iran
  Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision–Language Models
Shiho Matta1, Lis Kanashiro Pereira2, Peitao Han3, Shigeru Kitazawa3, Fei Cheng1
1Kyoto University, 2NICT, 3The University of Osaka
  I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes
Shijia Zhou1, Saif Mohammad2, Barbara Plank3, Diego Frassinelli4
1Ludwig Maximilian University of Munich, 2National Research Council Canada, 3LMU Munich, 4CIS, LMU Munich
  DEJIMA: A Novel Large-scale Japanese Dataset for Image Captioning and Visual Question Answering
Toshiki Katsube1, Fukuhara Taiga1, Kenichiro Ando2, Yusuke Mukuta1, Kohei Uehara1, Tatsuya Harada1
1The University of Tokyo, 2RIKEN
  CLEVR-3D-DeRef
Mary Martin, Martha Palmer, Maria Pacheco
University of Colorado Boulder
09:00 - 10:40    Session P8.5: Sign Languages - Poster Area
  Bridging Text-to-Sign Translation via Codebook-Oriented Pretraining
Ninlawat Phuangchoke and Chantri Polprasert
Asian Institute of Technology (AIT)
  A Resource and Evaluation Method for Phonological Continuity in Japanese Sign Language
Jundai Inoue1, Daisuke Hara2, Makoto Miwa2
1Knowledge and Data Engineering Lab, Toyota Technological Institute at Japan, 2Toyota Technological Institute
  Sentiment Analysis of German Sign Language Fairy Tales
Fabrizio Nunnari1, Siddhant Jain1, Patrick Gebhard2
1German Research Center for Artificial Intelligence (DFKI), 2DFKI
  A Critical Study of Automatic Evaluation in Sign Language Translation
Shakib Yazdani1, Yasser HAMIDULLAH2, Cristina España-Bonet3, Eleftherios Avramidis4, Josef van Genabith2
1German Research Center for Artificial Intelligence (DFKI), 2DFKI, 3BSC/DFKI GmbH, 4Alangu; German Research Center for Artificial Intelligence (DFKI)
  How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation
Anna Klezovich1, Johanna Mesch2, Gustav Eje Henter3, Jonas Beskow4
1Division of Speech, Music and Hearing, KTH, 2Stockholm University, 3KTH Royal Institute of Technology, 4KTH Speech, music and hearing
  Decomposing Sign Language Movements: A Multi-Band Visualization Method for Articulatory Analysis
Antonio F. G. Sevilla and José María Lahoz-Bengoechea
Universidad Complutense de Madrid
10:40 - 11:00    Coffee Break
11:00 - 12:40    Session O33: Psycholinguistics, Cognitive Linguistics and Linguistic Theories - Room 1
11:00 - 11:20  Implicit Bias in Peer Review: Through the Lens of Language Abstraction
Xulang Zhang, Rui Mao, Erik Cambria
Nanyang Technological University
11:20 - 11:40  The PARLO Dementia Corpus: A German Multi-Center Resource for Alzheimer's Disease
Franziska Braun1, Christopher Witzl2, Florian Hönig3, Elmar Nöth4, Tobias Bocklet2, Korbinian Riedhammer5
1Technische Hochschule Nürnberg Georg Simon Ohm, 2Technische Hochschule Nürnberg, 3KST Institut GmbH, Bad Emstal, 4Friedrich-Alexander-University Erlangen-Nuremberg, 5Technische Hochschule Nuernberg Georg Simon Ohm
11:40 - 12:00  Lexical and Discourse Semantics in a Reading-time Corpus of English
Jakub Dotlacil1, Laia Fortuny1, Li Kloostra1, Johan Bos2
1Utrecht University, 2University of Groningen
12:00 - 12:20  Semantic Capacity in Language Learners and LLMs: A Case Study of Quantifier Scope
Shaohua Fang, Yue Li, Yan Cong
Purdue University
11:00 - 12:40    Session O34: Opinion and Argument Mining - Room 2
11:00 - 11:20  Disambiguation of Emotion Annotations by Contextualizing Events in Plausible Narratives
Johannes Schaefer1 and Roman Klinger2
1Fundamentals of Natural Language Processing, 2University of Bamberg
11:20 - 11:40  Identifying Contexts of Distress in College Students' Reddit Posts: A Comparative Study of Classical NLP and Large Language Models
Carine Graff and Nikhil Krishnaswamy
Colorado State University
11:40 - 12:00  TiC-MuFormer: Time-Aware Caption-Integrated Multimodal Transformers for User-Level Mental Health Modeling
Georgios Tsoumplekas, Yannis Spyridis, Vasileios Argyriou
Kingston University
12:00 - 12:20  Improving Neural Argumentative Stance Classification in Controversial Topics with Emotion-Lexicon Features
Mohammad Yeghaneh Abkenar1, Weixing Wang2, Manfred Stede1, Mark Finlayson3, Davide Picca4, Panagiotis Ioannidis5
1University of Potsdam, 2Hasso Plattner Institute, 3FIU, 4University of Lausanne, 5PI Squared Insights
12:20 - 12:40  Emotion Transcription in Conversation: A Benchmark for Capturing Subtle and Complex Emotional States through Natural Language
Yoshiki Tanaka1, Ryuichi Uehara1, Koji Inoue2, Michimasa Inaba1
1The University of Electro-Communications, 2Kyoto University
11:00 - 12:40    Session O35: Parsing - Room 3
11:00 - 11:20  SETUP: Sentence-level English-To-Uniform Meaning Representation Parser
Emma Markle, Javier Gutierrez Bach, Shira Wein
Amherst College
11:20 - 11:40  This One or That One? A Study on Accessibility via Demonstratives with Multimodal Large Language Models
Yu Wang1, Emmanuele Chersoni2, Chu-Ren Huang3
1The Hong Kong Polytechnic University, 2Hong Kong Polytechnic University, 3The Hong Kong Polytechnic Universiy
11:40 - 12:00  AMR Parsing beyond English: An Experiment on Bulgarian, French, Hungarian and Ukrainian
Ivaylo Mitov1, Tadzhat Marharian1, Zsofia Hauk1, Samba FALL1, Maxime Amblard2, Bruno Guillaume3
1Institut des sciences du Digital, Management & Cognition, 2Université de Lorraine, 3LORIA / Inria Nancy Grand-Est
12:00 - 12:20  Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN
Rémi DE VERGNETTE1 and Maxime Amblard2
1Université de Lorraine, CNRS, Inria, LORIA, F-53999 Nancy, France, 2Université de Lorraine
12:20 - 12:40  Two Ojibwe Constraint Grammars: Morphological Disambiguation and Dependency Parsing
Matthias Diederichsen and Christopher Hammerly
University of British Columbia
11:00 - 12:40    Session O36: Multimodality and Speech - Room 4
11:00 - 11:20  Multimodal LLMs Do Not Compose Skills Optimally across Modalities
Paula Ontalvilla1, Aitor Ormazabal2, Gorka Azkune3
1HiTZ Center - Ixa, University of the Basque Country (UPV/EHU, 2University of the Basque Country, 3University of Basque Country
11:20 - 11:40  Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
Maha Tufail Agro1, Atharva Kulkarni2, Karima Kadaoui1, Zeerak Talat3, Hanan Aldarmaki2
1Mohamed bin Zayed University of Artificial Intelligence, 2MBZUAI, 3University of Edinburgh
11:40 - 12:00  MUStReason: A Benchmark for Diagnosing Pragmatic Reasoning in VideoLMs for Multimodal Sarcasm Detection.
Anisha Saha1, Varsha Suresh2, Timothy Hospedales3, Vera Demberg2
1Max Planck Institute for Informatics, Saarland Informatics Campus., 2Saarland University, 3University of Edinburgh
12:00 - 12:20  Human-Centered Multimodal Fusion for Sexism Detection in Memes with Eye-Tracking, Heart Rate, and EEG Signals
Iván Arcos Gabaldón, Paolo Rosso, Elena Gomis Vicent
Universitat Politècnica de València, UPV
12:20 - 12:40  Nos_Brais-GL: A FAIR Galician TTS Corpus for Neural Speech Synthesis
Adina Vladu1, Antonio Moscoso Sánchez2, Carmen Magariños3, María Perez Lago1, Elisa Fernández Rei1
1Instituto da Lingua Galega, Universidade de Santiago de Compostela, 2Instituto da Lingua Galega, Centro Singular en Tecnoloxías Intelixentes, Universidade de Santiago de Compostela, 3Instituto da Lingua Galega, Departamento de Electrónica e Computación, Universidade de Santiago de Compostela
11:00 - 12:40    Session P9.1: Natural Language Generation - Poster Area
  DR-CUP: A Dataset on Real-time Commentary in U.S. Presidential Debates
Yu-Yu Chang1, Huan-Wen Ho1, Chung-Chi Chen2, Ming-Hung Wang3
1National Chung Chen University, 2National Institute of Advanced Industrial Science and Technology, 3National Chung Cheng University
  Russian Generative Spelling, Punctuation and Capitalization Correction
Nikita Martynov1, Danil Astafurov2, Ulyana Isaeva1, Ivan Maksimov3, Joqsan Azocar4, Dmitrii Kosenko4, Alena Fenogenova5
1SaluteDevices, 2ITMO University, 3Moscow Institute of Physics and Technology, 4MIPT, 5SberAI
  Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization
Chaimae Chellaf El Hammoud1, Salima Mdhaffar2, Yannick Estève3, Stéphane Huet4
1Avignon, 2Avignon university, 3LIA - Avignon Université, 4Université d'Avignon
  Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering
Purva Chiniya1, Kevin Scaria2, Sagar Chaturvedi1
1Amazon, 2Amazon.com
  The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form Factuality Evaluation
Pavel Braslavski1, Dmitrii Iarosh2, Nikita Sushko3, Andrey Sakhovskiy4, Vasily Konovalov5, Elena Tutubalina6, Alexander Panchenko7
1HSE University, 2Skolkovo Institute of Science and Technology, Russia, 3Independant Researcher, 4Sber AI, Russia; Skoltech, Russia, 5Affiliation, 6HSE University, Russia and Kazan Federal University, Russia and AIRI, Russia and Insilico Medicine Hong Kong, Hong Kong, 7S-NLP
  MeteoGalEus: An Iberian Multilingual Weather Dataset in Galician, Euskera, and Spanish
Ainhoa Vivel-Couso1, Nella Zabrina Pramata2, David Robredo3, Aitor Soroa4, Jose Maria Alonso-Moral1
1University of Santiago de Compostela, 2University of Basque Country, 3Universidade de Santiago de Compostela, 4HiTZ Center - Ixa, University of the Basque Country UPV/EHU
  RadTimeline: Timeline Summarization for Longitudinal Radiological Lung Findings
Sitong Zhou, Meliha Yetisgen, Mari Ostendorf
University of Washington
  InstructSum: A Benchmark to Evaluate Instruction-Following Capability of Large Language Models in Summarization
Kosuke Nishida1, Kyosuke Nishida2, Itsumi Saito3
1NTT, 2NTT Human Informatics Laboratories, 3Tohoku University
  NOVELSUM: Evaluating Long-Form Summary Generation for Historical Scandinavian Novels
Ali Al-Laith, Alexander Conroy, Kirstine Degn, Jens Bjerring-Hansen, Daniel Hershcovich
University of Copenhagen
  Evaluating Large Language Models for Text-to-Gloss Translation in Kazakh-Russian Sign Language: A Pilot Study
Zhanibek Kozhirbayev1 and Alfarabi Imashev2
1National Laboratory Astana, Nazarbayev University, 2Nazarbayev University
  HotelCheckSpan: A Benchmark Dataset for LLM Faithfulness
Patricia Schmidtova1, Ondrej Dusek1, Saad Mahamood2
1Charles University, 2Shopware
11:00 - 12:40    Session P9.2.1: Machine Learning II - Poster Area
  Procrustes Analysis for Improving Language Model Merging
Olivier Ferret
CEA List
  MetaCORA: A Meta-Learned Curriculum for Adversarial and Contrastive Robustness in Speech Recognition
Yuqian Dai, Chun Fai Chan, Ying Ki Wong, Tsz Ho Pun
Logistics and Supply Chain MultiTech R&D Centre Limited
  Insights from Transfer Learning Experiments with Word-in-Context and Word Sense Disambiguation Models
Alp Mujko and Dominik Schlechtweg
University of Stuttgart
  Joint Identification and Induction of Semantic Frames with Scalable Semi-Supervised Graph Clustering
Fabian Barteld1, Steffen Remus2, Saba Anwar2, Julian Stawecki1, Alexander Ziem1, Chris Biemann2
1Heinrich Heine University Düsseldorf, 2Universität Hamburg
  Low-Rank Compression of Language Models via Differentiable Rank Selection
Sidhant Sundrani, Francesco Tudisco, Pasquale Minervini
University of Edinburgh
  Self-supervised Data Augmentation for Text Classification in Low-Data Settings
Deyu Ding1, Mengying Wang2, Andreas Spitz2
1Southern University of Science and Technology, 2University of Konstanz
  Distribution-aware Low-bitwidth Quantization for Large Language Models
Bao Huynh, Takashi Tsunakawa, Masafumi Nishida
Shizuoka University
  TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
ChengYeh Yang1, Chien-Chun Wang1, Li-Wei Chen2, Hung-Shin Lee2, Hsin-Min Wang3, Berlin Chen1
1National Taiwan Normal University, 2United Link Co., Ltd., 3Institute of Information Science, Academia Sinica
  Harnessing Synergy in Context and Emoji for Joint Detection of Harmful Online Content in Multi-turn Conversations
Feiyan Hu, Ciara Byrne, Jiang Zhou, Rena Maycock, Mark Langan
Chirp
  Dynamic Layer Selection for Efficient Tone Recognition in Self-Supervised Speech Models
Saint Germes BENGONO OBIANG, Norbert TSOPZE, Paulin MELATAGIA YONTA
Univertity of Yaounde 1
  Intent Recognition in Speech-to-Text Processing in the Context of Natural Interaction with Cognitive Assistive Systems
Behnam Ensan1, Magnus Jung1, Matthias Busch1, Adreas Wendemuth2
1doctoral candidate, 2Professor for Cognitive Systems, University Magdeburg
  Merging Continual Pretraining Models for Domain-Specialized LLMs: A Case Study in Finance
Kentaro Ueda1, François Portet2, Hirohiko Suwa1, Keiichi Yasumoto1
1Nara Institute of Science and Technology, 2Université Grenoble Alpe
  Phonetic-based Ranking for Improved Pseudo-Labeling in Low-Resource ASR
Marco Matassoni1, Roberto Gretter1, Falavigna Daniele1, Mohamed Nabih Ali Mohamed Nawar1, Alessio Brutti1, Matteo Negri1, Mauro Cettolo1, Marco Gaido2, Sara Papi1, Luisa Bentivogli1
1Fondazione Bruno Kessler, 2Fondazione Bruno Kessler, University of Trento
  Privacy-Preserving Information Extraction with Local LLMs: A Comparative Study on Dutch Debt Collection Letters
Beyza Celep, Natalia Amat-Lefort, Joost Visser
Leiden University
11:00 - 12:40    Session P9.2.2: Machine Learning III - Poster Area
  Forewarned Is Forearmed: When Non-Sequential Embedding Turns into an Anomaly Detector
Elys Allesiardo, Antoine Caubrière, Valentin Vielzeuf
Orange Research
  A Joint Detection Framework for Latvian Loanwords and Calques Using Monolingual Data
Yelingyun Zhang, Atis Kapenieks, Marina Platonova
Riga Technical University
  Pantagruel: Unified Self-Supervised Encoders for French Text and Speech
Phuong-Hang Le1, Valentin Pelloin2, Arnault Chatelain3, Maryem Bouziane4, Mohammed Ghennai5, Qianwen Guan6, Kirill Milintsevich7, Salima Mdhaffar8, Aidan Mannion9, Nils Defauw10, Shuyue Gu6, Alexandre Audibert11, Marco Dinarelli12, Yannick Estève13, Lorraine Goeuriot9, Steffen Lalande7, Nicolas Hervé2, Maximin Coavoux14, François Portet15, Étienne Ollion16, Marie Candito17, Maxime Peyrard5, Solange Rossato12, Benjamin Lecouteux18, Aurélie Nardy19, Gilles Sérasset11, Vincent Segonne20, Solène Evain5, Diandra Fabre5, Didier Schwab21
1Saclay AI, 2INA, 3CREST (Ecole Polytechnique, ENSAE, CNRS), 4Avignon Université, LIA, 5Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 6Université Paris Cité, 7Institut national de l'audiovisuel, 8Avignon university, 9LIG, Université Grenoble Alpes, 10Univ. Grenoble Alpes, CNRS, Grenoble INP, 11Université Grenoble Alpes, 12LIG, 13LIA - Avignon Université, 14CNRS, Univ Grenoble Alpes, 15Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble, 16CNRS-CREST, 17LLF, Université Paris Cité, 18LIG/GETALP, 19Lidilem, 20IRISA - Université Bretagne Sud, 21Univ. Grenoble Alpes
  Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights
Eneko Valero1, Maria Ribalta i Albado1, Oscar Sainz1, Naiara Perez2, German Rigau3
1University of the Basque Country (UPV/EHU), 2University of the Basque Country, 3UPV/EHU
  SemiAdapt: Semi-Supervised and Efficient LoRA-Based Domain Adaptation for Low-Resource Irish Machine Translation with Transformers
Josh Mcgiff and Nikola Nikolov
University of Limerick
  Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts
Valentin Pelloin1, Lina Bekkali2, Reda Dehak3, David Doukhan4
1INA, 2École nationale des ponts et chaussées (ENPC), 3EPITA, 4Institut national de l'audiovisuel (Ina)
  SENS-ASR: Semantic Embedding Injection in Neural-transducer for Streaming Automatic Speech Recognition
Youness Dkhissi1, Valentin Vielzeuf2, Elys Allesiardo1, Anthony Larcher3
1Orange Innovation, 2Orange Research, 3Université du Mans - LIUM
  Efficient Financial Language Understanding via Distillation with Synthetic Data
Wen-Fong (Xavier) Huang and Edwin Simpson
University of bristol
  Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
Aditya Kamlesh Parikh1, Cristian Tejedor-García2, Catia Cucchiarini3, Helmer Strik4
1Radboud University, 2CLST, Radboud University, 3Radboud University Nijmegen/Nederlandse Taalunie, 4Centre for Language and Speech Technology (CLST), Centre for Language Studies (CLS), Radboud University Nijmegen
  Leveraging Semi-Supervised Learning for Multimodal Hate Speech Data Annotation and Detection
Rathi Adarshi Rammohan1, Zhao Ren1, Dominik Puchala2, Aleksandra Swiderska2, Dennis Küster1, Tanja Schultz1
1University of Bremen, 2University of Warsaw
  Lexicalized Constituency Parsing for Middle Dutch: Low-resource Training and Cross-Domain Generalization
Yiming Liang1 and Fang Zhao2
1Universiteit Gent, 2Université Paris Cité & Laboratoire de Linguistic Formelle
  Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search
Kyle McCleary and James Ghawaly
Louisiana State University
  Reason-to-Learn (R2L): Multi-Agent Knowledge Distillation for Lightweight LLMs in Sentiment Analysis
Le-Huy Tu1, Quan Nguyen2, Vincent NGUYEN3, Johanna Bjorklund4, Xuan-Son Vu5
1DopikAI JSC., 2Umeå University, 3University of Orleans, INSA CVL, LIFO EA, France, 4Umea University, 5Lund University and DeepTensor AB
  PRiSM: Partial Ranking via Inter-layer Semantic Measurement for Efficient Fine-tuning of Language Models
Aldrin Biswas1, Md Fahim2, Md. Amin1, Amin Ali1, AKM Rahman1
1Center for Computational & Data Sciences, Independent University, Bangladesh, 2Center for Computational & Data Sciences at Independent University, Bangladesh (IUB)
11:00 - 12:40    Session P9.3.1: Language Modeling and LRs III - Poster Area
  Beyond Literal Meaning: How LLMs Interpret Yemeni Proverbs
Nasser Thmer1, Ali Al-Laith2, Muhammad Shoaib1
1UET LAHORE, 2University of Copenhagen
  SEFL: A Framework for Generating Synthetic Educational Assignment Feedback with LLM Agents
Mike Zhang1, Amalie Dilling2, Léon Gondelman2, Niels Lyngdorf2, Euan Lindsay2, Johannes Bjerva3
1University of Copenhagen, 2Aalborg University, 3Department of Computer Science, Aalborg University
  LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
Hailay Kidu Teklehaymanot1, Dren Fazlija2, Wolfgang Nejdl1
1L3S Research Center, 2L3S Research Center, Leibniz University Hannover
  A Cheap Lunch: Synthetic Annotation with Minimal Human Effort for Medical Text Mining
Shutao Chen and Piek Vossen
Vrije Universiteit Amsterdam
  Supervised Contrastive Fine-Tuning for Active Few-Shot Learning
Zirui Zhang, Lei Ge, Shengyu Qiao
Information Engineering University
  Simulating Student Interactions for Virtual Pretesting with In-Context Learning
Arthur Thuy1, Luca Benedetto2, Ekaterina Loginova3, Dries Benoit1
1Ghent University, 2University of Cambridge, Institut Polytechnique de Paris, 3Dedalus Healthcare
  An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter LLMs
Deshan Sumanathilaka, Nicholas Micallef, Julian Hough
Swansea University
  Building Effective Japanese Medical LLMs with an Open Recipe for Domain Adaptation through Continued Pre-training
Akiko Aizawa1, Yuki Arase2, Fei Cheng3, Jiahao Huang4, Zhiyi Huang2, Junfeng Jiang4, Teruhito Kanazawa1, Daisuke Kawahara5, Kazuma Kobayashi1, Takashi Kodama3, Sadao Kurohashi3, Yusuke Oda1, Yuma Tsuta1, Zhen Wan3, Zhishen Yang1, Rio Yokota2
1National Institute of Informatics, 2Institute of Science Tokyo, 3Kyoto University, 4University of Tokyo, 5Waseda University
  New Encoders for German Trained from Scratch: Comparing ModernGBERT with Converted LLM2Vec Models
Julia Wunderle1, Anton Ehrmanntraut2, Jan Pfister3, Fotis Jannidis2, Andreas Hotho4
1University of Wuerzburg, 2Universität Würzburg, 3Julius-Maximilians-Universität Würzburg (JMU), 4University of Würzburg
  Arabic ChartSumm: An English-to-Arabic Benchmark for Metadata-to-Text Summarization
Passant Elchafei1 and Amany Fashwan2
1Ulm University, Germany, 2Phonetics and Linguistics Department, Faculty of Arts, Alexandria University, Alexandria
  Introducing a Bangla Sentence – Gloss Pair Dataset for Bangla Sign Language Translation and Research
Neelavro Saha, Rafi Shahriyar, Nafis Roudra, Saadman Sakib, Annajiat Rasel
BRAC University
  Language Models as Semantic Augmenters for Sequential Recommenders
Mahsa Valizadeh, Xiangjue Dong, Rui Tuo, James Caverlee
Texas A&M University
  Efficient Adaptation of English Language Models for Morphologically Rich and Underrepresented Languages: The Case of Arabic
Ahmed Eldamaty1, Mohamed Abdelrahman2, Mohamed Elbehery1, Mariam Ashraf1, Radwa Elshawi2
1Giza Systems, 2University of Tartu
11:00 - 12:40    Session P9.3.2: Language Modeling and LRs IV - Poster Area
  GhostWriter: Hidden AI-Generated Texts over Multiple Languages, Domains and Generators
Manuel Schaaf1, Kevin Bönisch2, Alexander Mehler1
1Goethe-University Frankfurt am Main, 2Text Technology Lab, Goethe-University
  Using LLMs to Extract Instances of Schematic Constructions from Unannotated L2 Learner Corpora
Jelena Kallas1, Ahto Kiil2, Heete Sahkai1, Geda Paulsen3, Kertu Saul4
1Institute of the Estonian Language, 2University of Tartu, 3Institute of the Estonian Language, Uppsala University, 4Institute of the Estonian Language, University of Tartu
  Corruption-Based Data Augmentation for Arabic Essay Scoring: A Preliminary Study on the Organization Trait
May Bashendy and Tamer Elsayed
Qatar University
  Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach
Salim Al Mandhari1, Hieu Pham Dinh2, Mo El-Haj2, Paul Rayson1
1Lancaster University, 2VinUniversity
  ManufactuBERT: Efficient Continual Pretraining for Manufacturing
Robin Armingaud and Romaric Besancon
CEA LIST
  Smigiel Dataset: Laying Foundations for Investigating Machine-Generated Text Detection in Polish
Jakub Strebeyko1, Alina Wróblewska2, Piotr Przybyla3
1University of Warsaw, Warsaw, Poland, 2Institute of Computer Science, Polish Academy of Sciences, 3Universitat Pompeu Fabra
  Extracting Medical Image-Related Entities from Spanish Electronic Health Records Using NER Methods
Alexander Platas1, Marcos Merino1, Elena Zotova1, Montse Cuadros1, Karen López-Linares1, Mikel Pérez de Mendiola2, María Gálvez2, Cristina Barba2, Antón Asla2
1Vicomtech, 2Serikat
  A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German
Shiva Banasaz Nouri1, Elena Leitner2, Julian Moreno-Schneider2, Georg Rehm2
1TU Berlin, 2DFKI
  LLM-Based Data Generation and Clinical Skills Evaluation for Low-Resource French OSCEs
Tian Huang1, Tom Bourgeade2, Irina Illina3
1LORIA, University of Lorraine, 2LORIA - INRIA, University of Lorraine, 3LORIA/INRIA
  Instruction-Tuned Urdu LLMs: Efficient Adaptation of Llama Models and Evaluation Resources for Urdu
Munief Tahir1, Sana Shams2, Sarmad Hussain3, Miriam Butt4
1Al Khawarizmi Institute of Computer Science, 2Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, 3Center for Language Engineering, KICS, UET, 4University of Konstanz
  Is Biomedical Specialization Still Worth It? Insights from Domain-Adaptive Language Modelling with a New French Health Corpus
Aidan Mannion1, Cécile Macaire1, Armand Violle2, Stéphane Ohayon2, Xavier Tannier3, Didier Schwab4, Lorraine Goeuriot1, François Portet5
1LIG, Université Grenoble Alpes, 2LIMICS, Sorbonne Université, INSERM, 3Limics, Sorbonne Université, 4Univ. Grenoble Alpes, 5Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble
  TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation
Toms Bergmanis, Ingus Pretkalninš, Martins Kronis, Davis Nicmanis, Jelizaveta Jelinska, Roberts Rozis, Rinalds Viksna, Marcis Pinnis
Tilde
  Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs
Saugata Purkayastha1, Pranav Kushare1, Pragya Pal1, Sukannya Purkayastha2
1Saarland University, 2TU Darmstadt
11:00 - 12:40    Session P9.3.3: Language Modeling and LRs V - Poster Area
  ``Emphasizing the Commendable'': A Study of Homogenized Transitive Verb Constructions in Machine Generated Peer Reviews
Hing-Yuet Fung1, Chi-kiu Lo2, Samuel Larkin3
1Independent Researcher, 2National Research Council of Canada, 3National Research Council Canada
  CoDAE: Adapting Large Language Models for Education via Chain-of-Thought Data Augmentation
Shuzhou Yuan1, Willliam LaCroix2, Hardik Ghoshal3, Ercong Nie4, Michael Färber3
1Dresden University of Technology, 2Saarland University, 3TU Dresden, 4Centre for Information and Language Processing, LMU Munich
  Synthetic Instruction Generation for Low-Resource Nordic Languages: Viability and Limitations in LLM Instruction-Tuning
Mathias Stenlund1, Annika Simonsen1, Lars Bungum2, Jan Ebert3, Jiangtao Wang3, Oleg Filatov3, Hemanadhan Myneni1, Morris Riedel1, Hafsteinn Einarsson1
1University of Iceland, 2NTNU, 3Jülich Supercomputing Centre
  AYN: A Tiny Yet Competitive Indian Legal Language Model Pretrained from Scratch
Mitodru Niyogi1, Eric Gaussier2, Arnab Bhattacharya3
1CNRS, 2Univ. Grenoble Alpes, 3Dept. of Computer Science and Engineering, IIT Kanpur
  Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
Eeham Khan1, Firas Saidani2, Owen Van Esbroeck1, Richard Khoury2, Leila Kosseim1
1Concordia University, 2Université de Laval
  Reformulate and Create, Don't Translate: Creating Natural Prompts for Underserved Languages
Annika Simonsen1, Mathias Stenlund2, Lars Bungum3, Marc Volhardt2, Hafsteinn Einarsson2
1The University of Iceland, 2University of Iceland, 3Norwegian University of Science and Technology
  Generating High Quality Synthetic Data for Dutch Medical Conversations
Cecilia Kuan1, Aditya Kamlesh Parikh1, Henk van den Heuvel2
1Radboud University, 2CLS/CLST, Radboud University Nijmegen
  DeepICD-R1: Medical Reasoning through Hierarchical Rewards and Unsupervised Distillation
Tom Röhr1, Thomas Steffek1, Roman Teucher2, Keno Bressem3, Alexei Figueroa1, Paul Grundmann1, Peter Troeger1, Felix Gers1, Alexander Löser1
1Berliner Hochschule für Technik (BHT), 2Fraunhofer Research Engineer, 3Department of Diagnostic and Interventional Radiology, School of Medicine, University Hospital Rechts der Isar, Technical University of Munich
  SynthLLM: An LLM-based Scalable Synthetic Data Generation Pipeline for Low-Resource Languages
Solmaz Panahi1, Vasudevan Nedumpozhimana2, John Kelleher3
1Maynooth University, 2TU Dublin, 3Trinity Colledge Dublin
  Persona-Conditioned Generation of Patient Self-Reports from EHRs
Yuexin Wu1, jianming wei2, Vasile Rus1
1UNIVERSITY OF MEMPHIS, 2University Medical Center Utrecht
  SocialStep: Fast Prediction of Social Determinants of Health
Paul Landes1, Adam Cross2, Jimeng Sun3
1University of Illinois at Chicago, 2University of Illinois College of Medicine Peoria, 3University of Illinois Urbana-Champaign
  Dynamically Acquiring Text Content to Enable the Classification of Lesser-known Entities for Real-world Tasks
Fahmida Alam and Ellen Riloff
University of Arizona
  RILEC: Detection and Generation of L1 Russian Interference Errors in English Learner Texts
Darya Kharlamova1 and Irina Proskurina2
1National Research University Higher School of Economics, 2Laboratoire Hubert Curien, UMR CNRS 5516, Saint-Etienne, France, Université Claude Bernard Lyon 1, Université Lumière Lyon 2, ERIC, 69100, Villeurbanne, France
12:40 - 14:10    Lunch Break
14:10 - 14:55    Keynote Speaker: Dan Jurafsky - Room 1
14:55 - 15:00    Short Break (5mn)
15:00 - 16:40    Session O37: Evaluation, Validation, Quality Assurance - Room 1
15:00 - 15:20  Critical Foreign Policy Decision (CFPD) Benchmark: Measuring Diplomatic Preferences of Large Language Models
Benjamin Jensen1, Ian Reynolds1, Yasir Atalan1, Michael Garcia2, Austin Woo2, Anthony Chen2, Trevor Howarth2
1Center for Strategic and International Studies, 2Scale AI
15:20 - 15:40  CrisisCL: A Domain Incremental Learning Benchmark for Crisis Management
Paul Le Van Kiem1, Romain Meunier1, Farah Benamara2, Véronique MORICEAU3
1IRIT, 2University of toulouse, 3IRIT, Université de Toulouse
15:40 - 16:00  Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation
Neha Sharma1, Navneet Agarwal2, Kairit Sirts1
1University of Tartu, 2EXAI, University of Tartu
16:00 - 16:20  LLMs as Annotators: Evaluating Model–Human Alignment in Detecting Contentious Language in Historical Corpora
Yahui Zhao1, Clemencia Siro2, Laura Hollink1
1Centrum Wiskunde & Informatica (CWI), 2Centrum Wiskunde & Informatica
16:20 - 16:40  Widespread Gender and Pronoun Bias in Moral Judgments across LLMs
Gustavo Fernandes, Jeiverson Santos, Pedro O.S Vaz-de-Melo
UFMG
15:00 - 16:40    Session O38: Knowledge Discovery and Representation - Room 2
15:00 - 15:20  Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation
Lewis Watson, Carl Strathearn, Kenny Mitchell, Yanchao Yu
Edinburgh Napier University
15:20 - 15:40  Injecting Structured Biomedical Knowledge into Language Models:Continual Pretraining vs. GraphRAG
Jaafer Klila1, Sondes Bannour Souihi2, rahma boujelbane3, Nasredine Semmar4, Lamia Hadrich-Belguith5
1PhD student, 2CEA, 3FSEGS, 4CEA LIST, 5ANLP Research Group, MIRACL Lab, FSEGS, Sfax University
15:40 - 16:00  Linguistic Knowledge Graphs for Sense Prediction: A Case-study on Latin
Eleonora Ghizzota1, Paola Marongiu2, Pierpaolo Basile3, Stefano Ferilli4, Barbara McGillivray5
1University of Bari Aldo Moro, 2CNR-ILC, Istituto di Linguistica Computazionale 'A. Zampolli', 3Department of Computer Science, University of Bari Aldo Moro, 4Universitá degli Studi di Bari, 5King's College London
16:00 - 16:20  ACID: On the Perception of Online Classism
Arianna Muti1, Elisa Bassignana2, Amanda Cercas Curry1, Federica Durante3, Dirk Hovy1, Debora Nozza1
1Bocconi University, 2IT University of Copenhagen, 3Università Milano Bicocca
16:20 - 16:40  The Spectrum of Sentiment: Optimistic, Pessimistic, and Neutral Voices in Online Depression Discourse
Stefana Tabusca1, Ana-Maria Bucur2, Liviu Dinu1
1University of Bucharest, 2Università della Svizzera italiana
15:00 - 16:40    Session O39: Applications Involving LRs and Evaluation III - Room 3
15:00 - 15:20  A Benchmark Dataset and Comparative Evaluation of Phonemized and Romanized Urdu for Text-to-Speech
M Kaab Bin Shahid1 and Muhammed Izharuddin2
1University of Stuttgart, 2Aligarh Muslim University
15:20 - 15:40  S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature
Abigail Berthe-Pardo1, Gaspard Michel2, Elena Epure2, Christophe Cerisara3
1Université de Lorraine, CNRS, LORIA, 2Deezer Research, 3Universite de Lorraine, CNRS, LORIA
15:40 - 16:00  BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios
Yunseung Lee1, Subin Kim2, Youngjun Kwak2, Jaegul Choo3
1KakaoBank Corp., 2Kakaobank, 3Korea Advanced Institute of Science and Technology
16:00 - 16:20  TR-TEB: Turkish Text Embedding Benchmark
Omer Arslan, Atalay Celik, Yusuf Aslan, Hasan Durkaya, Mustafa Zenginoglu, Musa Yilmaz, Merve Kantarci, Mehmet Haklidir
TUBITAK BILGEM
16:20 - 16:40  Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+
Mason Shipton1, York Hay Ng2, Aditya Khan2, Phuong Hoang2, Xiang Lu3, A. Seza Dogruoz4, Annie Lee2
1Ontario Tech University, 2University of Toronto, 3University of Michigan, 4Universiteit Gent
15:00 - 16:40    Session O40: Multimodality, Cross-modality - Room 4
15:00 - 15:20  SciClaimEval: Cross-modal Claim Verification in Scientific Papers
Xanh Ho1, Yun-Ang Wu2, Sunisth Kumar3, Tian Cheng Xia4, Florian Boudin5, Andre Greiner-Petter6, Akiko Aizawa1
1National Institute of Informatics, 2National Taiwan University, 3University of Tokyo, 4University of Bologna, 5Nantes University, 6University of Goettingen
15:20 - 15:40  Localizing Events in Space: Comparing Humans and AI Models
Derrick Eui Gyu Kim, Kenneth Lai, James Pustejovsky
Brandeis University
15:40 - 16:00  STRUDEL: Unrolling a Benchmark for Evaluating Vision-Language Models on Structured Diagram Understanding across Domains
Daniel Steinigen, Lucie Flek, Sebastian Houben
Fraunhofer IAIS
16:00 - 16:20  VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
Byeonggeuk Lim, Kyeonghyun Kim, Jungmin Yun, Youngbin Kim
Chung-ang University
16:20 - 16:40  VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Josef Kuchar1, Marek Kadlcik2, Michal Spiegel3, Michal Stefanik1
1Masaryk University, 2Faculty of Informatics, Masaryk University, 3Kempelen Institute of Intelligent Technologies
15:00 - 16:40    Session P10.1: Social Media - Poster Area
  ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source
Hung Le1, Long To1, Manh Nguyen1, Kiet Nguyen2
1University of Information Technology, HCM VNU, 2University of Information Technology, VNU-HCM
  Automated Extraction of Answer Candidates for Question Generation
Claudia Preda1, Mihai Dascalu1, Stefan Ruseti2, Danielle McNamara3
1National University of Science and Technology POLITEHNICA Bucharest, 2University Politehnica of Bucharest, 3Arizona State University
  Green Bots versus Red Bots: Evaluating Large Language Models for Simulating Persuasion Dynamics in Online Influence Campaigns
Majd Al Ali1, Filip Muntean2, Lucia Donatelli1, Jurriaan van Diggelen3
1Vrije Universiteit Amsterdam, 2Vrije Universiteit, 3TNO
  Towards Expectation Detection in Language: A Case Study on Treatment Expectations in Reddit
Aswathy Velutharambath1 and Amelie Wührl2
1University of Stuttgart, University of Bamberg, 2IT University of Copenhagen
  Empathy Speaks in Metaphors: The Empathy-Metaphor Corpus of Figurative Language in Empathetic Text
Gyeongeun Lee and Natalie Parde
University of Illinois at Chicago
  A Computational Diachronic Analysis of Gen Z Mental Health Discourse: A Large-scale Reddit Corpus Study from Pre- to Post-COVID
Felix Mao
Rye Country Day School
  "Oat Milk Vegan Chocolate Taste Great!": Monitoring the Food Transition Debate in Reddit
Greta Zella1, Jan Willem Bolderdijk2, Saskia Peels1, Gerry Wakker1, Tommaso Caselli3
1University of Groningen, 2University of Amsterdam, University of Groningen, 3Rijksuniversiteit Groningen
  ClimateChat-300K: A Multi-Modal Facebook Dataset for Understanding Diverse Perspectives in Climate Communication
Wajdi Zaghouani1, Md. Rafiul Biswas2, Mabrouka Bessghaier1, Shimaa Ibrahim1, George Mikros2
1Northwestern University Qatar, 2Hamad Bin Khalifa University
  HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse
Sai Kartheek Reddy Kasu1, Shankar Biradar2, SUNIL SAUMYA3, Md. Shad Akhtar4
1Student, 2Assistant Professor, 3INDIAN INSTITUTE OF INFORMATION TECHNOLOGY DHARWAD, 4Indraprastha Institute of Information Technology, Delhi
  MindSET: Advancing Mental Health Benchmarking through Large-Scale Social Media Data
Saad Mankarious1, Edward Kempa2, Daniel Wiechmann3, Elma Kerz4, Yu Qiao5, Ayah Zirikly6
1Cornell College, 2University of Florida, Department of Computer and Information Science and Engineering, 3Institute for Logic Language and Computation, 4Exaia Technologies, 5RWTH Aachen University, 6Johns Hopkins University
  A Corpus of Misunderstood Irony on Turkish Social Media
Çagri Çöltekin and Güliz Günes
University of Tübingen
15:00 - 16:40    Session P10.2.1: Linguistics and Psycholinguistics I - Poster Area
  A Corpus of Joint EEG and Self-Paced Reading of Natural Dutch Texts
Sara Østergaard, Lenneke Lichtenberg, Laura Boon, Bruno Nicenboim
Tilburg University
  How Long Does a Quick Kiss Take? Studying Event Duration of Light Verb Constructions Using Explicit Word Embeddings
Lin de Huybrecht and Geraint Wiggins
Vrije Universiteit Brussel
  Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?
Prateek Rajput, Yewei Song, Iyiola Olatunji, Jacques Klein, Tegawendé Bissyande
University of Luxembourg
  A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production
Qiao Gan1, Jonathan Dunn2, Andrea Nini3, Benjamin Adams1
1University of Canterbury, 2University of Illinois Urbana-Champaign, 3University of Manchester
  Semantic Information: A Difference That Makes a Difference
J. Nathanael Philipp1, Max Kölbl2, Michael Richter3
1Sächsische Akademie der Wissenschaften zu Leipzig, 2Osaka University, 3Leipzig University
  Modeling the Memory-Surprisal Trade-Off over Time: Communicative Efficiency Decreases with Lexico-Grammatical Change in Scientific English
Julius Steuer1, Marie-Pauline Krielke2, Stefania Degaetano-Ortlieb2, Elke Teich3, Dietrich Klakow2
1Heidelberg Institute for Theoretical Studies, 2Saarland University, 3Universität des Saarlandes
  Mechanistic Interpretability Meets Cognitive Linguistics: Modelling Locative Image Schemas in the Circuit Framework
Mattia Proietti1, Afra Alishahi2, Grzegorz Chrupala2, Alessandro Lenci3
1Università di Pisa, 2Tilburg University, 3University of Pisa
  Variation Is the Norm: Embracing Sociolinguistics in NLP
Anne-Marie Lutgen1, Alistair Plum1, Verena Blaschke2, Barbara Plank2, Christoph Purschke1
1University of Luxembourg, 2LMU Munich
  Appraisal Theory-Informed Emotion Prediction
Xiaowei Wang1, Jayant Teotia2, Rui Mao3, Wandeep Ratan Singh1, Sabrina Tiun1, Erik Cambria4
1Universiti Kebangsaan Malaysia, 2NTU, 3Ruimao Tech, 4Nanyang Technological University
  The Evolution of Philosophy: A Metaphorical Cognition Perspective
Rui Mao1, Dapeng Chen2, Zihao Huang3, Xulang Zhang3, Erik Cambria3
1Ruimao Tech, 2Jiangsu Open University, 3Nanyang Technological University
15:00 - 16:40    Session P10.2.2: Linguistics and Psycholinguistics II - Poster Area
  Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues
Yu Wang1, Olcay Türk1, Angela Grimminger2, Hendrik Buschmeier1
1Bielefeld University, 2Paderborn University
  Figurative Language in Alzheimer's Discourse: Linguistic and Neural Alignment in Clinical Narratives
Diana Kylymnyk1, Vitória Tomasel2, Helena Caseli3, Edward Watkins4, Aline Villavicencio5, Rodrigo Wilkens4
1Department of Computer Science and Psychology, University of Exeter, 2Federal University of Sao Carlos, 3Federal University of São Carlos, 4university of Exeter, 5University of Exeter, UK
  Prompting Instruction-tuned LLMs for Semantic Similarity Values
Xander Snelder, Yunchong Huang, Jelke Bloem
University of Amsterdam
  Towards Dynamic Metaphor Identification: Evaluating GPT O-Series Models on Five Metaphoricity Cues in U.S. Trade Corpora
Berkay Bas1, Jelke Bloem1, Xiaojuan Tan2
1University of Amsterdam, 2VU Amsterdam
  Rethinking Evaluation in Retrieval-Augmented Personalized Dialogue: A Cognitive and Linguistic Perspective
Tianyi Zhang1 and David Traum2
1University of Southern California, 2University of Southern California Institute for Creative Technologies
  Evaluating Multimodal Large Language Model Narrative Interpretation through the Lens of Appraisal Theory
Jayant Teotia1, Xiaowei Wang2, Xulang Zhang3, Rui Mao3, Erik Cambria3
1NTU, 2Universiti Kebangsaan Malaysia, 3Nanyang Technological University
  Mapping Liberty Metaphors across Cultures and Time
Sidney Suen1, Rui Mao1, Kenneth Kwok2, Erik Cambria1
1Nanyang Technological University, 2Agency for Science, Technology and Research
  The Sensorimotor Norms for the Chinese Classifiers
Yimei Shao1, Yu-Yin Hsu1, Chu-Ren Huang2
1The Hong Kong Polytechnic University, 2The Hong Kong Polytechnic Universiy
  DeepQuestion: Systematic Generation of Real-World Challenges for Evaluating LLMs Performance
Ali Khoramfar, Ali Ramezani, Mohammad Mahdi Mohajeri, Mohammad Javad Dousti, Majid Nili Ahmadabadi, Heshaam Faili
University of Tehran
  Pragmatic Modelling in Language Learning: Caregiver Question-Answer Feedback in Child-Directed Dialogue
Maryam Bala1, Johannes Heim2, Elspeth Edelstein2, Arabella Sinclair3
1University of Southampton, 2University of Aberdeen, 3University College London
15:00 - 16:40    Session P10.3.1: Parsing and Tagging I - Poster Area
  Modular Approach to Automating Morphological Components in Grammar Engineering
Ekaterina Voloshina1 and Krasimir Angelov2
1University of Gothenburg, Chalmers University of Technology, 2University of Gothenburg and Chalmers University of Technology
  MorfFlex: Handling Rich Morphology
Jaroslava Hlavácová1, Marie Mikulová2, Barbora Štepánková3, Milan Straka3, Jan Hajic2
1CUNI, 2Charles University, 3Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics
  Using Valency Inheritance in Building a Valency Lexicon
Václava Kettnerová1, Veronika Kolárová1, Jirí Mírovský2, Michal Olbrich2
1Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 2Charles University
  From CHAT to Coded CoNLL-U: A Reproducible Pipeline for the Syntactic Annotation and Querying of Child Language Data
Achim Stein
University of Stuttgart
  TækTåK: Syntactic Analysis of Language Use on Danish TikTok
Thea Kristensen and Rob van der Goot
IT University of Copenhagen
  Adaptive Chunking: Optimizing Chunking-Method Selection for RAG
Paulo de Moura Júnior, Jean Lelong, Annabelle Blangero
Ekimetrics
  Do Large Language Models Grasp the Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish
Lujun LI1, Yewei Song1, Lama Sleem1, Yiqun Wang1, Yangjie Xu1, Cedric LOTHRITZ2, Niccolo' Gentile3, Radu State1, Tegawendé Bissyandé1, Jacques Klein1
1University of Luxembourg, 2Luxembourg Institute of Science and Technology (LIST), 3Foyer S.A.
  Survey of Tools for Manual Linguistic Annotation: Supporting Diversity through Interactive Exploration
Ludovica Pannitto1, Kaja Dobrovoljc2, Bruno Guillaume3
1LILEC - University of Bologna, 2University of Ljubljana, 3LORIA / Inria Nancy Grand-Est
  TextLens & LeTTuce: Automated Corpus Annotation and Multilingual Tagging as a Service
Cynthia Van Hee1, Jonas Doumen2, Vincent Prins3, Pranaydeep Singh4, Vincent Vandeghinste3, Els Lefever5
1LT3, Language and Translation Technology Team (Ghent University), 2KU Leuven, imec research group itec, 3Instituut voor de Nederlandse Taal, 4LT3, University of Ghent, 5LT3, Ghent University
  The Corpus of Contemporary Polish — a New Reference Corpus with Rich Syntactic Annotations
Witold Kieras1, Malgorzata Marciniak2, Marcin Wolinski1, Katarzyna Krasnowska-Kieras1, Marek Lazinski1
1Institute of Computer Science, Polish Academy of Sciences, 2Institute of Computer Science PAS
  Prague Dependency Treebank - Consolidated 2.0: Enriching a Complex Annotation Scheme
Marie Mikulová1, Jirí Mírovský1, Milan Straka2, Pavlína Synková1, Jan Štepánek3, Barbora Štepánková2, Jan Hajic1
1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 3Charles University in Prague, Faculty of Mathematics and Physics, UFAL
  Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies
Marie Mikulová1, Barbora Štepánková2, Daniel Zeman3, Jan Štepánek4, Milan Straka2, Jan Hajic1
1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 3Charles University, Faculty of Mathematics and Physics, 4Charles University in Prague, Faculty of Mathematics and Physics, UFAL
  Encoding Logical Relations of Chinese Complex Sentences within the Universal Dependencies Framework
Hongpu Zhu and Hongzhi Xu
Shanghai International Studies University
  Unsupervised Labelling of Mutation Triggers in Welsh
Nicolás Gutiérrez-Rolón and Fernando Alva-Manchego
Cardiff University
15:00 - 16:40    Session P10.3.2: Parsing and Tagging II - Poster Area
  UzUDT: Uzbek Universal Dependencies Treebank
Sanatbek Matlatipov1 and Mersaid Aripov2
1Dr, 2Professor
  BRAGD: Constrained Multi-Label POS Tagging for Faroese
Annika Simonsen1, Barbara Scalvini2, Uni Johannesen2, Iben Debess2, Hafsteinn Einarsson3, Vésteinn Snæbjarnarson4
1The University of Iceland, 2University of the Faroe Islands, 3University of Iceland, 4University of Copenhagen
  Syntactic Sugar for Syntactic Queries: Sequential Representations for Dependency Queries
Niklas Deworetzki1 and Arianna Masciolini2
1Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, 2University of Gothenburg
  Context Is (Almost) Everything: Llama-3 on Structured Output and AMR Parsing
Maja Buljan1, Stephan Oepen2, Lilja Øvrelid3
1Language Technology Group (LTG), University of Oslo, 2Universitetet i Oslo, 3Dept of Informatics, University of Oslo
  Towards the Morphological Annotation of North Markian (Low German)
Christian Chiarcos
University of Augsburg
  Cross-Dataset Inconsistencies in Morphological Annotation: Evidence from Universal Dependencies
Vlasta Ohlídalová
Masaryk University
  Improving Latvian Morphosyntactic Parsing with Pretrained Encoders and Analyzer-Constrained Decoding
Arturs Znotins
Institute of Mathematics and Computer Science, University of Latvia
  CommonMorph: Participatory Morphological Documentation Platform
Aso Mahmudi1, Sina Ahmadi2, Kemal Kurniawan3, Rico Sennrich2, Eduard Hovy3, Ekaterina Vylomova3
1The University of Melbourne, 2University of Zurich, 3University of Melbourne
  Datasets for Verb Alternations across Languages: BLM Templates and Data Augmentation Strategies
Giuseppe Samo1 and Paola Merlo2
1IDIAP, 2University of Geneva
  A Large and Balanced Multi-Domain Arabic Corpus Annotated for Morphology, Syntax, and Readability
Khalid Elmadani1, Adel Mahmoud Wizani2, Hanada Taha Thomure3, Nizar Habash1
1New York University Abu Dhabi, 2University of Turin, 3Zayed University
  The DELPH-IN Grammary: A Curated Repository of Grammars and Treebanks
Francis Bond1 and Dan Flickinger2
1Palacky University, 2Stanford University
  Morphemes without Borders: Evaluating Root–Pattern Morphology in Arabic Tokenizers and LLMs
Yara Alakeel1, Chatrine Qwaider2, Hanan Aldarmaki2, Sawsan Alqahtani1
1SDAIA, 2MBZUAI
  Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark
Terra Blevins1, Stephen Mayhew2, Marek Suppa3, Hila Gonen4, Shachar Mirkin5, Vasile Pais6, Kaja Dobrovoljc7, Voula Giouli8, Jun Kevin9, Eugene Jang1, Eungseo Kim10, Jeongyeon Seo11, Xenophon Gialis12, Yuval Pinter13
1Northeastern University, 2Duolingo, 3Comenius University in Bratislava, 4UBC, 5Alpinference, 6Research Institute for Artificial Intelligence, Romanian Academy, 7University of Ljubljana, 8Aristotle University of Thessaloniki / ILSP, ATHENA RC, 9Universitas Pelita Harapan, 10Seoul National University, 11Independent Researcher, 12Democritus University of Thrace, 13Ben-Gurion University of the Negev
15:00 - 16:40    Session P10.4.1: Lexicon and Semantics II - Poster Area
  APODICTUS: Automatic Processing of DICTionary Update candidateS
Felix Blessing1, Johannes Sax1, Julian Kaufmann1, Wei Zhao2, Nikolay Arefyev3, Dominik Schlechtweg1
1University of Stuttgart, 2University of Aberdeen, 3University of Oslo
  A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation
Robert Krovetz
Lexical Research
  Creating a Hybrid Rule and Neural Network Based Semantic Tagger Using Silver Standard Data: The PyMUSAS Framework for Multilingual Semantic Annotation
Andrew Moore1, Paul Rayson1, Dawn Archer2, Tim Czerniak3, Dawn Knight4, Daisy Lal1, Gearóid Ó Donnchadha5, Mícheál Ó Meachair6, Scott Piao1, Elaine Uí Dhonnchadha3, Johanna Vuorinen5, Yan Yabo7, Xiaobin Yang7
1Lancaster University, 2Manchester Metropolitan University, 3Trinity College Dublin, 4Cardiff University, 5independent researcher, 6Fiontar & Scoil na Gaeilge, Dublin City University, 7Hubei University
  Scare Quotes as Markers of "Questionable" Word Usages and Misalignment in Conversation: An Annotation Study
Aina Garí Soler1, Juan Carlos Zevallos Huaco2, Matthieu Labeau3, Chloé Clavel4
1PSL University, INRIA Paris, 2Independent Researcher, 3Telecom Paris, 4INRIA
  Modeling Clinical Uncertainty in Radiology Reports: From Explicit Uncertainty Markers to Implicit Reasoning Pathways
Paloma Rabaey1, Jong Hak Moon2, Jung-Oh Lee3, Min Gwan Kim4, Hangyul Yoon2, Thomas Demeester1, Edward Choi2
1Ghent University, 2KAIST, 3Mount Sinai Hospital, 4Seoul National University Hospital
  ArabDiscrim: A Decade-Long Arabic Facebook Corpus on Racism and Discrimination
Wajdi Zaghouani1, Shimaa Ibrahim1, Mabrouka Bessghaier1, Houda Bouamor2
1Northwestern University Qatar, 2Carnegie Mellon University in Qatar
  DAMETA: An LLM Benchmark for Danish Metaphor Interpretation with Systematically Varied Distractors
Nina Schneidermann1, Sanni Nimb2, Nathalie Norman1, Sussi Olsen3, Bolette Pedersen1
1University of Copenhagen, 2Society for Danish Language and Literature (DSL), 3UCPH, NorS, Centre for Language Technology
  A New Semantic Artifact Based Framework for Studying and Documenting Algospeak and Related Phenomena
Fahad Khan1, Elisa Gugliotta2, Elisa Squadrito3, Maura Tarquini2, Francesca Frontini4
1Istituto di Linguistica Computazionale "Antonio Zampolli", CNR, 2Università degli Studi di Sassari, 3Università di Macerata, 4Istituto di Linguistica Computazionale "A. Zampolli" - ILC Consiglio Nazionale delle Ricerche - CNR
  Creating a High Quality Abstract Meaning Representation Dataset Automatically
Johannes Heinecke1, Asadullah Munshi2, Frédéric Herledan2, Geraldine Damnati1
1Orange Innovation, 2Orange
  Towards a Comprehensive English Wordnet-Wikidata Mapping
John P. McCrae1, Johann Bergh2, Krasimir Angelov3
1Insight Center for Data Analytics, National University of Ireland Galway, 2Lingolutions, 3University of Gothenburg and Chalmers University of Technology
  AmDi - Ambiguous Words Diachronic Dataset
Felix Thielen1 and Kai Kugler2
1Trier Univerity, 2Trier University
15:00 - 16:40    Session P10.4.2: Lexicon and Semantics III - Poster Area
  GerVLPro: A CEFR-Graded Vocabulary List of L2 Learners' Productive Vocabulary in German
Noah-Manuel Michael1, Anna Huelsing2, Andrea Horbach3
1Kiel University, 2CAU, 3CAU Kiel / Leibniz Institute for Science and Mathematics Education
  Building Bridges between Student and Curricular Language: Creating a Corpus of Abstract Meaning Representations for the Classroom
Kristin Wright-Bettner1, Zheng Cai2, zekun zhao3, James H. Martin1, Jeffrey Flanigan4, Martha Palmer5
1University of Colorado Boulder, 2The University of Colorado, 3University of California, Santa Cruz, 4UC Santa Cruz, 5University of Colorado
  Mu'jam Arriyadh: A Comprehensive Lexicon for Contemporary Arabic Language
Afrah Altamimi1, Abdulrahman Alosaimy2, Halah Alharbi3, Hawra Aljasim3, Muneera Alhoshan4, Amal Almazrua5, Hanan Alharbi3, Abdulrahman Alshehri1, Bayan Almuqhim3, Maryam Algarny3, Yahya Asiri6, Abdullah I. Alharbi7, SALEH ALBALAWI3, Fawziah Asiri1, Sara Alhifthi8, Abdullah Alfaifi5
1KSGAAL, 2King Salman Academy for Arabic Language / Imam Mohammed Bin Saud Islamic University, 3King Salman Global Academy for Arabic Language, 4King Salman Global Global Academy for Arabic Language, 5KSAA, 6King salman global academy of Arabic language, 7King Salman Global Academy for Arabic, 8Saudi Arabia
  The Romanian Corpus Annotated with Multiword Expressions. PARSEME-Ro Version 2.0
Verginica Barbu Mititelu1, Mihaela Cristescu2, Elena Irimia3, Carmen Vasile2
1RACAI, 2University of Bucharest, 3Research Institute for Artificial Intelligence, Romanian Academy (RACAI)
  Missing Links: LLM-Augmentation of Event Triggers of State Changes in the OpenPI Dataset
Kyeongmin Rim1 and James Pustejovsky2
1Department of Computer Science, Brandeis University, 2Brandeis University
  VUPMC: A New Political Metaphor Corpus in Mandarin Chinese
Xiaojuan Tan
VU Amsterdam
  Not All Disneys Are the Same: Making Coreference Metonymy-Aware
Bingyang Ye, Jingxuan Tu, James Pustejovsky
Brandeis University
  JSTS-Neg: Japanese Semantic Textual Similarity Dataset for Evaluating Negation Understanding Ability
Reiko Yuasa, Yoshihide Kato, Shigeki Matsubara
Nagoya University
  Few-shot Prompting or Supervised Tuning? A Comparative Study of LLMs for Linguistically Distant Language Pairs in BDI
Deepen Naorem1, Sanasam Ranbir Singh2, Telem Joyson Singh3, Priyankoo Sarmah4
1Indian Institute of Technology, Guwahati, 2Indian Institute of Technology, 3IIT Guwahati, 4Indian Institute of Technology Guwahati
  When Structure Matters: Cross-Lingual Hyperbolic Embeddings for Chinese and English Wordnets
Mao-Chang Ku1, Da-Chen Lian2, Pin-Er Chen1, Po-Ya Angela Wang1, Wei-Ling Chen1, Shu-Kai HSIEH2
1National Taiwan University, 2Graduate Institute of Linguistics, National Taiwan University
16:40 - 17:00    Coffee Break
17:00 - 18:20    LREC 2022 Closing Ceremony - Room 1
20:00    LREC 2022 GALA Dinner
                         End of Day 3