| Day 3 |
| 09:00 - 10:40 | Session O29: Infrastructures, Policy and Legal Issues II - Room 1 |
| 09:00 - 09:20 |
Mitigating Misinterpretation in Policy Documents through Automated Language Understanding
Momojit Biswas, Anka Chandrahas Tummepalli, Preethu Rose Anish TCS Research |
| 09:20 - 09:40 |
Sovereign AI-based Public Services Are Viable and Affordable
António Branco1, Luis Gomes2, Rodrigo Santos1, Eduardo Santos1, João Ricardo Silva1, Nuno Marques1, Madalena Rodrigues1 1University of Lisbon, 2Faculdade de Ciencias da Universidade de Lisboa |
| 09:40 - 10:00 |
A Typology of Synthetic Datasets for Dialogue Processing in Clinical Contexts
Steven Bedrick1, A. Seza Dogruoz2, Sergiu Nisioi3 1Oregon Health & Science University, 2Universiteit Gent, 3Human Language Technologies Research Center, University of Bucharest |
| 10:00 - 10:20 |
Text+: A National Hub Including Legacy Language Data
Florian Barth1, Christoph Draxler2, Jennifer Ecker3, Stefan Fischer4, Philippe Genêt5, Alina Hemmer6, Timm Lehmberg7, Thorsten Trippel8, Andreas Witt3, Arden Zimmermann5, Claus Zinn9 1University of Göttingen, 2Institute of Phonetics and Speech Processing, LMU Munich, 3Leibniz Institute for the German Language, 4Universität des Saarlandes, 5Deutsche Nationalbibliothek, 6University of Hamburg, 7Academy of Science and Humanities in Hamburg, 8Leibniz-Institut für Deutsche Sprache, 9University of Tübingen |
| 10:20 - 10:40 |
Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech
Tanvi Dinkar1, Aiqi Jiang1, Simona Frenda2, Poppy Gerrard-Abbott3, Nancie Gunson2, Gavin Abercrombie1, Ioannis Konstas2 1Heriot Watt University, 2Heriot-Watt University, 3University of Edinburgh/Heriot-Watt University |
| 09:00 - 10:40 | Session O30: Opinion and Argument Mining, Sentiment Analysis - Room 2 |
| 09:00 - 09:20 |
Towards Complex Debate Understanding: Predicting Claim Impact Scores through the Modelling of Claim Interactions
Maxime Brouat1, Mihai Surdeanu2, Srdjan Vesic1, Eduardo Blanco2 1CRIL CNRS Univ. Artois, 2University of Arizona |
| 09:20 - 09:40 |
Is There Anything More Deceptive than an Obvious Fact? Investigating Implicitness in User-Generated Argumentative Text
Ekaterina Sviridova1, Elena Cabrio2, Serena Villata3 1Université Côte d'Azur, 2Université Côte d'Azur, Inria, CNRS, I3S, 3Université Côte d'Azur, CNRS, Inria, I3S |
| 09:40 - 10:00 |
Best-Worst Scaling of Hype in Biomedical Research: Building an Intensity Lexicon of Promotional Adjectives
Neil Millar1, Dipesh Satav1, Bojan Batalo2, Erica K. Shimomoto3, Ryosuke Ohniwa1 1University of Tsukuba, 2AIST, 3National Institute of Advanced Industrial Science and Technology |
| 10:00 - 10:20 |
Trust Me, I Can Convince You: The Contextualized Argument Appraisal Framework and the ContArgA Corpus
Lynn Greschner, Sabine Weber, Roman Klinger University of Bamberg |
| 10:20 - 10:40 |
Towards Clinical Applications of NLP: Detecting Emotion Regulation via Emotional Categories and Expression Modes in French Transcriptions
Salome Klein1, Amalia Todirascu2, Hélène Vassiliadou3 1UR 1339/LiLPa & FRLC (University of Strasbourg), 2LiLPa, University of Strasbourg, 3University of Strasbourg |
| 09:00 - 10:40 | Session O31: Bias, Offensive and Non-inclusive Language - Room 3 |
| 09:00 - 09:20 |
R.U.Psycho? A Framework for Robust Unified Psychometric Testing of Language Models
Julian Schelb1, Orr Borin2, David Garcia1, Andreas Spitz1 1University of Konstanz, 2Recosys |
| 09:20 - 09:40 |
Code-switching as a Bias Indicator in LLMs: "the Consequences Are Not the Same Para Nosotros"
Fanny Ducel1, Aurélie Névéol2, Vidit Khazanchi3, Loïc Leclere4, Arthur Pedrini4, Léa Bouchet5, Benjamin Caissial5, Karen Fort6 1LISN, Université Paris-Saclay, 2Université Paris Saclay, CNRS, LISN, 3LORIA, 4Université de Lorraine, LORIA, 5Université de Lorraine, 6Sorbonne Universite and LORIA |
| 09:40 - 10:00 |
Exploration of How Hate Is Framed on Social Media
Rakshitha Rao Ailneni and Sanda Harabagiu University of Texas at Dallas |
| 10:00 - 10:20 |
Are Social Biases in LLMs Consistent across Generative Tasks? A Case Study for Basque
Muitze Zulaika1, Xabier Saralegi1, Julia Shershneva2, Lia Gonzalez2, Arkaitz Fullaondo2 1Orai NLP Technologies, 2University of the Basque Country (EHU) |
| 10:20 - 10:40 |
Fine-grained Narrative Classification in Biased News Articles
Zeba Afroz1, Harsh Vardhan1, pawan bhakuni2, Aanchal Punia3, Rajdeep Kumar4, Md. Shad Akhtar1 1Indraprastha Institute of Information Technology, Delhi, 2Bharat Electronics Ghaziabad, 3Bharat Electronics, 4Bharat Electronics limited |
| 09:00 - 10:40 | Session O32: Speech Resources, Processing, Applications - Room 4 |
| 09:00 - 09:20 |
A Shoal of Voices: Parallel Read Speech from Professional Swedish Narrators
Christina Tånnander1, Jim O'Regan2, Jens Edlund3 1KTH Speech, Music and Hearing, MTM, 2KTH Royal Institute of Technology, 3KTH Speech, Music and Hearing |
| 09:20 - 09:40 |
Deep Learning-Based Multi-Aspect Pronunciation Assessment for Individuals with Down Syndrome
David Fernández-García, César González-Ferreras, Valentín Cardeñoso-Payo, Mario Corrales-Astorgano Universidad de Valladolid |
| 09:40 - 10:00 |
WikIPA: Integrating WikiPron and Lingua Libre for Multilingual IPA Transcription
Pierluigi Cassotti1, Jacob Suchardt2, Domenico De Cristofaro3 1University of Gothenburg, 2Leipzig University, 3Free University of Bozen |
| 10:00 - 10:20 |
How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse
Saki Imai1, Lee Kezar2, Laurel Aichler3, Mert Inan1, Erin Walker4, Alicia Wooten3, Lorna Quandt3, Malihe Alikhani1 1Northeastern University, 2University of Southern California, 3Gallaudet University, 4University of Pittsburgh |
| 10:20 - 10:40 |
Setting the Stage for Disfluency: Implications of Contextual Task Framing Effects for the Design of Listening Tasks
Ambika Kirkland1 and Jens Edlund2 1KTH Royal Institute of Technology, 2KTH Speech, Music and Hearing |
| 09:00 - 10:40 | Session P8.1.1: Machine Translation I - Poster Area |
|
ACAData: Parallel Dataset of Academic Data for Machine Translation
Iñaki Lacunza1, Javier Garcia Gilabert2, Francesca De Luca Fornaciari3, Javier Aula-Blasco1, Aitor Gonzalez-Agirre4, Maite Melero1, Marta Villegas1 1Barcelona Supercomputing Center, 2Barcelona Super Computing Center, 3BSC Barcelona Supercomputing Center, 4Barcelona Supercomputing Center (BSC) |
|
A Single Model Ensemble Framework for Neural Machine Translation Using Pivot Translation
Seokjin Oh1, Keonwoong Noh2, Woohwan Jung3 1SK Siltron, 2Korea University, 3Hanyang University |
|
Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures
Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frederic Blain, Eva Vanmassenhove Tilburg University |
|
Building a One-Million-Pair BokmålNynorsk Translation Corpus: A Quality-First Harvesting and Cleaning Pipeline
Per Kummervold1, Thea Tollersrud2, Angelina Zanardi2 1The National Library of Norway, 2National Library of Norway |
|
New Trends for Modern Machine Translation with Large Reasoning Models
Sinuo Liu1, Chenyang Lyu2, Minghao Wu3, Zifu Shang2, Longyue Wang4, Weihua Luo2, Kaifu Zhang2 1University of Edinburgh, 2Alibaba Group, 3Monash University, 4Tencent AI Lab |
|
MaitH 1.0: A Parallel Corpus and Baseline for Low-Resource Maithili-Hindi Translation
Kamanksha Dubey1, Chandresh Maurya2, Kumar Padmanabh3 1INDIAN INSTITUTE OF TECHNOLOGY, 2IIT Indore, 3EBTIC (Etisalat British Telecom Innovation Center, Khalifa University) |
|
NRD: A Hybrid Disentanglement Framework for Mitigating Interference in Multilingual Machine Translation
Jiarui Zhang1 and Yifan Deng2 1Institute of Information Engineering, 2University of Chinese Academy of Sciences |
|
Linguistic and Demographic Factors in Online Free Translation Task
Tyler Lee, Irina Stenger, Tania Avgustinova Saarland University |
|
Biases in Translation: Assessing Opinion Distortion in Machine Translated Texts
Nazanin Shafiabadi1 and François Yvon2 1Sorbonne University and ISIR, 2ISIR CNRS & Sorbonne Université |
|
When Translations Surprise: Human Awareness of Predictability in Translations
Cristian García-Romero1, Miquel Esplà-Gomis2, Felipe Sanchez-Martinez2 1University of Alicante, 2Universitat d'Alacant |
|
Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation
Xinyue Ma1, Pol Pastells2, Mireia Farrus1, Mariona Taule2 1Universitat de Barcelona, 2University of Barcelona |
|
CoTERM: A Consistency-Oriented Term Metric for MT System Evaluation
Amir Hazem1 and Kyo Kageura2 1RCAST, The University of Tokyo, 2University of Tokyo |
|
SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages
Hannah Liu1, Junghyun Min2, Annie Lee1, Ethan Yue Heng Cheung1, Shou-Yi Hung1, Elsie Chan1, Shiyao Qian1, RUNTONG LIANG1, Kimlan Huynh1, Wing Yu Yip1, York Hay Ng1, Tsz Fung Yau3, Ka Ieng Charlotte Lo1, You-Wei Wu4, Richard Tzong-Han Tsai5 1University of Toronto, 2Georgetown University, 3Scotiabank, 4National Central University, 5Academia Sinica |
|
Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models
Spyridon Mavromatis1, Sokratis Sofianopoulos2, Prokopis Prokopidis3, Maria Giagkou3 1Institute for Speech and Language Processing, Athena Research Center & National and Kapodistrian University of Athens, 2Researcher, 3ILSP/Athena RC |
| 09:00 - 10:40 | Session P8.1.2: Machine Translation II - Poster Area |
|
Linguistic Knowledge-Infused Fine-Tuning for Mitigating Gender Bias in Machine Translation
Luis Ernesto Garcia Estrada1, Audrey Mash2, Carlos Escolano3, Maite Melero2, Christine Basta4 1Universidad Politecnica de Catalunya, 2BSC, 3Universitat Politècnica de Catalunya, Barcelona Supercomputing Center, 4Alexandria University |
|
What Triggers My Model? Contrastive Explanations Inform Gender Choices by Translation Models
Janiça Hackenbuchner Ghent University |
|
ViKhoMT: A VietnameseK'Ho Neural Machine Translation Dataset and Evaluation for Community Health Communication
Tram Truong1, Vinh Nguyen2, Dang Thin1, Ngan Nguyen3 1University of Information Technology,Vietnam National University Ho Chi Minh city, 2None, 3University of Information Technology, Vietnam National University Hochiminh City |
|
Hindsight Quality Prediction Experiments in Multi-Candidate Human-Post-Edited Machine Translation
Malik Marmonier, Benoît Sagot, Rachel Bawden Inria |
|
PETra: A Multilingual Corpus of Pragmatic Explicitation in Translation
Doreen Osmelak1, Koel Dutta Chowdhury2, Uliana Sentsova1, Cristina España-Bonet3, Josef van Genabith4 1Saarland University, 2Saarland Informatics Campus,Saarland University, 3BSC/DFKI GmbH, 4DFKI |
|
A Dataset for Probing Translationese Preferences in English-to-Swedish Translation
Jenny Kunz1, Anja Jarochenko2, Marcel Bollmann2 1Linkoping University, 2Linköping University |
|
STAR-IL: A Dataset for Style-Aware Machine Translation of Product Reviews in Indian Languages
Ketaki Shetye1, Dipti Sharma2, Parameswari Krishnamurthy3 1International Institute of Information Technology, 2IIIT, Hyderabad, 3Assistant Professor, IIIT Hyderabad |
|
Cultural and Knowledge Biases in LLMs through the Lens of Entity-Aware Machine Translation
Lu Xu, Luca Moroni, Roberto Navigli Sapienza University of Rome |
|
Referenceless Evaluation of Machine Translation Models by Ranking Performance in Romanian to English Translate-train Settings
Mihail Feraru, Alexandra Diaconu, Bogdan Alexe University of Bucharest |
|
Every Word Presented in Context: Syntactic Coverage as Objective for Low-Resource Machine Translation with Large Language Models
Samuel Frontull and Thomas Ströhle University of Innsbruck |
|
Multilingual KokoroChat: A Multi-LLM Ensemble Translation Method for Creating a Multilingual Counseling Dialogue Dataset
Ryoma Suzuki, Zhiyang Qi, Michimasa Inaba The University of Electro-Communications |
|
NepTam: A Nepali-Tamang Parallel Corpus and Baseline Machine Translation Experiments
Rupak Raj Ghimire1, Bipesh Subedi2, Balaram Prasain3, Prakash Poudyal1, Praveen Acharya4, Nischal Karki1, Rupak Tiwari1, Rishikesh Sharma1, Jenny Poudel1, Bal Krishna Bal5 1Kathmandu University, 2Department of Computer Science and Engineering, Kathmandu University, 3Central Department of Linguistics, Tribhuvan University, 4Dublin City University, 5Department of Computer Science and Engineering, Kathmandu University, Nepal |
|
Scoring the Translation: On Target Automatic Keyword-Based Evaluation of Machine Translation in the Sports Domain
Steinthor Steingrimsson1 and Einar Sigurdsson2 1The Arni Magnusson Institute for Icelandic Studies, 2University of Pennsylvania |
|
Towards Improving Multimodal Machine Translation with LLMs: A Focus on Indic Languages
Amulya Ratna Dash1, Chirag Wadhwa2, Yashvardhan Sharma3 1Birla Institute of Technology & Science, Pilani, 2Birla Institute of Technology and Science, Pilani, Pilani campus, 3Birla Institute of Technology and Science |
| 09:00 - 10:40 | Session P8.2: Multilinguality and Translation Aids - Poster Area |
|
Parallel Sentence Filtering for Low-Resource Language Pairs: A Case Study for Upper Sorbian, German, and Czech
Ruiyang Jiang1, Shu Okabe2, Alexander Fraser3 1Technical University of Munich, 2TUM Heilbronn, 3Ludwig-Maximilians-Universität München |
|
OpenSubtitles2024: A Massively Parallel Dataset of Movie Subtitles for MT Development and Evaluation
Joerg Tiedemann and Hengyu Luo University of Helsinki |
|
CREST: Universal Safety Guardrails through Cluster-Guided Cross-Lingual Transfer
Lavish Bansal and Naman Mishra Repello AI |
|
Semantic Alignment across Ancient Egyptian Language Stages via Normalization-Aware Multitask Learning
He Huang Ludwig Maximilian University of Munich |
|
Conditioning LLMs to Generate Code-Switched Text
Maite Heredia1, Gorka Labaka2, Jeremy Barnes3, Aitor Soroa4 1HiTZ Basque Center for Language Technology - Ixa NLP Group, University of the Basque Country UPV/EHU, 2HiTZ Center - Ixa, University of the Basque Country (UPV/EHU), 3University of the Basque Country EHU/UPV, 4HiTZ Center - Ixa, University of the Basque Country UPV/EHU |
|
Are the LLMs Capable of Maintaining at Least the Language Genus?
Sandra Mitrovic1, David Kletz2, Ljiljana Dolamic3, Fabio Rinaldi4 1SUPSI - IDSIA, 2Supsi, IDSIA, 3armasuisse S&T, 4IDSIA, Swiss AI Institute |
|
Gender Bias in MT for a Genderless Language: New Benchmarks for Basque
Amaia Murillo1, Olatz Perez-de-Viñaspre2, Naiara Perez3 1HiTZ Center, University of the Basque Country UPV/EHU, 2HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 3University of the Basque Country |
|
Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition
Aleix Sant1, Jordi Luque2, Carlos Escolano3 1Telefónica Innovación Digital, 2Telefonica Research, 3Universitat Politècnica de Catalunya, Barcelona Supercomputing Center |
|
Multilingual Target-Stance Extraction
Ethan Mines1 and Bonnie Dorr2 1The University of Florida, 2University of Florida |
|
MUNIChus: MUltilingual News Image Captioning Benchmark
Yuji Chen1, Alistair Plum2, Hansi Hettiarachchi1, Diptesh Kanojia3, Saroj Basnet4, Marcos Zampieri4, Tharindu Ranasinghe1 1Lancaster University, 2University of Luxembourg, 3University of Surrey, 4George Mason University |
|
GlossMATE: Multi-Agent Translator Explanations for Glosses
Changbing Yang1, Patrick Littell2, Gabriel Bernier-Colborne3, Yanfei Lu4, Mengzhe Geng3 1University of British Columbia, 2National Research Council of Canada, 3National Research Council Canada, 4University of Toronto |
|
Diagnosing Translated Benchmarks: An Automated Quality Assurance Study of the EU20 Benchmark Suite
Klaudia Thellmann, Bernhard Stadler, Michael Färber TU Dresden |
|
Resource-Lean Lexicon Induction for German Dialects
Robert Litschko1, Barbara Plank1, Diego Frassinelli2 1LMU Munich, 2CIS, LMU Munich |
| 09:00 - 10:40 | Session P8.3: Multimodality - Poster Area |
|
FENCE: A Financial and Multimodal Jailbreak Detection Dataset
Mirae Kim, Seonghun Jeong, Youngjun Kwak Kakaobank |
|
Evaluating Multimodal Large Language Models on Vertically Written Japanese Text
Keito Sasagawa1, Shuhei Kurita2, Daisuke Kawahara1 1Waseda University, 2National Institute of Informatics |
|
ProMQA-Assembly: Multimodal Procedural QA Dataset on Assembly
Kimihiro Hasegawa1, Wiradee Imrattanatrai2, Masaki Asada2, Susan Holm1, Yuran Wang1, Xuanang Zhou3, Ken Fukuda4, Teruko Mitamura1 1Carnegie Mellon University, 2National Institute of Advanced Industrial Science and Technology, 3CMU, 4AIRC/AIST |
|
K-MIND: Korean Multimodal INteraction Data for Dyadic Conversation Analysis
Jae Hee Yang1, Yuha Shin2, Saim Shin1, Je Woo Kim1, Jin Yea Jang1 1Korea Electronics Technology Institute, 2MaumAI |
|
Do Multimodal LLMs Understand Order? Measuring the Fragility of Multimodal Reasoning under Input Order Perturbations
Sheng-Lun Wei1, Yu-Ling Liao2, Hen-Hsen Huang3, Hsin-Hsi Chen1 1National Taiwan University, 2National Taiwan University, Taiwan, 3Institute of Information Science, Academia Sinica |
|
Early Fusion with Contrastive Learning: A Lightweight Alternative for Multi-modal Classification
Felix Wernlein1, Abhik Jana2, Sandipan Sikdar1 1Leibniz University Hannover, 2IIT Bhubaneswar |
|
Multimodal Entrainment and Feedback in Online Group Meetings
Patrizia Paggio1, Manex Agirrezabal1, Giulia Di Cristina2, Bart Jongejan1, Costanza Navarretta1 1University of Copenhagen, 2University of Turin |
|
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
HYEYEON KIM1, Sungwoo Han2, Jingun Kwon3, Hidetaka Kamigaito4, Manabu Okumura5 1Department of Artificial Intelligence, Chungnam National University, 2Chungnam National University, Department of Artificial Intelligence, GILAB, 3Chungnam National University, 4Nara Institute of Science and Technology, 5Tokyo Institute of Technology |
|
Multimodal Reference by Means of the Pronoun We and Hand Gestures in a Novel Corpus of Parliamentary Opening Debates
Costanza Navarretta University of Copenhagen |
|
Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque
Lukas Arana1, Julen Etxaniz1, Ander Salaberria1, Gorka Azkune2 1HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 2University of Basque Country |
|
Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches
Anum Afzal1, Yuki Saito2, Hiroya Takamura3, Katsuhito Sudoh4, Shinnosuke Takamichi5, Graham Neubig6, Florian Matthes7, Tatsuya Ishigaki8 1Technical University of Munich, 2The University of Tokyo, 3The National Institute of Advanced Industrial Science and Technology (AIST), 4Nara Women's University, 5Keio University, 6Carnegie Mellon University, 7Technische Universität München, 8National Institute of Advanced Industrial Science and Technology (AIST) |
|
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
Sara Ghaboura1, Shubham Patle1, Ketan More1, Wafa Alghallabi1, Omkar Thawakar1, Jorma Laaksonen2, Hisham Cholakkal1, Salman Khan1, Rao Anwer1 1Mohamed bin Zayed University of AI, 2Aalto University |
|
Event Chronography in Multi-modal Data: The BME Method for Quantitative Analyses
Anaïs Murat, Maria Koutsombogera, Carl Vogel Trinity College Dublin |
|
CANVAS: A Multimodal Dataset of Chinese Textbook Images for Bias and Representation Analysis
Haotian Zhu, Kefan Yu, Min Li University of Washington |
|
MM-Conv: A Multimodal Dataset and Benchmark for Context-Aware Grounding in 3D Dialogue
Anna Deichler1, Jim O'Regan1, Fethiye Irmak Dogan1, Anna Klezovich1, Lubos Marcinek1, Iolanda Leite1, Jonas Beskow2 1KTH Royal Institute of Technology, 2KTH Speech, music and hearing |
|
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
June Hyoung Kwon, Jungmin Yun, Youngbin Kim Chung-Ang University |
|
DREAM: A Multicultural Multimodal Dataset Linking Dialogues and Realistic Image Sequences
Juan Mallo1, Marcos Estecha-Garitagoitia1, Ricardo Cordoba2, Luis Fernando D'Haro3 1Universidad Politécnica de Madrid, 2Speech Technology Group. Dept. of Electronic Engineering. Universidad Politecnica de Madrid, 3Speech Technology and Machine Learning Group, E.T.S.I. Telecomunicación, Universidad Politécnica de Madrid |
|
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs
Masayuki Kawarada1, Tatsuya Ishigaki2, Hiroya Takamura3 1CyberAgent/National Institute of Advanced Industrial Science and Technology, 2National Institute of Advanced Industrial Science and Technology (AIST), 3The National Institute of Advanced Industrial Science and Technology (AIST) |
| 09:00 - 10:40 | Session P8.4: Cross-modality - Poster Area |
|
Can Video LLMs See through Illusions? Benchmark Dataset and Comprehensive Analysis
Souto Ohira1, Tosho Hirasawa2, Mamoru Komachi1 1Hitotsubashi University, 2OMRON SINIC X Corporation |
|
To Skip, to Swap or to Not Swap? Identifying Step Transition Types in Instructional Manuals
Hsiu-Yu Yang1, Michael Roth2, Andreas Bulling3, Carina Silberer3 1Institute for Natural Language Processing, Stuttgart University, 2University of Technology Nuremberg, 3University of Stuttgart |
|
Fruitcakes and Cupcakes Emerging from Noise: The ComposiGen Dataset of Compounds and Their Compositionality
Jule Godbersen1, Sinan Kurtyigit2, Emma Raimundo Schulz3, Tonmoy Rakshit3, Diego Frassinelli4, Sabine Schulte im Walde3, Carina Silberer3 1Saarland University, 2Technical University of Munich, 3University of Stuttgart, 4CIS, LMU Munich |
|
Large Language Models' Internal Perception of Symbolic Music
Andrew Shin and Kunitake Kaneko Keio University |
|
Entity Image and Mixed-Modal Image Retrieval Datasets
Cristian-Ioan Blaga1, Paul Suganthan G C1, Sahil Dua1, Krishna Srinivasan2, Enrique Alfonseca2, Peter Dornbach1, Tom Duerig1, Imed Zitouni2, Zhe Dong3 1Google, 2, 3Microsoft |
|
Generating Sign Language Poses from HamNoSys and Natural Language Descriptions
Santiago Máximo1 and Luis Chiruzzo2 1Universidad de la República, 2Universidad de la Republica |
|
Evaluating Discriminability of Vision-Language Models
Masayasu Muraoka1 and Naoaki Okazaki2 1IBM Research - Tokyo, 2Institute of Science Tokyo |
|
Seeing the Other Side: Diagnostic Tasks for Viewpoint Reasoning in VisionLanguage Models
Makoto Takenaka1 and Hitomi Yanaka2 1Mitsubishi Electric, 2the University of Tokyo |
|
Multi-modal, Multi-task, Multi-criteria Automatic Evaluation with Vision Language Models
Masanari Oi1, Masahiro Kaneko2, Naoaki Okazaki1, Nakamasa Inoue1 1Institute of Science Tokyo, 2MBZUAI |
|
Challenges in Image-Caption Association in Portuguese: Evaluating the CLIP Model on the FM30K Dataset
Vitória Colonetti Benedet, Gutavo Lopes Tamiosso, Rafael Oleques Nunes, Dennis Giovani Balreira UFRGS |
|
A Large-Scale Instruction-Tuning Dataset and Models for Slovenian Vision-Language Tasks
Matej Martinc1 and Domen Vre2 1Jozef Stefan Institute, 2Univerza v Ljubljani |
|
A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding
Dilara Torunoglu-Selamet1, Dogukan Arslan1, Rodrigo Wilkens2, Wei He2, Doruk Eryigit3, Thomas Pickard4, Adriana Pagano5, Aline Villavicencio6, Gülsen Eryigit1, Ágnes Abuczki7, Aida Cardoso8, Alesia Lazarenka9, Dina Almassova10, Amalia Mendes11, Anna Kanellopoulou12, Antoni Brosa-Rodriguez13, Baiba Valkovska14, Beata Wojtowicz15, Bolette Pedersen16, Carlos Manuel Hidalgo-Ternero17, Chaya Liebeskind18, Danka Jokic19, Diego Alves20, Eleni Triantafyllidi12, Erik Velldal21, Fred Philippy22, Giedre Valunaite Oleskeviciene23, Ieva Rizgeliene24, Inguna Skadina25, Irina Lobzhanidze26, Isabell Haugen27, Jauza Akbar Krito28, Jelena Markovic29, Johanna Monti30, Josue Sauca31, Kaja Dobrovoljc32, Kingsley Ugwuanyi33, Laura Rituma34, Lilja Øvrelid35, Maha Tufail Agro36, Manzura Abjalova37, Maria Chatzigrigoriou38, María del Mar Sánchez Ramos39, Marija Pendevska40, Masoumeh Seyyedrezaei41, Mehrnoush Shamsfard42, Momina Ahsan43, Muhammad Ahsan Khan44, Nathalie Norman16, Nilay Erdem Ayyildiz45, Nina Hosseini-Kivanani46, Noémi Ligeti-Nagy47, Numaan Naeem43, Olha Kanishcheva48, Olha Yatsyshyna49, Daniil Orel43, Petra Giommarelli50, Petya Osenova51, Radovan Garabik52, Regina Semou53, Rozane Rebechi54, Salsabila Zahirah Pranida43, Samia Touileb27, Sanni Nimb55, Sarfraz Ahmad44, Sarvinoz Sharipova56, Shahar Golan57, Shaoxiong Ji58, Sopuruchi Aboh59, Srdjan Sucur29, Stella Markantonatou60, Sussi Olsen61, Vahide Tajalli42, Veronika Lipp47, Voula Giouli62, Yelda Yesildal Eraydin63, Zahra Saaberi64, Zhuohan Xie43 1Istanbul Technical University, 2University of Exeter, 3Istanbul Technical University NLP Group, 4University of Sheffield, 5Federal University of Minas Gerais, 6University of Exeter, UK, 7Károli Gáspár University of the Reformed Church in Hungary, 8Centro de Linguística da Universidade Nova de Lisboa, 9Tesi srl, 10Nazarbayev University, 11University of Lisbon - Centre of Linguistics, School of Arts and Humanities, 12Aristotle University of Thessaloniki, 13Universitat Rovira i Virgili, 14IMCS, University of Latvia, 15University of Warsaw, 16University of Copenhagen, 17Researcher, 18Jerusalem College of Technology , Lev Academic Center, 19University of Belgrade, 20Saarland University, 21University of Oslo, 22University of Luxembourg, 23Mykolas Romeris University, 24Vilnius university Institute of Data Science and Digital Technologies, 25Tilde/ Institute of Mathematics and Computer Science, University of Latvia, 26Ilia State University, 27University of Bergen, 28Universitas Gadjah Mada, 29University of East Sarajevo, 30"L'Orientale" University of Naples, 31Internacional University of Valencia, 32University of Ljubljana, 33SOAS University of London, 34Institute of Mathematics and Computer science, University of Latvia, 35Dept of Informatics, University of Oslo, 36Mohamed bin Zayed University of Artificial Intelligence, 37Alisher Navo'i Tashkent State Uzbek Language and Literature, 38National and Kapodistrian University of Athens, 39University of Alcalá, 40St. Cyrillus and Methodius University, 41Istinye University, 42Faculty of Computer Science and Engineering, Shahid Beheshti University, 43MBZUAI, 44Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), 45Assoc. Prof., 46RTL & University of Luxembourg, 47ELTE Research Centre for Linguistics, 48Heidelberg University, 49Ternopil Volodymyr Hnatiuk National Pedagogical University, 50University of Pisa, 51Sofia University "St. Kl. Ohridski" and IICT-BAS, 52L. Stur Institute of Linguistics, Slovak Academy of Sciences, 53NKUA, 54Universidade Federal do Rio Grande do Sul, 55Society for Danish Language and Literature (DSL), 56Samarkand State Institute of Foreign Languages, 57Jerusalem College of Technology, 58University of Turku and ELLIS Institute Finland, 59English and Communication, The Hong Kong Polytechnic University, 60ILSP/R.C. "Athena", 61UCPH, NorS, Centre for Language Technology, 62Aristotle University of Thessaloniki / ILSP, ATHENA RC, 63Dr., 64NLP Lab, Shahid Beheshti University, Tehran, Iran |
|
Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for VisionLanguage Models
Shiho Matta1, Lis Kanashiro Pereira2, Peitao Han3, Shigeru Kitazawa3, Fei Cheng1 1Kyoto University, 2NICT, 3The University of Osaka |
|
I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes
Shijia Zhou1, Saif Mohammad2, Barbara Plank3, Diego Frassinelli4 1Ludwig Maximilian University of Munich, 2National Research Council Canada, 3LMU Munich, 4CIS, LMU Munich |
|
DEJIMA: A Novel Large-scale Japanese Dataset for Image Captioning and Visual Question Answering
Toshiki Katsube1, Fukuhara Taiga1, Kenichiro Ando2, Yusuke Mukuta1, Kohei Uehara1, Tatsuya Harada1 1The University of Tokyo, 2RIKEN |
|
CLEVR-3D-DeRef
Mary Martin, Martha Palmer, Maria Pacheco University of Colorado Boulder |
| 09:00 - 10:40 | Session P8.5: Sign Languages - Poster Area |
|
Bridging Text-to-Sign Translation via Codebook-Oriented Pretraining
Ninlawat Phuangchoke and Chantri Polprasert Asian Institute of Technology (AIT) |
|
A Resource and Evaluation Method for Phonological Continuity in Japanese Sign Language
Jundai Inoue1, Daisuke Hara2, Makoto Miwa2 1Knowledge and Data Engineering Lab, Toyota Technological Institute at Japan, 2Toyota Technological Institute |
|
Sentiment Analysis of German Sign Language Fairy Tales
Fabrizio Nunnari1, Siddhant Jain1, Patrick Gebhard2 1German Research Center for Artificial Intelligence (DFKI), 2DFKI |
|
A Critical Study of Automatic Evaluation in Sign Language Translation
Shakib Yazdani1, Yasser HAMIDULLAH2, Cristina España-Bonet3, Eleftherios Avramidis4, Josef van Genabith2 1German Research Center for Artificial Intelligence (DFKI), 2DFKI, 3BSC/DFKI GmbH, 4Alangu; German Research Center for Artificial Intelligence (DFKI) |
|
How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation
Anna Klezovich1, Johanna Mesch2, Gustav Eje Henter3, Jonas Beskow4 1Division of Speech, Music and Hearing, KTH, 2Stockholm University, 3KTH Royal Institute of Technology, 4KTH Speech, music and hearing |
|
Decomposing Sign Language Movements: A Multi-Band Visualization Method for Articulatory Analysis
Antonio F. G. Sevilla and José María Lahoz-Bengoechea Universidad Complutense de Madrid |
| 10:40 - 11:00 | Coffee Break |
| 11:00 - 12:40 | Session O33: Psycholinguistics, Cognitive Linguistics and Linguistic Theories - Room 1 |
| 11:00 - 11:20 |
Implicit Bias in Peer Review: Through the Lens of Language Abstraction
Xulang Zhang, Rui Mao, Erik Cambria Nanyang Technological University |
| 11:20 - 11:40 |
The PARLO Dementia Corpus: A German Multi-Center Resource for Alzheimer's Disease
Franziska Braun1, Christopher Witzl2, Florian Hönig3, Elmar Nöth4, Tobias Bocklet2, Korbinian Riedhammer5 1Technische Hochschule Nürnberg Georg Simon Ohm, 2Technische Hochschule Nürnberg, 3KST Institut GmbH, Bad Emstal, 4Friedrich-Alexander-University Erlangen-Nuremberg, 5Technische Hochschule Nuernberg Georg Simon Ohm |
| 11:40 - 12:00 |
Lexical and Discourse Semantics in a Reading-time Corpus of English
Jakub Dotlacil1, Laia Fortuny1, Li Kloostra1, Johan Bos2 1Utrecht University, 2University of Groningen |
| 12:00 - 12:20 |
Semantic Capacity in Language Learners and LLMs: A Case Study of Quantifier Scope
Shaohua Fang, Yue Li, Yan Cong Purdue University |
| 11:00 - 12:40 | Session O34: Opinion and Argument Mining - Room 2 |
| 11:00 - 11:20 |
Disambiguation of Emotion Annotations by Contextualizing Events in Plausible Narratives
Johannes Schaefer1 and Roman Klinger2 1Fundamentals of Natural Language Processing, 2University of Bamberg |
| 11:20 - 11:40 |
Identifying Contexts of Distress in College Students' Reddit Posts: A Comparative Study of Classical NLP and Large Language Models
Carine Graff and Nikhil Krishnaswamy Colorado State University |
| 11:40 - 12:00 |
TiC-MuFormer: Time-Aware Caption-Integrated Multimodal Transformers for User-Level Mental Health Modeling
Georgios Tsoumplekas, Yannis Spyridis, Vasileios Argyriou Kingston University |
| 12:00 - 12:20 |
Improving Neural Argumentative Stance Classification in Controversial Topics with Emotion-Lexicon Features
Mohammad Yeghaneh Abkenar1, Weixing Wang2, Manfred Stede1, Mark Finlayson3, Davide Picca4, Panagiotis Ioannidis5 1University of Potsdam, 2Hasso Plattner Institute, 3FIU, 4University of Lausanne, 5PI Squared Insights |
| 12:20 - 12:40 |
Emotion Transcription in Conversation: A Benchmark for Capturing Subtle and Complex Emotional States through Natural Language
Yoshiki Tanaka1, Ryuichi Uehara1, Koji Inoue2, Michimasa Inaba1 1The University of Electro-Communications, 2Kyoto University |
| 11:00 - 12:40 | Session O35: Parsing - Room 3 |
| 11:00 - 11:20 |
SETUP: Sentence-level English-To-Uniform Meaning Representation Parser
Emma Markle, Javier Gutierrez Bach, Shira Wein Amherst College |
| 11:20 - 11:40 |
This One or That One? A Study on Accessibility via Demonstratives with Multimodal Large Language Models
Yu Wang1, Emmanuele Chersoni2, Chu-Ren Huang3 1The Hong Kong Polytechnic University, 2Hong Kong Polytechnic University, 3The Hong Kong Polytechnic Universiy |
| 11:40 - 12:00 |
AMR Parsing beyond English: An Experiment on Bulgarian, French, Hungarian and Ukrainian
Ivaylo Mitov1, Tadzhat Marharian1, Zsofia Hauk1, Samba FALL1, Maxime Amblard2, Bruno Guillaume3 1Institut des sciences du Digital, Management & Cognition, 2Université de Lorraine, 3LORIA / Inria Nancy Grand-Est |
| 12:00 - 12:20 |
Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN
Rémi DE VERGNETTE1 and Maxime Amblard2 1Université de Lorraine, CNRS, Inria, LORIA, F-53999 Nancy, France, 2Université de Lorraine |
| 12:20 - 12:40 |
Two Ojibwe Constraint Grammars: Morphological Disambiguation and Dependency Parsing
Matthias Diederichsen and Christopher Hammerly University of British Columbia |
| 11:00 - 12:40 | Session O36: Multimodality and Speech - Room 4 |
| 11:00 - 11:20 |
Multimodal LLMs Do Not Compose Skills Optimally across Modalities
Paula Ontalvilla1, Aitor Ormazabal2, Gorka Azkune3 1HiTZ Center - Ixa, University of the Basque Country (UPV/EHU, 2University of the Basque Country, 3University of Basque Country |
| 11:20 - 11:40 |
Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
Maha Tufail Agro1, Atharva Kulkarni2, Karima Kadaoui1, Zeerak Talat3, Hanan Aldarmaki2 1Mohamed bin Zayed University of Artificial Intelligence, 2MBZUAI, 3University of Edinburgh |
| 11:40 - 12:00 |
MUStReason: A Benchmark for Diagnosing Pragmatic Reasoning in VideoLMs for Multimodal Sarcasm Detection.
Anisha Saha1, Varsha Suresh2, Timothy Hospedales3, Vera Demberg2 1Max Planck Institute for Informatics, Saarland Informatics Campus., 2Saarland University, 3University of Edinburgh |
| 12:00 - 12:20 |
Human-Centered Multimodal Fusion for Sexism Detection in Memes with Eye-Tracking, Heart Rate, and EEG Signals
Iván Arcos Gabaldón, Paolo Rosso, Elena Gomis Vicent Universitat Politècnica de València, UPV |
| 12:20 - 12:40 |
Nos_Brais-GL: A FAIR Galician TTS Corpus for Neural Speech Synthesis
Adina Vladu1, Antonio Moscoso Sánchez2, Carmen Magariños3, María Perez Lago1, Elisa Fernández Rei1 1Instituto da Lingua Galega, Universidade de Santiago de Compostela, 2Instituto da Lingua Galega, Centro Singular en Tecnoloxías Intelixentes, Universidade de Santiago de Compostela, 3Instituto da Lingua Galega, Departamento de Electrónica e Computación, Universidade de Santiago de Compostela |
| 11:00 - 12:40 | Session P9.1: Natural Language Generation - Poster Area |
|
DR-CUP: A Dataset on Real-time Commentary in U.S. Presidential Debates
Yu-Yu Chang1, Huan-Wen Ho1, Chung-Chi Chen2, Ming-Hung Wang3 1National Chung Chen University, 2National Institute of Advanced Industrial Science and Technology, 3National Chung Cheng University |
|
Russian Generative Spelling, Punctuation and Capitalization Correction
Nikita Martynov1, Danil Astafurov2, Ulyana Isaeva1, Ivan Maksimov3, Joqsan Azocar4, Dmitrii Kosenko4, Alena Fenogenova5 1SaluteDevices, 2ITMO University, 3Moscow Institute of Physics and Technology, 4MIPT, 5SberAI |
|
Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization
Chaimae Chellaf El Hammoud1, Salima Mdhaffar2, Yannick Estève3, Stéphane Huet4 1Avignon, 2Avignon university, 3LIA - Avignon Université, 4Université d'Avignon |
|
Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering
Purva Chiniya1, Kevin Scaria2, Sagar Chaturvedi1 1Amazon, 2Amazon.com |
|
The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form Factuality Evaluation
Pavel Braslavski1, Dmitrii Iarosh2, Nikita Sushko3, Andrey Sakhovskiy4, Vasily Konovalov5, Elena Tutubalina6, Alexander Panchenko7 1HSE University, 2Skolkovo Institute of Science and Technology, Russia, 3Independant Researcher, 4Sber AI, Russia; Skoltech, Russia, 5Affiliation, 6HSE University, Russia and Kazan Federal University, Russia and AIRI, Russia and Insilico Medicine Hong Kong, Hong Kong, 7S-NLP |
|
MeteoGalEus: An Iberian Multilingual Weather Dataset in Galician, Euskera, and Spanish
Ainhoa Vivel-Couso1, Nella Zabrina Pramata2, David Robredo3, Aitor Soroa4, Jose Maria Alonso-Moral1 1University of Santiago de Compostela, 2University of Basque Country, 3Universidade de Santiago de Compostela, 4HiTZ Center - Ixa, University of the Basque Country UPV/EHU |
|
RadTimeline: Timeline Summarization for Longitudinal Radiological Lung Findings
Sitong Zhou, Meliha Yetisgen, Mari Ostendorf University of Washington |
|
InstructSum: A Benchmark to Evaluate Instruction-Following Capability of Large Language Models in Summarization
Kosuke Nishida1, Kyosuke Nishida2, Itsumi Saito3 1NTT, 2NTT Human Informatics Laboratories, 3Tohoku University |
|
NOVELSUM: Evaluating Long-Form Summary Generation for Historical Scandinavian Novels
Ali Al-Laith, Alexander Conroy, Kirstine Degn, Jens Bjerring-Hansen, Daniel Hershcovich University of Copenhagen |
|
Evaluating Large Language Models for Text-to-Gloss Translation in Kazakh-Russian Sign Language: A Pilot Study
Zhanibek Kozhirbayev1 and Alfarabi Imashev2 1National Laboratory Astana, Nazarbayev University, 2Nazarbayev University |
|
HotelCheckSpan: A Benchmark Dataset for LLM Faithfulness
Patricia Schmidtova1, Ondrej Dusek1, Saad Mahamood2 1Charles University, 2Shopware |
| 11:00 - 12:40 | Session P9.2.1: Machine Learning II - Poster Area |
|
Procrustes Analysis for Improving Language Model Merging
Olivier Ferret CEA List |
|
MetaCORA: A Meta-Learned Curriculum for Adversarial and Contrastive Robustness in Speech Recognition
Yuqian Dai, Chun Fai Chan, Ying Ki Wong, Tsz Ho Pun Logistics and Supply Chain MultiTech R&D Centre Limited |
|
Insights from Transfer Learning Experiments with Word-in-Context and Word Sense Disambiguation Models
Alp Mujko and Dominik Schlechtweg University of Stuttgart |
|
Joint Identification and Induction of Semantic Frames with Scalable Semi-Supervised Graph Clustering
Fabian Barteld1, Steffen Remus2, Saba Anwar2, Julian Stawecki1, Alexander Ziem1, Chris Biemann2 1Heinrich Heine University Düsseldorf, 2Universität Hamburg |
|
Low-Rank Compression of Language Models via Differentiable Rank Selection
Sidhant Sundrani, Francesco Tudisco, Pasquale Minervini University of Edinburgh |
|
Self-supervised Data Augmentation for Text Classification in Low-Data Settings
Deyu Ding1, Mengying Wang2, Andreas Spitz2 1Southern University of Science and Technology, 2University of Konstanz |
|
Distribution-aware Low-bitwidth Quantization for Large Language Models
Bao Huynh, Takashi Tsunakawa, Masafumi Nishida Shizuoka University |
|
TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
ChengYeh Yang1, Chien-Chun Wang1, Li-Wei Chen2, Hung-Shin Lee2, Hsin-Min Wang3, Berlin Chen1 1National Taiwan Normal University, 2United Link Co., Ltd., 3Institute of Information Science, Academia Sinica |
|
Harnessing Synergy in Context and Emoji for Joint Detection of Harmful Online Content in Multi-turn Conversations
Feiyan Hu, Ciara Byrne, Jiang Zhou, Rena Maycock, Mark Langan Chirp |
|
Dynamic Layer Selection for Efficient Tone Recognition in Self-Supervised Speech Models
Saint Germes BENGONO OBIANG, Norbert TSOPZE, Paulin MELATAGIA YONTA Univertity of Yaounde 1 |
|
Intent Recognition in Speech-to-Text Processing in the Context of Natural Interaction with Cognitive Assistive Systems
Behnam Ensan1, Magnus Jung1, Matthias Busch1, Adreas Wendemuth2 1doctoral candidate, 2Professor for Cognitive Systems, University Magdeburg |
|
Merging Continual Pretraining Models for Domain-Specialized LLMs: A Case Study in Finance
Kentaro Ueda1, François Portet2, Hirohiko Suwa1, Keiichi Yasumoto1 1Nara Institute of Science and Technology, 2Université Grenoble Alpe |
|
Phonetic-based Ranking for Improved Pseudo-Labeling in Low-Resource ASR
Marco Matassoni1, Roberto Gretter1, Falavigna Daniele1, Mohamed Nabih Ali Mohamed Nawar1, Alessio Brutti1, Matteo Negri1, Mauro Cettolo1, Marco Gaido2, Sara Papi1, Luisa Bentivogli1 1Fondazione Bruno Kessler, 2Fondazione Bruno Kessler, University of Trento |
|
Privacy-Preserving Information Extraction with Local LLMs: A Comparative Study on Dutch Debt Collection Letters
Beyza Celep, Natalia Amat-Lefort, Joost Visser Leiden University |
| 11:00 - 12:40 | Session P9.2.2: Machine Learning III - Poster Area |
|
Forewarned Is Forearmed: When Non-Sequential Embedding Turns into an Anomaly Detector
Elys Allesiardo, Antoine Caubrière, Valentin Vielzeuf Orange Research |
|
A Joint Detection Framework for Latvian Loanwords and Calques Using Monolingual Data
Yelingyun Zhang, Atis Kapenieks, Marina Platonova Riga Technical University |
|
Pantagruel: Unified Self-Supervised Encoders for French Text and Speech
Phuong-Hang Le1, Valentin Pelloin2, Arnault Chatelain3, Maryem Bouziane4, Mohammed Ghennai5, Qianwen Guan6, Kirill Milintsevich7, Salima Mdhaffar8, Aidan Mannion9, Nils Defauw10, Shuyue Gu6, Alexandre Audibert11, Marco Dinarelli12, Yannick Estève13, Lorraine Goeuriot9, Steffen Lalande7, Nicolas Hervé2, Maximin Coavoux14, François Portet15, Étienne Ollion16, Marie Candito17, Maxime Peyrard5, Solange Rossato12, Benjamin Lecouteux18, Aurélie Nardy19, Gilles Sérasset11, Vincent Segonne20, Solène Evain5, Diandra Fabre5, Didier Schwab21 1Saclay AI, 2INA, 3CREST (Ecole Polytechnique, ENSAE, CNRS), 4Avignon Université, LIA, 5Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 6Université Paris Cité, 7Institut national de l'audiovisuel, 8Avignon university, 9LIG, Université Grenoble Alpes, 10Univ. Grenoble Alpes, CNRS, Grenoble INP, 11Université Grenoble Alpes, 12LIG, 13LIA - Avignon Université, 14CNRS, Univ Grenoble Alpes, 15Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble, 16CNRS-CREST, 17LLF, Université Paris Cité, 18LIG/GETALP, 19Lidilem, 20IRISA - Université Bretagne Sud, 21Univ. Grenoble Alpes |
|
Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights
Eneko Valero1, Maria Ribalta i Albado1, Oscar Sainz1, Naiara Perez2, German Rigau3 1University of the Basque Country (UPV/EHU), 2University of the Basque Country, 3UPV/EHU |
|
SemiAdapt: Semi-Supervised and Efficient LoRA-Based Domain Adaptation for Low-Resource Irish Machine Translation with Transformers
Josh Mcgiff and Nikola Nikolov University of Limerick |
|
Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts
Valentin Pelloin1, Lina Bekkali2, Reda Dehak3, David Doukhan4 1INA, 2École nationale des ponts et chaussées (ENPC), 3EPITA, 4Institut national de l'audiovisuel (Ina) |
|
SENS-ASR: Semantic Embedding Injection in Neural-transducer for Streaming Automatic Speech Recognition
Youness Dkhissi1, Valentin Vielzeuf2, Elys Allesiardo1, Anthony Larcher3 1Orange Innovation, 2Orange Research, 3Université du Mans - LIUM |
|
Efficient Financial Language Understanding via Distillation with Synthetic Data
Wen-Fong (Xavier) Huang and Edwin Simpson University of bristol |
|
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
Aditya Kamlesh Parikh1, Cristian Tejedor-García2, Catia Cucchiarini3, Helmer Strik4 1Radboud University, 2CLST, Radboud University, 3Radboud University Nijmegen/Nederlandse Taalunie, 4Centre for Language and Speech Technology (CLST), Centre for Language Studies (CLS), Radboud University Nijmegen |
|
Leveraging Semi-Supervised Learning for Multimodal Hate Speech Data Annotation and Detection
Rathi Adarshi Rammohan1, Zhao Ren1, Dominik Puchala2, Aleksandra Swiderska2, Dennis Küster1, Tanja Schultz1 1University of Bremen, 2University of Warsaw |
|
Lexicalized Constituency Parsing for Middle Dutch: Low-resource Training and Cross-Domain Generalization
Yiming Liang1 and Fang Zhao2 1Universiteit Gent, 2Université Paris Cité & Laboratoire de Linguistic Formelle |
|
Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search
Kyle McCleary and James Ghawaly Louisiana State University |
|
Reason-to-Learn (R2L): Multi-Agent Knowledge Distillation for Lightweight LLMs in Sentiment Analysis
Le-Huy Tu1, Quan Nguyen2, Vincent NGUYEN3, Johanna Bjorklund4, Xuan-Son Vu5 1DopikAI JSC., 2Umeå University, 3University of Orleans, INSA CVL, LIFO EA, France, 4Umea University, 5Lund University and DeepTensor AB |
|
PRiSM: Partial Ranking via Inter-layer Semantic Measurement for Efficient Fine-tuning of Language Models
Aldrin Biswas1, Md Fahim2, Md. Amin1, Amin Ali1, AKM Rahman1 1Center for Computational & Data Sciences, Independent University, Bangladesh, 2Center for Computational & Data Sciences at Independent University, Bangladesh (IUB) |
| 11:00 - 12:40 | Session P9.3.1: Language Modeling and LRs III - Poster Area |
|
Beyond Literal Meaning: How LLMs Interpret Yemeni Proverbs
Nasser Thmer1, Ali Al-Laith2, Muhammad Shoaib1 1UET LAHORE, 2University of Copenhagen |
|
SEFL: A Framework for Generating Synthetic Educational Assignment Feedback with LLM Agents
Mike Zhang1, Amalie Dilling2, Léon Gondelman2, Niels Lyngdorf2, Euan Lindsay2, Johannes Bjerva3 1University of Copenhagen, 2Aalborg University, 3Department of Computer Science, Aalborg University |
|
LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
Hailay Kidu Teklehaymanot1, Dren Fazlija2, Wolfgang Nejdl1 1L3S Research Center, 2L3S Research Center, Leibniz University Hannover |
|
A Cheap Lunch: Synthetic Annotation with Minimal Human Effort for Medical Text Mining
Shutao Chen and Piek Vossen Vrije Universiteit Amsterdam |
|
Supervised Contrastive Fine-Tuning for Active Few-Shot Learning
Zirui Zhang, Lei Ge, Shengyu Qiao Information Engineering University |
|
Simulating Student Interactions for Virtual Pretesting with In-Context Learning
Arthur Thuy1, Luca Benedetto2, Ekaterina Loginova3, Dries Benoit1 1Ghent University, 2University of Cambridge, Institut Polytechnique de Paris, 3Dedalus Healthcare |
|
An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter LLMs
Deshan Sumanathilaka, Nicholas Micallef, Julian Hough Swansea University |
|
Building Effective Japanese Medical LLMs with an Open Recipe for Domain Adaptation through Continued Pre-training
Akiko Aizawa1, Yuki Arase2, Fei Cheng3, Jiahao Huang4, Zhiyi Huang2, Junfeng Jiang4, Teruhito Kanazawa1, Daisuke Kawahara5, Kazuma Kobayashi1, Takashi Kodama3, Sadao Kurohashi3, Yusuke Oda1, Yuma Tsuta1, Zhen Wan3, Zhishen Yang1, Rio Yokota2 1National Institute of Informatics, 2Institute of Science Tokyo, 3Kyoto University, 4University of Tokyo, 5Waseda University |
|
New Encoders for German Trained from Scratch: Comparing ModernGBERT with Converted LLM2Vec Models
Julia Wunderle1, Anton Ehrmanntraut2, Jan Pfister3, Fotis Jannidis2, Andreas Hotho4 1University of Wuerzburg, 2Universität Würzburg, 3Julius-Maximilians-Universität Würzburg (JMU), 4University of Würzburg |
|
Arabic ChartSumm: An English-to-Arabic Benchmark for Metadata-to-Text Summarization
Passant Elchafei1 and Amany Fashwan2 1Ulm University, Germany, 2Phonetics and Linguistics Department, Faculty of Arts, Alexandria University, Alexandria |
|
Introducing a Bangla Sentence Gloss Pair Dataset for Bangla Sign Language Translation and Research
Neelavro Saha, Rafi Shahriyar, Nafis Roudra, Saadman Sakib, Annajiat Rasel BRAC University |
|
Language Models as Semantic Augmenters for Sequential Recommenders
Mahsa Valizadeh, Xiangjue Dong, Rui Tuo, James Caverlee Texas A&M University |
|
Efficient Adaptation of English Language Models for Morphologically Rich and Underrepresented Languages: The Case of Arabic
Ahmed Eldamaty1, Mohamed Abdelrahman2, Mohamed Elbehery1, Mariam Ashraf1, Radwa Elshawi2 1Giza Systems, 2University of Tartu |
| 11:00 - 12:40 | Session P9.3.2: Language Modeling and LRs IV - Poster Area |
|
GhostWriter: Hidden AI-Generated Texts over Multiple Languages, Domains and Generators
Manuel Schaaf1, Kevin Bönisch2, Alexander Mehler1 1Goethe-University Frankfurt am Main, 2Text Technology Lab, Goethe-University |
|
Using LLMs to Extract Instances of Schematic Constructions from Unannotated L2 Learner Corpora
Jelena Kallas1, Ahto Kiil2, Heete Sahkai1, Geda Paulsen3, Kertu Saul4 1Institute of the Estonian Language, 2University of Tartu, 3Institute of the Estonian Language, Uppsala University, 4Institute of the Estonian Language, University of Tartu |
|
Corruption-Based Data Augmentation for Arabic Essay Scoring: A Preliminary Study on the Organization Trait
May Bashendy and Tamer Elsayed Qatar University |
|
Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach
Salim Al Mandhari1, Hieu Pham Dinh2, Mo El-Haj2, Paul Rayson1 1Lancaster University, 2VinUniversity |
|
ManufactuBERT: Efficient Continual Pretraining for Manufacturing
Robin Armingaud and Romaric Besancon CEA LIST |
|
Smigiel Dataset: Laying Foundations for Investigating Machine-Generated Text Detection in Polish
Jakub Strebeyko1, Alina Wróblewska2, Piotr Przybyla3 1University of Warsaw, Warsaw, Poland, 2Institute of Computer Science, Polish Academy of Sciences, 3Universitat Pompeu Fabra |
|
Extracting Medical Image-Related Entities from Spanish Electronic Health Records Using NER Methods
Alexander Platas1, Marcos Merino1, Elena Zotova1, Montse Cuadros1, Karen López-Linares1, Mikel Pérez de Mendiola2, María Gálvez2, Cristina Barba2, Antón Asla2 1Vicomtech, 2Serikat |
|
A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German
Shiva Banasaz Nouri1, Elena Leitner2, Julian Moreno-Schneider2, Georg Rehm2 1TU Berlin, 2DFKI |
|
LLM-Based Data Generation and Clinical Skills Evaluation for Low-Resource French OSCEs
Tian Huang1, Tom Bourgeade2, Irina Illina3 1LORIA, University of Lorraine, 2LORIA - INRIA, University of Lorraine, 3LORIA/INRIA |
|
Instruction-Tuned Urdu LLMs: Efficient Adaptation of Llama Models and Evaluation Resources for Urdu
Munief Tahir1, Sana Shams2, Sarmad Hussain3, Miriam Butt4 1Al Khawarizmi Institute of Computer Science, 2Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, 3Center for Language Engineering, KICS, UET, 4University of Konstanz |
|
Is Biomedical Specialization Still Worth It? Insights from Domain-Adaptive Language Modelling with a New French Health Corpus
Aidan Mannion1, Cécile Macaire1, Armand Violle2, Stéphane Ohayon2, Xavier Tannier3, Didier Schwab4, Lorraine Goeuriot1, François Portet5 1LIG, Université Grenoble Alpes, 2LIMICS, Sorbonne Université, INSERM, 3Limics, Sorbonne Université, 4Univ. Grenoble Alpes, 5Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble |
|
TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation
Toms Bergmanis, Ingus Pretkalnin, Martins Kronis, Davis Nicmanis, Jelizaveta Jelinska, Roberts Rozis, Rinalds Viksna, Marcis Pinnis Tilde |
|
Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs
Saugata Purkayastha1, Pranav Kushare1, Pragya Pal1, Sukannya Purkayastha2 1Saarland University, 2TU Darmstadt |
| 11:00 - 12:40 | Session P9.3.3: Language Modeling and LRs V - Poster Area |
|
``Emphasizing the Commendable'': A Study of Homogenized Transitive Verb Constructions in Machine Generated Peer Reviews
Hing-Yuet Fung1, Chi-kiu Lo2, Samuel Larkin3 1Independent Researcher, 2National Research Council of Canada, 3National Research Council Canada |
|
CoDAE: Adapting Large Language Models for Education via Chain-of-Thought Data Augmentation
Shuzhou Yuan1, Willliam LaCroix2, Hardik Ghoshal3, Ercong Nie4, Michael Färber3 1Dresden University of Technology, 2Saarland University, 3TU Dresden, 4Centre for Information and Language Processing, LMU Munich |
|
Synthetic Instruction Generation for Low-Resource Nordic Languages: Viability and Limitations in LLM Instruction-Tuning
Mathias Stenlund1, Annika Simonsen1, Lars Bungum2, Jan Ebert3, Jiangtao Wang3, Oleg Filatov3, Hemanadhan Myneni1, Morris Riedel1, Hafsteinn Einarsson1 1University of Iceland, 2NTNU, 3Jülich Supercomputing Centre |
|
AYN: A Tiny Yet Competitive Indian Legal Language Model Pretrained from Scratch
Mitodru Niyogi1, Eric Gaussier2, Arnab Bhattacharya3 1CNRS, 2Univ. Grenoble Alpes, 3Dept. of Computer Science and Engineering, IIT Kanpur |
|
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
Eeham Khan1, Firas Saidani2, Owen Van Esbroeck1, Richard Khoury2, Leila Kosseim1 1Concordia University, 2Université de Laval |
|
Reformulate and Create, Don't Translate: Creating Natural Prompts for Underserved Languages
Annika Simonsen1, Mathias Stenlund2, Lars Bungum3, Marc Volhardt2, Hafsteinn Einarsson2 1The University of Iceland, 2University of Iceland, 3Norwegian University of Science and Technology |
|
Generating High Quality Synthetic Data for Dutch Medical Conversations
Cecilia Kuan1, Aditya Kamlesh Parikh1, Henk van den Heuvel2 1Radboud University, 2CLS/CLST, Radboud University Nijmegen |
|
DeepICD-R1: Medical Reasoning through Hierarchical Rewards and Unsupervised Distillation
Tom Röhr1, Thomas Steffek1, Roman Teucher2, Keno Bressem3, Alexei Figueroa1, Paul Grundmann1, Peter Troeger1, Felix Gers1, Alexander Löser1 1Berliner Hochschule für Technik (BHT), 2Fraunhofer Research Engineer, 3Department of Diagnostic and Interventional Radiology, School of Medicine, University Hospital Rechts der Isar, Technical University of Munich |
|
SynthLLM: An LLM-based Scalable Synthetic Data Generation Pipeline for Low-Resource Languages
Solmaz Panahi1, Vasudevan Nedumpozhimana2, John Kelleher3 1Maynooth University, 2TU Dublin, 3Trinity Colledge Dublin |
|
Persona-Conditioned Generation of Patient Self-Reports from EHRs
Yuexin Wu1, jianming wei2, Vasile Rus1 1UNIVERSITY OF MEMPHIS, 2University Medical Center Utrecht |
|
SocialStep: Fast Prediction of Social Determinants of Health
Paul Landes1, Adam Cross2, Jimeng Sun3 1University of Illinois at Chicago, 2University of Illinois College of Medicine Peoria, 3University of Illinois Urbana-Champaign |
|
Dynamically Acquiring Text Content to Enable the Classification of Lesser-known Entities for Real-world Tasks
Fahmida Alam and Ellen Riloff University of Arizona |
|
RILEC: Detection and Generation of L1 Russian Interference Errors in English Learner Texts
Darya Kharlamova1 and Irina Proskurina2 1National Research University Higher School of Economics, 2Laboratoire Hubert Curien, UMR CNRS 5516, Saint-Etienne, France, Université Claude Bernard Lyon 1, Université Lumière Lyon 2, ERIC, 69100, Villeurbanne, France |
| 12:40 - 14:10 | Lunch Break |
| 14:10 - 14:55 | Keynote Speaker: Dan Jurafsky - Room 1 |
| 14:55 - 15:00 | Short Break (5mn) |
| 15:00 - 16:40 | Session O37: Evaluation, Validation, Quality Assurance - Room 1 |
| 15:00 - 15:20 |
Critical Foreign Policy Decision (CFPD) Benchmark: Measuring Diplomatic Preferences of Large Language Models
Benjamin Jensen1, Ian Reynolds1, Yasir Atalan1, Michael Garcia2, Austin Woo2, Anthony Chen2, Trevor Howarth2 1Center for Strategic and International Studies, 2Scale AI |
| 15:20 - 15:40 |
CrisisCL: A Domain Incremental Learning Benchmark for Crisis Management
Paul Le Van Kiem1, Romain Meunier1, Farah Benamara2, Véronique MORICEAU3 1IRIT, 2University of toulouse, 3IRIT, Université de Toulouse |
| 15:40 - 16:00 |
Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation
Neha Sharma1, Navneet Agarwal2, Kairit Sirts1 1University of Tartu, 2EXAI, University of Tartu |
| 16:00 - 16:20 |
LLMs as Annotators: Evaluating ModelHuman Alignment in Detecting Contentious Language in Historical Corpora
Yahui Zhao1, Clemencia Siro2, Laura Hollink1 1Centrum Wiskunde & Informatica (CWI), 2Centrum Wiskunde & Informatica |
| 16:20 - 16:40 |
Widespread Gender and Pronoun Bias in Moral Judgments across LLMs
Gustavo Fernandes, Jeiverson Santos, Pedro O.S Vaz-de-Melo UFMG |
| 15:00 - 16:40 | Session O38: Knowledge Discovery and Representation - Room 2 |
| 15:00 - 15:20 |
Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation
Lewis Watson, Carl Strathearn, Kenny Mitchell, Yanchao Yu Edinburgh Napier University |
| 15:20 - 15:40 |
Injecting Structured Biomedical Knowledge into Language Models:Continual Pretraining vs. GraphRAG
Jaafer Klila1, Sondes Bannour Souihi2, rahma boujelbane3, Nasredine Semmar4, Lamia Hadrich-Belguith5 1PhD student, 2CEA, 3FSEGS, 4CEA LIST, 5ANLP Research Group, MIRACL Lab, FSEGS, Sfax University |
| 15:40 - 16:00 |
Linguistic Knowledge Graphs for Sense Prediction: A Case-study on Latin
Eleonora Ghizzota1, Paola Marongiu2, Pierpaolo Basile3, Stefano Ferilli4, Barbara McGillivray5 1University of Bari Aldo Moro, 2CNR-ILC, Istituto di Linguistica Computazionale 'A. Zampolli', 3Department of Computer Science, University of Bari Aldo Moro, 4Universitá degli Studi di Bari, 5King's College London |
| 16:00 - 16:20 |
ACID: On the Perception of Online Classism
Arianna Muti1, Elisa Bassignana2, Amanda Cercas Curry1, Federica Durante3, Dirk Hovy1, Debora Nozza1 1Bocconi University, 2IT University of Copenhagen, 3Università Milano Bicocca |
| 16:20 - 16:40 |
The Spectrum of Sentiment: Optimistic, Pessimistic, and Neutral Voices in Online Depression Discourse
Stefana Tabusca1, Ana-Maria Bucur2, Liviu Dinu1 1University of Bucharest, 2Università della Svizzera italiana |
| 15:00 - 16:40 | Session O39: Applications Involving LRs and Evaluation III - Room 3 |
| 15:00 - 15:20 |
A Benchmark Dataset and Comparative Evaluation of Phonemized and Romanized Urdu for Text-to-Speech
M Kaab Bin Shahid1 and Muhammed Izharuddin2 1University of Stuttgart, 2Aligarh Muslim University |
| 15:20 - 15:40 |
S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature
Abigail Berthe-Pardo1, Gaspard Michel2, Elena Epure2, Christophe Cerisara3 1Université de Lorraine, CNRS, LORIA, 2Deezer Research, 3Universite de Lorraine, CNRS, LORIA |
| 15:40 - 16:00 |
BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios
Yunseung Lee1, Subin Kim2, Youngjun Kwak2, Jaegul Choo3 1KakaoBank Corp., 2Kakaobank, 3Korea Advanced Institute of Science and Technology |
| 16:00 - 16:20 |
TR-TEB: Turkish Text Embedding Benchmark
Omer Arslan, Atalay Celik, Yusuf Aslan, Hasan Durkaya, Mustafa Zenginoglu, Musa Yilmaz, Merve Kantarci, Mehmet Haklidir TUBITAK BILGEM |
| 16:20 - 16:40 |
Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+
Mason Shipton1, York Hay Ng2, Aditya Khan2, Phuong Hoang2, Xiang Lu3, A. Seza Dogruoz4, Annie Lee2 1Ontario Tech University, 2University of Toronto, 3University of Michigan, 4Universiteit Gent |
| 15:00 - 16:40 | Session O40: Multimodality, Cross-modality - Room 4 |
| 15:00 - 15:20 |
SciClaimEval: Cross-modal Claim Verification in Scientific Papers
Xanh Ho1, Yun-Ang Wu2, Sunisth Kumar3, Tian Cheng Xia4, Florian Boudin5, Andre Greiner-Petter6, Akiko Aizawa1 1National Institute of Informatics, 2National Taiwan University, 3University of Tokyo, 4University of Bologna, 5Nantes University, 6University of Goettingen |
| 15:20 - 15:40 |
Localizing Events in Space: Comparing Humans and AI Models
Derrick Eui Gyu Kim, Kenneth Lai, James Pustejovsky Brandeis University |
| 15:40 - 16:00 |
STRUDEL: Unrolling a Benchmark for Evaluating Vision-Language Models on Structured Diagram Understanding across Domains
Daniel Steinigen, Lucie Flek, Sebastian Houben Fraunhofer IAIS |
| 16:00 - 16:20 |
VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought
Byeonggeuk Lim, Kyeonghyun Kim, Jungmin Yun, Youngbin Kim Chung-ang University |
| 16:20 - 16:40 |
VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Josef Kuchar1, Marek Kadlcik2, Michal Spiegel3, Michal Stefanik1 1Masaryk University, 2Faculty of Informatics, Masaryk University, 3Kempelen Institute of Intelligent Technologies |
| 15:00 - 16:40 | Session P10.1: Social Media - Poster Area |
|
ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source
Hung Le1, Long To1, Manh Nguyen1, Kiet Nguyen2 1University of Information Technology, HCM VNU, 2University of Information Technology, VNU-HCM |
|
Automated Extraction of Answer Candidates for Question Generation
Claudia Preda1, Mihai Dascalu1, Stefan Ruseti2, Danielle McNamara3 1National University of Science and Technology POLITEHNICA Bucharest, 2University Politehnica of Bucharest, 3Arizona State University |
|
Green Bots versus Red Bots: Evaluating Large Language Models for Simulating Persuasion Dynamics in Online Influence Campaigns
Majd Al Ali1, Filip Muntean2, Lucia Donatelli1, Jurriaan van Diggelen3 1Vrije Universiteit Amsterdam, 2Vrije Universiteit, 3TNO |
|
Towards Expectation Detection in Language: A Case Study on Treatment Expectations in Reddit
Aswathy Velutharambath1 and Amelie Wührl2 1University of Stuttgart, University of Bamberg, 2IT University of Copenhagen |
|
Empathy Speaks in Metaphors: The Empathy-Metaphor Corpus of Figurative Language in Empathetic Text
Gyeongeun Lee and Natalie Parde University of Illinois at Chicago |
|
A Computational Diachronic Analysis of Gen Z Mental Health Discourse: A Large-scale Reddit Corpus Study from Pre- to Post-COVID
Felix Mao Rye Country Day School |
|
"Oat Milk Vegan Chocolate Taste Great!": Monitoring the Food Transition Debate in Reddit
Greta Zella1, Jan Willem Bolderdijk2, Saskia Peels1, Gerry Wakker1, Tommaso Caselli3 1University of Groningen, 2University of Amsterdam, University of Groningen, 3Rijksuniversiteit Groningen |
|
ClimateChat-300K: A Multi-Modal Facebook Dataset for Understanding Diverse Perspectives in Climate Communication
Wajdi Zaghouani1, Md. Rafiul Biswas2, Mabrouka Bessghaier1, Shimaa Ibrahim1, George Mikros2 1Northwestern University Qatar, 2Hamad Bin Khalifa University |
|
HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse
Sai Kartheek Reddy Kasu1, Shankar Biradar2, SUNIL SAUMYA3, Md. Shad Akhtar4 1Student, 2Assistant Professor, 3INDIAN INSTITUTE OF INFORMATION TECHNOLOGY DHARWAD, 4Indraprastha Institute of Information Technology, Delhi |
|
MindSET: Advancing Mental Health Benchmarking through Large-Scale Social Media Data
Saad Mankarious1, Edward Kempa2, Daniel Wiechmann3, Elma Kerz4, Yu Qiao5, Ayah Zirikly6 1Cornell College, 2University of Florida, Department of Computer and Information Science and Engineering, 3Institute for Logic Language and Computation, 4Exaia Technologies, 5RWTH Aachen University, 6Johns Hopkins University |
|
A Corpus of Misunderstood Irony on Turkish Social Media
Çagri Çöltekin and Güliz Günes University of Tübingen |
| 15:00 - 16:40 | Session P10.2.1: Linguistics and Psycholinguistics I - Poster Area |
|
A Corpus of Joint EEG and Self-Paced Reading of Natural Dutch Texts
Sara Østergaard, Lenneke Lichtenberg, Laura Boon, Bruno Nicenboim Tilburg University |
|
How Long Does a Quick Kiss Take? Studying Event Duration of Light Verb Constructions Using Explicit Word Embeddings
Lin de Huybrecht and Geraint Wiggins Vrije Universiteit Brussel |
|
Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?
Prateek Rajput, Yewei Song, Iyiola Olatunji, Jacques Klein, Tegawendé Bissyande University of Luxembourg |
|
A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production
Qiao Gan1, Jonathan Dunn2, Andrea Nini3, Benjamin Adams1 1University of Canterbury, 2University of Illinois Urbana-Champaign, 3University of Manchester |
|
Semantic Information: A Difference That Makes a Difference
J. Nathanael Philipp1, Max Kölbl2, Michael Richter3 1Sächsische Akademie der Wissenschaften zu Leipzig, 2Osaka University, 3Leipzig University |
|
Modeling the Memory-Surprisal Trade-Off over Time: Communicative Efficiency Decreases with Lexico-Grammatical Change in Scientific English
Julius Steuer1, Marie-Pauline Krielke2, Stefania Degaetano-Ortlieb2, Elke Teich3, Dietrich Klakow2 1Heidelberg Institute for Theoretical Studies, 2Saarland University, 3Universität des Saarlandes |
|
Mechanistic Interpretability Meets Cognitive Linguistics: Modelling Locative Image Schemas in the Circuit Framework
Mattia Proietti1, Afra Alishahi2, Grzegorz Chrupala2, Alessandro Lenci3 1Università di Pisa, 2Tilburg University, 3University of Pisa |
|
Variation Is the Norm: Embracing Sociolinguistics in NLP
Anne-Marie Lutgen1, Alistair Plum1, Verena Blaschke2, Barbara Plank2, Christoph Purschke1 1University of Luxembourg, 2LMU Munich |
|
Appraisal Theory-Informed Emotion Prediction
Xiaowei Wang1, Jayant Teotia2, Rui Mao3, Wandeep Ratan Singh1, Sabrina Tiun1, Erik Cambria4 1Universiti Kebangsaan Malaysia, 2NTU, 3Ruimao Tech, 4Nanyang Technological University |
|
The Evolution of Philosophy: A Metaphorical Cognition Perspective
Rui Mao1, Dapeng Chen2, Zihao Huang3, Xulang Zhang3, Erik Cambria3 1Ruimao Tech, 2Jiangsu Open University, 3Nanyang Technological University |
| 15:00 - 16:40 | Session P10.2.2: Linguistics and Psycholinguistics II - Poster Area |
|
Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues
Yu Wang1, Olcay Türk1, Angela Grimminger2, Hendrik Buschmeier1 1Bielefeld University, 2Paderborn University |
|
Figurative Language in Alzheimer's Discourse: Linguistic and Neural Alignment in Clinical Narratives
Diana Kylymnyk1, Vitória Tomasel2, Helena Caseli3, Edward Watkins4, Aline Villavicencio5, Rodrigo Wilkens4 1Department of Computer Science and Psychology, University of Exeter, 2Federal University of Sao Carlos, 3Federal University of São Carlos, 4university of Exeter, 5University of Exeter, UK |
|
Prompting Instruction-tuned LLMs for Semantic Similarity Values
Xander Snelder, Yunchong Huang, Jelke Bloem University of Amsterdam |
|
Towards Dynamic Metaphor Identification: Evaluating GPT O-Series Models on Five Metaphoricity Cues in U.S. Trade Corpora
Berkay Bas1, Jelke Bloem1, Xiaojuan Tan2 1University of Amsterdam, 2VU Amsterdam |
|
Rethinking Evaluation in Retrieval-Augmented Personalized Dialogue: A Cognitive and Linguistic Perspective
Tianyi Zhang1 and David Traum2 1University of Southern California, 2University of Southern California Institute for Creative Technologies |
|
Evaluating Multimodal Large Language Model Narrative Interpretation through the Lens of Appraisal Theory
Jayant Teotia1, Xiaowei Wang2, Xulang Zhang3, Rui Mao3, Erik Cambria3 1NTU, 2Universiti Kebangsaan Malaysia, 3Nanyang Technological University |
|
Mapping Liberty Metaphors across Cultures and Time
Sidney Suen1, Rui Mao1, Kenneth Kwok2, Erik Cambria1 1Nanyang Technological University, 2Agency for Science, Technology and Research |
|
The Sensorimotor Norms for the Chinese Classifiers
Yimei Shao1, Yu-Yin Hsu1, Chu-Ren Huang2 1The Hong Kong Polytechnic University, 2The Hong Kong Polytechnic Universiy |
|
DeepQuestion: Systematic Generation of Real-World Challenges for Evaluating LLMs Performance
Ali Khoramfar, Ali Ramezani, Mohammad Mahdi Mohajeri, Mohammad Javad Dousti, Majid Nili Ahmadabadi, Heshaam Faili University of Tehran |
|
Pragmatic Modelling in Language Learning: Caregiver Question-Answer Feedback in Child-Directed Dialogue
Maryam Bala1, Johannes Heim2, Elspeth Edelstein2, Arabella Sinclair3 1University of Southampton, 2University of Aberdeen, 3University College London |
| 15:00 - 16:40 | Session P10.3.1: Parsing and Tagging I - Poster Area |
|
Modular Approach to Automating Morphological Components in Grammar Engineering
Ekaterina Voloshina1 and Krasimir Angelov2 1University of Gothenburg, Chalmers University of Technology, 2University of Gothenburg and Chalmers University of Technology |
|
MorfFlex: Handling Rich Morphology
Jaroslava Hlavácová1, Marie Mikulová2, Barbora tepánková3, Milan Straka3, Jan Hajic2 1CUNI, 2Charles University, 3Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics |
|
Using Valency Inheritance in Building a Valency Lexicon
Václava Kettnerová1, Veronika Kolárová1, Jirí Mírovský2, Michal Olbrich2 1Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 2Charles University |
|
From CHAT to Coded CoNLL-U: A Reproducible Pipeline for the Syntactic Annotation and Querying of Child Language Data
Achim Stein University of Stuttgart |
|
TækTåK: Syntactic Analysis of Language Use on Danish TikTok
Thea Kristensen and Rob van der Goot IT University of Copenhagen |
|
Adaptive Chunking: Optimizing Chunking-Method Selection for RAG
Paulo de Moura Júnior, Jean Lelong, Annabelle Blangero Ekimetrics |
|
Do Large Language Models Grasp the Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish
Lujun LI1, Yewei Song1, Lama Sleem1, Yiqun Wang1, Yangjie Xu1, Cedric LOTHRITZ2, Niccolo' Gentile3, Radu State1, Tegawendé Bissyandé1, Jacques Klein1 1University of Luxembourg, 2Luxembourg Institute of Science and Technology (LIST), 3Foyer S.A. |
|
Survey of Tools for Manual Linguistic Annotation: Supporting Diversity through Interactive Exploration
Ludovica Pannitto1, Kaja Dobrovoljc2, Bruno Guillaume3 1LILEC - University of Bologna, 2University of Ljubljana, 3LORIA / Inria Nancy Grand-Est |
|
TextLens & LeTTuce: Automated Corpus Annotation and Multilingual Tagging as a Service
Cynthia Van Hee1, Jonas Doumen2, Vincent Prins3, Pranaydeep Singh4, Vincent Vandeghinste3, Els Lefever5 1LT3, Language and Translation Technology Team (Ghent University), 2KU Leuven, imec research group itec, 3Instituut voor de Nederlandse Taal, 4LT3, University of Ghent, 5LT3, Ghent University |
|
The Corpus of Contemporary Polish a New Reference Corpus with Rich Syntactic Annotations
Witold Kieras1, Malgorzata Marciniak2, Marcin Wolinski1, Katarzyna Krasnowska-Kieras1, Marek Lazinski1 1Institute of Computer Science, Polish Academy of Sciences, 2Institute of Computer Science PAS |
|
Prague Dependency Treebank - Consolidated 2.0: Enriching a Complex Annotation Scheme
Marie Mikulová1, Jirí Mírovský1, Milan Straka2, Pavlína Synková1, Jan tepánek3, Barbora tepánková2, Jan Hajic1 1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 3Charles University in Prague, Faculty of Mathematics and Physics, UFAL |
|
Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies
Marie Mikulová1, Barbora tepánková2, Daniel Zeman3, Jan tepánek4, Milan Straka2, Jan Hajic1 1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 3Charles University, Faculty of Mathematics and Physics, 4Charles University in Prague, Faculty of Mathematics and Physics, UFAL |
|
Encoding Logical Relations of Chinese Complex Sentences within the Universal Dependencies Framework
Hongpu Zhu and Hongzhi Xu Shanghai International Studies University |
|
Unsupervised Labelling of Mutation Triggers in Welsh
Nicolás Gutiérrez-Rolón and Fernando Alva-Manchego Cardiff University |
| 15:00 - 16:40 | Session P10.3.2: Parsing and Tagging II - Poster Area |
|
UzUDT: Uzbek Universal Dependencies Treebank
Sanatbek Matlatipov1 and Mersaid Aripov2 1Dr, 2Professor |
|
BRAGD: Constrained Multi-Label POS Tagging for Faroese
Annika Simonsen1, Barbara Scalvini2, Uni Johannesen2, Iben Debess2, Hafsteinn Einarsson3, Vésteinn Snæbjarnarson4 1The University of Iceland, 2University of the Faroe Islands, 3University of Iceland, 4University of Copenhagen |
|
Syntactic Sugar for Syntactic Queries: Sequential Representations for Dependency Queries
Niklas Deworetzki1 and Arianna Masciolini2 1Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, 2University of Gothenburg |
|
Context Is (Almost) Everything: Llama-3 on Structured Output and AMR Parsing
Maja Buljan1, Stephan Oepen2, Lilja Øvrelid3 1Language Technology Group (LTG), University of Oslo, 2Universitetet i Oslo, 3Dept of Informatics, University of Oslo |
|
Towards the Morphological Annotation of North Markian (Low German)
Christian Chiarcos University of Augsburg |
|
Cross-Dataset Inconsistencies in Morphological Annotation: Evidence from Universal Dependencies
Vlasta Ohlídalová Masaryk University |
|
Improving Latvian Morphosyntactic Parsing with Pretrained Encoders and Analyzer-Constrained Decoding
Arturs Znotins Institute of Mathematics and Computer Science, University of Latvia |
|
CommonMorph: Participatory Morphological Documentation Platform
Aso Mahmudi1, Sina Ahmadi2, Kemal Kurniawan3, Rico Sennrich2, Eduard Hovy3, Ekaterina Vylomova3 1The University of Melbourne, 2University of Zurich, 3University of Melbourne |
|
Datasets for Verb Alternations across Languages: BLM Templates and Data Augmentation Strategies
Giuseppe Samo1 and Paola Merlo2 1IDIAP, 2University of Geneva |
|
A Large and Balanced Multi-Domain Arabic Corpus Annotated for Morphology, Syntax, and Readability
Khalid Elmadani1, Adel Mahmoud Wizani2, Hanada Taha Thomure3, Nizar Habash1 1New York University Abu Dhabi, 2University of Turin, 3Zayed University |
|
The DELPH-IN Grammary: A Curated Repository of Grammars and Treebanks
Francis Bond1 and Dan Flickinger2 1Palacky University, 2Stanford University |
|
Morphemes without Borders: Evaluating RootPattern Morphology in Arabic Tokenizers and LLMs
Yara Alakeel1, Chatrine Qwaider2, Hanan Aldarmaki2, Sawsan Alqahtani1 1SDAIA, 2MBZUAI |
|
Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark
Terra Blevins1, Stephen Mayhew2, Marek Suppa3, Hila Gonen4, Shachar Mirkin5, Vasile Pais6, Kaja Dobrovoljc7, Voula Giouli8, Jun Kevin9, Eugene Jang1, Eungseo Kim10, Jeongyeon Seo11, Xenophon Gialis12, Yuval Pinter13 1Northeastern University, 2Duolingo, 3Comenius University in Bratislava, 4UBC, 5Alpinference, 6Research Institute for Artificial Intelligence, Romanian Academy, 7University of Ljubljana, 8Aristotle University of Thessaloniki / ILSP, ATHENA RC, 9Universitas Pelita Harapan, 10Seoul National University, 11Independent Researcher, 12Democritus University of Thrace, 13Ben-Gurion University of the Negev |
| 15:00 - 16:40 | Session P10.4.1: Lexicon and Semantics II - Poster Area |
|
APODICTUS: Automatic Processing of DICTionary Update candidateS
Felix Blessing1, Johannes Sax1, Julian Kaufmann1, Wei Zhao2, Nikolay Arefyev3, Dominik Schlechtweg1 1University of Stuttgart, 2University of Aberdeen, 3University of Oslo |
|
A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation
Robert Krovetz Lexical Research |
|
Creating a Hybrid Rule and Neural Network Based Semantic Tagger Using Silver Standard Data: The PyMUSAS Framework for Multilingual Semantic Annotation
Andrew Moore1, Paul Rayson1, Dawn Archer2, Tim Czerniak3, Dawn Knight4, Daisy Lal1, Gearóid Ó Donnchadha5, Mícheál Ó Meachair6, Scott Piao1, Elaine Uí Dhonnchadha3, Johanna Vuorinen5, Yan Yabo7, Xiaobin Yang7 1Lancaster University, 2Manchester Metropolitan University, 3Trinity College Dublin, 4Cardiff University, 5independent researcher, 6Fiontar & Scoil na Gaeilge, Dublin City University, 7Hubei University |
|
Scare Quotes as Markers of "Questionable" Word Usages and Misalignment in Conversation: An Annotation Study
Aina Garí Soler1, Juan Carlos Zevallos Huaco2, Matthieu Labeau3, Chloé Clavel4 1PSL University, INRIA Paris, 2Independent Researcher, 3Telecom Paris, 4INRIA |
|
Modeling Clinical Uncertainty in Radiology Reports: From Explicit Uncertainty Markers to Implicit Reasoning Pathways
Paloma Rabaey1, Jong Hak Moon2, Jung-Oh Lee3, Min Gwan Kim4, Hangyul Yoon2, Thomas Demeester1, Edward Choi2 1Ghent University, 2KAIST, 3Mount Sinai Hospital, 4Seoul National University Hospital |
|
ArabDiscrim: A Decade-Long Arabic Facebook Corpus on Racism and Discrimination
Wajdi Zaghouani1, Shimaa Ibrahim1, Mabrouka Bessghaier1, Houda Bouamor2 1Northwestern University Qatar, 2Carnegie Mellon University in Qatar |
|
DAMETA: An LLM Benchmark for Danish Metaphor Interpretation with Systematically Varied Distractors
Nina Schneidermann1, Sanni Nimb2, Nathalie Norman1, Sussi Olsen3, Bolette Pedersen1 1University of Copenhagen, 2Society for Danish Language and Literature (DSL), 3UCPH, NorS, Centre for Language Technology |
|
A New Semantic Artifact Based Framework for Studying and Documenting Algospeak and Related Phenomena
Fahad Khan1, Elisa Gugliotta2, Elisa Squadrito3, Maura Tarquini2, Francesca Frontini4 1Istituto di Linguistica Computazionale "Antonio Zampolli", CNR, 2Università degli Studi di Sassari, 3Università di Macerata, 4Istituto di Linguistica Computazionale "A. Zampolli" - ILC Consiglio Nazionale delle Ricerche - CNR |
|
Creating a High Quality Abstract Meaning Representation Dataset Automatically
Johannes Heinecke1, Asadullah Munshi2, Frédéric Herledan2, Geraldine Damnati1 1Orange Innovation, 2Orange |
|
Towards a Comprehensive English Wordnet-Wikidata Mapping
John P. McCrae1, Johann Bergh2, Krasimir Angelov3 1Insight Center for Data Analytics, National University of Ireland Galway, 2Lingolutions, 3University of Gothenburg and Chalmers University of Technology |
|
AmDi - Ambiguous Words Diachronic Dataset
Felix Thielen1 and Kai Kugler2 1Trier Univerity, 2Trier University |
| 15:00 - 16:40 | Session P10.4.2: Lexicon and Semantics III - Poster Area |
|
GerVLPro: A CEFR-Graded Vocabulary List of L2 Learners' Productive Vocabulary in German
Noah-Manuel Michael1, Anna Huelsing2, Andrea Horbach3 1Kiel University, 2CAU, 3CAU Kiel / Leibniz Institute for Science and Mathematics Education |
|
Building Bridges between Student and Curricular Language: Creating a Corpus of Abstract Meaning Representations for the Classroom
Kristin Wright-Bettner1, Zheng Cai2, zekun zhao3, James H. Martin1, Jeffrey Flanigan4, Martha Palmer5 1University of Colorado Boulder, 2The University of Colorado, 3University of California, Santa Cruz, 4UC Santa Cruz, 5University of Colorado |
|
Mu'jam Arriyadh: A Comprehensive Lexicon for Contemporary Arabic Language
Afrah Altamimi1, Abdulrahman Alosaimy2, Halah Alharbi3, Hawra Aljasim3, Muneera Alhoshan4, Amal Almazrua5, Hanan Alharbi3, Abdulrahman Alshehri1, Bayan Almuqhim3, Maryam Algarny3, Yahya Asiri6, Abdullah I. Alharbi7, SALEH ALBALAWI3, Fawziah Asiri1, Sara Alhifthi8, Abdullah Alfaifi5 1KSGAAL, 2King Salman Academy for Arabic Language / Imam Mohammed Bin Saud Islamic University, 3King Salman Global Academy for Arabic Language, 4King Salman Global Global Academy for Arabic Language, 5KSAA, 6King salman global academy of Arabic language, 7King Salman Global Academy for Arabic, 8Saudi Arabia |
|
The Romanian Corpus Annotated with Multiword Expressions. PARSEME-Ro Version 2.0
Verginica Barbu Mititelu1, Mihaela Cristescu2, Elena Irimia3, Carmen Vasile2 1RACAI, 2University of Bucharest, 3Research Institute for Artificial Intelligence, Romanian Academy (RACAI) |
|
Missing Links: LLM-Augmentation of Event Triggers of State Changes in the OpenPI Dataset
Kyeongmin Rim1 and James Pustejovsky2 1Department of Computer Science, Brandeis University, 2Brandeis University |
|
VUPMC: A New Political Metaphor Corpus in Mandarin Chinese
Xiaojuan Tan VU Amsterdam |
|
Not All Disneys Are the Same: Making Coreference Metonymy-Aware
Bingyang Ye, Jingxuan Tu, James Pustejovsky Brandeis University |
|
JSTS-Neg: Japanese Semantic Textual Similarity Dataset for Evaluating Negation Understanding Ability
Reiko Yuasa, Yoshihide Kato, Shigeki Matsubara Nagoya University |
|
Few-shot Prompting or Supervised Tuning? A Comparative Study of LLMs for Linguistically Distant Language Pairs in BDI
Deepen Naorem1, Sanasam Ranbir Singh2, Telem Joyson Singh3, Priyankoo Sarmah4 1Indian Institute of Technology, Guwahati, 2Indian Institute of Technology, 3IIT Guwahati, 4Indian Institute of Technology Guwahati |
|
When Structure Matters: Cross-Lingual Hyperbolic Embeddings for Chinese and English Wordnets
Mao-Chang Ku1, Da-Chen Lian2, Pin-Er Chen1, Po-Ya Angela Wang1, Wei-Ling Chen1, Shu-Kai HSIEH2 1National Taiwan University, 2Graduate Institute of Linguistics, National Taiwan University |
| 16:40 - 17:00 | Coffee Break |
| 17:00 - 18:20 | LREC 2022 Closing Ceremony - Room 1 |
| 20:00 | LREC 2022 GALA Dinner |
| End of Day 3 |