| Day 2 |
| 09:00 - 10:40 | Session O13: Digital Humanities and Related Corpora - Room 1 |
| 09:00 - 09:20 |
ATLAS: Article Tracking, Linking, and Analysis of Swedish Encyclopedias
Albin Andersson, Salam Jonasson, Fredrik Wastring, Pierre Nugues Lund University |
| 09:20 - 09:40 |
Evaluating Embedding Models on Danish Historical Newspapers: A Corpus and Benchmark Resource
Alie Lassche1, Pascale Feldkamp1, Yuri Bizzoni2, Katrine Baunvig3, Kristoffer Nielbo1, Johan Heinsen4 1Center for Humanities Computing, Aarhus University, 2Aarhus University, 3Center for Grundtvig Studies, Aarhus University, 4Aalborg University |
| 09:40 - 10:00 |
Leveraging Linguistic Similarity for Low-Resource Speech Transcription
Valentina Fedchenko1 and Eric Jordan2 1ERTIM, 2LACITO |
| 10:00 - 10:20 |
A Corpus of Persuasion Techniques in Slavic Languages
Jakub Piskorski1, Dimitar Dimitrov2, Marina Ernst3, Jacek Haneczok4, Michal Marcinczuk5, Arkadiusz Modzelewski6, Roman Yangarber7 1Polish Academy of Sciences, 2University of Sofia "St. Kliment Ohridski", 3University of Koblenz, 4Erste Group IT, 5CodeNLP, 6Polish-Japanese Academy of Information Technology, 7University of Helsinki |
| 10:20 - 10:40 |
GePaDeSE: A New Resource for Clause-Level Aspect in German Parliamentary Debates
Julian Schlenker1, Ines Rehbein1, Lilly Brauner2, Florian Ertz3, Ines Reinig4, Simone Paolo Ponzetto1 1University of Mannheim, 2University of Heidelberg, 3University of Göttingen, 4Mannheim University |
| 09:00 - 10:40 | Session O14: Lexicon - Room 2 |
| 09:00 - 09:20 |
FrameNet Semantic Role Classification by Analogy
Van Duy Ngo1, Stergos Afantenos2, Emiliano Lorini3, Miguel Couceiro4 1IRIT, University of Toulouse, 2IRIT and CNRS, University of Toulouse, 3RIT and CNRS, University of Toulouse, 4University of Lorraine, CNRS, Loria |
| 09:20 - 09:40 |
CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning
Masato Kikuchi1, Masatsugu Ono2, Toshioki Soga3, Tetsu Tanabe4, Tadachika Ozono1 1Nagoya Institute of Technology, 2Kitami Institute of Technology, 3Chitose Institute of Science and Technology, 4Hokkaido University |
| 09:40 - 10:00 |
Towards a Gold Standard for Adjectival Hypernymy: Enriching the Open English WordNet with a Hybrid Approach
Lorenzo Augello1, John P. McCrae2, Marco Passarotti3 1Università Cattolica del Sacro Cuore, Milan, Italy, 2Insight Center for Data Analytics, National University of Ireland Galway, 3Università Cattolica del Sacro Cuore |
| 10:00 - 10:20 |
PREMOVE in LiLa: Integrating Latin Preverbed Motion Verbs with WordNet and VerbNet
Andrea Farina1, Marco Passarotti2, Francesco Mambrini2, Matteo Pellegrini3, Eleonora Litta4, Giovanni Moretti2 1King's College London, 2Università Cattolica del Sacro Cuore, 3University of Surrey, 4Università Cattolica del Sacro Cuore, Milano |
| 10:20 - 10:40 |
From Incidents to Framing: A Dutch and English Frame-semantic Corpus and Lexicon
Piek Vossen, Pia Sommerauer, Levi Remijnse Vrije Universiteit Amsterdam |
| 09:00 - 10:40 | Session O15: Multilinguality, Machine Translation - Room 3 |
| 09:00 - 09:20 |
AI Safety Lost in Translation: Evaluating the Effectiveness of English-Italian Cross-Lingual LLM Safety Alignment
Alessio Wu1 and Martim Brandao2 1King's College London, 2Waseda University |
| 09:20 - 09:40 |
Semantic Label Drift in Cross-Cultural Translation
Mohsinul Kabir1, Tasnim Ahmed2, Md Mezbaur Rahman3, Polydoros Giannouris1, Sophia Ananiadou1 1University of Manchester, 2Queen's University, 3University of Illinois Chicago |
| 09:40 - 10:00 |
Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models
Shabnam Ataee, Hugo Huart, Andrei Popescu-Belis HEIG-VD / HES-SO |
| 10:00 - 10:20 |
Adja-French Parallel Corpus: A New Resource for Machine Translation of a West African Under-Resourced Language
Josue Godeme and Rolando Coto-Solano Dartmouth College |
| 10:20 - 10:40 |
Goldfish: Monolingual Language Models for 350 Languages
Tyler Chang1, Catherine Arnett2, Zhuowen Tu1, Benjamin Bergen1 1UC San Diego, 2EleutherAI |
| 09:00 - 10:40 | Session O16: Natural Language Generation and Summarization - Room 4 |
| 09:00 - 09:20 |
Dynaword: From One-shot to Continuously Developed Datasets
Kenneth Enevoldsen1, Kristian Jensen2, Jan Kostkan1, Balázs Szabó1, Márton Kardos1, Kirsten Vad1, Johan Heinsen1, Andrea Núñez3, Gianluca Barmina3, Jacob Nielsen3, Rasmus Larsen2, Rob van der Goot4, Peter Vahlstrup1, Per Dalum1, Desmond Elliott5, Lukas Poech3, Peter Schneider-Kamp3, Kristoffer Nielbo6 1Aarhus University, 2The Alexandra Institute, 3University of Southern Denmark, 4IT University of Copenhagen, 5University of Copenhagen, 6Center for Humanities Computing, Aarhus University |
| 09:20 - 09:40 |
From Bones to Rocks: A Systematic Evaluation of Specialized Definition Generation for Portuguese
Rafael Oleques Nunes, Dennis Giovani Balreira, Joel Luís Carbonera UFRGS |
| 09:40 - 10:00 |
Beyond Lemmas and Syntax: Comparing Human and LLM-Generated Scientific Abstracts
Sergei Bagdasarov and Diego Alves Saarland University |
| 10:00 - 10:20 |
Systematic Multi-Aspect Evaluation of Time Series-Based Report Generation: The Case of Financial Analysis from Stock Data
Elizabeth Fons1, Elena Kochkina2, Rachneet Kaur3, Zhen Zeng4, Berowne Hlavaty5, Charese Smiley6, Svitlana Vyetrenko7, Manuela Veloso2 1J.P. Morgan AI Research, 2JPMorgan Chase, 3J.P. Morgan Chase, 4JP Morgan Chase, 5J.P Morgan Chase, 6JPMorgan AI Research, 7J.P Morgan AI Research |
| 10:20 - 10:40 |
Bangla Key2Text: Text Generation from Keywords for a Low Resource Language
Tonmoy Talukder1 and G M Shahariar2 1Ahsanullah University of Science and Technology, 2University of California, Riverside |
| 09:00 - 10:40 | Session P4.1.1: Bias, Offensive Content, Guardrails I - Poster Area |
|
Towards Reliable AI Fairness: Challenges in Implementing Neuron Steering for Bias Mitigation
Ismael Garrido-Munoz1, Arturo Montejo-Raez1, Fernando Martínez-Santiago2 1Universidad de Jaen, 2University of Jaén at Spain |
|
From Body to Mind: Analyzing Gender Representation in Spanish Generative Language Models
Ismael Garrido-Munoz1, Fernando Martínez-Santiago2, Arturo Montejo-Raez1 1Universidad de Jaen, 2University of Jaén at Spain |
|
Incivility and Rigidity: Evaluating the Risks of Fine-Tuning LLMs for Political Argumentation
Svetlana Churina and Kokil Jaidka National University of Singapore |
|
EsBBQ and CaBBQ: The Spanish and Catalan Bias Benchmarks for Question Answering
Valle Ruiz-Fernández1, Mario Mina2, Júlia Falcão2, Luis Antonio Vasquez Reina2, Anna Salles2, Aitor Gonzalez-Agirre1, Olatz Perez-de-Viñaspre3 1Barcelona Supercomputing Center (BSC), 2Barcelona Supercomputing Center, 3HiTZ Center - Ixa, University of the Basque Country UPV/EHU |
|
ToxSyn-PT: A Synthetic Fine-Grained Dataset of Minority-Targeted Toxic Language in Portuguese
Iago Brito1, Julia Dollis2, Fernanda Farber3, diogo fernandes4, Arlindo Galvão Filho5 1Ceia NLP - UFG, 2CEIA - NLP, 3AKCIT, 4federal university of goias, 5Federal University of Goiás |
|
A Benchmark for Testing Robustness under Controlled Reference Bias in MT
Ahrii Kim1 and Seong-heum Kim2 1None, 2Soongsil University |
|
AnswerCarefully: Creating a Dataset for LLM Safety in Japanese
Hisami Suzuki1, Satoru Katsumata2, Takashi Kodama1, Tetsuro Takahashi3, Kouta Nakayama1, Satoshi Sekine4 1National Institute of Informatics, 2Retrieva, Inc., 3Kagoshima University, 4NII, LLMC |
|
A Dutch Benchmark to Assess Social Bias in LLMs within a Hiring Decision Setting
Renate Burema1, Anne Schuth2, Christopher Spelt3, Dong Nguyen4 1Ministry of the Interior and Kingdom Relations, 2DPG Media, 3Rijksoverheid, 4Utrecht University |
|
PBBQ: A Persian Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models
Farhan Farsi1, Shayan Bali2, Fatemeh Valeh3, Parsa Ghofrani1, Alireza Pakniat1, Seyedkian Kashfipour4, Amir H. Payberah5 1Amirkabir University of technology, 2King's College London, 3Amirkabir University of Technology (Tehran Polytechnic), 4Graduate Student, 5KTH Royal Institute of Technology |
|
Contextualizing Toxicity: An Annotation Framework for Unveiling Pragmatics in Conversations of Online Discussion Forums
Yingxue Fu1 and Anais Ollagnier2 1Centre Inria d'University Cote d'Azur, 2Universite Cote d'Azur, Inria, CNRS, I3S |
|
How Far Can Bias Go? Tracing Bias from Pre-Training Data to Alignment
Marion Thaler1, Abdullatif Köksal2, Alina Leidinger3, Anna Anna Korhonen4, Hinrich Schütze2 1Ludwig-Maximilians-Universität München, 2CIS, LMU Munich, 3ILLC, University of Amsterdam, 4Language Technology Lab, University of Cambridge |
| 09:00 - 10:40 | Session P4.1.2: Bias, Offensive Content, Guardrails II - Poster Area |
|
Robust Bias Evaluation with FilBBQ: A Filipino Bias Benchmark for Question-Answering Language Models
Lance Calvin Gamboa, Yue Feng, Mark Lee University of Birmingham |
|
Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes
Quintin Myers1 and Yanjun Gao2 1University of Colorado Anschutz, 2University of Colorado |
|
Exploring Social Bias in Slovenia: The EEC-SL Dataset
Jaya Caporusso1, Damar Hoogland2, Boshko Koloski3, Matthew Purver4, Senja Pollak1, pela Vintar1 1Joef Stefan Institute, 2Newcastle University, 3Jozef Stefan Institute, 4Queen Mary University of London |
|
The MISOMEM-Val Dataset for Identifying Human Values in Misogynistic Memes
Rakshitha Rao Ailneni and Sanda Harabagiu University of Texas at Dallas |
|
ConGA: Guidelines for Contextual Gender Annotation. a Framework for Annotating Gender in Machine Translation
Argentina Rescigno1, Eva Vanmassenhove2, Johanna Monti3 1University of Pisa, 2Tilburg University, 3"L'Orientale" University of Naples |
|
University Speaking for Everyone: Assessing Changes in Italian Higher Education Statutes toward Gender-Inclusive Language
Sebastiano Vecellio Salto1, Camilla Casula2, Alessio Palmero Aprosio3, Sara Tonelli4 1Fondazione Bruno Kessler, 2University of Trento / Fondazione Bruno Kessler, 3University of Trento, 4FBK |
|
Breaking the Benchmark: Revealing LLM Bias via Minimal Contextual Augmentation
Kaveh Eskandari Miandoab1, Mahammed Kamruzzaman2, Arshia Gharooni3, Gene Kim2, Vasanth Sarathy1, Ninareh Mehrabi4 1Tufts University, 2University of South Florida, 3Independent researcher, 4Meta |
|
TryggLLM: A Benchmark for Evaluating LLM Safety in Norwegian
Samia Touileb, Truls Pedersen, Isabell Haugen University of Bergen |
|
KOCOH: Korean Context-Dependent Hate Speech Dataset
Eunah Park and Sanghoun Song Korea University |
|
Towards Fair Speech Recognition: Mitigating Demographic Bias in End-to-End ASR Systems
Maliha Jahan1, Thomas Thebaud1, Zsuzsanna Fagyal2, Jesus Villalba1, Mark Hasegawa-Johnson3, Laureano Moro Velazquez1, Najim Dehak1 1Johns Hopkins University, 2University of Illinois Urbana-Champaign, 3University of Illinois |
| 09:00 - 10:40 | Session P4.2.1: Evaluation, Validation I - Poster Area |
|
RuBIN: A Russian Benchmark for Evaluating LLMs with Cultural Insights
Polina Lazukova and Irina Piontkovskaya Huawei Noah's Ark Lab |
|
Evaluating Phonetically Weighted and Unweighted Distance Measures in Dialectometry
Alfred Lameli Research Center Deutscher Sprachatlas |
|
Piecing Together Cross-Document Coreference Resolution Datasets: Systematic Dataset Analysis and Unification
Anastasia Zhukova1, Terry Lima Ruas2, Jan Philip Wahle3, Bela Gipp1 1University of Goettingen, 2University of Gottingen, 3University of Göttingen |
|
Spotlights and Blindspots: Evaluating Machine-Generated Text Detection
Kevin Stowe1 and Kailash Patil2 1Educational Testing Services (ETS), 2Pindrop |
|
JAPAS: A Benchmark and Neural Approach for Japanese Patent Support Relation Extraction
Katsuki Chousa1 and Ryosuke Sugiura2 1NTT, 2NTT, inc. |
|
A Teacher-Student Approach to Creating Verified Synthetic Clarification and Correction Dialogues for TableQA Tasks
Christian Poelitz1 and Nick McKenna2 1Microsoft Research, 2GitHub Applied Science |
|
Persona-Aware Evaluation of Cognitive Bias in LLMs: From Benchmark to Applied Decision-Making
Katsumasa Yoshikawa1, Junya Takayama2, Takato Yamazaki3 1Dai-ichi Life Holdings, Inc., 2SB Intuitions, 3SB Intuitions Corporation |
|
ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering
Daeyong Kwon, SeungHeon Doh, Juhan Nam KAIST |
|
MATA (??? ): Mindful Assessment of the Telugu Abilities of Large Language Models
Chalamalasetti Kranti1 and Sowmya Vajjala2 1University of Potsdam, 2National Research Council |
|
Estonian Native Large Language Model Benchmark
Helena Grete Lillepalu and Tanel Alumäe Tallinn University of Technology |
|
Indirect Question Answering in English, German and Bavarian: A Challenging Task for High- and Low-Resource Languages Alike
Miriam Winkler, Verena Blaschke, Barbara Plank LMU Munich |
|
Benchmarking Large Language Models for Chinese and Japanese IMEs: Phonetic-to-Character Generation and Textual Error Correction
Yuchun Zou1, Tedd Lee2, Xiaodi Fan3, Jun Li4 1CUNY Graduate Center, 2CUNY Hunter College, 3Meta Inc., 4CUNY Queens College and Graduate Center |
|
DaLA: Danish Linguistic Acceptability Evaluation Guided by Real World Errors
Gianluca Barmina1, Nathalie Norman2, Peter Schneider-Kamp1, Lukas Poech1 1University of Southern Denmark, 2University of Copenhagen |
|
KCIF: Knowledge-Conditioned Instruction Following
Rudra Murthy1, Praveen Venkateswaran1, Prince Kumar1, Danish Contractor2 1IBM, 2IBM Research IBM Research |
|
GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms
Masayuki Kawarada1, Kodai Watanabe2, Soichiro Murakami3 1CyberAgent/National Institute of Advanced Industrial Science and Technology, 2CyberAgent,Inc., 3CyberAgent, Inc. |
|
Can LLMs Evaluate What They Cannot Annotate? Revisiting LLM Reliability in Hate Speech Detection
Paloma Piot1, David Otero2, Patricia Martin-Rodilla3, Javier Parapar2 1Universidade da Coruna, 2Universidade da Coruña, 3IEGPS |
| 09:00 - 10:40 | Session P4.2.2: Evaluation, Validation II - Poster Area |
|
PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark
Mohammad Javad Ranjbar Kalahroodi1, Amirhossein Sheikholselami1, Sepehr Karimi Arpanahi1, Sepideh Ranjbar Kalahroodi2, Heshaam Faili1, Azadeh Shakery1 1University of Tehran, 2Shahid Beheshti University of Medical Sciences |
|
HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection
Irina Proskurina1, Marc-Antoine Carpentier2, Julien Velcin3 1Laboratoire Hubert Curien, UMR CNRS 5516, Saint-Etienne, France, Université Claude Bernard Lyon 1, Université Lumière Lyon 2, ERIC, 69100, Villeurbanne, France, 2École centrale de Lyon, 3Ecole Centrale de Lyon, LIRIS CNRS UMR 5205, France |
|
Investigating Memorization in Language Models Trained via Knowledge Distillation
Maarten Mäcking1 and Michaela Regneri2 1University of Hamburg, 2Universität Hamburg |
|
Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models
Hanwool Lee1, Dasol Choi2, Sooyong Kim3, Ilgyun Jung4, Sangwon Baek5, Guijin Son2, Inseong Hwang6, Naeun Lee3, Seunghyeok Hong7 1Shinhan Securities, 2Yonsei University, 3MODULABS, 4Korea University, 5Catius, 6Seoul National University of Science and Technology, 7Hankuk University of Foreign Studies |
|
Cross-Lingual Stability and Bias in Instruction-Tuned Language Models for Humanitarian NLP
Poli Nemkova1, Amrit Adhikari1, Matthew Pearson2, Vamsi Krishna Sadu1, Albert Mark1 1University of North Texas, 2Davidson College |
|
Counting on Consensus: Selecting the Right IAA Metric for NLP Annotation and Evaluation
Joseph James University of Sheffield |
|
Quadratic Weighted Kappa Is Not Enough for Evaluating Automated Essay Scoring Models
Salam Albatarni and Tamer Elsayed Qatar University |
|
Evaluating the Homogeneity of Keyphrase Prediction Models
Mael Houbre1, Florian Boudin2, Beatrice Daille3 1Ministerial Agency of Artificial Intelligence in Defense, 2Nantes University, 3Nantes Université- LS2N |
|
A Taxonomy of Safety: Harmonizing LLM Benchmarks in a Fragmented Landscape
Shadi Rastegar1, Viktor Hangya2, Fabian Kuech2, Darina Gold2 1IIS Fraunhofer, 2Fraunhofer IIS |
|
Consistency of LLMs to Comparative Statements in Mathematical Reasoning Tasks
Aidan San1, Daniel Son1, Xiaodong Liu2, Yangfeng Ji1 1University of Virginia, 2Microsoft Research |
|
PersianAnonymizer: Evaluating LLM-Labeled Training for Efficient NER-based Anonymization in Persian
Mohammad Hossein Shalchian1, Mostafa Amiri2, Amir Mahdi Sadeghzadeh1 1Sharif University of Technology, 2University of Tehran |
|
How Many Samples Do We Need? A Toolkit for Power-Aware Evaluation Design
Angelo Basile1, Areg Mikael Sarvazyan2, José González3 1Universitat Politecnica de Valencia, 2Symanto Research, 3TransPerfect |
|
Of Words and Meaning: A Grammatical and Semantic Benchmark for Faroese LLM Understanding
Iben Debess1, Barbara Scalvini1, Bolette Pedersen2 1University of the Faroe Islands, 2University of Copenhagen |
|
TURING: Evaluating Human Abilities to Identify AI-Generated Texts
Natalia Kalashnikova, Nicolas De Bufala, Sophie Fayad, Laurent Cervoni TALAN |
|
JamC-QA: A Multiple-Choice Question Answering Benchmark for Japan-Specific Knowledge
Teruaki Oka, Tomohide Shibata, Nao Yoshida SB Intuitions Corp. |
| 09:00 - 10:40 | Session P4.2.3: Evaluation, Validation III - Poster Area |
|
Evaluating Text Style Transfer: A Nine-language Benchmark for Text Detoxification
Vitaly Protasov1, Nikolay Babakov2, Daryna Dementieva3, Alexander Panchenko4 1Independent Researcher, 2Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, 3Technical University of Munich, 4S-NLP |
|
Irish-BLiMP: A Linguistic Benchmark for Evaluating Human and Language Model Performance in a Low-Resource Setting
Josh Mcgiff1, Tung Tran2, William Mulcahy1, Dáibhidh Ó Luinín1, Jake Dalzell3, Róisín Ní Bhroin4, Adam Burke4, Barry O'Sullivan2, Hoang Nguyen2, Nikola Nikolov1 1University of Limerick, 2University College Cork, 3Prifysgol Aberystwyth University, 4Independent |
|
EduBench: A Portuguese Benchmark for Open-Ended Discursive Question Answering
Pedro Paiola1, Luís Gabriel Mendes1, Bruno Monchelato1, André Schuck1, Gabriel Garcia1, Douglas Rodrigues1, Helena Caseli2, João Papa1 1São Paulo State University, 2Federal University of São Carlos |
|
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models
Malik Altakrori1, Nizar Habash2, Teresa Lynn3, Younes Samih1, Abed Alhakim Freihat4, Kirill Chirkunov3, Muhammed AbuOdeh3, Radu Florian5, Preslav Nakov4, Alham Fikri Aji3 1IBM Research AI, 2New York University Abu Dhabi, 3MBZUAI, 4Mohamed bin Zayed University of Artificial Intelligence, 5IBM Research |
|
SemBench: A Universal Semantic Framework for LLM Evaluation
Mikel Zubillaga1, Naiara Perez2, Oscar Sainz3, German Rigau4 1HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 2University of the Basque Country, 3University of the Basque Country (UPV/EHU), 4UPV/EHU |
|
EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs
Ali Satvaty1, Suzan Verberne2, Fatih Turkmen3 1University of Groningen, 2LIACS, Leiden University, 3Associate Professor University of Groningen |
|
Same Meaning, Different Scores: Lexical and Syntactic Sensitivity in LLM Evaluation
Bogdan Kostic1, Conor Fallon1, Julian Risch2, Alexander Loeser3 1Berliner Hochschule für Technik, 2deepset, 3Beuth-University of Applied Sciences Berlin |
|
The Potential for Misleading Results in Text Sanitisation with Standard Evaluation Metrics
Dan Zhang1 and Mark Anderson2 1Norwegian university of science and technology, 2Norsk Regnesentral |
|
Mind the Language Gap: Assessing LLM Safety in Italian
Elena Marafatto and Roberto Navigli Sapienza University of Rome |
|
Bulgarian Massive Multitask Language Understanding Benchmark
Svetla Koeva1, Ivelina Stoyanova2, Dimiter Georgiev3, Svetlozara Leseva4, Valentina Stefanova5, Maria Todorova6, Tsvetana Dimitrova5, Hristina Kukova2, Mihaela Moskova5, Tinko Tinchev5 1Institute for Bulgarian Language "Prof. Lyubomir Andreychin", Bulgarian Academy of Sciences, 2Department of Computational Linguistics, IBL - BAS, 3Department of Computational Linguistics, IBL - BAS Country/Region:Bulgaria (BG), 4Department of Computational Linguistics, Institute for Bulgarian - BAS, 5Institute for Bulgarian Language, 6Bulgarian Academy of Sciences |
|
PHEB: An European Portuguese High School-Level LLM Benchmark
Diogo Tavares1, Rafael Ferreira1, Afonso Simplício1, Gonçalo Vinagre1, Ana Condez1, Inês Calvo2, Inês Vieira1, David Semedo3, Joao Magalhaes3 1NOVA School of Science and Technology, 2, 3Universidade NOVA de Lisboa |
|
S-GRADES -- Studying Generalization of Student Response Assessments in Diverse Evaluative Settings
Tasfia Seuti and Sagnik Ray Choudhury University of North Texas |
|
Who Benchmarks the Benchmarks? A Case Study of LLM Evaluation in Icelandic
Finnur Ingimundarson1, Steinunn Rut Friðriksdóttir2, Bjarki Ármannsson3, Iris Nowenstein2, Steinþór Steingrímsson3 1University of Zurich, 2University of Iceland, 3The Árni Magnússon Institute for Icelandic Studies |
|
Is This Idea Novel? An Automated Benchmark for Judgment of Research Ideas
Tim Schopf1 and Michael Färber2 1National Institute of Informatics (NII), 2TU Dresden |
|
Questionnaire Meets LLM: A Benchmark and Empirical Study of Structural Skills for Understanding Questions and Responses
Duc-Hai Nguyen1, Vijayakumar Nanjappan2, Barry O'Sullivan2, Hoang Nguyen2 1Insight Research Ireland Centre for Data Analytics, School of Computer Science and Information Technology, University College Cork, Ireland, 2University College Cork |
|
Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy
Navdeep Singh Bedi1, Ana-Maria Bucur1, Noriko Kando2, Fabio Crestani3 1Università della Svizzera italiana, 2National Institute of Informatics, 3Università della Svizzera Italiana (USI) |
| 10:40 - 11:00 | Coffee Break |
| 11:00 - 12:40 | Session O17: Evaluation, Validation IV - Room 1 |
| 11:00 - 11:20 |
Transcription Accuracy in the Icelandic Gigaword Corpus: Evaluating Automatic and Manual Annotation
Johanna Mechler, Lilja Stefánsdóttir, Anton Ingason University of Iceland |
| 11:20 - 11:40 |
Benchmark Data Contamination in Underrepresented Languages: A Comprehensive Analysis Using Brazilian Data
Iriedson Vilar1, David Maia2, João Brunet3, Fabio Morais1, Leandro Marinho4 1Federal University of Campina Grande (UFCG), 2IFPB, 3Federal University of Campina Grande, 4UFCG |
| 11:40 - 12:00 |
TTSVowelViz: A Tool for Visualising Text-to-Speech Model Training via Vowel Spaces
Pasindu Udawatta1, Jesin James1, Balamurali B T2, Catherine Watson1, Ake Nicholas1, Binu Abeysinghe1 1University of Auckland, 2Singapore University of Technology and Design |
| 12:00 - 12:20 |
A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus
Michael Scott, Siyu Liang, Alicia Wassink, Gina-Anne Levow University of Washington |
| 12:20 - 12:40 |
ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech
Marios Koniaris, Argyro Tsipi, Panayiotis Tsanakas National Technical University of Athens |
| 11:00 - 12:40 | Session O18: Lexicon and Semantics I - Room 2 |
| 11:00 - 11:20 |
PARSEME 2.0 Multilingual Corpus of Multiword Expressions
Agata Savary1, Manon Scholivet2, Carlos Ramisch3, Takuya Nakamura4, Eric Bilinski5, Sara Stymne6, Voula Giouli7, Stella Markantonatou8, Vasile Pais9, Maria Mitrofan10, Louis Estève11, Bruno Guillaume12, Verginica Barbu Mititelu10, Jaka Cibej13, Roberto Díaz Hernández14, Victoria Fendel15, Polona Gantar13, Olha Kanishcheva16, Cvetana Krstev17, Chaya Liebeskind18, Irina Lobzhanidze19, Aleksandra Markovic20, Gunta Nepore-Berzkalne21, Adriana Pagano22, Mehrnoush Shamsfard23, Ranka Stankovic24, Vahide Tajalli23, Carole Tiberius25, Aakanksha Padhye26 1Paris-Saclay University, 2Universite Paris Saclay CNRS, 3Aix Marseille University, CNRS, LIS, 4LISN, Universite Paris-Saclay, CNRS/LIGM, Universite Gustave-Eiffel, CNRS, 5Universite Paris Saclay, CNRS, LISN, 6Uppsala University, 7Aristotle University of Thessaloniki / ILSP, ATHENA RC, 8ILSP/R.C. "Athena", 9Research Institute for Artificial Intelligence, Romanian Academy, 10RACAI, 11Université Paris-Saclay, CNRS, LISN, 12LORIA / Inria Nancy Grand-Est, 13University of Ljubljana, 14University of Jaén, 15University of Oxford, 16Heidelberg University, 17Association for Language Resources and Technologies, 18Jerusalem College of Technology , Lev Academic Center, 19Ilia State University, 20The Institute for the Serbian language of SASA, 21Institute of Mathematics and Computer Science, University of Latvia, 22Federal University of Minas Gerais, 23Faculty of Computer Science and Engineering, Shahid Beheshti University, 24University of Belgrade - Faculty of Mining and Geology, 25Instituut voor de Nederlandse Taal, 26Indian Institute of Technology Delhi |
| 11:20 - 11:40 |
Multi-SimLex for Dutch: Benchmarking Embedding- and Prompt-Based Model Performance on Semantic Similarity
Lizzy Brans1 and Jelke Bloem2 1Utrecht University, 2University of Amsterdam |
| 11:40 - 12:00 |
MultiCoS: A Multilingual Dataset of Connective Semantics with ContextSentence Compatibility
Anne Mucha, Ciyang Qing, Wataru Uegaki University of Edinburgh |
| 12:00 - 12:20 |
Adverbs Revisited: Enhancing WordNet Coverage of Adverbs with a Supersense Taxonomy
Jooyoung Lee1, Jader Camboim de Sá2, Cedric Pruski2 1Brown University, 2Luxembourg Institute of Science and Technology |
| 12:20 - 12:40 |
Introducing PerMet 1.0: A Metaphor-Annotated Corpus for Persian
Mohammad Saeid Miri Allameh Tabataba'i University |
| 11:00 - 12:40 | Session O19: Multilinguality, Machine Translation Evaluation - Room 3 |
| 11:00 - 11:20 |
KinyCOMET: Automatic Evaluation of Machine Translation Systems for Kinyarwanda--English
Prince Mazimpaka1, Jan Nehring2, Samuel Rutunda3, Cristina España-Bonet4 1University of Rwanda, 2C4IR, 3Digital Umuganda, 4DFKI |
| 11:20 - 11:40 |
Multiway Parallel Corpus in Forced Migration Domain for Multilingual Machine Translation
Fatemeh Azadi1, Samuel Larkin1, Chi-kiu Lo2 1National Research Council Canada, 2National Research Council of Canada |
| 11:40 - 12:00 |
Context-8: A Data Set for Evaluating Context Sensitivity in Machine Translation
Dongyue Wang and Kyo Kageura University of Tokyo |
| 12:00 - 12:20 |
AssamLegalTrans: A Parallel Corpus, Benchmark and Analysis for English-Assamese Machine Translation of Legal Judgments
Telem Joyson Singh1, Hemanta Baruah2, Sanasam Ranbir Singh2, Anindita Talukdar1, Nasrin Shahnaz1, Okram Jimmy Singh1, Priyankoo Sarmah2, Pallav Dutta1, Sukumar Nandi2, Pranab Duara3 1IIT Guwahati, 2Indian Institute of Technology Guwahati, 3Gauhati High Court |
| 12:20 - 12:40 |
Coordinate Structure Extraction for Patent Claims Using Multilingual LLMs
Tsukasa Ishimaru1, Takehito Utsuro1, Masaaki Nagata2 1University of Tsukuba, 2NTT, Inc. |
| 11:00 - 12:40 | Session O20: Discourse and Pragmatics II - Room 4 |
| 11:00 - 11:20 |
Human Label Variation in Implicit Discourse Relation Recognition
Frances Yung1, Daniil Ignatev2, Merel Scholman2, Vera Demberg1, Massimo Poesio3 1Saarland University, 2Utrecht University, 3Queen Mary University of London and University of Utrecht |
| 11:20 - 11:40 |
Conversational Implicatures through the Lens of LLMs
Agnese Lombardi and Alessandro Lenci University of Pisa |
| 11:40 - 12:00 |
The Emergence of the Pragmatic Dimension in Instructed-LMs
Davide Mazzaccara1 and Raffaella Bernardi2 1CIMeC, University of Trento, 2Free University of Bozen-Bolzano |
| 12:00 - 12:20 |
Distributed Partial Information Puzzles: Examining Common Ground Construction under Epistemic Asymmetry
Yifan Zhu1, Mariah Bradford2, Kenneth Lai1, Timothy Obiso1, Videep Venkatesha2, James Pustejovsky1, Nikhil Krishnaswamy2 1Brandeis University, 2Colorado State University |
| 12:20 - 12:40 |
Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask
Nan Li1, Albert Gatt1, Massimo Poesio2 1Utrecht University, 2Queen Mary University of London and University of Utrecht |
| 11:00 - 12:40 | Session P5.1.1: Inference, Reasoning, Question Answering II - Poster Area |
|
Assessing LLM Reasoning through Implicit Causal Chain Discovery in Climate Discourse
Liesbeth Allein1, Nataly Pineda-Castañeda2, Andrea Rocci2, Marie-Francine Moens1 1KU Leuven, 2Università della Svizzera italiana |
|
AccurateRAG: A Framework for Building Accurate Retrieval-Augmented Question-Answering Applications
Linh The Nguyen, Chi Tran, Dung Ngoc Nguyen, Van-Cuong Pham, Hoang Ngo, Dat Quoc Nguyen Qualcomm AI Research |
|
VideoEvent: Leveraging Relevance and LLMs for Video Question Answering
Chen-Chen Lin, Ming-Han Lee, KunRu Wu, Yu-Chee Tseng National Yang Ming Chiao Tung University |
|
MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering
Wen-wai Yim1, Asma Ben Abacha1, Zixuan Yu2, Robert Doerning2, Fei Xia2, Meliha Yetisgen2 1Microsoft, 2University of Washington |
|
LegalRikai: Open Benchmark a Benchmark for Complex Japanese Corporate Legal Tasks
Shogo Fujita1, Yuji Naraki2, Yiqing Zhu1, Shinsuke Mori3 1LegalOn Technologies, Inc., 2Cierpa & Company, 3Kyoto University |
|
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models
Neeraj Gangwar1, Suma Bhat2, Nickvash Kani2 1University of Illinois Urbana-Champaign, 2University of Illinois at Urbana-Champaign |
|
mSCoRe: A Multilingual and Scalable Benchmark for Skill-based Commonsense Reasoning
Nghia Ngo1, Franck Dernoncourt2, Thien Nguyen1 1University of Oregon, 2Adobe Research |
|
A Binary Problem in Binary QA: Diverse LLMs or Diverse Question Interpretations? That Is the Ensembling Question
Rafael Rosales1 and Santiago Miret2 1Intel, 2Lila Sciences |
|
ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
Shubhra Ghosh1, Abhilekh Borah2, Aditya Guru3, Kripabandhu Ghosh4 1Indian Institutes of Technology, Patna, 2Manipal University Jaipur, India, 3Manipal University Jaipur, 4Indian Institute of Science Education and Research- Kolkata (IISER-K) |
|
POLAR: A Corpus of Questions, Responses and Argumentation in Polish Political Radio Discourse
Daniel Ziembicki1, Aleksandra Zwierzchowska2, Ewelina Sobol3, Katarzyna Przerada3 1University of Warsaw, Department of Formal Linguistics, 2Institute of Computer Science Polish Academy of Sciences, 3No affiliation |
|
MHTS: Multi-Hop Tree Structure Framework for Generating Difficulty-Controllable QA Datasets for RAG Evaluation
Jeongsoo Lee1, Daeyong Kwon2, Kyohoon Jin1, JunNyeong Jeong1, Minwoo Sim1, Minwoo Kim1 1DATUMO, 2KAIST |
|
CareMedEval Dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field
Doria Bonzi1, Alexandre Guiggi2, Frederic Bechet3, Carlos Ramisch4, Benoit Favre5 1LORIA, 2Université Grenoble-Alpes, 3Aix Marseille Universite - LIS/CNRS, 4Aix Marseille University, CNRS, LIS, 5Aix-Marseille University LIS/CNRS |
|
LongTailQA: Benchmarking LLMs and RAG Models on Disambiguated Long-Tail Entities
William Xion1, Uwe Hadler2, Tim Cofala3, Maximilian Idahl4, Soumyadeep Roy5, Wolfgang Nejdl1 1L3S Research Center, 2L3S Research Centre, 3L3S Research Center, Leibniz Universität Hannover, 4L3S Research Center, Leibniz University Hannover, 5Stanford University |
|
CRaFT: An Explanation-Based Framework for Evaluating Cultural Reasoning in Multilingual Language Models
Shehenaz Hossain1 and Haithem Afli2 1ADAPT Centre, MTU, 2ADAPT Centre, Munster Technological University |
|
HEAD-QA v2: Expanding a Healthcare Benchmark for Reasoning
Alexis Correa1, Carlos Gómez-Rodríguez1, David Vilares2 1Universidade da Coruña, 2Universidade da Coruña, CITIC |
|
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants
Hunzalah Hassan Bhatti1 and Firoj Alam2 1Qatar Computing Research Institute, 2Qatar Computing Research Institute, HBKU |
|
Automatic Inter-document Multi-hop Scientific QA Generation
Seungmin Lee1, Dongha Kim2, Yuni Jeon1, Junyoung Koh1, Min Song1 1Yonsei University, 2Yonsei Unviersity |
|
CRiT-QA: Evaluating Multi-hop Reasoning with Counterfactual Chains and Distractor Traps
Jungmin Yun, June Hyoung Kwon, Youngbin Kim Chung-Ang University |
|
TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models
Reihaneh Iranmanesh, Saeedeh Davoudi, Pasha Abrishamchian, Ophir Frieder, Nazli Goharian Georgetown University Information Retrieval Lab |
| 11:00 - 12:40 | Session P5.1.2: Inference, Reasoning, Question Answering III - Poster Area |
|
Benchmarking Mathematical Reasoning in a Low-Resource Language: Structured Prompting and Evaluation in Basque
Inigo Martinez-Criado1, Aitor Soroa2, Jeremy Barnes1 1University of the Basque Country EHU/UPV, 2HiTZ Center - Ixa, University of the Basque Country UPV/EHU |
|
Assessing the Difficulty of Inference Types in Natural Language Inference for Clinical Trials
Mathilde Aguiar1, Pierre Zweigenbaum2, Nona Naderi3 1Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400, Orsay, France, 2LISN, CNRS, Université Paris-Saclay, 3Université Paris-Saclay |
|
Reasoning Graph-Structured Question Answering: Datasets and Insights from LLM Benchmarking
Khin Yone1, Devasha Trivedi2, Anish Pahilajani2, Jincen Shuai1, Samyak Rajesh Jain1, Ryan Rossi3, Nesreen Ahmed4, Franck Dernoncourt3, Yu Wang5, Namyong Park6 1University of California, Santa Cruz, 2UC Santa Cruz, 3Adobe Research, 4Cisco, 5University of Oregon, 6Carnegie Mellon University |
|
JBE-QA: Japanese Bar Exam QA Dataset for Assessing Legal Domain Knowledge
Zhihan Cao1, Fumihito Nishino2, Hiroaki Yamada1, Ha Thanh Nguyen3, Yusuke Miyao4, Ken Satoh2 1Institute of Science Tokyo, 2Center for Juris-informatics, ROIS-DS, 3National Institute of Informatics, 4University of Tokyo |
|
A Diagnostic Benchmark for Sweden-Related Factual Knowledge
Jenny Kunz Linkoping University |
|
GeoBenchmark: Probing Large Language Models for Geo-Spatial Knowledge
Ayomide Abayomi1, Jose G. Moreno2, Karim Radouane3, Lynda Tamine4 1IRIT/Université Jean Monnet, 2Paul Sabatier University - IRIT, 3University of Toulouse, 4IRIT |
|
FactOReS: Fact-checking with an Evidence-based Open Resource in Spanish
Nagore Bravo1, Jaione Bengoetxea2, Iker García-Ferrero3, Alba Bonet Jover4, Estela Saquete4, Rodrigo Agerri5 1HiTZ Center, University of the Basque Country, 2HiTZ Center - Ixa, University of the Basque Country UPV/EHU, 3Multiverse Computing, 4University of Alicante, 5HiTZ Center - Ixa, University of the Basque Country EHU |
|
Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection
Dylan Phelps1, Rodrigo Wilkens2, Edward Gow-Smith3, Thomas Pickard3, Maggie Mi3, Marco Idiart4, Aline Villavicencio5 1The University of Sheffield, 2University of Exeter, 3University of Sheffield, 4Federal University of Rio Grande do Sul, 5University of Exeter, UK |
|
ESG-QA: Building a Dataset for Question Answering on Environmental, Social, and Governance Pillars
Gabriel Assis1, Ayrton Surica1, Pedro Kroll1, Gabriela Mendes2, Darian Rabbani2, Edson Bollis2, Lucas Francisco Pellicer3, Aline Paes1 1Institute of Computing, Universidade Federal Fluminense, 2Instituto de Ciência e Tecnologia Itaú, 3Universidade de São Paulo (USP) |
|
Enhancing and Evaluating Tabular Models on the Fly via Synthetic QuestionAnswer Generation
Jorge Osés Grijalba1, Eugenio Martínez Cámara1, L. Alfonso Ureñ-López2, Jose Camacho-Collados3 1University of Jaén, 2University of Jaen, 3Cardiff University |
|
VIVID: A Culturally Grounded Benchmark Exposing the Figurative Language Gap in Vietnamese NLP
Tu Do1, Nhat Nguyen1, Tung Tran2, Hoang Nguyen2, Tu Phuong1, Long Dang1 1Posts and Telecommunications Institute of Technology, 2University College Cork |
|
Assessing Logical Coherence of LLMs via Fine-Grained NLI
Jon Apaolaza Larraya1, Begoña Altuna2, Aitor Soroa1, Inigo Lopez-Gazpio1 1HiTZ Basque Center for Language Technology - Ixa NLP Group - University of the Basque Country UPV/EHU, 2GOI institute, Basque Summer University (UEU) |
|
Counter-Hypothesis Generation: Towards Evaluating How LLMs Reason about Alternatives
Marzieh Abdolmaleki1, Aaron Maladry2, Veronique Hoste1, Els Lefever1 1LT3, Ghent University, 2Ghent University |
|
LFQA-HP-1M: A Large-Scale Human Preference Dataset for Long-Form Question Answering
Rafid Ishrak Jahan1, FAHMID SHAHRIAR IQBAL2, Sagnik Ray Choudhury2 1University of North Texas, Department of Computer Science and Engineering, 2University of North Texas |
|
Orthographic Constraint Satisfaction and Human Difficulty Alignment in Large Language Models
Bryan Tuck and Rakesh Verma University of Houston |
|
LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation
Koki Itai, Shunichi Hasegawa, Yuta Yamamoto, Gouki Minegishi, Masaki Otsuki neoAI Inc. |
|
Investigating Reasoning with Hypotheses: The RIP2 Corpus
Ella Schad, Clara Seyfried, Chris Reed University of Dundee |
|
Can Multimodal LLMs Generate Pedagogical Questions?
Thomas Gerald1, Sahar Ghannay2, Julie Lascar2, Paul Lerner3, Anne Vilnat4 1CNRS, Université Paris Saclay, LISN, 2CNRS, LISN, 3Sorbonne Université, CNRS, ISIR, 4LIMSI et Université Paris-Saclay |
|
The Riddle of Reflection: Evaluating Reasoning and Self-Awareness in Multilingual LLMs Using Indian Riddles
Abhinav P M1, Ojasva Saxena2, Oswald C3, Parameswari Krishnamurthy4 1International Institute of Information Technology, 2IIT Delhi, 3National Institute of Technology Tiruchirappalli, 4Assistant Professor, IIIT Hyderabad |
| 11:00 - 12:40 | Session P5.2.1: Speech Resources and Processing I - Poster Area |
|
Using Songs to Improve Kazakh Automatic Speech Recognition
Rustem Yeshpanov Independent Researcher |
|
Southern Kurdish Speech Recognition Resources and Benchmarking
Mohammad Mohammadamini1 and Marie Tahon2 1Le Mans University, 2LIUM / Le Mans University |
|
MASA: A Novel Multimodal Foundation Model for L2 Speaking Assessment in Picture-description Scenarios
Bi-Cheng Yan, Fu-An Chao, Hong-Yun Lin, Berlin Chen National Taiwan Normal University |
|
Tools for Estimating the Perceived Level of Phonetic Reduction
Nigel Ward1, Javier Vazquez1, Emma (Danny) Boushka1, Oliver Niebuhr2 1University of Texas at El Paso, 2University of Southern Denmark |
|
FalAR: A Large-scale Speaker-Annotated European Portuguese Speech Corpus of Parliamentary Sessions
Francisco Teixeira1, Carlos Carvalho2, Mariana Julião2, Catarina Botelho1, Rubén Solera-Ureña1, Sérgio Paulo1, Thomas Rolland1, Ben Peters1, Isabel Trancoso3, Alberto Abad4 1INESC-ID, 2INESC-ID/Instituto Superior Técnico, Universidade de Lisboa, 3INESC-ID / IST Univ. Lisbon, 4INESC-ID/IST |
|
English to Central Kurdish Speech Translation: Corpus Creation, Evaluation, and Orthographic Standardization
Mohammad Mohammadamini1, Daban Jaff2, Josep Crego3, Marie Tahon4, Antoine LAURENT5 1Le Mans University, 2Koya University, 3CHAPSVISION, 4LIUM / Le Mans University, 5LIUM - Laboratoire Informatique Université du Mans |
|
Automatic Prediction of Prominence and Boundary Strength from Text
Pauline Mas1, Kévin Vythelingum2, Jonathan Chevelu3, Marion Ouédraogo2, Damien Lolive4, Olivier Rosec2 1Voxygen, University of Rennes, IRISA, 2Voxygen, 3Univ Rennes, CNRS, IRISA, 4UBS, CNRS, IRISA |
|
SOMVOICE: A First Dataset to Study the Effects of Sleep Deprivation on Voice Characteristics of Healthy French Speakers
Vincent P. Martin1, Jean-Luc Rouas2, Colleen Beaumard3, Pierre Philip4 1Univ. Lorraine CNRS, Inria, LORIA, 2LaBRI CNRS UMR 5800 Univ. Bordeaux, 3Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI - UMR 5800, SANPSY - UMR 6033, 4Univ. Bordeaux, SANPSY, UMR 6033 |
|
Automatic Prediction of Child Speech Fluency with Game-Based Data from German Preschoolers
Valentin Kany, Bernd Möbius, Jürgen Trouvain Saarland University |
|
Selective Augmentation: Improving Universal Automatic Phonetic Transcription via G2P Bootstrapping
Tobias Bystrich1, Julia Pritzen2, Christoph Schmidt2, Claudia Wich-Reif3 1University of Bonn, Fraunhofer Institute IAIS, 2Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), 3University of Bonn |
|
AURORA Model of Formant-to-tongue Inversion for Didactic and Clinical Applications
Patrycja Strycharczuk1 and Sam Kirkham2 1University of Manchester, 2Lancaster University |
|
Investigating the Role of Synthetic Data Augmentation and Training Strategies on Improving Low-Resource Language ASR
Yun Hao, Reihaneh Amooie, Wietse de Vries, Rik van Noord, Martijn Wieling University of Groningen |
|
AutoRPT: A Tool for Bootstrapping Prosodic Annotation
Seth Heiney, Thomas Hicks, Sally Little, Fernanda Lourenco, Kai Retana, Eliana Stevens, Jonathan Howell Montclair State University |
|
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Wataru Nakata1, Kentaro Seki1, Hitomi Yanaka1, Yuki Saito1, Shinnosuke Takamichi2, Hiroshi Saruwatari1 1The University of Tokyo, 2Keio University |
|
ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark
Tung Nguyen1, Nhu Vo1, Giang Son Nguyen2, Duy Hoang1, Chien Huynh1, Inigo Jauregi Unanue3, Massimo Piccardi3, Wray Buntine4, Dung Le5 1VinUniversity, 2Nanyang Technological University, 3University of Technology Sydney, 4CECS, VinUniversity, 5College of Engineering and Computer Science, VinUniversity |
| 11:00 - 12:40 | Session P5.2.2: Speech Resources and Processing II - Poster Area |
|
Towards Privacy-Preserving Fine-Tuning: Anonymization of Aphasic Speech for Effective ASR
Sebastian Hofstetter and Timo Baumann Ostbayerische Technische Hochschule Regensburg |
|
ParlaSpeech 3.0: Richly Annotated Spoken Parliamentary Corpora of Croatian, Czech, Polish, and Serbian
Nikola Ljubeic1, Peter Rupnik2, Ivan Porupski1, Taja Kuzman Pungerek1 1Joef Stefan Institute, 2Jožef Stefan Institute |
|
LexiPhon: A Collection of Phonetically Transcribed Lexicons from Wikipedia
Amanda Doucette, Timothy J. O'Donnell, Morgan Sonderegger McGill University |
|
ROG: A Multi-Layer Manually Annotated Corpus of Spoken Slovenian
Kaja Dobrovoljc1, Darinka Verdonik2, Jaka Cibej1, Peter Rupnik3, Nikola Ljubeic4 1University of Ljubljana, 2University of Maribor, 3Jožef Stefan Institute, 4Joef Stefan Institute |
|
Building a Dataset for French Accent Classification Evaluation: Are We There Yet?
Diandra Fabre1, Mathieu Avanzi2, François Portet3 1Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 2Université de Neuchâtel, 3Univ Grenoble Alpes, Laboratoire d'Informatique de Grenoble |
|
M3-SLU: Evaluating Speaker-Attributed Reasoning in Multimodal Large Language Models
Yejin Kwon, TAEWOO KANG, Hyunsoo Yoon, Chang Ouk Kim Yonsei University |
|
Medispeech: A French Reading and Spontaneous Speech Corpus for Sleepiness Estimation
Colleen Beaumard1, Vincent P. Martin2, Charles Brazier3, Julien Coelho4, Jean-Luc Rouas5, Pierre Philip6 1Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI - UMR 5800, SANPSY - UMR 6033, 2Univ. Lorraine CNRS, Inria, LORIA, 3Univ. Bordeaux, Bordeaux INP, LaBRI, CNRS - UMR 5800, 4SANPSY CNRS UMR 6033, Univ. Bordeaux, CHU Bordeaux, University Department of Sleep Medicine, 5Univ. Bordeaux, Bordeaux INP, LaBRI CNRS - UMR 5800, 6SANPSY CNRS - UMR 6033, Univ. Bordeaux, CHU Bordeaux, University Department of Sleep Medicine |
|
StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario
Marcely Zanon Boito, Caroline Brun, Inyoung Kim, Denys PROUX, Salah Ait-Mokhtar, Nikolaos Lagos, Jean-Luc Meunier, Ioan Calapodescu NAVER LABS Europe |
|
Audio-Lyrics Alignment Dataset for Italian Arias
Pushkar Jajoria1, Arianna Graciotti2, Giovanna Casali3, Jesujoba Alabi1, Rodolfo Delmonte4, Angelo Pompilio3, Rocco Tripodi5, James McDermott6, Dietrich Klakow1 1Saarland University, 2University of Groningen, 3University of Bologna, 4Ca' Foscari University Venice now retired, 5Ca' Foscari University of Venice, Department of Environmental science, Informatics and Statistics, 6University of Galway |
|
A Semi-Automatic Workflow for Transcribing and Annotating Broadcast News
Christoph Draxler1, Sven Grawunder2, Jürgen Trouvain3, Felicitas Kleber4 1Institute of Phonetics and Speech Processing, LMU Munich, 2Max Planck Institute for Evolutionary Anthropology, Department of Linguistics, Leipzig, 3Saarland University, 4Deptartment of Language Science and Technology, Saarland University |
|
The Added Value of Metadata and Annotations: Evidence from Two Large-Scale, Naturalistic Corpus Studies
Anisia Popescu1, Johanna Cronenberg2, Ioana Vasilescu3, Ioana Chitoran4, Lori Lamel5, Martine Adda-Decker6 1Université Paris 8 - Saint Denis, 2LPP, CNRS, 3LISN CNRS, 4Universite de Paris, 5LISN, CNRS, 6LPP (Lab. Phonétique & Phonologie) / LIMSI-CNRS |
|
CS-YODAS: A Mined Dataset of In-the-Wild Code-Switched Speech
Brian Yan1, Qingzheng Wang1, Matthew Wiesner2, Anuj Diwan3, Olga Iakovenko4, Alex Polok5, Injy Hamed6, Shuichiro Shimizu7, Iris Emerman8, Thomas Hain9, David R. Mortensen10, Peter Viechnicki2, Shinji Watanabe1 1Carnegie Mellon University, 2Johns Hopkins University, 3University of Texas at Austin, 4Connex AI, 5Brno University of Technology, 6Mohamed bin Zayed University of Artificial Intelligence, 7Kyoto University, 8n/a, 9University of Sheffield, 10Language Technologies Institute, Carnegie Mellon University |
|
The Limits of Data Scaling: Sub-token Utilization and Acoustic Saturation in Multilingual ASR
Siyu Liang1, Nicolas Ballier2, Gina-Anne Levow1, Richard Wright1 1University of Washington, 2ALTAE, Université Paris Cité |
|
AusKidTalk: Developing Transcription Guidelines for Continuous Australian English Child Speech
Tuende Szalay1, Zheng Nan2, Renata Huang3, Mostafa Shahin2, Sirojan Tharmakulasingam2, Kirrie Ballard1, Beena Ahmed2 1The University of Sydney, 2The University of New South Wales, 3Macquarie University |
| 11:00 - 12:40 | Session P5.2.3: Speech Resources and Processing III - Poster Area |
|
spINAch: A Diachronic Corpus of French Broadcast Speech Controlled for Speakers' Age and Gender
Simon Devauchelle1, David Doukhan2, Remi Uro3, Lucas Ondel4, Valentin Pelloin5, Olympia Imbert-Brégégère2, Véronique Lefort2, Kévin Picard2, Emeline Seignobos2, Albert Rilliard1 1Universite Paris Saclay, CNRS, LISN, 2Institut national de l'audiovisuel (Ina), 3Laboratoire d'Intelligence Artificielle et Sémantique des Données, Université Paris 8 (EA4383), 4LISN, CNRS, 5INA |
|
SALAN: A Massive ASR Dataset for the Languages of Niger
Mamadou K KEITA1, Christopher Homan1, Emily Prud'hommeaux2, Abdoulaye SAKO3, Seydou Diallo4 1Rochester Institute of Technology, 2Boston College, 3ESEO, 4DAUST |
|
Listening for Ideology: Automatic Analysis of Character Speech in Historical Nazi Propaganda Films
Nicolas Ruth, Manuel Burghardt, Andreas Niekler Computational Humanities Group, Leipzig University |
|
Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious Dataset
Nick Rossenbach1, Robin Schmitt2, Tina Raissi1, Simon Berger2, Larissa Kleppel1, Ralf Schlüter2 1RWTH Aachen University, 2RWTH Aachen University, AppTek.ai |
|
WhiteHouse: Translation of the Casablanca Corpus for Multi-dialectal Arabic Speech Translation
Fethi Bougares1, Salima Mdhaffar2, Yannick Estève3 1LIUM- Le Mans Université, 2Avignon university, 3LIA - Avignon Université |
|
ToneSwiper: Facilitating Manual ToDI-annotation of Dutch Prosody
Matthijs Westera1 and Ariëlle Reitsema2 1Leiden Universiteit, 2Leiden University |
|
IMaSC: A Malayalam Speech Corpus for High-Quality Text-to-Speech Synthesis
Deepa Gopinath1, Thennal D K2, Vrinda Nair3, Swaraj S4, Sachin G4 1College of Engineering Trivandrum (CET), 2Independant Researcher, 3APJ Abdul Kalam Technological University, 4International Centre for Free and Open Source Solutions (ICFOSS) |
|
Speak in Context: Multilingual ASR with SpeechContext Alignment via Contrastive Learning
Yuchen Zhang1, Haralambos Mouratidis2, Ravi Shekhar2 1Universtiy of Essex, 2University of Essex |
|
Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages
Swati Sharma1, Divya Sharma1, Anubha Gupta2 1Indraprastha Institute of Information Technology, Delhi, 2IIIT Delhi |
|
Introducing MELI: The Mandarin-English Language Interview Corpus
Suyuan Liu and Molly Babel University of British Columbia |
|
PhonemeDF: A Synthetic Speech Dataset for Audio Deepfake Detection and Naturalness Evaluation
Vamshi Nallaguntla1, Aishwarya Fursule1, Shruti Kshirsagar1, Anderson Avila2 1Wichita State University, 2Institut national de la recherche scientifique |
|
How Much Data for Stable Formant Values? Pipeline for Convergence Detection Based on Read Speech
Kayla Sward1, Johan Sjons1, Axel Ekstrom2 1Department of Linguistics and Philology, Uppsala University, 2Speech, Music & Hearing, KTH Royal Institute of Technology |
|
MUSCAT: MUltilingual, SCientific ConversATion Benchmark
Supriti Sinhamahapatra1, Thai-Binh Nguyen1, Yigit Oguz1, Enes Ugan1, Jan Niehues1, Alexander Waibel2 1Karlsruhe Institute of Technology, 2Carnegie Mellon University |
| 12:40 - 13:20 | Antonio Zampolli Prize Winner Talk - Room 1 |
| 13:20 - 14:45 | Lunch Break |
| 14:45 - 15:15 | Invited Local Speaker - Room 1 |
| 15:15 - 15:20 | Short Break (5mn) |
| 15:20 - 17:00 | Session O21: Evaluation, Validation V - Room 1 |
| 15:20 - 15:40 |
Towards a Diagnostic and Predictive Evaluation Methodology for Sequence Labeling Tasks
Elena Alvarez-Mellado and Julio Gonzalo UNED School of Computer Science |
| 15:40 - 16:00 |
Memorization or Lucky Guesses: Detecting Short Sequences from Copyrighted Dutch News in LLM Output
Joris Veerbeek1, Kas Berendsen1, Alessandra Polimeno2, Antal van den Bosch1 1Utrecht University, 2DANS |
| 16:00 - 16:20 |
When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation
Thibault Prouteau1, Francis Lareau2, Nicolas Dugue3, Jean-Charles Lamirel4, Christophe Malaterre5 1Université de Lorraine, LORIA, CNRS, 2Computer Science Department, Université du Québec à Montréal, 3LIUM, Le Mans Universite, 4LORIA, 5Department of Philosophy & CIRST, Université du Québec à Montréal |
| 16:20 - 16:40 |
Detecting Hallucinations in Authentic LLMHuman Interactions
Yujie Ren, Niklas Gruhlke, Anne Lauscher University of Hamburg |
| 16:40 - 17:00 |
Issue Detection and Category Classification in Domain-Specific Technical Logbooks
Afshin Karimi1, Ingmar Hartl1, Henrik Tuennermann1, Anne Lauscher2 1DESY, 2University of Hamburg |
| 15:20 - 17:00 | Session O22: Information Extraction and Text Mining III - Room 2 |
| 15:20 - 15:40 |
Once upon a Kernel: Extracting Important Events from Narratives
Anshu Sharma1, Miguel Castiblanco-Melendez1, Alejandro Morales1, Mark Finlayson2 1Florida International University, 2FIU |
| 15:40 - 16:00 |
Temporal Expression Recognition in Legal Transcripts
Elizabeth Goldstein1 and Maria Berger2 1ORRO AI Genius,, 2Ruhr University Bochum |
| 16:00 - 16:20 |
Multilingual, Multimodal Pipeline for Creating Authentic and Structured Fact-Checked Claim Dataset
Z. Melce Hüsünbeyi1, Virginie Mouilleron2, Leonie Uhling1, Daniel Foppe1, Tatjana Scheffler3, Djamé Seddah2 1Ruhr-University Bochum, 2Inria, 3Ruhr University Bochum |
| 16:20 - 16:40 |
A Study on Building Efficient Zero-Shot Relation Extraction Models
Hugo THOMAS1, Caio Corro2, Guillaume Gravier3, Pascale Sébillot4 1IRISA, RENNES, 2Irisa, INSA Rennes, 3Univ Rennes, CNRS, Inria, IRISA - UMR 6074, France, 4Univ Rennes, INSA Rennes, CNRS, Inria, IRISA - UMR 6074 |
| 16:40 - 17:00 |
Beyond Catalogue Counts: The Dataset Visibility Asymmetry in Low-Resource Multilingual NLP
Zhiyin Tan1 and Changxu Duan2 1L3S Research Center, 2Technische Universität Darmstadt |
| 15:20 - 17:00 | Session O23: Simplification, Plain Language - Room 3 |
| 15:20 - 15:40 |
BLooP: Zero-Shot Abstractive Summarization Using Large Language Models with Bigram Lookahead Promotion
Varun Iyer1 and Cornelia Caragea2 1University of Illinois Chicago, 2University of Illinois at Chicago |
| 15:40 - 16:00 |
OasisSimp: An Open-source Asian-English Sentence Simplification Dataset
Hannah Liu1, Murphy Tian1, Iqra Ali2, Haonan Gao3, Qiaoyiwen Wu1, Blair Yang4, Uthayasanker Thayasivam5, Annie Lee1, Pakawat Nakwijit2, Surangika Ranathunga6, Ravi Shekhar7 1University of Toronto, 2Queen Mary University of London, 3Yale University, 4Coolwei AI Lab, 5University of Moratuwa, 6Massey University, 7University of Essex |
| 16:00 - 16:20 |
Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models
Thomas Stephan Juzek, Xiaoyang Ming, Jose Hernandez Florida State University |
| 16:20 - 16:40 |
How Much Noise Can BERT Handle? Insights from Multilingual Sentence Difficulty Detection?
Nouran Khallaf and Serge Sharoff University of Leeds |
| 16:40 - 17:00 |
Comparing Reading Behavior across Reader Expertise and Text Complexity: Insights from the French Eye-Tracking Corpus (FETA)
Oksana Ivchenko1 and Natalia Grabar2 1University of Lille, 2CNRS STL UMR8163, Université de Lille |
| 15:20 - 17:00 | Session O24: Machine Learning I - Room 4 |
| 15:20 - 15:40 |
Scaling LLM Reasoning from Minimal Labels: A Semi-Supervised Framework with a Lightweight Verifier
Keizo Kato1, Chenhui Chu2, Yugo Murawaki2, Sadao Kurohashi2 1Fujitsu Limited, 2Kyoto University |
| 15:40 - 16:00 |
PARL: Prompt-based Agents for Reinforcement Learning
Yarik Menchaca Resendiz1 and Roman Klinger2 1University of Stuttgart, 2University of Bamberg |
| 16:00 - 16:20 |
SPQ: An Ensemble Technique for Large Language Model Compression
Jiamin Yao and Eren Gultepe Southern Illinois University Edwardsville |
| 16:20 - 16:40 |
FPSC: A Sustainable Pipeline for Building a Faroese Parliamentary Speech Corpus
Dávid í Lág1, Barbara Scalvini1, Carlos Hernandez Mena2, Jon Gudnason3 1University of the Faroe Islands, 2Barcelona Supercomputing Center, 3Reykjavik University |
| 16:40 - 17:00 |
Efficient Dialect-Aware Modeling and Conditioning for Low-Resource Taiwanese Hakka Speech Processing
Peng An-Ci1, Kuan-Tang Huang1, Tien-Hong Lo1, Hung-Shin Lee2, Hsin-Min Wang3, Berlin Chen1 1National Taiwan Normal University, 2United Link Co., Ltd., 3Academia Sinica |
| 15:20 - 17:00 | Session P6.1.1: Corpora and Treebanks IV - Poster Area |
|
Construction of Japanese Prefectural Assembly Minutes Datasets across Three Electoral Terms: Comparative Analysis of 2011, 2015, and 2019 Four-Year Periods
Keiichi Takamaru1, Hokuto Ototake2, Yuzu Uchida3, Yasutomo Kimura4 1Utsunomiya Kyowa University, 2Fukuoka University, 3Hokkai-Gakuen University, 4Otaru University of Commerce |
|
EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates
Ludovic Moncla1, Pierre Nugues2, Thierry Joliveau3, Katherine McDonough4 1LIRIS, INSA Lyon, 2Lund University, 3UJM/CNRS UMR EVS, 4Lancaster University |
|
Mental Health Disorder Detection beyond Social Media: A Systematic Review of Available Datasets
Sadiya Sayara Chowdhury Puspo1, Ana-Maria Bucur2, Stevie Chancellor3, Özlem Uzuner1, Marcos Zampieri1 1George Mason University, 2Università della Svizzera italiana, 3University of Minnesota |
|
German Counseling Grounding-Act Corpus (GRACO)
Milena Belosevic Bielefeld University |
|
Presenting the Prague Discourse Treebank 4.0
Jirí Mírovský and Pavlína Synková Charles University |
|
Evaluation of Co-Speech Gesture Tracking Techniques in Naturalistic Interactions
Victoria Ivanova and Naomi Harte Trinity College Dublin |
|
Voices across Decades: A Multimodal Diachronic Corpus of German Bundestag Debates (GerParlDia-MM)
Ingo Siegert Otto von Guericke University Magdeburg |
|
MultiWikiQA: A Reading Comprehension Benchmark in 300+ Languages
Dan Smart Alexandra Institute |
|
SALOMO: An Annotation Tool for Complex Annotation Tasks with a Large Number of Labels
Tim Menzner University of Coburg |
|
VietJobs: A Vietnamese Job Advertisement Dataset
Hieu Pham Dinh, Hung Nguyen Huy, Mo El-Haj VinUniversity |
|
A Resource on Dialogical Moves in Native and Non-Native Academic Writers of English
Giulia D'Agostino1, Narjes Sheikh Asadi1, Elena Musi2 1Universita' della Svizzera italiana, 2University of Liverpool |
|
A Corpus-Based Profiling of Regional English Variants in Global Media: Insights from Olympic Journalism
Felix Mao Rye Country Day School |
|
JFC-Recipe: A Dataset for Nutrient Estimation from Japanese User-Generated Cooking Recipes
Keisuke Shirai1, Yoko Yamakata2, Hirotaka Kameko1, Akiko Sunto3, Jun Harashima4, Shinsuke Mori1 1Kyoto University, 2The University of Tokyo, 3Kanagawa University of Human Services, 4LY Corporation |
|
Annotating Conversational Phases and Communication Techniques: A Corpus of German Teacher-Parent Counseling Conversations
Tobias Hallmen1, Kathrin Gietl2, Karoline Hillesheim2, Annemarie Friedrich2, Elisabeth André2 1Chair for Human-Centered Artificial Intelligence, University of Augsburg, 2University of Augsburg |
|
RO-ABSA: A Romanian Dataset and Baselines for Aspect-Based Sentiment Analysis
Gheorghe Alina, Andrei Claudia, Ionescu Elena, Ruseti Stefan, Dascalu Mihai Politehnica University of Bucharest |
|
The Moral Foundations Reddit Corpus
Jackson Trager1, Alireza S. Ziabari1, Elnaz Rahmati1, Aida Mostafazadeh Davani2, Preni Golazizian1, Farzan Karimi-Malekabadi1, Ali Omrani3, Zhihe Li1, Brendan Kennedy4, Georgios Chochlakis1, Nils Karl Reimer5, Melissa Reyes1, Kesley Cheng1, Mellow Wei1, Christina Merrifield1, Arta Khosravi1, Evans Alvarez1, Morteza Dehghani1 1University of Southern California, 2Google, 3Snap, 4Pacific Northwest National Laboratory, 5University of California Santa Barbara |
| 15:20 - 17:00 | Session P6.1.2: Corpora and Treebanks V - Poster Area |
|
From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks
Neh Majmudar1, Anne Huang2, Jinfan Frank Hu2, Elena Filatova3 1PhD Student, 2High School, 3City University of New York (CUNY) |
|
Tracing How Annotators Think: Augmenting Preference Judgments with Reading Processes
Karin de Langis, William Walker, Khanh Le, Dongyeop Kang University of Minnesota |
|
CodeClarity: A Framework and Benchmark for Evaluating Multilingual Code Summarization
Madhurima Chakraborty1, Drishti Sharma2, Maryam Sikander2, Eman Nisar2 1University of California, Riverside, 2Cohere Labs Community |
|
A Longitudinal, Multinational, and Multilingual Corpus of News Coverage of the Russo-Ukrainian War
Dikshya Mohanty, Taisiia Sabadyn, Jelwin Rodrigues, Chenlu Wang, Abhishek Kalugade, Ritwik Banerjee Stony Brook University |
|
SKILL-IR-Discourse: A Large, Annotated Corpus of Argumentation and Domain Discourse on International Relations
Magdalena Wolska1, Matti Wiegmann2, Sassan Gholiagha3, Mitja Sienknecht3, Dora Kiesel1, Irene Lopez Garcia1, Patrick Riehmann4, Bernd Fröhlich1, Katrin Girgensohn3, Jürgen Neyer3, Benno Stein1 1Bauhaus-Universität Weimar, 2University of Kassel, 3Europa-Universität Viadrina, 4Jönköping University |
|
Building Multimodal Corpora Using Microtask Pipelines and Local Annotators
Helmiina Hotti1, Raul Vazquez1, Anna-Kaisa Jokipohja1, Timo Kalliokoski1, Henna Paakki1, Rosa Suviranta1, Tuomo Hiippala2 1University of Helsinki, 2Department of Languages, University of Helsinki |
|
Beyond Fake News Detection: A Community-based Study of the Multicultural Nature of Information Disorder
Sara Gemelli1, Giulia Di Cristina2, Yiran Zhang3, Md Azizul Hoque3, Alberto De La Torre Solís4, Mohamad Behboudi Eshkiki2, Nikolai Efimov2, Mariia Everstova2, Caterina Cappello2, Maziar Kianimoghadam Jouneghani2, Payam Latifi2, Yashar Mahboudi2, Farzaneh Mohseni2, Dario Placenti5, Tommaso Caselli6, Manuela Sanguinetti7, Aurora Scarpellini8, Chiara Zanchi9, Usman Naseem10, Marco Antonio Stranisci2, Simona Frenda11 1University of Pavia, University of Bergamo, 2University of Turin, 3Macquarie University, 4Universidad de Huelva, 5Politecnico di Torino, 6Rijksuniversiteit Groningen, 7University of Cagliari, Department of Mathematics and Computer Science, 8Università di Torino, 9University of Pavia, 10University of Sydney, 11Heriot-Watt University |
|
FreeTxt-Vi: A Benchmarked Vietnamese-English Toolkit for Segmentation, Sentiment, and Summarisation
Hung Nguyen1, Mo El-Haj1, Paul Rayson2, Dawn Knight3 1VinUniversity, 2Lancaster University, 3Cardiff University |
|
The Patrologia Graeca Corpus: OCR, Annotation, and Open Release of Noisy Nineteenth-Century Polytonic Greek Editions
Chahan Vidal-Gorène1 and Bastien Kindt2 1Ecole nationale des chartes-PSL University, Centre Jean Mabillon, LIPN, Calfa, 2UCLouvain/Institut Orientaliste |
|
National Library as Corpus: DeLiKo-2025@DNB a Very Large Corpus of German-language Contemporary Literature
Marc Kupietz1, Nils Diewald2, Philippe Genêt3, Andreas Witt1 1Leibniz Institute for the German Language, 2IDS Mannheim, 3Deutsche Nationalbibliothek |
|
Multi-party Conversational Corpus of L1 and L2 for Speech Alignment Research (Teams-SK): Methodological Approach
Stefan Benus1, Viktor Gatial2, Erik György2, Mária Hricková2, Martin Kaimír2, Zuzana Kozáciková2, Lucia Mareková2, Róbert Sabo3, Marian Trnka3, Erik Vráb2 1Constantine the Philosopher University in Nitra, Institute of Informatics, SAS, Bratislava, 2Constantine the Philosopher University in Nitra, 3Institute of Informatics, SAS, Bratislava |
|
Is Semi-Automatic Transcription Useful in Corpus Creation? Preliminary Considerations on the KIParla Corpus
Martina Simonotti1, Ludovica Pannitto2, Eleonora Zucchini3, Silvia Ballarè4, Caterina Mauri2 1DIT - University of Bologna, 2LILEC - University of Bologna, 3Masaryk University, 4FICLIT - University of Bologna |
|
Open Korean Historical Corpus: A Millennia-Scale Diachronic Collection of Public Domain Texts
Seyoung Song1, Nawon Kim2, Songeun Chae1, Kiwoong Park1, Jiho Jin1, Haneul Yoo1, Kyunghyun Cho3, Alice Oh1 1KAIST, 2Department of Sinographic Literatures, Korea University, 3New York University |
|
NAIST LIFE STORY: A Seven-Year Crowdsourced Dataset of Japanese Emotion-related Episodes
Kazuhiro Ito1, Junko Hayashi2, Hiroyuki Nagai1, Shoko Wakamiya2, Eiji ARAMAKI3 1NARA Institute of Science and Technology, 2NAIST, 3NAIST, Japan |
|
Audience Engagement with Arabic Women's Social Empowerment and Wellbeing: A Decadal Corpus
Wajdi Zaghouani1, Mabrouka Bessghaier1, Md. Rafiul Biswas2, Shimaa Ibrahim1 1Northwestern University Qatar, 2Hamad Bin Khalifa University |
|
ArPoMeme: An Annotated Arabic Multimodal Dataset for Political Ideology and Polarization
Wajdi Zaghouani1, Kais Attia2, Md. Rafiul Biswas3, Fadhl Eryani4 1Northwestern University Qatar, 2Freelance, 3Hamad Bin Khalifa University, 4University of Tübingen |
| 15:20 - 17:00 | Session P6.1.3: Corpora and Treebanks VI - Poster Area |
|
JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media
Wajdi Zaghouani1, Shimaa Ibrahim1, Mabrouka Bessghaier1, Houda Bouamor2 1Northwestern University Qatar, 2Carnegie Mellon University in Qatar |
|
ParaCLEAN: Improving Translation Quality through Systematic Parallel Data Cleaning
Audrey Mash, Ella Bohman, Maite Melero BSC |
|
DReUD: Discourse Relations in Universal Dependencies
Jirí Mírovský and Pavlína Synková Charles University |
|
MultiGraSCCo: A Multilingual Anonymization Benchmark with Annotations of Personal Identifiers
Ibrahim Baroud1, Christoph Otto2, Vera Czehmann3, Christine Hovhannisyan4, Lisa Raithel5, Sebastian Möller6, Roland Roller7 1Technische Universität Berlin, 2University of Potsdam, 3German Research Center for Artificial Intelligence (DFKI) and Technical University of Berlin, 4Quality & Usability Lab, Technische Universität Berlin; Department of Psychology, Humboldt-Universität zu Berlin, 5Technische Universitaet Berlin, BIFOLD, DFKI GmbH, 6Quality and Usability Lab, TU Berlin, 7DFKI SLT Lab |
|
Structured Legal Document Generation in India: A Model-Agnostic Wrapper Approach with VidhikDastaavej
Shubham Nigam1, Deepak Patnaik Balaramamahanthi2, Noel Shallum3, Kripabandhu Ghosh4, Arnab Bhattacharya5 1University of Birmingham, 2Indian Institute of Technology, Kanpur, 3Symbiosis Law School Pune, 4Indian Institute of Science Education and Research- Kolkata (IISER-K), 5Dept. of Computer Science and Engineering, IIT Kanpur |
|
PolyglotQL: A Pipeline for Multilingual Text-to-SPARQL Dataset Generation
Julio Perez1, Fabio Barth2, Georg Rehm2 1Technical University of Berlin, 2DFKI |
|
Building and Annotating a Large Comparable Corpus for Studying the Semantic Quantification - Chinese, French, Japanese, Korean
raoul blin1, Jinnam Choi2, WU qishen3, Yuxin Zhang4, Soonhee Hwang5, Takahiro Morita6, Alexander Delaporte1, Ilaine Wang7, Chang Liu7 1cnrs-crlao, 2CLLE, Université Jean-Jaurès, 3Paris Nanterre, 4sorbonne nouvelle, 5Hongik University, 6Kyoto University, 7INALCO |
|
Towards the Generation and Application of Dynamic Web-Based Visualization of UIMA-based Annotations for Big-Data Corpora with the Help of Unified Dynamic Annotation Visualizer
Thiemo Dahmann1, Julian Schneider1, Philipp Stephan1, Giuseppe Abrami1, Alexander Mehler2 1Goethe University Frankfurt, 2Goethe-University Frankfurt am Main |
|
The MultiplEYE Text Corpus: Towards a Diverse and Ever-Expanding Multilingual Text Corpus
Ramune Kaspere1, Anna Bondar2, Sergiu Nisioi3, Maja Stegenwallner-Schütz4, Hanne B. Søndergaard Knudsen5, Ana Matic6, Eva Pavlinuic Vilus2, Dorota Klimek-Jankowska7, Chiara Tschirner2, Not Battesta Soliva2, Deborah Jakobi2, Cui Ding2, Dima Abu Romi8, Cengiz Acarturk9, Matilda Agdler2, Anton Alexandru10, Mohd Faizan Ansari11, Annalisa Arcidiacono12, Elizabete Barisa13, Ana Bautista14, Lisa Beinborn15, Yevgeni Berzak16, Nedeljka Bjelanovic17, Anna Bothmann18, Jan Brasser2, Caterina Cacioli19, Anila Çepani20, Ilze Ceple13, Adelina Cerpja21, Dalí Chirino22, Jan Chromý23, Alessandro Corona Mendozza24, Iria de-Dios-Flores25, Nazik Dinçtopal Deniz26, Ana Doen6, Kristian Elersic27, Inmaculada Fajardo28, Zigmunds Freibergs29, Angelina Ganebnaya13, Shan Gao2, Jéssica Gomes30, Annjo Greenall31, Alba Haveriku32, Miao He33, Anamaria Hodivoianu10, Yu-Yin Hsu34, Amanda Isaksen31, Andreia Janeiro30, Kristine Jensen de López5, Aleksandar Jevremovic35, Vojislav Jovanovic36, Hanna Kedzierska7, Nik Kharlamov5, Sara Kosutar37, Nelda Kote32, Vanja Kovic36, Izabela Krejtz38, Thyra Krosness2, Oleksandra Kuvshynova10, Eilam Lavy39, Ella Lion16, Marta Lockiewicz40, Kaidi Lõo29, Paula Luegi30, Mircea Mihai Marin10, Clara Martin41, Svitlana Matvieieva42, Diane Mézière43, Xavier Mínguez-López28, Valeriia Modina44, Jurgita Motiejuniene1, Marie-Luise Müller45, Tolgonai Nasipbek kyzy46, Jamal Abdul Nasir47, Johanne Nedergård24, Aysegül Özkan48, Patrizia Paggio24, Marijan Palmovic6, Maria Christina Panagiotopoulou2, Alberto Parola24, Helena Pérez49, Klaudia Petersen50, Anja Podlesek27, Eva Pospíilová51, Marta Praulina13, Mikulá Preininger52, Loredana Punga53, Diego Rossini46, pela Rot54, Habib Sani Yahaya55, Irina A. Sekerina44, Anne Skadina13, Jordi Solé-Casals56, Lonneke van der Plas46, Saara M. Varjopuro43, Spyridoula Varlokosta57, João Veríssimo30, Oskari Juhapekka Virtanen43, Nemanja Vracar58, Mila Vulchanova31, Ahmad Wali10, Peizheng Wu2, Nilgün Yücel59, Stefan Frank22, Nora Hollenstein2, Lena Jäger2 1Kaunas University of Technology, 2University of Zurich, 3Human Language Technologies Research Center, University of Bucharest, 4University of Koblenz, 5Aalborg University, 6University of Zagreb, 7University of Wroclaw, 8Technion - Israel Institute of Technology, 9Cognitive Science Department, Jagiellonian University, 10University of Bucharest, 11Silesian University of Technology, 12University of Bergen, 13University of Latvia, 14Basque Center on Cognition, Brain and Language; University of the Basque Country, 15University of Goettingen, 16Technion - Israel Institute of Technology, 17Institute for Literature and Arts, 18University College London, 19Università di Firenze, 20University of Tirana, 21Institute of Linguistic and Literature, Academy of Sciences of Albania, 22Radboud University, 23Charles University (Prague), 24University of Copenhagen, 25Universitat Pompeu Fabra, 26Bogaziçi University, 27University of Ljubljana, 28University of Valencia, 29University of Tartu, 30University of Lisbon, 31Norwegian University of Science & Technology, 32Polytechnic University of Tirana, 33University of Konstanz, 34The Hong Kong Polytechnic University, 35Singidunum University, 36University of Belgrade, 37UiT The Arctic University of Norway, 38SWPS University, 39The Hebrew University of Jerusalem, 40University of Gdansk, 41Basque Center on Cognition, Brain and Language; Ikerbasque Basque Foundation for Science, 42Dragomanov Ukrainian State University, 43University of Turku, 44City University of New York, 45Leibniz Institute for Psychology, 46Università della Svizzera italiana, 47University of Galway, 48Baskent University, 49University of Santiago de Compostela, 50Copenhagen University, 51Charles University, 52Czech Academy of Sciences, 53West University of Timi?oara, 54St. Stanislav's Institution, 55Gozak Media, 56University of Vic Central University of Catalona, 57National and Kapodistrian University of Athens, 58University of Padua, 59Marmara University |
|
Sanskrit Travelogue: A Large-Scale Unified and Annotated Corpus of Sanskrit Texts
Giacomo De Luca1, Danilo Croce2, Roberto Basili2 1University of Tor Vergata, 2University of Roma, Tor Vergata |
|
The Foggia Occupator Corpus: Digitisation, Annotation, and Computational Analysis of an Occupation-Era Newspaper (1945-1946)
Michele Ciletti University of Foggia |
|
SiDiaC-v.2.0: Sinhala Diachronic Corpus Version 2.0
Nevidu Jayatilleke1, Nisansa de Silva2, Uthpala Sooriya-Arachchi3, Gagani Kulathilaka3, Azra Safrullah3, Johan Sofalas3 1Department of Computer Science & Engineering, University of Moratuwa, 2University of Moratuwa, 3Informatics Institute of Technology |
|
ShAnEL-2: A Multilingual Benchmarking Dataset for Short-Answer Language Learning Exercises
Jasper Degraeuwe1 and Thomas Moerman2 1Ghent University, 2Ghent University, LT3 |
|
The Swedish Parliamentary Motions Corpus 1867-2024
Robert Borges1, Fredrik Mohammadi Norén2, Lotta Åberg Brorsson3, Väinö Yrjänäinen4, Hanna Bäck5, Robert Klemmensen5, Måns Magnusson4 1Uppsala University, 2School of Arts and Communication, Malmö University, 3The Riksdag Library, 4Department of Statistics, Uppsala University, 5Department of Political Science, Lund University |
|
The Swedish Benchmark of Linguistic Minimal Pairs
Johan Sjons1, Fredrik Heinat2, Murathan Kurfali3 1Department of Linguistics and Philology, Uppsala University, 2Språk- och litteraturcentrum, Lund University, 3RISE Research Institutes of Sweden |
|
Exploring the Transfer of Irony Explanation Generation from English to Dutch
Aaron Maladry1, Els Lefever2, Cynthia Van Hee3, Veronique Hoste2 1Ghent University, 2LT3, Ghent University, 3LT3, Language and Translation Technology Team (Ghent University) |
|
DIDECO: An Annotated Dataset for Intent Detection in Digital Communications
Senaid Popovic1, Damien Riquet2, Maxime Meyer2, Fabien Lauer3, Yannick Parmentier3 1Université de Lorraine, 2Hornetsecurity, 3LORIA |
| 15:20 - 17:00 | Session P6.1.4: Corpora and Treebanks VII - Poster Area |
|
GUMBridge: A Corpus for Varieties of Bridging Anaphora
Lauren Levine and Amir Zeldes Georgetown University |
|
Beyond Transcripts: Iterative Peer-Editing with Audio Unlocks High-Quality Human Summaries of Conversational Speech
Kaavya Chaparala1, Thomas Thebaud2, Jesus Villalba Lopez2, Laureano Moro-Velazquez2, Peter Viechnicki2, Najim Dehak2 1Johns Hopkins, 2Johns Hopkins University |
|
SEEM-CZ: Annotation and Classification of Epistemic Markers in Czech
Barbora tepánková1, Michal Novák2, Tomá Musil3, Lucie Polakova3 1Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 2Charles University, Faculty of Mathematics and Physics, 3Charles University |
|
When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms
Adib Sakhawat1, Shamim Parveen2, Md Ruhul Amin2, Tahera Khatun3, Shamim Mahmud2, Md Saiful Islam4 1Islamic University of Technology, 2Govt. Teachers' Training College, Rajshahi, 3Rajshahi Govt. Girl's High School, Helenabad, Rajshahi, 4Govt.Teachers' Training College, Rajshahi |
|
Human vs LLM in Conversational Repair Annotation: A New Resource and Comparative Study
Anh Ngo1, Nicolas Rollet2, Catherine Pelachaud3, Chloé Clavel4 1Inria, 2ALMAnaCH, INRIA Paris; Télécom Paris, SES, Institut Polytechnique de Paris, I3-CNRS, 3CNRS, ISIR, Sorbonne University, 4ALMAnaCH, INRIA Paris; Télécom Paris, LTCI, Institut Polytechnique de Paris |
|
GPT-NL Public Corpus: A Permissively Licensed, Dutch-First Dataset for LLM Pre-training
Jesse Van Oort1, Frank Brinkkemper2, Erik de Graaf1, Bram Vanroy3, Saskia Lensink1 1TNO, 2GPT-NL, 3Instituut voor de Nederlandse Taal & KU Leuven |
|
Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation
Marii Ojastu, Hele-Andra Kuulmets, Aleksei Dorkin, Marika Borovikova, Dage Särg, Kairit Sirts University of Tartu |
|
GENIUS Keylog Corpus - a German High School Student Corpus with Keystroke Logging Data
Nils-Jonathan Schaller1, Thorben Jansen1, Lars Höft1, Hannah Pünjer1, Andrea Horbach2 1Leibniz Institute for Science and Mathematics Education, 2CAU Kiel / Leibniz Institute for Science and Mathematics Education |
|
OTA-BOUN: A Historical Turkish Dependency Treebank
Tarik Tiras1, Nureddin Ünal1, Ada Cengiz1, Ece Yurtseven2, Esma Tasdemir3, Saziye Ozates4 1Bogaziçi University, 2Robert College, 3Medeniyet University, 4Bogazici University |
|
TCMPHal: A Large-scale Dataset for Hallucination Detection in Traditional Chinese Medicine Pharmacy
Nijia Han1, Zimu Wang2, Ziwen Xie1, Wei Wang1, Jia Meng1, John Moraros1, Shuihua Wang1 1Xi'an Jiaotong-Liverpool University, 2University of Liverpool |
|
AraREQ: A Dataset and End-to-End System for Conflict Detection and Resolution in Software Requirements
Tymaa Hammouda1, Alaa Aljabari1, Nagham Hamad1, Mustafa Jarrar2 1Birzeit University, 2Hamad Bin Khalifa University |
|
MAD: A Corpus of Multilingual Argumentative Deliberation
Eimear Maguire, Ella Schad, Jacky Visser, Chris Reed, John Lawrence University of Dundee |
|
Infox-QC: A Quebec-Focused French Corpus for Misinformation Detection and AI Robustness Assessment
Moetaz Doghmane1, Hazem Amamou2, Thiziri Sefsaf3, Alan Davoust4, Anderson Avila1 1Institut national de la recherche scientifique, 2Student, 3INRS, 4Université du Québec en Outaouais |
|
unarXive 2024: A Large-Scale Scientific Corpus for Citation-Aware Retrieval and Generation
Ines Besrour and Michael Färber TU Dresden |
|
EPIC-EuroParl-UdS: Information-Theoretic Perspectives on Translation and Interpreting
Maria Kunilovskaya1 and Christina Pollklaesener2 1Saarland University, 2Hildesheim University |
|
FeedFetcher: A Resilient Web Feed Downloader for Corpus Construction
Ondrej Herman1, Jan Kraus2, Vit Suchomel3 1Masaryk University, 2Lexical Computing, 3Natural Language Processing Centre, Masaryk University |
|
Human-in-the-Loop Mass Transcription and Ground Truth Annotation for Challenging Historical Documents
Norbert Fischer and Frank Puppe Julius-Maximilians-Universität Würzburg |
| 17:00 - 17:20 | Coffee Break |
| 17:20 - 19:00 | Session O25: Corpora, Treebanks and Annotation - Room 1 |
| 17:20 - 17:40 |
CoMMA, a Large-scale Corpus of Multilingual Medieval Archives
Thibault Clérice1, Simon Gabay2, Malamatenia Vlachou-Efsthatiou3, Ariane Pinche4, Benoît Sagot5 1ALMAnaCH, Inria, 2Université de Genève, 3Ecole nationale des ponts et chaussées, 4CNRS, 5Inria |
| 17:40 - 18:00 |
Conversion of the Clark Hall Dictionary of Old English to TEI with RDF: An End-to-end Pipeline for Lexicographic Resource Retrodigitization
Sergei Stoliarov1, Maxim Ionov2, Fahad Khan3, Marina Buzzoni1, Francesca Frontini4 1Ca' Foscari University of Venice, 2University of Zaragoza, 3Istituto di Linguistica Computazionale "Antonio Zampolli", CNR, 4Istituto di Linguistica Computazionale "A. Zampolli" - ILC Consiglio Nazionale delle Ricerche - CNR |
| 18:00 - 18:20 |
AMORES: A Spanish Language Resource for an Extended Set of Moral Foundations
Oscar Araque1, Daniel Molina2, Anny Alvarez Nogales3, Carlos A. Iglesias3 1Universidad Politecnica de Madrid, 2SocialInnolabs, 3Universidad Politécnica de Madrid |
| 18:20 - 18:40 |
The Moralization Corpus: Frame-Based Annotation and Analysis of Moralizing Speech Acts across Diverse Text Genres
Maria Becker, Mirko Sommer, Lars Tapken, Yi Wan Teh, Bruno Brocai Heidelberg University |
| 18:40 - 19:00 |
Targum a Multilingual New Testament Translation Corpus
Maciej Rapacz and Aleksander Smywinski-Pohl AGH University of Kraków |
| 17:20 - 19:00 | Session O26: Named Entity Recognition, Speech Resources - Room 2 |
| 17:20 - 17:40 |
Trigger Warnings Are Grounded in a Shared Vocabulary: A Corpus Analysis with User-Generated Labels
Sebastian Heineking1, Matti Wiegmann1, Magdalena Wolska2, Benno Stein2, Martin Potthast3 1University of Kassel, 2Bauhaus-Universität Weimar, 3University of Kassel, hessian.AI, and ScaDS.AI |
| 17:40 - 18:00 |
ENEIDE: A High Quality Silver Standard Dataset for Named Entity Recognition and Linking in Historical Italian
Cristian Santini1, Sebastian Barzaghi2, Paolo Sernani1, Emanuele Frontoni1, Laura Melosi1, Mehwish Alam3 1University of Macerata, 2University of Bologna, 3Telecom Paris, Polytechnic Institute of Paris |
| 18:00 - 18:20 |
YoNER: A New YorùBá Multi-domain Named Entity Recognition Dataset
Peace Falola1, Jesujoba Alabi2, Solomon Akinola1, Folashade Ogunajo3, Emmanuel Alabi1, David Ifeoluwa Adelani4 1University of Ibadan, 2Saarland University, 3Atiba university, 4McGill University / MILA |
| 18:20 - 18:40 |
Linking Rationale to Decision on Internet Standards: A Retrieval-Based Approach Using Synthetic Data
Jie Bian and Michael Welzl University of Oslo |
| 18:40 - 19:00 |
The GELATO Dataset for Legislative NER
Matthew Flynn, Timothy Obiso, Sam Newman Brandeis University |
| 17:20 - 19:00 | Session O27: Simplification, Plain Language and Assistive Technologies - Room 3 |
| 17:20 - 17:40 |
Controllable Sentence Simplification in Italian: Fine-Tuning Large Language Models on Automatically Generated Resources
Michele Papucci1, Giulia Venturi2, Felice Dell'Orletta3 1ItaliaNLP Lab @ CNR-ILC, Università di Pisa, 2Institute of Computational Linguistics "Antonio Zampolli" (ILC-CNR), 3ItaliaNLP Lab @ Institute for Computational Linguistics "Antonio Zampolli", ILC - CNR |
| 17:40 - 18:00 |
Evaluating LLM-based Text Simplification for German: Effects on Post-Editing Effort, Quality Ratings, and User Comprehension
Luisa Carrer1, Andreas Säuberli2, Martin Kappus3, Lukas Fischer4, Sarah Ebling5 1School of Applied Linguistics, ZHAW Zurich University of Applied Sciences, 2LMU Munich, 3Zurich University of Applied Sciences, 4Department of Computational Linguistics, University of Zurich, 5University of Zurich |
| 18:00 - 18:20 |
Reading Time in the Wild: An Assessment of Readability Predictors Based on Naturally-Observed Reading Times
Sijbren van Vaals, Rik van Noord, Malvina Nissim University of Groningen |
| 18:20 - 18:40 |
Document-Level Text Simplification in Estonian Using Large Language Models
Meeri-Ly Muru1 and Eduard Barbu2 1National Library of Estonia, 2Institute of Computer Science |
| 18:40 - 19:00 |
A Human-in/on-the-Loop Framework for Accessible Text Generation
Lourdes Moreno and Paloma Martínez Universidad Carlos III de Madrid |
| 17:20 - 19:00 | Session O28: Applications Involving LRs and Evaluation II - Room 4 |
| 17:20 - 17:40 |
Automatic Analysis of Collaboration through Human Conversational Data Resources: A Review
Yi Yu1, Maria Boritchev2, Chloé Clavel3 1Inria Paris, University of Sorbonne, 2Télécom Paris, Institut Polytechnique de Paris, 3INRIA |
| 17:40 - 18:00 |
Benchmarking Arabic Authorship Attribution and Style Transfer with Large Language Models
Injy Hamed1, Bashar Alhafni2, Nizar Habash3, Thamar Solorio2 1Mohamed bin Zayed University of Artificial Intelligence, 2MBZUAI, 3New York University Abu Dhabi |
| 18:00 - 18:20 |
ADHD-Lang: A Large-Scale Social Media Dataset for Verbal Behavior and Digital Phenotyping in Adult ADHD
Daniel Wiechmann1, Elma Kerz2, Edward Kempa3, Yu Qiao2 1Institute for Logic Language and Computation, 2Exaia Technologies, 3University of Florida, Department of Computer and Information Science and Engineering |
| 18:20 - 18:40 |
SynBullying: A Multi-LLM Synthetic Conversational Dataset for Cyberbullying Detection
arefeh kazemi1, Hamza Qadeer1, Joachim Wagner2, hossein hosseini3, Sri Balaaji Natarajan Kalaivendan1, Brian Davis1 1Dublin City University, 2ADAPT Centre, Dublin City University, 3University of Isfahan |
| 18:40 - 19:00 |
The Multilingual Euphemism Benchmark: Datasets and Baselines for Pragmatic Language Understanding
Whitney Poh1, Julia Sammartino1, Jasper Andrew1, Witold Kieras2, Natalia Zawadzka-Paluektau2, Iryna Dilai3, Libby Barak1, JIng Peng1, Anna Feldman1 1Montclair State University, 2Institute of Computer Science, Polish Academy of Sciences, 3National University of Lviv |
| 17:20 - 19:00 | Session P7.1: Document Classification - Poster Area |
|
Advancing Retrieval-Augmented Generation for Persian: Development of Language Models, Comprehensive Benchmarks, and Best Practices for Optimization
Sara Bourbour Hosseinbeigi1, Mohammad Hossein Shalchian2, Sina Asghari3, Mohammad Ali Seif Kashani4, Mohammad Amin Abbasi5 1Department of Industrial and Systems Engineering, Tarbiat Modares University, 2Sharif University of Technology, 3Department of Computer Science, Iran University of Science and Technology, 4Department of Computer Engineering, Sharif University of Technology, 5Department of Computer Engineering, Iran University of Science and Technology |
|
Corpus and Baselines for Distinguishing Authentic, AI-Generated, and AI-Enhanced Resumes
Andrea Loizidou1, Anshu Sharma1, Adrian Esquivel2, Mark Finlayson3, Mustafa Ocal1 1Florida International University, 2TECKpert Inc., 3FIU |
|
Mute Cods: A Multilingual Telegram Dataset with Benchmark Models for Conspiracy Theory Detection
Katarina Laken1, Erik Marino2, Paloma Piot3, Davide Bassi4, Søren Fomsgaard5, Michele Maggini6, Renata Vieira7, Marcos Garcia8, Sara Tonelli9 1Fondazione Bruno Kessler, 2Universidade de Évora, 3Universidade da Coruna, 4Citius - Universidade de Santiago de Compostela, 5University of Caen, 6Centro Singular de Investigación en Tecnoloxías Intelixentes da USC, 7Évora University, 8Universidade de Santiago de Compostela, 9FBK |
|
Push and Pull: Training Sentence Encoders with Contrastive Losses for Distance-Based Multi-Label Text Classification
Jens Van Nooten1 and Andriy Kosar2 1University of Antwerp, 2Textgain |
|
PRIVaThe: An Annotated Dataset of Multi-Objectives Web Search Sessions
Claire Ibarboure1, Ludovic Tanguy2, Franck Amadieu1, Josiane Mothe3 1CLLE, UT2J, University of Toulouse & CNRS, 2CLLE: University of Toulouse & CNRS, 3INSPE, UT2J, University of Toulouse, CLLE & CNRS |
|
Towards Safer Calls for Everyone: Designing a Benchmark Dataset for Evaluating Voice Phishing Detection Models
joeun kang1, Gyuri Choi1, Chanhyuk Yoon2, Yongbin Jeong2, Younggyun Hahm3, Shea Husband1, Hansaem Kim1 1Yonsei University, 2Teddy Sum, 3Teddysum |
|
Learning Long-Document Embeddings via ChunkContext Entailment
Waheed Ahmed Abro1, Naïm Es-Sebbani2, Zied Bouraoui2 1SDAIA-KFUPM Joint Research Center for Artificial Intelligence, 2CRIL-CNRS & University of Artois |
|
Scientific Article Section Classification (SASC) Dataset
Nicolau Duran-Silva1, Julian Moreno-Schneider2, César Parra-Rojas3, Georg Rehm2 1SIRIS Lab, Research Division of SIRIS Academic & Universitat Pompeu Fabra, 2DFKI, 3SIRIS Lab, Research Division of SIRIS Academic |
|
JMTEB and JMTEB-lite: Japanese Massive Text Embedding Benchmark and Its Lightweight Version
Shengzhe Li1, Masaya Ohagi1, Ryokan Ri2, Akihiko Fukuchi1, Tomohide Shibata1, Daisuke Kawahara3 1SB Intuitions Corp., 2Google DeepMind, 3Waseda University |
|
Construction of a Japanese RAG Benchmark Using Synthetic Documents on Non-existent Entities and Events
Shengzhe Li1, Masaya Ohagi1, Hayato Tsukagoshi2, Akihiko Fukuchi1, Tomohide Shibata1, Daisuke Kawahara3 1SB Intuitions Corp., 2Nagoya University, 3Waseda University |
|
C4: A Multilingual Benchmark for Retrieval-Augmented Generation Based on the Catechism of the Catholic Church and Its Compendium
Pius von Däniken1, Mark Cieliebak2, Jan Deriu2 1Zurich University of Applied Sciences ZHAW, 2Zurich University of Applied Sciences |
| 17:20 - 19:00 | Session P7.2.1: Information Extraction and Text Mining IV - Poster Area |
|
Contrastively Pre-trained Event Embeddings with Schema-free LLM Annotations
Frank Mtumbuka and Steven Schockaert Cardiff University |
|
A Dataset of Psychiatric Hospital Notes with Temporal Information Annotations
Timothy Miller1, Gaby Dinh2, David Harris2, WonJin Yoon3, Spencer Thomas2, Boyu Ren4, MEIHUA HALL5, Guergana Savova1 1Boston Children's Hospital and Harvard Medical School, 2Boston Children's Hospital, 3Boston Children's Hospital, Harvard University, 4Mass General Brigham, 5McLean Hospital, HMS |
|
Format Matters: A Critical Evaluation of Output Formats for Prompting LLMs in SLU and NER
Pierre Lepagnol1, Sahar Ghannay2, Thomas Gerald3, Christophe Servan4, Sophie Rosset5 1LISN - Université Paris-Saclay - SCIAM, 2CNRS, LISN, 3CNRS, Université Paris Saclay, LISN, 4AMIAD - CNRS, LISN, 5Université Paris-Saclay, CNRS, LISN |
|
Identifying Imaging Follow-Up in Radiology Reports: A Comparative Analysis of Traditional ML and LLM Approaches
Namu Park1, Giridhar Kaushik Ramachandran2, Kevin Lybarger3, Fei Xia4, Özlem Uzuner3, Martin Gunn4, Meliha Yetisgen4 1University of Washington, Seattle, 2Novartis Institutes for BioMedical Research, 3George Mason University, 4University of Washington |
|
Efficient Topic Extraction via Graph-Based Labeling: A Lightweight Alternative to Deep Models
SALMA MEKAOUI1, Hiba Sofyan2, Imane Benchrif2, Imane Amaaz2, Ilham Chaker3, Arsalane Zarghili3, Nikola Nikolov1 1University of Limerick, 2Euromed University Of Fez | School of Digital Engineering and Artificial Intelligence, 3Faculty of Sciences and Technology, University Sidi Mohamed Ben Abdellah |
|
From Noise to Signal: When Outliers Seed New Topics
Evangelia Zve1, Gauvain Bourgne2, Benjamin Icard2, Jean-Gabriel Ganascia2 1LIP6 - Sorbonne University, Infopro Digital, 2LIP6 - Sorbonne University |
|
Explore Political Discourse with Transformers. Emergent Paradigmatic and Syntagmatic Representations.
Laurent Vanni1 and Damon Mayaffre2 1UMR 7320 BCL - Univ. cote d'azur - CNRS - France, 2UMR 7320 BCL - Univ. cote d'azur - CNRS - France |
|
The Growing Gains and Pains of Iterative Web Corpora Crawling: Insights from South Slavic CLASSLA-web 2.0 Corpora
Taja Kuzman Pungerek1, Peter Rupnik2, Vit Suchomel3, Nikola Ljubeic1 1Joef Stefan Institute, 2Jožef Stefan Institute, 3Natural Language Processing Centre, Masaryk University |
|
MaritimEmails: A Synthetic Dataset for Maritime Chartering Correspondence
Kevin Bruendler and Simon Clematide University of Zurich |
|
eSciBench: An Extensible Scientific PDF Extraction Benchmark
Noah Tremblay Taillon1 and Phillippe Langlais2 1DIRO/RALI, 2University of Montreal |
|
Vrittanta-AS: Dataset Development and Benchmarking for Event Trigger Detection and Classification in Assamese
Chaitanya Kirti, Dhrubajyoti Pathak, Ashish Anand, Prithwijit Guha Indian Institute of Technology Guwahati |
|
From Facts to Hypotheses: Joint Detection of Biomedical Relations and Epistemic Commitment Using LLMs
Aleksandra Gabryszak1, Phuc Tran Truong1, Arne Binder1, Nikola Milosevic2, Felix-Sebastian Keese3, Astrid Rheinländer3, Philippe Thomas4 1German Research Center for Artificial Intelligence (DFKI), 2Bayer A.G., 3Bayer AG, 4German Research Center for Artificial Intelligence |
|
SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing
Luca Foppiano1, Sotaro Takeshita2, Pedro Ortiz Suarez3, Ekaterina Borisova4, Raia Abu Ahmad4, Malte Ostendorff5, Fabio Barth6, Julian Moreno-Schneider6, Georg Rehm6 1ScienciaLAB, DFKI, Inria, 2University of Mannheim, 3Common Crawl Foundation, 4German Research Center for Artificial Intelligence (DFKI), 5German Research Center for Artificial Intelligence, 6DFKI |
|
CausalSense: Leveraging Common Sense Knowledge and LLMs for Joint Event Extraction and Relation Classification
Youssra REBBOUD1, Pasquale Lisena2, Raphael Troncy2 1EURECOM, sophia antiopolis, 2EURECOM |
| 17:20 - 19:00 | Session P7.2.2: Information Extraction and Text Mining V - Poster Area |
|
Large Language Models Are Good Term Extractors: A Systematic Evaluation
Ayla Rigouts Terryn Université de Montréal, Mila |
|
A Large-Scale Dataset for Linking-Based Geocoding
Hibiki Nakatani1, Yuichiro Yasui2, Ryosuke Wakamoto2, Masayuki Ishii2, Tetsuhisa Suizu1, Hiroki Ouchi1, Taro Watanabe1 1Nara Institute of Science and Technology, 2Nikkei Inc. |
|
FiNERVINER: Fine-grained Named Entity Recognition for Vulnerable Languages of India's North Eastern Region
Prachuryya Kaushik and Ashish Anand Indian Institute of Technology Guwahati |
|
APTFiNER: Annotation Preserving Translation for Fine-grained Named Entity Recognition
Prachuryya Kaushik1, Adittya Gupta1, Ajanta Maurya1, Gautam Sharma2, V. Saradhi3, Ashish Anand1 1Indian Institute of Technology Guwahati, 2Indian Institute of Technology, Guwahati, 3Associate Professor |
|
RelEx-PT: A Portuguese Sentence-Level Relation Extraction Dataset
Tomás Pinto1, Catarina Silva2, Hugo Goncalo Oliveira3 1University of Coimbra, CISUC/LASI, DEI, 2University of Coimbra, 3CISUC, DEI, University of Coimbra |
|
Benchmarking Portuguese Open Information Extraction
Gabriel Silva, Mário Rodrigues, António Teixeira, Marlene Amorim Universidade de Aveiro |
|
A Scalable Pipeline for Novelty Detection in Skill Extraction Using Large Language Models
Gian Seifert1 and Simon Clematide2 1University of Zürich, 2University of Zurich |
|
Do LLMs Judge Distantly Supervised Named Entity Labels Well? Constructing the JudgeWEL Dataset
Alistair Plum1, Laura Bernardy1, Tharindu Ranasinghe2 1University of Luxembourg, 2Lancaster University |
|
From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking Evidence
Premtim Sahitaj1, Jawan Kolanowski2, Ariana Sahitaj3, Veronika Solopova3, Max Upravitelev3, Daniel Röder4, Iffat Maab5, Junichi Yamagishi5, Sebastian Möller3, Vera Schmitt3 1Technical University of Berlin, 2Harz University of Applied Sciences, Faculty of Automation and Computer Science, 3Technische Universität Berlin, 4Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Speech and Language Technology Lab, 5National Institute of Informatics, Digital Content and Media Sciences Research Division, Tokyo |
|
EpiGator: LLM-based Tracker of Infectious Outbreaks
Yiheng Wu, Jue Hou, Trangcasanchai Sathianpong, Lidia Pivovarova, Roman Yangarber University of Helsinki |
|
Relation Extraction across Entire Books to Reconstruct Community Networks: The AffilKG Datasets
Erica Cai1, Sean Mcquade2, Kevin Young1, Brendan O'Connor1 1University of Massachusetts Amherst, 2Northwestern University |
|
Vrittanta-EN: A Benchmark Dataset for Event Trigger Detection and Classification Advancing Event Understanding in English Narrative Discourse
Chaitanya Kirti, Ashish Anand, Prithwijit Guha Indian Institute of Technology Guwahati |
|
MUC-4 Revisited: Document-level Event Analysis beyond Span-based Arguments
Helene Olsen1, Erik Velldal1, Lilja Øvrelid2 1University of Oslo, 2Dept of Informatics, University of Oslo |
| 17:20 - 19:00 | Session P7.3: Knowledge Representation and Graphs - Poster Area |
|
Historical Medical Knowledge Graphs and Ontologies from the Medical History of British India Corpus (1850-1950)
Mehrdad Almasi and Tugce Karatas University of Luxembourg |
|
Graph-TempCZ: A Graph Representation of Software Mentions for Predicting Software Usage in Scientific Publications
Congfeng Cao1, Pengyu Zhang2, Jelke Bloem2 1Institute for Logic, Language and Computation, University of Amsterdam, 2University of Amsterdam |
|
Automatic Suggestions Help Extending Eventive Ontology: A Case Study on SynSemClass
Jana Strakova1, Eva Fucíková2, Zdenka Uresova2, Jan Hajic2 1Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, 2Charles University |
|
JPPB: Automatic Construction of a Soft-Labeled Japanese Patient Phrase Bank for Symptom Normalization
Tomohiro Nishiyama1, Mana Kuramoto1, Shoko Wakamiya2, Eiji ARAMAKI3 1Nara Institute of Science and Technology, 2NAIST, 3NAIST, Japan |
|
How I Met Your Snowclone: Unsupervised Discovery of Snowclone Patterns in Large Datasets
Julien Bezançon1, Gaël Lejeune2, Marceau Hernandez3 1Sorbonne Université, 2STIH, Sorbonne Université, 3CERES, STIH, Sorbonne universite |
|
HOME-KGQA: A Benchmark Dataset for Multimodal Knowledge Graph Question Answering on Household Daily Activities
Shusaku Egami1, Aoi Ohta1, Tomoki Tsujimura2, Masaki Asada2, Tatsuya Ishigaki1, Ken Fukuda3, Masahiro Hamasaki1, Hiroya Takamura4 1National Institute of Advanced Industrial Science and Technology (AIST), 2National Institute of Advanced Industrial Science and Technology, 3AIRC/AIST, 4The National Institute of Advanced Industrial Science and Technology (AIST) |
|
Extending the Semantic Layer of the CompL-it Italian Lexicon: Traits, Semantic Types, and Definitions
Emiliano Giovannetti1, Andrea Bellandi2, Simone Marchi3, Mafalda Papini3 1Istituto di Linguistica Computazionale "A. Zampolli" - CNR, 2Institute for Computational Linguistics - CNR, 3Cnr-Istituto di Linguistica Computazionale "A. Zampolli" |
|
Integrating Knowledge Graph with Large Language Models for Multi-hop Question Generation
Yllias Chali and Al Hasib Mahamud University of Lethbridge |
|
LocalGovPL: A Corpus of Speaker-Attributed Polish Local Government Transcripts
Dariusz Czerski1 and Maciej Ogrodniczuk2 1Institute of Computer Science, Polish Academy oif Sciences, 2Institute of Computer Science, Polish Academy of Sciences |
|
The Amharic DBpedia Chapter: A Knowledge Graph for a Low-Resource Language
HIzkiel Alemayehu1, Tilahun Abedissa Taffa2, Meti Bayissa3, Andargachew Zewge3, Hamada Zahera4, Ricardo Usbeck5, Axel-Cyrille Ngonga Ngomo4 1University of Paderborn, 2University of Hamburg, 3Addis Ababa University, 4Paderborn University, 5Leuphana University Lueneburg |
|
Cygnet: Refactoring the Open Multilingual Wordnet
Rowan Maudslay1 and Francis Bond2 1University of Cambridge, 2Palacky University |
|
Masrad: Arabic Terminology Management Corpora with Semi-Automatic Construction
Mahdi Nasser1, Laura Sayah1, Fadi Zaraket2 1Arab Center for Research and Policy Studies, 2American University of Beirut |
| 17:20 - 19:00 | Session P7.4: Opinion, Sentiment, Emotion Analysis - Poster Area |
|
SentiMalti: A Maltese Sentiment Analysis Dataset and Models
Ian Caruana, Matthew Vella, Fabio Zammit, Kurt Micallef, Claudia Borg University of Malta |
|
Multilingual Structured Sentiment Analysis for Environmental Sustainability
Muhammad Okky Ibrohim1, Tommaso Caselli2, Cristina Bosco3, Valerio Basile1 1University of Turin, 2Rijksuniversiteit Groningen, 3Dipartimento di Informatica - Università di Torino |
|
LLM-as-an-Annotator: Training Lightweight Models with LLM-Annotated Examples for Aspect Sentiment Tuple Prediction
Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff University of Regensburg |
|
Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks
Jakub míd1, Pavel Priban1, Pavel Kral2 1University of West Bohemia, Faculty of Applied Sciences, 2University of West Bohemia, Dept. of Computer Science and Engineering |
|
AnnoABSA: A Web-Based Annotation Tool for Aspect-Based Sentiment Analysis with Retrieval-Augmented Suggestions
Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff University of Regensburg |
|
Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis
Jakob Fehle, Nils Constantin Hellwig, Udo Kruschwitz, Christian Wolff University of Regensburg |
|
LoveHate: Stance Detection and Generation for Multiple Topics in User-generated Comments in Russian and English
Natalia Evgrafova, Veronique Hoste, Els Lefever LT3, Ghent University |
|
From Trial by Fire to Sleep like a Baby: A Lexicon of Anxiety Associations for 20K English Multi-Word Expressions
Saif Mohammad National Research Council Canada |
|
Entity-Level Sentiment Analysis with Sentence Relevance Detection
Egil Rønningstad1, Roman Klinger2, Lilja Øvrelid3, Erik Velldal1 1University of Oslo, 2University of Bamberg, 3Dept of Informatics, University of Oslo |
|
Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages
Tadesse Destaw Belay1, Dawit Gete2, Abinew Ali Ayele3, Olga Kolesnikova4, Iqra Ameer5, Grigori Sidorov6, Seid Muhie Yimam7 1Instituto Politécnico Nacional (IPN), Centro de Investigación en Computación (CIC), 2Wollo University, 3Bahir Dar University, 4Centro de Investigacion en Computacion del Instituto Politecnico Nacional, 5The Pennsylvania State University, 6CIC-IPN, 7University of Hamburg |
|
A Japanese Dataset for Aspect-based Sentiment Polarity Classification and Emotion Intensity Estimation
Kentaro Hanafusa1, Kota Manabe1, Yuki Maeda1, Daisuke Maekawa1, Tomoyuki Kajiwara2, Hideaki Hayashi3, Yuta Nakashima4, Hajime Nagahara4 1Ehime University, 2Ehime University / The University of Osaka, 3The University of Osaka, 4Osaka University |
| 17:20 - 19:00 | Session P7.5: Argument Mining and Emotion Classification - Poster Area |
|
Assessing the Persuasive Effect of AI-Generated Image Support of Arguments
Mackwyn Quadras1, Manfred Stede1, Henning Wachsmuth2 1University of Potsdam, 2Leibniz University Hannover |
|
CIARAM: Class Imbalance Aware Generative Framework for Relational Argument Mining
Nilmadhab Das1, Sayan Pal2, V. Saradhi3, Ashish Anand4 1Research Scholar, 2Masters Scholar, 3Associate Professor, 4Indian Institute of Technology Guwahati |
|
Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs
Muhammed Saeed1, Muhammad Abdul-Mageed2, Shady Shehata3 1PhD Student TU Dresden, 2The University of British Columbia, 3University of Waterloo |
|
Prompt-Based Stance Control in German: An Evaluation of LLMs for Experimental Research on Attitude Change
Florian Omiecienski1, Cornelia Sindermann2, Agnieszka Falenska3 1Universität Stuttgart - IMS, 2Psychological Assessment, Psychology of Individual Differences, and Psychological Methods, Charlotte Fresenius Hochschule University of Psychology, Heidelberg, Germany; Computational Digital Psychology, Interchange Forum for Reflecting on Intelligent Systems, University of Stuttgart, 3IMS, University of Stuttgart |
|
CoSt-BR: A Language Resource for Conversational Stance Detection
Felipe da Fonseca1, Ivandré Paraboni2, Luciano Digiampietri1 1University of São Paulo, 2University of Sao Paulo |
|
Less Is More? The Role of Demographic Author Information in Emotion Classification of Ambiguous Text
Sabine Weber, Lynn Greschner, Roman Klinger University of Bamberg |
|
Big Five Personality Prediction through Emotion-Conditioned Representations and Learnable Psycholinguistic Mapping
Lorenzo Zangari, Antonin Schnyder, Davide Picca University of Lausanne |
|
SENSEI-ASG: A Challenging Dataset for Argument Summary Graph Parsing
Jonathan Clayton1, Marco Damonte2, Robert Gaizauskas1 1University of Sheffield, 2Amazon |
|
Categorical Emotions or Appraisals - Which Emotion Model Explains Argument Convincingness Better?
Lynn Greschner, Meike Bauer, Sabine Weber, Roman Klinger University of Bamberg |
|
Creation of the Estonian Subjectivity Dataset: Assessing the Degree of Subjectivity on a Scale
Karl Gustav Gailit1, Kadri Muischnek2, Kairit Sirts1 1University of Tartu, 2associate professor |
| End of Day 2 |