| 1 |
ATLAS: Article Tracking, Linking, and Analysis of Swedish Encyclopedias |
Albin Kent Andersson, Salam Jonasson, Fredrik Wastring and Pierre Nugues |
| 2 |
Beyond Literal Meaning: How LLMs Interpret Yemeni Proverbs |
Nasser Thmer, Ali Al-Laith and Muhammad Shoaib |
| 5 |
Detecting Risky Behavior Related to Alcohol and Drug Use within Adolescents' Private Messenger Conversations
|
Jaromír Plhák, Michaela Lebedíková, Ondrej Sotolar and David Smahel |
| 12 |
A Corpus of Joint EEG and Self-Paced Reading of Natural Dutch Texts |
Sara Møller Østergaard, Lenneke Doris Lichtenberg, Laura Boon and Bruno Nicenboim |
| 15 |
Construction of Japanese Prefectural Assembly Minutes Datasets across Three Electoral Terms: Comparative
Analysis of 2011, 2015, and 2019 Four-Year Periods |
Keiichi Takamaru, Hokuto Ototake, Yuzu Uchida and Yasutomo Kimura |
| 20 |
EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates |
Ludovic Moncla, Pierre Nugues, Thierry Joliveau and Katherine McDonough |
| 23 |
CLASE: A Hybrid Method for Chinese Legalese Stylistic Evaluation |
Yiran Rex Ma, Yuxiao Ye and Huiyuan Xie |
| 24 |
RuBIN: A Russian Benchmark for Evaluating LLMs with Cultural Insights |
Polina Lazukova and Irina Piontkovskaya |
| 27 |
Mental Health Disorder Detection beyond Social Media: A Systematic Review of Available Datasets |
Sadiya Sayara Chowdhury Puspo, Ana-Maria Bucur, Stevie Chancellor, Özlem Uzuner and Marcos Zampieri |
| 37 |
CoMMA, a Large-scale Corpus of Multilingual Medieval Archives |
Thibault Clérice, Simon Gabay, Malamatenia Vlachou-Efsthatiou, Ariane Pinche and Benoît Sagot |
| 39 |
MaltiSent: A Maltese Sentiment Analysis Dataset and Models |
Ian Caruana, Matthew Vella, Fabio Zammit, Kurt Micallef and Claudia Borg |
| 40 |
Contrastively Pre-trained Event Embeddings with Schema-free LLM Annotations |
Frank Mtumbuka and Steven Schockaert |
| 41 |
DR-CUP: A Dataset on Real-time Commentary in U.S. Presidential Debates |
Yu-Yu Chang, Huan-Wen Ho, Chung-Chi Chen and Ming-Hung Wang |
| 45 |
Mining Naturally Romanized Seed Corpora without Romanizations |
Adrian Benton, Alexander Gutkin, Christo Kirov and Brian Roark |
| 47 |
Beyond Generic Responses: Target-Aware Strategies for Countering Hate Speech |
Yen-Yu Chang, Daryna Dementieva and Alexander Fraser |
| 48 |
Assessing LLM Reasoning through Implicit Causal Chain Discovery in Climate Discourse |
Liesbeth Allein, Nataly Pineda-Castañeda, Andrea Rocci and Marie-Francine Moens |
| 55 |
Historical Medical Knowledge Graphs and Ontologies from the Medical History of British India Corpus
(1850-1950) |
Mehrdad Almasi and Tugce Karatas |
| 56 |
Multilingual Structured Sentiment Analysis for Environmental Sustainability |
Muhammad Okky Ibrohim, Tommaso Caselli, Cristina Bosco and Valerio Basile |
| 57 |
LLM-as-an-Annotator: Training Lightweight Models with LLM-Annotated Examples for Aspect Sentiment Tuple
Prediction |
Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz and Christian Wolff |
| 58 |
A Dataset of Psychiatric Hospital Notes with Temporal Information Annotations |
Timothy A. Miller, Gaby Dinh, David Harris, Wonjin Yoon, Spencer Thomas, Boyu Ren, Meihua Hall and Guergana
Savova |
| 59 |
AccurateRAG: A Framework for Building Accurate Retrieval-Augmented Question-Answering Applications |
Linh The Nguyen, Chi Tran, Dung Ngoc Nguyen, Van-Cuong Pham, Hoang Ngo and Dat Quoc Nguyen |
| 60 |
Procrustes Analysis for Improving Language Model Merging |
Olivier Ferret |
| 61 |
Towards Reliable AI Fairness: Challenges in Implementing Neuron Steering for Bias Mitigation |
Ismael Garrido-Munoz, Arturo Montejo-Raez and Fernando Martínez-Santiago |
| 62 |
From Body to Mind: Analyzing Gender Representation in Spanish Generative Language Models |
Ismael Garrido-Munoz, Fernando Martínez-Santiago and Arturo Montejo-Raez |
| 64 |
From Press to Pixels: Evolving Urdu Text Recognition |
Samee Arif and Sualeha Farid |
| 65 |
German Counseling Grounding-Act Corpus (GRACO) |
Milena Belosevic |
| 67 |
Dynaword: From One-shot to Continuously Developed Datasets |
Kenneth Enevoldsen, Kristian Nørgaard Jensen, Jan Kostkan, Balázs Szabó, Márton Kardos, Kirsten Vad, Johan
Heinsen, Andrea Blasi Núñez, Gianluca Barmina, Jacob Nielsen, Rasmus Larsen, Rob van der Goot, Peter
Vahlstrup,
Per Møldrup Dalum, Desmond Elliott, Lukas Galke Poech, Peter Schneider-Kamp and Kristoffer Nielbo |
| 69 |
HalleluBERT - Let Every Token That Has Meaning Bear Its Weight |
Raphael Scheible-Schmitt |
| 70 |
Sentiment Analysis and Language Models for Kwanyama |
Ndapa Nakashole |
| 73 |
Can Video LLMs See through Illusions? Benchmark Dataset and Comprehensive Analysis |
Souto Ohira, Tosho Hirasawa and Mamoru Komachi |
| 76 |
Modular Approach to Automating Morphological Components in Grammar Engineering |
Ekaterina Voloshina and Krasimir Angelov |
| 78 |
SEFL: A Framework for Generating Synthetic Educational Assignment Feedback with LLM Agents |
Mike Zhang, Amalie Pernille Dilling, Léon Gondelman, Niels Erik Ruan Lyngdorf, Euan D. Lindsay and Johannes
Bjerva |
| 79 |
Presenting the Prague Discourse Treebank 4.0 |
Jiří Mírovský and Pavlína Synková |
| 82 |
Using LLMs and AI in the Language Services Industry: An Overview |
Todor Lazarov |
| 85 |
AI Safety Lost in Translation: Evaluating the Effectiveness of English-Italian Cross-Lingual LLM Safety
Alignment |
Alessio Wu and Martim Brandao |
| 92 |
From Bones to Rocks: A Systematic Evaluation of Specialized Definition Generation for Portuguese |
Rafael Oleques Nunes, Dennis Giovani Balreira and Joel Luís Carbonera |
| 93 |
VideoEvent: Leveraging Relevance and LLMs for Video Question Answering |
Chen-Chen Lin, Ming-Han Lee, KunRu Wu and Yu-Chee Tseng |
| 94 |
Russian Generative Spelling, Punctuation and Capitalization Correction |
Nikita Martynov, Danil Astafurov, Ulyana Isaeva, Ivan Vasil'yevich Maksimov, Joqsan Azocar, Dmitrii Kosenko
and Alena Fenogenova |
| 97 |
Using Songs to Improve Kazakh Automatic Speech Recognition |
Rustem Yeshpanov |
| 98 |
Southern Kurdish Speech Recognition Resources and Benchmarking |
Mohammad Mohammadamini and Marie Tahon |
| 99 |
Report-based Recommendations for Policy Making and Agency Operations: Dataset and LLM Evaluation |
Aleksandra Edwards, Thomas Edwards, Jose Camacho-Collados and Alun Preece |
| 101 |
Semantic Label Drift in Cross-Cultural Translation |
Mohsinul Kabir, Tasnim Ahmed, Md Mezbaur Rahman, Polydoros Giannouris and Sophia Ananiadou |
| 105 |
Implicit Bias in Peer Review: Through the Lens of Language Abstraction |
Xulang Zhang, Rui Mao and Erik Cambria |
| 108 |
ConceptKT: A Benchmark for Concept-Level Deficiency Prediction in Knowledge Tracing |
Yu-Chen Kang, Yu-Chien Tang and An-Zi Yen |
| 109 |
Evaluation of Co-Speech Gesture Tracking Techniques in Naturalistic Interactions |
Victoria Ivanova and Naomi Harte |
| 111 |
BLooP: Zero-Shot Abstractive Summarization Using Small Language Models with a Bi-gram Lookahead Promotion
|
Varun Iyer and Cornelia Caragea |
| 112 |
SETUP: Sentence-level English-To-Uniform Meaning Representation Parser |
Emma Markle, Javier Gutierrez Bach and Shira Wein |
| 113 |
MetaCORA: A Meta-Learned Curriculum for Adversarial and Contrastive Robustness in Speech Recognition |
Yuqian Dai, Chun Fai Chan, Ying Ki Wong and Tsz Ho Pun |
| 118 |
Advancing Retrieval-Augmented Generation for Persian: Development of Language Models, Comprehensive
Benchmarks, and Best Practices for Optimization |
Sara Bourbour Hosseinbeigi, Mohammad Hossein Shalchian, Sina Asghari, Mohammad Ali Seif Kashani and Mohammad
Amin Abbasi |
| 124 |
Incivility and Rigidity: Evaluating the Risks of Fine-Tuning LLMs for Political Argumentation |
Svetlana Churina and Kokil Jaidka |
| 129 |
MASA: A Novel Multimodal Foundation Model for L2 Speaking Assessment in Picture-description Scenarios |
Bi-Cheng Yan, Fu-An Chao, Hong-Yun H.Y. Lin and Berlin Chen |
| 130 |
A Benchmark Dataset and Comparative Evaluation of Phonemized and Romanized Urdu for Text-to-Speech |
M Kaab Bin Shahid and Muhammed Izharuddin |
| 131 |
Voices across Decades: A Multimodal Diachronic Corpus of German Bundestag Debates (GerParlDia-MM) |
Ingo Siegert |
| 132 |
Robust Bias Evaluation with FilBBQ: A Filipino Bias Benchmark for Question-Answering Language Models |
Lance Calvin Lim Gamboa, Yue Feng and Mark Lee |
| 134 |
Evaluating Phonetically Weighted and Unweighted Distance Measures in Dialectometry |
Alfred Lameli |
| 135 |
To Skip, to Swap or to Not Swap? Identifying Step Transition Types in Instructional Manuals |
Hsiu-Yu Yang, Michael Roth, Andreas Bulling and Carina Silberer |
| 136 |
This One or That One? A Bilingual Study on Accessibility via Demonstratives with Multimodal Large Language
Models |
Yu Wang, Emmanuele Chersoni and Chu-Ren Huang |
| 142 |
Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks |
Jakub Šmíd, Pavel Priban and Pavel Kral |
| 143 |
Piecing Together Cross-Document Coreference Resolution Datasets: Systematic Dataset Analysis and Unification
|
Anastasia Zhukova, Terry Lima Ruas, Jan Philip Wahle and Bela Gipp |
| 144 |
Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization |
Chaimae Chellaf el Hammoud, Salima Mdhaffar, Yannick Estève and Stéphane Huet |
| 147 |
MorfFlex: Handling Rich Morphology |
Jaroslava Hlaváčová, Marie Mikulová, Barbora Štěpánková, Milan Straka and Jan Hajič |
| 153 |
AMR Parsing beyond English: An Experiment on Bulgarian, French, Hungarian and Ukrainian |
Ivaylo Mitov, Tadzhat Marharian, Zsofia F. Hauk, Samba Fall, Maxime Amblard and Bruno Guillaume |
| 154 |
An LLM-Based Assistant for Debt Waiver Court Procedures |
Lluis Padro, Daniel Ferrés, Roser Saurí and Mireia Artigot |
| 158 |
AnnoABSA: A Web-Based Annotation Tool for Aspect-Based Sentiment Analysis with Retrieval-Augmented
Suggestions
|
Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz and Christian Wolff |
| 159 |
Neural Network-assisted Analysis of Tube Vocal Tract Models |
Runhui Song, Johan Sjons and Axel G. Ekstrom |
| 160 |
Spotlights and Blindspots: Evaluating Machine-Generated Text Detection |
Kevin Stowe and Kailash Patil |
| 166 |
Open-access Dataset on Acceptability Ratings of Korean Clausal Constructions by Humans and GPT Models |
Gyu-Ho Shin, Soo-Hwan Lee and Chanyoung Lee |
| 170 |
JAPAS: A Benchmark and Neural Approach for Japanese Patent Support Relation Extraction |
Katsuki Chousa and Ryosuke Sugiura |
| 174 |
A Teacher-Student Approach to Creating Verified Synthetic Clarification and Correction Dialogues for TableQA
Tasks |
Christian Poelitz and Nick McKenna |
| 175 |
Fruitcakes and Cupcakes Emerging from Noise: The ComposiGen Dataset of Compounds and Their Compositionality
|
Jule Godbersen, Sinan Cem Kurtyigit, Emma Raimundo Schulz, Tonmoy Rakshit, Diego Frassinelli, Sabine Schulte
im Walde and Carina Silberer |
| 178 |
ACAData: Parallel Dataset of Academic Data for Machine Translation |
Iñaki Lacunza, Javier Garcia Gilabert, Francesca De Luca Fornaciari, Javier Aula-Blasco, Aitor
Gonzalez-Agirre, Maite Melero and Marta Villegas |
| 179 |
Persona-Aware Evaluation of Cognitive Bias in LLMs: From Benchmark to Applied Decision-Making |
Katsumasa Yoshikawa, Junya Takayama and Takato Yamazaki |
| 183 |
A Shoal of Voices: Parallel Read Speech from Professional Swedish Narrators |
Christina Tånnander, Jim O'Regan and Jens Edlund |
| 185 |
CEFR Level Prediction for Short Russian L2 Texts: Evaluating Classifiers and Instruction-Based LLMs |
Anna Glazkova, Antonina Laposhina and Dmitry Morozov |
| 186 |
Don't Teach Minerva : Guiding LLMs through Complex Syntax for Faithful Latin Translation with RAG |
Sergio Torres Aguilar |
| 188 |
MultiWikiQA: A Reading Comprehension Benchmark in 300+ Languages |
Dan Saattrup Smart |
| 190 |
Assessing the Persuasive Effect of AI-Generated Image Support of Arguments |
Mackwyn Quadras, Manfred Stede and Henning Wachsmuth |
| 193 |
Large Language Models' Internal Perception of Symbolic Music |
Andrew Shin and Kunitake Kaneko |
| 196 |
S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature
|
Abigail Berthe-Pardo, Gaspard Michel, Elena V. Epure and Christophe Cerisara |
| 197 |
CIARAM: Class Imbalance Aware Generative Framework for Relational Argument Mining |
Nilmadhab Das, Sayan Pal, V. V. Saradhi and Ashish Anand |
| 199 |
Insights from Transfer Learning Experiments with Word-in-Context and Word Sense Disambiguation Models |
Alp Mujko and Dominik Schlechtweg |
| 200 |
LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation |
Hailay Kidu Teklehaymanot, Dren Fazlija and Wolfgang Nejdl |
| 202 |
In-Distribution Steering: Balancing Control and Coherence in Language Model Generation |
Arthur Vogels, Benjamin Wong, Yann Choho, Annabelle Blangero and Milan Bhan |
| 203 |
EsBBQ and CaBBQ: The Spanish and Catalan Bias Benchmarks for Question Answering |
Valle Ruiz-Fernández, Mario Mina, Júlia Falcão, Luis Antonio Vasquez Reina, Anna Salles, Aitor
Gonzalez-Agirre
and Olatz Perez-de-Viñaspre |
| 204 |
Large Language Models for Citation Function Classification |
Daniel Vodička, Pavel Kral, Christophe Cerisara and Jakub Šmíd |
| 205 |
Central Kurdish TTS and Its Application in Speech to Text Translation |
Mohammad Mohammadamini, Meysam Shamsi and Marie Tahon |
| 208 |
Entity Image and Mixed-Modal Image Retrieval Datasets |
Cristian-Ioan Blaga, Paul Suganthan G C, Sahil Dua, Krishna Srinivasan, Enrique Alfonseca, Peter Dornbach,
Tom
Duerig, Imed Zitouni and Zhe Dong |
| 210 |
Towards Complex Debate Understanding: Predicting Claim Impact Scores through the Modelling of Claim
Interactions |
Maxime Marcel Brouat, Mihai Surdeanu, Srdjan Vesic and Eduardo Blanco |
| 211 |
Tools for Estimating the Perceived Level of Phonetic Reduction |
Nigel Ward, Javier Vazquez, Emma (Danny) R. Boushka and Oliver Niebuhr |
| 212 |
FENCE: A Financial and Multimodal Jailbreak Detection Dataset |
Mirae Kim, Seonghun Jeong and Youngjun Kwak |
| 215 |
Topic-Initiator: A Proactive Chatbot with Personalized Topic RAG for Enhancing Willingness to Converse |
Kazuya Matsuo, Atsushi Otsuka, Narichika Nomoto and Makoto Naka |
| 218 |
A Single Model Ensemble Framework for Neural Machine Translation Using Pivot Translation |
Seokjin Oh, Keonwoong Noh and Woohwan Jung |
| 221 |
ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering
|
Daeyong Kwon, SeungHeon Doh and Juhan Nam |
| 223 |
MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering |
Wen-wai Yim, Asma Ben Abacha, Zixuan Yu, Robert Doerning, Fei Xia and Meliha Yetisgen |
| 224 |
LegalRikai: Open Benchmark – a Benchmark for Complex Japanese Corporate Legal Tasks |
Shogo Fujita, Yuji Naraki, Yiqing Zhu and Shinsuke Mori |
| 225 |
Talk2Ref: A Dataset for Reference Prediction from Scientific Talks |
Frederik Yannick Broy, Maike Züfle and Jan Niehues |
| 226 |
MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations |
Aaron Robert Scott, Maike Züfle and Jan Niehues |
| 228 |
KinyCOMET: Automatic Evaluation of Machine Translation Systems for Kinyarwanda--English |
Prince Chris Mazimpaka, Jan Nehring, Samuel Rutunda and Cristina España-Bonet |
| 232 |
A Cheap Lunch: Synthetic Annotation with Minimal Human Effort for Medical Text Mining |
Shutao Chen and Piek T.J.M. Vossen |
| 234 |
Parallel Sentence Filtering for Low-Resource Language Pairs: A Case Study for Upper Sorbian, German, and
Czech
|
Ruiyang Jiang, Shu Okabe and Alexander Fraser |
| 236 |
Improving Multilingual Language Models by Aligning Representations through Steering |
Omar Mohamed Mahmoud, Buddhika Laknath Semage, Thommen George Karimpanal and Santu Rana |
| 239 |
Supervised Contrastive Fine-Tuning for Active Few-Shot Learning |
Zirui Zhang and Lei Ge |
| 241 |
Deep Learning-Based Multi-Aspect Pronunciation Assessment for Individuals with Down Syndrome |
David Fernández-García, César González-Ferreras, Valentín Cardeñoso-Payo and Mario Corrales-Astorgano |
| 244 |
Joint Identification and Induction of Semantic Frames with Scalable Semi-Supervised Graph Clustering |
Fabian Barteld, Steffen Remus, Saba Anwar, Julian Stawecki, Alexander Ziem and Chris Biemann |
| 249 |
MATA (మాట ): Mindful Assessment of the Telugu Abilities of Large Language Models |
Chalamalasetti Kranti and Sowmya Vajjala |
| 250 |
SALOMO: An Annotation Tool for Complex Annotation Tasks with a Large Number of Labels |
Tim Menzner |
| 251 |
FalAR: A Large-scale Speaker-Annotated European Portuguese Speech Corpus of Parliamentary Sessions |
Francisco Teixeira, Carlos Carvalho, Mariana Julião, Catarina Botelho, Rubén Solera-Ureña, Sérgio Paulo,
Thomas Rolland, Ben Peters, Isabel Trancoso and Alberto Abad |
| 252 |
IREKIER: An Easy Read Corpus for Basque and Spanish |
Jesús Calleja and Thierry Etchegoyhen |
| 253 |
Towards a Diagnostic and Predictive Evaluation Methodology for Sequence Labeling Tasks |
Elena Alvarez-Mellado and Julio Gonzalo |
| 259 |
Forewarned Is Forearmed: When Non-sequential Embedding Turns into an Anomaly Detector |
Elys Allesiardo, Antoine Caubrière and Valentin Vielzeuf |
| 260 |
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models |
Neeraj Gangwar, Suma Bhat and Nickvash Kani |
| 262 |
The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach |
Nizar El Ghazal, Antoine Caubrière and Valentin Vielzeuf |
| 263 |
QuALA-NL: Question & Answer with Legal Attribution in Dutch |
Romy A.N. van Drie, Roos M. Bakker, Daan L. Di Scala and Maaike de Boer |
| 268 |
Low-Rank Compression of Language Models via Differentiable Rank Selection |
Sidhant Sundrani, Francesco Tudisco and Pasquale Minervini |
| 270 |
Simulating Student Interactions for Virtual Pretesting with In-Context Learning |
Arthur Thuy, Luca Benedetto, Ekaterina Loginova and Dries F. Benoit |
| 271 |
An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter
LLMs |
Deshan Koshala Sumanathilaka, Nicholas Micallef and Julian Hough |
| 272 |
Off the Hamster Wheel: Rethinking Dialogue Research through a Meta-Analysis of the ACL Anthology 2024 |
Amandine Decker, Maxime Amblard and Ellen Breitholtz |
| 273 |
Using Valency Inheritance in Building a Valency Lexicon |
Václava Kettnerová, Veronika Kolářová, Jiří Mírovský and Michal Olbrich |
| 275 |
Self-supervised Data Augmentation for Classification in Low-Data Settings |
Deyu Ding, Mengying Wang and Andreas Spitz |
| 277 |
Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures |
Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frederic Blain and Eva Vanmassenhove |
| 280 |
Icelandic Math Eval: A Competitive Mathematics Benchmark for Large Language Models |
Hafsteinn Einarsson, Jökull Ari Haraldsson, Ívar Armin Derayat, Sigrún Helga Lund and Benedikt Steinar
Magnússon |
| 282 |
MazeEval: A Benchmark for Testing Sequential Decision-Making in Language Models |
Hafsteinn Einarsson |
| 283 |
TigerCoder: A Novel Suite of LLMs for Code Generation in Bangla |
Nishat Raihan, Antonios Anastasopoulos and Marcos Zampieri |
| 286 |
VDAct 2.0: Scaling Video-Grounded Dialogue for Event-driven Activity Understanding with LLM-Assisted
Filtering
|
Wiradee Imrattanatrai, Masaki Asada, Kimihiro Hasegawa, Ken Fukuda and Teruko Mitamura |
| 289 |
Multi-dimensional Evaluation of Character-Authentic Dialogue Models Learned from Question-Answer Data |
Atsushi Otsuka, Kazuya Matsuo, Kenta Hama, Masahiro Mizukami, Tsunehiro Arimoto, Hiroaki Sugiyama, Makoto
Naka
and Narichika Nomoto |
| 290 |
Mitigating Misinterpretation in Policy Documents through Automated Language Understanding |
Momojit Biswas, Anka Chandrahas Tummepalli and Preethu Rose Anish |
| 292 |
Evaluating Multimodal Large Language Models on Vertically Written Japanese Text |
Keito Sasagawa, Shuhei Kurita and Daisuke Kawahara |
| 293 |
Estonian Native Large Language Model Benchmark |
Helena Grete Lillepalu and Tanel Alumäe |
| 297 |
Distribution-aware Low-bitwidth Quantization for Large Language Models |
Bao Tan Duy Huynh, Takashi Tsunakawa and Masafumi Nishida |
| 298 |
CoachLah: A Singlish–English Parallel Corpus of Health Coaching Conversations with Behavior Goal Annotations
|
Iva Bojic, Mathieu Ravaut, Stephanie Hilary Xinyi Ma, Doreen Tan, Andy Hau Yan Ho and Andy Khong |
| 301 |
TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech
Recognition |
ChengYeh Yang, Chien-Chun Wang, Li-Wei Chen, Hung-Shin Lee, Hsin-Min Wang and Berlin Chen |
| 303 |
Format Matters: A Critical Evaluation of Output Formats for Prompting LLMs in SLU and NER |
Pierre Lepagnol, Christophe Servan, Sahar Ghannay, Thomas Gerald and Sophie Rosset |
| 304 |
Building a One‑Million‑Pair BokmåL–Nynorsk Translation Corpus |
Per E. Kummervold, Thea Tollersrud and Angelina Zanardi |
| 305 |
Authors Use Trigger Warnings with Lexical Consistency: A Corpus Analysis with User-Generated Labels |
Sebastian Heineking, Matti Wiegmann, Magdalena Wolska, Benno Stein and Martin Potthast |
| 307 |
mSCoRe: A Multilingual and Scalable Benchmark for Skill-based Commonsense Reasoning |
Nghia Trung Ngo, Franck Dernoncourt and Thien Huu Nguyen |
| 309 |
Adapting Pretrained Models to Endangered Languages in Japan: A Comparative Study on Ryukyuan and Ainu Speech
Recognition |
Kohei Matsuura, Takanori Ashihara and Tatsuya Kawahara |
| 317 |
Indirect Question Answering in English, German and Bavarian: A Challenging Task for High- and Low-Resource
Languages Alike |
Miriam Winkler, Verena Blaschke and Barbara Plank |
| 318 |
Empathy in Greek Exam-Related Support Conversations: A Comparative Evaluation of LLM Responses |
Panagiota Kyriazi and Prokopis Prokopidis |
| 321 |
English to Central Kurdish Speech Translation: Corpus Creation, Evaluation, and Orthographic Standardization
|
Mohammad Mohammadamini, Daban Jaff, Josep Crego, Marie Tahon and Antoine Laurent |
| 323 |
Corpus and Baselines for Distinguishing Authentic, AI-Generated, and AI-Enhanced Resumes |
Andrea Loizidou, Anshu Kiran Sharma, Adrian Esquivel, Mark A. Finlayson and Mustafa Ocal |
| 324 |
Evaluation of Two Leading Polish Language Models in a Real-world RAG Scenario |
Szymon Bartanowicz and Krzysztof Jassem |
| 325 |
CorpusClues: Scalable Unsupervised Similarity Search for Historical Texts Using MinHash-LSH |
Paulien Lemay, Klaas Bentein and Els Lefever |
| 327 |
Explainable AI for Ethical Counter Speech Generation in Hate Speech Mitigation |
Ashiful Islam Ridoy, Mohammed Faisal, Yogesh Kumar, Md Mamun-Ur Rashid, Marina Ernst and Frank Hopfgartner
|
| 328 |
Bangla Key2Text: Text Generation from Keywords for a Low Resource Language |
Tonmoy Talukder and G M Shahariar |
| 329 |
ViX-Ray: A Vietnamese Chest X-Ray Dataset for Vision-Language Models |
Duy Vu Minh Nguyen, Chinh Thanh Truong, Trần Hoàng Phúc, Hung Tuan Le, Nguyen Van-Thanh Dat, Trung Hieu Pham
and Kiet Van Nguyen |
| 332 |
Creating Task-Specific Speech Recognition Datasets from Scratch for Low-Resource Languages: Assessing the
Impact of Token Sequence Overlap |
Adwoa Asantewaa Bremang, Dennis Asamoah Owusu, Victor Quagraine and Leanne M.M. Annor-Adjaye |
| 333 |
Transcription Accuracy in the Icelandic Gigaword Corpus: Evaluating Automatic and Manual Annotation |
Johanna Mechler, Lilja Björk Stefánsdóttir and Anton Karl Ingason |
| 334 |
Radio Haiti-Inter: A Large-Scale Annotated Corpus of Spoken Haitian Creole |
William N. Havard, Rayan Ziane, Mélissa Menclé, Maximin Coavoux, Benjamin Lecouteux and Emmanuel Schang |
| 335 |
Benchmarking Large Language Models for Text Input in Chinese and Japanese |
Yuchun Zou, Tedd Lee, Xiaodi Fan and Jun Li |
| 336 |
Automatic Prediction of Prominence and Boundary Strength from Text |
Pauline Mas, Kévin Vythelingum, Jonathan Chevelu, Marion Ouédraogo, Damien Lolive and Olivier Rosec |
| 337 |
Enhancing Clinical Trial Analysis through Large Language Models for Multi-Evidence Natural Language
Inference
|
Shobanapriyan Chandrasegaran and Amal Htait |
| 338 |
DaLA: Danish Linguistic Acceptability Evaluation Guided by Real World Errors |
Gianluca Barmina, Nathalie Carmen Hau Norman, Peter Schneider-Kamp and Lukas Galke Poech |
| 339 |
How Long Does a Quick Kiss Take? Studying Event Duration of Light Verb Constructions Using Explicit Word
Embeddings |
Lin de Huybrecht and Geraint A. Wiggins |
| 340 |
REMIND: Input Loss Landscapes Reveal Residual Memorization in Post-Unlearning LLMs |
Liran Cohen, Yaniv Nemcovesky and Avi Mendelson |
| 341 |
ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source |
Hung Tuan Le, Long Truong To, Manh Trong Nguyen and Kiet Van Nguyen |
| 342 |
Harnessing Synergy in Context and Emoji for Joint Detection of Harmful Online Content in Multi-turn
Conversations |
Feiyan Hu, Ciara Anne Byrne, Jiang Zhou, Rena Maycock and Mark Langan |
| 344 |
Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs |
Muhammed Yahia Gaffar Saeed, Muhammad Abdul-Mageed and Shady Shehata |
| 346 |
From CHAT to Coded CoNLL-U: A Reproducible Pipeline for the Syntactic Annotation and Querying of Child
Language Data |
Achim Stein |
| 349 |
Prompt-Based Stance Control in German: An Evaluation of LLMs for Experimental Research on Attitude Change
|
Florian Omiecienski, Cornelia Sindermann and Agnieszka Falenska |
| 352 |
Synthetic Function Demonstrations Improve Generation in Low-Resource Programming Languages |
Nick McKenna, Xinnuo Xu, Jack Williams, Nicholas C. Wilson, Benjamin Van Durme and Christian Poelitz |
| 356 |
FRASE: Frame-based Structured Representations for Generalizable SPARQL Query Generation |
Papa Abdou Karim Karou Diallo and Amal Zouaq |
| 359 |
Graph-TempCZ: A Graph Representation of Software Mentions for Predicting Software Usage in Scientific
Publications |
Congfeng Cao, Pengyu Zhang and Jelke Bloem |
| 360 |
Identifying Imaging Follow-Up in Radiology Reports: A Comparative Analysis of Traditional ML and LLM
Approaches |
Namu Park, Giridhar Kaushik Ramachandran, Kevin Lybarger, Fei Xia, Özlem Uzuner, Martin Gunn and Meliha
Yetisgen |
| 362 |
Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering |
Purva Chiniya, Kevin Joseph Scaria and Sagar Chaturvedi |
| 363 |
GUMBridge: A Corpus for Varieties of Bridging Anaphora |
Lauren Levine and Amir Zeldes |
| 364 |
KCIF: Knowledge-Conditioned Instruction Following |
Rudra Murthy, Praveen Venkateswaran, Prince Kumar and Danish Contractor |
| 365 |
GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms |
Masayuki Kawarada, Kodai Watanabe and Soichiro Murakami |
| 366 |
Building Effective Japanese Medical LLMs with an Open Recipe for Domain Adaptation through Continued
Pre-training |
Akiko Aizawa, Yuki Arase, Fei Cheng, Jiahao Huang, Zhiyi Huang, Junfeng Jiang, Teruhito Kanazawa, Daisuke
Kawahara, Kazuma Kobayashi, Takashi Kodama, Sadao Kurohashi, Yusuke Oda, Yuma Tsuta, Zhen Wan, Zhishen Yang
and
Rio Yokota |
| 367 |
J-ClinicalBench: A Benchmark for Evaluating Large Language Models on Practical Clinical Tasks in Japanese
|
Seiji Shimizu, Tomohiro Nishiyama, Hisada Shohei, Yamato Himi, Shoko Wakamiya, Yuki Yanagisawaw, Masami
Tsuchiya, Satoko Hori and Eiji Aramaki |
| 369 |
A Mental State Extraction Dataset for Theory-of-Mind-based Reasoning in Emotional Support Conversations |
Seulgi Kim and Harksoo Kim |
| 370 |
Scaling LLM Reasoning from Minimal Labels: A Semi-Supervised Framework with a Lightweight Verifier |
Keizo Kato, Chenhui Chu, Yugo Murawaki and Sadao Kurohashi |
| 371 |
SouDeC: Source Detection and Classification in Czech |
Jiří Mírovský and Barbora Hladka |
| 372 |
BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios |
Yunseung Lee, Subin Kim, Youngjun Kwak and Jaegul Choo |
| 377 |
Constructing a Japanese Claim Decomposition Dataset for Fact-Checking of LLM-Generated Texts |
Miwa Masano, Ribeka Keyaki, Atsushi Keyaki, Rei Minamoto, Kaito Horio, Hirokazu Kiyomaru, Kouta Nakayama,
Hideyuki Tachibana and Daisuke Kawahara |
| 378 |
Do Language Models Encode Semantic Relations? Probing and Sparse Feature Analysis |
Andor Diera and Ansgar Scherp |
| 380 |
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios |
Weidong Zhan, Yue Wang, Nan Hu, Liming Xiao, Jingyuan Ma, Yuhang Qin, Zheng Li, Yixin Yang, Sirui Deng,
Jinkun
Ding, Qingxiu Dong, Wenhan Ma, Rui Li, Weilin Luo, Qun Liu and Zhifang Sui |
| 383 |
VietJobs: A Vietnamese Job Advertisement Dataset |
Hieu Pham Dinh, Hung Nguyen Huy and Mo El-Haj |
| 385 |
Can LLMs Evaluate What They Cannot Annotate? Revisiting LLM Reliability in Hate Speech Detection |
Paloma Piot, David Otero, Patricia Martin-Rodilla and Javier Parapar |
| 387 |
To Predict or Not to Predict? Towards Reliable Uncertainty Estimation in the Presence of Noise |
Nouran Khallaf and Serge Sharoff |
| 388 |
A Resource on Dialogical Moves in Native and Non-Native Academic Writers of English |
Giulia D'Agostino, Narjes Sheikh Asadi and Elena Musi |
| 389 |
Construction and Analysis of Japanese Parent-Child Dialogic Reading Corpus for Conversational Agents |
Yuko Nakagi, Yuya Chiba, Sanae Fujita and Shoko Araki |
| 390 |
SOMVOICE: A First Dataset to Study the Effects of Sleep Deprivation on Voice Characteristics of Healthy
French
Speakers |
Vincent P. Martin, Jean-Luc Rouas, Colleen Beaumard and Pierre Philip |
| 391 |
A Binary Problem in Binary QA: Diverse LLMs or Diverse Question Interpretations? That Is the Ensembling
Question |
Rafael Rosales and Santiago Miret |
| 392 |
Meta-Prompting Follow-Ups for Unsupervised Dialogue Evaluation Using Open-Source Large Language Models |
Gaetano Cimino, Chuyuan Li, Giuseppe Carenini and Vincenzo Deufemia |
| 393 |
PerHalluEval: Persian Hallucination Evaluation Benchmark for Large Language Models |
Mohammad Hosseini, Kimia Hosseini, Shayan Bali, Zahra Zanjani and Saeedeh Momtazi |
| 396 |
A Systematic Comparison of Large Language Models for Data Annotation in NER Tasks |
Muhammad Uzair Ul Haq, Davide Rigoni and Alessandro Sperduti |
| 397 |
Memorization or Lucky Guesses: Detecting Short Copyrighted News Sequences in LLM Output |
Joris Veerbeek, Kas Berendsen, Alessandra Polimeno and Antal van den Bosch |
| 399 |
ACLBot: A Knowledge Graph-Driven Assistant for ACL Anthology Research |
Jan Buchmann, Steven Lynden and Kristiina Jokinen |
| 400 |
PARL: Prompt-based Agents for Reinforcement Learning |
Yarik Menchaca Resendiz and Roman Klinger |
| 402 |
Generating Sign Language Poses from HamNoSys and Natural Language Descriptions |
Santiago Máximo and Luis Chiruzzo |
| 405 |
Efficient Topic Extraction via Graph-Based Labeling: A Lightweight Alternative to Deep Models |
Salma Mekaoui, Hiba Sofyan, Imane Benchrif, Imane Amaaz, Ilham Chaker, Arsalane Zarghili and Nikola S.
Nikolov
|
| 409 |
A Corpus-Based Profiling of Regional English Variants in Global Media: Insights from Olympic Journalism |
Felix Mao |
| 411 |
New Trends for Modern Machine Translation with Large Reasoning Models |
Sinuo Liu, Chenyang Lyu, Minghao Wu, Zifu Shang, Longyue Wang, Weihua Luo and Kaifu Zhang |
| 414 |
This House Debates AI: Evaluating a Language Model in Oxford-Style Debates against Human Experts |
Umberto Belluzzo, Kobi Hackenburg, Hannah Rose Kirk, Scott Hale and Paul Röttger |
| 415 |
Automatic Prediction of Child Speech Fluency with Game-Based Data from German Preschoolers |
Valentin Kany, Bernd Möbius and Jürgen Trouvain |
| 417 |
ADAB: Arabic Dataset for Automated Politeness Benchmarking - a Large-Scale Resource for Computational
Sociopragmatics |
Hend Al-Khalifa, Nadia Ghezaiel, Maria Bounnit, Hend Hamed Alhazmi, Noof Abdullah Alfear, Reem Fahad
Alqifari,
Ameera Masoud Almasoud and Sharefah Ahmed Al-Ghamdi |
| 420 |
TækTåK: Syntactic Analysis of Language Use on Danish TikTok |
Thea Kristensen and Rob van der Goot |
| 424 |
Mute Cods: A Multilingual Telegram Dataset with Benchmark Models for Conspiracy Theory Detection |
Katarina Laken, Erik Bran Marino, Paloma Piot, Davide Bassi, Søren Kirkegaard Fomsgaard, Michele Joshua
Maggini, Renata Vieira, Marcos Garcia and Sara Tonelli |
| 426 |
Selective Augmentation: Improving Universal Automatic Phonetic Transcription via G2P Bootstrapping |
Tobias Bystrich, Julia Maria Pritzen, Christoph Andreas Schmidt and Claudia Wich-Reif |
| 427 |
Adaptive Chunking: Optimizing Chunking-Method Selection for RAG |
Paulo Roberto de Moura Júnior, Jean Lelong and Annabelle Blangero |
| 429 |
PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering
Benchmark |
Mohammad Javad Ranjbar Kalahroodi, Amirhossein Sheikholselami, Sepehr Karimi Arpanahi, Sepideh Ranjbar
Kalahroodi, Heshaam Faili and Azadeh Shakery |
| 435 |
AURORA Model of Formant-to-tongue Inversion for Didactic and Clinical Applications |
Patrycja Strycharczuk and Sam Kirkham |
| 438 |
BURMESE-SAN: Burmese NLP Benchmark for Evaluating Large Language Models |
Thura Aung, Jann Railey Montalan, Jian Gang Ngui and Peerat Limkonchotiwat |
| 440 |
GRDD+: An Extended Greek Dialectal Dataset with Cross-Architecture Fine-tuning Evaluation |
Stergios Chatzikyriakidis, Dimitriοs Papadakis, Sevasti Ioanna Papaioannou and Erofili Psaltaki |
| 441 |
Same-Language Subtitles for Low-resource Languages: A Case of Bundelkhandi |
Anirudh Pradhan, Ayushi Pandey, Divyansh Kushwaha, Akshita Tiwary and Vivek Seshadri |
| 442 |
Fill-in-the-Blanks: Generating Pseudonyms for English and Swedish Texts with RoBERTa and Qwen |
Maria Irena Szawerna and Jacob Lee Suchardt |
| 443 |
Evaluating Social Intelligence in LLMs via Japanese Honorifics in Business Emails: A Social Semiotic System
Perspective |
Muxuan Liu, Tatsuya Ishigaki, Yusuke Miyao, Hiroya Takamura and Ichiro Kobayashi |
| 447 |
New Encoders for German Trained from Scratch: Comparing ModernGBERT with Converted LLM2Vec Models |
Julia Wunderle, Anton Ehrmanntraut, Jan Pfister, Fotis Jannidis and Andreas Hotho |
| 452 |
Evaluating Discriminability of Vision-Language Models |
Masayasu Muraoka and Naoaki Okazaki |
| 454 |
ProMQA-Assembly: Multimodal Procedural QA Dataset on Assembly |
Kimihiro Hasegawa, Wiradee Imrattanatrai, Masaki Asada, Susan E. Holm, Yuran Wang, Xuanang Zhou, Ken Fukuda
and Teruko Mitamura |
| 455 |
JFC-Recipe: A Dataset for Nutrient Estimation from Japanese User-Generated Cooking Recipes |
Keisuke Shirai, Yoko Yamakata, Hirotaka Kameko, Akiko Sunto, Jun Harashima and Shinsuke Mori |
| 456 |
Investigating the Role of Synthetic Data Augmentation and Training Strategies on Improving Low-Resource
Language ASR |
Yun Hao, Reihaneh Amooie, Wietse de Vries, Rik van Noord and Martijn Wieling |
| 457 |
From Noise to Signal: When Outliers Seed New Topics |
Evangelia Zve, Gauvain Bourgne, Benjamin Icard and Jean-Gabriel Ganascia |
| 462 |
The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form
Factuality
Evaluation |
Pavel Braslavski, Dmitrii Iarosh, Nikita Sergeevich Sushko, Andrey Sakhovskiy, Vasily Konovalov, Elena
Tutubalina and Alexander Panchenko |
| 463 |
Once upon a Kernel: Extracting Important Events from Narratives |
Anshu Kiran Sharma, Miguel Castiblanco-Melendez, Alejandro Morales and Mark A. Finlayson |
| 464 |
R.U.Psycho? A Framework for Robust Unified Psychometric Testing of Language Models |
Julian Schelb, Orr Borin, David Garcia and Andreas Spitz |
| 469 |
The Sufficiency-Conciseness Trade-off in LLM Self-Explanation from an Information Bottleneck Perspective
|
Ali Zahedzadeh and Behnam Bahrak |
| 470 |
HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech
Detection
|
Irina Proskurina, Marc-Antoine Carpentier and Julien Velcin |
| 471 |
Annotating Conversational Phases and Communication Techniques: A Corpus of German Teacher-Parent Counseling
Conversations |
Tobias Hallmen, Kathrin Gietl, Karoline Hillesheim, Annemarie Friedrich and Elisabeth André |
| 472 |
Automated Extraction of Answer Candidates for Question Generation |
Claudia Preda, Mihai Dascalu, Stefan Ruseti and Danielle S. McNamara |
| 476 |
AnonSub: A Massively Parallel Dataset of Movie Subtitles for MT Development and Evaluation |
Joerg Tiedemann and Hengyu Luo |
| 477 |
CoSt-BR: A Language Resource for Conversational Stance Detection |
Felipe Penhorate Carvalho da Fonseca, Luciano Antônio Digiampietri and Ivandré Paraboni |
| 478 |
RO-ABSA: A Romanian Dataset and Baselines for Aspect-Based Sentiment Analysis |
Gheorghe Andreea Alina, Andrei Claudia, Ionescu Elena, Ruseti Stefan and Dascalu Mihai |
| 481 |
Arabic ChartSumm: An English-to-Arabic Benchmark for Metadata-to-Text Summarization |
Passant Elchafei and Amany Fashwan |
| 482 |
MaitH 1.0: A Parallel Corpus and Baseline for Low-Resource Maithili-Hindi Translation |
Kamanksha Prasad Dubey, Chandresh Maurya and Kumar Padmanabh |
| 483 |
Beyond Transcripts: Iterative Peer-Editing with Audio Unlocks High-Quality Human Summaries of Conversational
Speech |
Kaavya Chaparala, Thomas Thebaud, Jesus Villalba Lopez, Laureano Moro-Velazquez, Peter Viechnicki and Najim
Dehak |
| 484 |
Node-Level Uncertainty Estimation in LLM-Generated SQL |
Hilaf Hasson and Ruocheng Guo |
| 490 |
Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation |
Lewis N. Watson, Carl Strathearn, Kenneth Mitchell and Yanchao Yu |
| 491 |
Automatic Suggestions Help Extending Eventive Ontology: A Case Study on SynSemClass |
Jana Strakova, Eva Fučíková, Zdenka Uresova and Jan Hajič |
| 492 |
Explore Political Discourse with Transformers. Emergent Paradigmatic and Syntagmatic Representations. |
Laurent Vanni and Damon Mayaffre |
| 493 |
PAIR: A Pilot Dataset for Dual Perspective-based Video-Grounded Dialogue and Reconciliation |
Lewis N. Watson, Carl Strathearn, Kenneth Mitchell and Yanchao Yu |
| 494 |
K-MIND: Korean Multimodal INteraction Data for Dyadic Conversation Analysis |
Jae Hee Yang, Yuha Shin, Saim Shin, Je Woo Kim and Jin Yea Jang |
| 495 |
The PARLO Dementia Corpus: A German Multi-Center Resource for Alzheimer's Disease |
Franziska Braun, Florian Hönig, Elmar Nöth, Tobias Bocklet and Korbinian Riedhammer |
| 496 |
Investigating Memorization in Language Models Trained via Knowledge Distillation |
Maarten Mäcking and Michaela Regneri |
| 497 |
A Joint Detection Framework for Latvian Loanwords and Calques Using Monolingual Data |
Yelingyun Zhang, Atis Kapenieks and Marina Platonova |
| 498 |
Introducing a Bangla Sentence – Gloss Pair Dataset for Bangla Sign Language Translation and Research |
Neelavro Saha, Rafi Shahriyar, Nafis Ashraf Roudra, Saadman Sakib and Annajiat Alim Rasel |
| 499 |
The Moral Foundations Reddit Corpus |
Jackson P. Trager, Alireza S. Ziabari, Elnaz Rahmati, Aida Mostafazadeh Davani, Preni Golazizian, Farzan
Karimi-Malekabadi, Ali Omrani, Zhihe Li, Brendan Kennedy, Nils Karl Reimer, Melissa Reyes, Kesley Cheng,
Mellow
Wei, Christina Merrifield, Arta Khosravi, Evans Alvarez and Morteza Dehghani |
| 503 |
CREST: Universal Safety Guardrails through Cluster-Guided Cross-Lingual Transfer |
Lavish Bansal and Naman Mishra |
| 506 |
Language Models as Semantic Augmenters for Sequential Recommenders |
Mahsa Valizadeh, Xiangjue Dong, Rui Tuo and James Caverlee |
| 507 |
Multi-Scale Model Compression via Nested Matrix Learning |
Xiangjue Dong, Aditya Anantharaman, Hemant Pugaliya and Kai Zhong |
| 509 |
Efficient Adaptation of English Language Models for Morphologically Rich and Underrepresented Languages: The
Case of Arabic |
Ahmed Samy Eldamaty, Mohamed Maher Zenhom Abdelrahman, Mohamed Mostafa Ibrahim Elbehery, Mariam Ashraf and
Radwa Elshawi |
| 511 |
AutoRPT: A Tool for Bootstrapping Prosodic Annotation |
Seth Heiney, Thomas Hicks, Sally Little, Fernanda Lourenco, Kai Retana, Eliana Stevens and Jonathan Howell
|
| 513 |
SPQ: An Ensemble Technique for Large Language Model Compression |
Jiamin Yao and Eren Gultepe |
| 514 |
Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language
Models
|
Hanwool Lee, Dasol Choi, Sooyong Kim, Ilgyun Jung, Sangwon Baek, Guijin Son, Inseong Hwang, Naeun Lee and
Seunghyeok Hong |
| 515 |
From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks |
Neh Majmudar, Anne Huang, Jinfan Frank Hu and Elena Filatova |
| 518 |
Dynamic Layer Selection for Efficient Tone Recognition in Self-Supervised Speech Models |
Saint Germes B. Bengono Obiang, Norbert Tsopze and Paulin Melatagia Yonta |
| 519 |
Tracing How Annotators Think: Augmenting Preference Judgments with Reading Processes |
Karin Johanna Denton de Langis, William Walker, Khanh Chi Le and Dongyeop Kang |
| 523 |
Benchmark Data Contamination in Underrepresented Languages: A Comprehensive Analysis Using Brazilian Data
|
Iriedson Souto Maior de Moraes Vilar, David Candeia Maia, João Brunet, Fabio Morais and Leandro Balby
Marinho
|
| 524 |
CodeClarity: A Framework and Benchmark for Evaluating Multilingual Code Summarization |
Madhurima Chakraborty, Drishti Sharma, Maryam Sikander and Eman Nisar |
| 526 |
Cross-Lingual Stability and Bias in Instruction-Tuned Language Models for Humanitarian NLP |
Poli Nemkova, Amrit Adhikari, Matthew Pearson, Vamsi Krishna Sadu and Albert V. Mark |
| 527 |
A Longitudinal, Multinational, and Multilingual Corpus of News Coverage of the Russo-Ukrainian War |
Dikshya Mohanty, Taisiia Sabadyn, Jelwin Rodrigues, Chenlu Wang, Abhishek Kalugade and Ritwik Banerjee |
| 528 |
FrameNet Semantic Role Classification by Analogy |
Van Duy Ngo, Stergos Afantenos, Emiliano Lorini and Miguel Couceiro |
| 529 |
Seeing the Other Side: Diagnostic Tasks for Viewpoint Reasoning in Vision–Language Models |
Makoto Takenaka and Hitomi Yanaka |
| 531 |
ToxSyn: Reducing Bias in Hate Speech Detection via Synthetic Minority Data in Brazilian Portuguese |
Iago Alves Brito, Julia Soares Dollis, Fernanda Bufon Farber, Diogo Fernandes and Arlindo R. Galvão Filho
|
| 534 |
Chulalongkorn Corpus of Spoken Thai |
Pittayawat Pittayaporn, Cathryn Yang, Sujinat Jitwiriyanont and James Kirby |
| 535 |
Do Large Language Models Grasp the Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish |
Lujun Li, Yewei Song, Lama Sleem, Yiqun Wang, Yangjie Xu, Cedric Lothritz, NiccolO' Gentile, Radu State,
Tegawendé F. Bissyandé and Jacques Klein |
| 536 |
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling |
Wataru Nakata, Kentaro Seki, Hitomi Yanaka, Yuki Saito, Shinnosuke Takamichi and Hiroshi Saruwatari |
| 540 |
Counting on Consensus: Selecting the Right IAA Metric for NLP Annotation and Evaluation |
Joseph H. F. James |
| 541 |
Do Multimodal LLMs Understand Order? Measuring the Fragility of Multimodal Reasoning under Input Order
Perturbations |
Sheng-Lun Wei, Yu-Ling Liao, Hen-Hsen Huang and Hsin-Hsi Chen |
| 543 |
NRD: A Hybrid Disentanglement Framework for Mitigating Interference in Multilingual Machine Translation |
Jiarui Zhang and Yifan Deng |
| 544 |
APODICTUS: Automatic Processing of DICTionary Update candidateS |
Felix Blessing, Johannes S. Sax, Julian Kaufmann, Wei Zhao, Kate Wild, Iona Ogilvie, Nikolay Arefyev and
Dominik Schlechtweg |
| 545 |
Towards Privacy-Preserving Fine-Tuning: Anonymization of Aphasic Speech for Effective ASR |
Sebastian Hofstetter and Timo Baumann |
| 546 |
Outgroup Bias in Large Language Models Arising from Social Identity Adoption |
Wenchao Dong, Assem Zhunis, Dongyoung Jeong, Hyojin Chin, Jiyoung Han and Meeyoung Cha |
| 548 |
CONVERSE: Annotation Scheme and Dataset for Multimodal Conversational Engagement Analysis in Human-Human and
Human-Robot Interaction |
Ekaterina Torubarova, Oskar Ljung, Julia Uddén and André Pereira |
| 550 |
Green Bots versus Red Bots: Evaluating Large Language Models for Simulating Persuasion Dynamics in Online
Influence Campaigns |
Majd Eddin Al Ali, Filip Mihai Muntean, Lucia Donatelli and Jurriaan van Diggelen |
| 551 |
ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering |
Shubhra Ghosh, Abhilekh Borah, Aditya Kumar Guru and Kripabandhu Ghosh |
| 555 |
SKILL-IR-Discourse: A Large, Annotated Corpus of Argumentation and Domain Discourse on International
Relations
|
Magdalena Wolska, Matti Wiegmann, Sassan Gholiagha, Mitja Sienknecht, Dora Kiesel, Irene Lopez Garcia,
Patrick
Riehmann, Bernd Fröhlich, Katrin Girgensohn, Jürgen Neyer and Benno Stein |
| 557 |
Bridging Text-to-Sign Translation via Codebook-Oriented Pretraining |
Ninlawat Phuangchoke and Chantri Polprasert |
| 558 |
Toward Generalized Cross-Lingual Hateful Language Detection with Web-Scale Data and Ensemble LLM Annotations
|
Dang Hai Dang, Jelena Mitrović and Michael Granitzer |
| 561 |
Intent Recognition in Speech-to-Text Processing in the Context of Natural Interaction with Cognitive
Assistive
Systems |
Behnam Ensan, Magnus Jung, Matthias Busch and Adreas Wendemuth |
| 562 |
Can LLMs Faithfully Explain Themselves in Low-Resource Languages? A Case Study on Emotion Detection in
Persian
|
Mobina Mehrazar, Mohammad Amin Yousefi, Parisa Beygi and Behnam Bahrak |
| 564 |
Nepal Script Text Recognition from Ancient Artifacts: Challenges and Opportunities |
Swornim Nakarmi, Sarin Sthapit, Arya Shakya, Sahil Ratna Tuladhar, Bal Krishna Bal and Rajani Chulyadyo |
| 566 |
Linguistic and Demographic Factors in a Free Translation Task from Polish to English |
Irina Stenger |
| 567 |
A Typologically Grounded Evaluation Framework for Word Order and Morphology Sensitivity in Multilingual
Masked
LMs |
Anna Feldman, Libby Barak and JIng Peng |
| 568 |
Multi-modal, Multi-task, Multi-criteria Automatic Evaluation with Vision Language Models |
Masanari Oi, Masahiro Kaneko, Naoaki Okazaki and Nakamasa Inoue |
| 570 |
Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis |
Jakob Fehle, Nils Constantin Hellwig, Udo Kruschwitz and Christian Wolff |
| 572 |
A Benchmark for Testing Robustness under Controlled Reference Bias in MT |
Ahrii Kim and Seong-heum Kim |
| 578 |
Semantic Alignment across Egyptian Language Stages via Normalization-Aware Multitask Learning |
He Huang |
| 579 |
Evaluation of Document-Level Text Simplification in Japanese |
Iori Yamashita, Hikari Tanaka, Hajime Kiyama, Kexin Bian, Zhousi Chen and Mamoru Komachi |
| 581 |
A Resource and Evaluation Method for Phonological Continuity in Japanese Sign Language |
Jundai Inoue, Daisuke Hara and Makoto Miwa |
| 587 |
From Generation to Evaluation: A Resource for Error-Categorized Question Generation from Video Transcripts
|
Joshua Berger, Markos Stamatakis, Anett Hoppe, Ralph Ewerth and Christian Wartena |
| 588 |
Push and Pull: Training Sentence Encoders with Contrastive Losses for Distance-Based Multi-Label Text
Classification |
Jens Van Nooten and Andriy Kosar |
| 589 |
BenCSSMark: Towards an Open, Collective Benchmark for Computational Social Sciences |
Etienne Ollion, Arnault Chatelain, Qianwen Guan, Diandra Fabre, Marie Candito, Lorraine Goeuriot, Emile
Chapuis, Abdelkrim Beloued, Nicolas Hervé and Didier Schwab |
| 591 |
Biases in Translation: Assessing Opinion Distortion in Machine Translated Texts |
Nazanin Shafiabadi and François Yvon |
| 593 |
ParlaSpeech 3.0: Richly Annotated Spoken Parliamentary Corpora of Croatian, Czech, Polish, and Serbian |
Nikola Ljubešić, Peter Rupnik, Ivan Porupski and Taja Kuzman Pungeršek |
| 594 |
Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost? |
Prateek Kumar Rajput, Yewei Song, Iyiola Emmanuel Olatunji, Jacques Klein and Tegawendé Bissyande |
| 596 |
Is There Anything More Deceptive than an Obvious Fact? Investigating Implicitness in User-generated
Argumentative Text |
Ekaterina Sviridova, Elena Cabrio and Serena Villata |
| 597 |
Challenges in Image-Caption Association in Portuguese: Evaluating the CLIP Model on the FM30K Dataset |
Vitória Colonetti Benedet, Gutavo Lopes Tamiosso, Rafael Oleques Nunes and Dennis Giovani Balreira |
| 599 |
JLLMSafety: Creating a Dataset for LLM Safety in Japanese |
Hisami Suzuki, Satoru Katsumata, Takashi Kodama, Tetsuro Takahashi, Kouta Nakayama and Satoshi Sekine |
| 602 |
Critical Foreign Policy Decision (CFPD) Benchmark: Measuring Diplomatic Preferences of Large Language Models
|
Benjamin Jensen, Ian J. Reynolds, Yasir Atalan, Michael Garcia, Austin Woo, Anthony Chen and Trevor Howarth
|
| 606 |
Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models |
Shabnam Ataee and Andrei Popescu-Belis |
| 608 |
The Growing Gains and Pains of Iterative Web Corpora Crawling: Insights from South Slavic CLASSLA-web 2.0
Corpora |
Taja Kuzman Pungeršek, Peter Rupnik, Vit Suchomel and Nikola Ljubešić |
| 611 |
Beyond Lemmas and Syntax: Comparing Human and LLM-Generated Scientific Abstracts |
Sergei Bagdasarov and Diego Alves |
| 613 |
Building Multimodal Corpora Using Microtask Pipelines and Local Annotators |
Helmiina Hotti, Raul Vazquez, Anna-Kaisa Jokipohja, Timo Kalliokoski, Henna Paakki, Rosa Suviranta and Tuomo
Hiippala |
| 614 |
From Behavior to Geometry: A Causal and Geometric Analysis of LoRA-Based Domain Adaptation |
Yizhe Wang, Zhenhua Ling and Liu He |
| 617 |
Using LLMs for Automatic Discipline Annotation in a Diachronic Corpus of English Scientific Papers |
Sergei Bagdasarov, Diego Alves, Stefan Fischer and Elke Teich |
| 618 |
When Translations Surprise: Human Awareness of Predictability in Translations |
Cristian García-Romero, Miquel Esplà-Gomis and Felipe Sanchez-Martinez |
| 619 |
MaritimEmails: A Synthetic Dataset for Maritime Chartering Correspondence |
Simon Clematide and Kevin Bruendler |
| 620 |
Is One Dataset Enough for Evaluation? Studying Generalizability of Automated Essay Scoring Models |
Sohaila Eltanbouly, Marwan Sayed and Tamer Elsayed |
| 621 |
Quadratic Weighted Kappa Is Not Enough for Evaluating Automated Essay Scoring |
Salam Albatarni and Tamer Elsayed |
| 623 |
CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning |
Masato Kikuchi, Masatsugu Ono, Toshioki Soga, Tetsu Tanabe and Tadachika Ozono |
| 625 |
LuxBorrow: From Pompier to Pompjee, Tracing Borrowing in Luxembourgish |
Nina Hosseini-Kivanani and Fred Philippy |
| 627 |
Towards Expectation Detection in Language: A Case Study on Treatment Expectations in Reddit |
Aswathy Velutharambath and Amelie Wührl |
| 628 |
SEEM-CZ: Annotation and Classification of Epistemic Markers in Czech |
Barbora Štěpánková, Tomáš Musil, Lucie Polakova and Michal Novák |
| 629 |
Why So Separate: Analyzing In-Context Learning from a Vector Space Perspective |
Tobias Kalmbach and Sandipan Sikdar |
| 630 |
Frame Semantic Patterns for Identifying Underreporting of Notifiable Events in Healthcare: The Case of
Gender-Based Violence |
Lívia Dutra, Arthur Lorenzi, Lais Berno, Franciany Campos, Karoline Biscardi, Kenneth Brown, Marcelo
Viridiano, Frederico Belcavello, Ely E. Matos, Olivia Guaranha, Erik Santos, Sofia Reinach and Tiago Timponi
Torrent |
| 631 |
HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings |
Rasmus Thyge Aavang Jensen, Giovanni Rizzi, Rasmus Tjalk-Bøggild, Alexandre Iolov, Mike Zhang and Johannes
Bjerva |
| 632 |
Early Fusion with Contrastive Learning: A Lightweight Alternative for Multi-modal Classification |
Felix Wernlein, Abhik Jana and Sandipan Sikdar |
| 633 |
eSciBench: An Extensible Scientific PDF Extraction Benchmark |
Noah Tremblay Taillon and Phillippe Langlais |
| 636 |
GhostWriter: Hidden AI-Generated Texts over Multiple Languages, Domains and Generators |
Manuel Schaaf, Kevin Bönisch and Alexander Mehler |
| 637 |
Evaluating Embedding Models on Danish Historical Newspapers: A Corpus and Benchmark Resource |
Alie Lassche, Pascale Feldkamp, Yuri Bizzoni, Katrine Baunvig, Kristoffer Nielbo and Johan Heinsen |
| 638 |
Empathy Speaks in Metaphors: The Empathy-Metaphor Corpus of Figurative Language in Empathetic Text |
Gyeongeun Lee and Natalie Parde |
| 639 |
Bidirectional Chinese and English Passive Sentences Dataset for Machine Translation |
Xinyue Ma, Pol Pastells, Mireia Farrus and Mariona Taule |
| 640 |
Beyond Fake News Detection: A Community-based Study of the Multicultural Nature of Information Disorder |
Sara Gemelli, Giulia Di Cristina, Mohamad Mojtaba Behboudi Eshkiki, Caterina Maria Cappello, Tommaso
Caselli,
Alberto De La Torre Solís, Nikolai Efimov, Mariia Everstova, Md Azizul Hoque, Maziar Kianimoghadam Jouneghani,
Payam Latifi, Yashar Mahboudi, Farzaneh Mohseni, Usman Naseem, Dario Placenti, Manuela Sanguinetti, Aurora
Scarpellini, Chiara Zanchi, Yiran Zhang, Marco Antonio Stranisci and Simona Frenda |
| 645 |
Evaluating the Homogeneity of Keyphrase Prediction Models |
Mael Houbre, Florian Boudin and Beatrice Daille |
| 647 |
LexiPhon: A Collection of Phonetically Transcribed Lexicons from Wikipedia |
Amanda Doucette, Timothy J. O'Donnell and Morgan Sonderegger |
| 649 |
``Emphasizing the Commendable'': A Study of Homogenized Transitive Verb Constructions in Machine Generated
Peer Reviews |
Hing-Yuet Fung, Chi-kiu Lo and Samuel Larkin |
| 650 |
A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation |
Robert Krovetz |
| 651 |
Best-Worst Scaling of Hype in Biomedical Research: Building an Intensity Lexicon of Promotional Adjectives
|
Neil Millar, Dipesh Satav, Bojan Batalo, Erica K. Shimomoto and Ryosuke L. Ohniwa |
| 654 |
A Dutch Benchmark to Assess Social Bias in LLMs within a Hiring Decision Setting |
Renate Burema, Anne Schuth, Christopher Spelt and Dong Nguyen |
| 655 |
FreeTxt-Vi: A Benchmarked Vietnamese-English Toolkit for Segmentation, Sentiment, and Summarisation |
Hung Huy Nguyen, Mo El-Haj, Paul Rayson and Dawn Knight |
| 656 |
Towards a Gold Standard for Adjectival Hypernymy: Enriching the Open English WordNet with a Hybrid Approach
|
Lorenzo Augello, John P. McCrae and Marco Passarotti |
| 657 |
Creating a Hybrid Rule and Neural Network Based Semantic Tagger Using Silver Standard Data: The PyMUSAS
Framework for Multilingual Semantic Annotation |
Andrew Moore, Paul Rayson, Dawn Archer, Tim Czerniak, Dawn Knight, Daisy Monika Lal, Gearóid Ó Donnchadha,
Mícheál J. Ó Meachair, Scott Piao, Elaine Uí Dhonnchadha, Johanna Vuorinen, Yan Yabo and Xiaobin Yang |
| 664 |
Prerequisites for Advancing Automatic Speech Recognition in Breton |
Morgan Grobol, Alice Millour, Wassim Zemouri, Yuna Drapier and Mélanie Jouitteau |
| 667 |
The Patrologia Graeca Corpus: OCR, Annotation, and Open Release of Noisy Nineteenth-Century Polytonic Greek
Editions |
Chahan Vidal-Gorène and Bastien Kindt |
| 668 |
CoDAE: Adapting Large Language Models for Education via Chain-of-Thought Data Augmentation |
Shuzhou Yuan, Willliam LaCroix, Hardik Ghoshal, Ercong Nie and Michael Faerber |
| 670 |
Using LLMs to Extract Instances of Schematic Constructions from Unannotated L2 Learner Corpora |
Jelena Kallas, Ahto Kiil, Heete Sahkai, Geda Paulsen and Kertu Saul |
| 671 |
Predicting Topic (Co-)Occurrence Using Topic Networks Built from the Project Gutenberg Corpus |
Bhuvanesh Verma and Alexander Mehler |
| 672 |
Voices and Echoes in Fictional Dialogue: A Study of Linguistic Coordination in Literary Texts |
Ioana-Roxana Boriceanu, Alina Iacob and Liviu P. Dinu |
| 673 |
COCOA: Creation and Exploratory Investigation of a COrpus of Claims frOm NLP Articles |
Clémentine Bleuze, Fanny Ducel, Maxime Amblard and Karen Fort |
| 675 |
Are LLMs Good Text Diacritizers? An Arabic and Yoruba Case Study |
Hawau Olamide Toyin, Samar Mohamed Magdy and Hanan Aldarmaki |
| 677 |
Code-switching as a Bias Indicator in LLMs: "the Consequences Are Not the Same Para Nosotros" |
Fanny Ducel, Aurélie Névéol, Vidit Khazanchi, Loïc Leclere, Arthur Pedrini, Léa Bouchet, Benjamin Caissial
and
Karen Fort |
| 678 |
ROG: A Multi-Layer Manually Annotated Corpus of Spoken Slovenian |
Kaja Dobrovoljc, Darinka Verdonik, Jaka Čibej, Peter Rupnik and Nikola Ljubešić |
| 679 |
PBBQ: A Persian Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models |
Farhan Farsi, Shayan Bali, Fatemeh Valeh, Parsa Ghofrani, Alireza Pakniat, Seyedkian Kashfipour and Amir H.
Payberah |
| 680 |
Vrittanta-AS: Dataset Development and Benchmarking for Event Trigger Detection and Classification in
Assamese
|
Chaitanya Kirti, Dhrubajyoti Pathak, Ashish Anand and Prithwijit Guha |
| 682 |
National Library as Corpus: DeLiKo-2025@DNB – a Very Large Corpus of German-language Contemporary Literature
|
Marc Kupietz, Nils Diewald, Philippe Genêt and Andreas Witt |
| 684 |
A Taxonomy of Safety: Harmonizing LLM Benchmarks in a Fragmented Landscape |
Shadi Rastegarmoghadam Cheraghi, Viktor Hangya, Fabian Kuech and Darina Gold |
| 685 |
Multi-party Conversational Corpus of L1 and L2 for Speech Alignment Research (Teams-SK): Methodological
Approach |
Stefan Benus, Viktor Gatial, Erik Gyorgy, Mária Hricková, Martin Kažimír, Zuzana Kozáčiková, Lucia Mareková,
Róbert Sabo, Marian Trnka and Erik Vráb |
| 686 |
PREMOVE in LiLa: Integrating Latin Preverbed Motion Verbs with WordNet and VerbNet |
Andrea Farina, Marco Passarotti, Francesco Mambrini, Matteo Pellegrini, Eleonora Litta and Giovanni Moretti
|
| 687 |
Scare Quotes as Markers of "Questionable" Word Usages and Misalignment in Conversation: An Annotation Study
|
Aina GarÃÂ Soler, Juan Carlos Zevallos Huaco, Matthieu Labeau and Chloé Clavel |
| 689 |
Merging Continual Pretraining Models for Domain-Specialized LLMs: A Case Study in Finance |
Kentaro Ueda, François Portet, Hirohiko Suwa and Keiichi Yasumoto |
| 691 |
Integrating Services, Platforms and Resources into a National Infrastructure Cluster for FAIR Language and
Cultural Data |
Giulia Pedonese, Daniele Melaccio, Michele Mallia, Monica Monachini, Francesca Frontini, Valeria Quochi,
Fahad
Khan, Angelo Mario Del Grosso, Federico Boschetti and Riccardo Del Gratta |
| 694 |
From Incidents to Framing: A Dutch and English Frame-semantic Corpus and Lexicon |
Piek T.J.M. Vossen, Pia Sommerauer and Levi Remijnse |
| 695 |
FineDialFact: A Benchmark for Fine-Grained Dialogue Fact Verification |
Xiangyan Chen, Yufeng Li, Yujian Gan, Arkaitz Zubiaga and Matthew Purver |
| 696 |
Small LLMs for Medical NLP: A Systematic Analysis of Few-Shot, Constraint Decoding, Fine-Tuning and
Continual
Pre-Training in Italian |
Pietro Ferrazzi, Mattia Franzin, Alberto Lavelli and Bernardo Magnini |
| 697 |
Automatic Analysis of Collaboration through Human Conversational Data Resources: A Review |
Yi Yu, Maria Boritchev and Chloé Clavel |
| 698 |
POLAR: A Corpus of Questions, Responses and Argumentation in Polish Political Radio Discourse |
Daniel Ziembicki, Aleksandra Zwierzchowska, Ewelina Sobol and Katarzyna Anna Przerada |
| 699 |
A Large-Scale Instruction-Tuning Dataset and Models for Slovenian Vision-Language Tasks |
Matej Martinc and Domen Vreš |
| 700 |
PARSEME 2.0 Multilingual Corpus of Multiword Expressions |
Agata Savary, Manon Scholivet, Carlos Ramisch, Takuya Nakamura, Eric Bilinski, Sara Stymne, Voula Giouli,
Stella Markantonatou, Vasile Pais, Maria Mitrofan, Louis Estève, Verginica Barbu Mititelu, Jaka Čibej, Roberto
Díaz Hernández, Victoria Fendel, Polona Gantar, Olha Kanishcheva, Cvetana Krstev, Chaya Liebeskind, Irina
Lobzhanidze, Aleksandra M. Marković, Gunta Nešpore-Bērzkalne, Adriana S. Pagano, Mehrnoush Shamsfard, Ranka
Stanković, Vahide Tajalli and Carole Tiberius |
| 702 |
Is Semi-Automatic Transcription Useful in Corpus Creation? Preliminary Considerations on the KIParla Corpus
|
Martina Simonotti, Ludovica Pannitto, Eleonora Zucchini, Silvia Ballarè and Caterina Mauri |
| 703 |
LoveHate: Stance Detection and Generation for Multiple Topics in User-generated Comments in Russian and
English |
Natalia Evgrafova, Veronique Hoste and Els Lefever |
| 705 |
When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation |
Thibault Prouteau, Francis Lareau, Jean-Charles Lamirel, Christophe Malaterre and Nicolas Dugue |
| 706 |
An Extreme Multi-label Text Classification (XMTC) Library Dataset: What If We Took ``Use of Practical AI in
Digital Libraries'' Seriously? |
Jennifer D'Souza, Sameer Sadruddin, Maximilian Kaehler, Andrea Salfinger, Luca Zaccagna, Francesca Incitti,
Lauro Snidaro and Osma Suominen |
| 707 |
Contextualizing Toxicity: An Annotation Framework for Unveiling Pragmatics in Conversations of Online
Discussion Forums |
Yingxue Fu and Anais Ollagnier |
| 709 |
Consistency of LLMs to Comparative Statements in Mathematical Reasoning Tasks |
Aidan W. San, Daniel Juyoung Son, Xiaodong Liu and Yangfeng Ji |
| 710 |
OasisSimp: An Open-source Asian-English Sentence Simplification Dataset |
Hannah Liu, Murphy Tian, Iqra Ali, Haonan Gao, Qiaoyiwen Wu, Blair Yang, Uthayasanker Thayasivam, Annie Lee,
Pakawat Nakwijit, Surangika Dayani Ranathunga and Ravi Shekhar |
| 711 |
A Corruption-Based Data Augmentation for Arabic Essay Scoring: A Preliminary Study on the Organization Trait
|
May Saed Bashendy and Tamer Elsayed |
| 713 |
Survey of Tools for Manual Linguistic Annotation: Supporting Diversity through Interactive Exploration |
Ludovica Pannitto, Kaja Dobrovoljc and Bruno Guillaume |
| 714 |
Building a Dataset for French Accent Classification Evaluation: Are We There Yet? |
Diandra Fabre, Mathieu Avanzi and François Portet |
| 716 |
A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production |
Qiao Gan, Jonathan Dunn, Andrea Nini and Benjamin Adams |
| 718 |
Open Korean Historical Corpus: A Millennia-Scale Diachronic Collection of Public Domain Texts |
Seyoung Song, Nawon Kim, Songeun Chae, Kiwoong Park, Jiho Jin, Haneul Yoo, Kyunghyun Cho and Alice Oh |
| 720 |
M3-SLU: Evaluating Speaker-Attributed Reasoning in Multimodal Large Language Models |
Yejin Kwon, Taewoo Kang, Hyunsoo Yoon and Chang Ouk Kim |
| 721 |
From Trial by Fire to Sleep like a Baby: A Lexicon of Anxiety Associations for 20K English Multi-Word
Expressions |
Saif M. Mohammad |
| 722 |
A Computational Diachronic Analysis of Gen Z Mental Health Discourse: A Large-scale Reddit Corpus Study from
Pre- to Post-COVID |
Felix Mao |
| 725 |
Ramsa: A Large Sociolinguistically Rich Emirati Arabic Speech Corpus for ASR and TTS |
Rania Al-Sabbagh |
| 727 |
TR-TEB: Turkish Text Embedding Benchmark |
Ömer Arslan, Atalay Çelİk, Yusuf Aslan, Hasan Fatih Durkaya, Mustafa Furkan Zenginoğlu, Musa Alperen Yılmaz,
Merve Gül Kantarcı and Mehmet Haklıdır |
| 730 |
JPPB: Automatic Construction of a Soft-Labeled Japanese Patient Phrase Bank for Symptom Normalization |
Tomohiro Nishiyama, Mana Kuramoto, Shoko Wakamiya and Eiji Aramaki |
| 734 |
Seven Years of Japanese Emotion-related Episodes: A Crowdsourced Dataset |
Kazuhiro Ito, Junko Hayashi, Hiroyuki Nagai, Shoko Wakamiya and Eiji Aramaki |
| 737 |
"Oat Milk Vegan Chocolate Taste Great!": Monitoring the Food Transition Debate in Reddit |
Greta Zella, Jan Willem Bolderdijk, Saskia Peels, Gerry Wakker and Tommaso Caselli |
| 738 |
AnonTool & AnonDataset: Automated Corpus Annotation and Multilingual Tagging as a Service |
Cynthia Van Hee, Jonas Doumen, Vincent Prins, Pranaydeep Singh, Vincent Vandeghinste and Els Lefever |
| 739 |
ClimateChat-300K: A Multi-Modal Facebook Dataset for Understanding Diverse Perspectives in Climate
Communication |
Wajdi Zaghouani, Md. Rafiul Biswas, Mabrouka Bessghaier, Shimaa Amer Ibrahim and George Mikros |
| 741 |
Methods for Entity-Level Sentiment Analysis |
Egil Rønningstad, Roman Klinger, Lilja Øvrelid and Erik Velldal |
| 743 |
CoTERM: A Consistency-Oriented Term Metric for MT System Evaluation |
Amir Hazem and Kyo Kageura |
| 745 |
Less Is More? The Role of Demographic Author Information in Emotion Classification of Ambiguous Text |
Sabine Weber, Lynn Greschner and Roman Klinger |
| 746 |
AraHopeCorpus: Annotation Guidelines and Dataset for Hope Speech in Arabic Social Media Crisis Discourse
|
Esra'a Ahmad Sharqawi and Wajdi Zaghouani |
| 747 |
Modeling Clinical Uncertainty in Radiology Reports: From Explicit Uncertainty Markers to Implicit Reasoning
Pathways |
Paloma Rabaey, Jong Hak Moon, Jung-Oh Lee, Min Gwan Kim, Hangyul Yoon, Thomas Demeester and Edward Choi |
| 749 |
Audience Engagement with Arabic Women's Social Empowerment and Wellbeing: A Decadal Corpus |
Wajdi Zaghouani, Mabrouka Bessghaier, Md. Rafiul Biswas and Shimaa Amer Ibrahim |
| 750 |
PersianAnonymizer: Evaluating LLM-Labeled Training for Efficient NER-based Anonymization in Persian |
Mohammad Hossein Shalchian, Mostafa Amiri and Amir Mahdi Sadeghzadeh |
| 751 |
Multimodal Entrainment and Feedback in Online Group Meetings |
Patrizia Paggio, Manex Agirrezabal, Giulia Di Cristina, Bart Jongejan and Costanza Navarretta |
| 752 |
Representing Multimodality in Terminology Resources |
Federica Vezzani |
| 754 |
Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse |
Aisha Ali Al-Athba and Wajdi Zaghouani |
| 756 |
How Far Can Bias Go? Tracing Bias from Prepre-Training Data to Alignment |
Marion Thaler, Abdullatif Köksal, Alina Leidinger, Anna Anna Korhonen and Hinrich Schütze |
| 757 |
How Many Samples Do We Need? A Toolkit for Power-Aware Evaluation Design |
Angelo Basile, Areg Mikael Sarvazyan and José Ángel González |
| 760 |
Integrating TEI, NER/NEL, Textometry, and Linked Data for a Semantically Enriched Interview Corpus |
Ranka Stanković, Tamara Vučenović, Biljana Rujević, Mihailo Škorić and Milica Ikonić Nešić |
| 761 |
ArabDiscrim: A Decade-Long Arabic Facebook Corpus on Racism and Discrimination |
Wajdi Zaghouani, Shimaa Amer Ibrahim and Houda Bouamor |
| 762 |
UniSkill: A Dataset for Matching University Curricula to Professional Competencies |
Nurlan Musazade, József Mezei and Mike Zhang |
| 763 |
Trust Me, I Can Convince You: The Contextualized Argument Appraisal Framework and the ContArgA Corpus |
Lynn Greschner, Sabine Weber and Roman Klinger |
| 764 |
HumaniCA: A Benchmark Resource for the Detection of Users' Ascription of Humanness to Conversational Agents
|
Sabrina Villata, Amon Rapp, Luigi Di Caro and Federica Cena |
| 767 |
Assessing the Political Fairness of Multilingual LLMs: A Case Study Based on a 21-Way Multiparallel EuroParl
Dataset |
Paul Lerner and François Yvon |
| 768 |
ArPoMeme: An Annotated Arabic Multimodal Dataset for Political Ideology and Polarization |
Wajdi Zaghouani, Kais Attia, Md. Rafiul Biswas and Fadhl Eryani |
| 769 |
JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media |
Wajdi Zaghouani, Shimaa Amer Ibrahim and Houda Bouamor |
| 770 |
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African
Languages |
Edward Thomas Bayes, Israel Abebe Azime, Jesujoba Alabi, Jonas Kgomo, Tyna Eloundou, Elizabeth Proehl, Kai
Chen, Imaan Khadir, Naome A. Etori, Shamsuddeen Hassan Muhammad, Choice Mpanza, Igneciah Pocia IP Thete,
Dietrich Klakow and David Ifeoluwa Adelani |
| 771 |
Semantic Information: A Difference That Makes a Difference |
J. Nathanael Philipp, Max Kölbl and Michael Richter |
| 772 |
Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach |
Salim Al Mandhari, Hieu Pham Dinh, Mo El-Haj and Paul Rayson |
| 773 |
ForumOccitania: A Corpus of User-Generated Content for Multiple Occitan Varieties |
Oriane Nédey, Juliette Janes, Rachel Bawden, Thibault Clérice and Benoît Sagot |
| 774 |
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via
Pseudo-labeling |
Hyeyeon Kim, Sungwoo Han, Jingun Kwon, Hidetaka Kamigaito and Manabu Okumura |
| 775 |
When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms |
Adib Sakhawat, Shamim Ara Parveen, Md Ruhul Amin, Tahera Khatun, Shamim Al Mahmud and Md Saiful Islam |
| 779 |
Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+ |
Mason Shipton, York Hay Ng, Aditya Khan, Phuong H. Hoang, Xiang Lu, A. Seza Dogruoz and Annie Lee |
| 781 |
Medispeech: A French Reading and Spontaneous Speech Corpus for Sleepiness Estimation |
Colleen Beaumard, Vincent P. Martin, Charles Brazier, Julien Coelho, Jean-Luc Rouas and Pierre Philip |
| 782 |
ParaCLEAN: Improving Translation Quality through Systematic Parallel Data Cleaning |
Audrey Mash, Ella Paulina Bohman and Maite Melero |
| 783 |
Lexical and Discourse Semantics in a Reading-time Corpus of English |
Jakub Dotlacil, Laia Colina Fortuny, Li Kloostra and Johan Bos |
| 784 |
SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages |
Hannah Liu, Junghyun Min, Annie Lee, Ethan Yue Heng Cheung, Shou-Yi Hung, Elsie Chan, Shiyao Qian, Runtong
Liang, Kimlan Huynh, Wing Yu Yip, York Hay Ng, TSZ Fung Yau, Ka Ieng Charlotte lo, You-Wei Wu and Richard
Tzong-Han Tsai |
| 785 |
DReUD: Discourse Relations in Universal Dependencies |
Jiří Mírovský and Pavlína Synková |
| 787 |
Of Words and Meaning: A Grammatical and Semantic Benchmark for Faroese LLM Understanding |
Iben Nyholm Debess, Barbara Scalvini and Bolette Pedersen |
| 789 |
The Corpus of Contemporary Polish — a New Reference Corpus with Rich Syntactic Annotations |
Witold Kieras, Małgorzata Marciniak, Marcin Woliński, Katarzyna Krasnowska-Kieraś and Marek Łaziński |
| 793 |
Phonetic-based Ranking for Improved Pseudo-Labeling in Low-Resource ASR |
Marco Matassoni, Roberto Gretter, Falavigna Daniele, Mohamed Nabih Ali Mohamed Nawar, Alessio Brutti, Luisa
Bentivogli, Mauro Cettolo, Marco Gaido, Sara Papi and Matteo Negri |
| 796 |
Conditioning LLMs to Generate Code-Switched Text |
Maite Heredia, Gorka Labaka, Jeremy Barnes and Aitor Soroa |
| 797 |
Reference-free Evaluation of Named Entity Recognition and Linking over OCRed Historical Texts |
Adam Jatowt, Thi Hong Hanh Tran, Ahmed Hamdi, Mickael Coustaty and Antoine Doucet |
| 798 |
Multimodal LLMs Do Not Compose Skills Optimally across Modalities |
Paula Ontalvilla, Aitor Ormazabal and Gorka Azkune |
| 799 |
Injecting Structured Biomedical Knowledge into Language Models:Continual Pretraining vs. GraphRAG |
Jaafer Klila, Sondes Souihi, Rahma Boujelbane, Nasredine Semmar and Lamia Hadrich-Belguith |
| 800 |
From Facts to Hypotheses: Joint Detection of Biomedical Relations and Epistemic Commitment Using LLMs |
Aleksandra Gabryszak, Phuc Tran Truong, Arne Binder, Nikola Milosevic, Felix-Sebastian Keese, Astrid
Rheinländer and Philippe Thomas |
| 801 |
TURING: Evaluating Human Abilities to Identify AI-Generated Texts |
Natalia Kalashnikova, Nicolas De Bufala, Sophie Fayad and Laurent Cervoni |
| 802 |
MultiGraSCCo: A Multilingual Anonymization Benchmark with Annotations of Personal Identifiers |
Ibrahim Baroud, Christoph Otto, Vera Czehmann, Christine Hovhannisyan, Lisa Raithel, Sebastian Möller and
Roland Roller |
| 803 |
Structured Legal Document Generation in India: A Model-Agnostic Wrapper Approach with VidhikDastaavej |
Shubham Kumar Nigam, Deepak Patnaik Balaramamahanthi, Noel Shallum, Kripabandhu Ghosh and Arnab Bhattacharya
|
| 805 |
ManufactuBERT: Efficient Continual Pretraining for Manufacturing |
Robin Armingaud and Romaric Besancon |
| 806 |
MHTS: Multi-Hop Tree Structure Framework for Generating Difficulty-Controllable QA Datasets for RAG
Evaluation
|
Jeongsoo Lee, Daeyong Kwon, Kyohoon Jin, JunNyeong Jeong, Minwoo Sim and Minwoo Kim |
| 809 |
Prague Dependency Treebank - Consolidated 2.0: Enriching a Complex Annotation Scheme |
Marie Mikulová, Jiří Mírovský, Milan Straka, Pavlína Synková, Jan Štěpánek, Barbora Štěpánková and Jan Hajič
|
| 810 |
Controllable Sentence Simplification in Italian: Fine-Tuning Large Language Models on Automatically
Generated
Resources |
Michele Papucci, Giulia Venturi and Felice Dell'Orletta |
| 813 |
PolyglotQL: A Pipeline for Multilingual Text-to-SPARQL Dataset Generation |
Julio Perez, Fabio Barth and Georg Rehm |
| 818 |
Towards Clinical Applications of NLP: Detecting Emotion Regulation via Emotional Categories and Expression
Modes in French Transcriptions |
Salome Klein, Amalia Todirascu and Hélène Vassiliadou |
| 819 |
StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario |
Marcely Zanon Boito, Caroline Brun, Inyoung Kim, Denys M. Proux, Salah Ait-Mokhtar, Nikolaos Lagos, Jean-Luc
Meunier and Ioan Calapodescu |
| 821 |
SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing |
Luca Foppiano, Sotaro Takeshita, Pedro Ortiz Suarez, Ekaterina Borisova, Raia Abu Ahmad, Malte Ostendorff,
Fabio Barth, Julian Moreno-Schneider and Georg Rehm |
| 822 |
Echoes of the Troubadours: A Corpus of Troubadour Poetry for Stylometric Analysis and Authorship Attribution
|
Loic De Langhe, Orphee De Clercq and Veronique Hoste |
| 824 |
Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation |
Lina Conti, Dennis Fucci, Marco Gaido, Matteo Negri, Guillaume Wisniewski and Luisa Bentivogli |
| 826 |
Detecting Hallucinations in Authentic LLM–Human Interactions |
Yujie Ren, Niklas Gruhlke and Anne Lauscher |
| 827 |
Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review |
Maha Tufail Agro, Atharva A. Kulkarni, Karima Kadaoui, Zeerak Talat and Hanan Aldarmaki |
| 828 |
Audio-Lyrics Alignment Dataset for Italian Arias |
Pushkar Jajoria, Arianna Graciotti, Giovanna Casali, Jesujoba Alabi, Rodolfo Delmonte, Angelo Pompilio,
Rocco
Tripodi, James McDermott and Dietrich Klakow |
| 830 |
A Semi-Automatic Workflow for Transcribing and Annotating Broadcast News |
Christoph Draxler, Sven Grawunder, Jürgen Trouvain and Felicitas Kleber |
| 831 |
PRIVaThe: An Annotated Dataset of Multi-Objectives Web Search Sessions |
Claire Ibarboure, Ludovic Tanguy, Franck Amadieu and Josiane Mothe |
| 832 |
CausalSense: Leveraging Common Sense Knowledge and LLMs for Joint Event Extraction and Relation
Classification
|
Youssra Rebboud, Pasquale Lisena and Raphael Troncy |
| 833 |
A Dataset of Wolof Ajami Manuscripts for HTR and OCR |
Oreen Yousuf, Elhadji Djibril Diagne, Christian Høgel, Beata Megyesi and Joakim Nivre |
| 834 |
Sovereign AI-based Public Services Are Viable and Affordable |
António Branco, Luis M. S. Gomes, Rodrigo Santos, Eduardo Santos, João Ricardo Silva, Nuno Marques and
Madalena Rodrigues |
| 836 |
DAMETA: An LLM Benchmark for Danish Metaphor Interpretation with Systematically Varied Distractors |
Nina Skovgaard Schneidermann, Sanni Nimb, Nathalie Carmen Hau Norman, Sussi Olsen and Bolette Pedersen |
| 838 |
A Fine-tuned ASR Model for Historical American Dialect Recordings |
Steven Coats |
| 839 |
WikIPA: Integrating WikiPron and Lingua Libre for Multilingual IPA Transcription |
Pierluigi Cassotti, Jacob Lee Suchardt and Domenico De Cristofaro |
| 841 |
TDMulti: A Tunisian Dialect-Modern Standard Arabic Multitask Corpus with a Context-Aware Cross-Attention
BERT
Model |
Roua Torjmen and Kais Haddar |
| 842 |
Multimodal Reference by Means of the Pronoun We and Hand Gestures in a Novel Corpus of Parliamentary Opening
Debates |
Costanza Navarretta |
| 843 |
Parallel Corpus Filtering Based on Semantic Similarity and Surface Dissimilarity for Japanese Text
Simplification with LLMs |
Daisuke Maekawa, Tomoyuki Kajiwara and Takashi Ninomiya |
| 844 |
Conversion of the Clark Hall Dictionary of Old English to TEI with RDF: An End-to-end Pipeline for
Lexicographic Resource Retrodigitization |
Sergei Stoliarov, Maxim Ionov, Fahad Khan, Marina Buzzoni and Francesca Frontini |
| 845 |
The Added Value of Metadata and Annotations: Evidence from Two Large-Scale, Naturalistic Corpus Studies |
Anisia Popescu, Johanna Cronenberg, Ioana Vasilescu, Ioana Chitoran, Lori Lamel and Martine Adda-Decker |
| 847 |
Explainable Semantic Textual Similarity via Dissimilar Span Detection |
Diego Miguel Lozano, Daryna Dementieva and Alexander Fraser |
| 848 |
CareMedEval Dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field |
Doria Bonzi, Alexandre Guiggi, Frederic Bechet, Carlos Ramisch and Benoit Favre |
| 852 |
Do Language Models Know Theo Has a Wife? Investigating the Proviso Problem |
Tara Azin, Daniel Dumitrescu, Diana Inkpen and Raj Singh |
| 854 |
Explaining Explanations: Interpretability Methods for Discourse Analysis of Transformer Attention Maps |
Louis Escouflaire, Jérémie Bogaert, Antonin Descampe, Cédrick Fairon and Francois-Xavier Standaert |
| 855 |
ŚMigiel Dataset: Laying Foundations for Investigating Machine-Generated Text Detection in Polish |
Jakub Strebeyko, Alina Wróblewska and Piotr Przybyła |
| 856 |
LongTailQA: Benchmarking LLMs and RAG Models on Disambiguated Long-Tail Entities |
William Xion, Uwe Hadler, Tim Cofala, Maximilian Idahl, Soumyadeep Roy and Wolfgang Nejdl |
| 857 |
Building and Annotating a Large Comparable Corpus for Studying the Semantic Quantification - Chinese,
French,
Japanese, Korean |
Raoul Blin, Jinnam Choi, Wu Qishen, Yuxin Zhang, Soonhee Hwang, Takahiro Morita, Alexander Delaporte, Ilaine
Wang and Chang Liu |
| 858 |
The Megrelian Language Corpus (MLC): Creation, Annotation, and Initial Steps toward a UD Treebank |
Irina Lobzhanidze, Rusudan Gersamia and Tamar Gogia |
| 859 |
Towards Reliable Evaluation of Emotional Text Generation in LLMs: Human vs. Automatic Metrics |
Sadegh Jafari, Els Lefever and Veronique Hoste |
| 860 |
A New Semantic Artifact Based Framework for Studying and Documenting Algospeak and Related Phenomena |
Anas Fahad Khan, Elisa Gugliotta, Elisa Squadrito, Maura Tarquini and Francesca Frontini |
| 863 |
CRaFT: An Explanation-Based Framework for Evaluating Cultural Reasoning in Multilingual Language Models |
Shehenaz Hossain and Haithem Afli |
| 864 |
Gretino: A Greek and Latin Dataset to Benchmark Retrieval Systems in Classical Languages |
Hawau Olamide Toyin, Federico Iezzi, Elia Scapini, Giulio Federico and Giovanni Puccetti |
| 867 |
HEAD-QA v2: Expanding a Healthcare Benchmark for Reasoning |
Alexis Correa, Carlos Gómez-Rodríguez and David Vilares |
| 868 |
Benchmarking Mathematical Reasoning in a Low-Resource Language: Structured Prompting and Evaluation in
Basque
|
Inigo Martinez-Criado, Aitor Soroa and Jeremy Barnes |
| 869 |
How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse |
Saki Imai, Lee Kezar, Laurel Aichler, Mert Inan, Erin Walker, Alicia Wooten, Lorna Cobban Quandt and Malihe
Alikhani |
| 870 |
Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque |
Lukas Arana, Julen Etxaniz, Ander Salaberria and Gorka Azkune |
| 871 |
Privacy-Preserving Information Extraction with Local LLMs: A Comparative Study on Dutch Debt Collection
Letters |
Beyza Celep, Natalia Amat-Lefort and Joost Visser |
| 873 |
Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches |
Anum Afzal, Yuki Saito, Hiroya Takamura, Katsuhito Sudoh, Shinnosuke Takamichi, Graham Neubig, Florian
Matthes
and Tatsuya Ishigaki |
| 875 |
Towards the Generation and Application of Dynamic Web-Based Visualization of UIMA-based Annotations for
Big-Data Corpora with the Help of Unified Dynamic Annotation Visualizer |
Thiemo Dahmann, Julian Schneider, Philipp Stephan, Giuseppe Abrami and Alexander Mehler |
| 876 |
A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding |
Dilara Torunoğlu-Selamet, Dogukan Arslan, Rodrigo Wilkens, Wei He, Doruk Eryiğit, Thomas Pickard, Adriana S.
Pagano, Aline Villavicencio, Gülşen Eryiğit, Ágnes Abuczki, Aida Cardoso, Alesia Lazarenka, Dina Almassova,
Amalia Mendes, Anna Kanellopoulou, Antoni Brosa-Rodriguez, Baiba Valkovska, Beata Wojtowicz, Bolette Pedersen,
Carlos Manuel Hidalgo-Ternero, Chaya Liebeskind, Danka Jokić, Diego Alves, Eleni Triantafyllidi, Erik Velldal,
Fred Philippy, Giedre Valunaite Oleskeviciene, Ieva Rizgeliene, Inguna Skadina, Irina Lobzhanidze, Isabell
Stinessen Haugen, Jauza Akbar Krito, Jelena M. Marković, Johanna Monti, Josue Alejandro Sauca, Kaja
Dobrovoljc,
Kingsley O. Ugwuanyi, Laura Rituma, Lilja Øvrelid, Maha Tufail Agro, Manzura Abjalova, Maria Chatzigrigoriou,
Mar�a del Mar S�nchez Ramos, Marija Pendevska, Masoumeh Seyyedrezaei, Mehrnoush Shamsfard, Momina Ahsan,
Muhammad Ahsan Riaz Khan, Nathalie Carmen Hau Norman, Nilay Erdem Ayyıldız, Nina Hosseini-Kivanani, Noémi
Ligeti-Nagy, Numaan Naeem, Olha Kanishcheva, Olha Yatsyshyna, Daniil Orel, Petra Giommarelli, Petya Osenova,
Radovan Garabik, Regina E. Semou, Rozane Rebechi, Salsabila Zahirah Pranida, Samia Touileb, Sanni Nimb,
Sarfraz
Ahmad, Sarvinoz Nematkhonova, Shahar Golan, Shaoxiong Ji, Sopuruchi Christian Aboh, Srdjan Sucur, Stella
Markantonatou, Sussi Olsen, Vahide Tajalli, Veronika Lipp, Voula Giouli, Yelda Yeşildal Eraydın, Zahra Saaberi
and Zhuohan Xie |
| 877 |
Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes |
Quintin Myers and Yanjun Gao |
| 880 |
Multiway Parallel Corpus in Forced Migration Domain for Multilingual Machine Translation |
Fatemeh Azadi, Samuel Larkin and Chi-kiu Lo |
| 881 |
TempPerturb-Eval: On the Joint Effects of Internal Temperature and External Perturbations in RAG Robustness
|
Yongxin Zhou, Philippe Mulhem and Didier Schwab |
| 883 |
CS-YODAS: A Mined Dataset of In-the-Wild Code-Switched Speech |
Brian Yan, Qingzheng Wang, Matthew Wiesner, Anuj Diwan, Olga Iakovenko, Alex Polok, Injy Hamed, Shuichiro
Shimizu, Iris Emerman, Thomas Hain, David R. Mortensen, Peter Viechnicki and Shinji Watanabe |
| 886 |
Assessing the Difficulty of Inference Types in Natural Language Inference for Clinical Trials |
Mathilde Aguiar, Pierre Zweigenbaum and Nona Naderi |
| 888 |
FPSC: A Sustainable Pipeline for Building a Faroese Parliamentary Speech Corpus |
Dávid í Lág, Barbara Scalvini, Carlos Daniel Hernandez Mena and Jon Gudnason |
| 889 |
The Anonymized Text Corpus: Towards a Diverse and Ever Expanding Multilingual Text Corpus |
Ramunė Kasperė, Anna Bondar, Sergiu Nisioi, Maja Stegenwallner-Schütz, Hanne Bruun Søndergaard Knudsen, Ana
Matić Škorić, Eva Pavlinušić Vilus, Dorota Klimek-Jankowska, Chiara Tschirner, Not Battesta Soliva, Deborah
Noemie Jakobi, Cui Ding, Cengiz Acarturk, Matilda Agdler, Marius Anton, Annalisa Arcidiacono, Elizabete
Barisa,
Ana Bautista, Lisa Beinborn, Nedeljka Bjelanović, Anna Isabelle Bothmann, Jan Brasser, Caterina Cacioli, Anila
Cepani, Ilze Ceple, Adelina Cerpja, Dali Chirino, Nazik Dinçtopal Deniz, Ana Došen, Inmaculada Inmaculada
Fajardo Bravo, Zigmunds Freibergs, Angelina Ganebnaya, Shan Gao, Miao He, Anamaria Hodivoianu, Habib Sani
Yahaya, Aleksandar Jevremovic, Hanna Kędzierska, Nik Kharlamov, Sara Kosutar, Vanja Kovic, Johanne Sofie Krog
Nedergård, Thyra Sigyn Hedda Krosness, Oleksandra Kuvshynova, Mirce Mirce Mihai Marin, Ella Lion, Marta
Lockiewicz, Kaidi Lõo, Paula Luegi, Clara D. Martin, Svitlana A. Matvieieva, Diane Mézière, Xavier Minguez,
Jurgita Motiejūnienė, Tolgonai Nasipbek Kyzy, Jamal Abdul Nasir, Vojislav Jovanovic, Ayşegül Ozkan, Patrizia
Paggio, Marijan Palmovic, Alberto Parola, Klaudia Laura Nivi Helene Petersen, Anja Podlesek, Eva Pospíšilová,
Marta Praulina, Mikuláš Preininger, Diego Rossini, Špela Rot, Jéssica Daniela Santos Gomes, Irina Sekerina,
Anne
Gabija Skadina, Jordi Solé Casals, Nilgun Yucel, Spyridoula Varlokosta, João Veríssimo, Ahmad Mustapha Wali,
Peizheng Wu, Yu-Yin Hsu, Stefan Frank, Nora Hollenstein and Lena Jäger |
| 890 |
Sanskrit Travelogue: A Large-Scale Unified and Annotated Corpus of Sanskrit Texts |
Giacomo De Luca, Danilo Croce and Roberto Basili |
| 891 |
The Foggia Occupator Corpus: Digitisation, Annotation, and Computational Analysis of an Occupation‑Era
Newspaper (1945–1946) |
Michele Ciletti |
| 893 |
A Typology of Synthetic Datasets for Dialogue Processing in Clinical Contexts |
Steven Bedrick, A. Seza Dogruoz and Sergiu Nisioi |
| 895 |
The Limits of Data Scaling: Sub-token Utilization and Acoustic Saturation in Multilingual ASR |
Siyu Liang, Nicolas Ballier, Gina-Anne Levow and Richard Wright |
| 897 |
MeteoGalEus: An Iberian Multilingual Weather Dataset in Galician, Euskera, and Spanish |
Ainhoa Vivel-Couso, Nella Zabrina Pramata, David Robredo, Aitor Soroa and Jose Maria Alonso-Moral |
| 899 |
Language Models Are Borrowing-Blind: A Multilingual Evaluation of Loanword Identification across 10
Languages
|
Merilin Sousa Silva and Sina Ahmadi |
| 900 |
TTSVowelViz: A Tool for Visualising Text-to-Speech Model Training via Vowel Spaces |
Pasindu Udawatta, Jesin James, Balamurali B T, Catherine Inez Watson, Ake Nicholas and Binu Nisal Abeysinghe
|
| 901 |
Automatic Suggestions of Supplements in the Herculaneum Papyri: Language Models and RESTful API |
Angelo Mario Del Grosso, Gabriele Giannessi, Simone Zenzaro and Federico Boschetti |
| 902 |
RadTimeline: Timeline Summarization for Longitudinal Radiological Lung Findings |
Sitong Zhou, Meliha Yetisgen and Mari Ostendorf |
| 904 |
NegNLI-BR: A Brazilian Portuguese Benchmark for Negation in Natural Language Inference |
Matheus Westhelle and Viviane Moreira |
| 909 |
Large Language Models Are Good Term Extractors: A Systematic Evaluation |
Ayla Rigouts Terryn |
| 910 |
BIS Reasoning 1.0: The First Large-Scale Japanese Benchmark for Belief-Inconsistent Syllogistic Reasoning
|
Ha Thanh Nguyen, Hideyuki Tachibana, Chaoran Liu, Qianying Liu, Su Myat Noe, Koichi Takeda and Sadao
Kurohashi
|
| 911 |
Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages |
Tadesse Destaw Belay, Dawit Ketema Gete, Abinew Ali Ayele, Olga Kolesnikova, Iqra Ameer, Grigori Sidorov and
Seid Muhie Yimam |
| 912 |
Bridging the Domain Divide: Supervised vs. Zero-Shot Clinical Section Segmentation from MIMIC-III to
Obstetrics |
Baris Karacan, Barbara Di Eugenio and Patrick Thornton |
| 914 |
Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection |
Federica Gamba, Aman Sinha, Timothee Mickus, Raul Vazquez, Patanjali Bhamidipati, Claudio Savelli, Ahana
Chattopadhyay, Laura A. Zanella, Yash Kankanampati, Binesh Arakkal Remesh, Aryan Ashok Chandramania, Rohit
Agarwal, Chuyuan Li, Ioana Buhnila and Radhika Mamidi |
| 916 |
JamC-QA: A Multiple-Choice Question Answering Benchmark for Japan-Specific Knowledge |
Teruaki Oka, Tomohide Shibata and Nao Yoshida |
| 918 |
Evaluating Text Style Transfer: A Nine-language Benchmark for Text Detoxification |
Vitaly Protasov, Nikolay Babakov, Daryna Dementieva and Alexander Panchenko |
| 920 |
Reasoning Graph-Structured Question Answering: Datasets and Insights from LLM Benchmarking |
Khin Yone, Devasha Trivedi, Anish Pahilajani, Jincen Shuai, Samyak Rajesh Jain, Ryan Rossi, Nesreen K.
Ahmed,
Franck Dernoncourt, Yu Wang and Namyong Park |
| 922 |
FSHealth: A Filipino Sign Language Corpus for Healthcare Communication |
Derek Roi Bautista, Mia Bernice Cruz, Peter Paul Flaminiano and Jocelynn Cu |
| 924 |
A Large-Scale Dataset for Linking-Based Geocoding |
Hibiki Nakatani, Yuichiro Yasui, Ryosuke Wakamoto, Masayuki Ishii, Tetsuhisa Suizu, Hiroki Ouchi and Taro
Watanabe |
| 925 |
FiNERVINER: Fine-grained Named Entity Recognition for Vulnerable Languages of India's North Eastern Region
|
Prachuryya Kaushik and Ashish Anand |
| 926 |
InstructSum: A Benchmark to Evaluate Instruction-Following Capability of Large Language Models in
Summarization |
Kosuke Nishida, Kyosuke Nishida and Itsumi Saito |
| 928 |
JBE-QA: Japanese Bar Exam QA Dataset for Assessing Legal Domain Knowledge |
Zhihan Cao, Fumihito Nishino, Hiroaki Yamada, Ha Thanh Nguyen, Yusuke Miyao and Ken Satoh |
| 930 |
Unified Encoders for French Speech and Text |
Phuong-Hang le, Valentin Pelloin, Arnault Chatelain, Maryem Bouziane, Mohammed Ghennai, Qianwen Guan, Kirill
Milintsevich, Salima Mdhaffar, Aidan Mannion, Nils Defauw, Shuyue Gu, Alexandre Daniel Audibert, Marco
Dinarelli, Yannick Estève, Lorraine Goeuriot, Steffen Lalande, Nicolas Hervé, Maximin Coavoux, François
Portet,
Étienne Ollion, Marie Candito, Maxime Peyrard, Solange Rossato, Benjamin Lecouteux, Aurélie Nardy, Gilles
Sérasset, Vincent Segonne, Solène Evain, Diandra Fabre and Didier Schwab |
| 934 |
Introducing PerMet 1.0: A Metaphor-Annotated Corpus for Persian |
Mohammad Saeid Miri |
| 935 |
AusKidTalk: Developing Transcription Guidelines for Continuous Australian English Child Speech |
Tuende Szalay, Zheng Nan, Renata Huang, Mostafa Shahin, Sirojan Tharmakulasingam, Kirrie Ballard and Beena
Ahmed |
| 936 |
Creating a High Quality Abstract Meaning Representation Dataset Automatically |
Johannes Heinecke, Asadullah Munshi, Frédéric Herledan and Geraldine Damnati |
| 940 |
HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse |
Sai Kartheek Reddy Kasu, Shankar Biradar, Sunil Saumya and Md. Shad Akhtar |
| 948 |
Text+: A National Hub Including Legacy Language Data |
Florian Barth, Philippe Genêt, Christoph Draxler, Jennifer Ecker, Stefan Fischer, Alina Hemmer, Timm
Lehmberg,
Thorsten Trippel, Andreas Witt, Claus Zinn and Arden Zimmermann |
| 950 |
Question and Response Dynamics in Public Service Encounters |
Wassiliki Siskou, Ingrid Espinoza, Laurin Friedrich, Steffen Eckhard and Annette Hautli-Janisz |
| 952 |
Modeling the Memory-Surprisal Trade-Off over Time: Communicative Efficiency Decreases with
Lexico-Grammatical
Change in Scientific English |
Julius Steuer, Marie-Pauline Krielke, Stefania Degaetano-Ortlieb, Elke Teich and Dietrich Klakow |
| 953 |
Cross-Lingual and Cross-Cultural Transfer of Talk Move Classification to German Science Classrooms |
Christian Wartena, Christian Schumburg, Andreas Nehring, Marcel Ebert, Friederike Korneck, David Schmitt,
Marie Irmer and Birgit Neuhaus |
| 955 |
SDC: Sinhala Diachronic Corpus |
Nevidu Jayatilleke, Nisansa de Silva, Gagani Kasundhi Kulathilaka, Azra Safrullah and Johan Nevin Sofalas
|
| 956 |
ShAnEL-2: A Multilingual Benchmarking Dataset for Short-Answer Language Learning Exercises |
Jasper Degraeuwe and Thomas Moerman |
| 957 |
IHPP: A Paragraph-Level Dataset for Investigating the Pragmatics of Hyperpartisan Italian News |
Michele Joshua Maggini, Davide Bassi, Angelo Valente, Gaël Dias and Pablo Gamallo |
| 958 |
Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision–Language Models |
Shiho Matta, Lis Kanashiro Pereira, Peitao Han, Fei Cheng and Shigeru Kitazawa |
| 959 |
NOVELSUM: Evaluating Long-Form Summary Generation for Historical Scandinavian Novels |
Ali Al-Laith, Alexander Conroy, Kirstine Nielsen Degn, Jens Bjerring-Hansen and Daniel Hershcovich |
| 960 |
TuniSpeech-21H: A Tunisian Arabic Speech Corpus for Automatic Speech Recognition |
Mohamed Ali Sghaier, Mohamed Lazhar Bellagha and Mounir Zrigui |
| 961 |
Extracting Medical Image-Related Entities from Spanish Electronic Health Records Using NER Methods |
Alexander Platas, Marcos Merino, Elena Zotova, Montse Cuadros and Karen López-Linares |
| 964 |
Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-tuning Experiments on LLMs and
NMT Models |
Spyridon Mavromatis, Sokratis Sofianopoulos, Prokopis Prokopidis and Maria Giagkou |
| 966 |
ENEIDE: A High Quality Silver Standard Dataset for Named Entity Recognition and Linking in Historical
Italian
|
Cristian Santini, Sebastian Barzaghi, Paolo Sernani, Emanuele Frontoni, Laura Melosi and Mehwish Alam |
| 967 |
Are the LLMs Capable of Maintaining at Least the Language Genus? |
Sandra Mitrović, David Kletz, Ljiljana Dolamic and Fabio Rinaldi |
| 969 |
A Historical Database for the Study of Obstruent-Lateral Palatalization in Ibero-Romance |
Andrea García Covelo |
| 970 |
The Swedish Parliamentary Motions Corpus 1867-2024 |
Robert Borges, Fredrik Mohammadi Norén, Lotta Åberg Brorsson, Väinö Yrjänäinen, Hanna Bäck, Robert
Klemmensen
and Måns Magnusson |
| 972 |
Big Five Personality Prediction through Emotion-Conditioned Representations and Learnable Psycholinguistic
Mapping |
Antonin Schnyder, Lorenzo Zangari and Davide Picca |
| 973 |
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark |
Sara Ghaboura, Shubham Patle, Ketan More, Wafa Hamad Mohamed Alghallabi, Omkar Thawakar, Jorma Laaksonen,
Hisham Cholakkal, Salman Khan and Rao Anwer |
| 976 |
APTFiNER: Annotation Preserving Translation for Fine-grained Named Entity Recognition |
Prachuryya Kaushik, Adittya Gupta, Ajanta Maurya, Gautam Sharma, V. V. Saradhi and Ashish Anand |
| 977 |
The Swedish Benchmark of Linguistic Minimal Pairs |
Johan Sjons, Fredrik Heinat and Murathan Kurfali |
| 979 |
RelEx-PT: A Portuguese Sentence-Level Relation Extraction Dataset |
Tomás Pinto, Catarina Silva and Hugo Goncalo Oliveira |
| 980 |
Designing LLM Agents for User-Centered Language Service Selection |
Ryoichiro Ogawa, Donghui Lin and Fumito Uwano |
| 981 |
Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights |
Eneko Valero, Maria Ribalta i Albado, Oscar Sainz, Naiara Perez and German Rigau |
| 982 |
A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German |
Shiva Banasaz Nouri, Elena Leitner, Julian Moreno-Schneider and Georg Rehm |
| 983 |
Reasoning over Object Descriptions Improves Coreference Resolution in Task-Based Dialogue Systems |
Oier Ijurco and Oier Lopez de Lacalle |
| 984 |
[CorpusX - ANONYMIZED] : A Diachronic Corpus of French Broadcast Speech Controlled for Speakers' Age and
Gender |
Simon Devauchelle, David Doukhan, Remi Uro, Lucas Ondel, Valentin Pelloin, Olympia Imbert-Brégégère,
Véronique
Lefort, Kévin Picard, Emeline Seignobos and Albert Rilliard |
| 987 |
PrePPER: A Preference Pattern-based Profiling Framework for Explainable Recommendation |
Taisuke Usumi, Takeharu Eda, Akiko Masaki, Akira Sakamoto and Sanae Muramatsu |
| 989 |
Benchmarking Portuguese Open Information Extraction |
Gabriel Silva, Mário Rodrigues, António Teixeira and Marlene Amorim |
| 990 |
MUStReason: A Benchmark for Diagnosing Pragmatic Reasoning in VideoLMs for Multimodal Sarcasm Detection.
|
Anisha Saha, Varsha Suresh, Timothy Hospedales and Vera Demberg |
| 992 |
Exploring the Transfer of Irony Explanation Generation from English to Dutch |
Aaron Maladry, Els Lefever, Cynthia Van Hee and Veronique Hoste |
| 996 |
Towards Safer Calls for Everyone: Designing a Benchmark Dataset for Evaluating Voice Phishing Detection
Models
|
Joeun Kang, Gyuri Choi, Chanhyuk Yoon, Yongbin Jeong, Younggyun Hahm and Hansaem Kim |
| 997 |
Benchmarking Arabic Authorship Attribution and Style Transfer with Large Language Models |
Injy Hamed, Bashar Alhafni, Nizar Habash and Thamar Solorio |
| 1001 |
ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark |
Tung X. Nguyen, Nhu Vo, Giang Son Nguyen, Duy Mai Hoang, Chien Dinh Huynh, Inigo Jauregi Unanue, Massimo
Piccardi, Wray Buntine and Dung D. Le |
| 1002 |
SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation
with a Case Study on Irish |
Josh Mcgiff and Nikola S. Nikolov |
| 1004 |
A Discourse-based Tool Series for Logical Validation of LLMs |
Boris Galitsky |
| 1005 |
DIDECO: An Annotated Dataset for Intent Detection in Digital Communications |
Senaid Popovic, Damien Riquet, Maxime Meyer, Yannick Parmentier and Fabien Lauer |
| 1006 |
Linguistic Knowledge-Infused Fine-Tuning for Mitigating Gender Bias in Machine Translation |
Luis Ernesto Garcia Estrada, Audrey Mash, Carlos Escolano, Maite Melero and Christine Basta |
| 1008 |
Towards a Comprehensive English Wordnet-Wikidata Mapping |
John P. McCrae, Johann Bergh, Krasimir Angelov and Joerg Waitelonis |
| 1013 |
Human Label Variation in Implicit Discourse Relation Recognition |
Frances Yung, Daniil Ignatev, Merel Scholman, Vera Demberg and Massimo Poesio |
| 1014 |
Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models
|
Thomas Stephan Juzek, Xiaoyang Ming and Jose A. Hernandez |
| 1017 |
LLM-Based Data Generation and Clinical Skills Evaluation for Low-Resource French OSCEs |
Tian Huang, Tom Bourgeade and Irina Illina |
| 1018 |
Exploring Social Bias in Slovenia: The EEC-SL Dataset |
Jaya Caporusso, Damar Hoogland, Boshko Koloski, Matthew Purver, Senja Pollak and Špela Vintar |
| 1019 |
Issue Detection and Category Classification in Domain-Specific Technical Logbooks |
Afshin Karimi, Ingmar Hartl, Henrik Tuennermann and Anne Lauscher |
| 1021 |
Human vs LLM in Conversational Repair Annotation: A New Resource and Comparative Study |
Anh Ngo, Nicolas Rollet, Catherine Pelachaud and Chloé Clavel |
| 1022 |
A Comprehensive Full-Form Lexicon for Arabic NLP and Speech Technology |
Yannis Haralambous and Jack Halpern |
| 1023 |
What Triggers My Model? Contrastive Explanations Inform Gender Choices by Translation Models |
Janiça Hackenbuchner |
| 1024 |
How Much Noise Can BERT Handle? Insights from Multilingual Sentence Difficulty Detection? |
Nouran Khallaf and Serge Sharoff |
| 1025 |
A Diagnostic Benchmark for Sweden-Related Factual Knowledge |
Jenny Kunz |
| 1026 |
GePaDeU - a Multi-layer Corpus of German Parliamentary Debates with Rich Semantic and Pragmatic Annotations
|
Ines Rehbein, Julian Schlenker, Lars Ostertag and Simone Paolo Ponzetto |
| 1027 |
Detecting Potentially Under-annotated Explicit Discourse Connectives in the Penn Discourse Treebank (PDTB-3)
with LLMs |
Yueh-Ting Chuang, Prathyusha Jwalapuram, Bonnie Webber and Xixian Liao |
| 1028 |
SALAN: A Massive ASR Dataset for the Languages of Niger |
Mamadou K Keita, Christopher Homan, Emily Prud'hommeaux, Abdoulaye Sako and Seydou Diallo |
| 1031 |
Can LLMs Understand Punchlines? LLMs' Narrative Understanding Evaluation with Short-shorts |
Jiashi Cheng and Takehito Utsuro |
| 1033 |
Evaluating the Impact of Question Wording Variation on the Answer Consistencies of Large Language Models
|
Junya Takayama, Masaya Ohagi, Tomoya Mizumoto and Katsumasa Yoshikawa |
| 1034 |
Instruction-Tuned Urdu LLMs: Efficient Adaptation of Llama Models and Evaluation Resources for Urdu |
Munief Hassan Tahir, Sana Shams, Sarmad Hussain and Miriam Butt |
| 1035 |
Is Biomedical Specialization Still Worth It? Insights from Domain-Adaptive Language Modelling with a New
French Health Corpus |
Aidan Mannion, Cécile Macaire, Armand Violle, Stéphane Ohayon, Xavier Tannier, Didier Schwab, Lorraine
Goeuriot and François Portet |
| 1040 |
Multi-SimLex for Dutch: Benchmarking Embedding- and Prompt-Based Model Performance on Semantic Similarity
|
Lizzy Brans and Jelke Bloem |
| 1041 |
Common European Language Data Space: Development, Current Status, and Future Perspectives |
Stelios Piperidis, Penny Labropoulou, Dimitrios Galanis, Khalid Choukri, Andrejs Vasiļjevs, Mitos
Deligiannis,
Katerina Gkirtzou, Dimitris Gkoumas, Athanasia Kolovou, Leon Voukoutis, Kanella Pouli, Maria Giagkou, Maria
Gavriilidou, Katrin Marheinecke, Elena Leitner, Simon Ostermann, Stefania Raccioppa, Kossay Talmoudi, Victoria
Arranz, Valérie Mapelli, Helene Mazo, Fernanda González Campo, Shi Yu, Aivars Bērziņš, Andis Lagzdiņš and
Georg Rehm |
| 1042 |
MUCH: A Multilingual Claim Hallucination Benchmark |
Jérémie Dentan, Alexi Stanislas Canesse, Davide Buscaldi, Aymen Shabou and Sonia Vanier |
| 1045 |
MedInjection-FR: Exploring the Role of Native, Synthetic, and Translated Data in Biomedical Instruction
Tuning
|
Ikram Belmadani, Oumaima el Khettari, Pacome Constant Dit Beaufils, Benoit Favre and Richard Dufour |
| 1053 |
EMMT: A Simultaneous Eye-tracking, 4-Electrode EEG and Audio Corpus for Multi-modal Reading and Translation.
|
Sunit Bhattacharya, Věra Kloudová, Vilem Zouhar and Ondřej Bojar |
| 1055 |
GPT-NL Public Corpus: A Permissively Licensed, Dutch-First Dataset for LLM Pre-training |
Jesse J. Van Oort, Frank Brinkkemper, Erik de Graaf and Bram Vanroy |
| 1056 |
Context-8: A Data Set for Evaluating Context Sensitivity in Machine Translation |
Dongyue Wang and Kyo Kageura |
| 1059 |
Evaluating the Impact of Source Diversity for RAG in Historical Research |
Ruhi Umesh Mahadeshwar, Andreas van Cranenburgh, Tommaso Caselli and Malvina Nissim |
| 1060 |
Learning Long-Document Embeddings via Chunk–Context Entailment |
Waheed Ahmed Abro, Naim Es-Sebbani and Zied Bouraoui |
| 1062 |
Irish-BLiMP: A Linguistic Benchmark for Evaluating Human and Language Model Performance in a Low-Resource
Setting |
Josh Mcgiff, Tung Khanh Tran, William Mulcahy, Dáibhidh Ó Luinín, Jake Dalzell, Róisín Ní Bhroin, Adam
Burke,
Barry O'Sullivan, Hoang D. Nguyen and Nikola S. Nikolov |
| 1063 |
Steering LLMs toward Korean Local Speech: Iterative Refinement Framework for Faithful Dialect Translation
|
Keunhyeung Park, Seunguk Yu and Youngbin Kim |
| 1064 |
Scientific Article Section Classification (SASC) Dataset |
Nicolau Duran-Silva, Julian Moreno-Schneider, César Parra-Rojas and Georg Rehm |
| 1066 |
AgriChain: Visually-Grounded Expert-Verified Reasoning for Interpretable Agricultural Vision–Language Models
|
Hazza Mahmood, Yongqiang Yu and Rao Anwer |
| 1067 |
Is Clinical Text Enough? A Multimodal Study on Mortality Prediction in Heart Failure Patients |
Oumaima El Khettari, Virgile Barthet, Guillaume Hocquet, Joconde Weller, Emmanuel Morin and Pierre
Zweigenbaum
|
| 1068 |
ViKhoMT: A Vietnamese–K'Ho Neural Machine Translation Dataset and Evaluation for Community Health
Communication |
Tram Truong, Vinh Nguyen, Dang Van Thin and Ngan Nguyen |
| 1071 |
How I Met Your Snowclone: Unsupervised Discovery of Snowclone Patterns in Large Datasets |
Julien Bezançon, Gaël Lejeune and Marceau Hernandez |
| 1072 |
HOME-KGQA: A Benchmark Dataset for Multimodal Knowledge Graph Question Answering on Household Daily
Activities
|
Shusaku Egami, Aoi Ohta, Tomoki Tsujimura, Masaki Asada, Tatsuya Ishigaki, Ken Fukuda, Masahiro Hamasaki and
Hiroya Takamura |
| 1076 |
Human-Centered Multimodal Fusion for Sexism Detection in Memes with Eye-Tracking, Heart Rate, and EEG
Signals
|
Iván Arcos Gabaldón, Paolo Rosso and Elena Gomis Vicent |
| 1078 |
Mechanistic Interpretability Meets Cognitive Linguistics: Modelling Locative Image Schemas in the Circuit
Framework |
Mattia Proietti, Afra Alishahi, Grzegorz Chrupała and Alessandro Lenci |
| 1080 |
LombardoGraphia: Automatic Classification of Lombard Orthography Variants |
Edoardo Signoroni and Pavel Rychly |
| 1081 |
The Impact of Tokenization Algorithms on Hungarian Language Model Performance |
Mátyás Osváth, Máté Norbert Molnár and Noémi Ligeti-Nagy |
| 1082 |
XXX: A Strategic Framework for Digital Sovereignty and Linguistic Inclusion of Basque in the Era of AI |
Victoria Arranz, Sara Arregi, Leire Barañano and Aitor García-Pablos |
| 1084 |
SPOT: An Annotated French Corpus and Benchmark for Detecting Critical Interventions in Online Conversations
|
Manon Berriche, Célia Nouri, Chloé Clavel and Jean-Philippe Cointet |
| 1086 |
Meenz Bleibt Meenz, but Large Language Models Do Not Speak the Dialect of Mainz |
Minh Duc Bui, Manuel Mager, Peter Herbert Kann and Katharina von der Wense |
| 1087 |
Knowledge-Infused Hierarchy-Aware Emotion Recognition in Code-mixed Mental Health Counseling Conversations
|
Aseem Srivastava, Kushagra Mittal, Anusha Tiwari and Md. Shad Akhtar |
| 1089 |
EduBench: A Portuguese Benchmark for Open-Ended Discursive Question Answering |
Pedro Henrique Paiola, Luís Gabriel Damiati Mendes, Bruno de Oliveira Monchelato, André da Fonseca Schuck,
Gabriel Lino Garcia, Douglas Rodrigues, Helena de Medeiros Caseli and João Paulo Papa |
| 1090 |
Temporal Expression Recognition in Legal Transcripts |
Elizabeth J. Goldstein and Maria Berger |
| 1091 |
JMTEB and JMTEB-lite: Japanese Massive Text Embedding Benchmark and Its Lightweight Version |
Shengzhe Li, Masaya Ohagi, Ryokan Ri, Akihiko Fukuchi, Tomohide Shibata and Daisuke Kawahara |
| 1093 |
Estonian WinoGrande Dataset: Comparative Analysis of LLMs Performance on Human and Machine Translation |
Marii Ojastu, Hele-Andra Kuulmets, Aleksei Dorkin, Marika Borovikova, Dage Särg and Kairit Sirts |
| 1094 |
Variation Is the Norm: Embracing Sociolinguistics in NLP |
Anne-Marie Lutgen, Alistair Plum, Verena Blaschke, Barbara Plank and Christoph Purschke |
| 1097 |
Disambiguation of Emotion Annotations by Contextualizing Events in Plausible Narratives |
Johannes Schaefer and Roman Klinger |
| 1098 |
Extending the Semantic Layer of the CompL-it Italian Lexicon: Traits, Semantic Types, and Definitions |
Emiliano Giovannetti, Andrea Bellandi, Simone Marchi and Mafalda Papini |
| 1099 |
Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual
Broadcasts
|
Valentin Pelloin, Lina Bekkali, Reda Dehak and David Doukhan |
| 1102 |
Bootstrapping NLP for Sakha: Named Entity Recognition and Sentiment Analysis in an Extremely Low-Resource
Setting |
Mariia Everstova, Nikolai Efimov and Valerio Basile |
| 1103 |
Automatic Essay Scoring and Feedback Generation in Basque Language Learning |
Ekhi Azurmendi, Xabier Arregi and Oier Lopez de Lacalle |
| 1104 |
Lightweight Cross-Lingual Federated Prompt Tuning for Low-Resource Languages |
Ubaid Azam, Imran Razzak and Shoaib Jameel |
| 1106 |
[Anonymized]: Leveraging Curriculum Learning to Achieve Equitable Language Representation |
Toms Bergmanis, Ingus Jānis Pretkalniņš, Martins Kronis, Davis Nicmanis, Jeļizaveta Jelinska, Roberts Rozis,
Rinalds Vīksna and Mārcis Pinnis |
| 1107 |
SENS-ASR: Semantic Embedding Injection in Neural-transducer for Streaming Automatic Speech Recognition |
Youness Dkhissi, Valentin Vielzeuf, Elys Allesiardo and Anthony Larcher |
| 1110 |
A German High School Student Corpus with Keystroke Logging Data. |
Nils-Jonathan Schaller, Thorben Jansen, Lars Höft, Hannah Pünjer and Andrea Horbach |
| 1112 |
Reading Dynamics and Comprehension in Cognitive Aging: A Multimodal Language Resource |
Claudia Marzi, Noemi Boni, Alice Todesco, Andrea Nadalini, Giorgia Albertin, Cristina Dolciotti, Paolo
Bongioanni, Marcello Ferro, Fabio Tamburini, Gloria Gagliardi and Vito Pirrelli |
| 1113 |
Efficient Financial Language Understanding via Distillation with Synthetic Data |
Wen-Fong (Xavier) Huang and Edwin Simpson |
| 1114 |
Gender Bias in MT for a Genderless Language: New Benchmarks for Basque |
Amaia Murillo, Olatz Perez-de-Viñaspre and Naiara Perez |
| 1117 |
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models |
Malik H. Altakrori, Nizar Habash, Teresa Lynn, Younes Samih, Abed Alhakim Freihat, Kirill Chirkunov,
Muhammed
AbuOdeh, Radu Florian, Preslav Nakov and Alham Fikri Aji |
| 1118 |
GeoBenchmark: Probing Large Language Models for Geo-Spatial Knowledge |
Ayomide Abayomi, Jose G. Moreno, Karim Radouane and Lynda Tamine |
| 1120 |
Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies |
Marie Mikulová, Barbora Štěpánková, Daniel Zeman, Jan Štěpánek, Milan Straka and Jan Hajič |
| 1123 |
Encoding Logical Relations of Chinese Complex Sentences within the Universal Dependencies Framework |
Hongpu Zhu and Hongzhi Xu |
| 1126 |
Sentiment Analysis of German Sign Language Fairy Tales |
Fabrizio Nunnari, Siddhant Jain and Patrick Gebhard |
| 1131 |
Hindsight Quality Prediction Experiments in Multi-Candidate Human-Post-Edited Machine Translation |
Malik Marmonier, Benoît Sagot and Rachel Bawden |
| 1135 |
A Sociophonetic Analysis of Racial Bias in Commercial ASR Systems Using the Pacific Northwest English Corpus
|
Michael Scott, Siyu Liang, Alicia Wassink and Gina-Anne Levow |
| 1138 |
ADHD-Lang: A Large-Scale Social Media Dataset for Verbal Behavior and Digital Phenotyping in Adult ADHD |
Daniel Wiechmann, Elma Kerz, Edward Kempa and Yu Qiao |
| 1139 |
AmDi - Ambiguous Words Diachronic Dataset |
Kai Kugler and Felix Thielen |
| 1141 |
A Scalable Pipeline for Novelty Detection in Skill Extraction Using Large Language Models |
Gian Seifert and Simon Clematide |
| 1142 |
FactOReS: Fact-checking with an Evidence-based Open Resource in Spanish |
Nagore Bravo, Jaione Bengoetxea, Iker GarcÃa-Ferrero, Alba Bonet Jover, Estela Saquete and Rodrigo Agerri
|
| 1143 |
SemBench: A Universal Semantic Framework for LLM Evaluation |
Mikel Zubillaga, Naiara Perez, Oscar Sainz and German Rigau |
| 1145 |
OTA-BOUN: A Historical Turkish Dependency Treebank |
Tarık Emre Tıraş, Nureddin Cüneyd Ünal, Ada Cengiz, Ece Yurtseven, Esma F. Bilgin Taşdemir and Saziye Betul
Ozates |
| 1146 |
Appraisal Theory-Informed Emotion Prediction |
Xiaowei Wang, Jayant Teotia, Rui Mao, Wandeep Kaur Ratan Singh, Sabrina Binti Tiun and Erik Cambria |
| 1149 |
Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection |
Dylan Phelps, Rodrigo Wilkens, Edward Gow-Smith, Thomas M. R. Pickard, Maggie Mi and Aline Villavicencio
|
| 1150 |
TCMPHal: A Large-scale Dataset for Hallucination Detection in Traditional Chinese Medicine Pharmacy |
Nijia Han, Zimu Wang, Ziwen Xie, Wei Wang, Jia Meng, John Moraros and Shuihua Wang |
| 1151 |
User Profiling for Specification-Sensitive Recommendations with Large Language Model Prompting |
Chih-Yu Chien, An-Zi Yen, Hen-Hsen Huang and Hsin-Hsi Chen |
| 1152 |
A Parallel Corpus of the Parable of the Prodigal Son: Building a Resource for Documenting Language Varieties
in Metropolitan France |
Lucence Ing, Juliette Janes, Sven Ködel and Benoît Sagot |
| 1153 |
A Recipe for Adapting Multilingual Embedders to OCR-Error Robustness and Historical Texts |
Andrianos Michail, Stylianos Psychias, Juri Opitz and Simon Clematide |
| 1157 |
Evaluating LLM-based Text Simplification for German: Effects on Post-Editing Effort, Quality Ratings, and
User
Comprehension |
Luisa Carrer, Andreas Säuberli, Martin Kappus, Lukas Fischer and Sarah Ebling |
| 1158 |
ESG-QA: Building a Dataset for Question Answering on Environmental, Social, and Governance Pillars |
Gabriel Assis, Ayrton Surica, Pedro Kroll, Gabriela Aires Mendes, Darian Rabbani, Edson Bollis, Lucas
Francisco Amaral Orosco Pellicer and Aline Paes |
| 1160 |
Enhancing and Evaluating Tabular Models on the Fly via Synthetic Question–Answer Generation |
Jorge Osés Grijalba, Eugenio Martínez Cámara, L. Alfonso Ureñ-López and Jose Camacho-Collados |
| 1164 |
Phrase-Level Segmentation on Medieval Corpora for Aligning Multilingual Texts |
Lucence Ing, Matthias Gille Levenson and Carolina Macedo |
| 1165 |
AraREQ: A Dataset and End-to-End Conflict Detection and Resolution in Software Requirements |
Tymaa Hasanain Hammouda, Mustafa Jarrar, Alaa Aljabari and Nagham Fahim Hamad |
| 1168 |
Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs |
Saugata Purkayastha, Pranav Kushare, Pragya Paramita Pal and Sukannya Purkayastha |
| 1170 |
A Critical Study of Automatic Evaluation in Sign Language Translation |
Shakib Yazdani, Yasser Hamidullah, Cristina España-Bonet, Eleftherios Avramidis and Josef van Genabith |
| 1171 |
Do LLMs Judge Distantly Supervised Named Entity Labels Well? Constructing the JudgeWEL Dataset |
Alistair Plum, Laura Maria Bernardy and Tharindu Ranasinghe |
| 1172 |
SWE-QA: A Dataset and Benchmark for Complex Code Understanding |
Laila Elkoussy and Julien Perez |
| 1173 |
The MISOMEM-Val Dataset for Identifying Human Values in Misogynistic Memes |
Rakshitha Rao Ailneni and Sanda Harabagiu |
| 1174 |
A Multilingual Human Annotated Corpus of Original and Easy-to-Read Texts to Support Access to Democratic
Participatory Processes |
Verena Riegler, Stefan Bott, Horacio Saggion, Almudena Rascón Alcaina and Nouran Khallaf |
| 1175 |
Listening for Ideology: Automatic Analysis of Character Speech in Historical Nazi Propaganda Films |
Nicolas Ruth and Manuel Burghardt |
| 1176 |
A Corpus for Personalized Dialogue Breakdown Repair in Japanese Open-Domain Conversations |
Kazuya Tsubokura, Yurie Iribe and Norihide Kitaoka |
| 1178 |
SyntaxGym for French: Resource, Annotation, and Evaluation of French and Multilingual LLMs |
Tatiana Bladier, Henri-José Deulofeu and Alexis Nasr |
| 1179 |
Evaluating Large Language Models for Text-to-Gloss Translation in Kazakh-Russian Sign Language: A Pilot
Study
|
Zhanibek Kozhirbayev and Alfarabi Imashev |
| 1182 |
Linguistic Knowledge Graphs for Sense Prediction: A Case-study on Latin |
Eleonora Ghizzota, Paola Marongiu, Pierpaolo Basile, Stefano Ferilli and Barbara McGillivray |
| 1184 |
CrisisCL: A Domain Incremental Learning Benchmark for Crisis Management |
Paul le van Kiem, Romain Meunier, Farah Benamara and Véronique Moriceau |
| 1185 |
Building the AURIS Corpus of Reference and Information Structure |
Christian Chiarcos, Christian Fäth, Tabea Gröger and Quentin Alastair Frey |
| 1188 |
MAD: A Corpus of Multilingual Argumentative Deliberation |
Eimear Maguire, Ella Schad, Jacky Visser, Chris Reed and John Lawrence |
| 1189 |
Unsupervised Labelling of Mutation Triggers in Welsh |
Nicolás Gutiérrez-Rolón and Fernando Alva-Manchego |
| 1190 |
Investigating How LLMs Propagate Female Stereotypes: Comparing What Models Say via Prompts with What They
Represent in Their Embeddings |
Andrea Valderrey Nuñez and Jelke Bloem |
| 1192 |
AMORES: A Spanish Language Resource for an Extended Set of Moral Foundations |
Oscar Araque, Daniel Molina, Anny D. Alvarez Nogales and Carlos A. Iglesias |
| 1193 |
PragExTra: A Multilingual Corpus of Pragmatic Explicitation in Translation |
Doreen Osmelak, Koel Dutta Chowdhury, Uliana Sentsova, Cristina España-Bonet and Josef van Genabith |
| 1194 |
Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious
Dataset |
Nick Rossenbach, Robin Schmitt, Tina Raissi, Simon Berger, Larissa Kleppel and Ralf Schlüter |
| 1196 |
RAGE: Roman and Greek Emotions |
Frederick Riemenschneider, Jonathan D. Geiger, Thomas Kuhn-Treichel and Anette Frank |
| 1197 |
WhiteHouse: Translation of the Casablanca Corpus for Multi-dialectal Arabic Speech Translation |
Fethi Bougares, Salima Mdhaffar and Yannick Estève |
| 1198 |
The Evolution of Philosophy: A Metaphorical Cognition Perspective |
Rui Mao, Dapeng Chen, Zihao Huang, Xulang Zhang and Erik Cambria |
| 1201 |
Synthetic Instruction Generation for Low-Resource Nordic Languages: Viability and Limitations in LLM
Instruction-Tuning |
Mathias Stenlund, Annika Simonsen, Lars Bungum, Jan Ebert, Jiangtao Wang, Oleg Filatov, Hemanadhan Myneni,
Morris Riedel and Hafsteinn Einarsson |
| 1202 |
ACID: On the Perception of Online Classism |
Arianna Muti, Elisa Bassignana, Amanda Cercas Curry, Federica Durante, Dirk Hovy and Debora Nozza |
| 1206 |
UzUDT: Uzbek Universal Dependencies Treebank |
Sanatbek Gayratovich Matlatipov and Mersaid Aripov |
| 1212 |
Leveraging Linguistic Similarity for Low-Resource Speech Transcription |
Valentina Fedchenko and Eric Jordan |
| 1213 |
EPOP: A Benchmark Corpus for Assessing NLP Models on Structured Information Extraction in Plant Health |
Robert Bossy, Marine Courtin, Xinzhi Yao, Marie Grosdidier, Isabelle Pieretti, Sandy Duperier and Claire
Nedellec |
| 1214 |
From Variance to Invariance: Qualitative Content Analysis for Narrative Graph Annotation |
Junbo Huang, Max Weinig, Ulrich Fritsche and Ricardo Usbeck |
| 1218 |
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment |
Aditya Kamlesh Parikh, Cristian Tejedor-García, Catia Cucchiarini and Helmer Strik |
| 1222 |
Evaluating Style Embeddings for Machine-Generated Text Detection |
Noé Durandard, Saurabh Dhawan and Thierry Poibeau |
| 1225 |
A Dataset of Historical Medical Periodicals Annotated with Textual Genre |
Vera Danilova and Sara Stymne |
| 1226 |
A Dataset for Evaluating ASR on Specialized Vocabulary |
Emily Haubert Klering, Eduardo Gabriel Cortes, Tatjana Chernenko, Mariana Vargas Trarbach, Gabriel de
Oliveira
Ramos, Sandro José Rigo, Maitê Dupont, Ana Luiza Treichel Vianna, Gabriela Krause dos Santos, Vinicius
Meirelles Pereira, Denis Andrei de Araujo and Rafael Kunst |
| 1227 |
VIVID: A Culturally Grounded Benchmark Exposing the Figurative Language Gap in Vietnamese NLP |
Tu Tran Do, Nhat Ngoc Nguyen, Tung Khanh Tran, Hoang D. Nguyen, Tu Minh Phuong and Long Hoang Dang |
| 1228 |
AYN: A Tiny Yet Competitive Indian Legal Language Model Pretrained from Scratch |
Mitodru Niyogi, Eric Gaussier and Arnab Bhattacharya |
| 1229 |
Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech |
Tanvi Dinkar, Aiqi Jiang, Simona Frenda, Poppy Gerrard-Abbott, Nancie A. Gunson, Gavin Abercrombie and
Ioannis
Konstas |
| 1230 |
Preserving Endangered Linguistic Heritage: Developing a Corpus for the Study of Contact-induced Changes in
Corfioto |
Giorgio Maria Di Nunzio and Georgios Vardakis |
| 1232 |
Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation
|
Neha Sharma, Navneet Agarwal and Kairit Sirts |
| 1233 |
Construction of a Japanese RAG Benchmark Using Synthetic Documents on Non-existent Entities and Events |
Shengzhe Li, Masaya Ohagi, Hayato Tsukagoshi, Akihiko Fukuchi, Tomohide Shibata and Daisuke Kawahara |
| 1236 |
Assessing Logical Coherence of LLMs via Fine-Grained NLI |
Jon Felix Apaolaza Larraya, Begoña Altuna, Aitor Soroa and Inigo Lopez-Gazpio |
| 1237 |
Developing Zila: A Spoken Language Resource for the Endangered Slovenian Gail Valley Dialect |
Andrej Zgank, Gregor Donaj, Urh Kolaric, Usi Sereinig, Tatjana Koren-Zwitter, Sanja Boto, Sabina
Zwitter-Grilc, Jasna Vidinic and Darinka Verdonik |
| 1239 |
Counter-Hypothesis Generation: Towards Evaluating How LLMs Reason about Alternatives |
Marzieh Abdolmaleki, Aaron Maladry, Veronique Hoste and Els Lefever |
| 1240 |
HistoriQA-ThirdRepublic: Multi-Hop Question Answering Corpus for Historical Research, Parliamentary Debates
from the French Third Republic (1870-1940) |
Aurelien Pellet, Marie Anna Puren and Julien Perez |
| 1241 |
C4: A Multilingual Benchmark for Retrieval-Augmented Generation Based on the Catechism of the Catholic
Church
and Its Compendium |
Pius von Däniken, Mark Cieliebak and Jan Deriu |
| 1242 |
ConGA: Guidelines for Contextual Gender Annotation. a Framework for Annotating Gender in Machine Translation
|
Argentina Anna Rescigno, Eva Vanmassenhove and Johanna Monti |
| 1243 |
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study |
Eeham Khan, Firas Saidani, Owen Van Esbroeck, Richard Khoury and Leila Kosseim |
| 1244 |
Modeling the Human Lexicon under Temperature Variations: Linguistic Factors, Diversity and Typicality in LLM
Word Associations |
Maria A. Rodriguez, Marie Candito and Richard Huyghe |
| 1245 |
GerVLPro: A CEFR-Graded Vocabulary List of L2 Learners' Productive Vocabulary in German |
Noah-Manuel Michael, Anna Huelsing and Andrea Horbach |
| 1246 |
Automated Anomaly Detection for Ensuring the Quality of Fieldwork Data: Assessing Community-Driven
Documentation of an Under-Resourced Language at the Myanmar Border |
Kellen Parker van Dam, Abishek Stephen and Keen Thaam |
| 1247 |
The Moralization Corpus: Frame-Based Annotation and Analysis of Moralizing Speech Acts across Diverse Text
Genres |
Maria Becker, Mirko Sommer, Lars Tapken, Yi Wan Teh and Bruno Brocai |
| 1248 |
Infox-QC: A Quebec-Focused French Corpus for Misinformation Detection and AI Robustness Assessment |
Moetaz Doghmane, Hazem Amamou, Thiziri Sefsaf, Alan Davoust and Anderson Raymundo Avila |
| 1249 |
From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking
Evidence |
Premtim Sahitaj, Jawan Kolanowski, Ariana Sahitaj, Veronika Solopova, Max Upravitelev, Daniel Röder, Iffat
Maab, Junichi Yamagishi, Sebastian Möller and Vera Schmitt |
| 1250 |
Paragraph Segmentation Revisited: Towards a Standard Task for Structuring Speech |
Fabian Retkowski and Alexander Waibel |
| 1251 |
HybridCodeAuthorship: A Benchmark Dataset for Line-Level Code Authorship Detection |
Luke S. Patterson, Li Wang and Adam Faulkner |
| 1253 |
CorEGe-PT: Compiling a Large Corpus of Academic Texts in~Portuguese |
Tanara Zingano Kuhn, José Matos, Bruno Neves, Daniela Pereira, Elisabete Cação, Ivo Simões, Jacinto Estima,
Delfim Leão and Hugo Goncalo Oliveira |
| 1256 |
SLURP-TN : Resource for Tunisian Dialect Spoken Language Understanding |
Haroun Elleuch, Salima Mdhaffar, Yannick Estève and Fethi Bougares |
| 1257 |
Optimizing Multilingual LLMs via Federated Learning: A Study of Linguistic Composition in Clients |
Aleix Sant, Jordi Luque and Carlos Escolano |
| 1258 |
ReTaT: A Unified Benchmark for Relation Extraction across Text and Table |
Nathalie Aussenac-Gilles, Mohamed Ettaleb, Mouna Kamel, Véronique Moriceau, Raphael Troncy, Yoan Chabot,
Thibault Ehrhart and Fanfu Wei |
| 1260 |
Event Chronography in Multi-modal Data: The BME Method for Quantitative Analyses |
Anaïs Claire Murat, Maria Koutsombogera and Carl Vogel |
| 1261 |
Multilingual, Multimodal Pipeline for Creating Authentic and Structured Fact-Checked Claim Dataset |
Z. Melce Hüsünbeyi, Virginie Mouilleron, Leonie Uhling, Daniel Foppe, Tatjana Scheffler and Djamé Seddah
|
| 1262 |
YoNER: A New YorùBá Multi-domain Named Entity Recognition Dataset |
Peace Busola Falola, Jesujoba Alabi, Solomon O. Akinola, Folashade T. Ogunajo, Emmanuel Oluwadunsin Alabi
and
David Ifeoluwa Adelani |
| 1264 |
Efficient Dialect-Aware Modeling and Conditioning for Low-Resource Taiwanese Hakka Speech Processing |
Peng An-Ci, Kuan-Tang Huang, Tien-Hong Lo, Hung-Shin Lee, Hsin-Min Wang and Berlin Chen |
| 1269 |
Constructing and Annotating Historical Multilingual Parallel Text Collections on the TEITOK Platform |
Maarten Janssen, Anna Jouravel and Piroska Lendvai |
| 1271 |
A Multi-Label Neural POS Tagger for Faroese with Constrained Loss |
Annika Simonsen, Barbara Scalvini, Uni Johannesen, Iben Nyholm Debess, Hafsteinn Einarsson and Vésteinn
Snæbjarnarson |
| 1272 |
Analysing Lightweight Large Language Models for Biomedical Named Entity Recognition on Diverse Ouput Formats
|
Pierre Epron, Mehwish Alam and Adrien Coulet |
| 1276 |
EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs |
Ali Satvaty, Suzan Verberne and Fatih Turkmen |
| 1278 |
A Dataset for Probing Translationese Preferences in English-to-Swedish Translation |
Jenny Kunz, Anja Jarochenko and Marcel Bollmann |
| 1279 |
To Eat and beyond: A FrameNet-Inspired Annotation of Food and Its Uses over Time |
Teresa Paccosi, Gauri Bhagwat and Marieke van Erp |
| 1282 |
Conversational Implicatures through the Lens of LLMs |
Agnese Lombardi and Alessandro Lenci |
| 1283 |
Syntactic Sugar for Syntactic Queries: Sequential Representations for Dependency Queries |
Niklas Deworetzki and Arianna Masciolini |
| 1284 |
To Overfit or Not to Overfit? An Evaluation of HTR Workflow on 17Th-18Th Century French Corpus |
Marine Tiger |
| 1285 |
Toward Conversational Hungarian Speech Recognition: Introducing the BEA-Large and BEA-Dialogue Datasets |
Máté Gedeon, Piroska Zsófia Barta, Peter Mihajlik, Tekla Etelka Graczi, Anna Kohári and Katalin Mády |
| 1289 |
UnarXive 2024: A Large-Scale Scientific Corpus for Citation-Aware Retrieval and Generation |
Michael Faerber and Ines Besrour |
| 1290 |
Nawatl Context-Free Grammars for Natural Language Processing |
Juan-Manuel Torres-Moreno, Martha Lorena Avendaño Garrido, Ligia Quintana Torres, Miguel Figueroa-Saavedra,
Juan Jose Guzman Landa and Graham Ranger |
| 1292 |
Reformulate and Create, Don't Translate: Creating Natural Prompts for Underserved Languages |
Annika Simonsen, Mathias Stenlund, Lars Bungum, Marc Daníel Skipstað Volhardt and Hafsteinn Einarsson |
| 1293 |
Building Bridges between Student and Curricular Language: Creating a Corpus of Abstract Meaning
Representations for the Classroom |
Kristin Wright-Bettner, Zheng Cai, Zekun Zhao, James H. Martin, Jeffrey Flanigan and Martha Palmer |
| 1294 |
Clinical Corpus Development: Legal Compliance and Semantic Enrichment |
Justin Hofenbitzer, Christina Lohr, Andrea Riedel, Rebekka Kiser, Aliaksandra Shutsko, Abanoub Abdelmalak,
Peter Klügl, Jutta Romberg, Sarah Riepenhausen, Miriam Schechner, Jakob Faller, Frank Meineke, Luise
Modersohn,
Markus Löffler, Juliane Fluck, Udo Hahn, Stefan Schulz and Martin Boeker |
| 1295 |
CAL: A Comprehensive Lexicon for Contemporary Arabic Language |
Afrah A. Altamimi, Abdulrahman Alosaimy, Halah Munif Alharbi, Hawra Aljasim, Muneera Alhoshan, Amal
Almazrua,
Hanan Alharbi, Abdulrahman Saeed Alshehri, Bayan M. Almuqhim, Maryam H. Algarny, Yahya A. Asiri, Abdullah I.
Alharbi, Saleh Zaidan Albalawi, Fawziah Mohammed Asiri, Sara Ali Alhifthi and Abdullah Alfaifi |
| 1296 |
A Corpus of Persuasion Techniques in Slavic Languages |
Jakub Piskorski, Dimitar Iliyanov Dimitrov, Marina Ernst, Jacek Haneczok, Michal Marcinczuk, Arkadiusz
Modzelewski and Roman Yangarber |
| 1298 |
The Romanian Corpus Annotated with Multiword Expressions. PARSEME-Ro Version 2.0 |
Verginica Barbu Mititelu, Mihaela Cristescu, Elena Irimia and Carmen Mîrzea Vasile |
| 1299 |
ToneSwiper: Facilitating Manual ToDI-annotation of Dutch Prosody |
Matthijs Westera and Ariëlle Reitsema |
| 1301 |
MyChat: A Text-based Dialogue Corpus Rich in Conversational Features |
Mai Hoang Dao, Catherine Lai and Peter Bell |
| 1303 |
STAR-IL: A Dataset for Style-Aware Machine Translation of Product Reviews in Indian Languages |
Ketaki Shetye, Dipti Misra Sharma and Parameswari Krishnamurthy |
| 1304 |
Automatic Segmentation of Classical Tibetan Texts into Autochthonous and Allochthonous Regions |
Guy Bilitski, Lev Shechter, Sonam Jamtsho, Nir Marciano, Nicola Bajetta, Rebecca Sunden, Omri Drori, Kai
Golan
Hashiloni, Orr Zwebner, Asaf Shina, Orna Almogi, Dorji Wangchuk and Kfir Bar |
| 1306 |
Cultural and Knowledge Biases in LLMs through the Lens of Entity-Aware Machine Translation |
Lu Xu, Luca Moroni and Roberto Navigli |
| 1307 |
Multilingual Target-Stance Extraction |
Ethan Leigh Mines and Bonnie J. Dorr |
| 1309 |
Saudi ASWAT: A Large-Scale Corpus of Spontaneous Saudi Arabic Speech |
Abdullah I. Alharbi, Afrah A. Altamimi, Muneera Alhoshan, Amal Almazrua, Halah Munif Alharbi, Bayan M.
Almuqhim, Hawra Aljasim, Abdulrahman Alosaimy, Yahya A. Asiri and Abdullah Alfaifi |
| 1313 |
Reading Time in the Wild: An Assessment of Readability Predictors Based on Naturally-Observed Reading Times
|
Sijbren van Vaals, Rik van Noord and Malvina Nissim |
| 1315 |
Context Is (Almost) Everything: Llama-3 on Structured Output and AMR Parsing |
Maja Buljan, Stephan Oepen and Lilja Øvrelid |
| 1316 |
EPIC-EuroParl-UdS for Information-Theoretic Analysis of Translation and Interpreting |
Maria Kunilovskaya and Christina Pollklaesener |
| 1318 |
Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN |
Rémi de Vergnette and Maxime Amblard |
| 1319 |
Physical Commonsense Reasoning for Lower-Resourced Languages and Dialects: A Study on Basque |
Jaione Bengoetxea, Itziar Gonzalez-Dios and Rodrigo Agerri |
| 1323 |
Predicting Different States of Understanding in Explanatory Interactions Using Cognitive Load Related
Linguistic Cues |
Yu Wang, Olcay Türk, Angela Grimminger and Hendrik Buschmeier |
| 1324 |
Leveraging Semi-Supervised Learning for Multimodal Hate Speech Data Annotation and Detection |
Rathi Adarshi Rammohan, Zhao Ren, Dominik Puchała, Aleksandra Świderska, Dennis Küster and Tanja Schultz
|
| 1325 |
WISTERIA: Weak Implicit Signal-based Temporal Relation Extraction with Attention |
Duy Dao Do, Anaïs Halftermeyer and Thi Bich Hanh Dao |
| 1331 |
Comparing Reading Behavior across Reader Expertise and Text Complexity: Insights from the French
Eye-Tracking
Corpus (FETA) |
Oksana Ivchenko and Natalia Grabar |
| 1332 |
Object Realisation in Spoken Guadeloupan French: Evaluating NLP Models for an Under-Resourced Variety |
Amalia Canes Nápoles and Sophie Repp |
| 1334 |
Generating High Quality Synthetic Data for Dutch Medical Conversations |
Cecilia Kuan, Aditya Kamlesh Parikh and Henk van den Heuvel |
| 1335 |
Conversational Assistants to Support Patients with Heart Failure: \\ Comparing a Neurosymbolic Architecture
with GPT |
Anuja Tayal, Devika Salunke, Barbara Di Eugenio, Paula G. Allen-Meares, Eulalia P. Abril, Olga
Garcia-Bedoya,
Carolyn A. Dickens and Andrew D. Boyd |
| 1336 |
Same Meaning, Different Scores: Lexical and Syntactic Sensitivity in LLM Evaluation |
Bogdan Kostić, Conor Fallon, Julian Risch and Alexander Loeser |
| 1340 |
SciCiteVal: A Multi-Domain Dataset for Scientific Citation Verification |
Qinyue Liu, Yongxin Zhou and Cyril Labbe |
| 1341 |
Automating FAIRness: A FAIRification Tool within the Language Resources Infrastructure |
Daniele Melaccio and Monica Monachini |
| 1344 |
DeepICD-R1: Medical Reasoning through Hierarchical Rewards and Unsupervised Distillation |
Tom Röhr, Thomas Maximilian Josef Steffek, Roman Teucher, Keno Bressem, Peter Troeger, Felix Alexander Gers
and Alexander Löser |
| 1345 |
University Speaking for Everyone: Assessing Changes in Italian Higher Education Statutes toward
Gender-Inclusive Language |
Sebastiano Vecellio Salto, Camilla Casula, Alessio Palmero Aprosio and Sara Tonelli |
| 1348 |
IMaSC: A Malayalam Speech Corpus for High-Quality Text-to-Speech Synthesis |
Deepa P. Gopinath, Thennal D K, Vrinda V. Nair, Swaraj K. S and Sachin G |
| 1350 |
Integrating Knowledge Graph with Large Language Models for Multi-hop Question Generation |
Yllias Chali and Al Hasib Mahamud |
| 1352 |
There Is No Spoon: Existential Presupposition in Large Language Models |
Marie-Léontine Wörgötter, Shikai Lai and Sebastian Schuster |
| 1357 |
Adja-French Parallel Corpus: A New Resource for Machine Translation of a West African Under-Resourced
Language
|
Josue Frejus Godeme and Rolando Coto-Solano |
| 1358 |
I Came, I Saw, I Explained: Benchmarking LLMs on Figurative Meaning in Memes |
Shijia Zhou, Saif M. Mohammad, Barbara Plank and Diego Frassinelli |
| 1361 |
FAME: Fictional Actors for Multilingual Erasure |
Claudio Savelli, Moreno La Quatra, Alkis Koudounas and Flavio Giobergia |
| 1364 |
LLMs as Annotators: Evaluating Model–Human Alignment in Detecting Contentious Language in Historical Corpora
|
Yahui Zhao, Clemencia Siro and Laura Hollink |
| 1365 |
SynBullying: A Multi-LLM Synthetic Conversational Dataset for Cyberbullying Detection |
Arefeh Kazemi, Hamza Qadeer, Joachim Wagner, Hossein Hosseini, Sri Balaaji Natarajan Kalaivendan and Brian
Davis |
| 1366 |
RuznamceNER: A Named Entity Recognition Dataset for Ottoman Turkish |
Esma Fatıma Bilgin Tasdemir, Dilara Zeynep Gürer and Saziye Betul Ozates |
| 1367 |
Figurative Language in Alzheimer's Discourse: Linguistic and Neural Alignment in Clinical Narratives |
Diana Kylymnyk, Vitória Hilgert Tomasel, Rodrigo Wilkens, Helena Caseli, Edward Watkins and Aline
Villavicencio |
| 1368 |
Goldfish: Monolingual Language Models for 350 Languages |
Tyler A. Chang, Catherine Arnett, Zhuowen Tu and Benjamin Bergen |
| 1369 |
The Potential for Misleading Results in Text Sanitisation with Standard Evaluation Metrics |
Dan Zhang and Mark Anderson |
| 1371 |
SommBench: Assessing Sommelier Expertise of Language Models |
William Brach, Tomas Bedej, Jacob Nielsen, Jacob Pichna, Juraj Bedej, Eemeli Saarensilta, Julie Dupouy,
Gianluca Barmina, Andrea Blasi Núñez, Peter Schneider-Kamp, Kristian Košťál, Michal Ries and Lukas Galke Poech
|
| 1372 |
Prompting Instruction-tuned LLMs for Semantic Similarity Values |
Xander Akiko Snelder, Yunchong Huang and Jelke Bloem |
| 1373 |
GePaDeSE: A New Resource for Clause-Level Aspect in German Parliamentary Debates |
Julian Schlenker, Ines Rehbein, Lilly Brauner, Florian Ertz, Ines Reinig and Simone Paolo Ponzetto |
| 1374 |
AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models |
Yann Le Beux, Oluchi Audu, Oche David Ankeli, Dhananjay Balakrishnan, Melissah Weya, Marie Daniella
Ralaiarinosy and Ignatius Ezeani |
| 1376 |
Common Voice for Pakistan: Developing an Open Speech Corpus for Low-Resource Pakistani Languages |
Meesum Alam and Francis Tyers |
| 1377 |
Systematic Multi-Aspect Evaluation of Time Series-Based Report Generation: The Case of Financial Analysis
from
Stock Data |
Elizabeth Fons, Elena Kochkina, Rachneet Kaur, Zhen Zeng, Berowne Hlavaty, Charese Smiley, Svitlana
Vyetrenko
and Manuela Veloso |
| 1378 |
MUNIChus: MUltilingual News Image Captioning Benchmark |
Claire Chen, Alistair Plum, Hansi Hettiarachchi, Diptesh Kanojia, Saroj Kumar Basnet, Marcos Zampieri and
Tharindu Ranasinghe |
| 1379 |
Missing Links: LLM-Augmentation of Event Triggers of State Changes in the OpenPI Dataset |
Kyeongmin Rim and James Pustejovsky |
| 1384 |
Reason2Decide: Rationale-Driven Multi-Task Learning |
H M Quamran Hasan, Housam Khalifa Bashier, Jiayi Dai, Mi-Young Kim and Randy Goebel |
| 1385 |
Identifying Contexts of Distress in College Students' Reddit Posts: A Comparative Study of Classical NLP and
Large Language Models |
Carine Graff and Nikhil Krishnaswamy |
| 1387 |
Speak in Context: Multilingual ASR with Speech–Context Alignment via Contrastive Learning |
Yuchen Zhang, Haralambos Mouratidis and Ravi Shekhar |
| 1391 |
Mind the Language Gap: Assessing LLM Safety in Italian |
Elena Marafatto and Roberto Navigli |
| 1392 |
FeedFetcher: A Resilient Web Feed Downloader for Corpus Construction |
Ondřej Herman, Jan Kraus and Vit Suchomel |
| 1396 |
Disentangling Approaches to Conversation Disentanglement: Fine-Tune or Learn from Scratch? |
Debaditya Pal, Anton Leuski, Ron Artstein, David Traum and Kallirroi Georgila |
| 1397 |
SynthLLM: An LLM-based Scalable Synthetic Data Generation Pipeline for Low-Resource Languages |
Solmaz Panahi, Vasudevan Nedumpozhimana and John Kelleher |
| 1399 |
Scripting History: A Diachronic Urdu Text and Image Corpus from the 18Th to 19Th Centuries |
Sana Shams, Sahar Rauf, Asad Mustafa, Muhammad Zeeshan Javed, Qurat-ul-Ain Akram, Sarmad Hussain and Miriam
Butt |
| 1400 |
Towards Dynamic Metaphor Identification: Evaluating GPT O-Series Models on Five Metaphoricity Cues in U.S.
Trade Corpora |
Berkay Bas, Jelke Bloem and Xiaojuan Tan |
| 1401 |
LocalGovPL: A Corpus of Speaker-Attributed Polish Local Government Transcripts |
Dariusz Czerski and Maciej Ogrodniczuk |
| 1404 |
Setting the Stage for Disfluency: Implications of Contextual Task Framing Effects for the Design of
Listening
Tasks |
Ambika Kirkland and Jens Edlund |
| 1407 |
Comparing Approaches to Automatic Summarization in Less-Resourced Languages |
Chester Palen-Michel and Constantine Lignos |
| 1408 |
VUPMC: A New Political Metaphor Corpus in Mandarin Chinese |
Xiaojuan Tan |
| 1411 |
CzechDocs: A Multiway Parallel Dataset of Formatted Documents for Minority Languages in Czechia |
Josef Jon, Miroslav Hrabal and Ondřej Bojar |
| 1415 |
TiC-MuFormer: Time-Aware Caption-Integrated Multimodal Transformers for User-Level Mental Health Modeling
|
Georgios Tsoumplekas, Yannis Spyridis and Vasileios Argyriou |
| 1417 |
Ragability Benchmark: A Dataset and Library to Test LLMs on Inter-context Conflicts |
Stephanie Gross, Johann Petrak and Brigitte Krenn |
| 1419 |
Amulwe KimüN: A Community-Grounded Demo, Resource, and ASR Baseline for Mapuzugun |
Cristian Eduardo Ahumada Oliva and Fatiha Sadat |
| 1420 |
Semantic Capacity in Language Learners and LLMs: A Case Study of Quantifier Scope |
Shaohua Fang, Yue Li and Yan Cong |
| 1421 |
Exploration of How Hate Is Framed on Social Media |
Rakshitha Rao Ailneni and Sanda Harabagiu |
| 1422 |
Bulgarian Massive Multitask Language Understanding Benchmark |
Svetla Peneva Koeva, Ivelina Stoyanova, Dimitar Georgiev, Svetlozara Leseva, Valentina Stefanova, Maria
Todorova, Tsvetana Ivanova Dimitrova, Hristina Kukova, Mihaela Moskova and Tinko Tinchev |
| 1423 |
Development of Serbian QA Datasets through Prompt-Based Generation and Human Validation |
Jovana Rađenović, Olivera Kitanović and Ranka Stanković |
| 1425 |
Not All Disneys Are the Same: Making Coreference Metonymy-Aware |
Bingyang Ye, Jingxuan Tu and James Pustejovsky |
| 1426 |
How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation |
Anna Klezovich, Johanna Mesch, Gustav Eje Henter and Jonas Beskow |
| 1427 |
The Emergence of the Pragmatic Dimension in Instructed-LMs |
Davide Mazzaccara and Raffaella Bernardi |
| 1430 |
Lexicalized Constituency Parsing for Middle Dutch: Low-resource Training and Cross-Domain Generalization
|
Yiming Liang and Fang Zhao |
| 1432 |
An Enhanced Pipeline for the Manzini-Savoia Corpus |
Achille Fusco, Greta Mazzaggio and Carlo Zoli |
| 1433 |
Linking Rationale to Decision on Internet Standards: A Retrieval-Based Approach Using Synthetic Data |
Jie Bian, Étienne Simon, Andrey Kutuzov, Egil Rønningstad and Michael Welzl |
| 1436 |
Decomposing Sign Language Movements: A Multi-Band Visualization Method for Articulatory Analysis |
Antonio F. G. Sevilla and José María Lahoz-Bengoechea |
| 1437 |
Rethinking Evaluation in Retrieval-Augmented Personalized Dialogue: A Cognitive and Linguistic Perspective
|
Tianyi Zhang and David Traum |
| 1441 |
SciClaimEval: Cross-modal Claim Verification in Scientific Papers |
Xanh Ho, Yun-Ang Wu, Sunisth Kumar, Tian Cheng Xia, Florian Boudin, Andre Greiner-Petter and Akiko Aizawa
|
| 1442 |
PsihoRo: Depression and Anxiety Romanian Text Corpus |
Alexandra Ciobotaru, Ana-Maria Bucur and Liviu P. Dinu |
| 1443 |
Improving Neural Argumentative Stance Classification in Controversial Topics with Emotion-Lexicon Features
|
Mohammad Yeghaneh Abkenar, Weixing Wang, Manfred Stede, Mark A. Finlayson, Davide Picca and Panagiotis
Ioannidis |
| 1444 |
Are Social Biases in LLMs Consistent across Generative Tasks? A Case Study for Basque |
Muitze Zulaika, Xabier Saralegi, Julia Shershneva, Lia Gonzalez and Arkaitz Fullaondo |
| 1445 |
Evaluating Gender and Pronoun Bias in LLM Moral Judgments |
Gustavo Lucius Fernandes, Jeiverson Santos and Pedro O.S Vaz-de-Melo |
| 1446 |
LFQA-HP-1M: A Large-Scale Human Preference Dataset for Long-Form Question Answering |
Rafid Ishrak Jahan, Fahmid Shahriar Iqbal and Sagnik Ray Choudhury |
| 1449 |
PHEB: An European Portuguese High School-Level LLM Benchmark |
Diogo C. Tavares, Rafael Ferreira, Afonso Simplício, Gonçalo Martins, Ana Carolina Condez, Inês Calvo, Inês
Vieira, David Semedo and Joao Magalhaes |
| 1451 |
Aligned Parallel Corpus of the Vedic SaṁHitāS for Machine Translation |
Yuzuki Tsukagoshi and Ikki Ohmukai |
| 1453 |
The Multilingual Euphemism Benchmark: Datasets and Baselines for Pragmatic Language Understanding |
Whitney Poh, Julia Sammartino, Jasper Andrew, Witold Kieras, Natalia Zawadzka-Paluektau, Iryna Dilai, Libby
Barak, JIng Peng and Anna Feldman |
| 1457 |
MekongPhon: A Large-Scale Parallel IPA Corpus for Lao and Khmer |
Ammon Shurtz, Christian Richardson and Stephen D. Richardson |
| 1459 |
MindSET: Advancing Mental Health Benchmarking through Large-Scale Social Media Data |
Saad Mankarious, Ayah Zirikly, Daniel Wiechmann, Elma Kerz, Yu Qiao and Edward Kempa |
| 1464 |
Distributed Partial Information Puzzles: Examining Common Ground Construction under Epistemic Asymmetry |
Yifan Zhu, Mariah Bradford, Kenneth Lai, Timothy Obiso, Videep Venkatesha, James Pustejovsky and Nikhil
Krishnaswamy |
| 1466 |
Persona-Conditioned Generation of Patient Self-Reports from EHRs |
Yuexin Wu, Jianming Wei and Vasile Rus |
| 1467 |
Towards the Morphological Annotation of North Markian (Low German) |
Christian Chiarcos |
| 1471 |
CorSpell: Introducing a Semiautomatic Tool for Spelling Normalization in Brazilian Portuguese |
Juliana Schoffen, Dennis Giovani Balreira, Elisa Marchioro Stumpf, Larissa Goulart, Tanara Zingano Kuhn,
Gabriel Ricci Pazzinato, Isadora Dahmer Hanauer, José Henrique de Souza Silva, Luiza Sarmento Divino and
Marine
Matte |
| 1472 |
GlossMATE: Multi-Agent Translator Explanations for Glosses |
Changbing Yang, Patrick Littell, Gabriel Bernier-Colborne, Yanfei Lu and Mengzhe Geng |
| 1473 |
Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages |
Swati Sharma, Divya V. Sharma and Anubha Gupta |
| 1474 |
Meta4XNLI-ptBR: Brazilian Portuguese Extension of Meta4XNLI Corpus |
Karina Mayumi Johansson, Fernanda Malheiros Assi, Isabella Leite Pereira da Silva, Rafael Vinícius Passador,
Isabela Cristina Rodrigues, Aline Paes and Helena Caseli |
| 1476 |
DEJIMA: A Novel Large-scale Japanese Dataset for Image Captioning and Visual Question Answering |
Toshiki Katsube, Fukuhara Taiga, Kenichiro Ando, Yusuke Mukuta, Kohei Uehara and Tatsuya Harada |
| 1477 |
RespondeoQA: A Benchmark for Bilingual Latin-English Question Answering |
Marisa Hudspeth, Patrick J. Burns and Brendan O'Connor |
| 1483 |
SocialStep: Fast Prediction of Social Determinants of Health |
Paul Landes, Adam Richard Cross and Jimeng Sun |
| 1485 |
Transformer-Enabled Diachronic Analysis of Vedic Sanskrit: Neural Methods for Quantifying Types of Language
Change |
Ananth A. Hariharan and David R. Mortensen |
| 1487 |
Localizing Events in Space: Comparing Humans and AI Models |
Derrick Eui Gyu Kim, Kenneth Lai and James Pustejovsky |
| 1488 |
FormosanMT: A Multilingual Parallel Corpus of the Formosan Language Family |
Hunter Scheppat, Joshua K. Hartshorne, Sema Koc, Éric le Ferrand and Emily Prud'hommeaux |
| 1492 |
ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech |
Marios Koniaris, Argyro Tsipi and Panayiotis Tsanakas |
| 1494 |
LitTx: A New Treatment Relation Extraction Dataset |
Yuhang Jiang, Md Sultan Al Nahian, Li Hao Richie Xu, Rani Chikkanna and Ramakanth Kavuluru |
| 1495 |
More than "Oh": Grounding Observable Events with Grunts in Multimodal Dialogue |
Richard A. Brutti and James Pustejovsky |
| 1496 |
FIBER: Factual Inference Bias Evaluation Resource |
Evren Ayberk Munis, Deniz Yilmaz, Arianna Muti and Cagri Toraman |
| 1497 |
The Construction of a Mixe Variant Parallel Corpus |
Ivan Vladimir Meza Ruiz, Zacarias Delfino Marquez, Martha Elba Ramírez Andrés, Victoriano Santiago Cayetano,
Jonathan Santiago Antonio and Carlos Daniel Hernández Mena |
| 1501 |
Quantifying the Accuracy–Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search |
Kyle A. McCleary and James M. Ghawaly |
| 1503 |
Introducing MELI: The Mandarin-English Language Interview Corpus |
Suyuan Liu and Molly Babel |
| 1507 |
Evaluating Multilingual Transformer Models for Lemmatization in Nepali: A Low-Resource Case Study |
Sunil Regmi, Sundeep Dawadi and Bal Krishna Bal |
| 1512 |
Evaluating Multimodal Large Language Model Narrative Interpretation through the Lens of Appraisal Theory
|
Jayant Teotia, Xiaowei Wang, Xulang Zhang, Rui Mao and Erik Cambria |
| 1513 |
SENSEI-ASG: A Challenging Dataset for Argument Summary Graph Parsing |
Jonathan Clayton, Marco Damonte and Robert Gaizauskas |
| 1514 |
Cross-Dataset Inconsistencies in Morphological Annotation: Evidence from Universal Dependencies |
Vlasta Ohlídalová |
| 1515 |
COME-ALPs: Coreference Annotation with MErging Heuristics Using ALignment-based Projection in Parallel
Corpora
|
Gabriela Nicole Gonzalez Saez, Mariam Nakhle, Illia Kholosha, Rachel Atherly and Marco Dinarelli |
| 1518 |
EpiGator: LLM-based Tracker of Infectious Outbreaks |
Yiheng Wu, Trangcasanchai Sathianpong, Jue Hou, Lidia Pivovarova and Roman Yangarber |
| 1519 |
PhonemeDF: A Synthetic Speech Dataset for Audio Deepfake Detection and Naturalness Evaluation |
Vamshi Nallaguntla, Aishwarya R. Fursule, Shruti Kshirsagar and Anderson Raymundo Avila |
| 1520 |
Human-in-the-Loop Mass Transcription and Ground Truth Annotation for Challenging Historical Documents |
Norbert Fischer and Frank Puppe |
| 1521 |
Diacritic Restoration for Low-Resource Indigenous Languages: Case Study with Bribri and Cook Islands MāOri
|
Rolando Coto-Solano, Daisy Li, Manoela Teleginski Ferraz, Olivia Sasse, Cha Krupka, Sharid Loáiciga and
Sally
Akevai Tenamu Nicholas |
| 1524 |
MedPT: A Massive Medical Question Answering Dataset for Brazilian-Portuguese Speakers |
Fernanda Bufon Farber, Iago Alves Brito, Julia Soares Dollis, Pedro Schindler Freire Brasil Ribeiro, Rafael
Teixeira Sousa and Arlindo R. Galvão Filho |
| 1526 |
MEUR: A Benchmark for Evaluating Vision-Language Models on Multimodal Event Understanding and Reasoning |
Zimu Wang, Yuqi Wang, Tong Chen, Changyu Zeng, Hongbin Na, Nijia Han, Fuyu Xing, Qi Chen, Qiufeng Wang, Anh
Nguyen, Shuihua Wang, Ling Chen, Jionglong Su, Haiyang Zhang and Wei Wang |
| 1527 |
Two Ojibwe Constraint Grammars: Morphological Disambiguation and Dependency Parsing |
Matthias Diederichsen and Christopher Hammerly |
| 1528 |
Breaking the Benchmark: Revealing LLM Bias via Minimal Contextual Augmentation |
Kaveh Eskandari Miandoab, Mahammed Kamruzzaman, Arshia Gharooni, Gene Louis Kim, Vasanth Sarathy and Ninareh
Mehrabi |
| 1529 |
High-Order Question Generation in a Multilingual Educational Context |
Suna Şeyma Uçar, Itziar Aldabe, Nora Aranberri and Orphee De Clercq |
| 1531 |
A Modern Online Learning Platform for ʻŌLelo HawaiʻI Classrooms |
Christian Castro, Keneth Martin, Winston Wu and William H. Wilson |
| 1532 |
Comparing Traditional and LLM-based Approaches for Automated Scoring of Dutch Writing Products |
Joni Kruijsbergen |
| 1535 |
Referenceless Evaluation of Machine Translation Models by Ranking Performance in Romaninan to English
Translate-train Settings |
Mihail Feraru, Alexandra Diaconu and Bogdan Dumitru Alexe |
| 1536 |
MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages |
Anri M. Lombard, Temi Aina, Ethan Wolff, Elan Norvick, Sbonelo Gumede, Simbarashe Mawere, Francois Meyer and
Jan Buys |
| 1537 |
Very Large-Scale Multilingual Resources for LLM and MT Research. Mono- and Bi-lingual Data, Multilingual
Evaluation, and Pre-Trained Models |
Stephan Oepen, Nikolay Arefyev, Marta Bañón, Laurie V. Burchell, Mariia Fedorova, Ona de Gibert, Jan Hajič,
Andrey Kutuzov, Vladislav Mikhailov, Gema Ramírez-Sánchez, Joerg Tiedemann and Jaume Zaragoza |
| 1540 |
S-GRADES -- Studying Generalization of Student Response Assessments in Diverse Evaluative Settings |
Tasfia Seuti and Sagnik Ray Choudhury |
| 1541 |
Orthographic Constraint Satisfaction and Human Difficulty Alignment in Large Language Models |
Bryan E. Tuck and Rakesh Verma |
| 1545 |
Categorical Emotions or Appraisals - Which Emotion Model Explains Argument Convincingness Better? |
Lynn Greschner, Meike Bauer, Sabine Weber and Roman Klinger |
| 1548 |
Improving Latvian Morphosyntactic Parsing through Continued Pretraining and Analyzer-Guided Decoding |
Arturs Znotins |
| 1549 |
Building Collaborative Speech Corpora for Low-Resource Languages: The Galician Dataset in Mozilla Common
Voice
|
Adina Ioana Vladu, Elisa Fernández Rei and María Pérez Lago |
| 1551 |
Dynamic Model Switching to Mitigate Outdated Knowledge in Large Language Models |
Ramakrishna Pinninti, Sabyasachi Kamila, Adam Jatowt, Ayan Mazumder and Mohammed Hasanuzzaman |
| 1552 |
The GELATO Dataset for Legislative NER |
Matthew Flynn, Timothy Obiso and Sam Newman |
| 1553 |
Glossed Data in Northern Interior Salish |
Anna Stacey |
| 1554 |
Frame-Guided Synthetic Claim Generation for Automatic Fact-Checking Using High-Volume Tabular Data |
Jacob Devasier, Akshith Putta, Qing Wang, Alankrit Moses and Chengkai Li |
| 1557 |
Diagnosing Translated Benchmarks: An Automated Quality Assurance Study of the EU20 Benchmark Suite |
Klaudia Thellmann, Bernhard Stadler and Michael Faerber |
| 1560 |
Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA |
Rishabh Maheshwary, Masoud Hashemi, Khyati Mahajan, Shiva Krishna Reddy Malay, Sai Rajeswar Mudumba, Sathwik
Tejaswi Madhusudhan, Spandana Gella and Vikas Yadav |
| 1561 |
TryggLLM: A Benchmark for Evaluating LLM Safety in Norwegian |
Samia Touileb, Truls Pedersen and Isabell Stinessen Haugen |
| 1563 |
CommonMorph: Participatory Morphological Documentation Platform |
Aso Mahmudi, Sina Ahmadi, Kemal Maulana Kurniawan, Rico Sennrich, Eduard H. Hovy and Ekaterina Vylomova |
| 1564 |
A Bilingual Bimodal Benchmark for Arabic-English NLP across Grammatical Correction, Essay Scoring,
Morphological Tagging, and Speech Recognition |
Bashar Alhafni, Injy Hamed, Fadhl Eryani, David Palfreyman and Nizar Habash |
| 1565 |
The Amharic DBpedia Chapter: A Knowledge Graph for a Low-Resource Language |
HIzkiel Mitiku Alemayehu, Tilahun Abedissa Taffa, Meti Adane Bayissa, Andargachew Asfaw Zewge, Hamada
Zahera,
Ricardo Usbeck and Axel-Cyrille Ngonga Ngomo |
| 1568 |
Evaluating the Adaptability of Large Language Models to Linguistic Variation |
Ziyan Xu, Alice Millour, Carlos-Emiliano Gonzalez-Gallardo and Jean-Yves Antoine |
| 1570 |
Reason-to-Learn (R2L): Multi-Agent Knowledge Distillation for Lightweight LLMs in Sentiment Analysis |
le-Huy Tu, Quan Nguyen, Vincent Nguyen, Johanna Bjorklund and Xuan-Son Vu |
| 1571 |
Information Asymmetry across Language Varieties: A Case Study on Cantonese–Mandarin and Bavarian–German QA
|
Siyao Peng, Renhao Pei, Verena Blaschke, Robert Litschko and Barbara Plank |
| 1572 |
Relation Extraction across Entire Books to Reconstruct Community Networks: The AffilKG Datasets |
Erica Cai, Sean Mcquade, Kevin Young and Brendan O'Connor |
| 1574 |
Developing a Guideline for the Labovian-Structural Analysis of Oral Narratives in Japanese |
Amane Watahiki, Tomoki Doi, Akari Kikuchi, Hiroshi Ohata, Yuki I. Nakata, Takuya Niikawa, Taiga Shinozaki
and
Hitomi Yanaka |
| 1575 |
Targum - a Multilingual New Testament Translation Corpus |
Maciej Rapacz and Aleksander Smywiński-Pohl |
| 1576 |
``Decode the Law": Towards Legal Text Simplification with Large Language Models |
Mohammed Danish Rabbani, Subhadeep Roy, Sayantan Mitra and Tulika Saha |
| 1581 |
Faithful Medical Dialogue Generation Using Homo-Heterogeneous Exemplar-based In-Context Knowledge Grounding
|
Priyanshu Priya, Hardik Goyal and Asif Ekbal |
| 1582 |
KOCOH: Korean Context-Dependent Hate Speech Dataset |
Eunah Park and Sanghoun Song |
| 1584 |
Fine-grained Narrative Classification in Biased News Articles |
Zeba Afroz, Harsh Vardhan, Pawan Bhakuni, Aanchal Punia, Rajdeep Kumar and Md. Shad Akhtar |
| 1586 |
Judging Instruction Responses in a Low-Resource Language: A Case Study on Basque |
David Ponce, Harritxu Gete, Thierry Etchegoyhen, Irune Zubiaga and Aitor Soroa |
| 1588 |
Datasets for Verb Alternations across Languages: BLM Templates and Data Augmentation Strategies |
Giuseppe Samo and Paola Merlo |
| 1590 |
Generation of Instruction and Preference Dataset for Improving Japanese Instruction Following in LLMs |
Kei Moriyama, Takashi Kodama and Kouta Nakayama |
| 1592 |
Mapping Liberty Metaphors across Cultures and Time |
Sidney Suen, Rui Mao, Kenneth Kwok and Erik Cambria |
| 1596 |
STRUDEL: Unrolling a Benchmark for Evaluating Vision-Language Models on Structured Diagram Understanding
across Domains |
Daniel Steinigen, Lucie Flek and Sebastian Houben |
| 1597 |
Proffiliadur: Welsh Language Text Profiling Toolkit |
Nicolás Gutiérrez-Rolón, Jonathan Davies, Tomos Williams, Dawn Knight and Fernando Alva-Manchego |
| 1598 |
Ithaca Revisited: Benchmarking a Domain-Specific Model for Epigraphy in the Age of LLMs |
Alessandro Locaputo, Andrea Brunello, Nicola Saccomanno, Paraskevi Platanou and Giuseppe Serra |
| 1600 |
From Print to Digital and beyond: The Retrodigitization of a Historical Dictionary of Italian as a Hybrid
Lexical Resource |
Marco Biffi, Sebastiana Cucurullo, Manuel Favaro, Elisa Guadagnini, Simonetta Montemagni and Eva Sassolini
|
| 1601 |
Learning through News: Bridging the Gap between Algorithmic Recommendation and Human Curation |
Florian Debaene, Loic De Langhe, Orphee De Clercq and Veronique Hoste |
| 1602 |
The Sensorimotor Norms for the Chinese Classifiers |
Yimei Shao, Yu-Yin Hsu and Chu-Ren Huang |
| 1603 |
Who Benchmarks the Benchmarks? A Case Study of LLM Evaluation in Icelandic |
Finnur Ágúst Ingimundarson, Steinunn Rut Friðriksdóttir, Bjarki Ármannsson, Iris Nowenstein and Steinþór
Steingrímsson |
| 1604 |
German General Social Survey Personas: A Survey-Derived Persona Prompt Collection for Population-Aligned LLM
Studies |
Jens Rupprecht, Leon Froehling, Claudia Wagner and Markus Strohmaier |
| 1605 |
A Large and Balanced Multi-Domain Arabic Corpus Annotated for Morphology, Syntax, and Readability |
Khalid N. Elmadani, Adel Mahmoud Wizani, Hanada Taha Thomure and Nizar Habash |
| 1606 |
CANVAS: A Multimodal Dataset of Chinese Textbook Images for Bias and Representation Analysis |
Haotian Zhu, Kefan Yu and Min Li |
| 1608 |
The Spectrum of Sentiment: Optimistic, Pessimistic, and Neutral Voices in Online Depression Discourse |
Stefana Arina Tabusca, Ana-Maria Bucur and Liviu P. Dinu |
| 1609 |
MultiCoS: A Multilingual Dataset of Connective Semantics with Context–Sentence Compatibility |
Wataru Uegaki, Anne Mucha and Ciyang Qing |
| 1611 |
MaskedVerbalizer: Automatic Verbalizer Construction for Few-Shot Text Classification in Low-Resource
Right-to-Left Languages |
Faizad Ullah, Furqan Sikandar, Areeba Waqar, Faizan Ali, Muhammad Sohaib Ayub, Mubashar Mushtaq and Asim
Karim
|
| 1618 |
How Much Data for Stable Formant Values? Pipeline for Convergence Detection Based on Read Speech |
Kayla Sward, Johan Sjons and Axel G. Ekstrom |
| 1620 |
Slovene Morphological and Word Formation Segmentation: A Novel Dataset and Evaluation |
Marko Pranjić, Boris Kern, Ines Voršič and Senja Pollak |
| 1622 |
Refusal Steering: Fine-grained Control over LLM Refusal Behaviour for Sensitive Topics |
Iker GarcÃa-Ferrero, David Montero and Roman Orus |
| 1623 |
Cygnet: Refactoring the Open Multilingual Wordnet |
Rowan Hall Maudslay and Francis Bond |
| 1625 |
Towards Fair Speech Recognition: Mitigating Demographic Bias in End-to-End ASR Systems |
Maliha Jahan, Thomas Thebaud, Zsuzsanna Fagyal, Jesus Villalba, Mark Hasegawa-Johnson, Laureano Moro
Velazquez
and Najim Dehak |
| 1629 |
RBR: RAG-Based Open-Domain Question Answering Using a Ranking Approach to Document Retrieval |
Priyatam Naravajhula and Vincent Ng |
| 1631 |
Evaluation of Failure Communication Strategies for Trust Repair in Human-AI Collaboration |
Stina Klein, Alexandru Wurm, Elisabeth Andre and Matthias Kraus |
| 1633 |
What Are LLMs Doing to Scientific Communication? Measuring Changes in Writing Practices and Reading
Experience
|
Filip Miletić and Neele Falk |
| 1634 |
MUSCAT: MUltilingual, SCientific ConversATion Benchmark |
Supriti Sinhamahapatra, Thai-Binh Nguyen, Yiğit Oğuz, Enes Yavuz Ugan, Jan Niehues and Alexander Waibel |
| 1635 |
Every Word Presented in Context: Syntactic Coverage as Objective for Low-Resource Machine Translation with
LLMs |
Samuel Frontull and Thomas Ströhle |
| 1636 |
LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented
Generation |
Koki Itai, Shunichi Hasegawa, Yuta Yamamoto, Gouki Minegishi and Masaki Otsuki |
| 1637 |
JSTS-Neg: Japanese Semantic Textual Similarity Dataset for Evaluating Negation Understanding Ability |
Reiko Yuasa, Yoshihide Kato and Shigeki Matsubara |
| 1638 |
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants |
Hunzalah Hassan Bhatti and Firoj Alam |
| 1640 |
Sentence-Level Back-Transliteration of Romanized Indian Languages: Performance Analysis and Challenges |
Saurabh Kumar, Dhruvkumar Babubhai Kakadiya, Sanasam Ranbir Singh and Sukumar Nandi |
| 1642 |
Multi-Session Client-Centered Treatment Outcome Evaluation in Psychotherapy |
Hongbin Na, Tao Shen, Shumao Yu and Ling Chen |
| 1646 |
CLEVR-3D-DeRef |
Mary Lynn Martin, Martha Palmer and Maria Leonor Pacheco |
| 1647 |
GeneFRDebate: Generated French Debates from News Articles with Industrial-Expert Summaries |
Rim Abrougui, Guillaume Lechien, Elisabeth Savatier and Benoît Laurent |
| 1649 |
AmbiCoRefVis: A Tool for Visualizing Coreferential Ambiguity |
Patrick Paetzold, Lukas Beiske, Mark-Matthias Zymla, Massimo Poesio, Miriam Butt, Daniel Weiskopf and Oliver
Deussen |
| 1650 |
Multilingual KokoroChat: A Multi-LLM Ensemble Translation Method for Creating a Multilingual Counseling
Dialogue Dataset |
Ryoma Suzuki, Zhiyang Qi and Michimasa Inaba |
| 1651 |
Document-Level Text Simplification in Estonian Using Large Language Models |
Meeri-Ly Muru and Eduard Barbu |
| 1655 |
Fables-DTR: A Corpus of Fables Annotated for Discourse and Temporal Relations |
Purificação Moura Silvano, António Leal, Aleksandra Tomaszewska, Maciej Ogrodniczuk, Luís Filipe Cunha,
Evelin Amorim and Joana Gomes |
| 1661 |
Towards Reward Modeling for AI Tutors in Math Mistake Remediation |
Kseniia Petukhova and Ekaterina Kochmar |
| 1662 |
A Study on Building Efficient Zero-Shot Relation Extraction Models |
Hugo Thomas, Caio Corro, Pascale Sébillot and Guillaume Gravier |
| 1663 |
Resource-Lean Lexicon Induction for German Dialects |
Robert Litschko, Barbara Plank and Diego Frassinelli |
| 1667 |
NepTam: A Nepali-Tamang Parallel Corpus and Baseline Machine Translation Experiments |
Rupak Raj Ghimire, Bipesh Subedi, Balaram Prasain, Prakash Poudyal, Praveen Acharya, Nischal Karki, Rupak
Tiwari, Rishikesh Kumar Sharma, Jenny Poudel and Bal Krishna Bal |
| 1668 |
Cross-Corpus CEFR Classification through Artificial Learners Perplexities |
Bernardo Stearns, John P. McCrae and Thomas Gaillat |
| 1669 |
Vrittanta-EN: A Benchmark Dataset for Event Trigger Detection and Classification Advancing Event
Understanding
in English Narrative Discourse |
Chaitanya Kirti, Ashish Anand and Prithwijit Guha |
| 1671 |
Creation of the Estonian Subjectivity Dataset: Assessing the Degree of Subjectivity on a Scale |
Karl Gustav Gailit, Kadri Muischnek and Kairit Sirts |
| 1672 |
Investigating Proactivity in Multimodal Task-Guidance Dialogues |
Sofia Brenna, Elisabetta Jezek, Matthias Kraus and Bernardo Magnini |
| 1673 |
A Benchmark Corpus for the Diagnostic Assessment of Content in L2 English Speech |
Kosuke Doi, Justin Vasselli and Taro Watanabe |
| 1677 |
Recovering Registers from Leveled Wordlists |
Yo Ehara |
| 1678 |
MUC-4 Revisited: Document-level Event Analysis beyond Span-based Arguments |
Helene Bøsei Olsen, Erik Velldal and Lilja Øvrelid |
| 1681 |
CEFR-Cymraeg: A Dataset and Baseline Models for Language Proficiency Assessment in Welsh |
Eeshan Waqar, Jonathan Davies, Dawn Knight and Fernando Alva-Manchego |
| 1683 |
Emotion Transcription in Conversation: A Benchmark for Capturing Subtle and Complex Emotional States through
Natural Language |
Yoshiki Tanaka, Ryuichi Uehara, Koji Inoue and Michimasa Inaba |
| 1685 |
Singlish to English Translation with Precision: A Dataset and Language Detection-Driven Masked Modeling for
Singlish to English Translation |
Sujit Kumar, Gerome Kusuma Ang, Stephanie Hilary Xinyi Ma, Andy Hau Yan Ho and Andy Khong |
| 1689 |
Dynamically Acquiring Text Content to Enable the Classification of Lesser-known Entities for Real-world
Tasks
|
Fahmida Alam and Ellen Riloff |
| 1690 |
Insights from Romanized Manipuri Social Media Text: A Transliteration Corpus and Variation Analysis |
Maisang Kamei Salice, Sanasam Ranbir Singh and Priyankoo Sarmah |
| 1693 |
Investigating Reasoning with Hypotheses: The RIP2 Corpus |
Ella Schad, Clara Seyfried and Chris Reed |
| 1694 |
Speak, Point, Look: A Multimodal Benchmark for Context-Aware Grounding in 3D Dialogue |
Anna Deichler, Jim O'Regan, Fethiye Irmak Dogan, Anna Klezovich, Lubos Marcinek, Iolanda Leite and Jonas
Beskow |
| 1695 |
DeepQuestion: Systematic Generation of Real-World Challenges for Evaluating LLMs Performance |
Ali Khoramfar, Ali Ramezani, Mohammad Mahdi Mohajeri, Mohammad Javad Dousti, Majid Nili Ahmadabadi and
Heshaam
Faili |
| 1698 |
Probing Discrete Speech Tokens of Spoken Language Models |
Sven Naber, Pranav Singh, Alberto Saponaro, Ioanna Karagianni, Julia Koch and Ngoc Thang Vu |
| 1702 |
MELD: Melding Diverse Multilingual and Multi-Domain Datasets for Named Entity Recognition Evaluation |
Kevin Glocker and Marco Kuhlmann |
| 1704 |
LLMs in Ottoman Turkish: From MLM to NER |
Enes Yılandiloğlu |
| 1706 |
AsmLegalMT: A Parallel Corpus, Benchmark and Analysis for English-Assamese Machine Translation of Legal
Judgments |
Telem Joyson Singh, Hemanta Baruah, Sanasam Ranbir Singh, Anindita Talukdar, Nasrin Shahnaz, Okram Jimmy
Singh, Priyankoo Sarmah, Pallav Kumar Dutta and Sukumar Nandi |
| 1708 |
DiscoRAG: A Discourse-Aware Agent for Query-Based Summarization of Long Documents |
Alexander Chernyavskiy, Lidiia Ostyakova and Dmitry Ilvovsky |
| 1709 |
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
|
June Hyoung Kwon, Jungmin Yun and Youngbin Kim |
| 1710 |
Is This Idea Novel? An Automated Benchmark for Judgment of Research Ideas |
Tim Schopf and Michael Faerber |
| 1711 |
Appeal, Align, Divide? Stance Detection for Group-Directed Messages in German Parliamentary Debates |
Ines Rehbein, Maris Leander Buttmann, Julian Schlenker and Simone Paolo Ponzetto |
| 1712 |
FinER-ABSA: A Benchmark for Implicit and Explicit Entity Recognition and Aspect-Based Sentiment Analysis in
Financial News |
Pachara Akkanwanich, Pavorn Thongyoo, Mahannop Thabua, Konlakorn Wongpatikaseree and Natthawut
Kertkeidkachorn
|
| 1714 |
Questionnaire Meets LLM: A Benchmark and Empirical Study of Structural Skills for Understanding Questions
and
Responses |
Duc-Hai Nguyen, Vijayakumar Nanjappan, Barry O'Sullivan and Hoang D. Nguyen |
| 1717 |
MUSIA: Multilingual Story Illustration Corpus for Cross-Cultural Alignment and Generation |
Krishna Tewari, Supriya Chanda, Nirmit Patil and Sukomal Pal |
| 1719 |
SloPal: A 60-Million-Word Slovak Parliamentary Corpus with Aligned Speech and Fine-Tuned ASR Models |
Erik Božík and Marek Suppa |
| 1720 |
Dialectal Filtering: Synthesizing Kurdish Corpora for Low-Resource Varieties by Utilizing "Noise" in Large
Textual Data |
Christian Schuler, Raman Ahmad, Ānrán Wáng, Daniil Gurgurov, Timo Baumann, Simon Ostermann and Josef van
Genabith |
| 1723 |
DREAM: A Multicultural Multimodal Dataset Linking Dialogues and Realistic Image Sequences |
Luis Fernando D'Haro and Juan Mallo |
| 1724 |
MUDiC: A Dataset for Multi-User Dialogue and Collaboration in Chatbot Interaction |
Nicolas Wagner, Cristina Luna Jimenez, Elisabeth Andre, Wolfgang Minker and Stefan Ultes |
| 1725 |
EthiQuest: LLM-Powered Ethical Questionnaire Generation for Research Review |
Ishank Kapania, Radhika Mamidi and Rahul Mishra |
| 1726 |
VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought |
Byeonggeuk Lim, Kyeonghyun Kim, Jungmin Yun and Youngbin Kim |
| 1727 |
Coordinate Structure Extraction for Patent Claims Using Multilingual LLMs |
Tsukusa Ishimaru, Takehito Utsuro and Masaaki Nagata |
| 1729 |
VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics |
Josef Kuchar, Marek Kadlcik, Michal Spiegel and Michal Stefanik |
| 1730 |
Augmented Keyphrase Extraction from Slovak Scientific Documents |
Dávid Števaňák and Marek Suppa |
| 1731 |
Adverbs Revisited: Enhancing WordNet Coverage of Adverbs with a Supersense Taxonomy |
Jooyoung Lee, Jader Martins Camboim de Sá and Cedric Pruski |
| 1732 |
Scoring the Translation: On Target Automatic Keyword-Based Evaluation of Machine Translation in the Sports
Domain |
Steinthor Steingrimsson and Einar Sigurdsson |
| 1734 |
PRiSM: Partial Ranking via Inter-layer Semantic Measurement for Efficient Fine-tuning of Language Models
|
Aldrin Kabya Biswas, MD Fahim, Md. Ashraful Amin, Amin Ahsan Ali and Akm Mahbubur Rahman |
| 1735 |
A Corpus of Misunderstood Irony on Turkish Social Media |
Çağrı Çöltekin and Güliz Güneş |
| 1738 |
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs |
Masayuki Kawarada, Tatsuya Ishigaki and Hiroya Takamura |
| 1741 |
The DELPH-IN Grammar Codex: A Curated Repository of Grammars and Treebanks |
Francis Bond and Dan Flickinger |
| 1743 |
Morphemes without Borders: Evaluating Root–Pattern Morphology in Arabic Tokenizers and LLMs |
Yara Yousif Alakeel, Chatrine Qwaider, Hanan Aldarmaki and Sawsan Alqahtani |
| 1744 |
When Consistency Becomes Bias: Interviewer Effects in Semi-Structured Clinical Interviews |
Hasindri Sankalpana Watawana, Sergio Gastón Burdisso, Diego Aaron Moreno-Galvan, Fernando Sanchez-Vega,
Adrian
Pastor Lopez Monroy, Petr Motlicek and Esau Villatoro-Tello |
| 1748 |
StoryCCDial: Collecting and Analyzing Human–Human Co-Creation Dialogues for Personalized Creative Support
|
Natsumi Ezure and Michimasa Inaba |
| 1750 |
Few-shot Prompting or Supervised Tuning? A Comparative Study of LLMs for Linguistically Distant Language
Pairs
in BDI |
Deepen Naorem, Sanasam Ranbir Singh, Telem Joyson Singh and Priyankoo Sarmah |
| 1752 |
HotelCheckSpan: A Benchmark Dataset for LLM Faithfulness |
Patricia Schmidtova, Ondrej Dusek and Saad Mahamood |
| 1753 |
Anonymized: A FAIR Galician TTS Corpus for Neural Speech Synthesis |
Adina Ioana Vladu, Antonio Moscoso Sánchez, Carmen Magariños, María Perez Lago and Elisa Fernández Rei |
| 1756 |
Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask |
Nan Li, Albert Gatt and Massimo Poesio |
| 1758 |
Beyond Catalogue Counts: Quantifying Visibility Bias in Low-Resource Multilingual NLP |
Zhiyin Tan and Changxu Duan |
| 1761 |
Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy |
Navdeep Singh Bedi, Ana-Maria Bucur, Noriko Kando and Fabio Crestani |
| 1762 |
Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan |
Chihiro Taguchi, Yukinori Takubo and David Chiang |
| 1763 |
When Structure Matters: Cross-Lingual Hyperbolic Embeddings for Chinese and English Wordnets |
Mao-Chang Ku, da-Chen Lian, Pin-Er Chen, Po-Ya Angela Wang, Wei-Ling Chen and Shu-Kai Hsieh |
| 1765 |
Adaptive Method for Self-Supervised Learning Models on Automatic Dialect Speech Recognition Based on Shared
Knowledge of Japanese Dialects and Standard Japanese |
Naoru Asakawa, Naoki Takahashi, Atsuhiko Kai and Seiichi Nakagawa |
| 1766 |
Can Multimodal LLMs Generate Pedagogically Relevant Questions? |
Thomas Gerald, Paul Lerner, Anne Vilnat, Sahar Ghannay and Julie Lascar |
| 1767 |
The Riddle of Reflection: Evaluating Reasoning and Self-Awareness in Multilingual LLMs Using Indian Riddles
|
Abhinav P M, Ojasva Saxena, Oswald C and Parameswari Krishnamurthy |
| 1768 |
Automatic Inter-document Multi-hop QA Generation |
Seungmin Lee, Dongha Kim, Yuni Jeon, Junyoung Koh and Min Song |
| 1772 |
MASRAD: Arabic Terminology Management Corpora with Semi-Automatic Construction |
Mahdi Nasser, Laura Sayah and Fadi Zaraket |
| 1774 |
HOTATE: A Japanese Dialogue Corpus Annotated with Responses of Private Thoughts and Public Statements |
Yuko Toda, Daisuke Maekawa, Kota Manabe, Eito Yoneyama, Kanade Nonomura, Yuki Fujiwara and Tomoyuki Kajiwara
|
| 1776 |
Towards Improving Multimodal Machine Translation with LLMs: A Focus on Indic Languages |
Amulya Ratna Dash, Chirag Wadhwa and Yashvardhan Sharma |
| 1778 |
Pragmatic Modelling in Language Learning: Caregiver Question-Answer Feedback in Child-Directed Dialogue |
Maryam Bala, Johannes Heim, Elspeth Edelstein and Arabella Sinclair |
| 1779 |
CRiT-QA: A Dataset for Evaluating LLMs on Counterfactual and Distractor-based Multi-hop Reasoning |
June Hyoung Kwon and Youngbin Kim |
| 1780 |
A Japanese Dataset for Aspect-based Sentiment Polarity and Emotion Intensity Estimation |
Kentaro Hanafusa, Kota Manabe, Yuki Meda, Daisuke Maekawa, Tomoyuki Kajiwara, Hideaki Hayashi, Yuta
Nakashima
and Hajime Nagahara |
| 1782 |
RILEC: Detection and Generation of L1 Russian Interference Errors in English Learner Texts |
Darya Kharlamova and Irina Proskurina |
| 1783 |
LegitimNarrate: A Dataset for Analyzing Legitimation Mechanisms in Crowdfunding Narratives |
Asmaa Lagrid, Sebastien Fournier, Benedicte Aldebert, Ali Ghods, Gael Leboeuf and Daisy Bertrand |
| 1785 |
Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark |
Terra Blevins, Stephen Mayhew, Marek Suppa, Hila Gonen, Shachar Mirkin, Vasile Pais, Kaja Dobrovoljc, Voula
Giouli, Jun Kevin, Enes Yılandiloğlu, Eugene Jang, Eungseo Kim, Jeongyeon Seo, Xenophon Gialis and Yuval
Pinter
|
| 1787 |
A Human-in/on-the-Loop Framework for Accessible Text Generation |
Lourdes Moreno and Paloma Martínez |
| 1790 |
DATASHI: A Parallel English–Tashlhiyt Corpus for Orthography Normalization and Low-Resource Language
Processing. |
Nasser-Eddine Monir and Zakaria Baou |
| 1794 |
TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models |
Reihaneh Iranmanesh, Saeedeh Davoudi, Pasha Abrishamchian, Ophir Frieder and Nazli Goharian |