Keynote and Invited Speakers

Keynote 1

Nancy F. Chen

The Long Arc of Language Resources: From Annotation to Alignment to Grounding

Language resources are the backbone of AI: they train models, structure linguistic analysis, and benchmark technological progress. Their evolution mirrors—and actively shapes—the trajectory of computational linguistics, speech technology, natural language processing, and artificial intelligence.
This talk traces the long arc of language resources across successive eras—from curated linguistic annotations to large-scale datasets enabling statistical learning, to representation learning and multimodal pretraining, and to alignment, where data shapes not only what models learn but also how they behave in society. Building on this arc, we argue that the next phase is grounding: anchoring language technologies to perception, interaction, cultural and social contexts, and domain knowledge.
Across these eras, language resources have served both as the fuel for modeling and as reflections of the scientific priorities and institutional forces of their time—from early academic efforts to coordinated infrastructures such as DARPA, Linguistic Data Consortium, European Language Resources Association, and O-COCOSDA, and more recently to industry-driven AI development. As machine learning scales, boundaries across languages, modalities, and cultures are increasingly blurred, bringing speech, text, and multimodal interaction into a shared modeling space—while also surfacing deeper challenges in evaluation, representation, and responsible deployment.
We illustrate this evolution through examples spanning multicultural alignment, grounding, and real-world deployment. These include MERaLiON, the first multimodal large language model for Southeast Asia; SingaKids AI Tutor, a multilingual agent supporting children learning Malay, Mandarin, and Tamil; and Siu Dai, a telebot that assists chronic disease patients in lifestyle management.
Ultimately, language resources do more than fuel models—they ground AI in the realities of human perception, interaction, and culture.

Bio

Dr. Nancy F. Chen is an ISCA Fellow (2025), AAIA Fellow (2025) and inaugural A*STAR Fellow (2023). She serves as Multimodal Generative AI Group Leader, Deputy Head (Research) for the Aural and Language Intelligence Department, AI for Education Program Head at I2R (Institute for Infocomm Research), and Principal Investigator at CFAR (Centre for Frontier AI Research) at A*STAR. Her research advances multimodal, multilingual large language models for agentic reasoning, inclusive and trustworthy AI, efficient modeling, AI-augmented learning, and AI governance. AI innovations from her lab have led to commercial spin-offs and been deployed at Singapore’s Ministry of Education. Dr. Chen has won 10 Best Paper Awards and 10 Professional Awards; delivered over 40 international keynotes and distinguished lectures; participated in 10 expert panels; and co-organized 20 international conferences, including serving as Program Chairs for NeurIPS 2025 and ICLR 2023. Dr. Chen serves on the Advisory Board for IJCAI-ECAI (2026), Board of Governors for APSIPA (2024-2026), IEEE SPS Distinguished Lecturer (2023), and Board Member of ISCA (2021-2024). She won 2025 Asia Women Tech Award and honored as Singapore 100 Women in Tech (2021). Her work has received broad media coverage, featured in over seven news and media outlets across English, Chinese, and Tamil. Dr Chen has long advised government and industry on AI and emerging technologies, beginning at MIT Lincoln Laboratory during her PhD at MIT and Harvard.

Keynote 2

Dan Jurafsky

The Social Failures of Language Models as Conversational Partners

Language models are increasingly used in conversation for information, advice, and emotional support. In this talk I’ll summarize studies in our lab showing that models fail in systematic ways as social interlocutors. We find that language models are socially sycophantic, linguistically overconfident, overly anthropomorphic, and epistemically self-centered. We then show that these flaws have real consequences for users: people interacting with models suffer consequences including overreliance, distorted judgment, and reduced personal responsibility. I’ll discuss datasets and metrics, explore mitigations, and call for design, evaluation, and accountability mechanisms to protect user well-being.

Bio

Dan Jurafsky is Professor of Linguistics, Professor of Computer Science , and Reynolds Professor in Humanities at Stanford University. He is an award-winning teacher, a MacArthur Fellow, the recipient of the Richard C. Atkinson Prize in Psychological and Cognitive Sciences from the National Academy of Sciences, a member of the American Academy of Arts and Sciences, and a Fellow of the Association for Computational Linguistics, the American Association for the Advancement of Science, and the Linguistics Society of America. Together with his students and other colleagues, he studies and teaches about natural language processing and large language models and their applications to the cognitive, linguistic, and social sciences and to social good. His books include the widely used co-authored online textbook “Speech and Language Processing” and the 2014 international bestseller and James Beard Award-nominee, “The Language of Food”.

Local Invited Speaker

Nicolau Dols

Catalan in Majorca: main features, challenges, and research tools

Catalan has been the language of the Balearics since the 13th century. After 1715, Catalan was gradually replaced by Spanish in official usage, while remaining the main everyday language in a mostly monolingual society. This situation was reversed at the end of the 20th century, when Catalan regained official status (together with Spanish), although monolinguals in Catalan no longer existed. I plan to produce a general overview of Catalan, focusing on the challenges it faces to maintain its vitality and relevance. In addition, issues related to digital resources for research on Catalan, both written and oral, will be presented, with special attention to corpora, dictionaries, and grammars.

Bio

Nicolau Dols (Palma, Majorca, 1967) is a full professor of Catalan at the Universitat de les Illes Balears (Majorca), and a member of the Institut d’Estudis Catalans, whose Philological Section (the official academy for the Catalan language) he currently chairs. His research interests lie mainly in phonology and, more generally, in grammar. He coauthored (with Max W. Wheeler and Alan Yates) Catalan. A Comprehensive Grammar (Routledge, 1999) and (with Richard M. Mansell) Catalan. An Essential Grammar (Routledge, 2017). His research has also addressed issues in sociolinguistics, language standardisation, translation, and constructed languages. His latest work is Language Recoding and Transcoding (co-edited with Jordi M. Antolí Martínez in Peter Lang, 2025), his own chapter focusing on a bidirectional model of phonology. He leads the team responsible for the oral Catalan language corpus at the Institut d’Estudis Catalans.