Tutorials (Montreal)

Scaling LLM Training: Efficient Pre-training & Fine-tuning on AI Accelerators
Youngsuk Park, Tao Yu, Leonard Lausen, Rajarshi Saha, Jiaji Huang, Saleh Soltan, Yida Wang, Mingyi Hong, George Karypis

The rise of powerful foundation models, particularly large language models (LLMs) built on Transformer architectures, has ushered in a new era of Generative AI, transforming various industries. These models have enabled a wide range of applications, including question answering, customer support, image and video generation, and code completion. However, modern LLMs consist of billions of parameters trained on trillions of tokens, making their development challenging in resource-constrained environments. This tutorial provides a comprehensive exploration of deep learning training techniques optimized for AI accelerators. These enable faster, memory-efficient, yet robust training in billion-scales of model parameters. We begin with an overview of Transformer architectures, deep learning optimization strategies, and system and hardware-level of techniques. We then discuss system optimization techniques, such as fast attention computation and fault-tolerant training at scale. Leveraging these modern deep learning frameworks, we illustrate the principles of scaling laws that enable the training of LLMs with hundreds of billions of parameters. Next, we delve into low-precision training methods (e.g.,FP8 and FP4), highlighting techniques such as numerical error handling through scaling and stochastic rounding. Finally, we examine fine-tuning approaches, such as low-rank adaptation together with sparsity and quantization, which enables efficient model updates by modifying only a small subset of parameters.

Schedule: August 16th AM – Full morning

Evaluating LLM-based Agents: Foundations, Best Practices and Open Challenges
Roy Bar-Haim, Arman Cohan, Lilach Eden, Michal Shmueli-Scheuer, Asaf Yehudai
https://github.com/Asaf-Yehudai/LLM-Agent-Evaluation-Survey/blob/main/Tutorials/IJCAI-tutorial-2025.md

The rapid advancement of Large Language Model (LLM)-based agents has sparked a growing interest in their evaluation, bringing forth both challenges and opportunities. This tutorial provides a comprehensive introduction to evaluating LLM-based agents, catering to participants from diverse backgrounds with little prior knowledge of agents, LLMs, metrics, or benchmarks. We will establish foundational concepts and explore key benchmarks that measure critical agentic capabilities, including planning, tool use, self-reflection, and memory. We will examine evaluation strategies tailored to various agent types, ranging from web-based and software engineering to conversational and scientific applications. We will also cover benchmarks and leaderboards that evaluate generalist agents over diverse skill sets. Additionally, we will review prominent developer frameworks for agent evaluation. Finally, we will present emerging trends in the field, identify current limitations, and propose directions for future research

Schedule: August 16th PM – Full afternoon

Beyond Text: Advanced Retrieval Augmented Generation for Complex and Multimodal Data
Liang Zhao, Chao Huang
https://cs.emory.edu/~lzhao41/projects/ijcai2025-rag

Retrieval-Augmented Generation (RAG) is a cutting-edge framework that combines retrieval-based methods with generative models to enhance the accuracy and relevance of responses by retrieving relevant information from a knowledge base before generating answers. Its significance lies in its ability to handle complex, knowledge-intensive tasks like question answering, document summarization, and conversational AI, making it a powerful tool for applications in healthcare, finance, education, and more. As RAG rapidly evolves, it is being applied to increasingly diverse domains, requiring it to handle broader types of data, including text, images, tables, graphs, and time-series data. However, this expansion introduces challenges such as cross-modal retrieval, unified representation learning, data fusion, scalability, noise handling, and evaluation. Addressing these challenges is urgent to ensure RAG’s effectiveness in real-world applications, where data is often heterogeneous, dynamic, and imperfect, and to unlock its full potential across a wide range of industries and use cases. This tutorial will cover a broad range of topics in recent progress of retrieval augmented generation, by reviewing and introducing the fundamental concepts and algorithms of RAGs, new research frontiers and technical advancement of RAGs for complex data, as well as corresponding applications and evaluations. In addition, rich tutorial materials will be included and introduced to help the audience gain a systematic understanding beyond our recently published survey paper and open-source repositories of state-of-the-art RAG algorithms.

Schedule: August 17th AM – After coffee break

Neuroevolution of Intelligent Agents
Risto Miikkulainen
https://www.cs.utexas.edu/~risto/talks/ijcai25-tutorial/

Neuroevolution, or optimization of neural networks through population-based search, allows constructing intelligent agents in domains where gradients are not available and significant exploration is needed to find good solutions. This tutorial introduces participants to the basics of neuroevolution, reviews example application areas, and provides hands-on experience through a colab exercise.

Schedule: August 17th PM – Before coffee break

AI Meets Algebra: Foundations and Frontiers
Yuka Hashimoto, Eren Mehmet Kiral, Yivan Zhang, Chao Li
https://algebra4ai.github.io/

This tutorial explores how advanced algebraic structures—such as C*-algebras, Lie groups, category theory, and tensor networks—can enhance machine learning by improving model efficiency, interpretability, and generalization. Moving beyond standard linear algebra, we present operator-algebraic tools for analyzing structured data, symmetry-based methods for learning distributions on manifolds, categorical frameworks for representation learning and reinforcement learning, and tensor networks for efficient computation. Designed for AI researchers and practitioners with a solid foundation in linear algebra and ML, the tutorial combines theoretical insights with practical applications, covering recent developments across kernel methods, neural networks, probabilistic modeling, and quantum AI.

Schedule: August 16th AM – Full morning

T10

Principles of Self-supervised Learning in the Foundation Model Era
Yisen Wang, Yifei Wang
https://sites.google.com/view/ijcai25-ssl

This tutorial aims to bridge the gap by providing a comprehensive overview of SSL principles and methodologies utilized in foundation models. It aims to cover 1) representative SSL methodologies in foundation models, 2) the theoretical principles and frameworks for analyzing SSL methods, and 3) advanced SSL topics and phenomena, such as, equivariant SSL, in-context learning, scaling laws, and feature interpretability. The tutorial concludes with a panel discussion featuring prominent researchers, addressing future directions and theoretical underpinnings of SSL in foundation models. This systematic tutorial aims to equip attendees with a solid understanding of modern SSL techniques and their foundational principles, fostering further advancements in the field.

Schedule: August 16th PM – Full afternoon

T12

Advances in Time-Series Anomaly Detection
Qinghua Liu, Paul Boniol, Themis Palpanas, John Paparrizos

Recent advances in data collection technology, accompanied by the ever-rising volume and velocity of streaming data, underscore the vital need for time series analytics. In this regard, time-series anomaly detection has been an important activity, entailing various applications in fields such as cyber security, financial markets, law enforcement, and health care. While traditional literature on anomaly detection is centered on statistical measures, the increasing number of machine learning algorithms in recent years call for a structured, general characterization of the research methods for time-series anomaly detection. In this paper, we present a process-centric taxonomy for time-series anomaly detection methods, systematically categorizing traditional statistical approaches and contemporary machine learning techniques. Beyond this taxonomy, we conduct a meta-analysis of the existing literature to identify broad research trends. Given the absence of a one-size-fits-all anomaly detector, we also introduce emerging trends for time-series anomaly detection. Furthermore, we review commonly used evaluation measures and benchmarks, followed by an analysis of benchmark results to provide insights into the impact of different design choices on model performance. Through these contributions, we aim to provide a holistic perspective on time-series anomaly detection and highlight promising avenues for future investigation.

Schedule: August 17th AM – Full morning

T14

Federated Compositional and Bilevel Optimization
Hongchang Gao, Xinwen Zhang
https://hcgao.github.io/tutorial_ijcai2025.html

Federated Learning has attracted significant attention in recent years, resulting in the development of numerous methods. However, most of these methods focus solely on traditional minimization problems and fail to address new learning paradigms in machine learning. Therefore, this tutorial focuses on the learning paradigm that can be formulated as the \textbf{stochastic compositional optimization} (SCO) problem and the \textbf{stochastic bilevel optimization} (SBO) problem, as they cover a wide variety of machine learning models beyond traditional minimization problem, such as model-agnostic meta-learning, imbalanced data classification models, contrastive self-supervised learning models, graph neural networks, neural architecture search, etc. The compositional structure and bilevel structures bring unique challenges in computation and communication for federated learning. To address these challenges, a series of federated compositional optimization and federated bilevel optimization methods have been developed in the past few years. However, these advances have not been widely disseminated. Thus, this tutorial aims to introduce the unique challenges, recent advances, and practical applications of federated SCO and SBO. The audience will benefit from this tutorial by gaining a deeper understanding of federated SCO and SBO algorithms and learning how to apply them to real-world applications.

Schedule: August 17th PM – Before coffee break

T17

Deep Learning for Graph Anomaly Detection
Hezhe Qiao, Hanghang Tong, Bo An, Irwin King, Charu Aggarwal, Guansong Pang
https://sites.google.com/view/ijcai-tutorial-on-ad/home

Deep learning for graph anomaly detection (DLGAD), which aims to identify rare observations in graphs, has attracted rapidly increasing attention in recent years due to its significance in a wide range of high-impact application domains such as abusive review detection and malicious behavior detection in online shopping applications, web attack detection, and suspicious activity detection in online/offline financial services. In this tutorial, we will present a comprehensive introduction of DLGAD from three key technical perspectives, including graph neural network (GNN) backbone, proxy task design, and graph anomaly measure. For each of these perspectives, we will review its inherent challenges, key intuitions, and underlying assumptions; objective functions, advantages, and disadvantages of state-of-the-art DLGAD methods will be discussed

Schedule: August 18th AM – Full morning

T20

Supervised Algorithmic Fairness in Distribution Shifts
Dong Li, Chen Zhao, Xintao Wu
https://sites.google.com/view/ijcai25-tutorial-fairness/home

Supervised fairness-aware machine learning under distribution shifts is an emerging field that addresses the challenge of maintaining equitable and unbiased predictions when faced with changes in data distributions from source to target domains. In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift over time due to various factors. This shift can result in classifiers exhibiting poor generalization with low accuracy and making unfair predictions, which disproportionately impact certain groups identified by sensitive attributes, such as race and gender. In this tutorial, we begin by providing a comprehensive summary of various types of distribution shifts organized into two main categories and briefly discuss the factors contributing to the emergence of unfair outcomes by supervised learning models under such shifts. Then, we conduct a thorough examination of existing methods for maintaining algorithmic fairness based on these shifts and highlight six commonly used approaches in the literature. Furthermore, we introduce frameworks for fairness-aware generalization under distribution shifts, including the latest research developments. Finally, we explore the intersections with related research fields, address significant challenges, and propose potential directions for future studies.

Schedule: August 18th AM – Before coffee break

T21

Fairness in Large Language Models: A Tutorial
Zichong Wang, Avash Palikhe, Zhipeng Yin, Jiale Zhang, Wenbin Zhang
https://fairness-llms-tutorial.github.io/

Large Language Models (LLMs) have demonstrated remarkable success across various domains over the years. However, despite their promising performance on various real world tasks, most of these algorithms lack fairness considerations, potentially leading to discriminatory outcomes against marginalized demographic groups and individuals. Many recent publications have explored ways to mitigate bias in LLMs. Nevertheless, a comprehensive understanding of the root causes of bias, their effects, and possible limitations of LLMs from the perspective of fairness is still in its early stages. To bridge this gap, this tutorial provides a systematic overview of recent advances in fair LLMs, beginning with real-world case studies, followed by an analysis of bias causes. We then explore fairness concepts specific to LLMs, summarizing bias evaluation strategies and algorithms designed to promote fairness. Finally, we analyze bias in LLM datasets and discuss current research challenges and open questions in the field. All tutorial resources are publicly accessible at https://github.com/lavinWong/fairness-in-large-language-models.

Schedule: August 18th PM – Before coffee break

T22

Human-Centric and Multimodal Evaluation for Explainable AI: Moving Beyond Benchmarks
Kang Hao Cheong, Shiyu Hu, Jie Zhao, Yongbao Wu
https://sites.google.com/view/ijcai25-trusteval

This tutorial addresses the critical limitations of conventional static evaluation metrics in AI by introducing dynamic, interactive, and human-centric methodologies. Through in-depth discussions on individual-level visual assessments that compare AI performance with human perception, and group-level evaluations in multi-agent and networked systems, we establish a foundation for evaluation frameworks that capture reasoning, adaptability, and long-term performance. The tutorial extends these methodologies to practical applications in AI for Education, where quantitative data and qualitative feedback are integrated to assess AI teaching agents and adaptive learning platforms. Participants will gain actionable insights into developing reliable, explainable, and ethical AI systems across diverse domains, bridging theoretical innovations with real-world case studies.

Schedule: August 18th PM – Before coffee break

T23

Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluations
Dong Li, Guihong Wan, Chen Zhao, Xintao Wu, Yevgeniy Semenov
https://sites.google.com/view/ijcai25-tutorial-cpath/home

Computational Pathology Foundation Models (CPathFMs) have emerged as a transformative approach for automating histopathological analysis by leveraging self-supervised learning on large-scale, unlabeled whole-slide images (WSIs). These models, categorized into uni-modal and multi-modal frameworks, facilitate tasks such as segmentation, classification, biomarker discovery, and prognosis prediction. However, the development of CPathFMs faces significant challenges, including limited dataset availability, domain-specific adaptation requirements, and the absence of standardized evaluation benchmarks. This tutorial will provide a comprehensive overview of the current state of CPathFMs, covering key datasets, adaptation strategies such as contrastive learning and multi-modal integration, and a taxonomy of evaluation tasks. We will discuss how these models are trained, fine-tuned, and assessed, addressing the critical gaps in generalization, bias mitigation, and clinical applicability. Additionally, we will explore emerging research directions in fairness, transparency, security, and standardization of evaluation protocols. This tutorial will serve as an essential resource for researchers, clinicians, and AI practitioners looking to advance the field of AI-driven computational pathology.

Schedule: August 18th AM – After coffee break