Tutorials – IJCAI 2025

Scaling LLM Training: Efficient Pre-training & Fine-tuning on AI Accelerators

The rise of powerful foundation models, particularly large language models (LLMs) built on Transformer architectures, has ushered in a new era of Generative AI, transforming various industries. These models have enabled a wide range of applications, including question answering, customer support, image and video generation, and code completion. However, modern LLMs consist of billions of parameters trained on trillions of tokens, making their development challenging in resource-constrained environments. This tutorial provides a comprehensive exploration of deep learning training techniques optimized for AI accelerators. These enable faster, memory-efficient, yet robust training in billion-scales of model parameters. We begin with an overview of Transformer architectures, deep learning optimization strategies, and system and hardware-level of techniques. We then discuss system optimization techniques, such as fast attention computation and fault-tolerant training at scale. Leveraging these modern deep learning frameworks, we illustrate the principles of scaling laws that enable the training of LLMs with hundreds of billions of parameters. Next, we delve into low-precision training methods (e.g.,FP8 and FP4), highlighting techniques such as numerical error handling through scaling and stochastic rounding. Finally, we examine fine-tuning approaches, such as low-rank adaptation together with sparsity and quantization, which enables efficient model updates by modifying only a small subset of parameters.

Schedule: August 16th AM – Full morning

LLM-based Role-Playing from the Perspective of Hallucinations
https://ijcai-roleplay.github.io/

Role-playing tasks in NLP require deliberate hallucinations to create immersive fictional scenarios, unlike conventional tasks that prioritize factual accuracy. LLMs must generate narratives adhering to explicit boundaries (e.g., personas) and implicit constraints (e.g., temporal knowledge), balancing creative storytelling with strict limitations. This "controlled hallucination" avoids under-hallucination (rigid interactions) or excessive deviations (factual inconsistencies). This tutorial connects theory and practice, exploring how to achieve controlled hallucination without sacrificing reliability. It examines methods across the LLM lifecycle, including pre-training reinforcement, fine-tuning for coherence, preference alignment for boundary adherence, and decoding adjustments for rule-bound creativity. By emphasizing real-world applications, it shows how to develop role-playing agents that blend imaginative storytelling with user-defined safeguards.

Schedule: August 29th AM – Full morning

Evaluating LLM-based Agents: Foundations, Best Practices and Open Challenges
https://github.com/Asaf-Yehudai/LLM-Agent-Evaluation-Survey/blob/main/Tutorials/IJCAI-tutorial-2025.md

The rapid advancement of Large Language Model (LLM)-based agents has sparked a growing interest in their evaluation, bringing forth both challenges and opportunities. This tutorial provides a comprehensive introduction to evaluating LLM-based agents, catering to participants from diverse backgrounds with little prior knowledge of agents, LLMs, metrics, or benchmarks. We will establish foundational concepts and explore key benchmarks that measure critical agentic capabilities, including planning, tool use, self-reflection, and memory. We will examine evaluation strategies tailored to various agent types, ranging from web-based and software engineering to conversational and scientific applications. We will also cover benchmarks and leaderboards that evaluate generalist agents over diverse skill sets. Additionally, we will review prominent developer frameworks for agent evaluation. Finally, we will present emerging trends in the field, identify current limitations, and propose directions for future research

Schedule: August 16th PM – Full afternoon

Empowering LLMs with Logical Reasoning: Challenges, Solutions, and Opportunities
https://sites.google.com/view/ijcai25-tutorial-logicllm

Large language models (LLMs) have achieved remarkable successes on various tasks. However, recent studies have found that there are still significant challenges to the logical reasoning abilities of LLMs, which can be categorized into the following two aspects: (1) Logical question answering: LLMs often fail to generate the correct answer within a complex logical problem which requires sophisticated deductive, inductive or abductive reasoning given a collection of premises and constrains. (2) Logical consistency: LLMs are prone to producing responses contradicting themselves across different questions. For example, a state-of-the-art question-answering LLM Macaw, answers “Yes” to both questions “Is a magpie a bird?” and “Does a bird have wings?” but answers “No” to “Does a magpie have wings?”. In this tutorial, we comprehensively introduce the most cutting-edge methods with a proposed new taxonomy. Specifically, to accurately answer complex logic questions, previous methods can be categorized based on reliance on external solvers, prompts, and fine-tuning. To avoid logical contradictions, we discuss concepts and solutions of various logical consistencies, including implication, negation, transitivity, factuality consistencies, and their composites. In addition, we review commonly used benchmark datasets and evaluation metrics, and discuss promising research directions, such as extending to modal logic to account for uncertainty and developing efficient algorithms that simultaneously satisfy multiple logical consistencies.

Schedule: August 29th PM – Full afternoon

Large Language Models for Recommendation
https://zhengzhi-1997.github.io/LLM4Rec-Tutorial/

In this tutorial, we aim to provide a comprehensive review and discussion of the intersection between LLMs and recommender systems. By offering a clear overview of recent advances and research directions, this tutorial aims to equip both academic researchers and industry practitioners with the necessary knowledge to leverage LLMs in creating more effective and trustworthy recommender systems.

Schedule: August 29th PM – Before coffee break

Beyond Text: Advanced Retrieval Augmented Generation for Complex and Multimodal Data
https://cs.emory.edu/~lzhao41/projects/ijcai2025-rag

Retrieval-Augmented Generation (RAG) is a cutting-edge framework that combines retrieval-based methods with generative models to enhance the accuracy and relevance of responses by retrieving relevant information from a knowledge base before generating answers. Its significance lies in its ability to handle complex, knowledge-intensive tasks like question answering, document summarization, and conversational AI, making it a powerful tool for applications in healthcare, finance, education, and more. As RAG rapidly evolves, it is being applied to increasingly diverse domains, requiring it to handle broader types of data, including text, images, tables, graphs, and time-series data. However, this expansion introduces challenges such as cross-modal retrieval, unified representation learning, data fusion, scalability, noise handling, and evaluation. Addressing these challenges is urgent to ensure RAG’s effectiveness in real-world applications, where data is often heterogeneous, dynamic, and imperfect, and to unlock its full potential across a wide range of industries and use cases. This tutorial will cover a broad range of topics in recent progress of retrieval augmented generation, by reviewing and introducing the fundamental concepts and algorithms of RAGs, new research frontiers and technical advancement of RAGs for complex data, as well as corresponding applications and evaluations. In addition, rich tutorial materials will be included and introduced to help the audience gain a systematic understanding beyond our recently published survey paper and open-source repositories of state-of-the-art RAG algorithms.

Schedule: August 17th AM – After coffee break

Neuroevolution of Intelligent Agents
https://www.cs.utexas.edu/~risto/talks/ijcai25-tutorial/

Neuroevolution, or optimization of neural networks through population-based search, allows constructing intelligent agents in domains where gradients are not available and significant exploration is needed to find good solutions. This tutorial introduces participants to the basics of neuroevolution, reviews example application areas, and provides hands-on experience through a colab exercise.

Schedule: August 17th PM – Before coffee break

AI Meets Algebra: Foundations and Frontiers
https://algebra4ai.github.io/

This tutorial explores how advanced algebraic structures—such as C*-algebras, Lie groups, category theory, and tensor networks—can enhance machine learning by improving model efficiency, interpretability, and generalization. Moving beyond standard linear algebra, we present operator-algebraic tools for analyzing structured data, symmetry-based methods for learning distributions on manifolds, categorical frameworks for representation learning and reinforcement learning, and tensor networks for efficient computation. Designed for AI researchers and practitioners with a solid foundation in linear algebra and ML, the tutorial combines theoretical insights with practical applications, covering recent developments across kernel methods, neural networks, probabilistic modeling, and quantum AI.

Schedule: August 16th AM – Full morning

Multimodal Large Language Model for Visually Rich Document Understanding
https://github.com/yihaoding/mllm_vrdiu

The rising demand for a Visually Rich Document (VRD) understanding spans various domains such as business, law, and medicine, aiming to enhance the efficiency of document-related tasks. Multimodal Large Language Models (MLLMs) are particularly suited for VRD understanding (VRDU) tasks, as they possess extensive real-world knowledge, enabling zero-shot or few-shot learning. This tutorial will systematically overview LLM/MLLM-driven VRDU frameworks, analyzing current technical trends. It will begin by introducing common document understanding tasks, along with associated benchmarks and techniques. Following this, a summary of existing LLM/MLLM-based VRDU frameworks will cover training strategies, inference methods, and data preparation. To solidify understanding, two hands-on labs will be provided, ensuring participants grasp not only theoretical concepts but also practical applications using state-of-the-art techniques to address real-world challenges.

Schedule: August 29th AM – Full morning

T10

Principles of Self-supervised Learning in the Foundation Model Era
https://sites.google.com/view/ijcai25-ssl

This tutorial aims to bridge the gap by providing a comprehensive overview of SSL principles and methodologies utilized in foundation models. It aims to cover 1) representative SSL methodologies in foundation models, 2) the theoretical principles and frameworks for analyzing SSL methods, and 3) advanced SSL topics and phenomena, such as, equivariant SSL, in-context learning, scaling laws, and feature interpretability. The tutorial concludes with a panel discussion featuring prominent researchers, addressing future directions and theoretical underpinnings of SSL in foundation models. This systematic tutorial aims to equip attendees with a solid understanding of modern SSL techniques and their foundational principles, fostering further advancements in the field.

Schedule: August 16th PM – Full afternoon

T11

Gradient-Based Multi-Objective Deep Learning
https://gradnexus.github.io/IJCAI25_tutorial/

The evaluation of deep learning models often involves navigating trade-offs among multiple criteria. This tutorial provides a structured overview of gradient-based multi-objective optimization (MOO) for deep learning models. We begin with the foundational theory, systematically exploring three core solution strategies: identifying a single balanced solution, finding a discrete set of Pareto optimal solutions, and learning a continuous Pareto set. We will cover their algorithmic details, convergence, and generalization. The second half of the tutorial focuses on applying MOO to Large Language Models (LLMs). We will demonstrate how MOO offers a principled framework for fine-tuning and aligning LLMs, effectively navigating trade-offs between multiple objectives. Through practical demonstrations of state-of-the-art methods, participants will gain valuable insights. The session will conclude by discussing emerging challenges and future research directions, equipping attendees to tackle multi-objective problems in their work.

Schedule: August 29th PM – Full afternoon

T12

Advances in Time-Series Anomaly Detection

Recent advances in data collection technology, accompanied by the ever-rising volume and velocity of streaming data, underscore the vital need for time series analytics. In this regard, time-series anomaly detection has been an important activity, entailing various applications in fields such as cyber security, financial markets, law enforcement, and health care. While traditional literature on anomaly detection is centered on statistical measures, the increasing number of machine learning algorithms in recent years call for a structured, general characterization of the research methods for time-series anomaly detection. In this paper, we present a process-centric taxonomy for time-series anomaly detection methods, systematically categorizing traditional statistical approaches and contemporary machine learning techniques. Beyond this taxonomy, we conduct a meta-analysis of the existing literature to identify broad research trends. Given the absence of a one-size-fits-all anomaly detector, we also introduce emerging trends for time-series anomaly detection. Furthermore, we review commonly used evaluation measures and benchmarks, followed by an analysis of benchmark results to provide insights into the impact of different design choices on model performance. Through these contributions, we aim to provide a holistic perspective on time-series anomaly detection and highlight promising avenues for future investigation.

Schedule: August 17th AM – Full morning

T13

GUI Agents with Foundation Models: Data Resource, Framework and Application
https://yuqi-zhou.github.io/GUI-Agent-with-Foundation-Models.github.io/

This tutorial aims to provide a structured overview of the latest work in the field of GUI agents. We deconstructs GUI agent ecosystems into three critical pillars: multimodal data resources, algorithmic frameworks, and applications: Data resources, such as user instructions, user interface (UI) screenshots, and behavior traces, form the cornerstone for the architecture design of GUI agent; Frameworks orchestrate the power of foundation models, knowledge bases and tools to enable intelligent and reliable decision-making; Applications represent the concrete setups for domain-optimized implementation. The current state of these aspects reflects the maturity of the field and highlights future research priorities. To this end, we organize this tutorial to be a captivating review and lecture around three key areas: Data Resources, Frameworks, and Applications.

Schedule: August 29th AM – Full morning

T14

Federated Compositional and Bilevel Optimization
https://hcgao.github.io/tutorial_ijcai2025.html

Federated Learning has attracted significant attention in recent years, resulting in the development of numerous methods. However, most of these methods focus solely on traditional minimization problems and fail to address new learning paradigms in machine learning. Therefore, this tutorial focuses on the learning paradigm that can be formulated as the \textbf{stochastic compositional optimization} (SCO) problem and the \textbf{stochastic bilevel optimization} (SBO) problem, as they cover a wide variety of machine learning models beyond traditional minimization problem, such as model-agnostic meta-learning, imbalanced data classification models, contrastive self-supervised learning models, graph neural networks, neural architecture search, etc. The compositional structure and bilevel structures bring unique challenges in computation and communication for federated learning. To address these challenges, a series of federated compositional optimization and federated bilevel optimization methods have been developed in the past few years. However, these advances have not been widely disseminated. Thus, this tutorial aims to introduce the unique challenges, recent advances, and practical applications of federated SCO and SBO. The audience will benefit from this tutorial by gaining a deeper understanding of federated SCO and SBO algorithms and learning how to apply them to real-world applications.

Schedule: August 17th PM – Before coffee break

T15

Multi-Modal Generative AI in Dynamic and Open Environment
https://mn.cs.tsinghua.edu.cn/ijcai25-aigc/

This tutorial explores recent advancements in multi-modal generative AI, including multi-modal large language models (MLLMs) and diffusion models, and also highlights the emerging challenges that exist when applying them to dynamic and open environments, where generalizable post-training techniques and unified multi-modal understanding and generation framework will be discussed.

Schedule: August 29th AM – Before coffee break

T16

A Tutorial on Bandit Learning in Matching Markets
https://sites.google.com/view/matchingtutorial

Matching Markets is fundamental in economics and game theory, aiming at developing matching results to achieve the desired objective. Stable matching is an important problem in this field that characterizes the equilibrium state among agents. Due to that agents usually have uncertain preferences, bandit learning recently attracted substantial research attention in this problem. This line of work mainly focuses on the algorithms’ stable regret which characterizes the utility of agents, and incentive compatibility which characterizes the robustness of the system. This tutorial comprehensively introduces the latest advancements and pioneering results in this problem, with a particular emphasis on these two metrics. To investigate the application of bandit algorithms in other areas of mechanism design, the tutorial also broadens its scope by introducing the bandit learning problem in auctions.

Schedule: August 29th PM – After coffee break

T17

Deep Learning for Graph Anomaly Detection
https://sites.google.com/view/ijcai-tutorial-on-ad/home

Deep learning for graph anomaly detection (DLGAD), which aims to identify rare observations in graphs, has attracted rapidly increasing attention in recent years due to its significance in a wide range of high-impact application domains such as abusive review detection and malicious behavior detection in online shopping applications, web attack detection, and suspicious activity detection in online/offline financial services. In this tutorial, we will present a comprehensive introduction of DLGAD from three key technical perspectives, including graph neural network (GNN) backbone, proxy task design, and graph anomaly measure. For each of these perspectives, we will review its inherent challenges, key intuitions, and underlying assumptions; objective functions, advantages, and disadvantages of state-of-the-art DLGAD methods will be discussed

Schedule: August 18th AM – Full morning

T18

Towards Low-Distortion Graph Representation Learning
https://magic-group-buaa.github.io/IJCAI25_tutorial/

Low-distortion graph representation learning aims to address the challenge of preserving complex structures and topological properties of graphs in low-dimensional representations. Graph distortion manifests in three primary ways: noisy and perturbed structure, missing and tampered topological properties, and attribution fallacies in structural distribution. In this tutorial, we will examine the development of low-distortion graph representation learning through three key perspectives: information theory, geometry, and causality. Specifically, we cover three major aspects, i.e., information-theoretic graph representation learning, geometry-guided graph representation learning, and invariance-guided graph representation learning. Besides, we will discuss future directions, such as advanced theories, low-distortion approaches in graph large language models and graph foundation model, and novel applications in AI for Science.

Schedule: August 29th PM – Full afternoon

T19

Beyond Graph Distribution Shifts: LLMs, Adaptation, and Generalizati
https://ood-generalization.com/ijcai2025Tutorial.htm

Graph machine learning has witnessed rapid progress across both academia and industry. However, most existing methods are developed under the in-distribution (I.D.) hypothesis, which assumes that training and testing graph data are drawn from the same distribution. In real-world applications—ranging from dynamic knowledge graphs to evolving biomedical networks—this assumption is frequently violated, resulting in severe performance degradation under distribution shifts. Addressing this challenge has become a key focus in recent years, leading to the development of novel paradigms that move beyond the I.D. setting. This tutorial presents a comprehensive overview of three emerging and synergistic directions for tackling distribution shifts in graph learning. First, we highlight Graph LLMs, which combine the representational power of large language models with graph structures to enable flexible, in-context, and few-shot learning on graphs. Second, we introduce adaptation techniques for both GNNs and Graph LLMs, including graph neural architecture search and continual learning strategies for evolving data. Third, we cover generalization methods that incorporate causality and invariance principles to build robust graph models under unseen distributions. These advances are reshaping the future of graph machine learning and are of central interest to the IJCAI community across machine learning, data mining, and real-world AI deployment.

Schedule: August 29th AM – After coffee break

T20

Supervised Algorithmic Fairness in Distribution Shifts
https://sites.google.com/view/ijcai25-tutorial-fairness/home

Supervised fairness-aware machine learning under distribution shifts is an emerging field that addresses the challenge of maintaining equitable and unbiased predictions when faced with changes in data distributions from source to target domains. In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift over time due to various factors. This shift can result in classifiers exhibiting poor generalization with low accuracy and making unfair predictions, which disproportionately impact certain groups identified by sensitive attributes, such as race and gender. In this tutorial, we begin by providing a comprehensive summary of various types of distribution shifts organized into two main categories and briefly discuss the factors contributing to the emergence of unfair outcomes by supervised learning models under such shifts. Then, we conduct a thorough examination of existing methods for maintaining algorithmic fairness based on these shifts and highlight six commonly used approaches in the literature. Furthermore, we introduce frameworks for fairness-aware generalization under distribution shifts, including the latest research developments. Finally, we explore the intersections with related research fields, address significant challenges, and propose potential directions for future studies.

Schedule: August 18th AM – Before coffee break

T21

Fairness in Large Language Models: A Tutorial
https://fairness-llms-tutorial.github.io/

Large Language Models (LLMs) have demonstrated remarkable success across various domains over the years. However, despite their promising performance on various real world tasks, most of these algorithms lack fairness considerations, potentially leading to discriminatory outcomes against marginalized demographic groups and individuals. Many recent publications have explored ways to mitigate bias in LLMs. Nevertheless, a comprehensive understanding of the root causes of bias, their effects, and possible limitations of LLMs from the perspective of fairness is still in its early stages. To bridge this gap, this tutorial provides a systematic overview of recent advances in fair LLMs, beginning with real-world case studies, followed by an analysis of bias causes. We then explore fairness concepts specific to LLMs, summarizing bias evaluation strategies and algorithms designed to promote fairness. Finally, we analyze bias in LLM datasets and discuss current research challenges and open questions in the field. All tutorial resources are publicly accessible at https://github.com/lavinWong/fairness-in-large-language-models.

Schedule: August 18th PM – Before coffee break

T22

Human-Centric and Multimodal Evaluation for Explainable AI: Moving Beyond Benchmarks
https://sites.google.com/view/ijcai25-trusteval

This tutorial addresses the critical limitations of conventional static evaluation metrics in AI by introducing dynamic, interactive, and human-centric methodologies. Through in-depth discussions on individual-level visual assessments that compare AI performance with human perception, and group-level evaluations in multi-agent and networked systems, we establish a foundation for evaluation frameworks that capture reasoning, adaptability, and long-term performance. The tutorial extends these methodologies to practical applications in AI for Education, where quantitative data and qualitative feedback are integrated to assess AI teaching agents and adaptive learning platforms. Participants will gain actionable insights into developing reliable, explainable, and ethical AI systems across diverse domains, bridging theoretical innovations with real-world case studies.

Schedule: August 18th PM – Before coffee break

T23

Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluations
https://sites.google.com/view/ijcai25-tutorial-cpath/home

Computational Pathology Foundation Models (CPathFMs) have emerged as a transformative approach for automating histopathological analysis by leveraging self-supervised learning on large-scale, unlabeled whole-slide images (WSIs). These models, categorized into uni-modal and multi-modal frameworks, facilitate tasks such as segmentation, classification, biomarker discovery, and prognosis prediction. However, the development of CPathFMs faces significant challenges, including limited dataset availability, domain-specific adaptation requirements, and the absence of standardized evaluation benchmarks. This tutorial will provide a comprehensive overview of the current state of CPathFMs, covering key datasets, adaptation strategies such as contrastive learning and multi-modal integration, and a taxonomy of evaluation tasks. We will discuss how these models are trained, fine-tuned, and assessed, addressing the critical gaps in generalization, bias mitigation, and clinical applicability. Additionally, we will explore emerging research directions in fairness, transparency, security, and standardization of evaluation protocols. This tutorial will serve as an essential resource for researchers, clinicians, and AI practitioners looking to advance the field of AI-driven computational pathology.

Schedule: August 18th AM – After coffee break