Survey track accepted papers (Guanghzou)

734: Toward Robust Non-Transferable Learning: A Survey and Benchmark
Authors: Ziming Hong, Yongli Xiang, Tongliang Liu
Location: Guangzhou | Day: TBD
Show Abstract
Over the past decades, researchers have primarily focused on improving the generalization abilities of models, with limited attention given to regulating such generalization. However, the ability of models to generalize to unintended data (e.g., harmful or unauthorized data) can be exploited by malicious adversaries in unforeseen ways, potentially resulting in violations of model ethics. Non-transferable learning (NTL), a task aimed at reshaping the generalization abilities of deep learning models, was proposed to address these challenges. While numerous methods have been proposed in this field, a comprehensive review of existing progress and a thorough analysis of current limitations remain lacking. In this paper, we bridge this gap by presenting the first comprehensive survey on NTL and introducing NTLBench, the first benchmark to evaluate NTL performance and robustness within a unified framework. Specifically, we first introduce the task settings, general framework, and criteria of NTL, followed by a summary of NTL approaches. Furthermore, we emphasize the often-overlooked issue of robustness against various attacks that can destroy the non-transferable mechanism established by NTL. Experiments conducted via NTLBench verify the limitations of existing NTL methods in robustness. Finally, we discuss the practical applications of NTL, along with its future directions and associated challenges.
4460: A Survey on Temporal Interaction Graph Representation Learning: Progress, Challenges, and Opportunities
Authors: Pengfei Jiao, Hongjiang Chen, Xuan Guo, Zhidong Zhao, Dongxiao He, Di Jin
Location: Guangzhou | Day: TBD
Show Abstract
Temporal interaction graphs (TIGs), defined by sequences of timestamped interaction events, have become ubiquitous in real-world applications due to their capability to model complex dynamic system behaviors. As a result, temporal interaction graph representation learning (TIGRL) has garnered significant attention in recent years. TIGRL aims to embed nodes in TIGs into low-dimensional representations that effectively preserve both structural and temporal information, thereby enhancing the performance of downstream tasks such as classification, prediction, and clustering within constantly evolving data environments. In this paper, we begin by introducing the foundational concepts of TIGs and emphasizing the critical role of temporal dependencies. We then propose a comprehensive taxonomy of state-of-the-art TIGRL methods, systematically categorizing them based on the types of information utilized during the learning process to address the unique challenges inherent to TIGs. To facilitate further research and practical applications, we curate the source of datasets and benchmarks, providing valuable resources for empirical investigations. Finally, we examine key open challenges and explore promising research directions in TIGRL, laying the groundwork for future advancements that have the potential to shape the evolution of this field.
4603: A Survey of Structural Entropy: Theory, Methods, and Applications
Authors: Dingli Su, Hao Peng, Yicheng Pan, Angsheng Li
Location: Guangzhou | Day: TBD
Show Abstract
Classical information theory, a cornerstone of artificial intelligence, is fundamentally limited by its local perspective, often analyzing pairwise interactions while ignoring the larger, hierarchical architecture of complex systems. Structural entropy (SE) presents a paradigm shift, extending Shannon entropy to quantify information on a global scale and measure the uncertainty embedded in a system’s organizational hierarchy. Although its applications have broadened significantly from its origins in community detection across diverse AI domains, a systematic synthesis of its theory, computational methods, and applications is currently lacking.
This survey provides a comprehensive overview of SE to fill this critical void in the literature. We offer a detailed examination of its theoretical foundations, computational frameworks, and key learning paradigms, with a focus on its integration with graph learning and reinforcement learning. Through an exploration of its diverse applications, we highlight the power of SE to advance graph-based analysis and modeling. Finally, we discuss key challenges and future research opportunities for incorporating SE principles into the development of more interpretable and theoretically grounded AI systems.
5438: A Survey of Optimization Modeling Meets LLMs: Progress and Future Directions
Authors: Ziyang Xiao, Jingrong Xie, Lilin Xu, Shisi Guan, Jingyan Zhu, Xiongwei Han, Xiaojin Fu, WingYin Yu, Han Wu, Wei Shi, Qingcan Kang, Jiahui Duan, Tao Zhong, Mingxuan Yuan, Jia Zeng, Yuan Wang, Gang Chen, Dongxiang Zhang
Location: Guangzhou | Day: TBD
Show Abstract
By virtue of its great utility in solving real-world problems, optimization modeling has been widely employed for optimal decision-making across various sectors, but it requires substantial expertise from operations research professionals. With the advent of large language models (LLMs), new opportunities have emerged to automate the procedure of mathematical modeling. This survey presents a comprehensive and timely review of recent advancements that cover the entire technical stack, including data synthesis and fine-tuning for the base model, inference frameworks, benchmark datasets, and performance evaluation. In addition, we conducted an in-depth analysis on the quality of benchmark datasets, which was found to have a surprisingly high error rate. We cleaned the datasets and constructed a new leaderboard with fair performance evaluation in terms of base LLM model and datasets. We also build an online portal that integrates resources of cleaned datasets, code and paper repository to benefit the community. Finally, we identify limitations in current methodologies and outline future research opportunities.
7388: Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC
Authors: Siwei Meng, Yawei Luo, Ping Liu
Location: Guangzhou | Day: TBD
Show Abstract
Recent advancements in AI-generated content have significantly improved the realism of 3D and 4D generation. However, most existing methods prioritize appearance consistency while neglecting underlying physical principles, leading to artifacts such as unrealistic deformations, unstable dynamics, and implausible objects interactions. Incorporating physics priors into generative models has become a crucial research direction to enhance structural integrity and motion realism. This survey provides a review of physics-aware generative methods, systematically analyzing how physical constraints are integrated into 3D and 4D generation. First, we examine recent works in incorporating physical priors into static and dynamic 3D generation, categorizing methods based on representation types, including vision-based, NeRF-based, and Gaussian Splatting-based approaches. Second, we explore emerging techniques in 4D generation, focusing on methods that model temporal dynamics with physical simulations. Finally, we conduct a comparative analysis of major methods, highlighting their strengths, limitations, and suitability for different materials and motion dynamics. By presenting an in-depth analysis of physics-grounded AIGC, this survey aims to bridge the gap between generative models and physical realism, providing insights that inspire future research in physically consistent content generation.
8251: One-shot Federated Learning Methods: A Practical Guide
Authors: Xiang Liu, Zhenheng Tang, Xia Li, Yijun Song, Sijie Ji, Zemin Liu, Bo Han, Linshan Jiang, Jialin Li
Location: Guangzhou | Day: TBD
Show Abstract
One-shot Federated Learning (OFL) is a distributed machine learning paradigm that constrains client-server communication to a single round, addressing privacy and communication overhead issues associated with multiple rounds of data exchange in traditional Federated Learning (FL). OFL demonstrates the practical potential for integration with future approaches that require collaborative training models, such as large language models (LLMs). However, current OFL methods face two major challenges: data heterogeneity and model heterogeneity, which result in subpar performance compared to conventional FL methods. Worse still, despite numerous studies addressing these limitations, a comprehensive summary is still lacking. To address these gaps, this paper presents a systematic analysis of the challenges faced by OFL and thoroughly reviews the current methods. We also offer an innovative categorization method and analyze the trade-offs of various techniques. Additionally, we discuss the most promising future directions and the technologies that should be integrated into the OFL field. This work aims to provide guidance and insights for future research.
8291: An Empirical Study of Federated Prompt Learning for Vision Language Model
Authors: Zhihao Wang, Wenke Huang, Tian Chen, Zekun Shi, Guancheng Wan, Yu Qiao, Bin Yang, Jian Wang, Bing Li, Mang Ye
Location: Guangzhou | Day: TBD
Show Abstract
The Vision Language Model (VLM) excels in aligning vision and language representations, and prompt learning has emerged as a key technique for adapting such models to downstream tasks. However, the application of prompt learning with VLM in federated learning (FL) scenarios remains underexplored. This paper systematically investigates the behavioral differences between language prompt learning (LPT) and vision prompt learning (VPT) under data heterogeneity challenges, including label skew and domain shift. We conduct extensive experiments to evaluate the impact of various FL and prompt configurations, such as client scale, aggregation strategies, and prompt length, to assess the robustness of Federated Prompt Learning (FPL). Furthermore, we explore strategies for enhancing prompt learning in complex scenarios where label skew and domain shift coexist, including leveraging both prompt types when computational resources allow. Our findings offer practical insights into optimizing prompt learning in federated settings, contributing to the broader deployment of VLMs in privacy-preserving environments.
8323: Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era
Authors: Chenxi Liu, Shaowen Zhou, Qianxiong Xu, Hao Miao, Cheng Long, Ziyue Li, Rui Zhao
Location: Guangzhou | Day: TBD
Show Abstract
The proliferation of edge devices has generated an unprecedented volume of time series data across different domains, motivating a variety of well-customized methods. Recently, Large Language Models (LLMs) have emerged as a new paradigm for time series analytics by leveraging the shared sequential nature of textual data and time series. However, a fundamental cross-modality gap between time series and LLMs exists, as LLMs are pre-trained on textual corpora and are not inherently optimized for time series. Many recent proposals are designed to address this issue. In this survey, we provide an up-to-date overview of LLMs-based cross-modality modeling for time series analytics. We first introduce a taxonomy that classifies existing approaches into four groups based on the type of textual data employed for time series modeling. We then summarize key cross-modality strategies, e.g., alignment and fusion, and discuss their applications across a range of downstream tasks. Furthermore, we conduct experiments on multimodal datasets from different application domains to investigate effective combinations of textual data and cross-modality strategies for enhancing time series analytics. Finally, we suggest several promising directions for future research. This survey is designed for a range of professionals, researchers, and practitioners interested in LLM-based time series modeling.
8361: Federated Low-Rank Adaptation for Foundation Models: A Survey
Authors: Yiyuan Yang, Guodong Long, Qinghua Lu, Liming Zhu, Jing Jiang, Chengqi Zhang
Location: Guangzhou | Day: TBD
Show Abstract
Effectively leveraging private datasets remains a significant challenge in developing foundation models. Federated Learning (FL) has recently emerged as a collaborative framework that enables multiple users to fine-tune these models while mitigating data privacy risks. Meanwhile, Low-Rank Adaptation (LoRA) offers a resource-efficient alternative for fine-tuning foundation models by dramatically reducing the number of trainable parameters. This survey examines how LoRA has been integrated into federated fine-tuning for foundation models—an area we term FedLoRA—by focusing on three key challenges: distributed learning, heterogeneity, and efficiency. We further categorize existing work based on the specific methods used to address each challenge. Finally, we discuss open research questions and highlight promising directions for future investigation, outlining the next steps for advancing FedLoRA.
8432: Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Authors: Xun Zhu, Zheng Zhang, Xi Chen, Yiming Shi, Miao Li, Ji Wu
Location: Guangzhou | Day: TBD
Show Abstract
With the rapid advancements in multi-modal large language models (MLLMs), connectors play a pivotal role in bridging diverse modalities and enhancing model performance. However, the design and evolution of connectors have not been comprehensively analyzed, leaving gaps in understanding how these components function and hindering the development of more powerful connectors. In this survey, we systematically review the current progress of connectors in MLLMs and present a structured taxonomy that categorizes connectors into atomic operations (mapping, compression, mixture of experts) and holistic designs (multi-layer, multi-encoder, multi-modal scenarios), highlighting their technical contributions and advancements. Furthermore, we discuss several promising research frontiers and challenges, including high-resolution input, dynamic compression, guide information selection, combination strategy, and interpretability. This survey is intended to serve as a foundational reference and a clear roadmap for researchers, providing valuable insights into the design and optimization of next-generation connectors to enhance the performance and adaptability of MLLMs.
8549: A Survey of Pathology Foundation Model: Progress and Future Directions
Authors: Conghao Xiong, Hao Chen, Joseph J. Y. Sung
Location: Guangzhou | Day: TBD
Show Abstract
Computational pathology, which involves analyzing whole slide images for automated cancer diagnosis, relies on multiple instance learning, where performance depends heavily on the feature extractor and aggregator. Recent Pathology Foundation Models (PFMs), pretrained on large-scale histopathology data, have significantly enhanced both the extractor and aggregator, but they lack a systematic analysis framework. In this survey, we present a hierarchical taxonomy organizing PFMs through a top-down philosophy applicable to foundation model analysis in any domain: model scope, model pretraining, and model design. Additionally, we systematically categorize PFM evaluation tasks into slide-level, patch-level, multimodal, and biological tasks, providing comprehensive benchmarking criteria. Our analysis identifies critical challenges in both PFM development (pathology-specific methodology, end-to-end pretraining, data-model scalability) and utilization (effective adaptation, model maintenance), paving the way for future directions in this promising field. Resources referenced in this survey are available at https://github.com/BearCleverProud/AwesomeWSI.
8585: A Survey on Bandit Learning in Matching Markets
Authors: Shuai Li, Zilong Wang, Fang Kong
Location: Guangzhou | Day: TBD
Show Abstract
The two-sided matching market problem has attracted extensive research in both computer science and economics due to its wide-ranging applications in multiple fields. In various online matching platforms, market participants often have unclear preferences. As a result, a growing area of research focuses on the online scenario. Here, one-side participants (players) gradually figure out their unknown preferences through multiple rounds of interactions with the other-side participants (arms). This survey comprehensively reviews and systematically organizes the abundant literature on bandit learning in matching markets. It covers not only existing theoretical achievements but also various other related aspects. Based on the current research, several distinct directions for future study have emerged. We are convinced that delving deeper into these directions could potentially yield theoretical algorithms that are more suitable for real-world situations.
8621: Generative Multi-Agent Collaboration in Embodied AI: A Systematic Review
Authors: Di Wu, Xian Wei, Guang Chen, Hao Shen, Bo Jin
Location: Guangzhou | Day: TBD
Show Abstract
Embodied multi-agent systems (EMAS) have attracted growing attention for their potential to address complex, real-world challenges in areas such as logistics and robotics. Recent advances in foundation models pave the way for generative agents capable of richer communication and adaptive problem-solving. This survey provides a systematic examination of how EMAS can benefit from these generative capabilities. We propose a taxonomy that categorizes EMAS by system architectures and embodiment modalities, emphasizing how collaboration spans both physical and virtual contexts. Central building blocks, perception, planning, communication, and feedback, are then analyzed to illustrate how generative techniques bolster system robustness and flexibility. Through concrete examples, we demonstrate the transformative effects of integrating foundation models into embodied, multi-agent frameworks. Finally, we discuss challenges and future directions, underlining the significant promise of EMAS to reshape the landscape of AI-driven collaboration.
8640: Understanding PII Leakage in Large Language Models: A Systematic Survey
Authors: Shuai Cheng, Zhao Li, Shu Meng, Mengxia Ren, Haitao Xu, Shuai Hao, Chuan Yue, Fan Zhang
Location: Guangzhou | Day: TBD
Show Abstract
Large Language Models (LLMs) have demonstrated exceptional success across a variety of tasks, particularly in natural language processing, leading to their growing integration into numerous facets of daily life. However, this widespread deployment has raised substantial privacy concerns, especially regarding personally identifiable information (PII), which can be directly associated with specific individuals. The leakage of such information presents significant real-world privacy threats. In this paper, we conduct a systematic investigation into existing research on PII leakage in LLMs, encompassing commonly utilized PII datasets, evaluation metrics, and current studies on both PII leakage attacks and defensive strategies. Finally, we identify unresolved challenges in the current research landscape and suggest future research directions.
8684: Tensor Network: from the Perspective of AI4Science and Science4AI
Authors: Junchi Yan, Yehui Tang, Xinyu Ye, Hao Xiong, Xiaoqiu Zhong, Yuhan Wang, Yuan Qi
Location: Guangzhou | Day: TBD
Show Abstract
Tensor network has been a promising numerical tool for computational problems across science and AI. For their emerging and fast development especially in the intersection between AI and science, this paper tries to present a compact review, regarding both their applications and its own recent technical development including open-source tools. Specifically, we make the observations that tensor network plays a functional role in matrix compression and representation, information fusion, as well as quantum-inspired algorithms, which can be generally regarded as Science4AI in our survey. On the other hand, there is an emerging line of research in tensor network in AI4Science especially like learning quantum many-body physics by using e.g. neural network quantum state. Importantly, we unify tensorization methodologies across classical and modern architectures, and particularly show how tensorization bridges low-order parameter spaces to high-dimensional representations without exponential parameter growth, and further point out their potential use in scientific computing. We conclude the paper with outlook for future trends.
8721: Graph Neural Networks for Databases: A Survey
Authors: Ziming Li, Youhuan Li, Yuyu Luo, Guoliang Li, Chuxu Zhang
Location: Guangzhou | Day: TBD
Show Abstract
Graph neural networks (GNNs) are powerful deep learning models for graph-structured data, demonstrating remarkable success across diverse domains. Recently, the database (DB) community has increasingly recognized the potentiality of GNNs, prompting a surge of researches focusing on improving database systems through GNN-based approaches. However, despite notable advances, There is a lack of a comprehensive review and understanding of how GNNs could improve DB systems.
Therefore, this survey aims to bridge this gap by providing a structured and in-depth overview of GNNs for DB systems. Specifically, we propose a new taxonomy that classifies existing methods into two key categories: (1) Relational Databases, which includes tasks like performance prediction, query optimization, and Text-to-SQL, and (2) Graph Databases, addressing challenges like efficient graph query processing and graph similarity computation. We systematically review key methods in each category, highlighting their contributions and practical implications.
Finally, we suggest promising avenues for integrating GNNs into Database systems.
8770: Safety of Embodied Navigation: A Survey
Authors: Zixia Wang, Jia Hu, Ronghui Mu
Location: Guangzhou | Day: TBD
Show Abstract
As large language models (LLMs) continue to advance and gain influence, the development of embodied AI has accelerated, drawing significant attention, particularly in navigation scenarios. Embodied navigation requires an agent to perceive, interact with, and adapt to its environment while moving toward a specified target in unfamiliar settings. However, the integration of embodied navigation into critical applications raises substantial safety concerns. Given their deployment in dynamic, real-world environments, ensuring the safety of such systems is critical. This survey provides a comprehensive analysis of safety in embodied navigation from multiple perspectives, encompassing attack strategies, defense mechanisms, and evaluation methodologies. Beyond conducting a comprehensive examination of existing safety challenges, mitigation technologies, and various datasets and metrics that assess effectiveness and robustness, we explore unresolved issues and future research directions in embodied navigation safety. These include potential attack methods, mitigation strategies, more reliable evaluation techniques, and the implementation of verification frameworks. By addressing these critical gaps, this survey aims to provide valuable insights that can guide future research toward the development of safer and more reliable embodied navigation systems. Furthermore, the findings of this study have broader implications for enhancing societal safety and increasing industrial efficiency.
8788: Large Language Models for Causal Discovery: Current Landscape and Future Directions
Authors: Guangya Wan, Yunsheng Lu, Yuqi Wu, Mengxuan Hu, Sheng Li
Location: Guangzhou | Day: TBD
Show Abstract
Causal discovery (CD) and Large Language Models (LLMs) have emerged as transformative fields in artificial intelligence that have evolved largely independently. While CD specializes in uncovering cause-effect relationships from data, and LLMs excel at natural language processing and generation, their integration presents unique opportunities for advancing causal understanding. This survey examines how LLMs are transforming CD across three key dimensions: direct causal extraction from text, integration of domain knowledge into statistical methods, and refinement of causal structures. We systematically analyze approaches that leverage LLMs for CD tasks, highlighting their innovative use of metadata and natural language for causal inference. Our analysis reveals both LLMs’ potential to enhance traditional CD methods and their current limitations as imperfect expert systems. We identify key research gaps, outline evaluation frameworks and benchmarks for LLM-based causal discovery, and advocate future research efforts for leveraging LLMs in causality research. As the first comprehensive examination of the synergy between LLMs and CD, this work lays the groundwork for future advances in the field.
8835: A Survey on the Feedback Mechanism of LLM-based AI Agents
Authors: Zhipeng Liu, Xuefeng Bai, Kehai Chen, Xinyang Chen, Xiucheng Li, Yang Xiang, Jin Liu, Hong-Dong Li, Yaowei Wang, Liqiang Nie, Min Zhang
Location: Guangzhou | Day: TBD
Show Abstract
Large language models (LLMs) are increasingly being adopted to develop general-purpose AI agents. However, it remains challenging for these LLM-based AI agents to efficiently learn from feedback and iteratively optimize their strategies. To address this challenge, tremendous efforts have been dedicated to designing diverse feedback mechanisms for LLM-based AI agents. To provide a comprehensive overview of this rapidly evolving field, this paper presents a systematic review of these studies, offering a holistic perspective on the feedback mechanisms in LLM-based AI agents. We begin by discussing the construction of LLM-based AI agents, introducing a generalized framework that encapsulates much of the existing work. Next, we delve into the exploration of feedback mechanisms, categorizing them into four distinct types: internal feedback, external feedback, multi-agent feedback, and human feedback. Additionally, we provide an overview of evaluation protocols and benchmarks specifically tailored for LLM-based AI agents. Finally, we highlight the significant challenges and identify potential directions for future studies. The relevant papers are summarized and will be consistently updated at https://github.com/kevinson7515/Agents-Feedback-Mechanisms.
8879: How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM
Authors: Jirong Zha, Yuxuan Fan, Xiao Yang, Chen Gao, Xinlei Chen
Location: Guangzhou | Day: TBD
Show Abstract
3D spatial understanding is essential in real-world applications such as robotics, autonomous vehicles, virtual reality, and medical imaging. Recently, Large Language Models (LLMs), having demonstrated remarkable success across various domains, have been leveraged to enhance 3D understanding tasks, showing potential to surpass traditional computer vision methods. In this survey, we present a comprehensive review of methods integrating LLMs with 3D spatial understanding. We propose a taxonomy that categorizes existing methods into three branches: image-based methods deriving 3D understanding from 2D visual data, point cloud-based methods working directly with 3D representations, and hybrid modality-based methods combining multiple data streams. We systematically review representative methods along these categories, covering data representations, architectural modifications, and training strategies that bridge textual and 3D modalities. Finally, we discuss current limitations, including dataset scarcity and computational challenges, while highlighting promising research directions in spatial perception, multi-modal fusion, and real-world applications.
8886: A Unifying Perspective on Model Reuse: From Small to Large Pre-Trained Models
Authors: Da-Wei Zhou, Han-Jia Ye
Location: Guangzhou | Day: TBD
Show Abstract
Machine learning has rapidly progressed, resulting in a vast repository of both general and specialized models that address diverse practical needs. Reusing pre-trained models (PTMs) from public model zoos has emerged as an effective strategy, leveraging rich model resources and reshaping traditional machine learning workflows. These PTMs encapsulate valuable inductive biases beneficial for downstream tasks. Well-designed reuse strategies enable models to be adapted beyond their original scope, enhancing both performance and efficiency in target machine learning systems. This survey offers a unifying perspective on model reuse, establishing connections across various domains and presenting a novel taxonomy that encompasses the full lifecycle of PTM utilization—including selection from model zoos, adaptation techniques, and related areas such as model representation learning. We delve into the similarities and distinctions between reusing specialized and general PTMs, providing insights into their respective advantages and limitations. Furthermore, we discuss key challenges, emerging trends, and future directions in model reuse, aiming to guide research and practice in the era of large-scale pre-trained models. A comprehensive list of papers about model reuse is available at https://github.com/LAMDA-Model-Reuse/Awesome-Model-Reuse.
8902: A Comprehensive and Systematic Review for Deep Learning-Based De Novo Peptide Sequencing
Authors: Jun Xia, Jingbo Zhou, Shaorong Chen, Tianze Ling, Stan Z. Li
Location: Guangzhou | Day: TBD
Show Abstract
Tandem mass spectrometry (MS/MS) has revolutionized the field of proteomics, enabling the high-throughput identification of proteins. However, one of the central challenges in mass spectrometry-based proteomics remains peptide identification, especially in the absence of a comprehensive peptide database. While traditional database search methods compare observed mass spectra to pre-existing protein databases, they are limited by the availability and completeness of these databases. \emph{De novo} peptide sequencing, which derives peptide sequences directly from mass spectra, has emerged as a crucial approach in such cases. In recent years, deep learning methods have made significant strides in this domain. These methods train deep neural networks for translating mass spectra into peptide sequences without relying on any pre-constructed databases.
Despite significant progress, this field still lacks a comprehensive and systematic review. In this paper, we provide the first review of deep learning-based \emph{de novo} peptide sequencing techniques from the perspectives of data types, model architectures, decoding strategies, and evaluation metrics. We also identify key challenges and highlight promising avenues for future research, providing a valuable resource for both the AI and scientific communities.
8905: Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models
Authors: Xiao-Wen Yang, Jie-Jing Shao, Lan-Zhe Guo, Bo-Wen Zhang, Zhi Zhou, Lin-Han Jia, Wang-Zhou Dai, Yu-Feng Li
Location: Guangzhou | Day: TBD
Show Abstract
Large Language Models (LLMs) have shown promising results across various tasks, yet their reasoning capabilities remain a fundamental challenge. Developing AI systems with strong reasoning capabilities is regarded as a crucial milestone in the pursuit of Artificial General Intelligence (AGI) and has garnered considerable attention from both academia and industry. Various techniques have been explored to enhance the reasoning capabilities of LLMs, with neuro-symbolic approaches being a particularly promising way. This paper comprehensively reviews recent developments in neuro-symbolic approaches for enhancing LLM reasoning. We first present a formalization of reasoning tasks and give a brief introduction to the neuro-symbolic learning paradigm. Then, we discuss neuro-symbolic methods for improving the reasoning capabilities of LLMs from three perspectives: Symbolic->LLM, LLM->Symbolic, and LLM+Symbolic. Finally, we discuss several key challenges and promising future directions. We have also released a GitHub repository including papers and resources related to this survey: https://github.com/LAMDASZ-ML/Awesome-LLM-Reasoning-with-NeSy.
9000: Deep Learning for Multivariate Time Series Imputation: A Survey
Authors: Jun Wang, Wenjie Du, Yiyuan Yang, Linglong Qian, Wei Cao, Keli Zhang, Wenjia Wang, Yuxuan Liang, Qingsong Wen
Location: Guangzhou | Day: TBD
Show Abstract
Missing values are ubiquitous in multivariate time series (MTS) data, posing significant challenges for accurate analysis and downstream applications. In recent years, deep learning-based methods have successfully handled missing data by leveraging complex temporal dependencies and learned data distributions. In this survey, we provide a comprehensive summary of deep learning approaches for multivariate time series imputation (MTSI) tasks. We propose a novel taxonomy that categorizes existing methods based on two key perspectives: imputation uncertainty and neural network architecture. Furthermore, we summarize existing MTSI toolkits with a particular emphasis on the PyPOTS Ecosystem, which provides an integrated and standardized foundation for MTSI research. Finally, we discuss key challenges and future research directions, which give insight for further MTSI research. This survey aims to serve as a valuable resource for researchers and practitioners in the field of time series analysis and missing data imputation tasks. A well-maintained MTSI paper and tool list is available at https://github.com/WenjieDu/Awesome_Imputation.
9246: Reward Models in Deep Reinforcement Learning: A Survey
Authors: Rui Yu, Shenghua Wan, Yucen Wang, Chen-Xiao Gao, Le Gan, Zongzhang Zhang, De-Chuan Zhan
Location: Guangzhou | Day: TBD
Show Abstract
In reinforcement learning (RL), agents continually interact with the environment and use the feedback to refine their behavior. To guide policy optimization, reward models are introduced as proxies of the desired objectives, such that when the agent maximizes the accumulated reward, it also fulfills the task designer’s intentions. Recently, significant attention from both academic and industrial researchers has focused on developing reward models that not only align closely with the true objectives but also facilitate policy optimization. In this survey, we provide a comprehensive review of reward modeling techniques within the RL literature. We begin by outlining the background and preliminaries in reward modeling. Next, we present an overview of recent reward modeling approaches, categorizing them based on the source, the mechanism, and the reward learning paradigm. Building on this understanding, we discuss various applications of these reward modeling techniques and review methods for evaluating reward models. Finally, we conclude by highlighting promising research directions in reward modeling. Altogether, this survey includes both established and emerging methods, filling the vacancy of a systematic review of reward models in current literature.
9258: Multimodal Retina Image Analysis Survey: Datasets, Tasks and Methods
Authors: Hongwei Sheng, Heming Du, Xin Shen, Sen Wang, Xin Yu
Location: Guangzhou | Day: TBD
Show Abstract
Retina images provide a noninvasive view of the central nervous system and microvasculature, making it essential for clinical applications.
Changes in the retina often indicate both ophthalmic and systemic diseases, aiding in diagnosis and early intervention.
While deep learning algorithms have advanced retina image analysis, a comprehensive review of related datasets, tasks, and benchmarking is still lacking.
In this survey, we systematically categorize existing retina image datasets based on their available data modalities, and review the tasks these datasets support in multimodal retina image analysis.
We also explain key evaluation metrics used in various retina image analysis benchmarks.
By thoroughly examining current datasets and methods, we highlight the challenges and limitations in existing benchmarks and discuss potential research topics in the field.
We hope this work will guide future retina analysis methods and promote the shared use of existing data across different tasks.