Long Chen

The Hong Kong University of Science and Technology

Towards Efficient Multimodal Reasoning Models

Today’s pretrained foundation models have demonstrated astonishing abilities in different applications. Hundreds of foundation models have been proposed during the past few years. Although significant progress has been achieved, there are still several challenges in designing stronger but more efficient foundation models. In this talk, I am going to share some recent works on building efficient multimodal reasoning models.

Hong Kong University of Science and Technology (HKUST). He is leading the research group: LONG Group (https://long-group.cse.ust.hk/). Before joining HKUST

he was a postdoctoral research scientist at Columbia University. He obtained his Ph.D. degree from Zhejiang University

and he was also a visiting student at NTU & NUS. His primary research interests are Computer Vision

Machine Learning

Yali Du

King’s College London

Towards Cooperative AI Agents

From collaborative industrial robots to personal AI assistants, the integration of AI into our daily lives highlights the critical need for effective and reliable coordination among agents, as well as between agents and humans. This challenge centers on creating agents that not only align with user intentions but also possess the flexibility to adapt to evolving circumstances, such as the introduction of novel agents. The pursuit of multi-agent cooperation extends beyond individual interactions to encompass broader societal considerations. In this talk, I will discuss the challenges of cooperative AI, and our contributions on multi-agent cooperation, human-ai coordination and cooperative alignments.

Josiah Hanna

Affiliation: the University of Wisconsin – Madison

Deploying Reinforcement Learning with Confidence via Active and Offline Policy Evaluation

Recent years have seen a surge of interest in reinforcement learning (RL) as a powerful method for enabling AI agents to learn how to act so as to achieve the goals set by their designers. In practice, a crucial question in RL applications is how to decide when a learned policy is performant enough for deployment and, just as importantly, when a learned policy should not be deployed. In this talk, I will describe recent work from my group on methods that aim to enable RL practitioners to answer this question and thus to enable the use of RL in domains where extensive testing of learned policies is difficult or impossible. I will first talk about a line of work in my group on offline policy evaluation (OPE), or predicting the performance of an untested policy using data from previously used policies. The key novelty in these works is to leverage state abstraction and representation learning to scale OPE methods to more complex domains such as robot control. Then I will discuss a line of work on active data collection for data-efficient evaluation of RL policies. In this line of work, we have shown how to adaptively collect data in order to effectively evaluate an RL policy with as few real-world interactions as possible. Taken together, these lines of work are an important step toward instilling confidence in decision-making trained with RL.

Zuozhu Liu

Roberto Martin-Martin

Reuth Mirsky

Department of Computer Science, Tufts University

Agents, Autonomy and Disobedience

Human-AI and human-robot interaction often frames artificial agents as obedient assistants that are designed to follow instructions and meet expectations. But what if this paradigm is limiting the true potential of collaborative AI?

I challenge the assumption that autonomy should always be constrained by compliance. I present a scale of autonomy for AI agents and use it to argue that intelligent disobedience can be not only beneficial but essential to cooperation. I will use guide dogs as inspiration to discuss several exciting manifestations of agency and intelligent disobedience in AI and robotics: reasoning about other agents

initiating an interaction

teaching teammates

and more."

Marynel Vázquez

Assistant Prof., Computer Science Department, Yale University

The Quest for Generalizable Robot Autonomy in Situated Human-Robot Interactions

Robots hold significant promise for contributing to social good across various domains. For example, robots may help us learn new skills, assist us in completing tasks or provide emotional support. To be successful in these endeavors in real-world human environments, robots need to be robust to changes in their social contexts, such as changes in individual users, group interactions, the physical environment, etc.

I will describe two lines of research critical to achieving more generalizable robot autonomy in situated human-robot interactions. First

I will describe a unified perspective for reasoning about social contexts in HRI that exploits the underlying relational structure of the data. This perspective is motivated by a need to computationally model various aspects of social contexts in HRI and

ultimately

aims to enable more generalizable social robot behavior policies. Second

Quanming Yau

Department of Electronic Engineering, Tsinghua University

Structure-Aware Learning: Evolving Topological Learning Techniques for Vertical Domains

The inherent structure within data across various vertical domains, from molecular biology to knowledge graphs, offers a powerful scaffold for machine learning. This talk will explore the evolution of topological learning techniques, spanning from classical graph-based models to the forefront of multi-agent systems. We will begin with the introduction of Graph Neural Networks (GNNs), a widely-used architecture to model complex topological structure in tasks like molecular property prediction and knowledge graph learning. Next, we examine the integration of Large Language Models (LLMs) into topological learning, a paradigm that unifies structured and textual data. By leveraging the capabilities of LLMs, this paradigm enables interpretable inference over complex knowledge graphs. Finally, we will explore the latest advancements where multi-agent systems with optimizable topological structures are designed and explored to solve complex tasks. Overall, this presentation outlines the recent progression of topological Learning, from GNNs to LLMs and Agents, showcasing a powerful paradigm for building sophisticated and adaptable AI solutions for science and industry.

Chuxu Zhang

Associate Professor of Computer Science and Engineering, University of Connecticut

Graph Machine Learning: Effectiveness, Efficiency, and Safety

Graph data is ubiquitous in real-world applications, and graph machine learning has emerged as a transformative force in advancing AI over the past decade. In this talk, I will present my research in graph machine learning, centered around three key dimensions: effectiveness, efficiency, and safety. I will discuss the development of models and algorithms that not only deliver strong predictive performance but also promote scalability and trustworthiness. I will also showcase how these methods are applied across diverse domains—including healthcare, social media, recommender systems, and natural language processing—to address pressing societal challenges through principled and impactful model design.

Long Chen

Towards Efficient Multimodal Reasoning Models