Special Track on AI4Tech: AI Enabling Critical Technologies Papers

6758: CycSeq: Leveraging Cyclic Data Generation for Accurate Perturbation Prediction in Single-Cell RNA-Seq

Authors: Yicheng Liu, Sai Wu, Tianyun Zhang, Chang Yao, Ning Shen

Location: Montreal | Day: August 20th | Time: 10:00 | Session: AI4Tech (1/3)

Show Abstract

Understanding and predicting the effects of cellular perturbations using single-cell sequencing technology remains a critical and challenging problem in biotechnology. In this work, we introduce CycSeq, a deep learning framework that leverages cyclic data generation and recent advances in neural architectures to predict single-cell responses under specified perturbations across multiple cell lines, while also generating the corresponding single-cell expression profiles. Specifically, CycSeq addresses the challenge of learning heterogeneous perturbation responses from unpaired single-cell gene expression data by generating pseudo-pairs through cyclic data generation. Experimental results demonstrate that CycSeq outperforms existing methods in perturbation prediction tasks, as evaluated using computational metrics such as R-squared and MAE. Furthermore, CycSeq employs a unified architecture that integrates information from multiple cell lines, enabling robust predictions even for long-tail cell lines with limited training data. The source code is publicly available at https://github.com/yczju/cycseq.

7596: Eye-See-You: Reverse Pass-Through VR and Head Avatars

Authors: Ankan Dash, Jingyi Gu, Guiling Wang, Chen Chen

Location: Montreal | Day: August 22nd | Time: 10:00 | Session: AI4Tech (3/3)

Show Abstract

Virtual Reality (VR) headsets, while integral to the evolving digital ecosystem, present a critical challenge: the occlusion of users’ eyes and portions of their faces, which hinders visual communication and may contribute to social isolation. To address this, we introduce RevAvatar, an innovative framework that leverages AI methodologies to enable reverse pass-through technology, fundamentally transforming VR headset design and interaction paradigms. RevAvatar integrates state-of-the-art generative models and multimodal AI techniques to reconstruct high-fidelity 2D facial images and generate accurate 3D head avatars from partially observed eye and lower-face regions. This framework represents a significant advancement in AI4Tech by enabling seamless interaction between virtual and physical environments, fostering immersive experiences such as VR meetings and social engagements. Additionally, we present VR-Face, a novel dataset comprising 200,000 samples designed to emulate diverse VR-specific conditions, including occlusions, lighting variations, and distortions. By addressing fundamental limitations in current VR systems, RevAvatar exemplifies the transformative synergy between AI and next-generation technologies, offering a robust platform for enhancing human connection and interaction in virtual environments.

7774: Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors

Authors: Xinyu Ding, Lexuan Chen, Siyu Liao, Zhongfeng Wang

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

Foundation models have achieved tremendous success in different domains.
However, their huge computation and storage complexity make these models difficult to fine-tune and also less applicable in practice.
Recent study shows training in Fourier domain can be an effective fine-tuning method in terms of both model performance and number of training parameters.
In this work, we propose to further reduce the complexity by the factorization through the product of interleaved circulant and diagonal matrices. In addition, we address the case of non-square fine-tuning weights by partitioning the circulant matrix into blocks.
Our method avoids the construction of weight change matrix and utilizes 1D fast Fourier transform (FFT) instead of 2D FFT.
Experimental results show that our method achieves similar or better performance across various tasks with much less floating-point operations (FLOPs) and the number of trainable parameters.

8493: Diversity-Aware Reinforcement Learning for de novo Drug Design

Authors: Hampus Gummesson Svensson, Christian Tyrchan, Ola Engkvist, Morteza Haghir Chehreghani

Location: Montreal | Day: August 22nd | Time: 10:00 | Session: AI4Tech (3/3)

Show Abstract

Fine-tuning a pre-trained generative model has demonstrated good performance in generating promising drug molecules. The fine-tuning task is often formulated as a reinforcement learning problem, where previous methods efficiently learn to optimize a reward function to generate potential drug molecules. Nevertheless, in the absence of an adaptive update mechanism for the reward function, the optimization process can become stuck in local optima. The efficacy of the optimal molecule in a local optimization may not translate to usefulness in the subsequent drug optimization process or as a potential standalone clinical candidate. Therefore, it is important to generate a diverse set of promising molecules. Prior work has modified the reward function by penalizing structurally similar molecules, primarily focusing on finding molecules with higher rewards. To date, no study has comprehensively examined how different adaptive update mechanisms for the reward function influence the diversity of generated molecules. In this work, we investigate a wide range of intrinsic motivation methods and strategies to penalize the extrinsic reward, and how they affect the diversity of the set of generated molecules. Our experiments reveal that combining structure- and prediction-based methods generally yields better results in terms of diversity.

8604: The Graph’s Apprentice: Teaching an LLM Low-Level Knowledge for Circuit Quality Estimation

Authors: Reza Moravej, Saurabh Bodhe, Zhanguang Zhang, Didier Chételat, Dimitrios Tsaras, Yingxue Zhang, Hui-Ling Zhen, Jianye Hao, Mingxuan Yuan

Location: Montreal | Day: August 20th | Time: 10:00 | Session: AI4Tech (1/3)

Show Abstract

Logic synthesis is a crucial phase in the circuit design process, responsible for transforming hardware description language (HDL) designs into optimized netlists. However, traditional logic synthesis methods are computationally intensive, restricting their iterative use in refining chip designs. Recent advancements in large language models (LLMs), particularly those fine-tuned on programming languages, present a promising alternative. This work proposes augmenting LLMs with predictor networks trained to estimate circuit quality directly from HDL code. To enhance performance, the model is regularized using embeddings from graph neural networks (GNNs) trained on Look-Up Table (LUT) graphs, thereby incorporating lower-level circuit insights. The proposed method demonstrates superior performance compared to existing graph-based RTL-level estimation techniques on the established benchmark OpenABCD, while providing instant feedback on HDL code quality.

8620: DL-KDD: Dual-Lightness Knowledge Distillation for Action Recognition in the Dark

Authors: Chi-Jui Chang, Oscar Tai-Yuan Chen, Vincent S. Tseng

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

Human action recognition in dark videos is a challenging task for computer vision due to the low quality of the videos filmed in the dark. Recent studies focused on applying dark enhancement methods to improve the visibility of the video. However, such video processing results in the loss of critical information in the original (un-enhanced) video. Conversely, traditional two-stream methods are capable of learning information from both original and enhanced videos, but it can lead to a significant increase in the computational cost. To address these challenges, we propose a novel knowledge-distillation-based framework, named Dual-Lightness KnowleDge Distillation (DL-KDD), which simultaneously
resolves the aforementioned issues by enabling a student model to obtain both original features and light-enhanced knowledge without additional complexity, thus improving the performance of the model and avoiding extra computational cost. Through comprehensive evaluations, the proposed DL-KDD, with only original video required as input during the inference phase, significantly outperforms state-of-the-art methods on the widely-used dark video datasets. The results highlight the excellence of our proposed knowledge-distillation-based framework for dark video human action recognition.

8622: DeCo: Defect-Aware Modeling with Contrasting Matching for Optimizing Task Assignment in Online IC Testing

Authors: Lo Pang-Yun Ting, Yu-Hao Chiang, Yi-Tung Tsai, Hsu-Chao Lai, Kun-Ta Chuang

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

In the semiconductor industry, integrated circuit (IC) processes play a vital role, as the rising complexity and market expectations necessitate improvements in yield. Identifying IC defects and assigning IC testing tasks to the right engineers improves efficiency and reduces losses. While current studies emphasize fault localization or defect classification, they overlook the integration of defect characteristics, historical failures, and the insights from engineer expertise, which restrains their effectiveness in improving IC handling. To leverage AI for these challenges, we propose DeCo, an innovative approach for optimizing task assignment in IC testing. DeCo constructs a novel defect-aware graph from IC testing reports, capturing co-failure relationships to enhance defect differentiation, even with scarce defect data. Additionally, it formulates defect-aware representations for engineers and tasks, reinforced by local and global structure modeling on the defect-aware graph. Finally, a contrasting-based assignment mechanism pairs testing tasks with QA engineers by considering their skill level and current workload, thus promoting an equitable and efficient job dispatch. Experiments on a real-world dataset demonstrate that DeCo achieves the highest task-handling success rates in different scenarios, exceeding 80%, while also maintaining balanced workloads on both scarce or expanded defect data. Moreover, case studies reveal that DeCo can assign tasks to potentially capable engineers, even for their unfamiliar defects, highlighting its potential as an AI-driven solution for the real-world IC failure analysis and task handling.

8634: CoFinDiff: Controllable Financial Diffusion Model for Time Series Generation

Authors: Yuki Tanaka, Ryuji Hashimoto, Takehiro Takayanagi, Zhe Piao, Yuri Murayama, Kiyoshi Izumi

Location: Montreal | Day: August 22nd | Time: 10:00 | Session: AI4Tech (3/3)

Show Abstract

The generation of synthetic financial data is a critical technology in the financial domain, addressing challenges posed by limited data availability. Traditionally, statistical models have been employed to generate synthetic data. However, these models fail to capture the stylized facts commonly observed in financial data, limiting their practical applicability. Recently, machine learning models have been introduced to address the limitations of statistical models; however, controlling synthetic data generation remains challenging. We propose CoFinDiff (Controllable Financial Diffusion model), a synthetic financial data generation model based on conditional diffusion models that accept conditions about the synthetic time series. By incorporating conditions derived from price data into the conditional diffusion model via cross-attention, CoFinDiff learns the relationships between the conditions and the data, generating synthetic data that align with arbitrary conditions. Experimental results demonstrate that: (i) synthetic data generated by CoFinDiff capture stylized facts; (ii) the generated data accurately meet specified conditions for trends and volatility; (iii) the diversity of the generated data surpasses that of the baseline models; and (iv) models trained on CoFinDiff-generated data achieve improved performance in deep hedging task.

8668: MolHFCNet: Enhancing Molecular Graph Representations with Hierarchical Feature Combining and Hybrid Pretraining

Authors: Duy-Long Nguyen, Duc-Luong Ho-Viet, Anh-Thu Ngo-Tran, Quang H. Nguyen, Binh P. Nguyen

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

Efficient molecular property prediction is crucial in bioinformatics and cheminformatics, with applications in drug discovery, materials science, and chemical engineering. This paper introduces MolHFCNet, a graph neural network designed to enhance molecular representation learning. At its core, the n-Hierarchical Features Combining (n-HFC) module aggregates information across multiple hierarchical feature spaces, effectively capturing both local and global graph structures. Unlike conventional models, n-HFC maintains computational complexity comparable to a single full-dimensional graph layer while supporting either 2D or 3D molecular graphs, ensuring flexibility across tasks. Furthermore, we propose a novel graph pretraining strategy that integrates predictive and contrastive learning, enabling the model to capture local chemical interactions and global molecular contexts for robust embeddings. Experimental results on benchmark datasets demonstrate MolHFCNet’s superior accuracy and efficiency compared to state-of-the-art methods, highlighting the potential of high-order hierarchical feature learning for advancing molecular graph analysis. Our code is available at https://github.com/ndlongvn/MolHFCNet.

8887: DeepFeatIoT: Unifying Deep Learned, Randomized, and LLM Features for Enhanced IoT Time Series Sensor Data Classification in Smart Industries

Authors: Muhammad Sakib Khan Inan, Kewen Liao

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

Internet of Things (IoT) sensors are ubiquitous technologies deployed across smart cities, industrial sites, and healthcare systems. They continuously generate time series data that enable advanced analytics and automation in industries. However, challenges such as the loss or ambiguity of sensor metadata, heterogeneity in data sources, varying sampling frequencies, inconsistent units of measurement, and irregular timestamps make raw IoT time series data difficult to interpret, undermining the effectiveness of smart systems. To address these challenges, we propose a novel deep learning model, DeepFeatIoT, which integrates learned local and global features with non-learned randomized convolutional kernel-based features and features from large language models (LLMs). This straightforward yet unique fusion of diverse learned and non-learned features significantly enhances IoT time series sensor data classification, even in scenarios with limited labeled data. Our model’s effectiveness is demonstrated through its consistent and generalized performance across multiple real-world IoT sensor datasets from diverse critical application domains, outperforming state-of-the-art benchmark models. These results highlight DeepFeatIoT’s potential to drive significant advancements in IoT analytics and support the development of next-generation smart systems.

8919: CogTwin: A Hybrid Cognitive Architecture Framework for Adaptable and Cognitive Digital Twins

Authors: Sukanya Mandal, Noel E. O’Connor

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

Current Digital Twin (DT) technology lacks the cognitive capabilities needed for true autonomy and intelligent adaptation. This paper introduces CogTwin, a hybrid cognitive architecture framework for developing Cognitive Digital Twins (CDTs). CogTwin integrates a 50ms cognitive cycle inspired by human cognition, dual knowledge graphs (static Domain Knowledge Repository (DKR) and dynamic Internal Knowledge Graph (DIKG)), a hybrid attention mechanism, and self-healing capabilities. Combining symbolic, sub-symbolic, and neuro-symbolic AI, CogTwin enables real-time learning and decision-making. Simulated smart city scenarios, including traffic incident management and power outage response, demonstrate CogTwin’s potential. Preliminary performance evaluations of the pseudocode suggest feasibility of the target 50ms cycle. The architecture also incorporates explainable AI (XAI) for transparency and human-CogTwin collaboration. CogTwin contributes towards a unified theory of cognition for DTs, laying the groundwork for more sophisticated and autonomous CDTs.

8958: AI4Contracts: LLM & RAG-Powered Encoding of Financial Derivative Contracts

Authors: Maruf Ahmed Mridul, Ian Sloyan, Aparna Gupta, Oshani Seneviratne

Location: Montreal | Day: August 20th | Time: 10:00 | Session: AI4Tech (1/3)

Show Abstract

Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) are reshaping how AI systems extract and organize information from unstructured text. A key challenge is designing AI methods that can incrementally extract, structure, and validate information while preserving hierarchical and contextual relationships. We introduce CDMizer, a template driven, LLM, and RAG-based framework for structured text transformation. By leveraging depth-based retrieval and hierarchical generation, CDMizer ensures a controlled, modular process that aligns generated outputs with predefined schemas. Its template-driven approach guarantees syntactic correctness, schema adherence, and improved scalability, addressing key limitations of direct generation methods. Additionally, we propose an LLM-powered evaluation framework to assess the completeness and accuracy of structured representations. Demonstrated in the transformation of Over-the-Counter (OTC) financial derivative contracts into the Common Domain Model (CDM), CDMizer establishes a scalable foundation for AI-driven document understanding, structured synthesis, and automated validation in broader contexts.

9047: NSF-MAP: Neurosymbolic Multimodal Fusion for Robust and Interpretable Anomaly Prediction in Assembly Pipelines

Authors: Chathurangi Shyalika, Renjith Prasad, Fadi El Kalach, Revathy Venkataramanan, Ramtin Zand, Ramy Harik, Amit Sheth

Location: Montreal | Day: August 21st | Time: 15:00 | Session: AI4Tech (2/3)

Show Abstract

In modern assembly pipelines, identifying anomalies is crucial in ensuring product quality and operational efficiency. Conventional single-modality methods fail to capture the intricate relationships required for precise anomaly prediction in complex predictive environments with abundant data and multiple modalities. This paper proposes a neurosymbolic AI and fusion-based approach for multimodal anomaly prediction in assembly pipelines. We introduce a time series and image-based fusion model that leverages decision-level fusion techniques. Our research builds upon three primary novel approaches in multimodal learning: time series and image-based decision-level fusion modeling, transfer learning for fusion, and knowledge-infused learning. We evaluate the novel method using our derived and publicly available multimodal dataset and conduct comprehensive ablation studies to assess the impact of our preprocessing techniques and fusion model compared to traditional baselines. The results demonstrate that a neurosymbolic AI-based fusion approach that uses transfer learning can effectively harness the complementary strengths of time series and image data, offering a robust and interpretable approach for anomaly prediction in assembly pipelines with enhanced performance. \noindent The datasets, codes to reproduce the results, supplementary materials, and demo are available at https://github.com/ChathurangiShyalika/NSF-MAP.

9061: Physics-based Generative Models for Geometrically Consistent and Interpretable Wireless Channel Synthesis

Authors: Satyavrat Wagle, Akshay Malhotra, Shahab Hamidi-Rad, Aditya Sant, David J. Love, Christopher G. Brinton

Location: Montreal | Day: August 20th | Time: 10:00 | Session: AI4Tech (1/3)

Show Abstract

In recent years, machine learning (ML) methods have become increasingly popular in wireless communication systems for several applications. A critical bottleneck for designing ML systems for wireless communications is the availability of realistic wireless channel datasets, which are extremely resource-intensive to produce. To this end, the generation of realistic wireless channels plays a key role in the subsequent design of effective ML algorithms for wireless communication systems. Generative models have been proposed to synthesize channel matrices, but outputs produced by such methods may not correspond to geometrically viable channels and do not provide any insight into the scenario being generated. In this work, we aim to address both these issues by integrating established parametric, physics-based geometric channel (PPGC) modeling frameworks with generative methods to produce realistic channel matrices with interpretable representations in the parameter domain. We show that the generative model converges to prohibitively suboptimal stationary points when learning the underlying prior directly over the parameters due to the non-convex PPGC model. To address this limitation, we propose a linearized reformulation of the problem to ensure smooth gradient flow during generative model training, while also providing insights into the underlying physical environment. We evaluate our model against prior baselines by comparing the generated, scenario-specific samples in terms of the 2-Wasserstein distance and through its utility when used for downstream compression tasks.

9166: Generating Grounded Responses to Counter Misinformation via Learning Efficient Fine-Grained Critiques

Authors: Xiaofei Xu, Xiuzhen Zhang, Ke Deng

Location: Montreal | Day: August 22nd | Time: 10:00 | Session: AI4Tech (3/3)

Show Abstract

Fake news and misinformation poses a significant threat to society, making efficient mitigation essential. However, manual fact-checking is costly and lacks scalability. Large Language Models (LLMs) offer promise in automating counter-response generation to mitigate misinformation, but a critical challenge lies in their tendency to hallucinate non-factual information. Existing models mainly rely on LLM self-feedback to reduce hallucination, but this approach is computationally expensive. In this paper, we propose MisMitiFact, Misinformation Mitigation grounded in Facts, an efficient framework for generating fact-grounded counter-responses at scale. MisMitiFact generates simple critique feedback to refine LLM outputs, ensuring responses are grounded in evidence.
We develop lightweight, fine-grained critique models trained on data sourced from readily available fact-checking sites to identify and correct errors in key elements such as numerals, entities, and topics in LLM generations. Experiments show that MisMitiFact generates counter-responses of comparable quality to LLMs’ self-feedback while using significantly smaller critique models. Importantly, it achieves ~5x increase in feedback generation throughput, making it highly suitable for cost-effective, large-scale misinformation mitigation. Code and additional results are available at https://github.com/xxfwin/MisMitiFact.