Research Projects

Core Research Achievements & Open Source Initiatives

The STAIR research group focuses on Scalable and Trustworthy AI Research, covering cutting-edge areas including large-scale graph mining, trustworthy foundation models, and scalable algorithms. Our mission is to bridge large language models (LLMs) and big graph mining to tackle real-world challenges. Below are our key research projects and contributions to the AI community.

Context Engineering for Large Language Models: A Comprehensive Survey

Context Engineering for Large Language Models: A Comprehensive Survey

A comprehensive survey that systematically optimizes information payloads for LLMs, establishing a unified framework for context-aware AI systems and revealing critical research gaps in long-form generation capabilities.

EagleMine: Beyond outliers and on to micro-clusters: Vision-guided Anomaly Detection

EagleMine: Beyond outliers and on to micro-clusters: Vision-guided Anomaly Detection

EagleMine is a novel tree-based mining approach to recognize and summarize the micro-clusters in the histogram.

spartan2: a developing open-sourced graph and time series mining package based on sparse tensor/matrix and sequential analysis.

spartan2 is a collection of data mining algorithms on big graphs and time series, providing three basic tasks: anomaly detection, forecast, and summarization.

SpecGreedy: Unified Dense Subgraph Detection (Best Student DM paper)

SpecGreedy: Unified Dense Subgraph Detection (Best Student DM paper)

Fast Spectral Theory-based Algorithms for unified dense subgraphs detection in large graphs.

FlowScope: Spotting Money Laundering Based on Graphs

FlowScope: Spotting Money Laundering Based on Graphs

A scalable algorithm for detecting money laundering in financial networks using multipartite graph modeling to trace complete fund flows from source to destination accounts.

CatchCore: Catching hierarchical dense subtensor

CatchCore: Catching hierarchical dense subtensor

CatchCore is a novel framework to detect hierarchical dense cores in multi-aspect data (i.e. tensors).

BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series

BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series

An unsupervised anomaly detection algorithm for time series data using adversarial generation, specifically designed for detecting anomalous patterns in rhythmic sequences like ECG readings.

EigenPulse: Detecting surges in large streaming graphs with row augmentation

EigenPulse: Detecting surges in large streaming graphs with row augmentation

EigenPulse is a streaming algorithm to detect surges of sliding windows in real time.

HoloScope: Topology-and-Spike Aware Fraud Detection

HoloScope: Topology-and-Spike Aware Fraud Detection

A holistic fraud detection system that leverages graph topology, temporal spikes, and rating deviations to accurately identify fraudulent user groups with sub-quadratic time complexity.