Picture for Huaibo Huang

Huaibo Huang

UniPrefill: Universal Long-Context Prefill Acceleration via Block-wise Dynamic Sparsification

Add code
May 07, 2026
Viaarxiv icon

Advancing Vision Transformer with Enhanced Spatial Priors

Add code
Apr 20, 2026
Viaarxiv icon

StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement

Add code
Apr 20, 2026
Viaarxiv icon

Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element Injection

Add code
Apr 09, 2026
Viaarxiv icon

MVPBench: A Multi-Video Perception Evaluation Benchmark for Multi-Modal Video Understanding

Add code
Mar 24, 2026
Viaarxiv icon

Think 360°: Evaluating the Width-centric Reasoning Capability of MLLMs Beyond Depth

Add code
Mar 24, 2026
Viaarxiv icon

Tuning Real-World Image Restoration at Inference: A Test-Time Scaling Paradigm for Flow Matching Models

Add code
Mar 23, 2026
Viaarxiv icon

GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?

Add code
Mar 19, 2026
Viaarxiv icon

Random Wins All: Rethinking Grouping Strategies for Vision Tokens

Add code
Feb 28, 2026
Viaarxiv icon

UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

Add code
Feb 15, 2026
Viaarxiv icon