Picture for Shuchang Zhou

Shuchang Zhou

LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection

Add code
May 06, 2026
Viaarxiv icon

MASRA: MLLM-Assisted Semantic-Relational Consistent Alignment for Video Temporal Grounding

Add code
May 05, 2026
Viaarxiv icon

Enhancing Self-Supervised Talking Head Forgery Detection via a Training-Free Dual-System Framework

Add code
May 05, 2026
Viaarxiv icon

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon

DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs

Add code
May 11, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models

Add code
Feb 27, 2025
Figure 1 for InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models
Figure 2 for InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models
Figure 3 for InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models
Figure 4 for InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Add code
Feb 13, 2025
Viaarxiv icon

UniScene: Unified Occupancy-centric Driving Scene Generation

Add code
Dec 06, 2024
Viaarxiv icon