Picture for Yubo Ma

Yubo Ma

SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting

Add code
Apr 21, 2026
Viaarxiv icon

ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels

Add code
Mar 03, 2026
Viaarxiv icon

EMemBench: Interactive Benchmarking of Episodic Memory for VLM Agents

Add code
Jan 23, 2026
Viaarxiv icon

PLawBench: A Rubric-Based Benchmark for Evaluating LLMs in Real-World Legal Practice

Add code
Jan 23, 2026
Viaarxiv icon

Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings

Add code
Jun 05, 2025
Viaarxiv icon

MTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning Evaluation

Add code
May 26, 2025
Viaarxiv icon

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks

Add code
Apr 26, 2025
Figure 1 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 2 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 3 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Figure 4 for Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Viaarxiv icon

Synergistic Weak-Strong Collaboration by Aligning Preferences

Add code
Apr 22, 2025
Viaarxiv icon

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Add code
Jan 21, 2025
Figure 1 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 2 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 3 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 4 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Viaarxiv icon

Long Context vs. RAG for LLMs: An Evaluation and Revisits

Add code
Dec 27, 2024
Figure 1 for Long Context vs. RAG for LLMs: An Evaluation and Revisits
Figure 2 for Long Context vs. RAG for LLMs: An Evaluation and Revisits
Figure 3 for Long Context vs. RAG for LLMs: An Evaluation and Revisits
Figure 4 for Long Context vs. RAG for LLMs: An Evaluation and Revisits
Viaarxiv icon