andy-yang

andy-yang-1

AI & ML interests

None yet

Recent Activity

authored a paper 2 days ago

BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching

authored a paper 2 days ago

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

authored a paper 2 days ago

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

View all activity

Organizations

authored 10 papers 2 days ago

BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching

Paper • 2411.16102 • Published Nov 25, 2024

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published Jun 24, 2025 • 42

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118

Barbarians at the Gate: How AI is Upending Systems Research

Paper • 2510.06189 • Published Oct 7, 2025 • 9

StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation

Paper • 2511.07399 • Published Nov 10, 2025 • 1

WorldModelBench: Judging Video Generation Models As World Models

Paper • 2502.20694 • Published Feb 28, 2025

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Paper • 2601.14243 • Published Jan 20 • 23

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Paper • 2602.02958 • Published Feb 3 • 34

SVG-EAR: Parameter-Free Linear Compensation for Sparse Video Generation via Error-aware Routing

Paper • 2603.08982 • Published 4 days ago • 15

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published 3 days ago • 58

authored 6 papers 10 months ago

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

Paper • 2311.04850 • Published Nov 8, 2023

Post-Training Sparse Attention with Double Sparsity

Paper • 2408.07092 • Published Aug 11, 2024

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

Paper • 2502.01776 • Published Feb 3, 2025 • 3

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

Paper • 2502.02770 • Published Feb 4, 2025 • 1

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Paper • 2505.18875 • Published May 24, 2025 • 42

andy-yang

AI & ML interests

Recent Activity

Organizations

andy-yang's activity