Multimedia Processing and Coding
Applications of LLMs in video coding, compression, and multimedia processing for efficient data representation

Multimedia Processing and Coding
Research on Large Language Models in Multimedia Processing and Coding

Revolutionizing Video Compression with AI
Harnessing Multimodal LLMs for Efficient Video Coding

Auditory LLMs for Speech Quality Evaluation
Advancing automated speech assessment through large language models

Accelerating Video LLMs with Dynamic Token Compression
Solving the efficiency bottleneck in video processing models

Error-Resilient Image Compression
Making neural image codecs robust against packet loss

Bridging Modalities in AI
Understanding Connectors in Multi-modal LLMs

Rethinking Position Embeddings for Video LLMs
Enhancing video understanding with VRoPE architecture

Real-Time Video Intelligence: ReKV
Enabling streaming video question-answering without reprocessing

STORM: Revolutionizing Long Video Analysis
Token-Efficient Processing for Multimodal LLMs

Next-Gen Video Understanding
Boosting Inference Efficiency with Image Packing & AoE Architecture

Making Video LLMs Faster & Lighter
1.x-Bit KV Cache Quantization for Memory Efficiency

Dynamic Token Representation for Video LLMs
Overcoming efficiency barriers in video processing for large language models

Efficient Text-to-Video Generation
Making Video AI Accessible for Resource-Limited Devices

Next-Gen Video Compression
GIViC: Reimagining Video Compression with Generative AI
