Lianyu Hu is a PhD graduating from Tianjin University, China, supervised by Prof Wei Feng. During his graduate student period, he worked closely with Prof Shenglan Liu. His research interest includes Multimodal Learning, Embodied Intelligence and Video Understanding.
π₯ News
-
We release TennisExpert, which aims to provide high-quality commentaries in real time for tennis matches. It first provides a large-scale, fine-grained and diverse tennis dataset, termed TennisVL to aim MLLMs to understand tennis. A expert model termed as TennisExpert is them trained with superior performance than Gemini-3.0-pro and GPT-5.2.
-
We release LightVLM, an highly efficient method for large vision language models with a two-stage design. It improves model efficiency by first conducting visual token merging in the encoding stage and then adopt KV Cache compression in the decoding stage. It could achieve about 2Γ throughput across diffferent benchmarks and 3.21Γ throughput boost when outputting longer sequences.
ποΈ Selected Publications ($\dagger$ denotes Corresponding Author)
π Technical Report
-
Lianyu Hu, Wei Feng, Liqing Gao, Zekang Liu, Liang Wan. CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation, 2024.04. [code].
- Lianyu Hu, Tongkai Shi, Liqing Gao, Zekang Liu, Wei Feng. Improving Continuous Sign Language Recognition with Adapted Image Models, 2024.04. [code].
π PrePrint
-
Zhaoyu Liu, Xi Weng, Lianyu Hu$\dagger$, Zhe Hou, Kan Jiang, Jin Song Dong, Yang Liu. TennisExpert: Towards Expert-Level Analytical Sports Video Understanding. 2026.03. [code].
-
Kailin Lyu, Kangyi Wu, Pengna Li, Xiuyu Hu, Qingyi Si, Cui Miao, Ning Yang, Zihang Wang, Long Xiao, Lianyu Hu$\dagger$, Jingyuan Sun$\dagger$, Ce Hao. HiMemVLN: Enhancing Reliability of Open-Source Zero-Shot Vision-and-Language Navigation with Hierarchical Memory System. 2026.03. [code].
- Lianyu Hu, Fanhua Shang, Liang Wan, Wei Feng. LightVLM: Acceleraing Large Multimodal Models with Pyramid Token Merging and KV Cache Compression. 2025.09.
Accepted Publications
-
Lianyu Hu, Liqing Gao, Fanhua Shang, Liang Wan, Wei Feng. iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models, ICLR2026. [code].
-
Tongkai Shi, Lianyu Hu$\dagger$, Fanhua Shang, Liqing Gao, Wei Feng. GReg: Geometry-Aware Region Refinement for Sign Language Video Generation. ICCV 2025.
-
Lianyu Hu, Tongkai Shi, Wei Feng, Fanhua Shang, Liang Wan. Deep Correlated Prompting for Visual Recognition with Missing Modalities. NeurIPS 2024. [code].
-
Tongkai Shi, Lianyu Hu, Fanhua Shang, Jichao Feng, Peidong Liu, Wei Feng. Pose-Guided Fine-Grained Sign Language Video Generation. ECCV 2024. [code].
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. Spatial Temporal Aggregation for Efficient Continuous Sign Language Recognition. IEEE Transactions on Emerging Topics in Computational Intelligence.
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. Dynamic Spatial-Temporal Aggregation for Skeleton-Aware Sign Language Recognition. COLING 2024. [code].
-
Lianyu Hu, Liqing Gao, Zekang Liu, Chi-Man Pun, Wei Feng. COMMA: Co-Articulated Multi-Modal Learning. AAAI 2024. [code].
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. Scalable Frame Resolution for Efficient Continuous Sign Language Recognition. Pattern Recognition.
-
Lianyu Hu, Liqing Gao, Zekang Liu, Chi-Man Pun, Wei Feng. AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition. ACMMM 2023 (Oral). [code].
-
Lianyu Hu, Shenglan Liu, Wei Feng. Skeleton-Based Action Recognition with Local Dynamic Spatial-Temporal Aggregation. Expert Systems with Applications. [code]. (Previous name: Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition)
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. Continuous Sign Language Recognition with Correlation Network. CVPR 2023. [code].
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. Self-Emphasizing Network for Continuous Sign Language Recognition. AAAI 2023 (Oral). [code].
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. Temporal Lift Pooling for Continuous Sign Language Recognition. ECCV 2022. [code].
-
Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng. HFNet: A Novel Model for Human Focused Sports Action Recognition. ACMMM 2020 Workshop.
π Honors and Awards
- 2025.06, Outstanding Graduate
- 2024.12, δΌη§ε¦ηζ ε ΅οΌten per yearοΌ
- 2024.10, National Scholarship
- 2023.10, National Scholarship
π Educations
- 2021-2025, PhD in Computer Science and Technology, Tianjin Univerisity
- 2018-2021, MEng in Computer Science and Technology, Dalian University of Technology
- 2014-2018, BSc in Electronics and Information Engineering, Dalian University of Technology