Guang Yang

Guang Yang

PhD Student

University of Washington



I’m a third-year PhD student at Paul G. Allen School of Computer Science & Engineering, University of Washington, advised by Prof. Noah Smith.

My research interests center on:

  • multimodal learning, with a focus on both understanding and generating content from images and videos;
  • multimodality in music, particularly on images (music score pictures), symbolic data (structured music scores), and their relationship with natural language.

Please feel free to contact me for potential collaborations.

Previously, I was an undergraduate at Yao Class, Tsinghua University.

Recent Publications

Quickly discover relevant content by filtering publications.
(2025). LEGATO: Large-scale End-to-end Generalizable Approach to Typeset OMR.

Cite arXiv URL

(2025). MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation.

Cite arXiv URL

(2024). Toward a More Complete OMR Solution. Proceedings of the 25th International Society for Music Information Retrieval Conference.

Cite DOI URL

(2023). Video Event Extraction via Tracking Visual States of Arguments. Proceedings of the AAAI Conference on Artificial Intelligence.

Cite DOI URL

(2023). Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence.

Cite DOI URL