Specializations
- Natural language processing
- Multimodal models, including vision language models, speech, and diffusion systems
- Multilingually and multiculturally representative systems
Research Areas
Biography
Michael Saxon is a Siegel Postdoctoral Fellow with the Tech Policy Lab and the Information School at the University of Washington. His research sits on the intersection of generative model benchmarking, multimodality, and AI ethics. He is particularly interested in difficult evaluation questions that arise in multimodal systems, and in developing methods to make systems performant and authentically user-responsive across languages and cultures. Saxon earned his bachelor’s in Electrical Engineering and master’s in Computer Engineering at Arizona State University, and his Ph.D. in Computer Science at the University of California, Santa Barbara, advised by William Wang.
Education
- Ph D, Computer Science, University of California, Santa Barbara, 2025
- MS, Computer Engineering, Arizona State University, 2020
- BS, Electrical Engineering, Arizona State University, 2018
Awards
- Neal Fenzi—Resonant Founder Fellowship - University of California, Santa Barbara, 2024
- Rising Star in Generative AI - UMass Amherst Rising Stars Workshop, 2024
- Outstanding Reviewer Award - ACL 2023, 2023
- Center for Responsible Machine Learning Fellowship - University of California, Santa Barbara, 2020
- Graduate Division Central Fellowship - University of California, Santa Barbara, 2020
- National Science Foundation Graduate Research Fellowship - National Science Foundation (NSF GRFP), 2020
Publications and Contributions
-
Conference PaperCan Vision Language Models Understand Mimes? (2025)Findings of the Association for Computational Linguistics
-
Conference PaperCulture is Everywhere: A Call for Intentionally Cultural Evaluation (2025)Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)
-
Conference PaperDo You Know About My Nation? Investigating Multilingual Language Models’ Cultural Literacy Through Factual Knowledge (2025)Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)
-
Conference PaperTC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation (2025)Findings of the Association for Computational Linguistics
-
Conference PaperThoughtTerminator: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models (2025)Second Conference on Language Modeling (COLM 2025)
-
Conference PaperVSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs (2025)Proc. International Conference on Computer Vision (ICCV 2025)
-
Journal Article, Academic Journal(2024)Transactions of the Association for Computational Linguistics (TACL), 12(2024), pp. 484-506
-
Conference Paper(2024)Proceedings of the Conference on Language Modeling (COLM 2024)
-
Conference Paper(2024)Proceedings of the Empirical Methods in Natural Language Processing Conference (EMNLP 2024)
-
Conference Paper(2024)Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
-
Conference Paper(2024)Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024)
-
Conference Paper(2023)Proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023)
-
Conference Paper(2023)Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)
-
Conference Workshop Paper(2023)Interspeech 2023 Show and Tell
-
Conference Paper(2023)Proceedings of the 37th Annual Conference on Neural Information Processing Systems (NeurIPS 2023)
-
Conference Paper(2023)Proceedings of the Empirical Methods in Natural Language Processing Conference (EMNLP 2023)
-
Conference Paper(2023)Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)
-
Conference PaperPECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets (2023)Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023), pp. 3061-3074
-
Conference Paper(2023)Proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023)
-
Conference Paper(2022)Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
-
Conference Paper(2022)Proceedings of the 36th edition of the Association for the Advancement of Artificial Intelligence Conference (AAAI 2022)
-
Conference Paper(2021)Proceedings of the Thirty-Fifth Annual Conference on Neural Information Processing Systems (NeurIPS 2021)
-
Conference Paper(2021)Interspeech 2021, pp. 4738-4742
-
Conference Paper(2021)Findings of the Association for Computational Linguistics (ACL 2021)
-
Conference Paper(2021)Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), pp. 2023-2037
-
Journal Article, Academic Journal(2020)IEEE Transactions on Audio, Speech, and Language Processing, pp. 2511-2522
-
Conference Paper(2020)Interspeech 2020, pp. 4273-4277
-
Conference Paper(2020)Interspeech 2020, pp. 2532-2536
-
Conference Paper(2019)Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (IEEE ICASSP 2019), pp. 6520-6524
-
Conference Paper(2019)Interspeech 2019, pp. 2528-2532
-
Conference Workshop PaperWord Pair Convolutional Model for Happy Moment Classification (2019)AAAI AffCon Workshop 2019, pp. 111-119
-
Conference Paper(2016)Proceedings of the IEEE 66th Electronic Components and Technology Conference (ECTC 2016), pp. 2222-2227
Presentations
-
How to nitpick multimodal evaluations
(2025)
CVPR 2025 - Virtual
-
Multilingual multimodal evaluation: how and why
(2025)
Google Translate Research - Mountain View, CA
-
Rigorous measurement in text-to-image systems
(2024)
Georgetown University - Washington DC
-
Rigorous measurement in text-to-image systems
(2024)
UMD CLIP Seminar - College Park MD
-
Rigorous measurement in text-to-image systems
(2024)
Stanford SALT Group - Palo Alto, CA
-
Disparities in Text-to-Image Model Conceptual Knowledge Across Languages
(2023)
2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT) - Chicago, IL
-
Rigorous measurement in text-to-image systems
(2023)
Arizona State University - Tempe, AZ
-
Rigorous measurement in text-to-image systems
(2023)
USC Information Sciences Institute - Marina Del Rey, CA