- VideoPrism: A Foundational Visual Encoder for Video Understanding
- VideoGLUE: Video General Understanding Evaluation of Foundation Models
Long Zhao*, Nitesh B. Gundavarapu*, Liangzhe Yuan*, Hao Zhou*, Shen Yan#, Jennifer J. Sun#, Luke Friedman#, Rui Qian#, Tobias Weyand, Yue Zhao, Rachel Hornung, Florian Schroff, Ming-Hsuan Yang, David A. Ross, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko+, Ting Liu+, Boqing Gong+. (*equal primary contributions; #equal core technical contributions; +equal senior contributions)
International Conference on Machine Learning (ICML), 2024. [arXiv] [Blog Post] [GitHub]
Liangzhe Yuan*, Nitesh B. Gundavarapu*, Long Zhao*, Hao Zhou*, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong. (*equal technical contributions)
Transactions on Machine Learning Research (TMLR), 2024. [arXiv] [GitHub]