I am a third year CS PhD at UC Santa Barbara, advised by William Wang at Natural Language Processing Group. I obtained my bachelor’s degree from Chu Kochen Honors College, Zhejiang University.
My research is focused on developing advanced multimodal models capable of enhancing their intelligence through interactions with humans and the real world.
News! Check out our Vision Arena demo on HuggingFace! You can directly chat with or compare the large multimodal models (GPT4-V, Gemini-Pro Vision, LLaVA-NEXT 34b, QwenVL Chat, etc.) side by side easily!