Interpretable learning (Currently focused on time series.), Deep Learning, Natural Language Processing
Efficient Multimodal Learning, Reinforcement Learning, Mulitmodal Large Language Model
GUI agent, Agenitc RL, Scene Understanding (Scene Graph Generation), Vision Language
Multimodal Large Language Models, New Environment Adaptation for Intelligent Agents, GUI Agent Design and Optimization
Scene Understanding, Multimodal Large Language Models, Natural Language Understanding
Scene graph generation, Multimodal large model, Visual language reasoning
Open-world Learning, Generalized Category Discovery, Domain Adaptation and Generalization
Scene Understanding (Scene Graph Generation), Vision Language, Graph Neural Networks
Out-of-Distribution Detection and Generalization, Uncertainty estimation, Learning from noisy labels