AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs.
Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Yaoting Wang, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha.
(2025). Accepted by The IEEE International Conference on Computer Vision (ICCV 2025).
On Path to Multimodal Generalist: General-level and General-bench.
Hao Fei*, Yuan Zhou*, Juncheng Li*, Xiangtai Li*, Qingshan Xu*, Bobo Li*, Shengqiong Wu*, Yaoting Wang,
Junbao Zhou, Jiahao Meng, Qingyu Shi, Zhiyuan Zhou, Liangtao Shi, Minghe Gao, Daoan Zhang, Zhiqi Ge,
Siliang Tang, Kaihang Pan, Yaobo Ye, Haobo Yuan, Tao Zhang, Weiming Wu, Tianjie Ju, Zixiang Meng, Shilin Xu,
Liyu Jia, Wentao Hu, Meng Luo, Jiebo Luo, Tat-Seng Chua, Shuicheng Yan, Hanwang Zhang.
(2025). Accepted by The 42nd International Conference on Machine Learning (ICML 2025 Oral).
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes with Natural Language.
Yaoting Wang*, Peiwen Sun*, Dongzhan Zhou, Guangyao Li, Honggang Zhang, Di Hu^.
(2024). Accepted by The 18th European Conference on Computer Vision (ECCV 2024).
Can Textual Semantics Mitigate Sounding Object Segmentation Preference?.
Yaoting Wang*, Peiwen Sun*, Yuanchao Li, Honggang Zhang, Di Hu^.
(2024). Accepted by The 18th European Conference on Computer Vision (ECCV 2024).
Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation.
Juncheng Ma, Peiwen Sun, Yaoting Wang, Di Hu^.
(2024). Accepted by The 18th European Conference on Computer Vision (ECCV 2024).
Prompting Segmentation with Sound is Generalizable Audio-visual Source Localizer.
Yaoting Wang*, Weisong Liu*, Guangyao Li, Jian Ding, Di Hu^, Xi Li. (2023).
Accepted by 38th AAAI conference on artificial intelligence (AAAI 2024).
Scaling Up Mobile Service Selection in Edge Computing Environment with Cuckoo Optimization Algorithm.
Ming Zhu, Feilong Yu, Xiukun Yan, Jing Li, Yaoting Wang. (2021).
Accepted by 2021 IEEE International Conference on Services Computing (SCC 2021).
Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition.
Yaoting Wang*, Yuanchao Li*, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai. (2023).
arXiv.