Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes with Natural Language.
Yaoting Wang*, Peiwen Sun*, Dongzhan Zhou, Guangyao Li, Honggang Zhang, Di Hu^.
(2024). Accepted by The 18th European Conference on Computer Vision (ECCV 2024).
Can Textual Semantics Mitigate Sounding Object Segmentation Preference?.
Yaoting Wang*, Peiwen Sun*, Yuanchao Li, Honggang Zhang, Di Hu^.
(2024). Accepted by The 18th European Conference on Computer Vision (ECCV 2024).
Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation.
Juncheng Ma, Peiwen Sun, Yaoting Wang, Di Hu^.
(2024). Accepted by The 18th European Conference on Computer Vision (ECCV 2024).
Prompting Segmentation with Sound is Generalizable Audio-visual Source Localizer.
Yaoting Wang*, Weisong Liu*, Guangyao Li, Jian Ding, Di Hu^, Xi Li. (2023).
Accepted by 38th AAAI conference on artificial intelligence (AAAI 2024).
Scaling Up Mobile Service Selection in Edge Computing Environment with Cuckoo Optimization Algorithm.
Ming Zhu, Feilong Yu, Xiukun Yan, Jing Li, Yaoting Wang. (2021).
Accepted by 2021 IEEE International Conference on Services Computing (SCC 2021).
Pre-print
Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition.
Yaoting Wang*, Yuanchao Li*, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai. (2023).
arXiv.