Zimbabwean Sign Language Vocabulary Actions

Incorporating Scene Graphs into Pre-trained Vision-Language Models for Multimodal Open-vocabulary Action Recognition

Abstract: This paper presents Action-SGFA, a novel action feature alignment approach to learn unified joint embeddings across four action modalities incorporating scene graph (SG) comprehension. A new ...

IEEE

Vision-Language Adaptive Clustering and Meta-Adaptation for Unsupervised Few-Shot Action Recognition

Abstract: Unsupervised few-shot action recognition is a practical but challenging task, which adapts knowledge learned from unlabeled videos to novel action classes with only limited labeled data.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Incorporating Scene Graphs into Pre-trained Vision-Language Models for Multimodal Open-vocabulary Action Recognition

Vision-Language Adaptive Clustering and Meta-Adaptation for Unsupervised Few-Shot Action Recognition

Trending now