We introduce the task of automatic human action co-occurrence identifica...
Joint vision-language models have shown great performance over a diverse...
Recent progress in large language models has enabled the deployment of m...
We aim to investigate the performance of current OCR systems on low reso...
We consider the task of temporal human action localization in lifestyle
...
We aim to automatically identify human action reasons in online videos. ...
Inspiration moves a person to see new possibilities and transforms the w...
We consider the task of identifying human actions visible in online vide...