It’s happened to all of us: you find the perfect model for your needs — a bracket, a box, a cable clip, but it only comes in ...
In this work, we propose the Structurebased Pseudo Label generation (SPL) framework for the zero-shot video sentence localization task, which learns with only video data without any annotation. We ...