This repository contains implementation details of “TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining”. TACOS is a dataset with strong captions, i.e., textual description of ...