Pipeline to generate summaries of youtube videos, using Whisper-Small for transcription, and BART-LARGE-XSUM for summarisation. BART has been finetuned on the popular CNN/Daily Mail Dataset, as it lends itself to summarisation tasks. Initially, we attempted to fine-tune GPT-2 for the summarisation task, but found it had poor performance: being a generative transfotmer, it generates words one-by-one, (extractive summarisation) whereas BART can generate at the sentence level (using abstractive summarisation). For more info on choice of summarisation model, see this article. We use the HuggingFace Transformers
libary to abstract some of the PyTorch code using the pipeline
submodule.
For some instances of google colab, the yt-dlp package may not work. This is an issue with the package, and cannot be resolved at this time. If you encouter this, I reccomend trying the pytube package instead.