Skip to content

sairam-kakarla/TSLG

Repository files navigation

TSLG

Requirements

bs4

Data Source

The corpus of songs are collected from lyrics hosting platform. The corpus contains songs written by Shri Seetharama Sastry.

Corpus Details

Parameter Measure
songs count 369
Words count 47993
Avg word per song 130
Avg Char per word 6
Unique Words 19077

Module

  • song_scrapper.py for scrapping individual songs.
  • song_url_scrapper.py for scrapping url to songs lyrics webpage.

⚠️ Follow a timeout of atleast 2 seconds for every request to the forum