Skip to content

matteopolak/tcd

Repository files navigation

Twitch Chat Downloader 🗒️

Build Status Release Status License:GPLv3 Rust:Nightly

tcd is a multi-threaded Twitch Chat Downloader built in Rust 🦀.

Usage: tcd.exe [OPTIONS] --channel <CHANNEL>

Options:
  -c, --channel <CHANNEL>      The channel(s) to download
  -i, --client-id <CLIENT_ID>  The Twitch client ID to use in the request headers
  -f, --format <FORMAT>        Used with --output or --stdout [default: csv] [possible values: json, csv]
  -l, --limit <LIMIT>          Downloads the first n videos from each channel
  -o, --output <OUTPUT>        If specified, pipes data to the file
  -p, --postgres [<POSTGRES>]  The PostgreSQL connection string [default: DATABASE_URL env]
  -q, --quiet                  Whether to print download progress
  -s, --stdout                 If specified, pipes data to stdout
  -t, --threads <THREADS>      The number of threads to use [default: 10]
  -h, --help                   Print help information
  -V, --version                Print version information

Pipe the chat messages of the first 5 videos of Atrioc, Linkus7 and Aspecticor to the file hitman.csv

tcd --channel atrioc --channel linkus7 --channel aspecticor --limit 5 --output hitman.csv

Building from source

# build the binary
cargo build --release

# execute the binary
target/release/tcd -c atrioc

Generating datasets

Some pre-made dataset scripts are located in the queries directory. You can run these with cargo run -p queries --example <name>.

Using pre-made datasets

Pre-made datasets can be downloaded from the [datasets] branch of the repository.

Piping data to a database

tcd supports saving data directly to a PostgreSQL database. First, apply the Prisma schema with the following commands:

# apply schema.prisma to the database
# note: this WILL wipe all database content
cargo prisma migrate dev --name init

# generate the Prisma client
cargo prisma generate

Or execute the migration.sql SQL statements against your database. Then, set the DATABASE_URL environment variable (a .env file works too), or supply the connection URL with --postgres <url>.

Output format

Data piped to a file or stdout will be in the following format:

--format csv

channel_id,video_id,comment_id,commenter_id,created_at,text
23211159,1642642569,3f445ae2-2f6e-4256-b367-df8132454786,157032028,"2022-11-03 21:25:22.754 +00:00","poggies"

--format json

[
  {
    "channelId": "i64",
    "videoId": "i64",
    "commentId": "string",
    "commenterId": "i64",
    "createdAt": "string",
    "text": "string"
  },
  {
    "channelId": 23211159,
    "videoId": 1642642569,
    "commentId": "3f445ae2-2f6e-4256-b367-df8132454786",
    "commenterId": 157032028,
    "createdAt": "2022-11-03 21:25:22.754 +00:00",
    "text": "poggies"
  },
]