Skip to content

jell0720/dcard-crawler

 
 

Repository files navigation

學術用途,爬取 Dcard 上的文章與留言做數據分析。

論文請見 https://hdl.handle.net/11296/gazcm2https://etd.lis.nsysu.edu.tw/ETD-db/ETD-search-c/view_etd?URN=etd-0805119-133659

Setup

參考 https://github.com/jk195417/dcard-crawler/wiki#建立實驗環境

Instal rubygems

$ bundle install

Edit credentials

$ EDITOR="sublime --wait" bin/rails credentials:edit

Create PostgreSQL database

$ rails db:setup

Setup text analysis services (optional)

Usage

Rake tasks

rails dcard:get_comments # 取得已抓取貼文的新留言
rails dcard:get_posts # 持續取得新貼文(非學校)

Sidekiq

$ sidekiq # 20 threads, perform jobs in default queue
$ sidekiq -C ./config/sidekiq_baidu.yml # 5 threads, perform jobs in baidu queue

About

爬取 Dcard 上的文章與留言做數據分析

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 71.4%
  • HTML 26.5%
  • JavaScript 1.6%
  • CSS 0.5%