Skip to content

Onetail/textAnalytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text Analytics System

Basic Environment

  • System: Ubuntu 16.04.4 LTS
  • Python: 3.5.2
  • Kafka: 2.11-0.11.0.2
  • Internet: Alibaba Cloud (on Great Firewall)

Flow

sudo ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic topicName
  • Write the python web crawlers with pykafka
  • Build TF-IDF Model for each article
  • pipe to the python text analytics

Usage

  • get the latest news from politics/entertainment
python3 crawler.py politics 
python3 crawler.py entertainment
  • get Message from kafka
python3 mqtool.py 
  • Analytics Message
python3 nlp.py politics 
python3 nlp.py entertainment
  • Auto crontab
$crontab -e

Releases

No releases published

Packages

No packages published

Languages