Skip to content
/ distl Public

A chrome extension that extracts relevant text from a webpage and performs text summarization and sentiment analysis. Made with Judy Chen, Henry Du, and Jerry Feng at Hack the North 2016.

License

Notifications You must be signed in to change notification settings

dqian96/distl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

distl

distl provides the client with a “distilled” version of articles as they browse the web. The client has to do very little - distl automatically extracts relevant text from a webpage and gives back a summarized version with sentiment analysis.

Components:

HTML/CSS/JS Frontend:

  • Beautifully designed
  • Filters out ads from relevant article text
  • Sends text to backend
  • Recieves and displays summarized text and sentiment analysis
  • Sends user ratings on text analysis to backend

Node.js Backend:

  • Recieves source text from frontend and caches it
  • Performs text summarization using a Node.js plugin
  • Sends text to Google Natural Language API for sentiment analysis
  • Aggregates results and sends analysis back to frontend
  • Recieves user rating on analysis and sends the cached sourced text to Google Storage

Google Cloud Platform:

  • Hosts the Node.js backend using Google App Engine
  • A VM created by Google Compute Engine uses CRON to schedule PySpark jobs on a Google Hadoop cluster
  • Spark apps analyze a day's worth of source text and user ratings (stored in Google Storage) for patterns (i.e. most frequently visited site, number of of dislike v.s. likes for a particular site, etc.)

About

A chrome extension that extracts relevant text from a webpage and performs text summarization and sentiment analysis. Made with Judy Chen, Henry Du, and Jerry Feng at Hack the North 2016.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published