Skip to content

Group projects for the USFQ class in Data Mining using Python

Notifications You must be signed in to change notification settings

Nicolas749/DataMiningProjects

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proyecto 1: Word counting using Python

Parallelism and Distributed computing

Notes:

MapReduce.py is the working file.

  • Counting is working and being parallelized correctly; however, a method must be implemented to show the parallelism graphically.
  • Implement checkpoints to keep track of each step of the process. For example, prompt "Continue? Y/N" after split, map, and reduce steps. This way, every part of the process is explicitly demonstrated as functional.
  • Consider making it truly distributed computing, perhaps using your computers as a cluster.

About

Group projects for the USFQ class in Data Mining using Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%