A search engine which can hold 100 trillion lines of log data.
-
Updated
May 22, 2017 - Go
A search engine which can hold 100 trillion lines of log data.
Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
Fundamentals of Spark with Python (using PySpark), code examples
Data science and Big Data with Python
Kubernetes-native platform to run massively parallel data/streaming jobs
Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
Efficient transducers for Julia
Inverted Indexer, web crawler, sort, search and poster steamer written using Python for information retrieval.
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
RedisGears python client
Parallelized Base functions
The core parallel and shared memory library used by Hack, Flow, and Pyre
There are Python 2.7 codes and learning notes for Spark 2.1.1
Web Application Message Async Server and WAMP/MQTT bridge
Code for paper "Locally Distributed Deep Learning Inference on Edge Device Clusters"
A Java Stochastic Dynamic Programming Library
Add a description, image, and links to the map-reduce topic page so that developers can more easily learn about it.
To associate your repository with the map-reduce topic, visit your repo's landing page and select "manage topics."