GoMR is a super-fast, super-simple, super-easy-to-debug mapreduce framework for Go. Written to deploy Mapreduce jobs without dealing with the JVM, for debugging, performance, and to write code in Go!
See examples/wordcount/parallel
for the canonical wordcount mapreduce
program. To build, cd
into the directory and run go build
. Then, run with
./parallel <textfile>
.
To write jobs for GoMR, we first need to create and object that satisfies the
interfaces found in gomr.go
. Namely:
type Mapper interface {
Map(in <-chan interface{}, out chan<- interface{})
}
type Partitioner interface {
Partition(in <-chan interface{}, outs []chan interface{}, wg *sync.WaitGroup)
}
type Reducer interface {
Reduce(in <-chan interface{}, out chan<- interface{}, wg *sync.WaitGroup)
}
type Job interface {
Mapper
Partitioner
Reducer
}
Second, we need to supply data to the input channel of the mapper. We can do
this manually, or use one of the handy methods found in input.go
:
inMapChans, outChan := gomr.RunLocal(m, r, wc)
gomr.TextFileParallel(os.Args[1], inMapChans)
m
, r
are the number of mappers and reducers. wc
is an object satisfying
the Job
interface.