Skip to content

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

Notifications You must be signed in to change notification settings

Mona19/hadoop-book

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example code for "Hadoop: The Definitive Guide, Third Edition" by Tom White.
Copyright (C) 2011 Tom White, 978-1-449-31152-0

https://www.hadoopbook.com/
https://oreilly.com/catalog/9781449311520/

The code is hosted at https://github.com/tomwhite/hadoop-book/. You can find code
for the first edition at https://github.com/tomwhite/hadoop-book/tree/1e, and
for the second edition at https://github.com/tomwhite/hadoop-book/tree/2e.

This version of the code has been tested with:
 * Hadoop 1.2.1/0.22.0/0.23.x/2.2.0
 * Avro 1.5.4
 * Pig 0.9.1
 * Hive 0.8.0
 * HBase 0.90.4/0.94.15
 * ZooKeeper 3.4.2
 * Sqoop 1.4.0-incubating
 * MRUnit 0.8.0-incubating

Before running the examples you need to install Hadoop, Pig, Hive, HBase,
ZooKeeper, and Sqoop (as appropriate) as explained in the book.

You also need to install Maven.

Then you can build the code with:

% mvn package -DskipTests

By default Hadoop 1.2.1 is used. This can be changed by specifying the
hadoop.version property, e.g.

% mvn package -DskipTests -Dhadoop.version=1.2.0

There are profiles for different Hadoop major versions and distributions,
specified in hadoop-meta/pom.xml, and they are specified using the hadoop.distro
property. For example, to use the default version of Hadoop 2:

% mvn package -DskipTests -Dhadoop.distro=apache-2

Again, you can specify hadoop.version to use a particular Hadoop 2 version:

% mvn package -DskipTests -Dhadoop.distro=apache-2 -Dhadoop.version=2.1.1-beta

You should then be able to run the examples from the book.

For chapter names for "Hadoop: The Definitive Guide", see
https://github.com/tomwhite/hadoop-book/wiki/Chapter-Numbers-By-Edition

About

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 83.4%
  • Shell 5.7%
  • Scala 3.2%
  • Perl 2.5%
  • Batchfile 2.3%
  • Python 1.3%
  • Other 1.6%