Apache Tez

Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.

At it's heart, tez is very simple and has just two components:

The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to perform arbitrary data-processing. Every 'task' in tez has the following:

Input to consume key/value pairs from.
Processor to process them.
Output to collect the processed key/value pairs.

A master for the data-processing application, where-by one can put together arbitrary data-processing 'tasks' described above into a task-DAG to process data as desired. The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.

Name		Name	Last commit message	Last commit date
Latest commit History 1,107 Commits
docs		docs
tez-api		tez-api
tez-common		tez-common
tez-dag		tez-dag
tez-dist		tez-dist
tez-examples		tez-examples
tez-mapreduce		tez-mapreduce
tez-plugins		tez-plugins
tez-runtime-internals		tez-runtime-internals
tez-runtime-library		tez-runtime-library
tez-tests		tez-tests
tez-tools/swimlanes		tez-tools/swimlanes
.gitignore		.gitignore
BUILDING.txt		BUILDING.txt
CHANGES.txt		CHANGES.txt
INSTALL.md		INSTALL.md
KEYS		KEYS
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apache Tez

About

Releases

Packages

License

mattqzhang/tez

Folders and files

Latest commit

History

Repository files navigation

Apache Tez

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages