Apache Tez

Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.

At its heart, tez is very simple and has just two components:

The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to perform arbitrary data-processing. Every 'task' in tez has the following:

Input to consume key/value pairs from.
Processor to process them.
Output to collect the processed key/value pairs.

A master for the data-processing application, where-by one can put together arbitrary data-processing 'tasks' described above into a task-DAG to process data as desired. The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.

Name		Name	Last commit message	Last commit date
Latest commit History 2,940 Commits
.github/workflows		.github/workflows
build-tools		build-tools
docs		docs
hadoop-shim-impls		hadoop-shim-impls
hadoop-shim		hadoop-shim
tez-api		tez-api
tez-build-tools		tez-build-tools
tez-common		tez-common
tez-dag		tez-dag
tez-dist		tez-dist
tez-examples		tez-examples
tez-ext-service-tests		tez-ext-service-tests
tez-mapreduce		tez-mapreduce
tez-plugins		tez-plugins
tez-runtime-internals		tez-runtime-internals
tez-runtime-library		tez-runtime-library
tez-tests		tez-tests
tez-tools		tez-tools
tez-ui		tez-ui
.asf.yaml		.asf.yaml
.gitignore		.gitignore
.travis.yml		.travis.yml
BUILDING.txt		BUILDING.txt
INSTALL.md		INSTALL.md
Jenkinsfile		Jenkinsfile
KEYS		KEYS
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
Tez_DOAP.rdf		Tez_DOAP.rdf
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apache Tez

About

Releases

Packages

Languages

License

bin41215/tez

Folders and files

Latest commit

History

Repository files navigation

Apache Tez

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages