GitHub - milindjagre/HDPCD: This repository contains all the documents related to HDPCD certification.

Welcome to HDPCD Repository

You can use this repository for preparing the Hortonworks Data Platform Certified Developer certification. The link for the certification is https://hortonworks.com/services/training/certification/exam-objectives/#hdpcd

Following objectives are tested through this certification

## DATA INGESTION
- Import data from a table in a relational database into HDFS
- Import the results of a query from a relational database into HDFS
- Import a table from a relational database into a new or existing Hive table
- Insert or update data from HDFS into a table in a relational database
- Given a Flume configuration file, start a Flume agent
- Given a configured sink and source, configure a Flume memory channel with a specified capacity

## DATA TRANSFORMATION
- Write and execute a Pig script
- Load data into a Pig relation without a schema
- Load data into a Pig relation with a schema
- Load data from a Hive table into a Pig relation
- Use Pig to transform data into a specified format
- Transform data to match a given Hive schema
- Group the data of one or more Pig relations
- Use Pig to remove records with null values from a relation
- Store the data from a Pig relation into a folder in HDFS
- Store the data from a Pig relation into a Hive table
- Sort the output of a Pig relation
- Remove the duplicate tuples of a Pig relation
- Specify the number of reduce tasks for a Pig MapReduce job
- Join two datasets using Pig
- Perform a replicated join using Pig
- Run a Pig job using Tez
- Within a Pig script, register a JAR file of User Defined Functions
- Within a Pig script, define an alias for a User Defined Function
- Within a Pig script, invoke a User Defined Function

## DATA ANALYSIS
- Write and execute a Hive query
- Define a Hive-managed table
- Define a Hive external table
- Define a partitioned Hive table
- Define a bucketed Hive table
- Define a Hive table from a select query
- Define a Hive table that uses the ORCFile format
- Create a new ORCFile table from the data in an existing non-ORCFile Hive table
- Specify the storage format of a Hive table
- Specify the delimiter of a Hive table
- Load data into a Hive table from a local directory
- Load data into a Hive table from an HDFS directory
- Load data into a Hive table as the result of a query
- Load a compressed data file into a Hive table
- Update a row in a Hive table
- Delete a row from a Hive table
- Insert a new row into a Hive table
- Join two Hive tables
- Run a Hive query using Tez
- Run a Hive query using vectorization
- Output the execution plan for a Hive query
- Use a subquery within a Hive query
- Output data from a Hive query that is totally ordered across multiple reducers
- Set a Hadoop or Hive configuration property from within a Hive query

Hope you guys like it. You can visit my LinkedIn profile at https://www.linkedin.com/in/milindjagre/

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
100_customers.csv		100_customers.csv
10_pig_transformation_script.pig		10_pig_transformation_script.pig
11_transformation_for_hive.csv		11_transformation_for_hive.csv
12_transform_data_for_hive.pig		12_transform_data_for_hive.pig
13_group_in_pig.pig		13_group_in_pig.pig
14_group_input_file.csv		14_group_input_file.csv
15_remove_NULL.pig		15_remove_NULL.pig
16_NULL_values_input.csv		16_NULL_values_input.csv
17_input_to_HDFS.txt		17_input_to_HDFS.txt
18_pig_load_to_HDFS.pig		18_pig_load_to_HDFS.pig
19_input_pig_to_hive.csv		19_input_pig_to_hive.csv
1_Sqoop_Import_Command.txt		1_Sqoop_Import_Command.txt
200_join.pig		200_join.pig
200_orders.csv		200_orders.csv
20_pig_to_hive.pig		20_pig_to_hive.pig
21_hive_table_creation.hql		21_hive_table_creation.hql
22_input_for_sort.csv		22_input_for_sort.csv
23_sort_in_pig.pig		23_sort_in_pig.pig
24_input_for_removing_duplicates.csv		24_input_for_removing_duplicates.csv
25_removing_duplicates.pig		25_removing_duplicates.pig
26_input_parallel_tasks.csv		26_input_parallel_tasks.csv
27_SET_multiple_reducers.pig		27_SET_multiple_reducers.pig
28_PARALLEL_multiple_reducers.pig		28_PARALLEL_multiple_reducers.pig
29_customers_input.csv		29_customers_input.csv
2_example.conf		2_example.conf
30_orders_input.csv		30_orders_input.csv
31_join_operation.pig		31_join_operation.pig
32_customers_input.csv		32_customers_input.csv
33_orders_input.csv		33_orders_input.csv
34_replicated_join.pig		34_replicated_join.pig
35_input_TEZ_mode.txt		35_input_TEZ_mode.txt
36_pig_script_tez_mode.pig		36_pig_script_tez_mode.pig
37_input_UDF_invoke.csv		37_input_UDF_invoke.csv
38_UDF_invocation.pig		38_UDF_invocation.pig
39_hive_query.sql		39_hive_query.sql
3_pig_demo.txt		3_pig_demo.txt
40_hive_managed_table.sql		40_hive_managed_table.sql
41_input_hive_external_table.csv		41_input_hive_external_table.csv
42_input_partition_hive_table.csv		42_input_partition_hive_table.csv
43_hive_partitioned_table.sql		43_hive_partitioned_table.sql
44_hive_bucketed_table.sql		44_hive_bucketed_table.sql
45_hive_table_with_ORC.sql		45_hive_table_with_ORC.sql
46_sequence_file_hive.sql		46_sequence_file_hive.sql
47_input_delimiter_hive.tsv		47_input_delimiter_hive.tsv
48_hive_table_tab_delimiter.sql		48_hive_table_tab_delimiter.sql
49_input_to_load_from_local.csv		49_input_to_load_from_local.csv
4_pig_wordcount.pig		4_pig_wordcount.pig
50_create_hive_table_for_local_load.sql		50_create_hive_table_for_local_load.sql
51_input_to_load_from_hdfs.csv		51_input_to_load_from_hdfs.csv
52_create_hive_table_for_hdfs_load.sql		52_create_hive_table_for_hdfs_load.sql
53_create_hive_table_for_SELECT_load.sql		53_create_hive_table_for_SELECT_load.sql
54_input_file_for_compressed_data.csv		54_input_file_for_compressed_data.csv
55_hive_table_for_compressed_data.sql		55_hive_table_for_compressed_data.sql
56_first_input_file_for_join.csv		56_first_input_file_for_join.csv
57_second_input_file_for_join.csv		57_second_input_file_for_join.csv
58_first_hive_table_for_join.sql		58_first_hive_table_for_join.sql
59_second_hive_table_for_join.sql		59_second_hive_table_for_join.sql
5_Pig_Schema_Less_Relation.pig		5_Pig_Schema_Less_Relation.pig
60_input_file_for_subquery.csv		60_input_file_for_subquery.csv
61_hive_create_table_for_subquery.sql		61_hive_create_table_for_subquery.sql
62_input_file_for_ordering_output.csv		62_input_file_for_ordering_output.csv
63_hive_create_table_for_order_by.sql		63_hive_create_table_for_order_by.sql
6_input.csv		6_input.csv
7_Pig_Relation_With_Schema.pig		7_Pig_Relation_With_Schema.pig
8_hive_to_pig.pig		8_hive_to_pig.pig
9_pig_transformation_input.txt		9_pig_transformation_input.txt
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to HDPCD Repository

Following objectives are tested through this certification

About

Releases

Packages

Languages

milindjagre/HDPCD

Folders and files

Latest commit

History

Repository files navigation

Welcome to HDPCD Repository

Following objectives are tested through this certification

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages