About this project: This project shows the performance of the players in different teams and the best 11 player of the tournament.
Datasource : Webscrapping the ESPN website usign brightdata. The data we got in json file format. So we used jupyter Notebook is used to convert the json files into the dataframes and then these dataframes into csv files for further data analysis.
###Data Cleaning: For the data cleaning, I used Power Query Editor. I took followed steps for data transformations
- renaming the files
- using first row as a header
- splitting the name into two columns
- removing duplicate records
- adding a custom column 'stage' to categorize the game into qualifier or Super12
- changing the datatype of stage column to text
- renaming column Bowling Team to team
- creating a ball column from overs column by splitting (like 2.5 over to 2 over and 5 balls)
- replacing null values to 0
- renaming columns 4s to fours, 6s to sixes, and team innings to team
- renaming column 'out/not_out to out where out=1 and not_out=0
- changing datatype of 'balls' to whole number
- removing ( text after name from batsman_name
Power BI: Import data using CSV files, data modelling, transformations using Power Query Editor, created DAX measures, designed dashboard using bookmark, tooltip, matrix,scatter chart, area chart, buttons, slicers and many more features.
This image shows the top opening batsman and their batting average, strike rate, boundaries scored and number of balls they faced.
This image shows the best 11 players of the T20 cricket world cup and their performance.