✔ Accuracy and precision cannot be used interchangeably, the former being true to intention (degree of closeness of measured value to true value) while the latter is true to itself (degree of closeness of repeated measured values)
✔ Probability and likelihood are different terms; the former is finding the chance of outcomes given a data distribution, the latter is finding the most likely distribution given the outcomes.
DESCRIPTIVE STATISTICS: For inference of the smaller sample data
INFERENTIAL STATISTICS: For inference of the larger population
Depending on your goal and the datatype (parametric or non-parametric), you select a test.
If the goal is to quantify an association between two groups, we check Pearson correlation for parametric data, Spearman correlation for non-parametric data. If the goal is to predict a target from one or more variables, we perform simple regression (two variables) and multiple regression (more than two variables) for parametric data. If we have to compare unpaired (independent) groups, we perform unpaired T-test (or one-way ANOVA for 2+ groups) for parametric data, and Mann-Whitney test (2 groups) for non-parametric data.
Parametric test:-
Assumption: Data has normal distribution
Non-parametric test:-
No assumption
HYPOTHESIS TESTS: Depending on datatypes and data sample, hypothesis testing is carried out.
![0](https://private-user-images.githubusercontent.com/101544669/295154661-d947dc46-e799-4647-af50-2da545412af9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAyMTY5NzYsIm5iZiI6MTcyMDIxNjY3NiwicGF0aCI6Ii8xMDE1NDQ2NjkvMjk1MTU0NjYxLWQ5NDdkYzQ2LWU3OTktNDY0Ny1hZjUwLTJkYTU0NTQxMmFmOS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA1JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwNVQyMTU3NTZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03N2Q3MjEyOGM4MDhkMzA1YTExMDNiMTJkOWY2NmJhZTBhZTQyMjc0MGQ2YjA5ODdmNWU4NTQyZTA0NTVjZmUzJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.oPXo13Zi4qd2IFSrydTrsDJDPvpVYb_4Zs1S0bm2oJ0)
There's a data classification based on privacy, security, risk management and regulatory compliance: public, confidential, restricted and internal.
For more: https://en.wikipedia.org/wiki/F-test https://en.wikipedia.org/wiki/Analysis_of_variance
Mode: Number that occurs most often in a dataset.
Median: Middle number/value when a dataset is ordered from least to greatest.
A violin plot shows the shape (density distribution) of data which boxplot does not, and it must be used to explore skewed data.
![vp](https://private-user-images.githubusercontent.com/101544669/334660984-eb349bbc-acf3-47e8-ab49-dea45666401e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAyMTY5NzYsIm5iZiI6MTcyMDIxNjY3NiwicGF0aCI6Ii8xMDE1NDQ2NjkvMzM0NjYwOTg0LWViMzQ5YmJjLWFjZjMtNDdlOC1hYjQ5LWRlYTQ1NjY2NDAxZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA1JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwNVQyMTU3NTZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03OGM2NGE4MmM2MTdmNmUxZGJiYjc1NWYzYTJhYWNkYzRjMWE4NDYzODNmMjRmZTI3OTdlNTA4ODQ1ZTEyZDY2JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.YlH8c67SwHSfcpbnVVWIWBA3z5-BcfExUvKJUbxBxH4)
There are power transformations that variables need to undergo if they follow either right-skewed or left-skewed distributions.
MEASURES OF DISPERSION: Range, quartile deviation and interquartile range (quartile deviation is half of the interquartile range), variance, standard deviation
Statistical models:-
Discriminative (L) and Generative (R) (non-conditional)
![mod](https://private-user-images.githubusercontent.com/101544669/326362237-64651d9a-486f-49ae-91a9-7b3749bdf42b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAyMTY5NzYsIm5iZiI6MTcyMDIxNjY3NiwicGF0aCI6Ii8xMDE1NDQ2NjkvMzI2MzYyMjM3LTY0NjUxZDlhLTQ4NmYtNDlhZS05MWE5LTdiMzc0OWJkZjQyYi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA1JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwNVQyMTU3NTZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0zZmFjOThjMzMyMTRjNTg2YjFiN2RiODVjNjRlNjBkZDlkNDYyMzE0MTExZjdhOWYxN2RiNGE5YmU2MjNiYTVhJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.Codk3NQ7Pbf5jB_qUlVV48Iw4de4a8yToBDLda71MME)