some data science projects in R, including reports in markdown format, codes, output plots & models.
this report includes questions about charles dickens' books & novels.
- most used top 20 words in dickens’ books
- wordcloud of dickens’ books
- five most used names in dickens’ books
- emtion (positivity and negativity in words) time series in les miserables
- most used verbs for men and women in dickens’ books
- unigram vs bigram distributions per chapter/ per book
- unigram, bigram distributions comparison between austins' and dickens' novels
this data is death records from 90's to 2017 in this report i've illustrated some of the features to better understand the data:
- correlation of main attributes
- ratio of murder/suicide based on race, sex, age and education.
Then I've trained a Generalized linear model using H2o and plotted the False Positives and False negatives overally and detailed( based on race and sex ). and then some plots are drawn for better understanding the model such as DOC or acc/threshold. then i try to tune the depth of model and based on the best depth i tune other parameters. this model is later used in a shiny app to help judges better predict wheather a death was caused by a murder or suicide
in this report you can see:
- top mobile device manufacturers
- evolution of mobile dimensions in time
- Box plot of device thickness based on headphone jack feature
- evolution of ppi of deviced in timeage
- how to filter old nokia phones!
- which devices can float on water?
- correlation of battery capaity and weight
- Samsung brand series prices change over time
- Competition of companies and the amount of their production over time
- Growth rate and changes of price, RAM and camera size
- Mobile phone area change chart
You can also check out this post