Martin Johnsson
Scripting for data analysis with R is a PhD course set to be given in 2017. This repository contains the developing exercises and homework assignments.
###1. A crash course in R
Before the seminar I'll send out instructions on how to setup R and RStudio so everyone can work along on their laptop.
Why do data analysis with a scripting language
The RStudio interface
Using R as a calculator
Working interactively and writing code
Getting help
Reading and looking at data
Installing useful packages
A first graph with ggplot2
Homework for next time: The Unicorn Dataset, exercises in reading data, descriptive statistics, linear models and a few statistical graphs.
###2. Programming for data analysis
Programming languages one may encounter in science
Common concepts and code examples
Data structures in R
Vectors
Data frames
Functions
Control flow
Homework for next time: The Unicorn Expression Dataset, exercises in data wrangling and more interesting graphs.
###3. Working with moderately large data
Exercise followup
More about functions
Lists
Objects
Functional and imperative programming
Doing things many times, loops and plyr
Simulating data
Working on a cluster
Final homework: Design analysis by simulation: pick a data analysis project that you care about; simulate data based on a model and reasonable effect size; implement the data analysis; and apply it to simulated data with and without effects to estimate power and other design characteristics. This ties together skills from all seminars.