Skip to content

mrtnj/scripting_for_data_analysis

Repository files navigation

Scripting for data analysis

Martin Johnsson

Scripting for data analysis with R is a PhD course set to be given in 2017. This repository contains the developing exercises and homework assignments.

Course plan

###1. A crash course in R

Before the seminar I'll send out instructions on how to setup R and RStudio so everyone can work along on their laptop.

Why do data analysis with a scripting language
The RStudio interface
Using R as a calculator
Working interactively and writing code
Getting help
Reading and looking at data
Installing useful packages
A first graph with ggplot2

Homework for next time: The Unicorn Dataset, exercises in reading data, descriptive statistics, linear models and a few statistical graphs.

###2. Programming for data analysis

Programming languages one may encounter in science
Common concepts and code examples
Data structures in R
Vectors
Data frames
Functions
Control flow

Homework for next time: The Unicorn Expression Dataset, exercises in data wrangling and more interesting graphs.

###3. Working with moderately large data

Exercise followup
More about functions
Lists
Objects
Functional and imperative programming
Doing things many times, loops and plyr
Simulating data
Working on a cluster

Final homework: Design analysis by simulation: pick a data analysis project that you care about; simulate data based on a model and reasonable effect size; implement the data analysis; and apply it to simulated data with and without effects to estimate power and other design characteristics. This ties together skills from all seminars.

About

A short course in scripting for data analysis with R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages