Skip to content

Commit

Permalink
I think most variables and models are now complete. Data is local, no…
Browse files Browse the repository at this point in the history
…t yet online
  • Loading branch information
MielHostens committed Jun 23, 2019
0 parents commit f25936f
Show file tree
Hide file tree
Showing 70 changed files with 3,878 additions and 0 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.Rproj.user
.Rhistory
.RData
.Ruserdata
*.csv
27 changes: 27 additions & 0 deletions DataImport.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#data manipulation
if (!require("dplyr")) {
install.packages("dplyr", dependencies = TRUE)
library(dplyr)
}

if(!(exists('AllDataRaw') && is.data.frame(get('AllDataRaw')))) {
AllDataRaw <- read.csv2(file = "../Data/TableauExportv2.csv",
header = T,
strip.white = TRUE,
dec = ".",
sep = ',', na.strings = c('', 'NA')
)
oldColumns <- names(AllDataRaw)
newColumns <- gsub("\\.", "", oldColumns, perl=TRUE)
#Strange name for AnimalId
newColumns[1] <- "AnimalId"
#Duplicate name for DaysInMilk
newColumns[19] <- "DaysInMilkBin"
names(AllDataRaw) <- newColumns
AllDataRaw <- AllDataRaw %>%
dplyr::arrange(
HerdId,
AnimalId,
Date
)
}
113 changes: 113 additions & 0 deletions Models/Decay.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
title: "R Notebook"
output:
github_document:
toc: true
toc_depth: 3
editor_options:
chunk_output_type: inline
---

```{r include=FALSE}
#data manipulation
if (!require("dplyr")) {
install.packages("dplyr", dependencies = TRUE)
library(dplyr)
}
#glmer
if (!require("lme4")){install.packages("lme4", dependencies = TRUE)
library(lme4)
}
#lsmeans
if (!require("lsmeans")){install.packages("lsmeans", dependencies = TRUE)
library(lsmeans)
}
#lsmeans
if (!require("lubridate")){install.packages("lsmeans", dependencies = TRUE)
library(lubridate)
}
```

# Raw Data import

```{r cache= TRUE}
source('../DataImport.R')
```

# Data manipulation

```{r}
#We inspect the quantile ranges
quantile(AllDataRaw$DaysPregnant)
AllData <- AllDataRaw %>% dplyr::filter(
LactationNumber == 1,
DaysPregnant <= 283, #We drop all above 75th percentile because no interest at this stage, missing inseminations?
M305 > 0 #No missing M305 calculations
) %>%
dplyr::mutate(
Date = mdy_hms(Date), #reformat ordering date
Year = year(mdy_hms(CalvingDate)),
Month = month(mdy_hms(CalvingDate)),
DaysPregnantQuantile = case_when(
DaysPregnant < 275 ~ "0-25th Pct",
TRUE ~ "25-75 Pct"
)
) %>%
dplyr::arrange(
HerdId,
AnimalId,
Date
) %>%
dplyr::group_by(
AnimalId,
HerdId,
DaysPregnantQuantile,
Year,
Month,
CalvingDate
) %>%
summarise(
Value = as.integer(last(Decay))
)
```

# Model M305

## Base model

```{r}
baseline <- lmer(
Value ~ 1 + (1 | HerdId),
data = AllData
)
qqnorm(residuals(baseline, type = 'pearson'))
```

## Full model

```{r message=FALSE, warning=FALSE, paged.print=FALSE}
GLM <- lmer(
Value ~
DaysPregnantQuantile + Year + Month
+ (1 | HerdId),
data = AllData
)
qqnorm(residuals(GLM))
```

## Comparison of baseline and nested model

```{r}
anova(GLM,baseline, test="Chisq")
```

## Least square means

```{r}
lsmeans(GLM, pairwise~DaysPregnantQuantile, type = "response", adjust="tukey")
```

132 changes: 132 additions & 0 deletions Models/Decay.html

Large diffs are not rendered by default.

140 changes: 140 additions & 0 deletions Models/Decay.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
R Notebook
================

- [Raw Data import](#raw-data-import)
- [Data manipulation](#data-manipulation)
- [Model M305](#model-m305)
- [Base model](#base-model)
- [Full model](#full-model)
- [Comparison of baseline and nested model](#comparison-of-baseline-and-nested-model)
- [Least square means](#least-square-means)

Raw Data import
===============

``` r
source('../DataImport.R')
```

Data manipulation
=================

``` r
#We inspect the quantile ranges

quantile(AllDataRaw$DaysPregnant)
```

## 0% 25% 50% 75% 100%
## 150 275 278 283 297

``` r
AllData <- AllDataRaw %>% dplyr::filter(
LactationNumber == 1,
DaysPregnant <= 283, #We drop all above 75th percentile because no interest at this stage, missing inseminations?
M305 > 0 #No missing M305 calculations
) %>%
dplyr::mutate(
Date = mdy_hms(Date), #reformat ordering date
Year = year(mdy_hms(CalvingDate)),
Month = month(mdy_hms(CalvingDate)),
DaysPregnantQuantile = case_when(
DaysPregnant < 275 ~ "0-25th Pct",
TRUE ~ "25-75 Pct"
)
) %>%
dplyr::arrange(
HerdId,
AnimalId,
Date
) %>%
dplyr::group_by(
AnimalId,
HerdId,
DaysPregnantQuantile,
Year,
Month,
CalvingDate
) %>%
summarise(
Value = as.integer(last(Decay))
)
```

Model M305
==========

Base model
----------

``` r
baseline <- lmer(
Value ~ 1 + (1 | HerdId),
data = AllData
)
qqnorm(residuals(baseline, type = 'pearson'))
```

![](Decay_files/figure-markdown_github/unnamed-chunk-4-1.png)

Full model
----------

``` r
GLM <- lmer(
Value ~
DaysPregnantQuantile + Year + Month
+ (1 | HerdId),
data = AllData
)
qqnorm(residuals(GLM))
```

![](Decay_files/figure-markdown_github/unnamed-chunk-5-1.png)

Comparison of baseline and nested model
---------------------------------------

``` r
anova(GLM,baseline, test="Chisq")
```

## refitting model(s) with ML (instead of REML)

## Data: AllData
## Models:
## baseline: Value ~ 1 + (1 | HerdId)
## GLM: Value ~ DaysPregnantQuantile + Year + Month + (1 | HerdId)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## baseline 3 -Inf -Inf Inf -Inf
## GLM 6 -Inf -Inf Inf -Inf 3

Least square means
------------------

``` r
lsmeans(GLM, pairwise~DaysPregnantQuantile, type = "response", adjust="tukey")
```

## Warning in vcov.merMod(object, correlation = FALSE): Computed variance-covariance matrix problem: not a positive definite matrix;
## returning NA matrix

## Note: D.f. calculations have been disabled because the number of observations exceeds 3000.
## To enable adjustments, set emm_options(pbkrtest.limit = 10881) or larger,
## but be warned that this may result in large computation time and memory use.

## Note: D.f. calculations have been disabled because the number of observations exceeds 3000.
## To enable adjustments, set emm_options(lmerTest.limit = 10881) or larger,
## but be warned that this may result in large computation time and memory use.

## $lsmeans
## DaysPregnantQuantile lsmean SE df asymp.LCL asymp.UCL
## 0-25th Pct 0 NA Inf NA NA
## 25-75 Pct 0 NA Inf NA NA
##
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df z.ratio p.value
## 0-25th Pct - 25-75 Pct 0 NA Inf NA NA
13 changes: 13 additions & 0 deletions Models/Decay_cache/markdown_github/__packages
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
base
methods
datasets
utils
grDevices
graphics
stats
dplyr
Matrix
lme4
emmeans
lsmeans
lubridate
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit f25936f

Please sign in to comment.