-
Notifications
You must be signed in to change notification settings - Fork 0
/
07-Chapter_6.Rmd
852 lines (621 loc) · 49.7 KB
/
07-Chapter_6.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
# Scale equating and linking
**Abstract**
This chapter focuses on a framework to make scales comparable across groups, countries and time points. It describes the theory of scale equating which proposes that two or more indices, with some indicators in common but not necessarily the same, can be made comparable via modelling. The chapter focuses in one form of equating: Item Response Theory equating because it is widely used, and it fits more naturally with the whole rationale of the book. Several examples are used to illustrate the main concepts and its implementation in **R**.
## Intuition to scale equating
Previous chapter introduces the sources affecting the comparability of different poverty indices and presents the concept and empirical implementation of measurement invariance. In poverty research, households are ranked according to some indicators of (low) material and social deprivation using survey or census data. The indicators in question have a structure (unidimensional or multidimensional, See chapter \@ref(Chapter-1)) for the general population. One problem is that such a measurement model might not be adequate for a different population (such as countries) or for a different year. We saw in previous chapter that in statistical terms this means that the measurement model is not equivalent across groups. We now know that, once a MI assessment has been conducted, poverty can be compared on the factor using the alignment method. This, however, might not be fully satisfactory for policy makers as the values of a standardized latent variable make little practical sense. Furthermore, it is unclear how to use these values to set a poverty line.
Another critical problem is that MI is adequate when scales have the same items and it does not solve the problem of working with scales that have different items or that have been upgraded in accordance with the living standards. Ideally, once measurement invariance is assessed researches would like to put everything into a meaningful metric and being able to compare measures that might have suffered from changes in its contents. Moreover, one question in poverty research is about how the severity of poverty is affected by changes in living standards and how this can be tractable using the available data. Thus, poverty measures need to be equated or equivalised in some way so that we can make meaningful comparisons across groups or time.
Scale or test equating is the answer from measurement theory to the problem of working with scales that are not necessarily equivalent across countries, time points, etc. The academic concern with making scores comparable has a history of more than 90 years [@Holland2007]. The modern development of the theory of scale equating can be traced back to the 1980s [@Haebara1980; @Petersen1983; @Kolen1988]. However, after the publication of @Kolen2004's seminal book, this framework has become more accessible through software development and implemented in several fields and has been under constant development [@Davier2010; @Gonzalez2017].
Educational testing has been the field at the forefront of scale equating. Admissions to universities often depend on the score of a test. But what if this test favours some students? and What if different versions of the tests are implemented? This poses a challenge for the admissions' office for two students might have different scores but just because the tests had different difficulties, i.e. one was easier than other.
How does this problem translate to poverty measurement? There are two good examples to see this happening to our scales. According to @Townsend1979 poverty is relative across time and space. The *space* to identify the dimensions and indicators of poverty in the early 20th Century would look very different from the *space* of things and activities necessary to live with dignity in the 21st Century. For example, overcrowding and access to electricity were very good indicators of poverty in London but these days is no longer the case. That means that the two measure will have different deprivation indicators. So, what does it mean that someone in the early 20th century had a deprivation score equal to 6 relative to someone that has a score equal to 3 today? Is it possible to compare how severely deprived they are? And how this might or might not impact the prevalence rate?
This is a very crude example, but it helps to illustrate some of the challenges when comparing poverty overtime. However, in poverty research one common belief is that using the same indicators is sufficient to make valid comparisons across groups. Leaving aside the data collection and sampling issues associated with MI, we could think carefully about this assumption. Imagine that we use indicators of the early 20th to measure poverty today. Then we have two people with the same deprivation pattern- they lack electricity, cook with charcoal, and live in an overcrowded house. Hence, they have the same deprivation scores. In the early 20th century it would not be very difficult to find such conditions in a small random sample. However, these days this situation would denote acute material deprivation. But these two people will have the same deprivation score, so without equating they would be equally deprived. How sensible is this conclusion? Well, not very.
Scale equating aims at adjusting the differences in severity between tests and groups. The process of equating, given the response patterns, will notice that it is *more difficult* (as in Item Response Theory modelling) today being deprived of the items in question than 100 years ago. So, based on difficulty, the contemporary scale will be adjusted in terms of the early 20th Century's scale. If the equating process is successful, it should make sense and a score equal 3 today would mean something more severe in terms of a score equal 3 100 years ago, i.e. it will be higher. The inflation of today's score will be the result of equating and it will tell us a lot of useful information to adjust our conclusions about poverty and severity of deprivation.
The second example focuses on the practical use of equating. Statistical offices tend to update their indices overtime, or we tend to have countries with two different measures but with some common indicators. So, they drop so indicators and include some new indicators. This just adds to the frustration of policymakers and academics because comparability is lost. One way to tackle this problem is scale equating. Figure \@(fig:changeEQ) displays a common situation in which the indicators have changed from measure A (Poverty) to measure B (Poverty B). There are seven items that appear in both measures and two items that were dropped and other two that were included.
```{r changeEQ, echo = FALSE, message=FALSE, fig.align='center', fig.cap='This figure shows the case in which there is a modification in the contents of a measure. In red there are the common items between the two measures', fig.pos='H', fig.height=3, fig.width=3}
knitr::include_graphics("Diagram_EQ_1.png")
```
One crucial aspect of the example above is that if we have to tests with a common subset of items, then we can assess not only the effect of the updating of the scale but also, we could link both measures and make them comparable across time points. We will see that for scale equating to work we need to identify as many **anchors** as we can, i.e. items that respect measurement invariance across tests. In this case we have seven potential candidates. In the following we introduce formally the concepts and methods to conduct scale equating. We will, of course, discuss some limitations of this approach as it might be sensible in all situations or desirable.
## Theory of scale equating [#Theoryequating]
### Workflow in scale equating
Drawing upon @Davier2010, before formally introducing scale equating, we will present some of the stages involved in this process.
1. Two or more poverty indices (test forms using the vocabulary in scale equating): We will have two indices and one research question: How to measure and compare the latent poverty levels of the population regardless of which index we use.
2. Detecting confounding differences: The task is to measure the latent poverty levels by avoiding the confounding effect of using indices that look at more or less acute deprivation indicators with the effect of latent poverty. This implies figuring out the differences between the estimated severities/difficulties of the indicators and the underlying latent level of poverty.
3. Modelling the data generating process: Looking at different modelling alternatives, check their assumptions and assess how sensible using a given model is. There are several methods, observed score equating, IRT equating, kernel equating, etc.
4. Error in equating: All models are wrong and thus it is vital to assess the extent to which the results of our model are useless or reasonable given the errors of the parameters that are being estimated.
### Theory of IRT scale equating
In this edition of this book, we will focus on one of the most widely used form of equating: Item Response Theory (IRT) equating because it is a framework that has been used for reliability analysis and fits more naturally with the kind of thinking behind this book: Poverty is a concept and its measurement is based on reflective models where deprivation is a cause of poverty. Thus, people have a latent level of poverty that produces observed deprivation scores.
Chapter \@ref(Chapter-2) underline the importance of explicitly working with models and blueprints in multidimensional poverty measurement. We proposed in equation \@ref(eq:mainmodel) a very simple model that looks as follows:
\begin{equation}
(\#eq:mainmodel)
\mathscr F = \{\mathscr X, F_{\theta} : \theta \in \Theta\}
\end{equation}
where the variables $x_1,...,x_n$ follow a certain distribution $F_{\theta}$, which is indexed by a parameter $\theta$ defined in the parameter space $\Theta$. $\mathscr F$ is a family of all probability distributions on $\mathscr X$, which is just the set of all possible observed data.
We will borrow the formulation of @Gonzalez2017 notation to formulate the equating model based on Item Response Theory, as this will be the method that we will use in the book, as it is consistent with the methods and rationale that we have used so far in the book.
For example, Item Response Theory (IRT) models, which are widely used in test equating, take the following general form:
\begin{equation}
(\#eq:irtmodel)
\mathscr F = {{0,1}, Bernoulli[\pi(\alpha_i, \omega_j)]:(\alpha_i, \omega_j) \in \mathbb{R} x \mathbb{R}}
\end{equation}
For a two-parameter IRT model we would have $x_{ij}$ denoting a binary deprivation indicator of an individual $i$ who poverty is measured based on the index $j$. That means that the probability of being deprived is given by: $(x_{ij} | \theta_i, a_j, b_j) \sim Bernoulli$, where $\theta_i$ is person's latent poverty, $a_j$ is the discrimination of the item, $b_j$ is the severity of the item. \pi() is the item characteristic curve (ICC).
The key parameters in this example are thus $\theta_i, a_j, b_j$. In the case that we have two indices $(j=2)$ we would like to link in some way the parameters of the first index ($X$) with the second index ($Y$). This could be achieved by estimating an IRT model for each index, extracting the parameters in question and then apply a linear equation to convert the IRT scores as:
\begin{equation}
(\#eq:irtequating1)
\theta_{yi} = A\theta_{xi} + B
\end{equation}
To relate the parameters between the two tests we use the following:
\begin{equation}
(\#eq:irtequating2)
a_{yj} = a_{xj} / A
\end{equation}
\begin{equation}
(\#eq:irtequating2)
b_{yj} = Ab_{xj} + B
\end{equation}
where A and B are the equating coefficients and these need to be estimated using different methods such as mean-sigma, mean-geometric, Haebara, Stocking-Lord [@Gonzalez2017; Davier2010]. Once these parameters have been estimated, the next step consists in putting the index $X$ in terms of the index $Y$. This process assumes that scores are equated considering the latent level of poverty which relates to observed score. There are two perspectives to do so: Observed-score equating and True-score equating [@Kolen2004]. We explain these two very briefly:
Observed-score equating is based on the actual marginal score distributions (i.e. imagine a histogram of a deprivation score. This kind of equating uses the IRT model to adjust the score distributions across parallel forms, i.e. indices. To do so an observed score is calculated at each specified value of latent poverty level (remember that the metric of this variable is $\theta \sim (0,\sigma^2)$). This is summed/integrated across all levels of latent poverty to produce a marginal score distribution. The equipercentile is applied to establish the relationship between the two observed scores.
True-score equating draws on the idea of a true score from Classical Test Theory (CTT). The whole idea is that people with the same level of latent poverty $\theta_j$ should have an equivalent true score, regardless the differences between tests. First, a latent poverty level ($\theta_j$, where $j=1$) is associated with a true score via the Newton-Rapson method. Then the true score of the A index is mapped into the index B using the same latent poverty level ($\theta_j$, where $j=2$). This procedure is often applied for each integer of the deprivation score.
## Data and designs in test equating
One of the major challenges in equating is disentageling the differences in severity between tests from the differences in latent poverty levels across the populations subject to the measurement. Equating aims to correct differences in severity across test so we need a design that help us to assess the extent to which we are comparing populations with similar levels of latent poverty.
The simplest design is applying, for example, two different indices to the same population. So we can the latent values of poverty measured by each scale. However, this is not very useful because we often want to compare different populations or time points. One common solution to this problem is to use an **anchor** scale, i.e. the scores of the common items between populations. This will tell us the difference in latent poverty levels between two populations. It is essential that this **anchor** scales reflects the severity of the whole test. Below we briefly describe the different designs. We, however, note that only the last two designs are likely to make sanse in poverty research and we will focus on these two.
### Single group design
This design uses a unique population. We could implement two scales A and B. If we measure each person in the sample using both forms we could easily equate the tests as we are measuring the same levels of latent poverty.
### Equivalent groups design
Instead of applying the two measures to the population, equivalent groups design applies one scale to a random -representative- sample of the population and the second scale to a different sample.
### Non Equivalent Groups with Anchor test Design
The simplest case of the Non Equivalent Groups with Anchor test Design (NEAT) consist in having two popualtions -sampled independently-. Poverty is measured using different scales (A and B) but there is a common **anchor** sub-index that is included in both A and B. In practice, we will have two countries with different indices but with a subset of common items. As we discussed in Section \@ref(Theoryequating), it is not enough to use the same items but to make certain that the items respect measurement invariance. Another example of a NEAT design is when we have two measures in two different time points with common items in $t1$ and $t2$. Our real data example uses data from the UK Family Resources Survey (FRS) to illustrate how the child deprivation index can be equated despite four indicators were dropped from the original measure. In this version of the book we will focus on this kind of design.
### Non Equivalent Groups with Covariates Design
The Non Equivalent Groups with Covariates Design (NEC) is an alternative to the NEAT design when there are no **anchors** and we have some ancilliary data. A vector of auxiliary data $Z$ is used to adjust the differences across different indices. This variant is uselful inasmuch as we have a good set of predictos of poverty as we will be adjusting differences in latent poverty. One of the most recent developments in equating is Bayesian Nonparametric Equating (BPN), which also allows for the inclusion of covariates [@Gonzalez2017].
## Example of IRT equating (NEAT design) with simulated data in R
To introduce IRT equating we will use two simulated indices (**A** and **B**) that capture a the simplest situation in equating and in poverty measurement. Both indices are comprised of 15 binary indicators ($n=5000$). The first 11 indicators are the same in that they are measurement invariance and have the same severities and discrimination values. The remaining four indicators have different severities. The indicators of index A are more severe than the indicators of index B. This is a situation in which two very similar indices are applied to the same population -with no changes in the underlying level of poverty- but one is more severe than another. So, it is not possible to compare the observed deprivation scores directly. This could be also depicting a situation where two similar countries with similar living standards are compared using similar indices.
```{r message=FALSE, warning=TRUE, include=FALSE}
test<-mplusObject(MONTECARLO = "NAMES=U1-U15;
GENERATE = U1-U15(1);
CATEGORICAL = U1-U15;
GENCLASSES = c(2);
CLASSES = c(2);
NOBS = 5000;
SEED = 3454367;
NREP = 1;
SAVE = EQ_IRT_1_*.dat;
REPSAVE = all;",
MODELPOPULATION ="%OVERALL%
[c#1*2];
f by [email protected]
f@1;
[u11$1@ 2.571]
[[email protected]];",
ANALYSIS =
"TYPE = MIXTURE;
ALGORITHM=INTEGRATION;
PROCESS=8;")
for(i in 1:1){
mplusModeler(test, modelout=paste("EQ_IRT_T1_DECREASE",i,".inp",sep=""), writeData = "never",
hashfilename = FALSE)
}
```
```{r message=FALSE, warning=TRUE, include=FALSE}
test<-mplusObject(MONTECARLO = "NAMES=U1-U15;
GENERATE = U1-U15(1);
CATEGORICAL = U1-U15;
GENCLASSES = c(2);
CLASSES = c(2);
NOBS = 5000;
SEED = 345444;
NREP = 1;
SAVE = EQ_IRT_2_*.dat;
REPSAVE = all;",
MODELPOPULATION ="%OVERALL%
[c#1*2];
f by [email protected]
f@1;
[u11$1@ 2.7]
[[email protected]];",
ANALYSIS =
"TYPE = MIXTURE;
ALGORITHM=INTEGRATION;
PROCESS=8;")
for(i in 1:1){
mplusModeler(test, modelout=paste("EQ_IRT_T2_DECREASE",i,".inp",sep=""), writeData = "never",
hashfilename = FALSE)
}
```
```{r message=FALSE, warning=FALSE, results="hide", include=FALSE}
runModels(filefilter = "EQ_IRT_T")
```
The data corresponding to index A is stored on the `EQ_IRT_1_1.dat` file and the data corresponding to index B is on the file `EQ_IRT_2_1.dat`. To familiarise ourselves with the data we will have a look at the files:
```{r}
A<-read.table("EQ_IRT_1_1.dat")
B<-read.table("EQ_IRT_2_1.dat")
```
We now tabulate the deprivation rates for each one of the 15 indicators so that we can appreciate the similarities and differences between the two samples. We see that there are just small variations in the first 11 items and larger variations for the remaining four. This is the expected behaviour given that we said that the latent levels of poverty are the same between samples A and B.
```{r}
dep_propA<-unlist(lapply(A, function(x) mean(x)))
dep_propA<-round(dep_propA[1:15]*100,0)
dep_propA
dep_propB<-unlist(lapply(B, function(x) mean(x)))
dep_propB<-round(dep_propB[1:15]*100,0)
dep_propB
```
The first step in IRT-test equating consist in fitting and IRT model for each one of the two indices. If we think this carefully, this just a more formal way to compare the measurement properties of index A and B. That is, whether the indicators discriminate well between the poor and the not poor and what latent level of severity of poverty each indicator is capturing.
The models will be fitted on **Mplus** via **R** using the `mplusAutomation()` function [@Hallquist2018]. We will write a more complex function because we will like to offer readers to run simulations to check the properties of equating, this is why we left and `i` in the commands.
Here we fit the IRT model for both indices. We will create one script for each index (`EQ_IRT_1_1.inp` for index A for example):
```{r message=FALSE, warning=FALSE, results="hide"}
test<-mplusObject(VARIABLE = "NAMES=V1-V15;
USEVARIABLES=V1-V15;
CATEGORICAL = V1-V15;",
MODEL= "f by V1-V15*;
f@1;")
for(i in 1:2){
mplusModeler(test, modelout=paste("EQ_IRT_",i,"_1",".inp",sep=""), writeData = "never",
hashfilename = FALSE)
}
```
Now we request `mplusAutomation()` to run both models on **Mplus** for us as follows:
```{r message=FALSE, warning=FALSE, results="hide"}
for(i in 1:2){
runModels(paste("EQ_IRT_",i,"_1",".inp",sep=""))
}
```
Finally we read the models using the `readModels()` function. We will store the files in the objects `irt_1` and `irt_2` and then we will put them in a list so that we can put them in the correct format for the equating (i.e. the irt parameters *a, b and c* of each item need to be in columns). So, using `lapply()` we can put them in the correct format and store them on the `irtS` list.
```{r}
irt_1<-readModels(filefilter ="EQ_IRT_1")
irt_2<-readModels(filefilter ="EQ_IRT_2")
irtS<-list(irt_1,irt_2)
irtS<-lapply(irtS, function(x) {
x<-x$parameters$irt.parameterization
x<-x[1:30,]
x<-data.frame(a=x$est[1:15],b=x$est[16:30],c=0)
x
}
)
```
The `irtS` list contains the parameters of indices A and B. We will rename them to facilitate the specification using the excellent `SNSequate` package [@Gonzalez2014]. We need to tell `SNSequate` the dataframes containing the IRT-parameters of each test and which items are the *anchors* -common items-.
```{r message=FALSE, warning=FALSE}
library(SNSequate)
parm.a<-irtS[[1]]
parm.b<-irtS[[2]]
comitems<-(1:11)
parm <- cbind(parm.a, parm.b)
```
The key function to conduct IRT-equating is `irt.link()`, this function uses the parameters, takes into account the common items and considers which kind of IRT we want to use, in this case a two-parameter model. In this case we will use 1.7 for the constant D, as it is common practice.
```{r message=FALSE, warning=FALSE}
eqconst<-irt.link(parm, comitems, model = "2PL", icc = "logistic", D = 1.7)
```
The object `eqconst` contains the estimated constants for equating: a and b. It will estimate these constants using different methods (mean-mean, StockLord, Haebara). For this example, we will use the constants from the StockLord method. We simply extract the constants as follows:
```{r}
Sl_a<-eqconst$StockLord[1]
Sl_b<-eqconst$StockLord[2]
```
The next step is to apply the contents to the IRT coefficients. What we need to do is to rescale the irt-parameters of test A -because we are equating test A with B- using the two constants as follows:
```{r message=FALSE, warning=FALSE}
irtS[[1]]$a<-irtS[[1]]$a/Sl_a
irtS[[1]]$b<-Sl_a*(irtS[[1]]$b)+Sl_b
irtS[[1]]
```
Now we have equivalent IRT-parameters so the latent scores can be fully compared. Section XX described two types of equating: observed and true. We will implement both below. For the true scale equating we simply say the names of the objects with the IRT-parameters, the method and we will use the following defaults for the scaling parameters $A=1$ and $B=0$. We also specify the common items. We store the results on the `true-eqscale` object. For the observed equating we need to specify the `theta_points`, i.e. the standardised values of the latent poverty level. Then we apply the function changing the method and adding the theta points. We store the output on the `obs_eqscale` object.
```{r message=FALSE, warning=FALSE}
theta_points=c(-5,-4,-3,-2,-1,0,1,2,3,4,5)
true_eqscale<-irt.eq(15, irtS[[1]], irtS[[2]], method="TS", A=1, B=0, common=comitems)
obs_eqscale<-irt.eq(15, irtS[[1]], irtS[[2]], theta_points, method="OS", common=comitems,
A=1, B=0)
```
We now do some manipulations to the `true_mean` object to extract the equated scores of form A in terms of B (`true_eqscale$tau_y`) and the latent values of poverty for each equated score (`true_eqscale$theta_equivalent`). We create a simple table to compare the equated score A in terms of B. We observed that the values of A in terms of B tend to be higher. Why is that? Remember that A had more severe indicators. That means that being deprived under test B was less severe. The number of deprivations in test A and B do not reflect the same severity levels, the latent severity under test A requires a higher observed deprivation score.
Sometimes is tricky to interpret these findings are there are double negatives. One way to think about this is by thinking in terms of an extreme situation. Imagine that test A is way more severe than test B, for example, test A uses indicators like having dirt floor, lacking electricity, unsafe source of water, walls made from cardboard to identify deprivation relative to B that uses indicators like floor without tiles, cant' afford electricity, clean water connected to the property, lacking walls made from cement or bricks. For the same living standards, a score of 4 in test A denotes a more extreme situation than a score equal to 4 in test B. Thus, the same severity of A in terms of B should be way higher. In the current example we have something similar but less dramatic.
```{r message=FALSE, warning=FALSE}
true_mean<-data.frame(Score.AintermsofB=true_eqscale$tau_y, Latent=true_eqscale$theta_equivalent)
true_mean$ScoreB<-0:15
true_mean
```
Now we will do the same extraction process for the observed score. We extract the values of the A score in terms of B (`obs_eqscale$e_Y_x`). We find that we get similar results. The observed score of A in terms of B is higher than the score of index B.
```{r message=FALSE, warning=FALSE}
obs_mean<-data.frame(Score.AintermsofB=obs_eqscale$e_Y_x, obs_mean=c(0:15))
obs_mean
```
To see these findings from a different perspective, we will plot the results (See Figure \@ref(fig:eqfig1). The first plot uses a 45 degree line to denote a situation where scores A and B are the same, i.e. a score of 3 measures the same underlying level of poverty. The pink line is the score A in terms of B. We see that the pink line is almost always below the 45-degree line. This means that the rescaled score A is always higher than B. For the same severity, someone assessed using index A has a higher observed deprivation score.
```{r message=FALSE, warning=FALSE, results="hide", include=FALSE}
jpeg("EQ_plot1.jpg", units="cm", width=10, height=10, res=300)
ggplot(true_mean,aes(Score.AintermsofB,ScoreB)) + geom_line(size=2, color="pink") + xlab("Measure A (scaled in B-terms)") + ylab("Measure B") + theme_bw() + geom_abline(intercept = 0, slope = 1) +
scale_y_continuous( limits = c(-.1,15), expand = c(0,0), breaks = seq(1, 15, 1) ) +
scale_x_continuous( limits = c(-.1,15), expand = c(0,0), breaks = seq(1, 15, 1))
dev.off()
```
```{r eqfig1, echo=TRUE, message=F, fig.cap="Comparison of the rescaled index A -in terms of B- with the index B."}
ggplot(true_mean,aes(Score.AintermsofB,ScoreB)) + geom_line(size=2, color="pink") + xlab("Measure A (scaled in B-terms)") + ylab("Measure B") + theme_bw() + geom_abline(intercept = 0, slope = 1) +
scale_y_continuous( limits = c(-.1,15), expand = c(0,0), breaks = seq(1, 15, 1) ) +
scale_x_continuous( limits = c(-.1,15), expand = c(0,0), breaks = seq(1, 15, 1))
```
```{r message=FALSE, warning=FALSE, results="hide", include=FALSE}
library(reshape2)
mm<-melt(true_mean, id="Latent")
ggplot(mm,aes(x=Latent,y=value, group=variable, colour=variable)) + geom_line( size=2) +
xlab("Latent Poverty") + ylab("Scales") + theme_bw()
```
## Real-data example
For our example we will use data from the UK Family Resources Survey (FRS). This survey collects data on consensual material deprivation, drawing upon the Poverty and Social Exclusion (PSE) approach, and it is used to estimate the UK child deprivation index.
### Step 1: Finding the anchors. MI Analysis
First we need to find the achors -invariant common items- across years. For this we will use the alligment method in **Mplus** to find the items that fulfill approximate MI. *Warning: This analysis will take quite some time to run. On my computer, it took 5 minutes.*. We already have conducted a MI analysis for the FRS data (See Chapter \@ref(Chapter-5)). We strongly recommend read that chapter before this one but in case you have skipped the analysis, we will run the model again.
To run the analysis we create the following `MIFRS_analysis.inp` file stored on the `test` object. The use variables command has the 17 items that are included across all years. So that we know from this subset which items could be used as **anchors**. Because the dimensionality of this scale is unkown, but we know that it is highly reliable, we will assume that a large part of the variance is accounted by for the higher order factor. We have 10 years, so we tell Mplus to use 10 classes to conduct the MI analysis across time points.
```{r eval=FALSE}
library(MplusAutomation)
test<-mplusObject(
TITLE = "MI Analysis FRS data;",
VARIABLE = "
Names are
gross4 FRSYear dep_ADDDEC dep_ADDEPLES dep_ADDHOL dep_ADDINS dep_ADDMEL
dep_ADDMON dep_ADDSHOE dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPSUM dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPACT dep_CDEPVEG
dep_CDPCOAT dep_ADBTBL id;
Missing are all (-9999) ;
usevariables = dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY;
categorical = dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY;
classes = c(10);
knownclass = c(FRSYear = 1 2 3 4 5 6 7 8 9 10);
weight=gross4;
cluster=id;",
ANALYSIS = "type = mixture complex;
estimator = ml;
alignment = fixed;
ALGORITHM=INTEGRATION;
Process=8;",
MODEL = "
%overall%
f by dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY;",
OUTPUT = "tech1 tech8 align;",
PLOT = "type = plot2;")
res <- mplusModeler(test, modelout = "MIFRS_analysis.inp",
writeData = "never",
hashfilename = FALSE,
dataout="FRS_mplusprep.dat", run = 1L)
```
The output of the MI analysis is not properly read in R. So we will simply open the `MIFRS_analysis.out` file and paste the summary of the MI analysis. In rows we have the Intercepts and the loadings of each item. The number corresponds to the FRS year variable in order. In brackets we have the parameters that are non-invariant for a given year. For example, the intercept of the first item is non-invariant in the first year but invariante for the rest. The second item's threshold is invariant across all years. We see that this scale fulfills approximate measurement invariance. We appreciate that there is a subset of items that can be used as common items in that they have partial or full scalar invariance. These items are: *1,3,5,6,7,9,10,11,16 and 17*.
```
APPROXIMATE MEASUREMENT INVARIANCE (NONINVARIANCE) FOR GROUPS
Intercepts/Thresholds
DEP_ADDD$1 (1) 2 3 4 5 6 7 8 9 10
DEP_ADDH$1 1 2 3 4 5 6 7 8 9 10
DEP_ADDI$1 1 2 3 4 5 6 7 8 9 10
DEP_ADDM$1 (1) (2) (3) (4) 5 6 7 8 9 (10)
DEP_ADEP$1 1 2 3 4 5 6 7 8 9 10
DEP_AF1$1 1 2 3 4 5 6 7 8 9 10
DEP_AFDE$1 1 2 3 4 5 6 7 8 9 10
DEP_HOUS$1 (1) (2) 3 4 5 6 7 8 9 10
DEP_CDEL$1 (1) (2) (3) (4) (5) 6 7 8 9 10
DEP_CDEP$1 1 2 3 4 5 6 7 8 9 10
DEP_CDEP$1 1 2 3 4 5 6 7 8 9 10
DEP_CDEP$1 1 2 3 4 5 6 7 8 (9) 10
DEP_CDEP$1 (1) 2 3 4 5 6 7 8 9 10
DEP_CDEP$1 1 2 3 4 5 6 7 8 9 (10)
DEP_CDEP$1 1 2 3 4 5 6 7 8 9 10
DEP_CDEP$1 1 2 3 4 5 6 7 8 9 10
DEP_CPLA$1 1 2 3 4 5 6 7 8 9 10
Loadings for F
DEP_ADDD (1) 2 3 4 5 6 7 8 9 10
DEP_ADDH (1) (2) (3) (4) 5 6 7 8 9 10
DEP_ADDI 1 2 3 4 5 6 7 8 9 10
DEP_ADDM 1 2 3 4 5 (6) 7 8 9 (10)
DEP_ADEP 1 2 3 4 5 6 7 8 9 10
DEP_AF1 1 2 3 4 5 6 7 8 9 10
DEP_AFDE 1 2 3 4 5 6 7 8 9 10
DEP_HOUS 1 2 (3) (4) 5 6 7 8 9 10
DEP_CDEL 1 2 3 4 5 6 7 8 9 10
DEP_CDEP 1 2 3 4 5 6 7 8 9 10
DEP_CDEP 1 2 3 4 5 6 7 8 9 10
DEP_CDEP 1 2 3 4 5 6 7 8 (9) 10
DEP_CDEP 1 2 3 4 5 6 7 8 9 10
DEP_CDEP (1) 2 3 4 5 6 7 8 9 10
DEP_CDEP 1 2 3 4 5 6 7 (8) (9) 10
DEP_CDEP 1 2 3 4 5 6 7 8 9 10
DEP_CPLA 1 2 3 4 5 6 7 8 9 10
```
### Step 2: Extract IRT parameters for each of 10 indices
Now we need the IRT parameters of the 21 items of each index. Yes, we need the full scale but remember that we have 10 **anchors**. We could use the `MplusAutomation` package to estimate the 10 models. First we need to create an iterative syntax that will run the same 2-parameter IRT for the first 7 years -they have the same items-. Then we will do the same for years 8, 9 and 10. We will save both files. The first file looks as follows:
```
[[init]]
iterators = i;
i = 1:7;
filename = FRS_IRT_[[i]].inp;
outputDirectory = "C:../PM Book";
[[/init]]
DATA : FILE = FRS_mplusprep[[i]].dat;
Variable:
Names are
gross4 FRSYear dep_ADDDEC dep_ADDEPLES dep_ADDHOL dep_ADDINS dep_ADDMEL
dep_ADDMON dep_ADDSHOE dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPSUM dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPACT dep_CDEPVEG
dep_CDPCOAT dep_ADBTBL id;
Missing are all (-9999) ;
usevariables = dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPSUM dep_ADDEPLES dep_ADDMEL dep_ADDSHOE;
categorical = dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPSUM dep_ADDEPLES dep_ADDMEL dep_ADDSHOE;
weight=gross4;
cluster=id;
Analysis: type = complex;
estimator = ml;
ALGORITHM=INTEGRATION;
Process=8;
model:
f by dep_ADDDEC* dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPSUM dep_ADDEPLES dep_ADDMEL dep_ADDSHOE;
f@1;
output:
tech1 tech8;
plot:
type = plot2;
```
The second file, for models 8, 9 and 10, looks like:
```
[[init]]
iterators = i;
i = 8:10;
filename = FRS_IRT_[[i]].inp;
outputDirectory = "C:../PM Book";
[[/init]]
DATA : FILE = FRS_mplusprep[[i]].dat;
Variable:
Names are
gross4 FRSYear dep_ADDDEC dep_ADDEPLES dep_ADDHOL dep_ADDINS dep_ADDMEL
dep_ADDMON dep_ADDSHOE dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPSUM dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPACT dep_CDEPVEG
dep_CDPCOAT dep_ADBTBL id;
Missing are all (-9999) ;
usevariables = dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPACT dep_CDEPVEG
dep_CDPCOAT dep_ADBTBL;
categorical = dep_ADDDEC dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPACT dep_CDEPVEG
dep_CDPCOAT dep_ADBTBL;
weight=gross4;
cluster=id;
subpopulation = FRSYear==8;
Analysis: type = complex;
estimator = ml;
ALGORITHM=INTEGRATION;
Process=8;
model:
f by dep_ADDDEC* dep_ADDHOL dep_ADDINS
dep_ADDMON dep_ADEPFUR dep_AF1 dep_AFDEP2 dep_HOUSHE1
dep_CDELPLY dep_CDEPBED dep_CDEPCEL dep_CDEPEQP dep_CDEPHOL dep_CDEPLES
dep_CDEPTEA dep_CDEPTRP dep_CPLAY dep_CDEPACT dep_CDEPVEG
dep_CDPCOAT dep_ADBTBL;
f@1;
output:
tech1 tech8;
plot:
type = plot2;
```
We create the `*.inp` files witih the `createModels` function using the `Mplusautomation` package:
```{r warning=FALSE, message=FALSE, results='hide'}
createModels("FRS_IRT_models1to7.txt")
createModels("FRS_IRT_models8to10.txt")
```
Running the 10 2-parameter IRT models (Be patient because it will take few minutes to run all the models). We will put all the parameters on the list `MI_coefs`.
```{r warning=FALSE, message=FALSE, eval=FALSE}
runModels(filefilter = "FRS_IRT_")
```
```{r warning=FALSE, message=FALSE, results = 'hide'}
MI_coefs<-readModels(filefilter ="frs_irt")
MI_coefs<-lapply(MI_coefs, function(x) {
x<-x$parameters$irt.parameterization
x<-x[1:42,]
x<-data.frame(a=x$est[1:21],b=x$est[22:42],c=0)
x
}
)
```
### Step 3: Estimating the constants for equating
We will create a list with the parameters and use the first year as reference (*2013/2014*). So we will constracting pairs of years were the last available year will remain fixed.
```{r warning=FALSE, message=FALSE,}
MI_coefs<-list(MI_coefs[[1]],MI_coefs[[3]],MI_coefs[[4]],MI_coefs[[5]],MI_coefs[[6]],MI_coefs[[7]],MI_coefs[[8]],MI_coefs[[9]],MI_coefs[[10]],MI_coefs[[2]])
parm_2<-cbind(MI_coefs[[1]], MI_coefs[[10]])
parm_3<-cbind(MI_coefs[[2]], MI_coefs[[10]])
parm_4<-cbind(MI_coefs[[3]], MI_coefs[[10]])
parm_5<-cbind(MI_coefs[[4]], MI_coefs[[10]])
parm_6<-cbind(MI_coefs[[5]], MI_coefs[[10]])
parm_7<-cbind(MI_coefs[[6]], MI_coefs[[10]])
parm_8<-cbind(MI_coefs[[7]], MI_coefs[[10]])
parm_9<-cbind(MI_coefs[[8]], MI_coefs[[10]])
parm_10<-cbind(MI_coefs[[9]], MI_coefs[[10]])
parm<-list(parm_2,parm_3,parm_4,parm_5,parm_6,parm_7,parm_8,parm_9,parm_10)
comitems<- c(1,3,5,6,7,9,10,11,16,17)
```
We now use the function `irt.link()` from the package `SNSequate` to find the equating parameters "a" and "b" using 2013/3014 as reference year, i.e. all the deprivation scores will be in terms of 2013/2014.
```{r}
eqconst<-lapply(1:9, function(i) irt.link(parm[[i]], comitems, model = "2PL", icc = "logistic", D = 1.7))
```
Once we have found the parameters we can extract them. For this example we will use the Haebara estimates for this example. Yet, you could check by yourself that there are very minimal using other methods.
```{r}
H<-do.call("rbind", lapply(eqconst, "[[", 4))
H<-split(H, seq(nrow(H)))
```
### Step 4: Applying the constants and obtaning equated scores
We need then to apply the constants to each IRT-coefs dataset
```{r}
FRSIRT <- MI_coefs[2:10]
parm.x<-lapply(1:9, function(i) {
FRSIRT[[i]]$a<-FRSIRT[[i]]$a/H[[i]][1]
FRSIRT[[i]]$b<-H[[i]][1]*(FRSIRT[[i]]$b)+H[[i]][2]
FRSIRT[[i]]
} )
```
```{r}
theta_points=c(-5,-4,-3,-2,-1,0,1,2,3,4,5)
param_x<-lapply(parm.x[1:9], function(x) list(a=x$a,b=x$b,c=x$c))
param_y<-lapply(1:9, function(x) list(a=MI_coefs[[1]]$a,b=MI_coefs[[1]]$b,c=MI_coefs[[1]]$c))
true_eqscale<-lapply(1:9, function(i) irt.eq(21, param_x[[i]], param_y[[i]], method="TS", A=1, B=0, common=comitems))
obs_eqscale<-lapply(1:9, function(i) irt.eq(21, param_x[[i]], param_y[[i]], theta_points, method="OS", common=comitems,
A=1, B=0))
```
Observed score equating. We observe that despite the changes to the child deprivation score, the scale is very stable across years. There are minor differences across years when we round the numbers. We appreciate, however, that the years 2007-2011 show substative differences relative to 2013/2014. Particularly, among the severely deprived population. For the same latent severity of poverty, a deprivation scroe equal to 8 in 2009/2010 is equal to a score of 9 in 2013/2014.
```{r}
obs_mean<-lapply(obs_eqscale, function(x) data.frame(score.xiny=x$e_Y_x))
obs_mean<-as.data.frame(obs_mean)
names(obs_mean)<-c("2004/2005","2005/2006","2006/2007","2007/2008","2008/2009","2009/2010","2010/2011","2011/2012",
"2012/2013")
obs_mean$Reference<-0:21
round(obs_mean,0)
```
Plotting the results. See plot \@Ref(fig:frsobseq)
```{r frsobseq, echo=TRUE, message=F, fig.cap="Comparison of the rescaled indices. Reference year: 2013/2014"}
library(reshape2)
mm<-melt(obs_mean, id="Reference")
ggplot(mm,aes(x=Reference,y=value, group=variable, colour=variable)) + geom_line(aes(linetype=variable), size=2) +
xlab("2013") + ylab("Rescaled") + theme_bw() + scale_y_continuous( limits = c(-1.5,8), breaks = seq(-2, 8, 1) ) +
geom_abline(slope=1, intercept=0) + theme(legend.position="bottom") +
scale_x_continuous( limits = c(-1.5,8), breaks = seq(-2, 8, 1))
```
```{r message=FALSE, warning=FALSE, results="hide", include=FALSE}
jpeg("frsobseq.jpg", units="cm", width=10, height=10, res=300)
ggplot(mm,aes(x=Reference,y=value, group=variable, colour=variable)) + geom_line(aes(linetype=variable), size=2) +
xlab("2004") + ylab("Rescaled") + theme_bw() + scale_y_continuous( limits = c(-1.5,8), breaks = seq(-2, 8, 1) ) +
geom_abline(slope=1, intercept=0) + theme(legend.position="bottom") +
scale_x_continuous( limits = c(-1.5,8), breaks = seq(-2, 8, 1))
dev.off()
```
### Finding equated poverty lines
Now that we have put deprivation scores in terms of the reference year, we can estimate poverty using adjusted poverty lines. This will help us to assess how poverty changes when we used a fixed poverty line relative to when we use a equated one. The adjusted poverty line transforms the value of reference, lets say 3, into an adjusted value, i.e. what is 3 in a given year in terms of 2013/2014. To asses this, first we will paste the adjusted deprivation scores by year. So we will creata the `mm` data frame.
```{r}
library(reshape2)
mm$FRSYear[mm$variable=="2004/2005"]<-1
mm$FRSYear[mm$variable=="2005/2006"]<-2
mm$FRSYear[mm$variable=="2006/2007"]<-3
mm$FRSYear[mm$variable=="2007/2008"]<-4
mm$FRSYear[mm$variable=="2008/2009"]<-5
mm$FRSYear[mm$variable=="2009/2010"]<-6
mm$FRSYear[mm$variable=="2010/2011"]<-7
mm$FRSYear[mm$variable=="2011/2012"]<-8
mm$FRSYear[mm$variable=="2012/2013"]<-9
mm$FRSYear[mm$variable=="2013/2014"]<-10
mm$ds_uw<-mm$Reference
```
Now we can merge the `mm` data with the `FRS_mplusprep.dta` file. We will crate a deprivation score considering the 21 indicators for each year (remember that in 2012 the indicators changed). We will call this deprivation score `ds_uw`(unweighted). Then we marge the `mm` data with the `FRS_mplusprep.dta` data and assign the `ds_uw` to the 2013/3014 data (as it is not present in the mm file).
```{r}
library(haven)
D<-read_dta("FRS_mplusprep.dta")
if(D$FRSYear<=7) {
D$ds_uw<-rowSums(D[,c(3:23)],na.rm=TRUE)
} else {
rowSums(D[,c(3,5,6,8,10,11:19,21:27)],na.rm=TRUE)
}
D<-merge(D,mm,by=c("FRSYear","ds_uw"),all.x=T)
D$value[D$FRSYear==10]<-D$ds_uw
```
The next step is to create a column to identify who is poor and who is not based on the unadjusted and adjusted poverty line. We will use $3$ as the value of the poverty line in this example -See Chapter XX for a discussion of the poverty line-. To get the adjusted poverty lines we need to get the rows for each year that are equal to (unadjusted 3).
```{r messages=F, warning=FALSE}
D2<-subset(D,D$ds_uw==3)
lines<-ddply(D2,.(FRSYear), summarise, mean=mean(value))
lines
```
Now all we have to do is to merge the files and identify poverty using the thresholds equal to 3 for both the adjusted and the unadjusted poverty line.
```{r}
D<-merge(D,lines,by="FRSYear")
D$mean<-round(D$mean)
D$mean[D$FRSYear==10]<-3
D$poor_uw<-ifelse(D$ds_uw>=3,1,0)
D$poor_w<-ifelse(D$ds_uw>=D$mean,1,0)
```
We will do a simple calculation of the prevalence rate for each year using both poverty lines. We will simply consider the survey weights in this occasion. The function `svyby` will give us both the point estimate and the standard errors. We will produce the bands of the confidence intervals (95\%) for the plots below.
```{r}
library(survey)
des<-svydesign(id=~1, weights=~gross4, data=D)
pov_uadj<-svyby(~poor_uw, ~FRSYear, des, svymean)
pov_adj<-svyby(~poor_w, ~FRSYear, des, svymean)
pov_adj<-merge(pov_adj,pov_uadj,by="FRSYear")
pov_adj<-pov_adj[,c(2:ncol(pov_adj))]*100
pov_adj$ci_wu<-pov_adj$poor_w+1.96*(pov_adj$se.x)
pov_adj$ci_wl<-pov_adj$poor_w-1.96*(pov_adj$se.x)
pov_adj$ci_uwu<-pov_adj$poor_uw+1.96*(pov_adj$se.y)
pov_adj$ci_uwl<-pov_adj$poor_uw-1.96*(pov_adj$se.y)
pov_adj$year<-c("2004/2005","2005/2006","2006/2007","2007/2008","2008/2009","2009/2010","2010/2011","2011/2012",
"2012/2013","2013/2014")
```
After we produce the relevant calculations we can plot the evolution of child poverty with confidence intervals (95\%) using both lines using `ggplot2`. The adjusted poverty line indicates that prior the economic crisis there was a drop in child poverty and that it has decreased after 2011/2012. The unadjusted measure shows very low variability and see that the change to the set of indicator does not seem to impact the prevalence rate (yet, it does so in terms of severity).
```{r eqchildpov, echo=TRUE, message=F, fig.cap="Evolution of the child poverty in the UK. 2004-2014. Adjusted ppoverty lines. Reference year: 2013/2014"}
p1<-ggplot(pov_adj,aes(y=poor_w, x=year, group=1)) + geom_line(aes(color="green")) + geom_ribbon(aes(ymin=ci_wu, ymax=ci_wl), fill="green", alpha=.4) + ylab("% Child poverty") +
scale_y_continuous( limits = c(20,40), breaks = seq(20, 40, 5) ) + theme_bw()
p2<-p1 + geom_line(aes(y=poor_uw, x=year, group=1, color="#F5A9A9")) + geom_ribbon(aes(ymin=ci_uwu, ymax=ci_uwl), fill="#F5A9A9", alpha=.4) + theme_bw() + scale_color_discrete(name = "Poverty line", labels = c("Unadjusted", "Adjusted"))
p2
```
```{r message=FALSE, warning=FALSE, results="hide", include=FALSE}
jpeg("eqchildpov.jpg", units="cm", width=10, height=10, res=300)
p1<-ggplot(pov_adj,aes(y=poor_w, x=year, group=1)) + geom_line(aes(color="green")) + geom_ribbon(aes(ymin=ci_wu, ymax=ci_wl), fill="green", alpha=.8) + ylab("% Child poverty") +
scale_y_continuous( limits = c(20,40), breaks = seq(20, 40, 5) ) + theme_bw()
p2<-p1 + geom_line(aes(y=poor_uw, x=year, group=1, color="blue")) + geom_ribbon(aes(ymin=ci_uwu, ymax=ci_uwl), fill="#2E9AFE", alpha=.8) + theme_bw() + scale_color_discrete(name = "Poverty line", labels = c("Unadjusted", "Adjusted"))
p2
dev.off()
```