pip.tex

%%%%%%%%%%%% PREAMBLE %%%%%%%%%%%%%%%%%%%%%

% Initiate document
% -----------------
\documentclass[11pt,a4paper]{article}

% Document codification and language
\usepackage[utf8]{inputenc} % Codificação do documento (conversão automática dos acentos)
\usepackage[T1]{fontenc}
\usepackage[USenglish]{babel}

% Set up geometry of the document
\usepackage{geometry}
\geometry{a4paper,tmargin=2.54cm,bmargin=2.54cm,lmargin=2.54cm,rmargin=2.54cm}

% Load packages
% -------------
\usepackage[centertags]{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{amsfonts}
\usepackage{latexsym}
\usepackage{graphicx}

\usepackage{epstopdf} %converting to PDF
\usepackage{polynom}
\usepackage{verbatim}
\usepackage{rotating}

\usepackage{setspace}
\usepackage[section]{placeins}
\usepackage[justification=centering]{caption}
\usepackage{tabularx}
\usepackage{booktabs}
\usepackage{multirow}
\usepackage{enumitem}

\def\sym#1{\ifmmode^{#1}\else\(^{#1}\)\fi} %for \sym{*}

\usepackage{array}
\newcolumntype{C}{>{\centering\arraybackslash}m{2cm}}
\newcolumntype{D}{>{\centering\arraybackslash}m{1.81cm}}

\usepackage{color}
\setcounter{tocdepth}{4}
\usepackage{adjustbox}
\usepackage{pgfplots}
\usepackage{graphicx}
\usepackage{afterpage}
\usepackage{lscape}
%\usepackage{subfigure} is an obsolete package
\usepackage{subcaption} %is the new one
\usepackage[bottom]{footmisc}
\usepackage{longtable}
\usepackage{ragged2e}
\usepackage{pdflscape}
\usepackage{titlesec}

% HYPERLINKS
\usepackage[hyphens]{url}
\usepackage[backref]{hyperref}
\hypersetup{colorlinks=true,
            linkcolor=black,
            urlcolor=blue,
            citecolor=gray}

\renewcommand\backrefxxx[3]{%
    \hyperlink{page.#1}{\textcolor{red}{$\uparrow$#1}}%
    }

\usepackage{scalerel,stackengine}
\stackMath
\newcommand\reallywidehat[1]{%
    \savestack{\tmpbox}{\stretchto{%
        \scaleto{%
        \scalerel*[\widthof{\ensuremath{#1}}]{\kern-.6pt\bigwedge\kern-.6pt}%
        {\rule[-\textheight/2]{1ex}{\textheight}}%WIDTH-LIMITED BIG WEDGE
        }{\textheight}% 
    }{0.5ex}}%
    \stackon[1pt]{#1}{\tmpbox}%
}

\usepackage[round]{natbib}
\bibliographystyle{ecta}

\pgfplotsset{compat=1.14}
\begin{document}

% Title page
% ----------

\font\myfont=cmr12 at 19pt
\title{\myfont{Supporting Teacher Autonomy to Improve Education Outcomes: Experimental Evidence from Brazil}\thanks{This draft benefited from comments from Pablo Acosta, Guadalupe Bedoya, Paul Christian, Flavio Cunha, Steven Glover, Florence Kondylis, Arianna Legovini, John Loeser, Pedro Olinto, Daniel Rogger, Ravi Somani, and seminar participants at São Paulo School of Economics (EESP-FGV). We thank the World Bank i2i fund and the World Bank Brazil Country Office for generous  research  funding. Finally, we thank the staff at the Rio Grande do Norte State Secretariat of Education for being great research partners. Computational reproducibility was verified by DIME Analytics. Details of the reproducibility checklist can be found in the \href{https://github.com/worldbank/brazil-pip-education/blob/master/pip_app.pdf}{Online Appendix}. The reproducibility package is available on \href{https://github.com/worldbank/brazil-pip-education}{GitHub}. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.}}

\newcommand*\samethanks[1][\value{footnote}]{\footnotemark[#1]}

\author{%
    Caio Piza\thanks{Development Impact Evaluation (DIME), The World Bank, 1818 H Street NW, Washington, DC 20433, United States.}%
    \and Astrid Zwager\samethanks[2]%
    \and Matteo Ruzzante\samethanks[2]%
    \and Rafael Dantas\samethanks[2]%
    \and Andre Loureiro\thanks{Education Global Practice, The World Bank.}
}

\date{}

\maketitle

\begin{abstract}
	
    \setstretch{1.2}
    \noindent What is the impact of greater teacher autonomy on student learning? This paper provides experimental evidence from a program in Brazil. The program supported teachers, through a combination of technical assistance and a small grant, to autonomously develop and implement an innovative project aimed at engaging their students. The findings show that the program improved student learning by 0.15 standard deviation and grade passing by 13 percent in sixth grade, a critical year of transition from primary to lower-secondary education. The paper explores two mechanisms: teacher turnover and student socio-emotional skills. Teacher turnover is reduced by 20.7 percent, and the impacts on student outcomes are concentrated in the schools with the largest reductions. The findings also indicate positive impacts on conscientiousness and extroversion among the students. The results suggest that increasing the autonomy of public servants can improve service delivery, even in a low-capacity context
    
\end{abstract}

\vspace{1.5em}

\begin{flushleft}
    \textbf{Keywords:}  autonomy, mentoring, school resources, teacher motivation, socio-emotional skills, education policy, lower-secondary education.  \\[1em]
    \textbf{JEL Codes:} H52, I21, M52, O15. \\
\end{flushleft}

\newpage
\sloppy

% Set up paragraph
\doublespacing

\setlength\parskip{1em}
\setlength\parindent{0pt}

\titlespacing{\section}{0pt}{0.5\parskip}{*-0.75}
\titlespacing{\subsection}{0pt}{0.5\parskip}{*-0.75}
\titlespacing{\subsubsection}{0pt}{0.5\parskip}{*-0.75}


%%%%%%%%%%%%%%%%% INTRODUCTION %%%%%%%%%%%%%%%%%%%%%%%%%

\section{Introduction} \label{intro}

%%% PARA 1: Quality of education matters for growth and equality

Over the last three decades many countries have succeeded in putting kids in school, but gains in learning have been limited \citep{WDR2018}. Improving the quality of education is a priority for many countries given its role in building human capital, affecting individual earning prospects and long-term growth (\citealp{hanushek2008role, hanushek2012better}; \citealp{chetty2014measuringII}). Despite increasing resource allocation to the sector, governments have struggled to substantially improve education outcomes (\citealp{mcewan2015improving}; \citealp{glewwe2016improving}). The recent World Development Report (WDR) points to a `learning crisis' faced by many countries, including Brazil, and the urgent need for solutions \citep{WDR2018}.

%%% PARA 2: Focus on teacher

Attempts to improve student outcomes often focus on increasing teacher effectiveness, due to their central role in the education production function (\citealp{chetty2014measuringI, chetty2014measuringII}; \citealp{araujo2016teacher}; \citealp{jackson2018test}). While policies tend to put more emphasis on monetary incentives (e.g., salary increase and performance-based payments) or on enhancing qualifications \citep{evans2016really}, the role of teacher motivation is often neglected \citep{WDR2018}. We present experimental evidence of an education policy in Brazil that seeks to motivate teachers by providing them the autonomy to design and implement a local project to tackle their specific issues instead of ``prescribing solutions''.\footnote{The paper does not speak to the wide literature on school decentralization, which involves allowing local management of resources and/or curriculum \citep{hanushek2013does}.}    

%%% PARA 3: Autonomy

The effect of assigning civil servants with more autonomy is an empirical question. On the one hand, increasing local autonomy may encourage agents to reduce effort due to the limited ability of the central government to observe and reward effort accordingly. For example, decentralization may backfire if resources are captured by local entities or used inefficiently (\citealp{burgess2012political}; \citealp{banerjee2020improving}). On the other hand, greater autonomy can improve service delivery by providing a non-monetary incentive for agents and add meaning to the job \citep{cassar2018nonmonetary}\footnote{The association between autonomy and motivation is at the foundation of Self-Determination Theory in the social psychology literature (\citealp{deci1985intrinsic}; \citealp{ryan2017self}). Studies have focused on how monetary rewards might crowd out motivation, as they undermine autonomous decision-making (\citealp{deci1971effects}; \citealp{amabile1976effects}; \citealp{pritchard1977effects}) and how non-monetary incentives, that give greater autonomy, can enhance motivation \citep{zuckerman1978importance}} and by leveraging their superior knowledge of local context (\citealp{duflo2018value}; \citealp{rogger2018hierarchy}). \cite{rasul2018managementand} and \cite{rasul2018managementof} find that more autonomy is correlated with quality and completion of public projects delivered even in contexts of low government capacity, and \cite{bandiera2020allocation} suggests autonomy can reduce misalignment of incentives between officials and taxpayers with potential welfare benefits for society. We investigate whether increasing autonomy of teachers can achieve efficiency gains in the delivery of public education.

%%% PARA 4: Introduction of what we study in what context

We study the Pedagogical Innovation Project (\textit{Projeto de Inovação Pedagógica} -- PIP), which was implemented by the State Secretariat of Education (SEE) of Rio Grande do Norte (RN). RN consistently scores at the bottom of the Brazilian Education Development Index (\textit{Índice de Desenvolvimento da Educação Básica} -- IDEB).\footnote{IDEB is a national indicator for the quality of education and combines information on student test scores and passing rates. IDEB was established in 2007 and it became one of the principal outcomes for Brazilian educational policy, setting targets for schools, municipalities and states.} Through seminars and the support from a dedicated mentor, teachers developed a diagnostic of their main pedagogical challenges and a context-specific project to address them. Mentors complemented local capacity while ensuring close ties with central government, possibly reducing moral hazard concerns associated with strategic behavior of local staff. Approved proposals were awarded financial support to implement the projects, ranging from about US\$ 7,500 to US\$ 11,000\footnote{Equivalent to 30,000 to 45,000 Brazilian reais, using exchange rate on 12/31/2015.}, or median US\$ 139 per student, i.e., 3.6\% of average annual expenditure per student in Brazil \citep{eag-2016}. The decentralized approach sought to ensure relevance of the interventions as well as motivate teachers by giving them autonomy over design and implementation. 

Our experiment focuses on the 2016 edition, which targeted the final grade of primary education (5\textsuperscript{th} grade), the first grade of lower secondary education (6\textsuperscript{th} grade) and the first grade of upper secondary education (10\textsuperscript{th} grade), with the latter two generally being the most problematic in terms of repetition and dropout rates, according to the school census (INEP). Of 299 schools eligible to receive the project in 2016, 130 schools were randomly invited to participate and submit a proposal.

%%% Para 5: Present main results - learning and progression

We show that the project had substantial impacts on both learning and progression for 6\textsuperscript{th} grade students, a critical grade in school transition from primary to lower-secondary education \citep{Santos2017}. Our ITT estimates point to an impact of 0.18 SD on math and 0.16 SD in Portuguese and slightly lower impacts on human (0.10 SD) and natural sciences (0.12 SD). We estimate the average impact on learning to be the equivalent to 0.5 extra year of schooling, or 0.36 year per US\$ 100 spent. Consistent with the results on learning gains, we find substantial improvements in Grade 6 passing rates, which are estimated to increase by 8.5 percentage points (pp), a 13\% improvement compared to the control mean. A back-of-the-envelop calculation of the combined effect of increased learning and higher probability of finishing high school suggests a net present value of the expected years of schooling on future earnings ranging between US\$ 7 to US\$ 13 thousand or 28 to 52 Brazilian minimum wages. Compared with the cost of the program per student (US\$ 139), the estimated NPV suggests that PIP was a high-return investment for the state.

%%% Para 6: Mechanisms

We empirically investigate two, potentially complementary, channels through which the program may have impacted learning and grade progression: (1) teacher turnover and (2) student socio-emotional skills. 

%%% Para 7: Teacher turnover

We hypothesize that turnover drops if teachers feel more motivated/committed to implement their own pedagogical projects during the academic year. A drop in teacher turnover is in turn expected to affect student outcomes based on the well documented negative relationship between high teacher turnover and learning (\citealp{akhtari2018political}; \citealp{ronfeldt2013teacher}; \citealp{jackson2014teacher}). We estimate that the PIP reduced teacher turnover in Grade 6 by 20.7\% over the control mean. To test whether the reduction in teacher turnover affected final student outcomes, we estimate heterogeneous effects by teacher turnover at baseline. We find that the impacts on both teacher turnover and learning are concentrated in schools with higher teacher turnover at baseline. Impacts of the program in schools with high teacher turnover at baseline approach 0.28 SD on learning and 7 pp on dropout. To assess whether the results are driven by a mechanical reduction in teacher turnover alone, we leverage the fact that most Grade 6 teachers also teach Grade 7. We estimate a similar reduction on 7\textsuperscript{th} grade teacher turnover, yet do not find impacts on progression rates of 7\textsuperscript{th} grade students. These results provide suggestive evidence that the increased motivation may not necessarily spill over to other grades in absence of the project.

%%% PARA 8: Socio-emotional skills

We also expect the program to impact students' socio-emotional skills either directly by improving student-teacher interactions and boosting students' motivation through the implementation of the innovative projects, or indirectly through impacts on cognitive skills (\citealp{cunha2007technology,cunha2008formulating}; \citealp{cunha2010estimating}). We measure the Big Five personality traits to test this mechanism. We find positive impacts of 0.17 SD on conscientiousness, the trait most commonly associated with the acquisition of cognitive skills (\citealp{poropat2009meta}; \citealp{ivcevic2014predicting}), and 0.20 SD on extroversion for 6\textsuperscript{th} graders. Grade 6 is a critical moment for students as they transition from primary to lower-secondary education. During this transition, students move from having a single teacher to multiple teachers resulting in weaker ties between students and teachers, which has been shown to affect learning and socio-emotional skills (\citealp{bedard2005middle}; \citealp{hanewald2013transition}; \citealp{Santos2017}). Improving teacher and student motivation might counterbalance the weakening of student-teacher interaction at this stage.

%%% PARA 9: Contributions to the literature

Our paper makes three main contributions. First, it provides experimental evidence that increase in the autonomy of local civil service providers, complemented with technical assistance, can improve outcomes of interest even in a low capacity environment. While monetary incentives, such as performance-based payments, have achieved positive results in some contexts (\citealp{lavy2009performance}; \citealp{muralidharan2011teacher}; \citealp{duflo2012incentives}; \citealp{mbiti2019inputs}), these schemes may not be feasible in many developing countries. Our findings suggest that results can be achieved in a more cost effective way by exploiting the potential complementarity between non-monetary incentives and intrinsic motivation (\citealp{bowles2012economic}; \citealp{ashraf2014no}). However, complementing local capacity through technical assistance may be critical, as autonomy alone has had limited success in the Brazilian context (\citealp{almeida2016assessing}; \citealp{de2016impacto}).

Second, while the literature on school grants (\citealp{das2013school}; \citealp{blimpo2015parental}; \citealp{beasley2017willing}; \citealp{carneiro2020school}) shows that decentralizing resources and the authority over their use may be a more efficient way to target resources by leveraging better information of local decision-makers, our results show that efficiency gains can be obtained by leveraging mostly existing resources. This result speaks to the growing evidence pointing to the importance of school management in improving learning outcomes through better resource allocation (\citealp{abdulkadirouglu2011accountability}; \citealp{dobbie2013getting}; \citealp{rockoff2012information}; \citealp{taylor2012effect}; \citealp{fryer2014injecting}; \citealp{fryer2017management}).

Finally, there has been increasing attention on building students' socio-emotional skills early in school due to their role in complementing cognitive skills in predicting academic achievement and labor market outcomes (\citealp{heckman2001importance}; \citealp{heckman2006effects}; \citealp{almlund2011personality}; \citealp{lindqvist2011labor}). However, it is not common practice to measure socio-emotional skills, especially outside the context of interventions that aim to affect them directly (\citealp{heckman2000policies}, \citealp{heckman2012hard}, \citealp{sanchez2016taking}). Our results show that measuring socio-emotional skills can expand the understanding of how these skills can be affected and may avoid understating important welfare benefits of education programs.  

%%% PARA 10: Organization of the paper

The remainder of the paper is organized as follows. Section \ref{sec:context} details the context and intervention. Section \ref{sec:design_data} describes the experimental design and data sources. Section \ref{sec:methodology_results} presents the empirical strategy and results, while Section \ref{sec:mechanism} explores potential mechanisms driving the main impacts. Section \ref{sec:policy} provides back-of-the-envelope estimates for the impact of the program on school quality indicators and individuals' expected earnings. Section \ref{sec:conclude} concludes with policy recommendations.


%%%%%%%%%%%%%%%%%%%% CONTEXT %%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Context and Intervention} \label{sec:context}

\subsection{Education in Brazil and Rio Grande do Norte} \label{sec:brazil}

Brazil has made significant progress to guarantee universal access to primary education, reaching a 99\% enrollment rate for children between the age of 6 and 14 years in 2018.\footnote{Source: National Household Survey -- \textit{Pesquisa Nacional por Amostra de Domicílios (PNAD) Contínua} -- and School Census.} Despite this substantial progress, large challenges remain to keep kids in school and ensure quality of education. Grade repetition and dropout rates in primary and secondary schools are among the highest in the world. Large age-grade distortions are found across all grades and an average student spends 15 years -- instead of 12 -- to graduate from high school. Among the youth of 19 years old, only 63.5\% have graduated high school.\footnote{Source: \textit{PNAD Contínua} and School Census.} Despite the largest improvements in math score in the PISA evaluation between 2003 and 2012, Brazil still ranks below all LAC countries except for Peru and the Dominican Republic \citep{data-00365}.

A major constraint to school quality and student achievement in Brazil is principal and teacher turnover \citep{akhtari2018political}. In the RN public school system, 30 percent of teachers leave their schools each year, potentially disrupting school operations and compromising personnel collaboration.\footnote{Source: teacher census (INEP). One reason for the high turnover relates to how placement of teachers is organized in Brazil. Teachers are initially placed at any school with a vacancy, with limited consideration of their location preferences. Every year, teachers are allowed to compete for new vacancies \citep{akhtari2018political}.} Using school-level data from INEP, we find that teacher permanence is positively correlated with passing rates and negatively correlated with age-grade distortion, retention and dropout rates, for both primary and secondary schools (Table \ref{tab:correlates_turnover}).\footnote{Teacher permanence is an index produced by INEP. It averages, at the school level, the number of years a teacher stays in a given school over a 5-period time frame, weighting for the number of teachers in a school. The index ranges from 0 to 5, where a higher number indicates more regularity of the teacher pool in a school.}

These national figures hide a high degree of regional variation (Figure \ref{fig:IDEB_byState}). In this paper we study an education project implemented by the RN state government, one of Brazil's poorest states. In the 2015 national standardized exam\footnote{(\textit{Sistema de Avaliação da Educação Básica} -- SAEB)} RN state schools scored at the bottom of the learning distribution in both primary and lower secondary education.\footnote{2015 is the year prior to the roll-out of the interventions we study in this paper.} The difference in 5\textsuperscript{th} grade proficiency levels between the average student in RN and the best performing state is the equivalent of 2.5 years of education.\footnote{This uses the calculation proposed by \cite{alves2016desigualdades}.} The low level of learning is reflected in the state's progression indicators. In 2015, average school dropout at upper secondary education was 12.4\% compared to the national average of 8.8\%.\footnote{Source: INEP.} The combination of high dropout rates and low learning outcomes put RN state schools near the bottom of the national quality of education indicator (Figure \ref{fig:IDEB_byState}). 

%%%%%%%%%%%%%%%%%%%%%%% PIP %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{\textbf{The Pedagogical Innovation Project (PIP)}} \label{sec:pip}

The Pedagogical Innovation Project (\textit{Projeto de Inovação Pedagógica} -- PIP) was developed by the RN SEE and aimed at improving both student progression and learning outcomes in primary and secondary state schools, which represent 16\% of elementary schools, 41\% of middle schools, and 94\% of high schools in the public education system. The program targets Grades 4, 5, 6 and 10, the grades with the most critical dropout and retention rates (Figure \ref{fig:grade_comparison}). The intervention has three main components: i) High degree of autonomy for teachers to design and implement a project based on a context-specific diagnostic of the main challenges, with the SEE having only an advisory role to assure minimum quality standards; ii) Continuous technical support to schools for the design and implementation of the project; iii) A grant to implement the project. 

The decentralized approach of PIP sought to ensure the relevance of the interventions as well as motivate teachers and students. The design of the project is based on the premise that: i) school staff are better equipped than central-level bureaucrats to identify solutions to the school-specific problems using local knowledge; ii) allowing school staff autonomy over the selection and development of interventions motivates teachers by giving them the opportunity to implement activities of their authorship; iii) innovative projects can engage students and improve student-teacher interactions. 

The PIP was launched in 2014 and between the 2015 to 2018 school years covered a total of 397 of the 639 state schools. The SEE supported teachers during project development and implementation. Here we detail the support in each of these phases.  

\subsubsection*{Project Development} \label{sec:project}

To initiate the design phase, schools are invited to participate in a three-day workshop on innovative and project-oriented teaching practices. During break-out sessions participants identify the main pedagogical challenges they face and discuss how the innovation concepts would fit to their context. Each school is provided an individualized report card comparing their test scores and passing grades with average of the state, region, and their city.

Following the workshop, each school is assigned a mentor (\textit{professor orientador}), who is part of the SEE central team.\footnote{Mentors are selected based on their experience with implementing pedagogical projects in schools and all are existing staff of the state secretariat.} Each mentor is assigned to 10 schools on average. The mentor first works with the school to prepare a diagnostic of their challenges, such as low academic performance, grade retention, indiscipline, lack of motivation, or school dropout. Based on the diagnostic, schools identify possible drivers and propose an innovative and actionable plan to improve the targeted education outcomes. The mentor then works with the school to translate the diagnostic and proposed project into a detailed implementation plan that is reviewed by the SEE of RN. 

\subsubsection*{Implementation Support and Monitoring} \label{sec:implement}

Schools with approved proposals are awarded with a fixed amount of funding to execute their projects. Schools can only spend the funds on inputs directly related to their proposed project. The grant amount depends on the number of classes included in the project and ranges from R\$ 30,000 to 45,000, i.e., US\$ 7,576 to US\$ 11,364 (Figure \ref{fig:grant}).\footnote{Using exchange rate on 12/31/2015.} The median transfer per enrolled student was R\$ 555.55, the equivalent of US\$ 139, which represents about 3.6\% of average annual expenditure per student in Brazil \citep{eag-2016}.

Through subsequent visits and remote follow up, mentors closely support the implementation of the projects. Mentors help schools obtain the necessary paperwork to access the funding and prepare procurement of materials. 

\subsubsection*{Characteristics of Sub-Projects} \label{sec:subprojects}

Schools were encouraged to explore settings beyond traditional lecture style lessons to improve student-teacher interactions and to embed their project across disciplines, increasing coordination across subjects. Proposed projects were evaluated by the SEE. The project had to demonstrate an innovative methodology for that school's context, and not necessarily a frontier methodology. In practice, all submitted proposals were approved. Most proposals fell into one of the following three categories: 

\textit{Writing and reading}: These projects were designed to improve students’ literacy and oral communication skills. They included activities, such as studying Brazilian literature classics, publication of a school newspapers, broadcasting of school radio, setting up theater plays or organizing book fairs and poetry contests.
 
\textit{Communication, media and culture}: The focus of this type of project was to introduce students to digital tools and give teachers the opportunity to use new technologies and social media. Examples include the development of videogames and robotics classes.
 
\textit{Culture and arts}: The goal of these projects was to explore different forms of cultural and artistic expressions, such as painting, graffiti, dance, theater, cinema, and music. 


%%%%%%%%%%%%%%%%%%%%%%% DESIGN AND DATA %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Experimental Design and Data} \label{sec:design_data}

The PIP was launched in 2014 with implementation taking place in the 2015 school year. Each year a subset of state schools joined the project. Our study focuses on the cohort of schools that initiated design in 2015 and implemented the project in the 2016 school year. This section further details the selection of participating schools and data sources.

\subsection{Experimental Design} \label{sec:experiment}

To ensure enough operational capacity, only a sub-sample of schools were selected to participate each year. To determine the pool of eligible schools for that year, three filters were applied. First, only schools that would not change director between the 2015 and 2016 school year were included to ensure buy-in for the prepared projects. State legislation requires directors to change schools every two years, resulting in about half the schools changing director every year.\footnote{As a result none of schools from the first 2015 cohort was considered. This legislation has since slightly changed to allow for directors to stay on longer.} Second, the 2016 edition targeted the final grade of primary education (5\textsuperscript{th} grade), the first grade of lower secondary education (6\textsuperscript{th} grade) and the first grade of upper secondary education (10\textsuperscript{th} grade).\footnote{Other editions of the program included 4\textsuperscript{th} grade.} Only schools offering at least one of those three grades were considered. Finally, schools that participate in the Federal project ProEMI (\textit{Ensino Médio Inovador)} were excluded.\footnote{\textit{Ensino Médio Inovador} (Innovative High School project -- ProEMI) was established in 2009 by the Ministry of Education as a policy aimed to support innovative curricular projects in upper secondary schools through technical and financial assistance.} Out of 639 state schools, 299 were eligible to receive the PIP project in 2016. 

The selection of participating schools was done randomly, which forms the basis of our identification strategy. The RN SEE aimed to support a total of 130 schools in the 2016 school year. The randomization was stratified by school grade and region. From the 2015 PIP cohort we learned that schools typically participate in just one grade. The SEE preferred to focus on higher grades, which is typically where schools experience more challenges. Therefore, schools offering several of the target grades (5\textsuperscript{th}, 6\textsuperscript{th} and 10\textsuperscript{th}) are assigned to participate with the highest target grade they offer. The state is divided in 4 regions and, with the 3 grade levels, this resulted in a total of 12 strata. In each stratum, around 40\% of the schools were allocated to the treatment group. In each selected school only the highest target grade, i.e., 5\textsuperscript{th}, 6\textsuperscript{th}, or 10\textsuperscript{th}, is selected to participate. Larger schools may have more than one class in a grade, in which case all classes, and thus students, in the selected grade participate. Not all teachers of a grade necessarily participate. Selection of teachers is decided within schools and is likely not random. When analyzing student and teacher outcomes we always consider all students and all teachers of the selected grade. 

The randomization resulted in 130 eligible schools in the treatment group and 169 in the control group (Panel A in Table \ref{tab:sample}). All selected schools were invited to the workshops held in the final months of the 2015 school year. The randomization was performed using the 2015 school census. After the start of the 2016 school year a few schools had closed or no longer offered the grade that had been selected for the intervention.\footnote{Eight schools had closed, six were not offering regular classes anymore, four were selected for the 5\textsuperscript{th}-grade experimental group but were not offering 5\textsuperscript{th} grade anymore, and one was in the 6\textsuperscript{th} grade group but was not offering 6\textsuperscript{th} grade anymore.} This leaves us with a final sample of 280 schools effectively allocated to the experiment at the beginning of the 2016 school year (Panel B in Table \ref{tab:sample}), 126 in the treatment group and 154 in the control group. The geographical distribution and treatment assignment of these schools is shown in Figure \ref{fig:treat_map}. Across the selected grades in each school, a total of 19,899 students were included in the experiment, 9,432 in treated schools and 10,467 in control schools (Panel C in Table \ref{tab:sample}).  

% Sample sizes for the experiment
\begin{table}[htbp]
    \caption{Sample}
    \label{tab:sample}
    \centering
    \input{DataWork/Output/Tables/tab1-sample.tex}
\end{table}
%

\subsection{Data} \label{sec:data}

To assess the impact of the PIP we leverage three main sources of data. We use administrative data, such as the Brazilian school census and data from the SEE, and collect data on cognitive and socio-emotional skills.

\subsubsection*{Administrative Data} 
We use administrative data from both the annual national school census and the state's education monitoring system to obtain school, teacher and student characteristics and progression.\footnote{The school census is carried out on an annual basis by the \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP) of the Brazilian Ministry of Education.} It contains data on overall school characteristics, such as location, presence of a library, science lab and internet, number of teachers, students and classes.\footnote{We extract school location and distance from the capital of the state, Natal, by scraping Google Maps API with school names.} The census also allows us to track individual teachers and students over time, even if they move to other schools within the state.\footnote{The Brazilian Education Census is implemented in two stages. At the beginning of the school year (i.e., May-July) initial student enrollment data are collected and the survey of school, teacher and students' characteristics. In February-March of the following year, data are collected on passing/retention and on ``movement'', which includes dropout and transfers.} The state's monitoring system, the \textit{Sistema Integrado de Gestão da Educação} (SIGEduc) portal, provides data on passing, dropout, and retention rates at the grade level.\footnote{Progression rates are reported at the end of the school year (i.e., February-March) by principals, and then validated by INEP.} Where possible the analysis of the results uses both sources. Finally, the SEE provided data on school directors and on the implementation of the PIP, such as the score of the proposal, resources allocated to schools and execution of the project. Rate of implementation of the proposed plan is assessed by the mentor at each visit.  

\subsubsection*{Learning Outcomes}
To measure student learning, we use the standardized state exam in math, Portuguese, human and natural sciences, which were administered to 5\textsuperscript{th}, 6\textsuperscript{th}, 10\textsuperscript{th} and 12\textsuperscript{th} grades at the end of the 2016 school year. The RN standardized exam was introduced in 2016 and expanded to include all the PIP priority grades. For math and Portuguese, we obtain the scores rescaled to the national standardized test (SAEB), which allows us to put the impact on student learning in the Brazilian wide context. 

\subsubsection*{Socio-Emotional Skills} 
To analyze the impact on socio-emotional skills, we measure the Big Five personality traits (neuroticism, extroversion, conscientiousness, agreeableness, openness).\footnote{The taxonomy of the five-factor model of personality we follow in this paper has been developed in the psychology literature following seminal work by \citet{fiske1949consistency}.} We use a self-reported test developed and adapted to younger students in Brazil by the \textit{Instituto Ayrton Senna} (IAS). This test, and equivalent, are widely used in the literature to assess socio-emotional skills.\footnote{See \cite{kautz2014fostering} for a review of the recent advances on measuring socio-emotional skills.}\textsuperscript{,}\footnote{Research has shown that individuals with the same level of a trait may assess themselves at very different levels on a Likert scale \citep{primi2016anchor}. To address this issue, we administered a set of anchoring-vignettes which help reveal the respondent's latent scale and response style allowing us to calibrate the individual responses following the method suggested in \cite{primi2016anchor}. The vignettes describe three hypothetical individuals that represent three clearly distinct points on a scale (low, medium and high). Students are asked to assess the personality trait of each of the characters along a 1-5 Likert scale. The student self-evaluation is then calibrated to a 1-7 scale according to her response to the vignette.} The test was administered at the end of the 2016 school year to the grade that entered the randomization (highest grade offered among 5\textsuperscript{th}, 6\textsuperscript{th} and 10\textsuperscript{th} grade, see Section \ref{sec:experiment}). In case a school had multiple classes in the same grade, one class was randomly chosen.  


\subsection{Validity of the Experiment} \label{sec:balance}

\subsubsection*{Randomization}
To examine whether the randomization resulted in balanced samples across control and treatment groups, we compare observable characteristics prior to roll-out of the project. Table \ref{tab:baltab} shows several characteristics at the school, grade, teacher and student level, including some of the key outcomes of the intervention, such as repetition and dropout rates. For grade, teacher and student comparisons, we only consider the classes in the eligible grade for that school (see description in Section \ref{sec:experiment}). Columns 2 and 4 show the means in the treatment and control groups. In column 5, we report both standard p-values based on t-test of differences in the means and p-values computed using randomization inference.\footnote{See \cite{young2019channelling} on the importance of randomization statistical inference in experimental setups and \cite{hess2017randomization} for a guideline on its implementation.} Generally we find no statistical differences when comparing the treatment and control groups. A joint significance test of school and student characteristics confirm that these variables do not jointly predict treatment assignment (F-stat of 0.69 and 1.76, respectively).

Randomization was done by grade level, to test the validity of the sub-group analysis, we also report p-values for the comparison in each grade in columns 6-8. We find a statistically significant, yet small, difference in age of 6\textsuperscript{th} graders. The control group is on average 0.25 years older than the treatment group. In the analysis we check robustness of the results to the inclusion of this unbalanced variable as a control.

% Balance table comparing treatment and control schools
\begin{table}[htbp]
    \caption{Balance Table}
    \label{tab:baltab}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccccc} \hline \hline
            \input{DataWork/Output/Tables/tab2-baltab.tex}
            \multicolumn{9}{@{}p{1.55\textwidth}}{\textit{Notes}: For school and grade level comparisons we use data from the 2015 Rio Grande do Norte school census (\textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} -- INEP) and progression rates from \textit{Sistema Integrado de Gestão da Educação} (SIGEduc) portal. At the teacher and student level, we compare socio-demographics at the beginning of the year of the intervention, i.e., 2016, from that year Rio Grande do Norte school census. Teacher data regard only those teachers who taught in the classes involved in the project, and not from other grades. Student data regard students enrolled in those grades at the beginning of the school year. Two schools out of the 280 schools in the sample are missing in the census. Standard errors (SE) are robust in Panel A and B, and clustered at the school level in Panel C and D. Strata (i.e., region and grade) fixed effects are included in all the estimated regressions. We show both standard p-values and p-values computed using randomization inference (RI) with 10,000 repetitions for the whole sample and for each grade.}
        \end{tabular}
    \end{adjustbox}
\end{table}
%

\subsubsection*{Implementation}

All 130 initially selected schools were invited to participate in the workshop, which occurred in late 2015. Of the 128 schools that attended, all prepared and submitted a proposal. All submitted proposals were approved, some after modifications. At the beginning of the 2016 school year, four of the 130 selected treatment schools had closed or did not offer the target grade anymore, resulting in a final sample of 126 schools, all with approved projects. Following approval, all schools received the first visit of the mentor at the beginning of the school year. Throughout the year, schools were meant to receive quarterly visits. Of the 126 schools, 109 received at least three visits along the school year, and 39 received all four visits. To receive the allocated funding, the schools had to provide proof that they did not have outstanding balance with federal, state or municipal tax collection agencies.\footnote{Although public schools do not pay taxes, they do need to file that they are exempt.} The lack of this documentation delayed the transfer of funds for most schools. Transfers were supposed to take place at the beginning of the school year in February, but the first transfers were only made in July. By the end of the school year of 2016, 90 schools had received the funding.\footnote{Eight schools received the funding in the following year.} Despite the challenges with the transfer of resources, mentors worked with the schools to continue implementation of the activities proposed in their work plan. By the end of the school year 79.37\% of schools implemented at least 70\% of the planned activities. All analysis take consideration the original assignment in the experiment (Panel B in Table \ref{tab:sample}) and should therefore be interpreted as intent-to-treat (ITT) effects.

\subsubsection*{Missing Data}
Not all schools and students participated in the socio-emotional and proficiency test. Table \ref{tab:baltab_participation} compares participation rates between the control and treatment group. Overall, 84\% of schools participated in the socio-emotional test and 94\% participated in the state standardized tests. Among the participating schools, on average, 55\% of enrolled students took the socio-emotional test, and 69\% participated in the proficiency tests.\footnote{Lower participation in the socio-emotional test is explained by the fact that it was carried out later than the proficiency test, when some of the schools in our sample had already released their students for the summer break.} Treated schools are more likely to participate in the socio-emotional test (91\% versus 78\%), while participation is balanced for the proficiency tests. Conditional on the school participating, the percentage of test takers is balanced for both tests, across all grades, suggesting no differential within school selection by treatment assignment. Note that we have imperfect overlap between students that took the socio-emotional test and those taking the proficiency test. Overall, 49\% of students in the selected class took both tests, which restricts our ability to interact these variables in our analysis of the potential mechanism of the program.

To explore how the unbalanced participation of schools in the socio-emotional test may affect our results, we replicate the balance table restricting the sample to schools with at least one test-taker (Tables \ref{tab:baltab_socio_schoollevel} and \ref{tab:baltab_proficiencia_media_schoollevel}). We find similar balance results between treatment and control schools among this sub-sample of test-takers. To test whether school quality varies across test-takers and non-test-takers, we compare schools that participated in the socio-emotional test with those who did not, across treatment and control groups (Figure \ref{fig:predict_participation}), using the 2015 IDEB as a measure of school quality at baseline. As expected, treatment and control schools generally have similar IDEB scores ($p=0.98$). However, participating schools generally have better scores than non-participating schools ($p=0.02$). Yet this pattern appears to be no different among treatment and control groups ($p=0.62$). This suggest that our results are likely unbiased estimates of program impacts among tested schools, yet they may not extend to the non-tested schools.


%%%%%%%%%%%%%%%%%%%%%%% RESULTS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Empirical Strategy and Results} \label{sec:methodology_results}
\subsection{Empirical Strategy} \label{sec:methodology}

We estimate the effect of randomly assigning schools to the project on our outcomes of interest with the following reduced-form specification,
\begin{equation} \label{eq:OLS}
    y_{is} = \alpha + \beta \cdot T_{s} + \Sigma_{strata} + \varepsilon_{is}
\end{equation}
where $y$ is the outcome of interest for student $i$ in school $s$, $T_s$ is the indicator variable of treatment assignment, $\Sigma$ is a vector of strata dummies, and $\varepsilon$ is the error term. Standard errors are clustered at the school level.\footnote{Some estimates are obtained at the school level. In these cases, we employ robust standard errors.} Since not all assigned schools received all components of the project, as discussed in Section \ref{sec:experiment}, the parameter $\beta$ measures the ITT effect. We provide estimates of project impact for all schools pooled as well as for each grade separately. To check robustness of the results, we estimate the model adding controls, and we use interaction-weighted, regression-weighted and blocked difference-in-means estimators.\footnote{\cite{gibbons2018broken} show that, in the presence of heterogeneous treatment effects, fixed effects estimates are generally not a consistent estimator of the average treatment effect. The blocked difference-in-means approach uses strata sizes, instead of fixed effects, to weight the treatment effects estimates within each strata.}

To explore potential distributional effects of the project we estimate unconditional quantile treatment effects (UQTE) effects following \cite{firpo2009unconditional}. Unlike the average effect, quantile treatment effects assess whether the impact of the project differs at distinct points (quantiles) of the outcome distribution. The UQTE has a similar interpretation as the average effect and it is estimated by computing the horizontal difference between accumulated (or marginal) distributions of treated and control outcomes for a given quantile. For example, the effect on the median is given by $UQTE_{0.5} = Q_{0.5}(Y_{T}) - Q_{0.5}(Y_{C})$, where $Y_{T}$ is the value of the outcome variable (e.g., test score) at the distribution median of the treated group and $Y_{C}$ is the value of the outcome variable at the distribution median of the control group.


\subsection{Results} \label{sec:results}
We begin the analysis with the key outcomes targeted by the project: student learning and progression indicators such as grade passing, repetition and dropout. In Section \ref{sec:mechanism} we look at intermediate outcomes, such as teacher-turnover and socio-emotional skills to shed light on the potential mechanisms leading to impacts on final outcomes.

\subsubsection*{Learning Outcomes} \label{sec:skills}

Table \ref{tab:test_studentlevel} shows ITT estimates on overall test scores as well as separated by subject and grade. We find large positive impact on learning outcomes among schools assigned to treatment, but for 6\textsuperscript{th} graders only. The intervention improved overall test scores for 6\textsuperscript{th} graders by 0.15 SD, or 6 points compared to a mean in the control group of 163. Significant improvements are observed across all subjects but are more pronounced for math and Portuguese. For robustness, we re-estimate the model controlling for a vector of covariates\footnote{They include student's age, gender and race dummies (white, indigenous, black, or \textit{pardo}), whether they receive \textit{Bolsa Familia}, and whether they use school transportation.} and using alternative estimation strategies, such as interaction-weighted and regression-weighted estimators, blocked difference-in-means, as well as collapsing data at the school level. The results are very similar and are available in the Online Appendix.\footnote{The Online Appendix can be accessed through this \href{https://github.com/worldbank/brazil-pip-education/blob/master/pip_app.pdf}{link}.} The quantile regression estimates for 6\textsuperscript{th} graders are presented in Figure \ref{fig:qreg_media_grade6} (average test score), and suggest gains across the board, with a more pronounced difference at the higher end of the grade distribution.\footnote{Quantile results disaggregated by subject are available in the Online Appendix.}

% Impact on cognitive
\vfill
\begin{table}[ht!]
    \caption{Impact on Student Learning}
    \label{tab:test_studentlevel}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccc} \hline \hline
            \input{DataWork/Output/Tables/tab3-test_studentlevel.tex}
            \multicolumn{6}{@{}p{0.92\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Unit of observation: student. Outcome variables in the column headers. All regressions are OLS with strata (i.e., region and grade) fixed effects. Standard errors clustered at the school level in parentheses. The coefficients are expressed in terms of standard deviations from the control group, while the unconditional mean and standard deviation of the dependent variable refer to the raw values in the control group.}
        \end{tabular}
    \end{adjustbox}
\end{table}
%

\clearpage

In Figure \ref{fig:grade6_byGender}, we compare both average and quantile treatment effects by gender. On average, PIP positively affected learning outcomes of both female and male 6\textsuperscript{th} graders; however, distributional analysis suggests that the project shifted the entire distribution of test scores for boys to the right, but for girls resulted only in differences in the higher quantiles. We find suggestive evidence that the project helped boys catch up with the initially higher proficiency level of girls.

To contextualize the magnitude of the impact on 6\textsuperscript{th} graders, we convert the learning gains from the project in additional years of schooling. To do so, we use the state standardized test scores rescaled to the national standardized exams (SAEB). The exam is taken in Grades 5 and 9 and constructed to allow for comparison of levels on a unique proficiency scale across grades and years.\footnote{The exam uses item response theory (IRT) to express scores on a unique scale for all grades of the national education system. This is achieved by including test items from 5\textsuperscript{th} grade tests into 9\textsuperscript{th} grade tests. The same is done from one edition to the next making SAEB scores comparable over time. The test takes place every two years.} This enables calculation of the accumulated knowledge in math and Portuguese of an average student between the tests taken in 5\textsuperscript{th} and 9\textsuperscript{th} grade. To calculate the average gains in knowledge between those 4 years of schooling we compare the test scores of a cohort of students from RN that took the 5\textsuperscript{th} grade exam in 2013 and the 9\textsuperscript{th} grade exam in 2017. We find that the average gain in test score for this cohort was 60 points, or 15 points per year on average. Based on the ITT estimates, we find that PIP improved 6\textsuperscript{th} graders math and Portuguese scores by 6.81 points on the SAEB exams scale, the equivalent of a little under half a year of additional schooling.\footnote{The OLS results in terms of SDs, using SAEB-rescaled test scores as outcome variable, are available in the Online Appendix. In our data, one SD  improvement in learning in 5\textsuperscript{th} grade corresponds to 50 points, i.e., 3.3 years of schooling. Comparing gains in literacy for a set of countries, \citet{evans2019equivalent} find that a one SD improvement in test scores ranges from 4.7 to 6.5 years of schooling.} In Section \ref{sec:policy} we reflect on the economic implications of these results.

\subsubsection*{Student Progression} \label{sec:flow}

Table \ref{tab:promotion} shows the effect on passing, retention and dropout rates across grades. We report results using both data from SIGEduc, which are reported at the grade level, and from tracking individual students using the 2016 and 2017 waves of the school census. 

We find positive impacts on overall progression. These are driven by substantial improvements in 6\textsuperscript{th} grade, which is consistent with the results on learning gains. Passing rates in 6\textsuperscript{th} grade are estimated to increase by 8.46 pp, a 13\% improvement compared to the control mean of 63.56\%. We find similar results using the census data; a 7 pp increase among 6\textsuperscript{th} graders (12\%). We find no evidence of differential impact by gender (Table \ref{tab:promotion_het_gender}) or by baseline levels of passing rate (Table \ref{tab:promotion_het}). The latter estimates confirm that the provision of schools' relative performance during the design workshops likely did not drive the results.

The impacts on grade passing mechanically result from either a reduction in dropout or retention or a combination of both. The SIGEduc data suggest that the result was mainly achieved by reducing grade repetition by 6.85 pp (23\%). However, census data point to a reduction in dropout being the main driver. The discrepancy in the results can be explained by the difference in timing of defining a student’s status. The SIGEduc data only captures students dropping out during the school year, while the census also captures dropout of students over the summer break. This suggest that part of the students reported as retained in SIGEduc drop out by the beginning of the next school year.

% Impact on progression
\vfill
\begin{table}[ht!]
    \caption{Impact on Student Progression Rates}
    \label{tab:promotion}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccccc} \hline \hline
            \input{DataWork/Output/Tables/tab4-promotion.tex}
            \multicolumn{9}{@{}p{1.1\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. School-level data are from \textit{Sistema Integrado de Gestão da Educação} (SIGEduc) and student-level data are from Rio Grande do Norte census. Unit of observation: school and student. Outcome variables in the column headers. All regressions are OLS with strata (i.e., region and grade) fixed effects. Robust standard errors for school-level regressions and standard errors clustered at the school level for student-level regressions in parentheses. The coefficients are expressed in terms of percentage points and the mean and standard deviation of the dependent variable in the control group are unconditional.}
        \end{tabular} 
    \end{adjustbox}
\end{table}

\clearpage


The reduction in 6\textsuperscript{th} grade retention might have long-term implications for students’ years of education and likelihood of completing school. To evaluate how much improving progression may affect students' school careers we track all RN students that were in 6\textsuperscript{th} grade in 2011 up to 2017 using school census data. We find that students who were promoted in 6\textsuperscript{th} grade in 2011 are 40 pp more likely to be in school in 2017 than students who were retained in 2011 (Figure \ref{fig:retention_grade6_dropout}). Similarly, after 6 years, they have completed 2.34 more years of schooling (Figure \ref{fig:retention_grade6_education}). We quantify the correlation between retention in 6\textsuperscript{th} grade and schooling outcomes by estimating a simple OLS regression of dropout and completed years of schooling on retention.\footnote{We estimate the following cross-section regression: $y_{isc} = \alpha + \beta \cdot retained_{isc} + \sigma_{s} + \gamma_{c} + \varepsilon_{isc}$, where $y_{isc}$ is the outcome variable, i.e., dropout dummy or years of completed schooling, of student $i$ in school $s$ and class $c$, $retained_{isc}$ is a dummy variable for students who repeated 6\textsuperscript{th} grade in 2011; ($\sigma_{s}$) and ($\gamma_{c}$) are school and classes fixed effects. Standard errors are clustered at the school level.} We find that failing 6\textsuperscript{th} grade is associated with a 21 pp higher likelihood of school dropout after 6 years, and a reduction of 1.7 years of completed schooling (Table \ref{tab:retention_grade6_regs}). Taken at face value, our estimates provide suggestive evidence that the reduction of 23\% (or 7 pp) in repetition rate caused by the PIP might contribute to substantially reduce school dropout (by 4.83 pp) and increase years of schooling (by 0.4 extra years) of the treated cohort of 6\textsuperscript{th} graders.

\begin{figure}[ht!]
    \caption{6\textsuperscript{th} Grade Retention and Student Attainment}
    \captionsetup[subfigure]{position=top,justification=centering}
    \label{fig:retention_grade6}
    \centering
    
    \begin{subfigure}{\textwidth}
        \caption{Percentage of 2011 6\textsuperscript{th} Graders Enrolled in Subsequent Years}
        \label{fig:retention_grade6_dropout}
        \centering
        \includegraphics[width=0.8\textwidth]{DataWork/Output/Figures/fig1a-retention_grade6_dropout.png}
    \end{subfigure} 
     
    \begin{subfigure}{\textwidth}
        \caption{Years of Completed Schooling of 2011 6\textsuperscript{th} Graders}
        \label{fig:retention_grade6_education}
        \centering
        \includegraphics[width=0.8\textwidth]{DataWork/Output/Figures/fig1b-retention_grade6_education.png}
    \end{subfigure}  
     
    \begin{minipage}{0.825\textwidth}
        \small{\textit{Notes:} The points in Panel (a) show the percentage of 6\textsuperscript{th} graders in 2011 who were enrolled in any grade (6\textsuperscript{th} or higher) in the following years, up to 2016. Panel (b) shows the average years of completed schooling of students who were enrolled in 6\textsuperscript{th} grade in 2011 by each following year, up to 2016. The sample is the universe of students of public schools in Rio Grande do Norte (N = 73,010) and is split between those who were promoted in 2011 and those who were retained in 2011. Data from 2011-2017 school censuses.}
    \end{minipage}
\end{figure}

%%%%%%%%%%%%%%%%%%%%%%% MECHANISM %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Potential Mechanisms} \label{sec:mechanism}

The results show that PIP had substantial impact on student learning and progression in Grade 6. In this section we explore the mechanisms through which the intervention may have affected these outcomes. As discussed in Section \ref{sec:pip}, there are three main components to the project, namely: i) teacher autonomy, ii) technical support through workshops and mentors, and iii) financial support. We empirically investigate two, potentially complementary, mechanisms through which these can improve student outcomes: 

\begin{enumerate}[nosep]
    \item Reducing teacher turnover by increasing teacher motivation through the provision of autonomy and resources to develop their own project with technical support;
    \item Building student socio-emotional skills either directly through i) the implementation of innovative projects in the schools, which aim to enhance student-teacher interactions, increasing students' motivation and skills; or indirectly through ii) the impact on cognitive skills, or iii) as a result of the first mechanism. 
\end{enumerate}

Teacher autonomy over the design and use of the grant likely affects student outcomes through both mechanisms: 1) it may crowd in teacher intrinsic motivation and affect teaching quality as well as 2) lead to better locally tailored projects. The second mechanism assumes that innovative pedagogical projects, aimed at changing student-teacher interactions, could generate positive results, regardless of teacher autonomy.  

In this section we explore which of the two channels were likely affected. However, we cannot tell apart the relative importance of the three different project components or the two mechanisms, as the same package was offered to all treatment schools.  

\subsection{Teacher Turnover} \label{sec:turnover}

PIP allowed substantial autonomy for teachers, which may have affected teachers' motivation and engagement with students. We do not directly observe teachers' motivation in our data. To document whether the PIP might have affected teachers' commitment and motivation, we look at teacher turnover as a proxy.\footnote{To define `teacher turnover', we track teachers across years in the school census. The outcome is a dummy of whether a teacher is in the same school in two consecutive years. The dummy is zero if a teacher is still teaching in the same school (in any grade) and one otherwise.} 

ITT estimates presented in Panel A of Table \ref{tab:turnover_teacherlevel} suggest the project increased the probability of a teacher staying in the same school the following year by 6.4 pp (i.e., a 20.7\% decrease in teacher turnover over the control mean of 30.9\%) in 6\textsuperscript{th} grade. The higher teacher turnover in control schools is driven by more teachers leaving to other schools, they do not leave the education system more than in treatment schools. We interpret this result as suggestive evidence that part of project's success in Grade 6 was achieved by increasing motivation among teachers. 

To explore whether affecting teacher turnover is driving impact on final outcomes, we estimate heterogeneous effects by teacher turnover at baseline. Panel B in Table \ref{tab:turnover_teacherlevel} suggests that the reduction in teacher turnover is concentrated in schools with high teacher turnover rates at baseline. `High teacher turnover' is defined at the grade level and equal to one if that school has a turnover rate above the sample median of that grade before the intervention.\footnote{To define `high teacher turnover' schools at baseline we calculate the proportion of teachers in a grade who leave a school between the 2015 and 2016 school year. `High teacher turnover' is defined as a dummy, which takes the value one if the proportion of teachers leaving that school is above the median turnover distribution for schools treated in that grade.} The median turnover at baseline in 6\textsuperscript{th} grade is 33.33\%.\footnote{Most 5\textsuperscript{th} grade schools have only one teacher and their median turnover rate is zero. Therefore, we are not able to estimate heterogeneous effects for this grade.} We leverage this finding to document whether teacher turnover is a likely mechanism driving the learning results. We indeed find that the impacts on learning and dropout are also concentrated in schools with high teacher turnover at baseline. Impacts on learning for this group approach 0.28 SD (Table  \ref{tab:het_turnover}). 

\vfill

% Impact on teacher turnover
\begin{table}[ht!]
    \caption{Impact on Probability of Teacher Staying in the Same School}
    \label{tab:turnover_teacherlevel}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lDDDD} \hline \hline
            \input{DataWork/Output/Tables/tab5-turnover_teacherlevel.tex}
            \multicolumn{5}{@{}p{1.12\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Data are from Rio Grande do Norte 2016 and 2017 teacher censuses. Unit of observation: teacher. Outcome variables in the column headers. $\sum \hat{\beta}$ is the sum of the treatment effect with the interaction variable coefficient. The p-value refers to the null hypothesis $\sum \hat{\beta} = 0$. All regressions are linear probability model with strata (i.e., region and grade) fixed effects. Standard errors clustered at the school level in parentheses. Note that the coefficient on the high-turnover dummy at baseline for 5\textsuperscript{th} grade is not identified because the median itself is equal to 0. This is due to the fact that, in most schools, 5\textsuperscript{th} grade has only one teacher, thus the school turnover rate variable is either equal to 0 or 1.}
        \end{tabular} 
    \end{adjustbox}
\end{table}
%

% Impact on test scores and learning by teacher turnover
\begin{table}[ht!]
    \caption{Impact on Student Learning and Progression by Teacher Turnover at Baseline}
    \label{tab:het_turnover}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccc} \hline \hline
            \input{DataWork/Output/Tables/tab6-het_turnover.tex}
            \multicolumn{5}{@{}p{0.985\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Outcome variables in the column headers. `Average test score' is the average of standardized test scores in math, Portuguese, human and natural science. Student-level data on progression are from Rio Grande do Norte census. Teacher data are from Rio Grande do Norte 2016 and 2017 teacher censuses. Unit of observation: student. $\sum \hat{\beta}$ is the sum of the treatment effect with the interaction variable coefficient. The p-value refers to the null hypothesis $\sum \hat{\beta} = 0$. All regressions are OLS with strata (i.e., region and grade) fixed effects. The coefficients on learning is expressed in terms of standard deviations from the control group, while the coefficients on progression are expressed in terms of percentage points. Standard errors clustered at the school level in parentheses.}
        \end{tabular} 
    \end{adjustbox}
\end{table}
%

\FloatBarrier

To assess whether solely reducing teacher turnover in itself is sufficient to achieve these results, we explore the fact that many 6\textsuperscript{th} grade teachers also teach in other grades, where no innovative projects are implemented. According to the school census data, 90.43\% of 6\textsuperscript{th} grade teachers also teach 7\textsuperscript{th} grade, 81.76\% in 8\textsuperscript{th} grade, and 73.21\% in 9\textsuperscript{th} grade.\footnote{The percentage is balanced between treatment and control schools.} As a result, the reduction in 6\textsuperscript{th} grade turnover also mechanically affects turnover in the other grades in the same schools (Panel A of Table \ref{tab:spillover_other_grades}). We compare student level outcomes for 6\textsuperscript{th} grade schools, in their remaining lower-secondary grades (Panel B of Table \ref{tab:spillover_other_grades}).\footnote{The results using grade level data from SIGEduc are very similar. See the Online Appendix.} We only have access to data on student progression in other grades, the standardized test was not implemented in 7\textsuperscript{th} grade. The lack of positive impacts on other grades suggests that reducing teacher turnover alone might not be sufficient to affect student outcomes: positive results in 6\textsuperscript{th} grade are likely driven by the combination of increased motivation of teachers and the other project components.\footnote{Results by teacher turnover at baseline also show no impacts on other grades. See the Online Appendix.} Moreover, we find no negative spillovers on other grades, which suggests that teachers did not increase effort in 6\textsuperscript{th} grade at the cost of other grades.

% Spillover to other grades
\clearpage
\null
\vfill
\begin{table}[ht!]
    \caption{Impact on Other Grades in 6\textsuperscript{th} Grade Treated Schools}
    \label{tab:spillover_other_grades}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lCCCC} \hline \hline
            \input{DataWork/Output/Tables/tab7-spillover_other_grades.tex}
            \multicolumn{5}{@{}p{0.925\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Data are from Rio Grande do Norte 2016 and 2017 teacher and student censuses. Unit of observation: teacher in the first panel and student in the other panels. Sample: schools treated at 6\textsuperscript{th} grade. All outcome variables (in the panel headers) are dummy variables and regressions are linear probability model with strata (i.e., region) fixed effects. Standard errors clustered at the school level in parentheses.}
        \end{tabular} 
    \end{adjustbox}
\end{table}
\vfill
% 

\subsection{Socio-Emotional Skills}

Throughout the development of the projects, teachers were encouraged to design an intervention that would change student-teacher interactions, and engage students by exposing them to learning opportunities outside the classroom, moving away from the traditional lecture-based teaching. As a consequence, resulting projects may have directly affected student socio-emotional skills. In addition, increasing teacher motivation and commitment may provide an indirect channel to improve socio-emotional skills. 

Table \ref{tab:socio_studentlevel} shows the ITT estimates on each of the Big Five personality traits. The indicators are standardized (within grade) and the coefficients can be interpreted in terms of standard deviations. Pooling all grades, we find that the project had a positive and statistically significant effect on conscientiousness and extroversion.\footnote{The same robustness checks used for estimating the impact on learning outcomes can be seen in the Online Appendix.} However, in line with previous results, these are driven by the impacts on 6\textsuperscript{th} graders (0.17 SD and 0.21 SD respectively). Among the Big Five, the trait of `conscientiousness' is commonly associated with acquisition of cognitive skills (\citealp{poropat2009meta}; \citealp{ivcevic2014predicting}). It encompasses traits such as self-control, organization, responsibility and perseverance.

% Impact on socio-emotional skills
%\vspace{20pt}
\begin{table}[ht!]
    \caption{Impact on Socio-Emotional Skills}
    \label{tab:socio_studentlevel}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccc} \hline \hline
            \input{DataWork/Output/Tables/tab8-socio_studentlevel.tex}
            \multicolumn{6}{@{}p{1.175\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Unit of observation: student. Outcome variables in the column headers. All regressions are OLS with strata (i.e., region and grade) fixed effects. Standard errors clustered at the school level in parentheses. The coefficients are expressed in terms of standard deviations from the control group, while mean and standard deviation of the dependent variable refer to the raw values in the control group. `Neuroticism' is reverse-coded so that a positive coefficient implies a lower level of neuroticism score.}
        \end{tabular}
    \end{adjustbox}
\end{table}
%

We observe that student test scores and socio-emotional outcomes are positively correlated in the tested sample at endline, regardless of treatment status (Figure \ref{fig:scatter_test_socio}). Unfortunately, as mentioned in Section \ref{sec:data}, data on socio-emotional skills were only collected for a random subset of students in each school,\footnote{Participation rate in the socio-emotional test and other observed characteristics are balanced at the student level when we restrict the sample to the subset of students who took the socio-emotional test. See discussion in Section \ref{sec:balance}.} therefore we cannot further investigate the mediating role of socio-emotional skills on learning outcomes or vice versa.\footnote{When we restrict the sample to the subset of students who took the socio-emotional test, we are unable to detect effects of the project on learning outcomes.} However, in line with the literature, these correlations confirm the complementarities between socio-emotional and cognitive skills. 

\subsection{Understanding Heterogeneous Impacts by Grade}

We presented the results to the mentors in a focus group discussion to shed light on what may be driving differences in impacts across grades. First, we found that mentors had more experience with teaching and implementing projects in lower grades, which may have resulted in the technical assistance being better tailored to these grades. Second, mentors stated that the project filled a clear gap faced by 6\textsuperscript{th} graders who experience a significant transition between levels of education. The key difference between these levels is that students in primary education have a single teacher, which allows for a close student-teacher relationship. These ties are weaker in lower secondary education, as students have multiple teachers (at least 5). The potential negative impact of this transition is well documented in the United States and has been recently investigated in Brazil (\citealp{bedard2005middle}; \citealp{cook2006should}; \citealp{hanewald2013transition}; \citealp{Santos2017}). \cite{Santos2017} evaluate the impact of a pilot in municipal schools in Rio de Janeiro, Brazil, which expanded primary school to include  6\textsuperscript{th} grade. They find that having the 6\textsuperscript{th} grade in the primary school increases learning by 0.16 SD, and suggestive evidence that strengthening of student-teacher relationship mediated some of the effect on learning.  

We compare administrative data to assess whether project implementation varied across the three grades. All treatment schools received similar levels of support from the SEE team: all had an approved sub-project and were assigned a mentor who visited them regularly. Here we focus on the implementation of the planned activities by the schools throughout the year. We report three measures of implementation: i) obtaining the clearance certificate, which is a necessary requirement for schools to receive funding from any state level educational program\footnote{We indeed find that obtaining the clearance certificate is what most predicts the rate of implementation (Table \ref{tab:predict_implementation}). We find that being assigned to receive the PIP increases the likelihood of schools obtaining the clearance certificate by 41 pp. This impact does not differ by grade (Table \ref{tab:clearance_certificate}).}; ii) percentage of project funds received by the end of the school year; and iii) whether a school implemented at least 70\% of the planned activities included in the work plan. We observe substantial difference in rates of implementation across the three grades (Figure \ref{fig:implementation_byGrade}). Each of the indicators shows higher rates of implementation in 6\textsuperscript{th} grade.\footnote{We do not observe any significant correlation between school characteristics at baseline and implementation (Table \ref{tab:predict_implementation}), however it is likely that implementation is endogenous to unobserved school quality and our outcomes of interest, therefore we refrain from comparing schools with high rates of implementation with schools with low rates of implementation as this would provide biased results.} 

Taken together, Grade 6 schools may have perceived the project as being particularly relevant to smooth shocks observed around the transition from primary to lower secondary education. This may have led to better implementation in this grade. 

\begin{figure}[htbp]
    \caption{Implementation by Grade}
    \label{fig:implementation_byGrade}
    \centering
    \includegraphics[width=13.4cm]{DataWork/Output/Figures/fig2-implementation_byGrade.png}
    \begin{minipage}{0.825\textwidth}
        \small{\textit{Notes:} `Implementation' is defined as the ratio of the number of activities that were implemented over the number of planned activities described in the work plan. Data are from State Secretariat of Education (SEE). Sample: schools treated.}
    \end{minipage}
\end{figure}
\FloatBarrier

%%%%%%%%%%%%%%%%%%%%%%% POLICY %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Policy Analysis} \label{sec:policy}

In this section we use the main results on learning and progression to produce back-of-the- envelope estimates for the impact of the program on school quality indicators and individuals' expected earnings.

\subsubsection*{Quality of Education} 

We use the ITT estimates to compute the counterfactual distribution of two national quality of education indicators. First, Figure \ref{fig:itt_ProvaBrasil} shows that if students retain their learning gains over time, as measured by SAEB scores, the impact of the PIP would suffice to close half of the knowledge gap between RN and the country's average by the end of Grade 9. Second, combining impacts on progression and learning, suggests that the PIP would help RN state schools move upwards in the IDEB ranking by at least two positions \ref{fig:itt_IDEB}. The strategy is described in more detail in Appendix \ref{sec:ideb}.  

\subsubsection*{Expected Returns to Education} 

We expect the intervention to impact labor market outcomes of the 6\textsuperscript{th} graders in the long term through two channels: first via learning gains among those that stayed in school (productivity channel), and second via higher probability of remaining in school conditional on passing Grade 6 (a combination of productivity effects with signaling or diploma effects). The first channel focuses on the improved quality of education, while the second reflects extra years of education among more knowledgeable students.

Using the ITT effects of the PIP on learning as being approximately equal to 0.5 extra years of schooling, a back-of-the-envelope calculation suggests a net present value (NPV) on future earnings of 29,148.97 Brazilian Reais (BRL) -- or US\$ 7,287.24. The second channel is through the increase in student years of schooling through a reduction in repetition which we estimate leads to about 0.4 extra years of schooling, with a NPV on future earnings of 23,319.18 BRL (or US\$ 5,829.79). The full effect on expected earnings would then range from US\$ 7,000 to US\$13,000 or 28 to 52 Brazilian minimum wages. The data and methodology used for the calculations are described in Appendix \ref{sec:NPV}. 

This calculation assumes all the expected impacts on future earnings are driven by direct or indirect gains in learning. However, beside mediating the accumulation of cognitive skills, there may be direct impacts of socio-emotional skills on labor market outcomes, making this a lower bound estimate. 

%%%%%%%%%%%%%%%%%%%%%%%%%%%% CONCLUSION %%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Conclusion} \label{sec:conclude}

In this paper, we study whether providing autonomy to public sector agents, such as teachers, can improve the quality of service delivery in a low state capacity environment. The possibility of stimulating decision-making responsibility of local public officials to make the best use of their contextual knowledge to design and implement more effective policies is a first order question in the public sector, especially when resources are scarce and the monitoring capacity of the central government limited.

We explore this question in the context of an education program, which was randomly rolled out in state schools of Rio Grande do Norte, one of the poorest Brazilian states. The Pedagogical Innovative Program (PIP) consisted of three key components: i) teacher autonomy to develop a work plan to tackle problems locally identified; ii) technical assistance from mentors assigned by the state secretariat to support teachers during the diagnostic and development of the work plan stages; and iii) funding earmarked for schools to buy the pedagogical material necessary to implement the activities described in the work plan. The project was designed to motivate teachers and students and improve students' outcomes, leveraging mostly existing staff and school resources. 

We find that the PIP had meaningful impacts on 6\textsuperscript{th} graders, a critical grade during the transition from primary to lower-secondary education. Our ITT estimates point to learning gains in math and Portuguese of 0.18 SD and 0.16 SD, respectively. In addition, we find that passing rates increased whereas dropout and retention decreased. To shed light on the mechanisms underpinning our main results, we tested whether the program affected teacher turnover and students' socio-emotional skills as the program envisaged the motivations of teachers and students as the main pathways for the program's success. The program reduced teacher turnover by 20.7\% and most of this reduction was observed in schools with higher teacher turnover at the baseline. Consistent with these results, we document learning gains almost twice as high in schools with high teacher turnover at the baseline. To estimate impacts on socio-emotional skills, we use the Big Five personality traits. Our results show positive impacts on conscientiousness and extroversion. Overall, these findings empirically support the program's intention to impact students' outcomes by motivating teachers and students. 

These results have direct implications for policy design in countries that might have neither fiscal space to design pay-for-performance schemes at scale nor effective monitoring mechanisms. Autonomy over limited funding appeared to be enough to provide a non-monetary incentive to increase teacher motivation. In combination with the technical assistance the program mitigated agency problems observed in other types of interventions where the decentralization of decision-making to local officials backfired (e.g., \citealp{banerjee2020improving}) while complementing local capacity.

%Our results indicate that the decentralized approach to motivate local public servants, while leveraging their knowledge to tackle relevant demand side constraints, was able to secure high returns at a relative low cost.

The lack of results in other grades may be explained by lower rates of implementation or the approach being particularly appropriate in a context where motivation of agents and final recipients, in this case students, is particularly important to affect outcomes. More research is needed to understand in which settings this approach is more likely to succeed.


%%%%%%%%%%%%%%%%%%%%%%%%%%%% BIBLIOGRAPHY %%%%%%%%%%%%%%%%%%%%%%%%%%%

\bibliography{references}

%%%%%%%%%%%%%%%%%%%%%%%%%%%% APPENDIX TABLES AND FIGURES %%%%%%%%%%%%%%%%%%%%%%%%%%%
\clearpage
\appendix
	
\section{Supplementary Figures and Tables}
\setcounter{figure}{0}
\setcounter{table}{0}
\renewcommand{\thefigure}{A\arabic{figure}}
\renewcommand{\thetable}{A\arabic{table}}
\label{app:tables_figures}

% IDEB RN vs. other Brazilian states
\null
\vfill
\begin{figure}[ht!]
    \caption{IDEB in Rio Grande do Norte vs.\ Other Brazilian States, 2015}
    \label{fig:IDEB_byState}
    \centering
    \includegraphics[width=13cm]{DataWork/Output/Figures/figA1-IDEB_byState.png}
    \begin{minipage}{0.835\textwidth}
        \small{\textit{Notes:} We use data from \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP) for state public schools. The IDEB index is defined at each education stage, i.e., for primary, middle, and secondary schools. It is a national indicator for the quality of education, which combines information on student test scores and passing rates (see Appendix \ref{sec:ideb} for details on the construction of the index). The bars show the average IDEB across the three education stages by state in 2015.}
    \end{minipage}
\end{figure}
\vfill

% Dropout and retention rate in RN
\begin{figure}[htbp]
    \caption{Grade Repetition and School Dropout Rates by Grade in Rio Grande do Norte}
    \label{fig:grade_comparison}
    
    \centering
    \captionsetup[subfigure]{position=top,justification=centering}
    
    \vspace{12pt}
    
    \begin{subfigure}{\textwidth}
        \centering
        \caption{Grade Repetition Rate}
        \includegraphics[width=13cm]{DataWork/Output/Figures/figA2a-grade_comparison_retention.png}
        \label{fig:grade_comparison_retention}
    \end{subfigure}
    
    \vspace{12pt}
    
    \begin{subfigure}{\textwidth}
        \centering
        \caption{School Dropout Rate}
        \includegraphics[width=13cm]{DataWork/Output/Figures/figA2b-grade_comparison_dropout.png}
        \label{fig:grade_comparison_dropout}  
    \end{subfigure}
    
    \begin{minipage}{0.8\textwidth}
        \small{\textit{Notes:} The bars show average retention and dropout rate among public schools in Rio Grande do Norte in 2015. Data are from \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP).}
    \end{minipage}
\end{figure}
%

% Grant
\begin{figure}[htbp]
    \caption{Allocation of Resources by Type of Grant}
    \label{fig:grant}
    \centering
    
    \includegraphics[width=13cm]{DataWork/Output/Figures/figA3-grant.png}
    
    \begin{minipage}{0.8\textwidth}
        \small{\textit{Notes:} The bars show the percentage of schools by the type of grant they were assigned to receive through PIP (ranging from 30,000 to 45,000). The values are in Brazilian Reais, which were worth 0.25 US dollars at the beginning of 2016.}
    \end{minipage}
    
\end{figure}
%

% Map
\begin{figure}[htbp]
    \caption{Geographical Distribution of Schools by Treatment Status}
    \label{fig:treat_map}
    \centering
    
    \includegraphics[width=15cm]{DataWork/Output/Figures/figA4-treat_map.png}
    
    \begin{minipage}{0.92\textwidth}
        \small{\textit{Notes:} GPS locations were extracted by scraping Google Maps API with school names. All but 6 schools in the experimental sample, 3 in the control and 3 in the treatment group, were not properly located using this method.}
    \end{minipage}
\end{figure}

% IDEB by Participation and Treatment
\begin{figure}[htbp]
    \caption{IDEB by Participation to Socio-Emotional Test and Treatment}
    \label{fig:predict_participation}
    \centering

    \includegraphics[width=14cm]{DataWork/Output/Figures/figA5_predict_participation.png}
    
    \noindent
    \justifying
    \small{\textit{Notes:} The bars show the unconditional means of the school IDEB by participation in the socio-emotional test and by treatment assignment, as described in \ref{sec:balance}. We regress IDEB on these 4 categories so that:
    \begin{equation*}
        IDEB_s = \beta_1 \cdot T_{m_s} + \beta_2 \cdot T_{p_s} + \beta_3 \cdot C_{m_s} + \beta_4 \cdot C_{p_s} + \varepsilon_s
    \end{equation*}
    Therefore, we run three different group comparisons -- namely treated schools vs.\ control; participating schools vs.\ missing schools; treated vs.\ control among participating schools -- by testing the null hypotheses that $\beta_2$ + $\beta_4$ = $\beta_1$ + $\beta_3$, $\beta_2$ + $\beta_1$ = $\beta_4$ + $\beta_2$, $\beta_2$ = $\beta_4$, respectively, through standard t-tests. IDEB data refer to 2015 and are from \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP).}
\end{figure}

\begin{figure}[htbp]
    
    \caption{Quantile Treatment Effect on Average Test Score  in 6\textsuperscript{th} Grade}
    \label{fig:qreg_media_grade6}
    \centering
    \includegraphics[width=14cm]{DataWork/Output/Figures/figA6-qreg_media_grade6.png}
    
    \begin{minipage}{0.825\textwidth}
        \small{\textit{Notes:} Point estimates of quantile regressions with strata (i.e., region) fixed effects and standard errors clustered at the school level. Confidence intervals are 90\%. Sample: schools treated at 6\textsuperscript{th} grade.}
    \end{minipage}
\end{figure}

\begin{figure}[htbp]
    
    \centering
    \caption{Impact on Average Test Score by Gender in 6\textsuperscript{th} Grade}
    \captionsetup[subfigure]{position=top,justification=centering}
    \label{fig:grade6_byGender}
    
    \begin{subfigure}{\textwidth}
        \caption{Distribution}
        \label{fig:kdensity_grade6_byGender}
        \centering
        \includegraphics[width=14cm]{DataWork/Output/Figures/figA7a-kdensity_grade6_byGender.png}
    \end{subfigure}
    
    \begin{subfigure}{\textwidth}
        \caption{Quantile Treatment Effect}
        \label{fig:qreg_media_grade6_byGender}
        \centering
        \includegraphics[width=14cm]{DataWork/Output/Figures/figA7b-qreg_media_grade6_byGender.png}
    \end{subfigure}
    
    \begin{minipage}{0.85\textwidth}
        \small{\textit{Notes:} Average test score is the average of standardized test scores in math, Portuguese, human and natural science (range 0-400). Sample: schools treated at 6\textsuperscript{th} grade. Kernel densities are computed using Epanechnikov kernel function. Treatment effects in (a) are estimated through regressions with strata (i.e., region and grade) fixed effects and standard errors clustered at the school level. ** and * indicate significance at the 5 and 10 percent critical level. In (b), we plot point estimates of quantile regressions with 90\% confidence intervals. Quantile treatment effects are expressed in terms of standard deviations from the control group.}
    \end{minipage}
\end{figure}

\begin{figure}[htbp]

    \caption{Scatter Plot of Cognitive and Socio-Emotional Skills}
    \label{fig:scatter_test_socio}
    
    \begin{subfigure}{\textwidth}
        \caption{All Schools}
        \label{fig:scatter_test_socio_all}
        \raggedleft
        \includegraphics[width=15cm]{DataWork/Output/Figures/figA8a-scatter_test_socio.png}
    \end{subfigure}
    
    \begin{subfigure}{\textwidth}
        \caption{6\textsuperscript{th} Grade}
        \label{fig:scatter_test_socio_grade6}
        \raggedleft
        \includegraphics[width=15cm]{DataWork/Output/Figures/figA8b-scatter_test_socio_grade6.png}
    \end{subfigure}
    
    \centering
    \begin{minipage}{0.9\textwidth}
        \footnotesize{\textit{Notes:} Unit of observation: student. The linear fits are estimated for both treatment and control group through an OLS regression with standard errors clustered at the school level. *** indicates significance at the 1 percent critical level. The sample is restricted to students who took the socio-emotional test. `Average test score' is the average of standardized test scores in math, Portuguese, human and natural science. `Average socio-emotional score' is the average of standardized scores in agreeableness, conscientiousness, extroversion, neuroticism and openness. `Neuroticism' is reverse-coded so that a positive coefficient implies a lower level of neuroticism score. Both variables are expressed in terms of standard deviations from the control group.}
    \end{minipage}
        
\end{figure}

% Figures for Policy Analysis
\begin{figure}[htbp]

    \centering
    \caption{Learning Gains in 6\textsuperscript{th} Grade Rescaled to SAEB -- Projection over Time}
    \captionsetup[subfigure]{position=top,justification=centering}
    \label{fig:itt_ProvaBrasil}
    
    \begin{subfigure}{\textwidth}
        \centering
        \caption{Math}
        \label{fig:itt_ProvaBrasil_MT}
        \includegraphics[width=13cm, height=8.75cm]{DataWork/Output/Figures/figA9a-itt_ProvaBrasil_MT.png}
    \end{subfigure}
    \begin{subfigure}{\textwidth}
        \centering
        \caption{Portuguese}
        \label{fig:itt_ProvaBrasil_PT}
        \includegraphics[width=13cm]{DataWork/Output/Figures/figA9b-itt_ProvaBrasil_LT.png}
    \end{subfigure}
    
    \begin{minipage}{0.825\textwidth}
        \small{\textit{Notes:} We use data from \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP) for state public schools in Rio Grande do Norte and Brazil. In particular, we use the average for the cohort who was in 5\textsuperscript{th} grade in 2013 and 9\textsuperscript{th} grade in 2017. The points in 6\textsuperscript{th}, 7\textsuperscript{th}, and 8\textsuperscript{th} grades are linear interpolation. The PIP intent-to-treat effect on 6\textsuperscript{th} graders is estimated through OLS with strata (i.e., region and grade) fixed effects and standard errors clustered at the school level. **, and * indicate significance at the 5, and 10 percent critical level.}
    \end{minipage}
    
\end{figure}

\vfill
\begin{figure}[htbp]

    \centering
    \caption{Learning Gains in 6\textsuperscript{th} Grade Rescaled to \textit{IDEB} -- Comparison with Other Brazilian States}
    
    \includegraphics[width=15cm]{DataWork/Output/Figures/figA10-itt_IDEB.png}
    \label{fig:itt_IDEB}
    
    \begin{minipage}{0.825\textwidth}
        \small{\textit{Notes:} We use data from \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP) for state public schools. The bars show average IDEB by state in 2015. The PIP intent-to-treat effect on 6\textsuperscript{th} graders is estimated through OLS with strata (i.e., region and grade) fixed effects and robust standard errors. ** indicate significance at the 5 percent critical level. See Appendix \ref{sec:ideb} for the methodology we follow to compute IDEB for our grades of interest.}
    \end{minipage}
    
\end{figure}
\vfill

\clearpage
\null
\vfill
\begin{table}[ht!]

	\caption{Effect of Teacher Permanence on Education Outcomes in Brazil}
	\label{tab:correlates_turnover}
	\centering
	\begin{adjustbox}{max width=\linewidth}
		\begin{tabular}{lcccc} \hline\hline
		    \input{DataWork/Output/Tables/tabA1-correlates_turnover.tex}
		    \multicolumn{5}{@{}p{0.775\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Unit of observation: school. Year: 2015. Outcome variables in the column headers. `Age-grade distortion' is the percentage of students in one grade who are older than the expected age for that grade. `Teacher permanence index' is the school weighted average of \textit{Indicador de Regularidade Docente}, which takes values between 0 and 5 and is defined as the frequency of a teacher in a school during the last 5 years. The index is standardized so that the coefficients can be interpreted as the effect of one-standard-deviation change in such index. The mean of the `teacher permanence index' in the sample is 3.04 and the standard deviation is 0.85. All regressions are OLS. Standard errors clustered at the state level in parentheses. The sample is the universe of schools in Brazil. Data are from \textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} (INEP): \href{http:https://portal.inep.gov.br/indicadores-educacionais}{http:https://portal.inep.gov.br/indicadores-educacionais}.}
		\end{tabular}
	\end{adjustbox}
	
\end{table}
\vfill

% Balance in participation in tests
\begin{table}[htbp]

    \caption{Balance in Socio-Emotional and Proficiency Test Participation}
    \label{tab:baltab_participation}
    \centering
    \begin{adjustbox}{totalheight=\textheight} 
        \begin{tabular}{lcccccccc} \hline \hline
            \input{DataWork/Output/Tables/tabA2-baltab_participation.tex}
            \multicolumn{9}{@{}p{1.06\textwidth}}{\textit{Notes}: `\textit{Participating schools}' is a dummy for schools which had at least one test taker. `\textit{Percentage of test takers}' is defined as the percentage of students who took the test for each school in the sample, conditional on the school being a `participating school'. Robust standard errors (SE) in parentheses. Strata (i.e., region) fixed effects are included in all the estimated regressions. We show both standard p-values and p-values computed using randomization inference (RI) with 10,000 repetitions for the whole sample and each grade.} 
        \end{tabular}
    \end{adjustbox}
    
\end{table}

\begin{table}[htbp]

    \caption{Balance Table on Subsample of Schools with Socio-Emotional Test Takers}
    \label{tab:baltab_socio_schoollevel}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccccc} \hline \hline
            \input{DataWork/Output/Tables/tabA3-baltab_socio_schoollevel.tex}
            \multicolumn{9}{@{}p{1.525\textwidth}}{\textit{Notes}: For school and grade level comparisons we use data from the 2015 Rio Grande do Norte school census (\textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} -- INEP) and progression rates from \textit{Sistema Integrado de Gestão da Educação} (SIGEduc) portal. At the teacher and student level, we compare socio-demographics at the beginning of the year of the intervention, i.e., 2016, from that year Rio Grande do Norte school census. Teacher data regard only those teachers who taught in the classes involved in the project, and not from other grades. Student data regard students enrolled in those grades at the beginning of the school year. The sample is restricted to schools that had at least one socio-emotional test taker. Standard errors (SE) are robust in Panel A and B, and clustered at the school level in Panel C and D. Strata (i.e., region and grade) fixed effects are included in all the estimated regressions. We show both standard p-values and p-values computed using randomization inference (RI) with 10,000 repetitions for the whole sample and for each grade.}
        \end{tabular}
    \end{adjustbox}
    
\end{table}

\begin{table}[htbp]

    \caption{Balance Table on Subsample of Schools with Proficiency Test Takers}
    \label{tab:baltab_proficiencia_media_schoollevel}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccccc} \hline \hline
            \input{DataWork/Output/Tables/tabA4-baltab_proficiencia_media_schoollevel.tex}
            \multicolumn{9}{@{}p{1.525\textwidth}}{\textit{Notes}: For school and grade level comparisons we use data from the 2015 Rio Grande do Norte school census (\textit{Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira} -- INEP) and progression rates from \textit{Sistema Integrado de Gestão da Educação} (SIGEduc) portal. At the teacher and student level, we compare socio-demographics at the beginning of the year of the intervention, i.e., 2016, from that year Rio Grande do Norte school census. Teacher data regard only those teachers who taught in the classes involved in the project, and not from other grades. Student data regard students enrolled in those grades at the beginning of the school year. The sample is restricted to schools that had at least one proficiency-test taker. Standard errors (SE) are robust in Panel A and B, and clustered at the school level in Panel C and D. Strata (i.e., region and grade) fixed effects are included in all the estimated regressions. We show both standard p-values and p-values computed using randomization inference (RI) with 10,000 repetitions for the whole sample and for each grade.}
        \end{tabular}
    \end{adjustbox}
    
\end{table}
\FloatBarrier

\begin{table}[htbp]

    \caption{Impact on Student Progression Rates -- Heterogeneity by Gender}
    \label{tab:promotion_het_gender}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccc} \hline \hline
            \input{DataWork/Output/Tables/tabA5-promotion_het_gender.tex}
            \multicolumn{5}{@{}p{0.8\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Data are from Rio Grande do Norte census. Unit of observation: student. Outcome variables in the panel headers. All regressions are OLS with strata (i.e., region and grade) fixed effects. Standard errors clustered at the school level in parentheses.}
        \end{tabular} 
    \end{adjustbox}
    
\end{table}

\clearpage
\null
\vfill
\begin{table}[ht!]
    \caption{Impact on Student Progression Rates -- Heterogeneity by Passing Rate at Baseline}
    \label{tab:promotion_het}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccccc} \hline \hline
            \input{DataWork/Output/Tables/tabA6-promotion_het.tex}
            \multicolumn{9}{@{}p{1.4\textwidth}}{\small \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. School-level data are from \textit{Sistema Integrado de Gestão da Educação} (SIGEduc) and student-level data are from Rio Grande do Norte census. Unit of observation: school and student. Outcome variables in the panel headers. All regressions are OLS with strata (i.e., region and grade) fixed effects. Robust standard errors for school-level regressions and standard errors clustered at the school level for student-level regressions in parentheses. Note that the coefficient on the high-turnover dummy at baseline for 5\textsuperscript{th} grade is not identified because the median itself is equal to 0. This is due to the fact that, in most schools, 5\textsuperscript{th} grade has only one teacher, thus the school turnover rate variable is either equal to 0 or 1.}
        \end{tabular} 
    \end{adjustbox}
\end{table}
\vfill

% Impact of retention on dropout
\null
\vfill
\begin{table}[ht!]
    \caption{Impact of 6\textsuperscript{th} Grade Retention on Student Achievement}
    \label{tab:retention_grade6_regs}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccccc} \hline \hline
            \input{DataWork/Output/Tables/tabA7-retention_grade6_regs.tex}
            \multicolumn{7}{@{}p{\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Outcome variables in the column headers. All regressions are OLS. Standard errors clustered at the school -- columns (1), (2), (4), (5) -- or class -- columns (3), (6) -- level in parentheses. The sample is the universe of 6\textsuperscript{th} grade students of public schools in Rio Grande do Norte. `Dropout' is a dummy variable equal to 1 if the student dropped out in one year between 2011 and 2016, and 0 otherwise. `Years of completed schooling' is taken in the last year in which the student is in the census database. When the student drops out, we consider his/her last grade as its level of completed schooling. Data from 2011-2017 censuses.}
        \end{tabular}
    \end{adjustbox}
\end{table}
\vfill

% Drivers of Implementation
\begin{table}[htbp]
    \caption{Drivers of Implementation}
    \label{tab:predict_implementation}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lccc} \hline \hline
            \input{DataWork/Output/Tables/tabA8-predict_implementation.tex}
            \multicolumn{4}{@{}p{0.79\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Data are from 2015 Rio Grande do Norte school census, State Secretary of Education and Culture (SEEC),  and \textit{Sistema Integrado de Gestão da Educação} (SIGEduc). `Implementation' is defined as a school having the ratio of the number of activities that were implemented over the number of planned activities described in the work plan above 70\%. `School infrastructure index' is constructed through principal component analysis of the following dummy variables: whether school has internet, library, science lab, and is located in an urban area. Unit of observation: school. All regressions are OLS with strata (i.e., region and grade) fixed effects. Robust standard errors in parentheses.}
        \end{tabular} 
    \end{adjustbox}
\end{table}

\begin{table}[htbp]
    \caption{Impact on Probability of School Obtaining the Clearance Certificate}
    \label{tab:clearance_certificate}
    \centering
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lcccc} \hline \hline
            \input{DataWork/Output/Tables/tabA9-clearance_certificate.tex}
            \multicolumn{5}{@{}p{0.825\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Data are from State Secretary of Education and Culture (SEEC). Unit of observation: school. `Treatment effect comparisons by grade' are based on the regression in column (1) with grade interaction terms. All regressions are OLS with strata (i.e., region and grade) fixed effects. Robust standard errors in parentheses.}
        \end{tabular} 
    \end{adjustbox}
\end{table}


%%%%%%%%%%%%%%%%%%%%%% APPENDIX ON BACK OF THE ENVELOP CALCULATIONS %%%%%%%%%

\clearpage
\section{Back-of-the-Envelope Estimations} 
\setcounter{table}{0}
\setcounter{figure}{0}
\setcounter{equation}{0}
\renewcommand{\thetable}{B\arabic{table}}
\renewcommand{\thefigure}{B\arabic{figure}}
\renewcommand{\theequation}{B\arabic{equation}}

\subsection{IDEB} \label{sec:ideb}

\subsubsection*{Methodology}
The Brazilian Education Development Index (\textit{Índice de Desenvolvimento da Educação Básica} -- IDEB) was created by the INEP in 2007 as an indicator that aggregates the two main drivers of education quality: student proficiency as quantified by standardized exams and student attainment as measured by grade passing rates.\footnote{You can find the informative and technical notes (in Portuguese) on how MEC compiles IDEB at \href{http:https://download.inep.gov.br/educacao_basica/portal_ideb/o_que_e_o_ideb/nota_informativa_ideb.pdf}{http:https://download.inep.gov.br/educacao\_basica/portal\_ideb/o\_que\_e\_o\_ideb/nota\_informativa\_ideb.pdf} or \href{http:https://download.inep.gov.br/educacao_basica/portal_ideb/o_que_e_o_ideb/Nota_Tecnica_n1_concepcaoIDEB.pdf}{http:https://download.inep.gov.br/educacao\_basica/portal\_ideb/o\_que\_e\_o\_ideb/Nota\_Tecnica\_n1\_concepcaoIDEB.pdf}. Our methodological discussion faithfully reflects the contents of these two documents.} Since then, IDEB has been regularly employed to monitor the evolution of Brazilian education system and to compare different state experiences.

In order to have a comparable measure of education learning gains, IDEB uses the national standardized exams in math and Portuguese, known as SAEB. This test is administered to all public and private schools every second year. In particular, students in the last year of primary, middle, and high schools, i.e., 5\textsuperscript{th}, 9\textsuperscript{th}, and 12\textsuperscript{th} grades, are evaluated. SAEB tests are based on IRT so to define a unique scale for all grades and years of the national education system. This is done by including items from the previous grades and years in the test.\footnote{Besides the test, students, teachers and principals are subject to socio-economic and cultural questionnaires, which are used by the MEC to foster the understanding of the tested schools.}

To compute IDEB, SAEB scores are standardized in a scale between between 0 and 10, following the equation
\begin{equation}
    N_{sj} = \frac{score_{sj} - min_j}{max_j - min_j} \cdot 10
\end{equation}
where $j$ is the subject of the test, i.e., either math or Portuguese, and $s$ is the school identifier. $min_j$ and $max_j$ are the inferior and superior limits of subject $j$ in the 1997 SAEB (the first year in which the test was administered nationwide). Namely, these limits were computed by taking the values 3 SDs, $\sigma_{j}$, away from the average, $\mu_{j}$, of the 1997 scores in each discipline
\begin{equation}
    min_{j} = \mu_{j} - 3 \cdot \sigma_{j}\,;\quad max_{j} = \mu_{j} + 3 \cdot \sigma_{j}
\end{equation}

Finally, the arithmetic mean of math and Portuguese standardized scores is taken
\begin{equation} \label{eq:n}
    N_{s} = \frac{N_{s,j=math} +figure N_{s,j=Portuguese}}{2}
\end{equation}

With regard to student attainment, IDEB uses an indicator of achievement at the school level, $P_s$, which is obtained by taking the inverse of the average of the passing rates of primary, middle, or high school, $T_s$. In mathematical notation,
\begin{equation} 
    T_s = \frac{\sum_{y=1}^{Y} \frac{1}{p_{sy}}}{Y}
\end{equation}
\begin{equation} \label{eq:p}
    P_s = \frac{1}{T_s}
\end{equation}
where $y$ is the grade of interest, $Y$ is the total number of grades with positive passing rates in the school $s$, and $p_{sy}$ is the grade-level passing rate. In the absence of dropout, $T_s$ measures the duration time of a certain stage of education for an average student in school $s$. 

Hence, IDEB results from multiplying the two indicators defined in Equation \ref{eq:n} and \ref{eq:p}
\begin{equation}
    IDEB_s = N_s \cdot P_s
\end{equation}
\begin{equation}
    0 \leq N_s \leq 10\,;\; 0 \leq P_s \leq 1\,;\; 0 \leq IDEB_s \leq 10
\end{equation}
and is equal to the standardized 0-10 score in SAEB adjusted for the average time (in years) it takes to conclude one grade in that stage of education.

\subsubsection*{Estimation}
As mentioned in Section \ref{sec:data}, the state standardized tests on which we base our analysis were rescaled to SAEB ITR range allowing one to compute $N_s$, as defined in Equation \ref{eq:n}, for each school in our sample. As we described in the paper, the PIP was implemented in the last year of primary school, i.e., the 5\textsuperscript{th} grade, but not in the last years of middle and high school. This means that we are not able to compute the IDEB for those grades, but we focus on the grade of the intervention.

On the other hand, we use passing rates in the grade of the intervention to calculate $P_s$. Again, as we are looking only at one grade of a stage of education, $P_s$ will be equal to the passing rate (in percentage points) in that grade.

Combining these two variables, we calculate a grade-level measure of IDEB for schools in the treatment and control groups. Therefore, we use this index to estimate the ITT in terms of IDEB points. Namely, we employ the model defined in Equation \ref{eq:OLS}. The results are shown in Table \ref{tab:IDEB_schoollevel}.

In line with the baseline results on standardized test scores and passing rates, the only significant effect is found in 6\textsuperscript{th} grade. The intervention had an ITT of 0.28 IDEB points.
We take this coefficient to assess how the PIP would move RN across the nation distribution. In order to do so, we compare lower-secondary IDEB in 2015 for all Brazilian states.\footnote{As SAEB tests take place every two years, we are not able to have comparable data from 2016, which was the year in which PIP was actually implemented.} As one can see in Figure \ref{fig:itt_IDEB}, RN was the third worst state in terms of quality of education, after Sergipe and Alagoas. The increase in IDEB caused by PIP, as estimated above, would shift RN from the bottom decile to the third decile according to ITT estimates.\footnote{The results are robust to the inclusion of school-level control variables.}

\vspace{3em}

\begin{table}[ht!]

    \caption{Impact on IDEB}
    \centering
    \label{tab:IDEB_schoollevel}
    \begin{adjustbox}{max width=\textwidth}
        \begin{tabular}{lccc} \hline \hline
            \input{DataWork/Output/Tables/tabB1-IDEB_schoollevel.tex}
            \multicolumn{4}{@{}p{0.625\textwidth}}{\footnotesize \textit{Notes}: \sym{*}Significant at 10\%. \sym{**}Significant at 5\%. \sym{***}Significant at 1\%. Unit of observation: school. Regressions are OLS with strata (i.e., region and grade) fixed effects. Robust standard errors in parentheses.}
        \end{tabular}
    \end{adjustbox}
\end{table}
%

\clearpage

\subsection{Net Present Value of Increased Learning} \label{sec:NPV}

Increased learning is associated with long-term labor market returns, assuming that the accumulation in human capital is sustained over time. In this subsection, we follow the method proposed by \citet{evans2019equivalent} to translate the impact of the education intervention in net present value (NPV) of potential increased lifetime earnings. The NPV is defined as
\begin{equation}
    NPV = \sum_{k=20-\alpha}^{N} \frac{\Delta Y \cdot \beta \cdot w}{(1+i)^k}
\end{equation}

where $\Delta Y$ is the number of equivalent years of education caused by the intervention, $\beta$ is the return to one year of education, $w$ is the real wage, $i$ is the discount rate, $\alpha$ is the age at which the student was targeted by the intervention, and $N$ is his/her expected work life.

Hence, $\Delta Y \cdot \beta$ represents the predicted wage increase, stemming from the learning improvement. Assuming constant wages over time, this translates into an additional income of $\Delta Y \cdot \beta \cdot w$ for an average worker.\footnote{This is a conservative approach: as we expect wages to grow over time, the actual NPV from the intervention may be higher than the one we estimate hereafter.} As students enter the labor marker only in a later stage (when they are 20 years old), these wage gains are discounted by $k=20-\alpha$ years. Therefore, we sum the yearly increases in NPV across the whole worklife of a student.

We use the 2016 Annual Social Information Report (\textit{Relação Anual de Informações Sociais} -- RAIS) from the Ministry of Labor and Employment to retrieve the average wage in RN (this refers to the formal sector) and, therefore, to estimate the return to education in RN through a conventional Mincerian equation \citep{mincer1974schooling}. Namely, the average wage in 2016 was 24,486.48 BRL, i.e., around 6,000 US\$. In line with recent estimates by \citet{psacharopoulos2018returns}, we find the return to one extra year of education in RN to be around 10\%. The age of 6\textsuperscript{th} graders, who received the intervention, was on average 12 years, and we assume the expected work life to be 40 years (which means an average worker retires when he/she is 60). Finally, the discount rate is taken at 3\%.

Using our ITT estimates, as computed in Section \ref{sec:results}, we find that PIP would increase annual wages by 5\%. This would mean a shift of the median worker to the 6\textsuperscript{th}, or 7\textsuperscript{th}, decile, respectively, in the wage distribution of RN (Figure \ref{fig:kdensity_wage}). Considering the whole worklife, the intervention has a predicted NPV between 29,148.97 and 52,468.15 BRL, i.e., 7,287.24 to 13,117.03 US\$. This is equivalent to about one average annual Brazil income per capita.

\vspace{3em}

\begin{figure}[ht!]

    \centering
    \caption{Learning Gains in 6\textsuperscript{th} Grade Rescaled to Annual Wage}
    \includegraphics[width=16cm]{DataWork/Output/Figures/figB1-kdensity_wage.png}
    \label{fig:kdensity_wage}
     
    \begin{minipage}{0.9\textwidth}
        \small{\textit{Notes:} Kernel densities are computed using Epanechnikov kernel function. The three horizontal lines represent the median wage of Rio Grande do Norte, which is considered as counterfactual, and of the median PIP student, assuming the effects on equivalent years of education estimated in Section \ref{sec:results} through an OLS model. The sample is the universe of formal workers in Rio Grande do Norte in 2016. Data are from \textit{Relação Anual de Informações Sociais} (RAIS). N = 801,956.}
    \end{minipage}
\end{figure}

\end{document}