Skip to content

This repository contains the code and data used in the article "Estudo de Modelos para a Previsão de Arrecadação do ICMS do Rio de Janeiro" by João Pedro Verçosa.

License

Notifications You must be signed in to change notification settings

JPVercosa/icms-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ICMS Prediction for Rio de Janeiro

This repository contains the code and data used in the article "Estudo de Modelos para a Previsão de Arrecadação do ICMS do Rio de Janeiro" by João Pedro Verçosa. The study explores various machine learning models to forecast the ICMS revenue in Rio de Janeiro, using data from 2004 to 2022.

Table of Contents

  1. Introduction
  2. Data
  3. Models
  4. Usage
  5. Results

Introduction

The purpose of this study is to analyze and predict the ICMS revenue in Rio de Janeiro using advanced machine learning techniques. The models used include Random Forest, XGBoost, and Long Short-Term Memory (LSTM) neural networks. The study aims to provide more accurate forecasting to support government planning and decision-making.

Data

The dataset includes time series data from various economic and social indicators, that were compared with the ICMS time series using DTW (Dynamic Time Warping notebook) technique. They were collected from open data sources such as the Portal de Dados Abertos1, SGS-Sistema Gerenciador de Séries Temporais2, and the Empresa de Pesquisa Energética3. The data used in this study is available in the data directory of this repository.

Models

The models explored in this study are:

  • Random Forest
  • XGBoost
  • LSTM Neural Networks

Each model is trained and evaluated using a set of parameters optimized through Grid Search. Details on the parameter values and optimization process are provided in the article and the accompanying Jupyter notebooks in this repository.

Usage

To run the code, you need to have Python installed with the required libraries. You can install the dependencies using the provided requirements.txt file.

pip install -r requirements.txt

Running the Models

Each model has its own Jupyter notebook in the notebooks directory:

You can open and run these notebooks to reproduce the results of the study. The notebooks include all the steps from data preprocessing, model training, and evaluation.

Data Preprocessing

The data preprocessing steps are included in the preprocessing.ipynb script. This script normalizes the data and prepares it for model training. Run the preprocessing.ipynb to get everything ready.

Results

The study found that the Random Forest and XGBoost models performed better than the LSTM model in terms of predictive accuracy. The best performance was achieved using a multivariate approach with ICMS and total oil production series, yielding a Mean Absolute Percentage Error (MAPE) of 10.01% over a 12-month forecast horizon.

For more detailed information, please refer to the full article (in Portuguese) that can be found here: Estudo de Modelos para aPrevisão de Arracadação do ICMS do Rio de Janeiro.

Footnotes

  1. Portal de Dados Abertos

  2. SGS-Sistema Gerenciador de Séries Temporais

  3. Empresa de Pesquisa Energética

About

This repository contains the code and data used in the article "Estudo de Modelos para a Previsão de Arrecadação do ICMS do Rio de Janeiro" by João Pedro Verçosa.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published