Time Series Forecasting of Energy Consumption in Turkey

Project Overview

This repository contains a time series forecasting project focused on predicting the hourly energy consumption of Turkey using data from 2016 to 2018 to forecast for the year 2019 (https://www.kaggle.com/datasets/hgultekin/hourly-power-consumption-of-turkey-20162020). The project aims to demonstrate my skills in handling time series data.

Data

The dataset used is "Hourly Power Consumption of Turkey (2016-2020)" from Kaggle, which includes hourly energy consumption data in Megawatt-hours (MWh). The data from 2016 to 2018 was used to train the model, and the data from 2019 was used for testing and validating the predictions.

Methodology

The project follows these steps:

Data Preprocessing (make_dataset.py): Importing, cleaning, and preparing the dataset for analysis.
1.1 Advanced Data Preprocessing (make_dataset_v2.py): adding missing hour rows and interpolate values
Exploratory Data Analysis (visualize.py): Visualizing the energy consumption trends over time.
Feature Engineering (build_features.py): Creating additional features like day of the week, month, and hour to improve the model's performance.
3.1 Advanced Feature Engineering (build_features_v2.py): Creating additional features like day of the year, yearly lags 1 and 2 (accounting for leap years).
Model Training and Evaluation (train_model.py): Using XGBoost for time series forecasting and evaluating its performance on the 2019 data.
- The XGBoost model was trained with early stopping to prevent overfitting.
- Feature importance was analyzed to understand the impact of different time-related features.
- The model's predictions for the year 2019 were plotted against actual values, showing how closely the model could predict real-world data.
- Evaluation metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and R² Score were used to quantify the model's accuracy.
  4.1 Model Training and Evaluation (train_model_v2.py): Trained on data with more advanced cleaning and feature engineering.
Anomaly Detection (anomaly_detection.py): Identifying unusual patterns in energy consumption.
- Utilized the Isolation Forest algorithm to estimate and identify anomalies in the dataset.
- Configured the model with a contamination factor of 0.01, indicating an expected proportion of outlier data.
- Predicted and labeled data points as normal or anomalies, based on the model's output.

Key Results

The _v2 data performed worse on the Evaluation (MAE:1567.426 MSE:5475637.342 R²:0.753), hence it's not visualized hereinafter.

Evaluation

MAE: 1403.981
MSE: 4865901.909
R²: 0.781

Visualization

Conclusion

This project successfully demonstrates the ability to forecast energy consumption using machine learning techniques. It illustrates a simple form of feature engineering and the effectiveness of gradient boosting algorithms in handling time series data.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time Series Forecasting of Energy Consumption in Turkey

Project Overview

Data

Methodology

Key Results

Evaluation

Visualization

Conclusion

About

Releases

Packages

Languages

magellanic-clouds17/time_series_anomaly_detection_forecasting

Folders and files

Latest commit

History

Repository files navigation

Time Series Forecasting of Energy Consumption in Turkey

Project Overview

Data

Methodology

Key Results

Evaluation

Visualization

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages