Skip to content

Here comes the Sun, literally, with Machine Learning

Notifications You must be signed in to change notification settings

PieroPaialungaAI/SunML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Convolutional Neural Networks for binary classification of Solar Flares

sf

Hello you! This is my project about Convolutional Neural Networks for binary classification of Solar Flares. If you are interested in the general project and want to know how technology can help astrophysics without any knowledge domain, please go ahead and read chapter 1.
If you are not interested in the theoretical background of this project, but want to know more about Python, AI and Machine Learning techniques, please skip the first part and move to chapter 2 to discover how to move in the repository and reproduce the compuational experiment.
If you are interested in both you are just perfect. :) Get an idea of what is the idea of the project in chapter 1, and deepen the computational effort and challenges in chapter 2. Welcome aboard,skipper.

1. About the project

The Sun is the most important star for our planet, as it permits to have life as we know it. One of the most important phenomena of Sun magnetical activity is the one of the Solar Flares. These are sudden flashes, often seen in the proximity of sunspot and accompanied by enormous energy ejection (coronal mass ejections). Coronal mass ejections can have crucial effects on our lives as radio communication or damage to satellites (or people if we are unlucky) during space missions. The detection and the analysis of solar flares is thus an important task and a necessary effort.\ Sofisticated techniques have been developed to determine, predict and analyze Solar Flares. Nonetheless the challenge of this GitHub repository is to classify whether or not there is a solar flare in the Sun right now without having any domain knowledge that can cost lot of money and require years of study.

1.1 The Challenge

The challenge of this GitHub Repository is to perform a binary classification directly on Solar Magnetogram images, without having no additional information about the Sun magnetic behavior. Screenshot 2020-12-20 at 21 39 46

1.2 The tool

The tool that have been used during this process is the one of most powerful algorithm that is called Convolutional Neural Network. \ This algorithm analyzes each pixel of the image and build a network of parameters that converges to a single number that is the probability of that image to belong to one class. In this particular project, two convolutional neural networks have been applied. The first convolutional neural network take the Sun images and detect whether or not there is an active region in the Sun. The second convolutional network take the specific active region portion of the Sun image and detect whether or not there is a Solar Flare in that region or not. The first uses the input as a black and white (bidimensional matrix) image. The second one uses the input as a RGB (three-dimensional matrix/tensor) image. Anyway they are really similar, and the accuracy is similar too:

  • 94.7% for the first one
  • 95.8% for the second one

2. About the code

I'm not going to lie on this one. Data extraction was the most intense job. The development of the CNNs was preatty easy as there are well known structures that do an excellent job in terms of image classification. As we are full of physical information about solar flares it was preatty hard to find raw images of the Solar Flares. An important section of this report is occupied by Web Scraping.

2.1 Software

The language is exclusively Python. Keras libraries were used in order to implement the CNNs, Pandas to data visualization and handling, matplotlib for image data visualization. PIL was used to treat images. Selenium was used to navigate into Chrome with Python.

2.2 "Codes"

The Codes repository contains horses images. :) Just kidding, it has three Jupyter notebooks. The first one is "activeornot.ipynb" and contains both information about the web scraping techniques used to get the active/not active Sun images from this site and the Machine Learning CNN part too. In this notebook, the first CNN has been developed.\ As it is, of course, a kind of Supervised Learning, the labels have been obtained by referring to a big page about the Sun that is the wonderful SpaceWetherLive The second notebook is SFAR.ipynb and collects the scraping techniques used to obtain images and their labels from an online page LMSAL (P.S. please be REALLY careful to have uploaded the work.csv file on the CSVs part, or it won't run) The third noetbook is the implementation of the second CNNs on the SFAR.ipynb collected images and it is called SolarFlaresARclassification.ipynb.

2.3 CSVs

SolarFlareDetailedData.csv is the CSV of all the information that are collected by the site of the first noetbook. You will find SolarFlareDetailedData2018.csv and SolarFlareDetailedData2013.csv that are essentially the same thing but in a different year. This essentialy will give you an idea of all the solar flares that have occurred in that year. The cross-check of the available data is obtained in what is called AvailableData.csv, and you have again the variants that refer to different years. The last CSV regards the labels and the images links required for the second convolutional network and it is calles arsolarflaredata.csv

2.4 Images

The images that are used to train both the first and the second CNNs are collected. As they were a lot, I've uploaded just 5 of them, but if you follow the scraping procedure of the notebooks you can collect how many image you want. (I've used 10^2 images for both classification for computational necessities).

  • ActNoActexample collects the images for the Active/Not active classification
  • SolarFlaresARexample collects the images for the Sun Flares/Active region classification

About you

What do you guys think about it? How can this approach be further expanded? What are your thoughts? Please let me know or hit me at: [email protected]

May the force be with you.

About

Here comes the Sun, literally, with Machine Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published