Skip to content

RobRomijnders/segm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Semantic Segmentation

This repository implements the minimal code to do semantic segmentation.

In semantic segmentation, each pixel of an input image must be assigned to an output class. For example, in segmenting tumors in CT-scans, each pixel must be assigned to class "healthy tissue" and "pathological tissue". Other examples include street scene segmentation, where each pixel belings to "background", "pedestrian", "car, "building" and so on.

Most applications of semantic segmentation work within a larger pipeline. Therefore, semantic segmentation might seem a complicated algorithm. However, in this repository we use about 300-400 lines of code to show the core of the segmentation.

Model

Each pixel must be assigned a label. Therefore, the output size is of same order as the input size. Say the input is Height x Width x Channels, then the output is Height x Width x Classes. Usually, other problems in machine learning require only a small output. For example, classification requires only num_classes output and regression requires a single number. To deal with these large outputs, two solutions are

  • Divide the problem into Height*Width small problems. Train some algorithm to classify a patch into a single label and run this algorithm over the entire input.
  • Use a neural network to connect each pixel in the input with each pixel in the output. This repository implements this approach.

Data

We pick a small dataset in order to show the concept. In fact, we create our own dataset. Most public datasets involve images of thousands of pixels. They require much more computation power than the average person has on his/her laptop.

Our data combines the digits from MNIST with the background of CIFAR10. This resembles the usual foreground-background segmentation task. Below are some examples from the data.

The datagenerator randomly overlays a CIFAR image and an MNIST digit. The images and the offsets are chosen from uniform distributions.

Convolutional neural network

As we deal with images, we model with a convolutional neural network (CNN). A CNN maps with filters between layers of neurons. This is a natural choice for images. Most concepts in the natural consist of hierarchical patterns. Moreover, with CNN's we can scale to arbitrary height and widths. You can read more about that in this paper.

Code

The code consist of three scripts

  • model.py describes the model using Tensorflow library
  • load_data.py describes the generation and sampling of the data
  • main.py describes the main code: instantiating the model and data generator and trains the model.

Further reading

Note that you have to download the data yourself:

Images

Samples of the input data input data segmentation results

About

Simple Semantic Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages