Skip to content

Kayal314/ImageColorizer

Repository files navigation

Deep Learning Model that Colors Grayscale Images

Paper by Federico Baldassarre, Diego González Morín, Lucas Rodés-Guirao: arXiv:1712.03400 [cs.CV] Deep Koalarization

Some Predicted Results

Test Images

Test Image 1 Test Image 2 Test Image 3

Generated Images

Predicted Image 1 Predicted Image 2 Predicted Image 3

Network Architecture

Network Architecture

Encoder Network Architecture

Layer Filters Kernel Size Strides Padding Activation
Conv2D_E1 64 (3 × 3) (2 × 2) same ReLU
Conv2D_E2 128 (3 × 3) (1 × 1) same ReLU
Conv2D_E3 128 (3 × 3) (2 × 2) same ReLU
Conv2D_E4 256 (3 × 3) (1 × 1) same ReLU
Conv2D_E5 256 (3 × 3) (2 × 2) same ReLU
Conv2D_E6 512 (3 × 3) (1 × 1) same ReLU
Conv2D_E7 512 (3 × 3) (1 × 1) same ReLU
Conv2D_E8 256 (3 × 3) (1 × 1) same ReLU

Fusion Network Architecture

Layer Filters Kernel Size Strides Padding Activation
Conv2D_F1 256 (1 × 1) (1 × 1) same ReLU

Decoder Network Architecture

Layer Filters Kernel Size Strides Padding Activation
Conv2D_D1 128 (3 × 3) (1 × 1) same ReLU
UpSamp2D_D1 - - - - -
Conv2D_D2 64 (3 × 3) (1 × 1) same ReLU
Conv2D_D3 64 (3 × 3) (1 × 1) same ReLU
UpSamp2D_D2 - - - - -
Conv2D_D4 32 (3 × 3) (1 × 1) same ReLU
Conv2D_D5 2 (3 × 3) (1 × 1) same tanh
UpSamp2D_D2 - - - - -

Fusion Layer Architecture

Fusion Layer

High-Level Feature Extraction through Inception Resnet v2

The Inception Resnet v2 Model extracts the high-level features of the input grayscale image. The last layer before the softmax activation outputs a vector of size 1000 or dimension (1000 × 1 × 1) (feature-vector). This vector is repeated 28 × 28 times and then reshaped into a volume of (28 × 28 × 1000). This volume is then concatenated depth-wise to the Conv2D_E8 layer. This whole block of size (28 * 28 * 1256) is then passed through Conv2D_F1.