Skip to content

Image Classification using Transfer Learning: InceptionV3 and VGG Net

Notifications You must be signed in to change notification settings

swapnasamirshukla/BT5153-Term-Project

Repository files navigation

Neural Networks for Fashion Image Classification and Visual Search

N|Solid

Build Status

This is in requirement of BT5153 Applied Machine Learning Term Paper. Dataset used : - https://www.kaggle.com/paramaggarwal/fashion-product-images-small

This package includes two solutions:-

  • Image Classification using Transfer Learning :-VGG19 and Inception V3
  • Image Search using a custom CNN based AutoEncoder and Resnet50

Image Classification

  • Three schemes to conduct product classification for e-commerce -by gender+masterCategory -by subCategory -by articleType

VGG19 Architecture

  • MODEL ARCHITECTURE VGG-19 has 19 deep layers with unified kernel size of 3×3

  • LAYERS TUNED Only the last 5 layers are re-trained with newly add-in 3 dense layers in order to predict the correct number of class for current study

  • HYPER-PARAMETERS TUNED Optimizer using adam with learning rate of 1e-5 and decay of 1e-6, new add-in layer with 512 hidden units with drop out rate of 0.5

  • REGULARIZATION Early stopping, reduce learning rate on plateau, drop-out, data augmentation (ONLY during training) are applied, weight sharing is done implicitly through convolution

InceptionV3 Architecture

  • MODEL ARCHITECTURE Inception V3 has 48 dense layers and stacks 11 inception modules, each consisting of pooling layers and convolutional filters

  • LAYERS TUNED All the layers are re-trained with newly add-in 3 dense layers in order to predict the correct number of class for current study

  • HYPER-PARAMETERS TUNED Optimizer using rmsprop with learning rate of 1e-4 and decay of 1e-6, new add-in layer with 128 hidden units with drop out rate of 0.4

  • REGULARIZATION Early stopping, reduce learning rate on plateau, drop-out, data augmentation (ONLY during training) are applied, weight sharing is done implicitly through convolution

Evaluation Metrics
  Accuracy,Precision and Recall

Image Search

AutoEncoder using CNN Architecture

  • MODEL ARCHITECTURE CNN architecture is used to build AntoEncoder for image search. It comprises of both an Encoder and a Decoder. Embedding features is generated bythe Encoder.

  • LAYERS DESIGN Encoder consists of the first 7 layers Decoder consists of the last 7 layers

  • HYPER-PARAMETERS TUNED Optimizer using adam with learning rate of 1e-5 and decay of 1e-6

  • TRAINING Autoencoders are self-supervised learning model with input and output being the image itself

Authors

The names of the creators of these workbooks and their LinkedIn Profiles are given below

Name LinkedIn
Li Fengzi https://www.linkedin.com/in/fengzi-li-14b65946/
Shashi Kant https://www.linkedin.com/in/shashi-kant-bksc/
Shunichi Araki https://www.linkedin.com/in/shunichi-araki-76a68398/
Sumer Bangera https://www.linkedin.com/in/sumerbangera/
Swapna Samir Shukla https://www.linkedin.com/in/swapna-samir-shukla-079491a0/

Todos

  • Fine Tune Inception V3
  • Improve Image Search

License

NUS

All authors are graduate students of MSBA program Class of2020 at NUS Business School. .BT5153 Applied Machine Learning Term Paper, MSBA @ NUS,2020. Copyright 2020 by the author(s).

About

Image Classification using Transfer Learning: InceptionV3 and VGG Net

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published