Introduction

Did you ever go through your vacation photos and ask yourself: What is the name of this temple I visited in China? Who created this monument I saw in France? Landmark recognition can help! This technology can predict landmark labels directly from image pixels, to help people better understand and organize their photo collections.

Problem and Approach

The dataset we worked on is derived from the Google Landmark Recognition Challenge that took place on Kaggle. The challenge at hand was to build models that classify the images provided in such a way that it matches the correct landmark with each unique image.

We have to classify these landmarks from (15 Thousand!) different classes of landmarks.The landmark recognition training data originally contained over 1.2 million images with around 15K classes.To put things simply, this means that we would require a lot of computing power, coupled with a lot of time and patience. We worked on Nvidia DGX GPU (supercomputer) because of the same.

Another problem we faced is that we were given image URL's, so first we wrote the python script for downloading the images from those URL's and place them into their respective classes.

Now we seperated these folders into 3 parts train(80%), test(10%) and validation(10%) using python script folder_splitting.py

Top 30classes Train Data:-

After manually scrubbing, we observed that many test images have no landmarks and some have more than one landmark in them.

ResNet 50

Our first modeling approach was ResNet50, a pre-trained convolutional neural network that is trained on the ImageNet database.

The reason we started with this model is twofold: ResNet50 is a residual learner, meaning it tries to learn from the residuals as opposed to only learning the contributing features
This model aims to solve the oversaturation issue that many neural nets face as network depth increases.

In simple words, ResNet50 subtracts feature information from the input of a layer to learn about the residuals. ResNet50 is a 50 layer Residual Network.

VGG-16

VGG16, is perhaps one of the most popular convolutional net that is used for Transfer Learning using Keras. It is a 16-layer Covnet used by the Visual Geometry Group (VGG) at Oxford University in the 2014 ILSVRC (ImageNet) competition.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
Dataset_folder.ipynb		Dataset_folder.ipynb
Delf.ipynb		Delf.ipynb
InceptionResNetV2.ipynb		InceptionResNetV2.ipynb
LICENSE		LICENSE
README.md		README.md
ResNet50.ipynb		ResNet50.ipynb
VGG16.py		VGG16.py
download_img.py		download_img.py
folder_splitting.py		folder_splitting.py
landmarkrecognition.ipynb		landmarkrecognition.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Problem and Approach

ResNet 50

VGG-16

About

Releases

Packages

Languages

License

adityasurana/Google-Landmark-Recognition-Challenge

Folders and files

Latest commit

History

Repository files navigation

Introduction

Problem and Approach

ResNet 50

VGG-16

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages