Skip to content

ikteng/Digit-Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Digit-Predictor

build_model.py

This program builds and trains a CNN model for digit recognition using the popular MNIST dataset, showcasing key concepts such as data preprocessing, model architecture, training strategies, and model evaluation.

Dataset: https://www.kaggle.com/competitions/digit-recognizer/overview

  • Data loading: The MNIST dataset, containing grayscale images of hardwritten digits (0 to 9), is loaded into the program
  • Data Preprocessing: The pixel values of the images are normalized to the range [0,1] for better convergence during training. The images are reshaped to fit the input shape required by the CNN model.
  • Label Encoding: The categorical labels (digits) are converted into one-hot encoded vectors to prepare them for classification.
  • Model Definition: A sequential CNN model is defined using the Keras library. The model consists of several convolutional layers followed by batch normalization, max-pooling layers, dropout layers to prevent overfitting, and densely connected layers.
  • Model Compilation: The model is compiled with the Adam optimizer, categorical cross-entropy loss function, and accuracy metric to measure its performance.
  • Data Augmentation: ImageDataGenerator is used to perform data augmentation on the training images. This includes rotating, zooming, shifting, and flipping the images to increase the diversity of the training dataset and improve the model's generalization ability.
  • Learning Rate Reduction: A learning rate reduction strategy is implemented using the ReduceLROnPlateau callback, which dynamically adjusts the learning rate during training based on the validation accuracy.
  • Early Stopping: EarlyStopping callback is utilized to stop training if the validation loss does not decrease for a certain number of epochs, preventing overfitting and saving computation time.
  • Model Training: The model is trained using the augmented data generated by ImageDataGenerator. The training process involves iterating over multiple epochs and updating the model parameters to minimize the loss function.
  • Model Saving: Once training is complete, the trained model is saved in the HDF5 file format for future use without the need to retrain.

predictor.py

This program shows digit recognition using a pre-trained CNN model on real-world images, showcasing key concepts such as model loading, image preprocessing, prediction, and optinal visualization.

  • Model Loading: It loads a pre-trained CNN model for digit recognition. The model has been trained on a dataset of handwritten digits.
  • Image Preprocessing: The program defines a function preprocess_image to preprocess an input image. It converts the image to grayscale, checks the background vs font color to determine whether to invert the colors, and then extracts individual digit regions from the image using connected component analysis. Each digit region is resized to 28x28 pixels and normalized before being passed to the model for prediction.
  • Digit Prediction: Another function predict_digits is defined to predict the digits in the preprocessed image. It utilizes the pre-trained model to predict the digit in each region extracted from the image. The predicted digits are then concatenated to form the final predicted sequence.
  • Example Usage: An example usage of the predict_digits function is provided in the test file, where it takes the path of an input image containing handwritten digits, predicts the digits in the image, and prints the predicted sequence.
  • Visualization (Optional): The program includes commented-out code to display the original image with bounding boxes drawn around the detected digit regions. This visualization step can be enabled by uncommenting the relevant code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages