Skip to content
View ayushkumarshah's full-sized avatar
🏠
Working from home
🏠
Working from home

Organizations

@RIT-SWEN610-F21

Block or report ayushkumarshah

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ayushkumarshah/README.md

Hi there πŸ‘‹ , I am Ayush Kumar Shah, an AI enthusiast.

  • πŸ”­ I’m a fifth-year Ph.D. candidate at Rochester Institute of Technology (RIT), conducting research at the Document and Pattern Recognition Lab (DPRL), under the mentorship of Dr. Richard Zanibbi.
  • πŸ’‘ My work centers around designing fast, efficient, and interpretable parsers for recognizing complex mathematical and chemical formulas. I explore graphical notations across multiple formats, including PDFs, typeset images, and handwritten strokes. Through graph attention-based techniques, I aim to enhance how contextual information is processed, while preserving a natural and interpretable graph representation.
  • 🎯 My goal is to deliver high accuracy in formula recognition through models that are not only faster but also easier to interpret than traditional encoder-decoder architectures.
  • πŸ’» Recently, I developed ChemScraper, a molecule diagram parser that extracts characters and graphics directly from PDF molecule images. By utilizing typesetting instructions and simple graph transformations, it generates both visual and chemical graphs β€” without the need for OCR, GPUs, or vectorization. ChemScraper offers a practical approach to creating fine-grained, annotated datasets for training visual parsers, and also a visual parser for parsing molecule images (raster) directly.
  • 🌐 Research interests: Pattern recognition, recognition of graphical structures, computer vision, speaker understanding, large language models, multi-modal deep learning, natural language processing .
  • ✍️ I write blog posts that reflect my new learnings mostly related to python and AI.
  • 🌱 I’m currently learning fundamental concepts and advancements in recognition (parsing) of graphical information from documents.
  • πŸ“ƒ You can view my CV here: My CV
  • Personal website: shahayush.com

Follow on Twitter Linkedin: ayush7

My latest Blog posts

My latest YouTube Videos

Ayush's github stats

Some additional pinned repositories

Pinned Loading

  1. Guitar-Chords-recognition Guitar-Chords-recognition Public

    An application that predicts the chords when melspectrograms of guitar sound is fed into a CNN.

    Python 119 42

  2. autocar autocar Public

    A self-driving car that can detect lanes, stop sign, traffic light and avoid a collision, built using Canny edge detection, Hough transform, Haar cascade classifier, and Arduino programming.

    Python 5 1

  3. AI-Plays-GTA5 AI-Plays-GTA5 Public

    A bike-riding agent in a virtual environment (GTA5), built using CNN, used for simulating self-driving vehicles.

    Python 7

  4. Deep-Learning-Nanodegree-Udacity Deep-Learning-Nanodegree-Udacity Public

    This repository contains all the projects that I submitted during the completion of the Deep Learning Nano Degree provided by Udacity.

    Jupyter Notebook 2 1

  5. Nepali_Plagiarism_Detection Nepali_Plagiarism_Detection Public

    An application which detects plagiarised Devanagari text files using a self built rule based stemming algorithm and Cosine similarity.

    Jupyter Notebook 6 3

  6. SLR-Parser SLR-Parser Public

    A SLR_Parser which costructs canonical collection of LR(0) items and SLR Parsing table and also parses a given input string.

    Python 5 2