Skip to content
View smitesh22's full-sized avatar
Block or Report

Block or report smitesh22

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
smitesh22/README.md

Hi there 👋

I am Smitesh👋, I have 2 years of industry experience in Data engineering and Data Science.I have work experience in collaboratively developing data warehousing solutions using SQL in Informatica PowerCenter, as well as in data visualization with Tableau. I am committed ✊ to building expertise in data analytics, statistical concepts, and industry-relevant engineering tools. Highly skilled in deep learning libraries such as NumPy, Matplotlib, Keras, PyTorch, TensorFlow, and others. Currently, I am working with Orcawise and developing a custom NLU system.

About me

  • ✒️ As long as data is involved in a problem statement, I will dive deep to solve it!
  • 🔭 I’m currently working with Orcawise[https://orcawise.com/] as a Data Science Intern and looking for full time opportunities.
  • 🌱 I’m currently learning Azure services for data engineering and preparing for Azure Data Engineer Associate Exam.
  • 📫 How to reach me: Email / Linkedin.

General Idea on my pinned projects

1. Unsupervised-Machine-Learning-For-Solar-Site-Selection :

Tech Stack : [Python, QGIS, PyTorch, Numpy, Pandas, Searborn, Spacy, LaTEX, GeoPandas,API services for data]

  • Utilized geospatial data to select optimal sites for solar energy projects, leveraging advanced deep learning technique.
  • Analysed Geological Information Systems (GIS) data and developed a machine learning pipeline that involved preprocessing GIS data from multiple web databases, modeling the data, and staging it for input into a deep learning model.
  • Developed a multi-input Auto-Encoder to learn representations from geospatial data and applied various clustering algorithms to cluster optimal solar locations.

2. daft.ie_dataengineering_and_analysis

Tech Stack: [Python, Apache Airflow, Alchemy for Sql, Docker, Airflow Scheduler]

  • Orchestrated a data pipeline using Apache Airflow to ingest data from Daft.ie.
  • Implemented an ELT process deployed on the Airflow web server, where the ingested data was stored on an AWS S3 bucket and transformed before being loaded into a Snowflake database, serving as the data lake.
  • Created a star schema in the data transformation phase by normalising the input data, which was subsequently utilized for data visualization and exploratory data analysis (EDA) in Tableau.
  • Automated the process by utilizing the Airflow Scheduler to monitor the AWS bucket for changes and containerized the project using Docker for easy deployment.

3. Masters-Assignment-Data

Tech Stack: [Python, R, PyTorch, Numpy, Pandas, Searborn, Sci-kit learn, LaTEX]

  • Repository for my college work during my Master's at University of Galway.

Pinned Loading

  1. Unsupervised-Machine-Learning-For-Solar-Site-Selection Unsupervised-Machine-Learning-For-Solar-Site-Selection Public

    Jupyter Notebook 1

  2. daft.ie_dataengineering_and_analysis daft.ie_dataengineering_and_analysis Public

    Python 1

  3. Masters-Assignment-Data Masters-Assignment-Data Public

    Assignment Data for Masters Modules

    Jupyter Notebook 1

  4. DjangoWebProject DjangoWebProject Public

    Python

  5. DesktopApplication DesktopApplication Public

    C#

  6. Football-Data-Analysis Football-Data-Analysis Public

    Code for football data analysis required for my blog

    HTML