I am Smitesh👋, I have 2 years of industry experience in Data engineering and Data Science.I have work experience in collaboratively developing data warehousing solutions using SQL in Informatica PowerCenter, as well as in data visualization with Tableau. I am committed ✊ to building expertise in data analytics, statistical concepts, and industry-relevant engineering tools. Highly skilled in deep learning libraries such as NumPy, Matplotlib, Keras, PyTorch, TensorFlow, and others. Currently, I am working with Orcawise and developing a custom NLU system.
- ✒️ As long as data is involved in a problem statement, I will dive deep to solve it!
- 🔭 I’m currently working with Orcawise[https://orcawise.com/] as a Data Science Intern and looking for full time opportunities.
- 🌱 I’m currently learning Azure services for data engineering and preparing for Azure Data Engineer Associate Exam.
- 📫 How to reach me: Email / Linkedin.
Tech Stack : [Python, QGIS, PyTorch, Numpy, Pandas, Searborn, Spacy, LaTEX, GeoPandas,API services for data]
- Utilized geospatial data to select optimal sites for solar energy projects, leveraging advanced deep learning technique.
- Analysed Geological Information Systems (GIS) data and developed a machine learning pipeline that involved preprocessing GIS data from multiple web databases, modeling the data, and staging it for input into a deep learning model.
- Developed a multi-input Auto-Encoder to learn representations from geospatial data and applied various clustering algorithms to cluster optimal solar locations.
- Orchestrated a data pipeline using Apache Airflow to ingest data from Daft.ie.
- Implemented an ELT process deployed on the Airflow web server, where the ingested data was stored on an AWS S3 bucket and transformed before being loaded into a Snowflake database, serving as the data lake.
- Created a star schema in the data transformation phase by normalising the input data, which was subsequently utilized for data visualization and exploratory data analysis (EDA) in Tableau.
- Automated the process by utilizing the Airflow Scheduler to monitor the AWS bucket for changes and containerized the project using Docker for easy deployment.
- Repository for my college work during my Master's at University of Galway.