Hi, I'm Md. Siddiqur Rahman, a passionate data science enthusiast with a keen interest in uncovering insights from complex datasets. I enjoy working on end-to-end data science projects, from data collection and cleaning to modeling and visualization. My focus is on creating data-driven solutions that make a meaningful impact.
- Languages: Python, HTML, CSS
- Data Analysis: pandas, numpy, SQL, Excel
- Database: SQLite, MongoDB
- Version control: Git, GitHUb
- Data Visualization: matplotlib, seaborn, Plotly
- Machine Learning: scikit-learn, statsmodels, linearmodels
- IDE: VS code, Jupyter notebook, Kaggle
- Data Engineering: ETL
- More Tools: Word, Powerpoint
- Problem-Solving: Strong analytical and critical thinking skills.
- Collaboration: Experience working in multidisciplinary teams.
- Communication: Ability to explain complex concepts in simple terms.
- Project Management: Skilled in Agile methodologies and project coordination.
- Company: Mentorness
- Duration: April, 2024 - May, 2024
- Credential: Certificate
- Responsibilities:
- Authored and presented a technical article on a machine learning topic to internal and external stakeholders.
- Built and evaluated a machine learning model to predict mobile phone price classes, using data preprocessing, feature engineering, and model assessment techniques.
- Analyzed the ICC Men's T20 Cricket World Cup dataset, creating visualizations to identify and present key insights.
- Deployed a machine learning model to a production environment, ensuring functionality and reliability. Collaborated with stakeholders to validate the deployment process.
-
Built a predictive model to classify mobile phones into predefined price ranges, using attributes such as battery power, camera features, memory, connectivity options, and more.
-
Developed five distinct models: Logistic Regression, Support Vector Classifier, Decision Tree Classifier, Random Forest Classifier, and Gradient Boosting Classifier.
-
Conducted hyperparameter tuning and cross-validation using grid search to optimize model performance.
-
Deployed the best-performing model, which achieved a 98% accuracy rate, using Streamlit for interactive visualization and user interaction.
-
Created an intuitive interface to allow users to interact with the model, providing real-time predictions and insights based on specific mobile phone attributes.
-
Tech Stack: Python, pandas, scikit-leran, matplotlib, streamlit
-
Key Learnings: Data preprocessing, feature engineering, model evaluation, data analysis, data visualization
-
GitHub Repo: Link to repo
-
Jupyter Notebook: Link to Kaggle
-
Analyzed a comprehensive dataset from a major cricket tournament, focusing on batting and bowling statistics.
-
Identified and explained key batting metrics, including most runs scored, highest strike rates, and best batting averages, across different teams, innings, and individual players.
-
Evaluated key bowling metrics, such as most wickets taken, lowest economy rates, and lowest average runs conceded, contextualizing the data by innings, teams, and individual bowlers.
-
Conducted in-depth analysis of additional features, including most boundaries hit, to provide comprehensive insights into the tournament's performance trends.
-
Tech Stack: Python, pandas, plotly, matplotlib, dash, gunicorn
-
Key Learnings: Data preprocessing, Data Analysis, Data Visualization
-
GitHub Repo: Link to repo
-
Jupyter Notebook: Link to Kaggle
-
Bachelor of Science in Economics
- University: Jahangirnagar University
- Graduation Year: 2021
- Relevant Coursework: Econometrics, Data Visualization, Statistics, Linear Algebra, Calculus
-
Certification in Data Science
- Organization: World Quant University
- Completion Year: 2023
- Description: A comprehensive data science certification covering various data science and machine learning topics. The tools to interact with databses (i.e., mongodb, sqlite) are also important features.
- Credential: Credly Badge
-
Certification in Python
- Organization: Harvard University
- Completion Year: 2023
- Description: A Python course with David Malan introduces foundational topics like functions, variables, conditionals, and loops. It includes practical training in reading, writing, testing, and debugging code, with a focus on exception handling and unit testing. The course also covers regular expressions for data manipulation, object-oriented programming to model real-world entities, and file operations for reading and writing files. Third-party libraries are explored to extend Python's capabilities, equipping students with the skills to address real-world coding challenges.
- Credential: Certificate
-
Certification in Python (Crash Course)
- Organization: Google (Coursera)
- Completion Year: 2021
- Description: A Python programming course designed for beginners covered foundational concepts, emphasizing the benefits of Python in IT. It included basic syntax and hands-on practice with different code editors, allowing exploration into writing computer programs. The course demonstrated that with the right code, computers can accomplish a lot.
- Credential: Certificate
- LinkedIn: Md. Siddiqur Rahman
- Kaggle: Md. Siddiqur Rahman
- Email: [email protected]
I'm open to collaboration and enjoy discussing all things data science. Don't hesitate to reach out!