Dagart dagartga

Hi there 👋

I am a dedicated Data Scientist, who also has over 12 years of Life Science experience, based in Los Angeles with a passion for solving business problems through Data Science techniques and effectively communicating insights to stakeholders using visualizations. With a systematic and creative approach, I consistently aim to add tangible value to teams, businesses, and end-users. I am committed to continuous learning and self-improvement.

Technical Skills

Programming: Python, SQL

Tools & Technologies: Git, AWS, Databricks

Data Manipulation & Analysis: Pandas, Numpy, Statsmodels, SciPy

Visualization: Tableau, Seaborn, Matplotlib

Machine Learning & Deep Learning: PySpark, SparkSQL, TensorFlow, XGBoost

Statistical Analysis: A/B Testing, Regression (Linear, Logistic), Classification, Clustering, PCA, Forecasting, Anomaly Detection

Model Interpretation: SHAP, Interpretability Techniques

Model Deployment: Docker, Flask, Streamlit

Soft Skills

Research, Communication, Accountability, Initiative, Collaboration, Critical Thinking, Passion, Presentation, Project Delivery, Idea Generation

Github Projects

Data Science Portfolio Website

Homepage

My main portfolio website that contains the following data science projects

Quantifying Sales Uplift with Causal Impact Analysis

Analyzed the impact of a "Delivery Club" special on sales for a grocery retailer using a Python causal impact library, revealing a 41.1% uplift in sales.

Web App for Parkinson's Prediction using Boosted Models

Developed a predictive model for Parkinson's severity using boosted tree models with feature engineering, resulting in significant improvement in F1 score and recall.

Understanding Alcohol Product Relationships Using Association Rule Learning

Analyzed customer buying patterns to uncover product relationships in alcohol retail, revealing insights about customer preferences.

Compressing Feature Space For Classification Using PCA

Predicted customer behavior using historical music sales data, achieving high accuracy through feature space compression and Random Forest.

The "You Are What You Eat" Customer Segmentation

Segmented customers for a grocery chain to provide marketing insights based on dietary preferences.

UK Bank Customer Demographics Tableau Dashboard

Created a dashboard for targeted marketing campaigns based on bank customer demographics.

Streamlit Apps

Parkinsons Severity Prediction

Boosted tree models for predicting the maximum severity of Parkinsons for clinical patients based on their protein and peptide mass spectrometry quantities.
Explainable results using Feature Importance for each patient and a description of the top proteins.

Salifort Motors Employee Retention Prediction

Uses an XGBoost Classifier to predict how likely an employee will leave based on the HR data.
Provides SHAP values to explain the probability score assigned to the employee.

Deep Learning Project

Transfer Learning for X-Ray Image Classification

Used DenseNet-201 pre-trained neural network and fine-tuned one hidden layer with 4096 nodes and 30% dropout for regularization.
Solved imbalanced target distribution using class weights to the parameters for the neural network.
Applied hyperparameter tuning of # of hidden layers, nodes, learning rate, batch size, learning rate decay, and momentum to find optimal values.
Final Test Statistics:
- AUC: 0.990
- F1: 0.972
- Recall: 0.971
- Precision: 0.973

Provide feedback

Saved searches

Use saved searches to filter your results more quickly