Skip to content
View smougel's full-sized avatar
🏠
Working from home
🏠
Working from home
Block or Report

Block or report smougel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
smougel/README.md

Hi there πŸ‘‹

  • πŸ”­ I’m the founder of https://www.seocopilot.fr : Helping compagnies to improve their SEO organicaly
  • 🌱 I’m currently learning about Data Science / AI / Deep Learning / Machine learning (Never ending process)
  • πŸ‘― I’m looking to collaborate on everything that has a meaningful purpose when I have time !
  • πŸ“« How to reach me: https://www.linkedin.com/in/smougel/
  • πŸ˜„ Pronouns: He/His

I've a background of 20 years as Software Engineer (Fullstack dev) and now I add a new card to my set as Data Scientist / AI Engineer. I love to learn new things about Deep learning / CNN / Sequence Models (Many thanks Andrew NG & Coursera)

  • πŸ’» Programming languages : Python, Javascript, Php
  • πŸ—‚οΈ Databases : Maria DB / MySQL / Redis
  • πŸ’ Front : React / Redux / Css / HTML 5
  • πŸ”§ Web frameworks : Symfony / Laravel / Code Igniter
  • πŸ”© DB frameworks : Active Records, Doctrine ORM
  • βš™οΈ Backend : Writing of Workers & Daemons
  • ⌚ Load & Queue management : Beanstalkd
  • πŸ“Š DataViz : Matplotlib , plotly, seaborn
  • πŸ§ͺ Data science : Pandas, numpy, scikit learn, Tensorflow, Pytorch, keras

Projects

I will open source them as soon as possible.

Exploratory Data Analysis (Data for good): 🌳 🌲 🌱 🏒 Paris Trees 🌳 🌲 🌱

Goal : Helping Paris city to become a smart city.

Optimization of tree maintenance

Data source : opendata.paris.fr

My work : https://github.com/smougel/eda_paris_tree (Notebook & Presentation)

Exploratory Data Analysis (Open food facts) πŸ• πŸ‡ πŸ“ πŸ§€ πŸ” 🍫

Goal : Analyse and find healthy products / Inform people about nutritional metrics

My work : https://github.com/smougel/eda_open_food_facts (Notebook & Presentation)

Credit scoring βœ’οΈ πŸ’―

Build a model to detect people able to repay their loan... or not...

πŸ’‘ Process :

  • exploratory data analysis
  • data cleaning
  • feature engineering
  • sampling / train & test split
  • model training : SVM, Neural Networks, Logistic Regression, Random forest
  • variable importance evaluation with lime

πŸ“ Metrics : Precision / Recall / F1-Score

πŸͺ› Hyperparameter tuning with grid-search and cross-validation

Tech used : Python, Scikit-learn, Matplotlib, Seaborn Project : https://github.com/smougel/credit_scoring/tree/master (Notebook)

Customer segmentation for e-commerce πŸ‘¨ πŸ‘§ πŸ§“ πŸ‘½ πŸ™†

Unsupervised learning task

Dataset : Olist.com

Gain insights about user behavior and discover buyer characteristics

πŸ’‘ Process :

  • exploratory data analysis
  • data wrangling
  • feature engineering
  • dimensionality reduction : Principal Component Analysis
  • clustering : k-means, db-scan
  • elbow method
  • High-dimensionality vizualisation : T-SNE, U-Map
  • Analysis of cluster stability

πŸ“ Metrics : ARI Score

AI for Restaurants 🍽️ 🍝

😑 Customer dissatisfaction discovery πŸͺ„ Automatic photo classification : Menu card, Food picture 🍝, Outdoor picture

Dataset : https://www.yelp.com/dataset

πŸ’‘ Process :

  • exploratory data analysis
  • data wrangling for NLP (🀬 stop words , lemmatization, stemming, tokenization)
  • data wrangling for photos (contrast normalization, resizing, noise filtering)
  • Topic discovery : LDA
  • Convolutional neural networks
  • Regularization (Dropout)
  • Exploration of filters learned by the CNN (Thanks to francois Chollet)
  • Use of Yelp API

Tech used : Tensorflow, NLP, Sequence model, LSTM, CNN, Keras, Open CV

Bad Buzz Detection in comments πŸ—£οΈ πŸ‘ / πŸ‘Ž

Dataset : https://www.kaggle.com/kazanova/sentiment140 (1.6M Tweets) Goal : Sentiment analysis from tweets. Benchmark with Microsoft Azure Sentiment Analysis.

πŸ’‘ Process :

  • exploratory data analysis
  • data wrangling (lemming / stemming / tokenization)
  • modelization (Basic to advanced : logistic regression, TF-IDF, LSTM)
  • benchmark with Azure Machine Learning Services

πŸͺ› Hyper parameter tuning πŸ“ Metrics : F-Beta Score

Tech used : Word embeddings (Word2Vec & FastText), Tensorflow & Keras

Image segmentation for autonomous driving πŸ€– ❀️ πŸš—

Dataset : Cityscape

πŸ’‘ Process :

  • exploratory data analysis
  • data wrangling (Picture to binary mask)
  • data augmentation (Random cropping, flipping, mirroring)
  • modelization (Basic to advanced : Fully connected layers to U-Net architecture)
  • ☁️ training in the cloud (w/ Microsoft Azure : compute instance provisionning)
  • Model serving via Flask API hosted on Microsoft Azure

πŸͺ› Hyper parameter tuning πŸ“ Metrics : Jaccard index

Tech used : Tensorflow, Keras, CNN, U-Net, Flask, Azure Services

Content recommendation for news reading 🧚 πŸͺ„ πŸ“š

Dataset : News Portal User Interactions by Globo.com https://www.kaggle.com/gspmoreira/news-portal-user-interactions-by-globocom#clicks_sample.csv

πŸ’‘ Process :

  • exploratory data analysis (w/ t-SNE visualization of news embeddings)
  • data wrangling
  • modelization : content filtering and collaborative filtering
  • ☁️ training in the cloud (w/ Microsoft Azure : compute instance provisionning)
  • Use of serverless Azure Function for model serving / Azure Storage
  • Integration with a node js mobile app

πŸ“ Metrics : Similarity measure (dot product, cosine)

Tech used : Tensorflow, Sparse Tensor, Matrix factorization, Azure Services

Chatbot for vacation booking πŸ€– πŸͺ„ 🌴 β˜€οΈ

Dataset : Microsoft frames dataset (Dialogs between two humans via a chat interface) https://www.microsoft.com/en-us/research/project/frames-dataset/

πŸ’‘ Process :

  • exploratory data analysis (w/ t-SNE visualization of news embeddings)
  • data wrangling
  • LUIS Training
  • Integration w/ Microsoft bot framework

πŸ“ Metrics : Similarity measure (dot product, cosine)

Tech used : Microsoft LUIS, Microsoft Bot Framework, Azure application insight, Unit Testing

Pinned Loading

  1. eda_open_food_facts eda_open_food_facts Public

    Jupyter Notebook

  2. eda_paris_tree eda_paris_tree Public

    EDA for Paris trees

    Jupyter Notebook

  3. pytorch-Deep-Learning pytorch-Deep-Learning Public

    Forked from Atcold/NYU-DLSP20

    Deep Learning (with PyTorch)

    Jupyter Notebook

  4. content_recommendation content_recommendation Public

    Jupyter Notebook

  5. travelbot travelbot Public

    Travel chatbot developed with azure machine learning / Luis / Microsoft Chatbot Framework Builder

    Python

  6. batteurMDR/ggj2015 batteurMDR/ggj2015 Public

    JavaScript