Skip to content
View XavierSpycy's full-sized avatar
👋
Hi~
👋
Hi~

Highlights

  • Pro
Block or Report

Block or report XavierSpycy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
XavierSpycy/README.md

Photo by Chris Ried on Unsplash

Typing SVG
graph TD;
    DomainKnowledge[Domain Knowledge]-->MachineLearning[Machine Learning];
    MachineLearning[Machine Learning]-->StatisticalLearning[Statistical Learning];
    MachineLearning[Machine Learning]-->DeepLearning[Deep Learning];
    DomainKnowledge[Domain Knowledge]-->BackendDevelopment[Backend Development];
    DeepLearning[Deep Learning]-->ImageClassification[Image Classification];
    ImageClassification[Image Classification]-->LabelNoise[Label Noise];
    DeepLearning[Deep Learning]-->NaturalLanguageProcessing[Natural Language Processing];
    NaturalLanguageProcessing[Natural Language Processing]-->TopicModeling[Topic Modeling];
    NaturalLanguageProcessing[Natural Language Processing]-->LLMApplication[LLM Application];
    DeepLearning[Deep Learning]-->Multimodality[Multimodality];
    Multimodality[Multimodality]-->ImageTextClassification[Image-Text Classification];
    Multimodality[Multimodality]-->VisualQuestionAnswering[VisualQuestionAnswering];
Properties Skills
Domain Knownledge Deep Learning Natural Language Processing Software Development Multimodality Learning
Language Python
Data Analysis NumPy Pandas SciPy Matplotlib
Databases MySQL
Self-developed package mlforce Machine Learning Force
Statistic Learning Tools R programming R Studio
Machine Learning Libraries Scikit Learn
Deep Learning Frameworks PyTorch TensorFlow HuggingFace
Visualization techniques Tableau PowerBI D3 Tulip yEd Gephi

Curriculum Vitae

English | 中文版

Experiences / Projects

NumPy-Based Projects

Photo by Google DeepMind on Unsplash

🌟Self-Developed Library using NumPy

My NumPy-based projects have been successfully integrated into my own open-source Python library, named MLForce. This library is also readily accessible on the PyPI Community.

🌟Multilayer Perceptron from Scratch using NumPy

A robust implementation of multilayer perceptrons, entirely built upon the powerful NumPy library.

Advantages of our implementation:

  • Easy to construct

    layers = [
        Input(input_dim=2),
        Dense(units=4, activation='leaky_relu', init='kaiming_normal', init_params={'mode': 'out'}),
        Dense(units=3, activation='hardswish', init='xavier_normal'),
        Dense(units=2, activation='relu', init='kaiming_normal', init_params={'mode': 'in'}),
        Dense(units=1, activation='tanh', init='xavier_uniform')
    ]
    mlp = MultilayerPerceptron(layers)

  • Easy and stable to train

    mlp.compile(optimizer='Adam',
                metrics=['MeanSquareError'])
    mlp.fit(X, y, epochs=3, batch_size=8, use_progress_bar=True)

    Loss
  • Great results

    Decision boundary
  • Capability of dealing with complex datasets (10 classes, 128 features, 50,000 samples)

    Smooth optimization procedure in 600 epochs

🌟Non-negative Matrix Factorization using NumPy

This project implements nine different Non-negative Matrix Factorization (NMF) algorithms and compares the robustness of each algorithm to five various types of noise in real-world data applications.

  • Well-reconstructed effects

    Image reconstruction
  • Sufficient experiments

    We conduct a seires of experiments, thus when developing your own algorithms, these results could act as a baseline. The results of the experiments (2 datasets × 5 noise types × 2 noise levels × 5 random seeds implicitly) are displayed in the repository.

  • Flexible development

    Our development framework empowers you to effortlessly create your own NMF algorithms with minimal Python scripting.

  • Mature pipeline

    Our framework offers well-established pipelines, accommodating both standard and customized NMF tests.

    For personalized NMF models, the nmf parameter accepts a BasicNMF object. You can seamlessly insert your own NMF model into our pipeline to evaluate its performance.

  • Multiprocessing experiments

    We've harnessed the power of multiprocessing for extensive experiments, significantly enhancing efficiency. This approach has halved the overall experiment duration, reducing it to 30% ~ 50% of the time it would take to run each experiment sequentially.

    For a comprehensive analysis of your algorithm, our platform enables conducting multiple experiments across various datasets:

    from algorithm.pipeline import Experiment
    

    exp = Experiment() exp.choose('L1NormRegularizedNMF') exp.execute()

  • Interactive algorithm interface

    Demo

    Note that the initial parameter in these experiments can also be BasicNMF object, allowing the direct integration of your custom NMF model for thorough evaluation and testing.

DON'T HESITATE TO DEVELOP YOUR OWN ALGORITHM!!!

PyTorch-Based Projects

Photo by Alex Knight on Unsplash

🌟EMNIST Handwritten Character Classification

This project aims to reproduce various convolutional neural networks and modify them to our specific requirements.

Performance of different CNNs on the training set
AlexNet VGGNet SpinalNet ResNet
Accuracy 87.95% 89.80% 87.15% 89.28%
Precision 87.62% 90.01% 86.18% 89.24%
Recall 87.95% 89.80% 87.15% 89.28%
F1 score 86.59% 88.42% 85.28% 88.30%
Performance of different CNNs on the test set
AlexNet VGGNet SpinalNet ResNet
Accuracy 86.96% 87.24% 85.92% 86.88%
Precision 85.55% 86.43% 85.92% 86.88%
Recall 86.96% 87.24% 85.92% 86.88%
F1 score 85.58% 85.66% 84.07% 85.68%

Effects of VGGNet

🌟CAT: A Visual-Text Multimodal Classifier

This project involves a multi-label multi-classification problem. We deployed four pre-trained image models and two pre-trained text models. To enhance performance, we developed 12 multi-modal models using self-attention and cross-attention mechanisms. The project poster showcases some valuable techniques and intriguing discoveries.

CAT (Convolution, Attention and Transformer) architecture

Project Poster

🌟Robust Traniners for Noisy Labels

This project is an experimental repository focusing on dealing with datasets containing a high level of noisy labels (50% and above). This repository features experiments conducted on the FashionMNIST and CIFAR datasets using the ResNet34 as the baseline classifier.

The repository explores various training strategies (Trainer objects), including ForwardLossCorrection, CoTeaching, JoCoR, and O2UNet. Specifically, for datasets with unknown transition matrices, DualT is employed as the Transition Matrix Estimator.

  • Meaningful Loss Trends

    Loss Trend 1

    Loss Trend 2
  • Persuasive Results
    FashionMNIST0.5
    Actual Transition Matrix Estimated Transition Matrix
    0.50.20.30.4730.2090.309
    0.30.50.20.3060.4850.232
    0.20.30.50.2210.3060.460
    FashionMNIST0.6
    Actual Transition Matrix Estimated Transition Matrix
    0.40.30.30.4070.2950.298
    0.30.40.30.2970.3940.308
    0.30.30.40.3010.3100.388

🌟Transformers for Tabular Data

A PyTorch-based implementation that leverages Transformer architectures to enhance the handling and design of tabular data.

🌟MultiCLIP: Multimodal-Multilabel-Multistage Classification using Language Image Pre-training

A framework for multimodal-multilabel-multistage classification utilizing advanced pretrained models like CLIP and BLIP.

Diagrams of implementation:

CLIP + Router

BLIP + Anything

TensorFlow-Based Project

🌟Awesome Tutorials for TensorFlow2

Certification

SpecializationLauncherCompletion DateCredential
Generative Adversarial Networks (GANs) DeepLearning.AI Jun 2024 Link
Natural Language Processing DeepLearning.AI Oct 2023 Link
Deep Learning DeepLearning.AI Aug 2023 Link
Mathematics for Machine Learning and Data Science DeepLearning.AI Aug 2023 Link
Applied Data Science with Python University of Michigan Jul 2023 Link
Machine Learning DeepLearning.AI & Stanford University Jul 2023 Link
Mathematics for Machine Learning Imperial College London Jun 2023 Link
Expressway to Data Science: Python Programming University of Colorado Boulder Dec 2022 Link
Python 3 Programming University of Michigan Dec 2022 Link
Introduction to Scripting in Python Rice University Nov 2022 Link
Statistics with Python University of Michigan Nov 2022 Link
Excel Skills for Data Analytics and Visualization Macquarie University Oct 2022 Link
Python for Everybody University of Michigan Oct 2022 Link
Excel Skills for Business Macquarie University Sep 2022 Link

Credentials

How to Reach me

Gmail Email: [email protected]

LinkedIn LinkedIn: Jiarui XU


Thank you for visiting ❤️

Pinned

  1. huggingface/transformers huggingface/transformers Public

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Python 128k 25.4k

  2. Hannibal046/Awesome-LLM Hannibal046/Awesome-LLM Public

    Awesome-LLM: a curated list of Large Language Model

    15.7k 1.3k

  3. EMNIST-Classifier EMNIST-Classifier Public

    Handwritten character recognition on EMNIST ByClass using Convolutional Neural Networks with PyTorch.

    Python 3

  4. NumPyNMF NumPyNMF Public

    NumPyNMF implements nine different Non-negative Matrix Factorization (NMF) algorithms using NumPy library and compares the robustness of each algorithm to five various types of noise in real-world …

    Python 2 1

  5. NumPyMultilayerPerceptron NumPyMultilayerPerceptron Public

    A Multilayer Perceptron from scratch using NumPy. Offers almost all basic functionalities . Suitable for classification and regression tasks. 一个用NumPy从零实现的多层感知机。提供几乎所有基本功能。适用于分类和回归任务。

    Python 2

  6. CAT-ImageTextIntegrator CAT-ImageTextIntegrator Public

    An innovative deep learning framework leveraging the CAT (Convolutions, Attention & Transformers) architecture to seamlessly integrate visual and textual modalities. This model exploits the prowess…

    Python 2