Skip to content
View DrishtiShrrrma's full-sized avatar
Block or Report

Block or report DrishtiShrrrma

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DrishtiShrrrma/README.md

Portfolio

🙋‍♂️About Me

Hello there! I'm Drishti. With a passion for technology and a knack for problem-solving, I've delved into various projects, from Natural Language Processing to Audio DL and Reinforcement Learning.

I started my career with CPA Global (now Clarivate), Noida, where I worked for four years as an IP Researcher and IP Consultant, respectively. This deep dive into patents not only honed my analytical skills but also introduced me to the vast expanse of AI. However, as time passed, a restless quest for greater purpose and multiple unexpected harsh twists in life pushed me to re-invent and rebuild my life.

After segueing into the Data Science domain, I've actively engaged with the Hugging Face's open-source initiatives and disseminated my research insights through Medium and Analytics Vidhya.

🏆 Achievements

  1. Hugging Face Whisper Fine-tuning Event, '23:

    • Secured 1st position for fine-tuned Whisper models for ASR task across 11 different low-resource languages, leveraging the Mozilla Common Voice 11 dataset. Languages included Azerbaijani, Breton, Hausa, Hindi, Kazakh, Lithuanian, Marathi, Nepali, Punjabi, Slovenian, and Serbian.
    • Models outperformed even the benchmarks set by OpenAI's Whisper research paper.
  2. Hugging Face Wav2Vec2 Fine-tuning Event, '22:

    • Attained 1st position with models fine-tuned for 7 distinct languages.
  3. Secured 3rd place in Analytics Vidhya Blogathon'26 for 'Best Guide'.

  4. Ranked 3rd in Analytics Vidhya Blogathon'26 for 'Best Article'.

  5. Analytics Vidhya Blogathon Winner - Best Article

🥇 Certifications

  1. Hugging Face NLP Course Part-1 and Part-2.
  2. Hugging Face Deep Reinforcement Learning Course 2023.
  3. Hugging Face Audio Course

💪 Volunteer Experience

  1. Reviewed 3 research papers focused on advancing ML in low resource setting for PML4LRS @ ICLR 2024, Feb'24.
  2. Reviewed an NLP research paper for EMNLP - Aug'23.
  3. Trained, tested, and deployed TF-based models for Keras at Hugging Face, March'22.

🤖 NLP Projects

Project Name Checkpoint & Code Key Highlights Blog Demo (WIP)
Comparative Analysis of LoRA Parameters on Llama-2 with Flash Attention Hugging Face; GitHub Varying lora_dropout yields stable training loss but inconsistent inference times, while increasing lora_alpha improves training without sacrificing efficiency. Blog
Dissecting Llama-2-7b's Behavior with Varied Pretraining Temperature and Attention Mechanisms Hugging Face; GitHub i) Flash Attention nearly halves the training time compared to Normal Attention. ii) Minimal difference in training loss across different pretraining_tp values. Blog
Comparative Study: Training OPT-350M and GPT-2 Using Reward-Based Training Hugging Face; GitHub While opt-350m experienced a rapid initial decline in loss, GPT-2 showed a steadier descent but trained faster overall. Blog
Unraveling the Dual Impact: Batch Size and Mixed Precision on DistilBERT’s Performance in Language Detection Hugging Face, GitHub Performance metrics, such as training and validation losses, exhibit a subtle deterioration with very large batch sizes, suggesting possible coarse gradient approximations. Furthermore, utilizing fp16 enhances computational speed across different batch sizes while maintaining comparable accuracy metrics Blog
Analyzing the Impact of lora_alpha on Llama-2 Quantized with GPTQ Hugging Face, GitHub At lora_alpha 32, optimal training (3.8675) and validation losses (4.2374) were achieved, but values beyond this showed decreased performance and potential overfitting, while runtimes remained consistent. Blog
Comprehensive Evaluation of Various Transformer Models in Detecting Normal, Hate, and Offensive Texts GitHub bert-base-uncased stands out as a top performer that balances efficiency and precision. It did even better than roberta-large! Blog
Unveiling the Impact of Weight Decay on MBart-large-50 for English-Spanish Translation Hugging Face; GitHub Weight-decay shows only a muted influence on the MBart-50 model’s English-Spanish translation performance. Blog
Fine-Tuning Llama-2-7b on Databricks-Dolly-15k Dataset and Evaluating with BigBench-Hard Hugging Face, GitHub While it demonstrated proficiency in handling general questions, there were instances where disparities emerged between the responses generated by the model and the anticipated answers, specifically during evaluations on BigBench-Hard questions. train_loss = 2.343 Blog
Comparative Analysis of Adapter Vs Full Fine-Tuning- RoBERTa Hugging Face, GitHub Surprisingly and unexpectedly adapters performed better than the fully fine-tuned RoBERTa model, but, to have a concrete conclusion, more experiments must be conducted. Blog
codeBERT-based Password Strength Classifier Hugging Face, GitHub Data Visualization Charts, Handled Imbalanced Data, Casing affected Password Strength Blog
BERT-base MCQA Hugging Face, GitHub Overfitted Model, didn't perform really well
Fine-tuning 4-bit Llama-2-7b with Flash Attention using DPO GitHub Training Halted Prematurely Blog
Sentence-t5-large Quora Text Similarity Checker Hugging Face, GitHub
Stable Diffusion Prompt Generator Hugging Face, GitHub

🕹Reinforcement Learning Projects

Index Environment Best Checkpoint mean_reward Demo
1 Trained Model of a PPO Agent Playing LunarLander-v2 Checkpoint 280.89 Demo
2 trained Model of a Q-Learning Agent Playing Taxi-v3 Checkpoint 4.85 Demo
3 Trained Model of a DQN Agent Playing SpaceInvadersNoFrameskip-v4 Checkpoint 502.78 Demo
4 Trained Model of a Reinforce Agent Playing CartPole-v1 Checkpoint 500 Demo
5 Reinforce Agent Playing Pixelcopter-PLE-v0 Checkpoint 25.01 Demo
6 Trained Model of a A2C Agent Playing PandaReachDense-v2 Checkpoint -1.66 Demo
7 PPO Model Trained on the doom_health_gathering_supreme Environment Checkpoint 7.12 Demo
8 PPO Agent Playing SnowballTarget Using the Unity ML-Agents Library Checkpoint 0 Demo
9 PPO Agent Playing Pyramids Using the Unity ML-Agents Library Checkpoint 0 Demo
10 POCA Agent Playing SoccerTwos Using the Unity ML-Agents Library Checkpoint 0 Demo

🎶 Audio Projects

Project Name Checkpoint Metrics Demo
ASR using Whisper Hugging Face, GitHub
Text-to-Speech Using SpeechT5 Hugging Face, GitHub
DistilHuBERT fine-tuned on GTZAN for Audio Classification Task Hugging Face, GitHub
Wav2Vec2 fine-tuned on MESD Dataset for Emotion Classification Hugging Face Acc=91.54%
Time-stamp Prediction using Whisper GitHub
Wav2Vec2 fine-tuned for Keyword Spotting Hugging Face

Thank you for taking the time to explore my journey! 👨‍💻

Pinned Loading

  1. music-source-separation music-source-separation Public

    Jupyter Notebook 1

  2. coqui coqui Public

    Jupyter Notebook

  3. speechT5-spanish-tts speechT5-spanish-tts Public

    Jupyter Notebook

  4. wav2vec2-finetuning-event-huggingface wav2vec2-finetuning-event-huggingface Public

  5. whisper-finetuning-event-huggingface whisper-finetuning-event-huggingface Public

    Jupyter Notebook 1