As an experienced data professional grounded in analytics, economics, and business administration, I excel at turning complex data into actionable insights. My strength lies in designing data solutions that not only support but enhance strategic decision-making. I also thrive as a trailblazer, initiating data-driven processes that establish robust, scalable systems.
Music similarity algorithm for recommendation based on acoustic features My contribution to this paper was focused on algorithms for feature extraction using python (Tensorflow and Essentia). In our study, we expanded upon an academic paper methodology to measure music similarity, utilizing both objective features like tempo and loudness, and subjective features such as danceability using TensorFlow. Features were normalized and adjusted for consistency. We calculated similarity using the inverse of 1d Minkowski distance. View Academic Poster Presentation |
|
Scraping and Processing 500.000 PDF files to create search functionality In my project to create an search functionality for Chile's "Diario Oficial", I tackled enhancing public data accessibility using Python and Beautifulsoup for scraping half a million PDF files, integrating AWS API Gateway for IP rotation, and employing Google Cloud Platform for cloud-based data management. I implemented multithreading and optimized database queries to handle large data volumes, applying advanced data engineering techniques within a scalable infrastructure. View Project |
|
Early Prediction of Student Success: A Comparison of Statistical Methods for Multiclass Classification For this academic project, my main contribution was in data preparation, data imbalance treatment and algorithm fine-tuning. Following the form of the original authors’, we computed the F1 score and accuracy for the various modeling techniques: multinomial logistic regression with lasso, decision trees, XGBoost (both 1.0 and 2.0), bagging trees, random forests, and two-variable logistic regression. Random forests and two-variable logistic regression were run on one-hot encoded datasets that included the original dataset as well as datasets that were “boosted” using both SMOTENC and AdaSyn. The F1 scores of our models were generally much higher than those of the original paper. |
|
An algebraic approach to Neural Networks: a Pure Python implementation Implemented a mathematical approach to neural networks, focusing on optimizing computational efficiency and accuracy. This project was executed using pure Python and Numpy, and includes an overview of the involved mathematical theory. View Project |
I am keen to push the boundaries of data science and explore new challenges. If you are looking for a professional who transforms data into assets, let’s connect.