- Since 2017 I work as a data scientist.
- ๐ข My work experience can be found in my LinkedIn profile (Jakub Bartczuk).
- ๐ I studied Theoretical Mathematics (BSc.) and Data Science (MSc.) at University of Wrocลaw.
- I am mostly focused on deep learning, especially for NLP and Computer Vision. I enjoy problems that go beyond straightforward supervised learning. Extending standard search engine methods is one of such problems.
- In the free time I like to tinker with opensource
- The languages I enjoy the most are Elixir, Lisps (Elixir is kinda a Lisp), Rust and Python.
- ๐ When I need to take a rest from sitting at the computer I train martial arts and like to read about mathematics, linguistics, buddhism and psychology.
- In deepsense.ai I worked on TrelBERT - Polish twitter BERT.
- In findkit I put together wrappers making working with information retrieval with vector data easier.
- ๐ฐ NewsBERT is a RSS feeds information retrieval app that using huggingface transformers zero-shot learning feature.
- niph makes searching podcasts easier. The inspiration was Karpathy's transcription of Lex Fridman Podcast Currently tested with Lex Fridman podcast, but it also will work on transcriptions with similar format.
With over 500 starred repositories searching through them became cumbersome. I did a small project for retrieval on starred repositories which looked promising, but it is hard to gauge how useful such solution would be in practice.
In the thesis I use PapersWithCode data for information retrieval.
PapersWithCode contains links between papers and repositories that implement them. Most repositories are tagged with at least one task like "unsupervised segmentation" or "semantic parsing".
I proposed and built a system that among other things uses zero-shot learning and features extracted with Graph Neural Networks from Python files and functions dependency (call) graph.
- ๐ search huggingface models
- search Lex Fridman Podcast - searches on Andrej Karpathy's transcription site, can fetch precise link to timestamped entry
- symmetric deep dream - I added averaging over rotations to make deep dream more symmetric
- Demystifying UMAP - presentation for Advanced Data Mining seminar at University of Wroclaw.
- manifold learning rotation equivariance on raw pixel values - it seems like Isomap and UMAP partly preserve the rotational structure (for a given image, the closest image are the closest rotated one and its 180 rotation). These algorithms are not designed to do this!
- neural nets like it's 2010 but in modern framework... RBMs in Jax