Skip to content
/ dsevero Public template
forked from HugoBlox/theme-academic-cv

Source code for my personal website.

Notifications You must be signed in to change notification settings

dsevero/dsevero

 
 

Repository files navigation

Research Engineer

Meta - Fundamental AI Research (FAIR) Labs

Originally, I am from Florianópolis (Brazil) but I've lived in New Jersey, Orlando, Toronto (now), São Paulo, as well as other smaller cities in the south of Brazil. I spent 2022 at Google AI with Lucas Theis and Johannes Ballé as a Student Researcher.

Google Scholar X (Twitter) CV

Research Interests

I'm interested in information theory, machine learning, and AI.

Compression of non-sequential data

Lossless compression algorithms typically preserve the ordering in which data points are compressed. However, there are data types where order is not meaningful, such as collections of files, rows in a database, nodes in a graph, and, notably, datasets in machine learning applications.

Compressing with traditional algorithms is possible if we pick an order for the elements and communicate the corresponding ordered sequence. However, unless the order information is somehow removed during the encoding process, this procedure will be sub-optimal, because the order contains information and therefore more bits are used to represent the source than are truly necessary.

In previous works, we gave a formal definition for non-sequential objects as random sets of equivalent sequences, which we call Combinatorial Random Variables (CRVs), as well as a general class of computatioanlly efficient algorithms that achieve the optimal compression rate of CRVs: Random Permutation Codes (RPCs). Specialized RPCs are given for the case of multisets (Random Order Coding), graphs (Random Edge Coding), and partitions/clusterings (under review), providing new algorithms for compression of databases, social networks, and web data in the JSON file format.

Currently, I'm interested in the application of RPCs to reduce the memory footprint of vector databases.

Latest News

April 2024 - I've moved to Montréal to start as a Research Engineer at FAIR Labs!

March 2024 - LASI and Shuffle Coding were accepted to ICLR 2024.

August 2023 - I started a second internship at FAIR (Meta AI) in information theory and generative modelling with Matthew Muckley.

April 2023 - Random Edge Coding and Action Matching were accepted to ICML 2023.

Tutorials and Workshops

Recommended readings (not my authorship)

Selected Publications and Preprints

For a complete list, please see my Google Scholar profile.

The Unreasonable Effectiveness of Linear Prediction as a Perceptual Metric
Daniel Severo, Lucas Theis, Johannes Ballé
International Conference on Learning Representations (ICLR), 2024

Random Edge Coding: One-Shot Bits-Back Coding of Large Labeled Graphs
Daniel Severo, James Townsend, Ashish Khisti, Alireza Makhzani
International Conference on Machine Learning (ICML), 2023

Action Matching: Learning Stochastic Dynamics from Samples
Kirill Neklyudov, Rob Brekelmans, Daniel Severo, Alireza Makhzani
International Conference on Machine Learning (ICML), 2023
Compressing Multisets with Large Alphabets using Bits-Back Coding
Daniel Severo, James Townsend, Ashish Khisti, Alireza Makhzani, Karen Ullrich
IEEE Journal on Selected Areas in Information Theory, 2023
Best Paper Award at NeurIPS Workshop on DGMs, 2021

About

Source code for my personal website.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 95.9%
  • HTML 1.3%
  • JavaScript 1.1%
  • TeX 0.7%
  • CSS 0.6%
  • SCSS 0.2%
  • Other 0.2%