Skip to main content

Showing 1–31 of 31 results for author: Moreau, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.11676  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation

    Authors: Yanis Lalou, Théo Gnassounou, Antoine Collas, Antoine de Mathelin, Oleksii Kachaiev, Ambroise Odonnat, Alexandre Gramfort, Thomas Moreau, Rémi Flamary

    Abstract: Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. While many methods have been proposed in the literature, fair and realistic evaluation remains an open question, particularly due to methodological difficulties in selecting hyperparameters in the unsupervised setting.… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  2. arXiv:2406.16938  [pdf, other

    eess.SP cs.LG stat.ML

    Unmixing Noise from Hawkes Process to Model Learned Physiological Events

    Authors: Guillaume Staerman, Virginie Loison, Thomas Moreau

    Abstract: Physiological signal analysis often involves identifying events crucial to understanding biological dynamics. Traditional methods rely on handcrafted procedures or supervised learning, presenting challenges such as expert dependence, lack of robustness, and the need for extensive labeled data. Data-driven methods like Convolutional Dictionary Learning (CDL) offer an alternative but tend to produce… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.06849  [pdf, other

    stat.ML cs.LG

    Flexible Parametric Inference for Space-Time Hawkes Processes

    Authors: Emilia Siviero, Guillaume Staerman, Stephan Clémençon, Thomas Moreau

    Abstract: Many modern spatio-temporal data sets, in sociology, epidemiology or seismology, for example, exhibit self-exciting characteristics, triggering and clustering behaviors both at the same time, that a suitable Hawkes space-time process can accurately capture. This paper aims to develop a fast and flexible parametric inference technique to recover the parameters of the kernel functions involved in th… ▽ More

    Submitted 17 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  4. arXiv:2308.16022  [pdf, other

    stat.ML cs.LG

    PAVI: Plate-Amortized Variational Inference

    Authors: Louis Rouillard, Alexandre Le Bris, Thomas Moreau, Demian Wassermann

    Abstract: Given observed data and a probabilistic generative model, Bayesian inference searches for the distribution of the model's parameters that could have yielded the data. Inference is challenging for large population studies where millions of measurements are performed over a cohort of hundreds of subjects, resulting in a massive parameter space. This large cardinality renders off-the-shelf Variationa… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

  5. arXiv:2305.15042  [pdf, other

    cs.LG stat.ML

    Test like you Train in Implicit Deep Learning

    Authors: Zaccharie Ramzi, Pierre Ablin, Gabriel Peyré, Thomas Moreau

    Abstract: Implicit deep learning has recently gained popularity with applications ranging from meta-learning to Deep Equilibrium Networks (DEQs). In its general formulation, it relies on expressing some components of deep learning pipelines implicitly, typically via a root equation called the inner problem. In practice, the solution of the inner problem is approximated during training with an iterative proc… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  6. arXiv:2303.05798  [pdf, other

    cs.LG eess.SP stat.ML

    Sliced-Wasserstein on Symmetric Positive Definite Matrices for M/EEG Signals

    Authors: Clément Bonet, Benoît Malézieux, Alain Rakotomamonjy, Lucas Drumetz, Thomas Moreau, Matthieu Kowalski, Nicolas Courty

    Abstract: When dealing with electro or magnetoencephalography records, many supervised prediction tasks are solved by working with covariance matrices to summarize the signals. Learning with these matrices requires using Riemanian geometry to account for their structure. In this paper, we propose a new method to deal with distributions of covariance matrices and demonstrate its computational efficiency on M… ▽ More

    Submitted 24 May, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: Published as a conference paper at ICML2023

  7. arXiv:2302.08766  [pdf, other

    stat.ML cs.LG math.OC

    A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization

    Authors: Mathieu Dagréou, Thomas Moreau, Samuel Vaiter, Pierre Ablin

    Abstract: Bilevel optimization problems, which are problems where two optimization problems are nested, have more and more applications in machine learning. In many practical cases, the upper and the lower objectives correspond to empirical risk minimization problems and therefore have a sum structure. In this context, we propose a bilevel extension of the celebrated SARAH algorithm. We demonstrate that the… ▽ More

    Submitted 20 February, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted at AISTATS 2024

  8. arXiv:2210.04635  [pdf, other

    stat.ML cs.LG

    FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels

    Authors: Guillaume Staerman, Cédric Allain, Alexandre Gramfort, Thomas Moreau

    Abstract: Temporal point processes (TPP) are a natural tool for modeling event-based data. Among all TPP models, Hawkes processes have proven to be the most widely used, mainly due to their adequate modeling for various applications, particularly when considering exponential or non-parametric kernels. Although non-parametric kernels are an option, such models require large datasets. While exponential kernel… ▽ More

    Submitted 2 August, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

  9. arXiv:2206.13424  [pdf, other

    cs.LG math.OC stat.ML

    Benchopt: Reproducible, efficient and collaborative optimization benchmarks

    Authors: Thomas Moreau, Mathurin Massias, Alexandre Gramfort, Pierre Ablin, Pierre-Antoine Bannier, Benjamin Charlier, Mathieu Dagréou, Tom Dupré la Tour, Ghislain Durif, Cassio F. Dantas, Quentin Klopfenstein, Johan Larsson, En Lai, Tanguy Lefort, Benoit Malézieux, Badr Moufad, Binh T. Nguyen, Alain Rakotomamonjy, Zaccharie Ramzi, Joseph Salmon, Samuel Vaiter

    Abstract: Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementat… ▽ More

    Submitted 28 October, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted in proceedings of NeurIPS 22; Benchopt library documentation is available at https://benchopt.github.io/

  10. arXiv:2206.05111  [pdf, other

    cs.AI cs.LG q-bio.NC stat.ME stat.ML

    PAVI: Plate-Amortized Variational Inference

    Authors: Louis Rouillard, Thomas Moreau, Demian Wassermann

    Abstract: Given some observed data and a probabilistic generative model, Bayesian inference aims at obtaining the distribution of a model's latent parameters that could have yielded the data. This task is challenging for large population studies where thousands of measurements are performed over a cohort of hundreds of subjects, resulting in a massive latent parameter space. This large cardinality renders o… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  11. arXiv:2201.13409  [pdf, other

    stat.ML cs.LG math.OC

    A framework for bilevel optimization that enables stochastic and global variance reduction algorithms

    Authors: Mathieu Dagréou, Pierre Ablin, Samuel Vaiter, Thomas Moreau

    Abstract: Bilevel optimization, the problem of minimizing a value function which involves the arg-minimum of another function, appears in many areas of machine learning. In a large scale empirical risk minimization setting where the number of samples is huge, it is crucial to develop stochastic methods, which only use a few samples at a time to progress. However, computing the gradient of the value function… ▽ More

    Submitted 10 November, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: Accepted at NeurIPS 2022

  12. arXiv:2112.06652  [pdf, other

    eess.SP cs.LG math.ST stat.AP

    DriPP: Driven Point Processes to Model Stimuli Induced Patterns in M/EEG Signals

    Authors: Cédric Allain, Alexandre Gramfort, Thomas Moreau

    Abstract: The quantitative analysis of non-invasive electrophysiology signals from electroencephalography (EEG) and magnetoencephalography (MEG) boils down to the identification of temporal patterns such as evoked responses, transient bursts of neural oscillations but also blinks or heartbeats for data cleaning. Several works have shown that these patterns can be extracted efficiently in an unsupervised way… ▽ More

    Submitted 11 July, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

  13. arXiv:2106.06338  [pdf, other

    cs.LG math.OC stat.ML

    Understanding approximate and unrolled dictionary learning for pattern recovery

    Authors: Benoît Malézieux, Thomas Moreau, Matthieu Kowalski

    Abstract: Dictionary learning consists of finding a sparse representation from noisy data and is a common way to encode data-driven prior knowledge on signals. Alternating minimization (AM) is standard for the underlying optimization, where gradient descent steps alternate with sparse coding procedures. The major drawback of this method is its prohibitive computational cost, making it unpractical on large r… ▽ More

    Submitted 8 February, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

  14. arXiv:2106.00553  [pdf, other

    cs.LG stat.ML

    SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models

    Authors: Zaccharie Ramzi, Florian Mannel, Shaojie Bai, Jean-Luc Starck, Philippe Ciuciu, Thomas Moreau

    Abstract: In recent years, implicit deep learning has emerged as a method to increase the effective depth of deep neural networks. While their training is memory-efficient, they are still significantly slower to train than their explicit counterparts. In Deep Equilibrium Models (DEQs), the training is performed as a bi-level problem, and its computational complexity is partially driven by the iterative inve… ▽ More

    Submitted 10 March, 2023; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted as a spotlight to ICLR 2022

  15. arXiv:2102.06477  [pdf, other

    stat.ML cs.LG q-bio.QM

    HNPE: Leveraging Global Parameters for Neural Posterior Estimation

    Authors: Pedro L. C. Rodrigues, Thomas Moreau, Gilles Louppe, Alexandre Gramfort

    Abstract: Inferring the parameters of a stochastic model based on experimental observations is central to the scientific method. A particularly challenging setting is when the model is strongly indeterminate, i.e. when distinct sets of parameters yield identical observations. This arises in many practical situations, such as when inferring the distance and power of a radio source (is the source close and we… ▽ More

    Submitted 9 November, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

  16. arXiv:2010.09545  [pdf, other

    math.OC stat.ML

    Learning to solve TV regularized problems with unrolled algorithms

    Authors: Hamza Cherkaoui, Jeremias Sulam, Thomas Moreau

    Abstract: Total Variation (TV) is a popular regularization strategy that promotes piece-wise constant signals by constraining the $\ell_1$-norm of the first order derivative of the estimated signal. The resulting optimization problem is usually solved using iterative algorithms such as proximal gradient descent, primal-dual algorithms or ADMM. However, such methods can require a very large number of iterati… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: accepted to NeurIPS 2020

  17. arXiv:2007.01627  [pdf, other

    cs.LG cs.AI stat.ML

    NeuMiss networks: differentiable programming for supervised learning with missing values

    Authors: Marine Le Morvan, Julie Josse, Thomas Moreau, Erwan Scornet, Gaël Varoquaux

    Abstract: The presence of missing values makes supervised learning much more challenging. Indeed, previous work has shown that even when the response is a linear function of the complete data, the optimal predictor is a complex function of the observed entries and the missingness indicator. As a result, the computational or sample complexities of consistent approaches depend on the number of missing pattern… ▽ More

    Submitted 4 November, 2020; v1 submitted 3 July, 2020; originally announced July 2020.

    Journal ref: Advances in Neural Information Processing Systems 33, Dec 2020, Vancouver, Canada

  18. arXiv:2002.03722  [pdf, other

    stat.ML cs.LG

    Super-efficiency of automatic differentiation for functions defined as a minimum

    Authors: Pierre Ablin, Gabriel Peyré, Thomas Moreau

    Abstract: In min-min optimization or max-min optimization, one has to compute the gradient of a function defined as a minimum. In most cases, the minimum has no closed-form, and an approximation is obtained via an iterative algorithm. There are two usual ways of estimating the gradient of the function: using either an analytic formula obtained by assuming exactness of the approximation, or automatic differe… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

    Comments: 31 pages

  19. arXiv:1905.11071  [pdf, other

    stat.ML cs.LG

    Learning step sizes for unfolded sparse coding

    Authors: Pierre Ablin, Thomas Moreau, Mathurin Massias, Alexandre Gramfort

    Abstract: Sparse coding is typically solved by iterative optimization techniques, such as the Iterative Shrinkage-Thresholding Algorithm (ISTA). Unfolding and learning weights of ISTA using neural networks is a practical way to accelerate estimation. In this paper, we study the selection of adapted step sizes for ISTA. We show that a simple step size strategy can improve the convergence rate of ISTA by leve… ▽ More

    Submitted 27 May, 2019; originally announced May 2019.

    Comments: 22 pages

  20. arXiv:1904.08368  [pdf, other

    cs.LG cs.PL stat.ML

    Relay: A High-Level Compiler for Deep Learning

    Authors: Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Logan Weber, Josh Pollock, Luis Vega, Ziheng Jiang, Tianqi Chen, Thierry Moreau, Zachary Tatlock

    Abstract: Frameworks for writing, compiling, and optimizing deep learning (DL) models have recently enabled progress in areas like computer vision and natural language processing. Extending these frameworks to accommodate the rapidly diversifying landscape of DL models and hardware platforms presents challenging tradeoffs between expressivity, composability, and portability. We present Relay, a new compiler… ▽ More

    Submitted 24 August, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

  21. arXiv:1901.09235  [pdf, other

    cs.LG cs.DC stat.ML

    Distributed Convolutional Dictionary Learning (DiCoDiLe): Pattern Discovery in Large Images and Signals

    Authors: Thomas Moreau, Alexandre Gramfort

    Abstract: Convolutional dictionary learning (CDL) estimates shift invariant basis adapted to multidimensional data. CDL has proven useful for image denoising or inpainting, as well as for pattern discovery on multivariate signals. As estimated patterns can be positioned anywhere in signals or images, optimization techniques face the difficulty of working in extremely high dimensions with millions of pixels… ▽ More

    Submitted 26 January, 2019; originally announced January 2019.

  22. arXiv:1810.11066  [pdf, other

    cs.LG stat.ML

    Automating Generation of Low Precision Deep Learning Operators

    Authors: Meghan Cowan, Thierry Moreau, Tianqi Chen, Luis Ceze

    Abstract: State of the art deep learning models have made steady progress in the fields of computer vision and natural language processing, at the expense of growing model sizes and computational complexity. Deploying these models on low power and mobile devices poses a challenge due to their limited compute capabilities and strict energy budgets. One solution that has generated significant research interes… ▽ More

    Submitted 25 October, 2018; originally announced October 2018.

    Comments: 10 pages, 11 figures

  23. arXiv:1807.04188  [pdf, other

    cs.LG cs.DC stat.ML

    A Hardware-Software Blueprint for Flexible Deep Learning Specialization

    Authors: Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

    Abstract: Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility. Changes in algorithms, models, operators, or numerical systems threaten the viability of specialized hardware accelerators. We propose VTA, a programmable deep learning architecture templat… ▽ More

    Submitted 22 April, 2019; v1 submitted 11 July, 2018; originally announced July 2018.

    Comments: 6 pages plus references, 8 figures

  24. arXiv:1805.09654  [pdf, other

    eess.SP cs.LG stat.ML

    Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals

    Authors: Tom Dupré La Tour, Thomas Moreau, Mainak Jas, Alexandre Gramfort

    Abstract: Frequency-specific patterns of neural activity are traditionally interpreted as sustained rhythmic oscillations, and related to cognitive mechanisms such as attention, high level visual processing or motor control. While alpha waves (8-12 Hz) are known to closely resemble short sinusoids, and thus are revealed by Fourier analysis or wavelet transforms, there is an evolving debate that electromagne… ▽ More

    Submitted 26 May, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

  25. arXiv:1805.08166  [pdf, other

    cs.LG stat.ML

    Learning to Optimize Tensor Programs

    Authors: Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

    Abstract: We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems. However, existing systems rely on manually optimized libraries such as cuDNN where only a narrow range of server class GPUs are well-suppor… ▽ More

    Submitted 8 January, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: NeurIPS 2018

  26. arXiv:1801.06378  [pdf, other

    stat.ML cs.LG cs.SE

    Introducing ReQuEST: an Open Platform for Reproducible and Quality-Efficient Systems-ML Tournaments

    Authors: Thierry Moreau, Anton Lokhmotov, Grigori Fursin

    Abstract: Co-designing efficient machine learning based systems across the whole hardware/software stack to trade off speed, accuracy, energy and costs is becoming extremely complex and time consuming. Researchers often struggle to evaluate and compare different published works across rapidly evolving software frameworks, heterogeneous hardware platforms, compilers, libraries, algorithms, data sets, models,… ▽ More

    Submitted 19 January, 2018; originally announced January 2018.

    Comments: ReQuEST tournament website: https://cKnowledge.org/request

  27. arXiv:1706.01338  [pdf, other

    stat.ML

    Understanding the Learned Iterative Soft Thresholding Algorithm with matrix factorization

    Authors: Thomas Moreau, Joan Bruna

    Abstract: Sparse coding is a core building block in many data analysis and machine learning pipelines. Typically it is solved by relying on generic optimization techniques, such as the Iterative Soft Thresholding Algorithm and its accelerated version (ISTA, FISTA). These methods are optimal in the class of first-order methods for non-smooth, convex functions. However, they do not exploit the particular stru… ▽ More

    Submitted 2 June, 2017; originally announced June 2017.

    Comments: Ongoing work - This document is not complete and might contains errors. arXiv admin note: text overlap with arXiv:1609.00285

  28. arXiv:1705.10087  [pdf, other

    cs.LG stat.ML

    DICOD: Distributed Convolutional Sparse Coding

    Authors: Thomas Moreau, Laurent Oudre, Nicolas Vayatis

    Abstract: In this paper, we introduce DICOD, a convolutional sparse coding algorithm which builds shift invariant representations for long signals. This algorithm is designed to run in a distributed setting, with local message passing, making it communication efficient. It is based on coordinate descent and uses locally greedy updates which accelerate the resolution compared to greedy coordinate selection.… ▽ More

    Submitted 13 May, 2018; v1 submitted 29 May, 2017; originally announced May 2017.

  29. arXiv:1611.04499  [pdf, other

    stat.ML cs.LG

    Post Training in Deep Learning with Last Kernel

    Authors: Thomas Moreau, Julien Audiffren

    Abstract: One of the main challenges of deep learning methods is the choice of an appropriate training strategy. In particular, additional steps, such as unsupervised pre-training, have been shown to greatly improve the performances of deep structures. In this article, we propose an extra training step, called post-training, which only optimizes the last layer of the network. We show that this procedure can… ▽ More

    Submitted 31 October, 2017; v1 submitted 14 November, 2016; originally announced November 2016.

    Comments: submitted to ICLR 2018

  30. arXiv:1609.00285  [pdf, other

    stat.ML

    Understanding Trainable Sparse Coding via Matrix Factorization

    Authors: Thomas Moreau, Joan Bruna

    Abstract: Sparse coding is a core building block in many data analysis and machine learning pipelines. Typically it is solved by relying on generic optimization techniques, that are optimal in the class of first-order methods for non-smooth, convex functions, such as the Iterative Soft Thresholding Algorithm and its accelerated version (ISTA, FISTA). However, these methods don't exploit the particular struc… ▽ More

    Submitted 29 May, 2017; v1 submitted 1 September, 2016; originally announced September 2016.

    Comments: Published as a conference paper at ICLR 2017

  31. arXiv:1503.05798  [pdf, ps, other

    stat.AP

    Simulating recurrent events that mimic actual data: a review of the literature with emphasis on event-dependence

    Authors: Juliette Pénichoux, Thierry Moreau, Aurélien Latouche

    Abstract: We conduct a review to assess how the simulation of repeated or recurrent events are planned. For such multivariate time-to-events, it is well established that the underlying mechanism is likely to be complex and to involve in particular both heterogeneity in the population and event-dependence. In this respect, we particularly focused on these two dimensions of events dynamic when mimicking actua… ▽ More

    Submitted 19 March, 2015; originally announced March 2015.