Skip to main content

Showing 1–27 of 27 results for author: Jun, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.13977  [pdf, other

    stat.ML cs.LG

    A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits

    Authors: Junghyun Lee, Se-Young Yun, Kwang-Sung Jun

    Abstract: We present a unified likelihood ratio-based confidence sequence (CS) for any (self-concordant) generalized linear models (GLMs) that is guaranteed to be convex and numerically tight. We show that this is on par or improves upon known CSs for various GLMs, including Gaussian, Bernoulli, and Poisson. In particular, for the first time, our CS for Bernoulli has a poly(S)-free radius where S is the nor… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 31 pages, 1 figure, 2 tables

  2. arXiv:2406.10738  [pdf, other

    cs.LG stat.ME

    Adaptive Experimentation When You Can't Experiment

    Authors: Yao Zhao, Kwang-Sung Jun, Tanner Fiez, Lalit Jain

    Abstract: This paper introduces the \emph{confounded pure exploration transductive linear bandit} (\texttt{CPET-LB}) problem. As a motivating example, often online services cannot directly assign users to specific control or treatment experiences either for business or practical reasons. In these settings, naively comparing treatment and control groups that may result from self-selection can lead to biased… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2402.11156  [pdf, other

    stat.ML cs.LG

    Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits

    Authors: Kyoungseok Jang, Chicheng Zhang, Kwang-Sung Jun

    Abstract: We study low-rank matrix trace regression and the related problem of low-rank matrix bandits. Assuming access to the distribution of the covariates, we propose a novel low-rank matrix estimation method called LowPopArt and provide its recovery guarantee that depends on a novel quantity denoted by B(Q) that characterizes the hardness of the problem, where Q is the covariance matrix of the measureme… ▽ More

    Submitted 8 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  4. arXiv:2402.09201  [pdf, ps, other

    cs.LG stat.ML

    Better-than-KL PAC-Bayes Bounds

    Authors: Ilja Kuzborskij, Kwang-Sung Jun, Yulian Wu, Kyoungseok Jang, Francesco Orabona

    Abstract: Let $f(θ, X_1),$ $ \dots,$ $ f(θ, X_n)$ be a sequence of random elements, where $f$ is a fixed scalar function, $X_1, \dots, X_n$ are independent random variables (data), and $θ$ is a random parameter distributed according to some data-dependent posterior distribution $P_n$. In this paper, we consider the problem of proving concentration inequalities to estimate the mean of the sequence. An exampl… ▽ More

    Submitted 4 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  5. arXiv:2402.07341  [pdf, other

    stat.ML cs.LG

    Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization

    Authors: Kwang-Sung Jun, Jungtaek Kim

    Abstract: Adapting to a priori unknown noise level is a very important but challenging problem in sequential decision-making as efficient exploration typically requires knowledge of the noise level, which is often loosely specified. We report significant progress in addressing this issue for linear bandits in two respects. First, we propose a novel confidence set that is `semi-adaptive' to the unknown sub-G… ▽ More

    Submitted 7 June, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: accepted to ICML'24; fixed typos

  6. arXiv:2310.18554  [pdf, other

    stat.ML cs.LG

    Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

    Authors: Junghyun Lee, Se-Young Yun, Kwang-Sung Jun

    Abstract: Logistic bandit is a ubiquitous framework of modeling users' choices, e.g., click vs. no click for advertisement recommender system. We observe that the prior works overlook or neglect dependencies in $S \geq \lVert θ_\star \rVert_2$, where $θ_\star \in \mathbb{R}^d$ is the unknown parameter vector, which is particularly problematic when $S$ is large, e.g., $S \geq d$. In this work, we improve the… ▽ More

    Submitted 12 March, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: 39 pages, 1 figure, 1 table; Accepted to the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024) (ver2: fixed some errors and significantly expanded discussions on various parts, such as related work. ver3: fixed some minor typos)

  7. arXiv:2304.14989  [pdf, other

    cs.LG stat.ML

    Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards

    Authors: Hao Qin, Kwang-Sung Jun, Chicheng Zhang

    Abstract: We study $K$-armed bandit problems where the reward distributions of the arms are all supported on the $[0,1]$ interval. It has been a challenge to design regret-efficient randomized exploration algorithms in this setting. Maillard sampling \cite{maillard13apprentissage}, an attractive alternative to Thompson sampling, has recently been shown to achieve competitive regret guarantees in the sub-Gau… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

    Comments: Accepted by NeurIPS 2023

  8. arXiv:2302.05829  [pdf, other

    cs.LG stat.ML

    Tighter PAC-Bayes Bounds Through Coin-Betting

    Authors: Kyoungseok Jang, Kwang-Sung Jun, Ilja Kuzborskij, Francesco Orabona

    Abstract: We consider the problem of estimating the mean of a sequence of random elements $f(X_1, θ)$ $, \ldots, $ $f(X_n, θ)$ where $f$ is a fixed scalar function, $S=(X_1, \ldots, X_n)$ are independent random variables, and $θ$ is a possibly $S$-dependent parameter. An example of such a problem would be to estimate the generalization error of a neural network trained on $n$ examples where $f$ is a loss fu… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  9. arXiv:2210.15345  [pdf, other

    stat.ML cs.LG

    PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits

    Authors: Kyoungseok Jang, Chicheng Zhang, Kwang-Sung Jun

    Abstract: In sparse linear bandits, a learning agent sequentially selects an action and receive reward feedback, and the reward function depends linearly on a few coordinates of the covariates of the actions. This has applications in many real-world sequential decision making problems. In this paper, we propose a simple and computationally efficient sparse linear estimation method called PopArt that enjoys… ▽ More

    Submitted 17 November, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 10 pages, 1 figures, published in the 2022 Conference on Neural Information Processing Systems

  10. arXiv:2205.01257  [pdf, other

    stat.ML cs.AI cs.LG

    Norm-Agnostic Linear Bandits

    Authors: Spencer, Gales, Sunder Sethuraman, Kwang-Sung Jun

    Abstract: Linear bandits have a wide variety of applications including recommendation systems yet they make one strong assumption: the algorithms must know an upper bound $S$ on the norm of the unknown parameter $θ^*$ that governs the reward generation. Such an assumption forces the practitioner to guess $S$ involved in the confidence bound, leaving no choice but to wish that $\|θ^*\|\le S$ is true to guara… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: AISTATS'22; added acknowledgements

  11. arXiv:2202.02407  [pdf, other

    stat.ML cs.LG

    An Experimental Design Approach for Regret Minimization in Logistic Bandits

    Authors: Blake Mason, Kwang-Sung Jun, Lalit Jain

    Abstract: In this work we consider the problem of regret minimization for logistic bandits. The main challenge of logistic bandits is reducing the dependence on a potentially large problem dependent constant $κ$ that can at worst scale exponentially with the norm of the unknown parameter $θ_{\ast}$. Abeille et al. (2021) have applied self-concordance of the logistic function to remove this worst-case depend… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

  12. arXiv:2111.03290  [pdf, other

    stat.ML cs.LG

    Maillard Sampling: Boltzmann Exploration Done Optimally

    Authors: Jie Bian, Kwang-Sung Jun

    Abstract: The PhD thesis of Maillard (2013) presents a rather obscure algorithm for the $K$-armed bandit problem. This less-known algorithm, which we call Maillard sampling (MS), computes the probability of choosing each arm in a \textit{closed form}, which is not true for Thompson sampling, a widely-adopted bandit algorithm in the industry. This means that the bandit-logged data from running MS can be read… ▽ More

    Submitted 4 March, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: accepted to AISTATS'22

  13. arXiv:2111.03289  [pdf, ps, other

    stat.ML cs.LG math.ST

    Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs

    Authors: Yeoneung Kim, Insoon Yang, Kwang-Sung Jun

    Abstract: In online learning problems, exploiting low variance plays an important role in obtaining tight performance guarantees yet is challenging because variances are often not known a priori. Recently, considerable progress has been made by Zhang et al. (2021) where they obtain a variance-adaptive regret bound for linear bandits without knowledge of the variances and a horizon-free regret bound for line… ▽ More

    Submitted 4 February, 2023; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: accepted to neurips'22

  14. arXiv:2110.14099  [pdf, other

    stat.ML cs.IT cs.LG math.ST stat.ME

    Tight Concentrations and Confidence Sequences from the Regret of Universal Portfolio

    Authors: Francesco Orabona, Kwang-Sung Jun

    Abstract: A classic problem in statistics is the estimation of the expectation of random variables from samples. This gives rise to the tightly connected problems of deriving concentration inequalities and confidence sequences, that is confidence intervals that hold uniformly over time. Previous work has shown how to easily convert the regret guarantee of an online betting algorithm into a time-uniform conc… ▽ More

    Submitted 31 July, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  15. arXiv:2102.02472  [pdf, other

    cs.LG cs.AI stat.ML

    Transfer Learning in Bandits with Latent Continuity

    Authors: Hyejin Park, Seiyun Shin, Kwang-Sung Jun, Jungseul Ok

    Abstract: Structured stochastic multi-armed bandits provide accelerated regret rates over the standard unstructured bandit problems. Most structured bandits, however, assume the knowledge of the structural parameter such as Lipschitz continuity, which is often not available. To cope with the latent structural parameter, we consider a transfer learning setting in which an agent must learn to transfer the str… ▽ More

    Submitted 25 June, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

  16. arXiv:2011.11222  [pdf, other

    stat.ML cs.LG

    Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

    Authors: Kwang-Sung Jun, Lalit Jain, Blake Mason, Houssam Nassif

    Abstract: We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/κ$, where $κ$ is the minimal variance over all arms' reward distri… ▽ More

    Submitted 18 March, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

    Journal ref: Proceedings of the International Conference on Machine Learning (ICML'21), pp. 5148-5157, 2021

  17. arXiv:2006.08754  [pdf, other

    cs.LG stat.ML

    Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality

    Authors: Kwang-Sung Jun, Chicheng Zhang

    Abstract: We study stochastic structured bandits for minimizing regret. The fact that the popular optimistic algorithms do not achieve the asymptotic instance-dependent regret optimality (asymptotic optimality for short) has recently alluded researchers. On the other hand, it is known that one can achieve bounded regret (i.e., does not grow indefinitely with $n$) in certain instances. Unfortunately, existin… ▽ More

    Submitted 22 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: accepted to NeurIPS'20; added the lower bound result

  18. arXiv:1911.09564  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Parameter-Free Locally Differentially Private Stochastic Subgradient Descent

    Authors: Kwang-Sung Jun, Francesco Orabona

    Abstract: We consider the problem of minimizing a convex risk with stochastic subgradients guaranteeing $ε$-locally differentially private ($ε$-LDP). While it has been shown that stochastic optimization is possible with $ε$-LDP via the standard SGD (Song et al., 2013), its convergence rate largely depends on the learning rate, which must be tuned via repeated runs. Further, tuning is detrimental to privacy… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

    Comments: to appear at Privacy in Machine Learning (PriML) workshop, NeurIPS'19

  19. arXiv:1905.10680  [pdf, other

    cs.LG stat.ML

    Kernel Truncated Randomized Ridge Regression: Optimal Rates and Low Noise Acceleration

    Authors: Kwang-Sung Jun, Ashok Cutkosky, Francesco Orabona

    Abstract: In this paper, we consider the nonparametric least square regression in a Reproducing Kernel Hilbert Space (RKHS). We propose a new randomized algorithm that has optimal generalization error bounds with respect to the square loss, closing a long-standing gap between upper and lower bounds. Moreover, we show that our algorithm has faster finite-time and asymptotic rates on problems where the Bayes… ▽ More

    Submitted 25 May, 2019; originally announced May 2019.

  20. arXiv:1902.01500  [pdf, other

    cs.LG math.OC stat.ML

    Parameter-Free Online Convex Optimization with Sub-Exponential Noise

    Authors: Kwang-Sung Jun, Francesco Orabona

    Abstract: We consider the problem of unconstrained online convex optimization (OCO) with sub-exponential noise, a strictly more general problem than the standard OCO. In this setting, the learner receives a subgradient of the loss functions corrupted by sub-exponential noise and strives to achieve optimal regret guarantee, without knowledge of the competitor norm, i.e., in a parameter-free way. Recently, Cu… ▽ More

    Submitted 20 September, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

    Comments: v1: Accepted to COLT'19, v2: adjusted Theorem 3, w_t closed form solution, and typos

  21. arXiv:1901.02470  [pdf, other

    cs.LG stat.ML

    Bilinear Bandits with Low-rank Structure

    Authors: Kwang-Sung Jun, Rebecca Willett, Stephen Wright, Robert Nowak

    Abstract: We introduce the bilinear bandit problem with low-rank structure in which an action takes the form of a pair of arms from two different entity types, and the reward is a bilinear function of the known feature vectors of the arms. The unknown in the problem is a $d_1$ by $d_2$ matrix $\mathbfΘ^*$ that defines the reward, and has low rank $r \ll \min\{d_1,d_2\}$. Determination of $\mathbfΘ^*$ with t… ▽ More

    Submitted 9 June, 2019; v1 submitted 8 January, 2019; originally announced January 2019.

    Comments: Accepted to ICML'19

  22. arXiv:1810.12188  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Adversarial Attacks on Stochastic Bandits

    Authors: Kwang-Sung Jun, Lihong Li, Yuzhe Ma, Xiaojin Zhu

    Abstract: We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm. We propose the first attack against two popular bandit algorithms: $ε$-greedy and UCB, \emph{without} knowledge of the mean rewards. The attacker is able to spend only logarithmic effort, multiplied by a problem-specific parameter that becomes smaller as the b… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

    Comments: accepted to NIPS

  23. arXiv:1808.05760  [pdf, other

    cs.LG cs.CR stat.ML

    Data Poisoning Attacks in Contextual Bandits

    Authors: Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu

    Abstract: We study offline data poisoning attacks in contextual bandits, a class of reinforcement learning problems with important applications in online recommendation and adaptive medical treatment, among others. We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for… ▽ More

    Submitted 23 August, 2018; v1 submitted 17 August, 2018; originally announced August 2018.

    Comments: GameSec 2018

  24. arXiv:1711.02545  [pdf, other

    stat.ML cs.LG

    Online Learning for Changing Environments using Coin Betting

    Authors: Kwang-Sung Jun, Francesco Orabona, Stephen Wright, Rebecca Willett

    Abstract: A key challenge in online learning is that classical algorithms can be slow to adapt to changing environments. Recent studies have proposed "meta" algorithms that convert any online learning algorithm to one that is adaptive to changing environments, where the adaptivity is analyzed in a quantity called the strongly-adaptive regret. This paper describes a new meta algorithm that has a strongly-ada… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

    Comments: submitted to a journal. arXiv admin note: substantial text overlap with arXiv:1610.04578

  25. arXiv:1706.00136  [pdf, other

    stat.ML cs.LG

    Scalable Generalized Linear Bandits: Online Computation and Hashing

    Authors: Kwang-Sung Jun, Aniruddha Bhargava, Robert Nowak, Rebecca Willett

    Abstract: Generalized Linear Bandits (GLBs), a natural extension of the stochastic linear bandits, has been popular and successful in recent years. However, existing GLBs scale poorly with the number of rounds and the number of arms, limiting their utility in practice. This paper proposes new, scalable solutions to the GLB problem in two respects. First, unlike existing GLBs, whose per-time-step space and t… ▽ More

    Submitted 21 October, 2017; v1 submitted 31 May, 2017; originally announced June 2017.

    Comments: accepted to NIPS'17 (typos fixed)

  26. arXiv:1610.04578  [pdf, other

    stat.ML cs.LG

    Improved Strongly Adaptive Online Learning using Coin Betting

    Authors: Kwang-Sung Jun, Francesco Orabona, Rebecca Willett, Stephen Wright

    Abstract: This paper describes a new parameter-free online learning algorithm for changing environments. In comparing against algorithms with the same time complexity as ours, we obtain a strongly adaptive regret bound that is a factor of at least $\sqrt{\log(T)}$ better, where $T$ is the time horizon. Empirical results show that our algorithm outperforms state-of-the-art methods in learning with expert adv… ▽ More

    Submitted 7 August, 2017; v1 submitted 14 October, 2016; originally announced October 2016.

    Comments: fixed a few typos

  27. arXiv:1609.00845  [pdf, other

    stat.ML cs.LG

    Graph-Based Active Learning: A New Look at Expected Error Minimization

    Authors: Kwang-Sung Jun, Robert Nowak

    Abstract: In graph-based active learning, algorithms based on expected error minimization (EEM) have been popular and yield good empirical performance. The exact computation of EEM optimally balances exploration and exploitation. In practice, however, EEM-based algorithms employ various approximations due to the computational hardness of exact EEM. This can result in a lack of either exploration or exploita… ▽ More

    Submitted 3 September, 2016; originally announced September 2016.

    Comments: Submitted to GlobalSIP 2016