Showing 1–13 of 13 results for author: Greenberg, I

Search v0.5.6 released 2020-02-24

arXiv:2408.11730 [pdf, other]

cs.IT

Effective Wordle Heuristics

Authors: Ronald I. Greenberg

Abstract: While previous researchers have performed an exhaustive search to determine an optimal Wordle strategy, that computation is very time consuming and produced a strategy using words that are unfamiliar to most people. With Wordle solutions being gradually eliminated (with a new puzzle each day and no reuse), an improved strategy could be generated each day, but the computation time makes a daily exh… ▽ More While previous researchers have performed an exhaustive search to determine an optimal Wordle strategy, that computation is very time consuming and produced a strategy using words that are unfamiliar to most people. With Wordle solutions being gradually eliminated (with a new puzzle each day and no reuse), an improved strategy could be generated each day, but the computation time makes a daily exhaustive search impractical. This paper shows that simple heuristics allow for fast generation of effective strategies and that little is lost by guessing only words that are possible solution words rather than more obscure words. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: 7 pages including references, 4 tables
arXiv:2310.00675 [pdf, other]

cs.LG eess.SP

Optimization or Architecture: How to Hack Kalman Filtering

Authors: Ido Greenberg, Netanel Yannay, Shie Mannor

Abstract: In non-linear filtering, it is traditional to compare non-linear architectures such as neural networks to the standard linear Kalman Filter (KF). We observe that this mixes the evaluation of two separate components: the non-linear architecture, and the parameters optimization method. In particular, the non-linear model is often optimized, whereas the reference KF model is not. We argue that both s… ▽ More In non-linear filtering, it is traditional to compare non-linear architectures such as neural networks to the standard linear Kalman Filter (KF). We observe that this mixes the evaluation of two separate components: the non-linear architecture, and the parameters optimization method. In particular, the non-linear model is often optimized, whereas the reference KF model is not. We argue that both should be optimized similarly, and to that end present the Optimized KF (OKF). We demonstrate that the KF may become competitive to neural models - if optimized using OKF. This implies that experimental conclusions of certain previous studies were derived from a flawed process. The advantage of OKF over the standard KF is further studied theoretically and empirically, in a variety of problems. Conveniently, OKF can replace the KF in real-world systems by merely updating the parameters. △ Less

Submitted 1 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023
arXiv:2306.14020 [pdf, other]

cs.LG

Individualized Dosing Dynamics via Neural Eigen Decomposition

Authors: Stav Belogolovsky, Ido Greenberg, Danny Eytan, Shie Mannor

Abstract: Dosing models often use differential equations to model biological dynamics. Neural differential equations in particular can learn to predict the derivative of a process, which permits predictions at irregular points of time. However, this temporal flexibility often comes with a high sensitivity to noise, whereas medical problems often present high noise and limited data. Moreover, medical dosing… ▽ More Dosing models often use differential equations to model biological dynamics. Neural differential equations in particular can learn to predict the derivative of a process, which permits predictions at irregular points of time. However, this temporal flexibility often comes with a high sensitivity to noise, whereas medical problems often present high noise and limited data. Moreover, medical dosing models must generalize reliably over individual patients and changing treatment policies. To address these challenges, we introduce the Neural Eigen Stochastic Differential Equation algorithm (NESDE). NESDE provides individualized modeling (using a hypernetwork over patient-level parameters); generalization to new treatment policies (using decoupled control); tunable expressiveness according to the noise level (using piecewise linearity); and fast, continuous, closed-form prediction (using spectral representation). We demonstrate the robustness of NESDE in both synthetic and real medical problems, and use the learned dynamics to publish simulated medical gym environments. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Comments: arXiv admin note: text overlap with arXiv:2202.00117
arXiv:2301.11147 [pdf, other]

cs.LG

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

Authors: Ido Greenberg, Shie Mannor, Gal Chechik, Eli Meirom

Abstract: A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a meta-policy that adapts to new tasks. Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty. This limits system reliability since test tasks… ▽ More A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a meta-policy that adapts to new tasks. Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty. This limits system reliability since test tasks are not known in advance. In this work, we define a robust MRL objective with a controlled robustness level. Optimization of analogous robust objectives in RL is known to lead to both *biased gradients* and *data inefficiency*. We prove that the gradient bias disappears in our proposed MRL framework. The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML). RoML is a meta-algorithm that generates a robust version of any given MRL algorithm, by identifying and over-sampling harder tasks throughout training. We demonstrate that RoML achieves robust returns on multiple navigation and continuous control benchmarks. △ Less

Submitted 1 October, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: NeurIPS 2023
arXiv:2208.02294 [pdf, other]

cs.CL cs.LG

Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning

Authors: Deborah Cohen, Moonkyung Ryu, Yinlam Chow, Orgad Keller, Ido Greenberg, Avinatan Hassidim, Michael Fink, Yossi Matias, Idan Szpektor, Craig Boutilier, Gal Elidan

Abstract: Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge. In this work we develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversa… ▽ More Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge. In this work we develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale. Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space that changes as the conversation progresses. Trained using crowd-sourced data, our novel system is able to substantially exceeds the (strong) baseline supervised model with respect to several metrics of interest in a live experiment with real users of the Google Assistant. △ Less

Submitted 25 July, 2022; originally announced August 2022.
arXiv:2205.05138 [pdf, other]

cs.LG

Efficient Risk-Averse Reinforcement Learning

Authors: Ido Greenberg, Yinlam Chow, Mohammad Ghavamzadeh, Shie Mannor

Abstract: In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. A risk measure often focuses on the worst returns out of the agent's experience. As a result, standard methods for risk-averse RL often ignore high-return strategies. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypas… ▽ More In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. A risk measure often focuses on the worst returns out of the agent's experience. As a result, standard methods for risk-averse RL often ignore high-return strategies. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We also devise a novel Cross Entropy module for risk sampling, which (1) preserves risk aversion despite the soft risk; (2) independently improves sample efficiency. By separating the risk aversion of the sampler and the optimizer, we can sample episodes with poor conditions, yet optimize with respect to successful strategies. We combine these two concepts in CeSoR - Cross-entropy Soft-Risk optimization algorithm - which can be applied on top of any risk-averse policy gradient (PG) method. We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks, including in scenarios where standard risk-averse PG completely fails. △ Less

Submitted 12 October, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: Accepted to NeurIPS 2022
arXiv:2202.00117 [pdf, other]

cs.LG eess.SY

Continuous Forecasting via Neural Eigen Decomposition

Authors: Stav Belogolovsky, Ido Greenberg, Danny Eitan, Shie Mannor

Abstract: Neural differential equations predict the derivative of a stochastic process. This allows irregular forecasting with arbitrary time-steps. However, the expressive temporal flexibility often comes with a high sensitivity to noise. In addition, current methods model measurements and control together, limiting generalization to different control policies. These properties severely limit applicability… ▽ More Neural differential equations predict the derivative of a stochastic process. This allows irregular forecasting with arbitrary time-steps. However, the expressive temporal flexibility often comes with a high sensitivity to noise. In addition, current methods model measurements and control together, limiting generalization to different control policies. These properties severely limit applicability to medical treatment problems, which require reliable forecasting given high noise, limited data and changing treatment policies. We introduce the Neural Eigen-SDE algorithm (NESDE), which relies on piecewise linear dynamics modeling with spectral representation. NESDE provides control over the expressiveness level; decoupling of control from measurements; and closed-form continuous prediction in inference. NESDE is demonstrated to provide robust forecasting in both synthetic and real high-noise medical problems. Finally, we use the learned dynamics models to publish simulated medical gym environments. △ Less

Submitted 4 February, 2023; v1 submitted 31 January, 2022; originally announced February 2022.
arXiv:2104.02372 [pdf, other]

cs.LG eess.SY

The Fragility of Noise Estimation in Kalman Filter: Optimization Can Handle Model-Misspecification

Authors: Ido Greenberg, Shie Mannor, Netanel Yannay

Abstract: The Kalman Filter (KF) parameters are traditionally determined by noise estimation, since under the KF assumptions, the state prediction errors are minimized when the parameters correspond to the noise covariance. However, noise estimation remains the gold-standard regardless of the assumptions - even when it is not equivalent to errors minimization. We demonstrate that even seemingly simple probl… ▽ More The Kalman Filter (KF) parameters are traditionally determined by noise estimation, since under the KF assumptions, the state prediction errors are minimized when the parameters correspond to the noise covariance. However, noise estimation remains the gold-standard regardless of the assumptions - even when it is not equivalent to errors minimization. We demonstrate that even seemingly simple problems may include multiple assumptions violations - which are sometimes hard to even notice. We show theoretically and empirically that even a minor violation may largely shift the optimal parameters. We propose a gradient-based method along with the Cholesky parameterization to explicitly optimize the state prediction errors. We show consistent improvement over noise estimation in tens of experiments in 3 different domains. Finally, we demonstrate that optimization makes the KF competitive with an LSTM model - even in non linear problems. △ Less

Submitted 30 June, 2022; v1 submitted 6 April, 2021; originally announced April 2021.
arXiv:2010.11660 [pdf, other]

cs.LG

Detecting Rewards Deterioration in Episodic Reinforcement Learning

Authors: Ido Greenberg, Shie Mannor

Abstract: In many RL applications, once training ends, it is vital to detect any deterioration in the agent performance as soon as possible. Furthermore, it often has to be done without modifying the policy and under minimal assumptions regarding the environment. In this paper, we address this problem by focusing directly on the rewards and testing for degradation. We consider an episodic framework, where t… ▽ More In many RL applications, once training ends, it is vital to detect any deterioration in the agent performance as soon as possible. Furthermore, it often has to be done without modifying the policy and under minimal assumptions regarding the environment. In this paper, we address this problem by focusing directly on the rewards and testing for degradation. We consider an episodic framework, where the rewards within each episode are not independent, nor identically-distributed, nor Markov. We present this problem as a multivariate mean-shift detection problem with possibly partial observations. We define the mean-shift in a way corresponding to deterioration of a temporal signal (such as the rewards), and derive a test for this problem with optimal statistical power. Empirically, on deteriorated rewards in control problems (generated using various environment modifications), the test is demonstrated to be more powerful than standard tests - often by orders of magnitude. We also suggest a novel Bootstrap mechanism for False Alarm Rate control (BFAR), applicable to episodic (non-i.i.d) signal and allowing our test to run sequentially in an online manner. Our method does not rely on a learned model of the environment, is entirely external to the agent, and in fact can be applied to detect changes or drifts in any episodic signal. △ Less

Submitted 28 October, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

Comments: ICML 2021
arXiv:cs/0301034 [pdf, ps, other]

cs.DS cs.DM

Computing the Number of Longest Common Subsequences

Authors: Ronald I. Greenberg

Abstract: This note provides very simple, efficient algorithms for computing the number of distinct longest common subsequences of two input strings and for computing the number of LCS embeddings. This note provides very simple, efficient algorithms for computing the number of distinct longest common subsequences of two input strings and for computing the number of LCS embeddings. △ Less

Submitted 29 January, 2003; originally announced January 2003.

Comments: 3 pages, LaTeX

ACM Class: F.2.2; G.2.1
arXiv:cs/0301030 [pdf, ps, other]

cs.DM cs.DS

Bounds on the Number of Longest Common Subsequences

Authors: Ronald I. Greenberg

Abstract: This paper performs the analysis necessary to bound the running time of known, efficient algorithms for generating all longest common subsequences. That is, we bound the running time as a function of input size for algorithms with time essentially proportional to the output size. This paper considers both the case of computing all distinct LCSs and the case of computing all LCS embeddings. Also… ▽ More This paper performs the analysis necessary to bound the running time of known, efficient algorithms for generating all longest common subsequences. That is, we bound the running time as a function of input size for algorithms with time essentially proportional to the output size. This paper considers both the case of computing all distinct LCSs and the case of computing all LCS embeddings. Also included is an analysis of how much better the efficient algorithms are than the standard method of generating LCS embeddings. A full analysis is carried out with running times measured as a function of the total number of input characters, and much of the analysis is also provided for cases in which the two input sequences are of the same specified length or of two independently specified lengths. △ Less

Submitted 6 August, 2003; v1 submitted 28 January, 2003; originally announced January 2003.

Comments: 13 pages. Corrected typos, corrected operation of hyperlinks, improved presentation

ACM Class: G.2.1
arXiv:cs/0211001 [pdf, ps, other]

cs.DS

Fast and Simple Computation of All Longest Common Subsequences

Authors: Ronald I. Greenberg

Abstract: This paper shows that a simple algorithm produces the {\em all-prefixes-LCSs-graph} in $O(mn)$ time for two input sequences of size $m$ and $n$. Given any prefix $p$ of the first input sequence and any prefix $q$ of the second input sequence, all longest common subsequences (LCSs) of $p$ and $q$ can be generated in time proportional to the output size, once the all-prefixes-LCSs-graph has been con… ▽ More This paper shows that a simple algorithm produces the {\em all-prefixes-LCSs-graph} in $O(mn)$ time for two input sequences of size $m$ and $n$. Given any prefix $p$ of the first input sequence and any prefix $q$ of the second input sequence, all longest common subsequences (LCSs) of $p$ and $q$ can be generated in time proportional to the output size, once the all-prefixes-LCSs-graph has been constructed. The problem can be solved in the context of generating all the distinct character strings that represent an LCS or in the context of generating all ways of embedding an LCS in the two input strings. △ Less

Submitted 8 March, 2011; v1 submitted 1 November, 2002; originally announced November 2002.

Comments: LaTeX 8 pages, 4 figures, corrected typos (especially in pseudocode in Figure 4)

ACM Class: F.2.2
arXiv:cs/0105034 [pdf, ps, other]

cs.DC

On the Area of Hypercube Layouts

Authors: Ronald I. Greenberg, Lee Guan

Abstract: This paper precisely analyzes the wire density and required area in standard layout styles for the hypercube. The most natural, regular layout of a hypercube of N^2 nodes in the plane, in a N x N grid arrangement, uses floor(2N/3)+1 horizontal wiring tracks for each row of nodes. (The number of tracks per row can be reduced by 1 with a less regular design.) This paper also gives a simple formula… ▽ More This paper precisely analyzes the wire density and required area in standard layout styles for the hypercube. The most natural, regular layout of a hypercube of N^2 nodes in the plane, in a N x N grid arrangement, uses floor(2N/3)+1 horizontal wiring tracks for each row of nodes. (The number of tracks per row can be reduced by 1 with a less regular design.) This paper also gives a simple formula for the wire density at any cut position and a full characterization of all places where the wire density is maximized (which does not occur at the bisection). △ Less

Submitted 29 May, 2001; originally announced May 2001.

Comments: 8 pages, 4 figures, LaTeX

ACM Class: C.1.2

Journal ref: condensed and revised in Information Processing Letters, v. 84, n. 1, pp. 41--46, Sep. 2002

Search v0.5.6 released 2020-02-24