Skip to main content

Showing 1–50 of 115 results for author: Cohen, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.07080  [pdf, other

    cs.CL

    Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities

    Authors: Shaltiel Shmidman, Avi Shmidman, Amir DN Cohen, Moshe Koppel

    Abstract: Training large language models (LLMs) in low-resource languages such as Hebrew poses unique challenges. In this paper, we introduce DictaLM2.0 and DictaLM2.0-Instruct, two LLMs derived from the Mistral model, trained on a substantial corpus of approximately 200 billion tokens in both Hebrew and English. Adapting a pre-trained model to a new language involves specialized techniques that differ sign… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2406.19485  [pdf, other

    eess.IV cs.CV

    GAPNet: Granularity Attention Network with Anatomy-Prior-Constraint for Carotid Artery Segmentation

    Authors: Lin Zhang, Chenggang Lu, Xin-yang Shi, Caifeng Shan, Jiong Zhang, Da Chen, Laurent D. Cohen

    Abstract: Atherosclerosis is a chronic, progressive disease that primarily affects the arterial walls. It is one of the major causes of cardiovascular disease. Magnetic Resonance (MR) black-blood vessel wall imaging (BB-VWI) offers crucial insights into vascular disease diagnosis by clearly visualizing vascular structures. However, the complex anatomy of the neck poses challenges in distinguishing the carot… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2403.03458  [pdf, other

    cs.CV cs.LG

    Slot Abstractors: Toward Scalable Abstract Visual Reasoning

    Authors: Shanka Subhra Mondal, Jonathan D. Cohen, Taylor W. Webb

    Abstract: Abstract visual reasoning is a characteristically human ability, allowing the identification of relational patterns that are abstracted away from object features, and the systematic generalization of those patterns to unseen problems. Recent work has demonstrated strong systematic generalization in visual reasoning tasks involving multi-object inputs, through the integration of slot-based methods… ▽ More

    Submitted 2 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 18 pages, 9 figures

  4. arXiv:2402.18426  [pdf, other

    cs.AI cs.LG

    A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

    Authors: Declan Campbell, Jonathan D. Cohen

    Abstract: The human cognitive system exhibits remarkable flexibility and generalization capabilities, partly due to its ability to form low-dimensional, compositional representations of the environment. In contrast, standard neural network architectures often struggle with abstract reasoning tasks, overfitting, and requiring extensive data for training. This paper investigates the impact of the relational b… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  5. arXiv:2402.12276  [pdf, other

    cs.IR

    Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from LLMs

    Authors: Puxuan Yu, Daniel Cohen, Hemank Lamba, Joel Tetreault, Alex Jaimes

    Abstract: In search settings, calibrating the scores during the ranking process to quantities such as click-through rates or relevance levels enhances a system's usefulness and trustworthiness for downstream users. While previous research has improved this notion of calibration for low complexity learning-to-rank models, the larger data demands and parameter count specific to modern neural text rankers prod… ▽ More

    Submitted 26 August, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  6. arXiv:2402.11447  [pdf, other

    cs.CL

    In-Context Example Ordering Guided by Label Distributions

    Authors: Zhichao Xu, Daniel Cohen, Bei Wang, Vivek Srikumar

    Abstract: By allowing models to predict without task-specific training, in-context learning (ICL) with pretrained LLMs has enormous potential in NLP. However, a number of problems persist in ICL. In particular, its performance is sensitive to the choice and order of in-context examples. Given the same set of in-context examples with different orderings, model performance may vary between near random to near… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: preprint

  7. arXiv:2402.04203  [pdf, other

    cs.AI q-bio.NC

    Human-Like Geometric Abstraction in Large Pre-trained Neural Networks

    Authors: Declan Campbell, Sreejan Kumar, Tyler Giallanza, Thomas L. Griffiths, Jonathan D. Cohen

    Abstract: Humans possess a remarkable capacity to recognize and manipulate abstract structure, which is especially apparent in the domain of geometry. Recent research in cognitive science suggests neural networks do not share this capacity, concluding that human geometric abilities come from discrete symbolic structure in human mental representations. However, progress in artificial intelligence (AI) sugges… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  8. arXiv:2401.14493  [pdf, other

    cs.CL cs.HC cs.LG

    K-QA: A Real-World Medical Q&A Benchmark

    Authors: Itay Manes, Naama Ronn, David Cohen, Ran Ilan Ber, Zehavi Horowitz-Kugler, Gabriel Stanovsky

    Abstract: Ensuring the accuracy of responses provided by large language models (LLMs) is crucial, particularly in clinical settings where incorrect information may directly impact patient health. To address this challenge, we construct K-QA, a dataset containing 1,212 patient questions originating from real-world conversations held on K Health (an AI-driven clinical platform). We employ a panel of in-house… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: The data and the evaluation script are available at https://github.com/Itaymanes/K-QA. Results and model comparisons can be viewed at https://huggingface.co/spaces/Itaykhealth/K-QA

  9. arXiv:2311.07188  [pdf, other

    cs.CV

    Fitting tree model with CNN and geodesics to track vesselsand application to Ultrasound Localization Microscopy data

    Authors: Théo Bertrand, Laurent D. Cohen

    Abstract: Segmentation of tubular structures in vascular imaging is a well studied task, although it is rare that we try to infuse knowledge of the tree-like structure of the regions to be detected. Our work focuses on detecting the important landmarks in the vascular network (via CNN performing both localization and classification of the points of interest) and representing vessels as the edges in some min… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  10. arXiv:2310.14282  [pdf, other

    cs.CL cs.AI cs.IR

    NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval

    Authors: Uri Katz, Matan Vetzler, Amir DN Cohen, Yoav Goldberg

    Abstract: Recognizing entities in texts is a central need in many information-seeking scenarios, and indeed, Named Entity Recognition (NER) is arguably one of the most successful examples of a widely adopted NLP task and corresponding NLP technology. Recent advances in large language models (LLMs) appear to provide effective solutions (also) for NER tasks that were traditionally handled with dedicated model… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  11. arXiv:2309.14568  [pdf, other

    cs.CL

    Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew

    Authors: Shaltiel Shmidman, Avi Shmidman, Amir David Nissan Cohen, Moshe Koppel

    Abstract: We present DictaLM, a large-scale language model tailored for Modern Hebrew. Boasting 7B parameters, this model is predominantly trained on Hebrew-centric data. As a commitment to promoting research and development in the Hebrew language, we release both the foundation model and the instruct-tuned model under a Creative Commons license. Concurrently, we introduce DictaLM-Rab, another foundation mo… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  12. arXiv:2309.11512  [pdf, other

    stat.AP cs.LG

    Multidimensional well-being of US households at a fine spatial scale using fused household surveys: fusionACS

    Authors: Kevin Ummel, Miguel Poblete-Cazenave, Karthik Akkiraju, Nick Graetz, Hero Ashman, Cora Kingdon, Steven Herrera Tenorio, Aaryaman "Sunny" Singhal, Daniel Aldana Cohen, Narasimha D. Rao

    Abstract: Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistical… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 35 pages, 6 figures

  13. Predictive Uncertainty-based Bias Mitigation in Ranking

    Authors: Maria Heuss, Daniel Cohen, Masoud Mansoury, Maarten de Rijke, Carsten Eickhoff

    Abstract: Societal biases that are contained in retrieved documents have received increased interest. Such biases, which are often prevalent in the training data and learned by the model, can cause societal harms, by misrepresenting certain groups, and by enforcing stereotypes. Mitigating such biases demands algorithms that balance the trade-off between maximized utility for the user with fairness objective… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Journal ref: CIKM 2023: 32nd ACM International Conference on Information and Knowledge Management

  14. arXiv:2309.06629  [pdf, other

    cs.AI cs.NE

    The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

    Authors: Taylor W. Webb, Steven M. Frankland, Awni Altabaa, Simon Segert, Kamesh Krishnamurthy, Declan Campbell, Jacob Russin, Tyler Giallanza, Zack Dulberg, Randall O'Reilly, John Lafferty, Jonathan D. Cohen

    Abstract: A central challenge for cognitive science is to explain how abstract concepts are acquired from limited experience. This has often been framed in terms of a dichotomy between connectionist and symbolic cognitive models. Here, we highlight a recently emerging line of work that suggests a novel reconciliation of these approaches, by exploiting an inductive bias that we term the relational bottleneck… ▽ More

    Submitted 1 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  15. arXiv:2309.04169  [pdf, other

    cs.CV

    Grouping Boundary Proposals for Fast Interactive Image Segmentation

    Authors: Li Liu, Da Chen, Minglei Shu, Laurent D. Cohen

    Abstract: Geodesic models are known as an efficient tool for solving various image segmentation problems. Most of existing approaches only exploit local pointwise image features to track geodesic paths for delineating the objective boundaries. However, such a segmentation strategy cannot take into account the connectivity of the image edge features, increasing the risk of shortcut problem, especially in the… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  16. arXiv:2309.00597  [pdf, other

    cs.CE cs.DC cs.ET q-bio.NC quant-ph

    The QUATRO Application Suite: Quantum Computing for Models of Human Cognition

    Authors: Raghavendra Pradyumna Pothukuchi, Leon Lufkin, Yu Jun Shen, Alejandro Simon, Rome Thorstenson, Bernardo Eilert Trevisan, Michael Tu, Mudi Yang, Ben Foxman, Viswanatha Srinivas Pothukuchi, Gunnar Epping, Thi Ha Kyaw, Bryant J Jongkees, Yongshan Ding, Jerome R Busemeyer, Jonathan D Cohen, Abhishek Bhattacharjee

    Abstract: Research progress in quantum computing has, thus far, focused on a narrow set of application domains. Expanding the suite of quantum application domains is vital for the discovery of new software toolchains and architectural abstractions. In this work, we unlock a new class of applications ripe for quantum computing research -- computational cognitive modeling. Cognitive models are critical to und… ▽ More

    Submitted 8 December, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

  17. arXiv:2308.15729  [pdf, other

    cs.CG math.NA

    Computing Geodesic Paths Encoding a Curvature Prior

    Authors: Da Chen, Jean-Marie Mirebeau, Minglei Shu, Laurent D. Cohen

    Abstract: In this paper, we introduce an efficient method for computing curves minimizing a variant of the Euler-Mumford elastica energy, with fixed endpoints and tangents at these endpoints, where the bending energy is enhanced with a user defined and data-driven scalar-valued term referred to as the curvature prior. In order to guarantee that the globally optimal curve is extracted, the proposed method in… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  18. arXiv:2307.16104  [pdf, other

    cs.LG cs.AI physics.soc-ph

    AI Increases Global Access to Reliable Flood Forecasts

    Authors: Grey Nearing, Deborah Cohen, Vusumuzi Dube, Martin Gauch, Oren Gilon, Shaun Harrigan, Avinatan Hassidim, Daniel Klotz, Frederik Kratzert, Asher Metzger, Sella Nevo, Florian Pappenberger, Christel Prudhomme, Guy Shalev, Shlomo Shenzis, Tadele Tekalign, Dana Weitzner, Yoss Matias

    Abstract: Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Using AI, we achieve reliability in predicting extreme riverine event… ▽ More

    Submitted 3 November, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

  19. arXiv:2307.07575  [pdf, other

    cs.LG cs.NE

    A Quantitative Approach to Predicting Representational Learning and Performance in Neural Networks

    Authors: Ryan Pyle, Sebastian Musslick, Jonathan D. Cohen, Ankit B. Patel

    Abstract: A key property of neural networks (both biological and artificial) is how they learn to represent and manipulate input information in order to solve a task. Different types of representations may be suited to different types of tasks, making identifying and understanding learned representations a critical part of understanding and designing useful networks. In this paper, we introduce a new pseudo… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: 30 pages, 16 figures

  20. Fast Marching Energy CNN

    Authors: Nicolas Makaroff, Théo Bertrand, Laurent D. Cohen

    Abstract: Leveraging geodesic distances and the geometrical information they convey is key for many data-oriented applications in imaging. Geodesic distance computation has been used for long for image segmentation using Image based metrics. We introduce a new method by generating isotropic Riemannian metrics adapted to a problem using CNN and give as illustrations an example of application. We then apply t… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  21. arXiv:2306.16098  [pdf, other

    eess.IV cs.CV

    Chan-Vese Attention U-Net: An attention mechanism for robust segmentation

    Authors: Nicolas Makaroff, Laurent D. Cohen

    Abstract: When studying the results of a segmentation algorithm using convolutional neural networks, one wonders about the reliability and consistency of the results. This leads to questioning the possibility of using such an algorithm in applications where there is little room for doubt. We propose in this paper a new attention gate based on the use of Chan-Vese energy minimization to control more precisel… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  22. arXiv:2306.02500  [pdf, other

    cs.CV

    Systematic Visual Reasoning through Object-Centric Relational Abstraction

    Authors: Taylor W. Webb, Shanka Subhra Mondal, Jonathan D. Cohen

    Abstract: Human visual reasoning is characterized by an ability to identify abstract patterns from only a small number of examples, and to systematically generalize those patterns to novel inputs. This capacity depends in large part on our ability to represent complex visual inputs in terms of both objects and relations. Recent work in computer vision has introduced models with the capacity to extract objec… ▽ More

    Submitted 10 November, 2023; v1 submitted 4 June, 2023; originally announced June 2023.

  23. arXiv:2305.18417  [pdf, other

    cs.LG q-bio.NC

    Determinantal Point Process Attention Over Grid Cell Code Supports Out of Distribution Generalization

    Authors: Shanka Subhra Mondal, Steven Frankland, Taylor Webb, Jonathan D. Cohen

    Abstract: Deep neural networks have made tremendous gains in emulating human-like intelligence, and have been used increasingly as ways of understanding how the brain may solve the complex computational problems on which this relies. However, these still fall short of, and therefore fail to provide insight into how the brain supports strong forms of generalization of which humans are capable. One such case… ▽ More

    Submitted 23 January, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: 29 pages (including Appendix), 21 figures

  24. arXiv:2305.12517  [pdf, other

    cs.CL cs.IR cs.LG

    Description-Based Text Similarity

    Authors: Shauli Ravfogel, Valentina Pyatkin, Amir DN Cohen, Avshalom Manevich, Yoav Goldberg

    Abstract: Identifying texts with a given semantics is central for many information seeking scenarios. Similarity search over vector embeddings appear to be central to this ability, yet the similarity reflected in current text embeddings is corpus-driven, and is inconsistent and sub-optimal for many use cases. What, then, is a good notion of similarity for effective retrieval of text? We identify the need… ▽ More

    Submitted 24 July, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted in COLM 2024

  25. arXiv:2304.11721  [pdf, ps, other

    cs.IR cs.CL

    A Lightweight Constrained Generation Alternative for Query-focused Summarization

    Authors: Zhichao Xu, Daniel Cohen

    Abstract: Query-focused summarization (QFS) aims to provide a summary of a document that satisfies information need of a given query and is useful in various IR applications, such as abstractive snippet generation. Current QFS approaches typically involve injecting additional information, e.g. query-answer relevance or fine-grained token-level interaction between a query and document, into a finetuned large… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: in proceedings of SIGIR 2023

  26. arXiv:2303.10608  [pdf, other

    cs.LG cs.AI eess.SP

    A model is worth tens of thousands of examples

    Authors: Thomas Dagès, Laurent D. Cohen, Alfred M. Bruckstein

    Abstract: Traditional signal processing methods relying on mathematical data generation models have been cast aside in favour of deep neural networks, which require vast amounts of data. Since the theoretical sample complexity is nearly impossible to evaluate, these amounts of examples are usually estimated with crude rules of thumb. However, these rules only suggest when the networks should work, but do no… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

  27. arXiv:2303.02260  [pdf, other

    cs.CV cs.CL

    Learning to reason over visual objects

    Authors: Shanka Subhra Mondal, Taylor Webb, Jonathan D. Cohen

    Abstract: A core component of human intelligence is the ability to identify abstract patterns inherent in complex, high-dimensional perceptual data, as exemplified by visual reasoning tasks such as Raven's Progressive Matrices (RPM). Motivated by the goal of designing AI systems with this capacity, recent work has focused on evaluating whether neural networks can learn to solve RPM-like problems. Previous w… ▽ More

    Submitted 26 October, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: ICLR 2023

  28. arXiv:2303.01593  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    QAID: Question Answering Inspired Few-shot Intent Detection

    Authors: Asaf Yehudai, Matan Vetzler, Yosi Mass, Koren Lazar, Doron Cohen, Boaz Carmeli

    Abstract: Intent detection with semantically similar fine-grained intents is a challenging task. To address it, we reformulate intent detection as a question-answering retrieval task by treating utterances and intent names as questions and answers. To that end, we utilize a question-answering retrieval architecture and adopt a two stages training schema with batch contrastive loss. In the pre-training stage… ▽ More

    Submitted 21 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: ICLR paper

  29. arXiv:2302.06321  [pdf, other

    cs.CL cs.AI

    Parameter-efficient Modularised Bias Mitigation via AdapterFusion

    Authors: Deepak Kumar, Oleg Lesota, George Zerveas, Daniel Cohen, Carsten Eickhoff, Markus Schedl, Navid Rekabsaz

    Abstract: Large pre-trained language models contain societal biases and carry along these biases to downstream tasks. Current in-processing bias mitigation approaches (like adversarial training) impose debiasing by updating a model's parameters, effectively transferring the model to a new, irreversible debiased state. In this work, we propose a novel approach to develop stand-alone debiasing functionalities… ▽ More

    Submitted 18 June, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: Post EACL 2023 version

  30. arXiv:2211.11609  [pdf

    cs.CV eess.IV math.AP

    Deformable Voxel Grids for Shape Comparisons

    Authors: Raphaël Groscot, Laurent D. Cohen

    Abstract: We present Deformable Voxel Grids (DVGs) for 3D shapes comparison and processing. It consists of a voxel grid which is deformed to approximate the silhouette of a shape, via energy-minimization. By interpreting the DVG as a local coordinates system, it provides a better embedding space than a regular voxel grid, since it is adapted to the geometry of the shape. It also allows to deform the shape b… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Journal ref: 14th International Conference on Digital Image Processing (ICDIP 2022), May 2022, Wuhan (Virtual), China

  31. arXiv:2208.02294  [pdf, other

    cs.CL cs.LG

    Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning

    Authors: Deborah Cohen, Moonkyung Ryu, Yinlam Chow, Orgad Keller, Ido Greenberg, Avinatan Hassidim, Michael Fink, Yossi Matias, Idan Szpektor, Craig Boutilier, Gal Elidan

    Abstract: Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge. In this work we develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversa… ▽ More

    Submitted 25 July, 2022; originally announced August 2022.

  32. arXiv:2207.08574  [pdf, other

    stat.ML cs.LG eess.SP

    ManiFeSt: Manifold-based Feature Selection for Small Data Sets

    Authors: David Cohen, Tal Shnitzer, Yuval Kluger, Ronen Talmon

    Abstract: In this paper, we present a new method for few-sample supervised feature selection (FS). Our method first learns the manifold of the feature space of each class using kernels capturing multi-feature associations. Then, based on Riemannian geometry, a composite kernel is computed, extracting the differences between the learned feature associations. Finally, a FS score based on spectral analysis is… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 22 pages, 10 figures

  33. arXiv:2207.08157  [pdf, other

    cs.LG cs.AI

    Automated Repair of Neural Networks

    Authors: Dor Cohen, Ofer Strichman

    Abstract: Over the last decade, Neural Networks (NNs) have been widely used in numerous applications including safety-critical ones such as autonomous systems. Despite their emerging adoption, it is well known that NNs are susceptible to Adversarial Attacks. Hence, it is highly important to provide guarantees that such systems work correctly. To remedy these issues we introduce a framework for repairing uns… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

    Comments: Code and results are available at https://github.com/dorcoh/NNSynthesizer

  34. arXiv:2205.11558  [pdf, other

    cs.AI

    Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines

    Authors: Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, Thomas L. Griffiths

    Abstract: Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire very different strategies from humans. We show that co-training these agents on predicting representations from natural language task descriptions and programs… ▽ More

    Submitted 5 February, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), winner of Outstanding Paper Award

  35. arXiv:2204.06608  [pdf, other

    cs.LG

    Modularity benefits reinforcement learning agents with competing homeostatic drives

    Authors: Zack Dulberg, Rachit Dubey, Isabel M. Berwian, Jonathan D. Cohen

    Abstract: The problem of balancing conflicting needs is fundamental to intelligence. Standard reinforcement learning algorithms maximize a scalar reward, which requires combining different objective-specific rewards into a single number. Alternatively, different objectives could also be combined at the level of action value, such that specialist modules responsible for different objectives submit different… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: 4 pages, accepted paper at the Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2022

  36. arXiv:2204.01437  [pdf

    cs.AI cs.HC

    Disentangling Abstraction from Statistical Pattern Matching in Human and Machine Learning

    Authors: Sreejan Kumar, Ishita Dasgupta, Nathaniel D. Daw, Jonathan D. Cohen, Thomas L. Griffiths

    Abstract: The ability to acquire abstract knowledge is a hallmark of human intelligence and is believed by many to be one of the core differences between humans and neural network models. Agents can be endowed with an inductive bias towards abstraction through meta-learning, where they are trained on a distribution of tasks that share some abstract structure that can be learned and applied. However, because… ▽ More

    Submitted 3 March, 2023; v1 submitted 4 April, 2022; originally announced April 2022.

  37. arXiv:2202.03045  [pdf, ps, other

    cs.LG stat.ML

    Metric-valued regression

    Authors: Dan Tsir Cohen, Aryeh Kontorovich

    Abstract: We propose an efficient algorithm for learning mappings between two metric spaces, $\X$ and $\Y$. Our procedure is strongly Bayes-consistent whenever $\X$ and $\Y$ are topologically separable and $\Y$ is "bounded in expectation" (our term; the separability assumption can be somewhat weakened). At this level of generality, ours is the first such learnability result for unbounded loss in the agnosti… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  38. CODER: An efficient framework for improving retrieval through COntextual Document Embedding Reranking

    Authors: George Zerveas, Navid Rekabsaz, Daniel Cohen, Carsten Eickhoff

    Abstract: Contrastive learning has been the dominant approach to training dense retrieval models. In this work, we investigate the impact of ranking context - an often overlooked aspect of learning dense retrieval models. In particular, we examine the effect of its constituent parts: jointly scoring a large number of negatives per query, using retrieved (query-specific) instead of random negatives, and a fu… ▽ More

    Submitted 3 November, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: EMNLP 2022

  39. arXiv:2112.07308  [pdf, other

    cs.CL

    Conversational Search with Mixed-Initiative -- Asking Good Clarification Questions backed-up by Passage Retrieval

    Authors: Yosi Mass, Doron Cohen, Asaf Yehudai, David Konopnicki

    Abstract: We deal with the scenario of conversational search, where user queries are under-specified or ambiguous. This calls for a mixed-initiative setup. User-asks (queries) and system-answers, as well as system-asks (clarification questions) and user response, in order to clarify her information needs. We focus on the task of selecting the next clarification question, given the conversation context. Our… ▽ More

    Submitted 23 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  40. arXiv:2111.10434  [pdf, other

    cs.LG

    Machine Learning for Mechanical Ventilation Control (Extended Abstract)

    Authors: Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alexander Yu, Paula Gradu, Karan Singh, Cyril Zhang, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan

    Abstract: Mechanical ventilation is one of the most widely used therapies in the ICU. However, despite broad application from anaesthesia to COVID-related life support, many injurious challenges remain. We frame these as a control problem: ventilators must let air in and out of the patient's lungs according to a prescribed trajectory of airway pressure. Industry-standard controllers, based on the PID method… ▽ More

    Submitted 23 December, 2021; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2021 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:2102.06779

  41. arXiv:2111.00794  [pdf, other

    cs.CV

    Geodesic Models with Convexity Shape Prior

    Authors: Da Chen, Jean-Marie Mirebeau, Minglei Shu, Xuecheng Tai, Laurent D. Cohen

    Abstract: The minimal geodesic models based on the Eikonal equations are capable of finding suitable solutions in various image segmentation scenarios. Existing geodesic-based segmentation approaches usually exploit image features in conjunction with geometric regularization terms, such as Euclidean curve length or curvature-penalized length, for computing geodesic curves. In this paper, we take into accoun… ▽ More

    Submitted 25 November, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: This paper has been accepted by TPAMI

  42. arXiv:2110.15425  [pdf, other

    cs.PL

    Distill: Domain-Specific Compilation for Cognitive Models

    Authors: Jan Vesely, Raghavendra Pradyumna Pothukuchi, Ketaki Joshi, Samyak Gupta, Jonathan D. Cohen, Abhishek Bhattacharjee

    Abstract: This paper discusses our proposal and implementation of Distill, a domain-specific compilation tool based on LLVM to accelerate cognitive models. Cognitive models explain the process of cognitive function and offer a path to human-like artificial intelligence. However, cognitive modeling is laborious, requiring composition of many types of computational tasks, and suffers from poor performance as… ▽ More

    Submitted 14 January, 2022; v1 submitted 28 October, 2021; originally announced October 2021.

    Comments: 11 pages, 7 figures

  43. A Modern Perspective on Query Likelihood with Deep Generative Retrieval Models

    Authors: Oleg Lesota, Navid Rekabsaz, Daniel Cohen, Klaus Antonius Grasserbauer, Carsten Eickhoff, Markus Schedl

    Abstract: Existing neural ranking models follow the text matching paradigm, where document-to-query relevance is estimated through predicting the matching score. Drawing from the rich literature of classical generative retrieval models, we introduce and formalize the paradigm of deep generative retrieval models defined via the cumulative probabilities of generating query terms. This paradigm offers a ground… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: ICTIR'21

  44. arXiv:2106.07369  [pdf

    cs.LG q-bio.NC

    A Self-Supervised Framework for Function Learning and Extrapolation

    Authors: Simon N. Segert, Jonathan D. Cohen

    Abstract: Understanding how agents learn to generalize -- and, in particular, to extrapolate -- in high-dimensional, naturalistic environments remains a challenge for both machine learning and the study of biological agents. One approach to this has been the use of function learning paradigms, which allow peoples' empirical patterns of generalization for smooth scalar functions to be described precisely. Ho… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: 15 pages, 4 figures

  45. arXiv:2105.07408  [pdf, other

    cs.IT

    Dimension-Free Empirical Entropy Estimation

    Authors: Doron Cohen, Aryeh Kontorovich, Aaron Koolyk, Geoffrey Wolfer

    Abstract: We seek an entropy estimator for discrete distributions with fully empirical accuracy bounds. As stated, this goal is infeasible without some prior assumptions on the distribution. We discover that a certain information moment assumption renders the problem feasible. We argue that the moment assumption is natural and, in some sense, {\em minimalistic} -- weaker than finite support or tail decay co… ▽ More

    Submitted 26 December, 2022; v1 submitted 16 May, 2021; originally announced May 2021.

  46. People construct simplified mental representations to plan

    Authors: Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths

    Abstract: One of the most striking features of human cognition is the capacity to plan. Two aspects of human planning stand out: its efficiency and flexibility. Efficiency is especially impressive because plans must often be made in complex environments, and yet people successfully plan solutions to myriad everyday problems despite having limited cognitive resources. Standard accounts in psychology, economi… ▽ More

    Submitted 26 November, 2022; v1 submitted 14 May, 2021; originally announced May 2021.

    Comments: 56 pages, 5 main figures, 10 extended data figures, supplementary information is included in ancillary files

    Journal ref: Nature, 606(7912), 129-136 (2022)

  47. Not All Relevance Scores are Equal: Efficient Uncertainty and Calibration Modeling for Deep Retrieval Models

    Authors: Daniel Cohen, Bhaskar Mitra, Oleg Lesota, Navid Rekabsaz, Carsten Eickhoff

    Abstract: In any ranking system, the retrieval model outputs a single score for a document based on its belief on how relevant it is to a given search query. While retrieval models have continued to improve with the introduction of increasingly complex architectures, few works have investigated a retrieval model's belief in the score beyond the scope of a single value. We argue that capturing the model's un… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: ACM SIGIR preprint

  48. arXiv:2102.06779  [pdf, other

    cs.LG

    Machine Learning for Mechanical Ventilation Control

    Authors: Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alexander Yu, Paula Gradu, Karan Singh, Cyril Zhang, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan

    Abstract: We consider the problem of controlling an invasive mechanical ventilator for pressure-controlled ventilation: a controller must let air in and out of a sedated patient's lungs according to a trajectory of airway pressures specified by a clinician. Hand-tuned PID controllers and similar variants have comprised the industry standard for decades, yet can behave poorly by over- or under-shooting their… ▽ More

    Submitted 18 January, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

  49. arXiv:2102.06108  [pdf, other

    cs.CV eess.IV

    SWAGAN: A Style-based Wavelet-driven Generative Model

    Authors: Rinon Gal, Dana Cohen, Amit Bermano, Daniel Cohen-Or

    Abstract: In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

  50. arXiv:2101.03549  [pdf, other

    eess.IV cs.CV cs.LG

    Learning Rotation Invariant Features for Cryogenic Electron Microscopy Image Reconstruction

    Authors: Koby Bibas, Gili Weiss-Dicker, Dana Cohen, Noa Cahan, Hayit Greenspan

    Abstract: Cryo-Electron Microscopy (Cryo-EM) is a Nobel prize-winning technology for determining the 3D structure of particles at near-atomic resolution. A fundamental step in the recovering of the 3D single-particle structure is to align its 2D projections; thus, the construction of a canonical representation with a fixed rotation angle is required. Most approaches use discrete clustering which fails to ca… ▽ More

    Submitted 10 January, 2021; originally announced January 2021.

    Comments: Accepted IEEE-ISBI 2021