Skip to main content

Showing 1–50 of 1,035 results for author: Huang, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.08821  [pdf, other

    cs.IR cs.AI

    EasyRec: Simple yet Effective Language Models for Recommendation

    Authors: Xubin Ren, Chao Huang

    Abstract: Deep neural networks have become a powerful technique for learning representations from user-item interaction data in collaborative filtering (CF) for recommender systems. However, many existing methods heavily rely on unique user and item IDs, which limits their ability to perform well in practical zero-shot learning scenarios where sufficient training data may be unavailable. Inspired by the suc… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  2. arXiv:2408.08592  [pdf, other

    cs.RO

    Case Study: Runtime Safety Verification of Neural Network Controlled System

    Authors: Frank Yang, Sinong Simon Zhan, Yixuan Wang, Chao Huang, Qi Zhu

    Abstract: Neural networks are increasingly used in safety-critical applications such as robotics and autonomous vehicles. However, the deployment of neural-network-controlled systems (NNCSs) raises significant safety concerns. Many recent advances overlook critical aspects of verifying control and ensuring safety in real-time scenarios. This paper presents a case study on using POLAR-Express, a state-of-the… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 15 pages, 5 figures, submitted to Runtime Verification 2024

  3. arXiv:2408.07875  [pdf, other

    cs.LG stat.ML

    Incremental Structure Discovery of Classification via Sequential Monte Carlo

    Authors: Changze Huang, Di Wang

    Abstract: Gaussian Processes (GPs) provide a powerful framework for making predictions and understanding uncertainty for classification with kernels and Bayesian non-parametric learning. Building such models typically requires strong prior knowledge to define preselect kernels, which could be ineffective for online applications of classification that sequentially process data because features of data may sh… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  4. arXiv:2408.04998  [pdf, other

    cs.CL cs.AI

    ProFuser: Progressive Fusion of Large Language Models

    Authors: Tianyuan Shi, Fanqi Wan, Canbin Huang, Xiaojun Quan, Chenliang Li, Ming Yan, Ji Zhang

    Abstract: While fusing the capacities and advantages of various large language models (LLMs) offers a pathway to construct more powerful and versatile models, a fundamental challenge is to properly select advantageous model during the training. Existing fusion methods primarily focus on the training mode that uses cross entropy on ground truth in a teacher-forcing setup to measure a model's advantage, which… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  5. arXiv:2408.02501  [pdf, ps, other

    cs.IT eess.SP

    Fair Resource Allocation For Hierarchical Federated Edge Learning in Space-Air-Ground Integrated Networks via Deep Reinforcement Learning with Hybrid Control

    Authors: Chong Huang, Gaojie Chen, Pei Xiao, Jonathon A. Chambers, Wei Huang

    Abstract: The space-air-ground integrated network (SAGIN) has become a crucial research direction in future wireless communications due to its ubiquitous coverage, rapid and flexible deployment, and multi-layer cooperation capabilities. However, integrating hierarchical federated learning (HFL) with edge computing and SAGINs remains a complex open issue to be resolved. This paper proposes a novel framework… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: Accepted for publication in IEEE Journal on Selected Areas in Communications

  6. arXiv:2407.21693  [pdf, other

    cs.AI

    TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities

    Authors: Ming Zhang, Caishuang Huang, Yilong Wu, Shichun Liu, Huiyuan Zheng, Yurui Dong, Yujiong Shen, Shihan Dou, Jun Zhao, Junjie Ye, Qi Zhang, Tao Gui, Xuanjing Huang

    Abstract: Task-oriented dialogue (TOD) systems aim to efficiently handle task-oriented conversations, including information collection. How to utilize TOD accurately, efficiently and effectively for information collection has always been a critical and challenging task. Recent studies have demonstrated that Large Language Models (LLMs) excel in dialogue, instruction generation, and reasoning, and can signif… ▽ More

    Submitted 7 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  7. arXiv:2407.21301  [pdf, ps, other

    cs.IT eess.SP

    Integrated Sensing and Communication in IRS-assisted High-Mobility Systems: Design, Analysis and Optimization

    Authors: Xingyu Peng, Qin Tao, Xiaoling Hu, Richeng Jin, Chongwen Huang, Xiaoming Chen

    Abstract: In this paper, we investigate integrated sensing and communication (ISAC) in high-mobility systems with the aid of an intelligent reflecting surface (IRS). To exploit the benefits of Delay-Doppler (DD) spread caused by high mobility, orthogonal time frequency space (OTFS)-based frame structure and transmission framework are proposed. {In such a framework,} we first design a low-complexity ratio-ba… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: 15 pages, 9 figures

  8. arXiv:2407.17663  [pdf, other

    cs.CR cs.LG

    Explaining the Model, Protecting Your Data: Revealing and Mitigating the Data Privacy Risks of Post-Hoc Model Explanations via Membership Inference

    Authors: Catherine Huang, Martin Pawelczyk, Himabindu Lakkaraju

    Abstract: Predictive machine learning models are becoming increasingly deployed in high-stakes contexts involving sensitive personal data; in these contexts, there is a trade-off between model explainability and data privacy. In this work, we push the boundaries of this trade-off: with a focus on foundation models for image classification fine-tuning, we reveal unforeseen privacy risks of post-hoc model exp… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: ICML 2024 Workshop on the Next Generation of AI Safety

  9. arXiv:2407.16364  [pdf, other

    cs.CV

    Harmonizing Visual Text Comprehension and Generation

    Authors: Zhen Zhao, Jingqun Tang, Binghong Wu, Chunhui Lin, Shu Wei, Hao Liu, Xin Tan, Zhizhong Zhang, Can Huang, Yuan Xie

    Abstract: In this work, we present TextHarmony, a unified and versatile multimodal generative model proficient in comprehending and generating visual text. Simultaneously generating images and texts typically results in performance degradation due to the inherent inconsistency between vision and language modalities. To overcome this challenge, existing approaches resort to modality-specific data for supervi… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  10. arXiv:2407.15782  [pdf, ps, other

    cs.IT eess.SP

    Reconfigurable Intelligent Surface Empowered Full Duplex Systems: Opportunities and Challenges

    Authors: Chong Huang, Yun Wen, Long Zhang, Gaojie Chen, Zhen Gao, Pei Xiao

    Abstract: Reconfigurable intelligent surfaces (RISs) have emerged as a promising technology in wireless communications. Simultaneously transmitting and reflecting RIS (STAR-RISs) in particular have garnered significant attention due to their dual capabilities of simultaneous transmission and reflection, underscoring their potential applications in critical scenarios within the forthcoming sixth-generation (… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Accepted for publication in IEEE Communications Standards Magazine

  11. arXiv:2407.13083  [pdf, other

    cs.SD cs.CV eess.AS

    Modeling and Driving Human Body Soundfields through Acoustic Primitives

    Authors: Chao Huang, Dejan Markovic, Chenliang Xu, Alexander Richard

    Abstract: While rendering and animation of photorealistic 3D human body models have matured and reached an impressive quality over the past years, modeling the spatial audio associated with such full body models has been largely ignored so far. In this work, we present a framework that allows for high-quality spatial audio generation, capable of rendering the full 3D soundfield generated by a human body, in… ▽ More

    Submitted 20 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Comments: ECCV 2024. Project Page: https://wikichao.github.io/Acoustic-Primitives/

  12. arXiv:2407.09709  [pdf, other

    cs.LG cs.CL

    GOFA: A Generative One-For-All Model for Joint Graph Language Modeling

    Authors: Lecheng Kong, Jiarui Feng, Hao Liu, Chengsong Huang, Jiaxin Huang, Yixin Chen, Muhan Zhang

    Abstract: Foundation models, such as Large Language Models (LLMs) or Large Vision Models (LVMs), have emerged as one of the most powerful tools in the respective fields. However, unlike text and image data, graph data do not have a definitive structure, posing great challenges to developing a Graph Foundation Model (GFM). For example, current attempts at designing general graph models either transform graph… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  13. arXiv:2407.09250  [pdf

    cs.NI cs.LG

    FedsLLM: Federated Split Learning for Large Language Models over Communication Networks

    Authors: Kai Zhao, Zhaohui Yang, Chongwen Huang, Xiaoming Chen, Zhaoyang Zhang

    Abstract: Addressing the challenges of deploying large language models in wireless communication networks, this paper combines low-rank adaptation technology (LoRA) with the splitfed learning framework to propose the federated split learning for large language models (FedsLLM) framework. The method introduced in this paper utilizes LoRA technology to reduce processing loads by dividing the network into clie… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  14. arXiv:2407.07723  [pdf, other

    cs.IT cs.AI

    Understanding is Compression

    Authors: Ziguang Li, Chao Huang, Xuliang Wang, Haibo Hu, Cole Wyeth, Dongbo Bu, Quan Yu, Wen Gao, Xingwu Liu, Ming Li

    Abstract: We have previously shown all understanding or learning are compression, under reasonable assumptions. In principle, better understanding of data should improve data compression. Traditional compression methodologies focus on encoding frequencies or some other computable properties of data. Large language models approximate the uncomputable Solomonoff distribution, opening up a whole new avenue to… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

  15. arXiv:2407.06153  [pdf, other

    cs.SE cs.CL

    What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

    Authors: Shihan Dou, Haoxiang Jia, Shenxi Wu, Huiyuan Zheng, Weikang Zhou, Muling Wu, Mingxu Chai, Jessica Fan, Caishuang Huang, Yunbo Tao, Yan Liu, Enyu Zhou, Ming Zhang, Yuhao Zhou, Yueming Wu, Rui Zheng, Ming Wen, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Xipeng Qiu, Qi Zhang, Xuanjing Huang

    Abstract: The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundar… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 17 pages, 7 figures

  16. arXiv:2407.06094  [pdf, ps, other

    cs.RO

    ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions

    Authors: Micol Spitale, Maria Teresa Parreira, Maia Stiber, Minja Axelsson, Neval Kara, Garima Kankariya, Chien-Ming Huang, Malte Jung, Wendy Ju, Hatice Gunes

    Abstract: Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to t… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  17. arXiv:2407.05873  [pdf, other

    eess.SP cs.IT

    Receiver Selection and Transmit Beamforming for Multi-static Integrated Sensing and Communications

    Authors: Dan Wang, Yuanming Tian, Chuan Huang, Hao Chen, Xiaodong Xu, Ping Zhang

    Abstract: Next-generation wireless networks are expected to develop a novel paradigm of integrated sensing and communications (ISAC) to enable both the high-accuracy sensing and high-speed communications. However, conventional mono-static ISAC systems, which simultaneously transmit and receive at the same equipment, may suffer from severe self-interference, and thus significantly degrade the system performa… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  18. arXiv:2407.05840  [pdf, other

    cs.ET physics.optics

    A 103-TOPS/mm$^2$ Integrated Photonic Computing Engine Enabling Next-Generation Reservoir Computing

    Authors: Dongliang Wang, Yikun Nie, Gaolei Hu, Hon Ki Tsang, Chaoran Huang

    Abstract: Reservoir computing (RC) is a leading machine learning algorithm for information processing due to its rich expressiveness. A new RC paradigm has recently emerged, showcasing superior performance and delivering more interpretable results with shorter training data sets and training times, representing the next generation of RC computing. This work presents the first realization of a high-speed nex… ▽ More

    Submitted 31 May, 2024; originally announced July 2024.

  19. arXiv:2407.05249  [pdf, ps, other

    cs.IT eess.SP

    RIS-assisted Coverage Enhancement in mmWave Integrated Sensing and Communication Networks

    Authors: Xu Gan, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Faouzi Bader, Zhaoyang Zhang, Chau Yuen, Yong Liang Guan, Merouane Debbah

    Abstract: Integrated sensing and communication (ISAC) has emerged as a promising technology to facilitate high-rate communications and super-resolution sensing, particularly operating in the millimeter wave (mmWave) band. However, the vulnerability of mmWave signals to blockages severely impairs ISAC capabilities and coverage. To tackle this, an efficient and low-cost solution is to deploy distributed recon… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  20. arXiv:2407.03475  [pdf, other

    cs.LG

    How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

    Authors: Etai Littwin, Omid Saremi, Madhu Advani, Vimal Thilak, Preetum Nakkiran, Chen Huang, Joshua Susskind

    Abstract: Two competing paradigms exist for self-supervised learning of data representations. Joint Embedding Predictive Architecture (JEPA) is a class of architectures in which semantically similar inputs are encoded into representations that are predictive of each other. A recent successful approach that falls under the JEPA framework is self-distillation, where an online encoder is trained to predict the… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Technical report

  21. arXiv:2407.03169  [pdf, other

    cs.CL cs.SD eess.AS

    Investigating Decoder-only Large Language Models for Speech-to-text Translation

    Authors: Chao-Wei Huang, Hui Lu, Hongyu Gong, Hirofumi Inaguma, Ilia Kulikov, Ruslan Mavlyutov, Sravya Popuri

    Abstract: Large language models (LLMs), known for their exceptional reasoning capabilities, generalizability, and fluency across diverse domains, present a promising avenue for enhancing speech-related tasks. In this paper, we focus on integrating decoder-only LLMs to the task of speech-to-text translation (S2TT). We propose a decoder-only architecture that enables the LLM to directly consume the encoded sp… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to Interspeech 2024

  22. arXiv:2407.03007  [pdf, other

    cs.CL cs.AI

    What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

    Authors: Chengrui Huang, Zhengliang Shi, Yuntao Wen, Xiuying Chen, Peng Han, Shen Gao, Shuo Shang

    Abstract: Tool learning methods have enhanced the ability of large language models (LLMs) to interact with real-world applications. Many existing works fine-tune LLMs or design prompts to enable LLMs to select appropriate tools and correctly invoke them to meet user requirements. However, it is observed in previous works that the performance of tool learning varies from tasks, datasets, training settings, a… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 19 pages, 9 figures

  23. arXiv:2407.02680  [pdf, other

    cs.SE

    KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution

    Authors: Alex Mathai, Chenxi Huang, Petros Maniatis, Aleksandr Nogikh, Franjo Ivancic, Junfeng Yang, Baishakhi Ray

    Abstract: Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks. In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel. Unlike application-level software, a systems codebase like Linux is multilingual (low-level C/Assembly/Bash/Rust); gigantic (>20 million lines); critical (impac… ▽ More

    Submitted 8 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  24. arXiv:2407.01976  [pdf, other

    cs.CL cs.AI cs.MM

    A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

    Authors: Jinghui Lu, Haiyang Yu, Yanjie Wang, Yongjie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao Liu, Can Huang

    Abstract: Recently, many studies have demonstrated that exclusively incorporating OCR-derived text and spatial layouts with large language models (LLMs) can be highly effective for document understanding tasks. However, existing methods that integrate spatial layouts with text have limitations, such as producing overly long text sequences or failing to fully leverage the autoregressive traits of LLMs. In th… ▽ More

    Submitted 24 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  25. arXiv:2407.01926  [pdf

    physics.med-ph cs.CV

    Chemical Shift Encoding based Double Bonds Quantification in Triglycerides using Deep Image Prior

    Authors: Chaoxing Huang, Ziqiang Yu, Zijian Gao, Qiuyi Shen, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

    Abstract: This study evaluated a deep learning-based method using Deep Image Prior (DIP) to quantify triglyceride double bonds from chemical-shift encoded multi-echo gradient echo images without network training. We employed a cost function based on signal constraints to iteratively update the neural network on a single dataset. The method was validated using phantom experiments and in vivo scans. Results s… ▽ More

    Submitted 25 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  26. arXiv:2407.01081  [pdf, other

    cs.CV cs.CL

    CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation

    Authors: Yuxuan Wang, Yijun Liu, Fei Yu, Chen Huang, Kexin Li, Zhiguo Wan, Wanxiang Che

    Abstract: Despite the rapid development of Chinese vision-language models (VLMs), most existing Chinese vision-language (VL) datasets are constructed on Western-centric images from existing English VL datasets. The cultural bias in the images makes these datasets unsuitable for evaluating VLMs in Chinese culture. To remedy this issue, we present a new Chinese Vision- Language Understanding Evaluation (CVLUE… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  27. arXiv:2406.18118  [pdf, other

    cs.CR cs.CL

    SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

    Authors: Caishuang Huang, Wanxu Zhao, Rui Zheng, Huijie Lv, Shihan Dou, Sixian Li, Xiao Wang, Enyu Zhou, Junjie Ye, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: As the development of large language models (LLMs) rapidly advances, securing these models effectively without compromising their utility has become a pivotal area of research. However, current defense strategies against jailbreak attacks (i.e., efforts to bypass security protocols) often suffer from limited adaptability, restricted general capability, and high cost. To address these challenges, w… ▽ More

    Submitted 28 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  28. arXiv:2406.14307  [pdf, other

    q-bio.GN cs.CL cs.CV

    QuST-LLM: Integrating Large Language Models for Comprehensive Spatial Transcriptomics Analysis

    Authors: Chao Hui Huang

    Abstract: In this paper, we introduce QuST-LLM, an innovative extension of QuPath that utilizes the capabilities of large language models (LLMs) to analyze and interpret spatial transcriptomics (ST) data. In addition to simplifying the intricate and high-dimensional nature of ST data by offering a comprehensive workflow that includes data loading, region selection, gene expression analysis, and functional a… ▽ More

    Submitted 1 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  29. arXiv:2406.14171  [pdf, other

    cs.AI cs.CL

    Ranking LLMs by compression

    Authors: Peijia Guo, Ziguang Li, Haibo Hu, Chao Huang, Ming Li, Rui Zhang

    Abstract: We conceptualize the process of understanding as information compression, and propose a method for ranking large language models (LLMs) based on lossless data compression. We demonstrate the equivalence of compression length under arithmetic coding with cumulative negative log probabilities when using a large language model as a prior, that is, the pre-training phase of the model is essentially th… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 7 pages, 4 tables

  30. arXiv:2406.14163  [pdf, other

    cs.DB stat.ME

    A Unified Statistical And Computational Framework For Ex-Post Harmonisation Of Aggregate Statistics

    Authors: Cynthia A. Huang

    Abstract: Ex-post harmonisation is one of many data preprocessing processes used to combine the increasingly vast and diverse sources of data available for research and analysis. Documenting provenance and ensuring the quality of multi-source datasets is vital for ensuring trustworthy scientific research and encouraging reuse of existing harmonisation efforts. However, capturing and communicating statistica… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  31. arXiv:2406.12787  [pdf, other

    cs.CL cs.HC

    Generating Educational Materials with Different Levels of Readability using LLMs

    Authors: Chieh-Yang Huang, Jing Wei, Ting-Hao 'Kenneth' Huang

    Abstract: This study introduces the leveled-text generation task, aiming to rewrite educational materials to specific readability levels while preserving meaning. We assess the capability of GPT-3.5, LLaMA-2 70B, and Mixtral 8x7B, to generate content at various readability levels through zero-shot and few-shot prompting. Evaluating 100 processed educational materials reveals that few-shot prompting signific… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: In2Writing 2024

  32. arXiv:2406.11781  [pdf, other

    cs.IR

    DiffMM: Multi-Modal Diffusion Model for Recommendation

    Authors: Yangqin Jiang, Lianghao Xia, Wei Wei, Da Luo, Kangyi Lin, Chao Huang

    Abstract: The rise of online multi-modal sharing platforms like TikTok and YouTube has enabled personalized recommender systems to incorporate multiple modalities (such as visual, textual, and acoustic) into user representations. However, addressing the challenge of data sparsity in these systems remains a key issue. To address this limitation, recent research has introduced self-supervised learning techniq… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  33. arXiv:2406.11208  [pdf

    cs.NI

    Privacy-preserving Pseudonym Schemes for Personalized 3D Avatars in Mobile Social Metaverses

    Authors: Cheng Su, Xiaofeng Luo, Zhenmou Liu, Jiawen Kang, Min Hao, Zehui Xiong, Zhaohui Yang, Chongwen Huang

    Abstract: The emergence of mobile social metaverses, a novel paradigm bridging physical and virtual realms, has led to the widespread adoption of avatars as digital representations for Social Metaverse Users (SMUs) within virtual spaces. Equipped with immersive devices, SMUs leverage Edge Servers (ESs) to deploy their avatars and engage with other SMUs in virtual spaces. To enhance immersion, SMUs incline t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 6pages, 4 figures

  34. arXiv:2406.11192  [pdf, other

    cs.CL

    Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition

    Authors: Yuming Yang, Wantong Zhao, Caishuang Huang, Junjie Ye, Xiao Wang, Huiyuan Zheng, Yang Nan, Yuran Wang, Xueying Xu, Kaixin Huang, Yunke Zhang, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Open Named Entity Recognition (NER), which involves identifying arbitrary types of entities from arbitrary domains, remains challenging for Large Language Models (LLMs). Recent studies suggest that fine-tuning LLMs on extensive NER data can boost their performance. However, training directly on existing datasets faces issues due to inconsistent entity definitions and redundant data, limiting LLMs… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 20 pages. Project page: https://github.com/UmeanNever/B2NER

  35. arXiv:2406.08810  [pdf, other

    cs.CV

    Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

    Authors: Chaoqin Huang, Haoyan Guan, Aofan Jiang, Yanfeng Wang, Michael Spratling, Xinchao Wang, Ya Zhang

    Abstract: Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD)… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  36. arXiv:2406.08754  [pdf, other

    cs.CL cs.CR

    Exploiting Uncommon Text-Encoded Structures for Automated Jailbreaks in LLMs

    Authors: Bangxin Li, Hengrui Xing, Chao Huang, Jin Qian, Huangqing Xiao, Linfeng Feng, Cong Tian

    Abstract: Large Language Models (LLMs) are widely used in natural language processing but face the risk of jailbreak attacks that maliciously induce them to generate harmful content. Existing jailbreak attacks, including character-level and context-level attacks, mainly focus on the prompt of the plain text without specifically exploring the significant influence of its structure. In this paper, we focus on… ▽ More

    Submitted 19 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures

  37. arXiv:2406.08404  [pdf, other

    cs.LG cs.AI

    Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning

    Authors: Yuhui Wang, Qingyuan Wu, Weida Li, Dylan R. Ashley, Francesco Faccio, Chao Huang, Jürgen Schmidhuber

    Abstract: The Value Iteration Network (VIN) is an end-to-end differentiable architecture that performs value iteration on a latent MDP for planning in reinforcement learning (RL). However, VINs struggle to scale to long-term and large-scale planning tasks, such as navigating a $100\times 100$ maze -- a task which typically requires thousands of planning steps to solve. We observe that this deficiency is due… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    ACM Class: I.2.6

  38. arXiv:2406.08124  [pdf, other

    cs.CL cs.AI

    Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

    Authors: Duanyu Feng, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, Wenqiang Lei

    Abstract: The success of the reward model in distinguishing between responses with subtle safety differences depends critically on the high-quality preference dataset, which should capture the fine-grained nuances of harmful and harmless responses. This motivates the need to develop a dataset involving preference margins, which accurately quantify how harmless one response is compared to another. In this pa… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Our code is available at https://github.com/colfeng/Legend

  39. arXiv:2406.07498  [pdf, other

    cs.SD eess.AS

    RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention

    Authors: Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

    Abstract: In real-time speech communication systems, speech signals are often degraded by multiple distortions. Recently, a two-stage Repair-and-Denoising network (RaD-Net) was proposed with superior speech quality improvement in the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. However, failure to use future information and constraint receptive field of convolution layers limit the system's perfor… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  40. arXiv:2406.06516  [pdf, other

    stat.ME cs.LG stat.ML

    Distribution-Free Predictive Inference under Unknown Temporal Drift

    Authors: Elise Han, Chengpiao Huang, Kaizheng Wang

    Abstract: Distribution-free prediction sets play a pivotal role in uncertainty quantification for complex statistical models. Their validity hinges on reliable calibration data, which may not be readily available as real-world environments often undergo unknown changes over time. In this paper, we propose a strategy for choosing an adaptive window and use the data therein to construct prediction sets. The w… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 25 pages, 4 figures, 6 tables

  41. arXiv:2406.06110  [pdf, other

    cs.CL cs.AI

    Recurrent Context Compression: Efficiently Expanding the Context Window of LLM

    Authors: Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, Jinqiao Wang

    Abstract: To extend the context length of Transformer-based large language models (LLMs) and improve comprehension capabilities, we often face limitations due to computational resources and bounded memory storage capacity. This work introduces a method called Recurrent Context Compression (RCC), designed to efficiently expand the context window length of LLMs within constrained storage space. We also invest… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  42. arXiv:2406.04368  [pdf, other

    cs.CL cs.AI cs.CY

    SocialNLP Fake-EmoReact 2021 Challenge Overview: Predicting Fake Tweets from Their Replies and GIFs

    Authors: Chien-Kun Huang, Yi-Ting Chang, Lun-Wei Ku, Cheng-Te Li, Hong-Han Shuai

    Abstract: This paper provides an overview of the Fake-EmoReact 2021 Challenge, held at the 9th SocialNLP Workshop, in conjunction with NAACL 2021. The challenge requires predicting the authenticity of tweets using reply context and augmented GIF categories from EmotionGIF dataset. We offer the Fake-EmoReact dataset with more than 453k as the experimental materials, where every tweet is labeled with authenti… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  43. arXiv:2406.02377  [pdf, other

    cs.IR cs.AI cs.CL

    XRec: Large Language Models for Explainable Recommendation

    Authors: Qiyao Ma, Xubin Ren, Chao Huang

    Abstract: Recommender systems help users navigate information overload by providing personalized recommendations aligned with their preferences. Collaborative Filtering (CF) is a widely adopted approach, but while advanced techniques like graph neural networks (GNNs) and self-supervised learning (SSL) have enhanced CF models for better user representations, they often lack the ability to provide explanation… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  44. arXiv:2406.02161  [pdf, other

    cs.RO eess.SP

    An Observability-Constrained Magnetic-Field-Aided Inertial Navigation System

    Authors: Chuan Huang, Gustaf Hendeby, Isaac Skog

    Abstract: A method to construct an observability-constrained magnetic-field-aided inertial navigation system is proposed. The proposed method builds upon the previously proposed observability-constrained extended Kalman filter and extends it to work with a magnetic-field-based odometry-aided inertial navigation system. The proposed method is evaluated using simulation and real-world data, showing that (i) t… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  45. arXiv:2406.01919  [pdf, other

    cs.CL

    OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection

    Authors: Chenyang Huang, Abbas Ghaddar, Ivan Kobyzev, Mehdi Rezagholizadeh, Osmar R. Zaiane, Boxing Chen

    Abstract: Recently, there has been considerable attention on detecting hallucinations and omissions in Machine Translation (MT) systems. The two dominant approaches to tackle this task involve analyzing the MT system's internal states or relying on the output of external tools, such as sentence similarity or MT quality estimators. In this work, we introduce OTTAWA, a novel Optimal Transport (OT)-based word… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024 Findings

  46. arXiv:2406.01629  [pdf, other

    cs.IR cs.AI cs.SI

    RecDiff: Diffusion Model for Social Recommendation

    Authors: Zongwei Li, Lianghao Xia, Chao Huang

    Abstract: Social recommendation has emerged as a powerful approach to enhance personalized recommendations by leveraging the social connections among users, such as following and friend relations observed in online social platforms. The fundamental assumption of social recommendation is that socially-connected users exhibit homophily in their preference patterns. This means that users connected by social ti… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  47. arXiv:2406.01613  [pdf, other

    q-bio.QM cs.CV eess.IV

    QuST: QuPath Extension for Integrative Whole Slide Image and Spatial Transcriptomics Analysis

    Authors: Chao-Hui Huang

    Abstract: Recently, various technologies have been introduced into digital pathology, including artificial intelligence (AI) driven methods, in both areas of pathological whole slide image (WSI) analysis and spatial transcriptomics (ST) analysis. AI-driven WSI analysis utilizes the power of deep learning (DL), expands the field of view for histopathological image analysis. On the other hand, ST bridges the… ▽ More

    Submitted 1 July, 2024; v1 submitted 30 May, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures

  48. arXiv:2406.01436  [pdf, other

    cs.CL

    Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models

    Authors: Cheng-Hsun Hsueh, Paul Kuo-Ming Huang, Tzu-Han Lin, Che-Wei Liao, Hung-Chieh Fang, Chao-Wei Huang, Yun-Nung Chen

    Abstract: Knowledge editing is a rising technique for efficiently updating factual knowledge in Large Language Models (LLMs) with minimal alteration of parameters. However, recent studies have identified concerning side effects, such as knowledge distortion and the deterioration of general abilities, that have emerged after editing. This survey presents a comprehensive study of these side effects, providing… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  49. arXiv:2406.01331  [pdf, other

    cs.IT eess.SP

    Performance Trade-off of Integrated Sensing and Communications for Multi-User Backscatter Systems

    Authors: Yuanming Tian, Dan Wang, Chuan Huang, Wei Zhang

    Abstract: This paper studies the performance trade-off in a multi-user backscatter communication (BackCom) system for integrated sensing and communications (ISAC), where the multi-antenna ISAC transmitter sends excitation signals to power multiple single-antenna passive backscatter devices (BD), and the multi-antenna ISAC receiver performs joint sensing (localization) and communication tasks based on the ba… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  50. arXiv:2406.01326  [pdf, other

    cs.CV

    TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

    Authors: Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Shu Wei, Binghong Wu, Lei Liao, Yongjie Ye, Hao Liu, Houqiang Li, Can Huang

    Abstract: Tables contain factual and quantitative data accompanied by various structures and contents that pose challenges for machine comprehension. Previous methods generally design task-specific architectures and objectives for individual tasks, resulting in modal isolation and intricate workflows. In this paper, we present a novel large vision-language model, TabPedia, equipped with a concept synergy me… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 20 pages, 8 figures