Skip to main content

Showing 1–50 of 98 results for author: Lan, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15787  [pdf, other

    cs.CL cs.IR

    Interactive Agents: Simulating Counselor-Client Psychological Counseling via Role-Playing LLM-to-LLM Interactions

    Authors: Huachuan Qiu, Zhenzhong Lan

    Abstract: Virtual counselors powered by large language models (LLMs) aim to create interactive support systems that effectively assist clients struggling with mental health challenges. To replicate counselor-client conversations, researchers have built an online mental health platform that allows professional counselors to provide clients with text-based counseling services for about an hour per session. No… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.13459  [pdf, other

    cs.CV

    Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model

    Authors: Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei Xing, Lei Zhao, Huaizhong Lin, Jianfeng Dong, Dalong Zhang

    Abstract: Current video deblurring methods have limitations in recovering high-frequency information since the regression losses are conservative with high-frequency details. Since Diffusion Models (DMs) have strong capabilities in generating high-frequency details, we consider introducing DMs into the video deblurring task. However, we found that directly applying DMs to the video deblurring task has the f… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: accepted by ECCV2024

    ACM Class: I.4.4

  3. arXiv:2408.10496  [pdf, other

    cs.CV

    GPT-based Textile Pilling Classification Using 3D Point Cloud Data

    Authors: Yu Lu, YuYu Chen, Gang Zhou, Zhenghua Lan

    Abstract: Textile pilling assessment is critical for textile quality control. We collect thousands of 3D point cloud images in the actual test environment of textiles and organize and label them as TextileNet8 dataset. To the best of our knowledge, it is the first publicly available eight-categories 3D point cloud dataset in the field of textile pilling assessment. Based on PointGPT, the GPT-like big model… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 8 pages, 2 figures

  4. arXiv:2407.15083  [pdf, other

    cs.LG

    Rocket Landing Control with Random Annealing Jump Start Reinforcement Learning

    Authors: Yuxuan Jiang, Yujie Yang, Zhiqian Lan, Guojian Zhan, Shengbo Eben Li, Qi Sun, Jian Ma, Tianwen Yu, Changwu Zhang

    Abstract: Rocket recycling is a crucial pursuit in aerospace technology, aimed at reducing costs and environmental impact in space exploration. The primary focus centers on rocket landing control, involving the guidance of a nonlinear underactuated rocket with limited fuel in real-time. This challenging task prompts the application of reinforcement learning (RL), yet goal-oriented nature of the problem pose… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: IROS 2024 Oral

  5. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  6. arXiv:2407.02894  [pdf, other

    cs.CL cs.AI

    Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation

    Authors: Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Min Zhang, Jinsong Su

    Abstract: In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL 2024 Findings

  7. arXiv:2407.01894  [pdf, other

    cs.CV cs.HC

    Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection

    Authors: Zixing Li, Chao Yan, Zhen Lan, Xiaojia Xiang, Han Zhou, Jun Lai, Dengqing Tang

    Abstract: Advanced cognition can be extracted from the human brain using brain-computer interfaces. Integrating these interfaces with computer vision techniques, which possess efficient feature extraction capabilities, can achieve more robust and accurate detection of dim targets in aerial images. However, existing target detection methods primarily concentrate on homogeneous data, lacking efficient and ver… ▽ More

    Submitted 8 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 18 pages,15 figures

  8. arXiv:2407.01638  [pdf, other

    cs.SE cs.AI cs.DC cs.PL

    LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes

    Authors: Matthew T. Dearing, Yiheng Tao, Xingfu Wu, Zhiling Lan, Valerie Taylor

    Abstract: This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of millions to billions of codes. To tackle this problem, we propose an automated pipeline framework, called LASSI, designed to translate between parallel programming… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  9. arXiv:2406.17287  [pdf, other

    cs.CL cs.AI

    Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models

    Authors: Yang Yan, Lizhi Ma, Anqi Li, Jingsong Ma, Zhenzhong Lan

    Abstract: Accurate assessment of personality traits is crucial for effective psycho-counseling, yet traditional methods like self-report questionnaires are time-consuming and biased. This study exams whether Large Language Models (LLMs) can predict the Big Five personality traits directly from counseling dialogues and introduces an innovative framework to perform the task. Our framework applies role-play an… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  10. arXiv:2406.15097  [pdf, other

    cs.NI

    Modeling and Analysis of Application Interference on Dragonfly+

    Authors: Yao Kang, Xin Wang, Neil McGlohon, Misbah Mubarak, Sudheer Chunduri, Zhiling Lan

    Abstract: Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ networks offer more path diversity than the original Dragonfly design, they are still prone to performance variability due to their hierarchical architecture and resource sharing design. Event-driven network simulators are indispensable tools for navigating complex system desi… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by SIGSIM PADS 2019

  11. arXiv:2406.15000  [pdf, other

    cs.CL cs.AI

    Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations

    Authors: Lichao Zhang, Jia Yu, Shuai Zhang, Long Li, Yangyang Zhong, Guanbao Liang, Yuming Yan, Qing Ma, Fangsheng Weng, Fayu Pan, Jing Li, Renjun Xu, Zhenzhong Lan

    Abstract: Large Language Models (LLMs) have significantly advanced user-bot interactions, enabling more complex and coherent dialogues. However, the prevalent text-only modality might not fully exploit the potential for effective user engagement. This paper explores the impact of multi-modal interactions, which incorporate images and audio alongside text, on user engagement in chatbot conversations. We cond… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  12. arXiv:2405.18706  [pdf, other

    cs.CV

    FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

    Authors: You Huang, Zongyu Lan, Liujuan Cao, Xianming Lin, Shengchuan Zhang, Guannan Jiang, Rongrong Ji

    Abstract: The Segment Anything Model (SAM) marks a notable milestone in segmentation models, highlighted by its robust zero-shot capabilities and ability to handle diverse prompts. SAM follows a pipeline that separates interactive segmentation into image preprocessing through a large encoder and interactive inference via a lightweight decoder, ensuring efficient real-time performance. However, SAM faces sta… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  13. arXiv:2405.12669  [pdf, other

    cs.CL

    A Survey on Multi-modal Machine Translation: Tasks, Methods and Challenges

    Authors: Huangjun Shen, Liangying Shao, Wenbo Li, Zhibin Lan, Zhanyu Liu, Jinsong Su

    Abstract: In recent years, multi-modal machine translation has attracted significant interest in both academia and industry due to its superior performance. It takes both textual and visual modalities as inputs, leveraging visual context to tackle the ambiguities in source texts. In this paper, we begin by offering an exhaustive overview of 99 prior works, comprehensively summarizing representative studies… ▽ More

    Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  14. arXiv:2405.04909  [pdf, other

    cs.CV cs.AI

    Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

    Authors: Zhengxing Lan, Hongbo Li, Lingshan Liu, Bo Fan, Yisheng Lv, Yilong Ren, Zhiyong Cui

    Abstract: Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of the complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) without explici… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  15. arXiv:2404.14070  [pdf

    cs.HC cs.CY

    No General Code of Ethics for All: Ethical Considerations in Human-bot Psycho-counseling

    Authors: Lizhi Ma, Tong Zhao, Huachuan Qiu, Zhenzhong Lan

    Abstract: The pervasive use of AI applications is increasingly influencing our everyday decisions. However, the ethical challenges associated with AI transcend conventional ethics and single-discipline approaches. In this paper, we propose aspirational ethical principles specifically tailored for human-bot psycho-counseling during an era when AI-powered mental health services are continually emerging. We ex… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 54 pages,11 tables, APA style, the tables are presented following Reference

  16. arXiv:2404.13584  [pdf, other

    cs.CV cs.LG

    Rethink Arbitrary Style Transfer with Transformer and Contrastive Learning

    Authors: Zhanjie Zhang, Jiakai Sun, Guangyuan Li, Lei Zhao, Quanwei Zhang, Zehua Lan, Haolin Yin, Wei Xing, Huaizhong Lin, Zhiwen Zuo

    Abstract: Arbitrary style transfer holds widespread attention in research and boasts numerous practical applications. The existing methods, which either employ cross-attention to incorporate deep style attributes into content attributes or use adaptive normalization to adjust content features, fail to generate high-quality stylized images. In this paper, we introduce an innovative technique to improve the q… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by CVIU

  17. Union: An Automatic Workload Manager for Accelerating Network Simulation

    Authors: Xin Wang, Misbah Mubarak, Yao Kang, Robert B. Ross, Zhiling Lan

    Abstract: With the rapid growth of the machine learning applications, the workloads of future HPC systems are anticipated to be a mix of scientific simulation, big data analytics, and machine learning applications. Simulation is a great research vehicle to understand the performance implications of co-running scientific applications with big data and machine learning workloads on large-scale systems. In thi… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  18. Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network

    Authors: Yao Kang, Xin Wang, Zhiling Lan

    Abstract: High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between minimal and non-minimal paths with the least congestion. In practice, current adaptive routing algorithms estimate routing path congestion based on local information such as output queue occupancy. Usi… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  19. MRSch: Multi-Resource Scheduling for HPC

    Authors: Boyang Li, Yuping Fan, Matthew Dearing, Zhiling Lan, Paul Richy, William Allcocky, Michael Papka

    Abstract: Emerging workloads in high-performance computing (HPC) are embracing significant changes, such as having diverse resource requirements instead of being CPU-centric. This advancement forces cluster schedulers to consider multiple schedulable resources during decision-making. Existing scheduling studies rely on heuristic or optimization methods, which are limited by an inability to adapt to new scen… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  20. Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling

    Authors: Boyang Li, Zhiling Lan, Michael E. Papka

    Abstract: In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promising outcomes. However, a significant challenge arises from the lack of interpretability in deep neural networks (DNN), rendering them as black-box models to system managers. This lack of model interpret… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  21. Study of Workload Interference with Intelligent Routing on Dragonfly

    Authors: Yao Kang, Xin Wang, Zhiling Lan

    Abstract: Dragonfly interconnect is a crucial network technology for supercomputers. To support exascale systems, network resources are shared such that links and routers are not dedicated to any node pair. While link utilization is increased, workload performance is often offset by network contention. Recently, intelligent routing built on reinforcement learning demonstrates higher network throughput with… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  22. arXiv:2403.13250  [pdf, other

    cs.CL

    Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models

    Authors: Huachuan Qiu, Shuai Zhang, Hongliang He, Anqi Li, Zhenzhong Lan

    Abstract: Pornographic content occurring in human-machine interaction dialogues can cause severe side effects for users in open-domain dialogue systems. However, research on detecting pornographic language within human-machine interaction dialogues is an important subject that is rarely studied. To advance in this direction, we introduce CensorChat, a dialogue monitoring dataset aimed at detecting whether t… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted to CSCWD 2024 (27th International Conference on Computer Supported Cooperative Work in Design). arXiv admin note: text overlap with arXiv:2309.09749

  23. arXiv:2402.11958  [pdf, other

    cs.CL

    Automatic Evaluation for Mental Health Counseling using LLMs

    Authors: Anqi Li, Yu Lu, Nirui Song, Shuai Zhang, Lizhi Ma, Zhenzhong Lan

    Abstract: High-quality psychological counseling is crucial for mental health worldwide, and timely evaluation is vital for ensuring its effectiveness. However, obtaining professional evaluation for each counseling session is expensive and challenging. Existing methods that rely on self or third-party manual reports to assess the quality of counseling suffer from subjective biases and limitations of time-con… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 21 pages, 4 figures

  24. arXiv:2402.11522  [pdf, other

    cs.CL

    Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents

    Authors: Shuai Zhang, Yu Lu, Junwen Liu, Jia Yu, Huachuan Qiu, Yuming Yan, Zhenzhong Lan

    Abstract: With the growing humanlike nature of dialog agents, people are now engaging in extended conversations that can stretch from brief moments to substantial periods of time. Understanding the factors that contribute to sustaining these interactions is crucial, yet existing studies primarily focusing on short-term simulations that rarely explore such prolonged and real conversations. In this paper, w… ▽ More

    Submitted 12 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  25. arXiv:2402.06772  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG

    Retrosynthesis Prediction via Search in (Hyper) Graph

    Authors: Zixun Lan, Binjie Hong, Jiajun Zhu, Zuo Zeng, Zhenfu Liu, Limin Yu, Fei Ma

    Abstract: Predicting reactants from a specified core product stands as a fundamental challenge within organic synthesis, termed retrosynthesis prediction. Recently, semi-template-based methods and graph-edits-based methods have achieved good performance in terms of both interpretability and accuracy. However, due to their mechanisms these methods cannot predict complex reactions, e.g., reactions with multip… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  26. arXiv:2401.13919  [pdf, other

    cs.CL cs.AI

    WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

    Authors: Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu

    Abstract: The rapid advancement of large language models (LLMs) has led to a new era marked by the development of autonomous applications in real-world scenarios, which drives innovation in creating advanced web agents. Existing web agents typically only handle one input modality and are evaluated only in simplified web simulators or static web snapshots, greatly limiting their applicability in real-world s… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted to ACL 2024 (main). Code and data is released at https://github.com/MinorJerry/WebVoyager

  27. arXiv:2401.13178  [pdf, other

    cs.CL cs.AI cs.LG

    AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents

    Authors: Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He

    Abstract: Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications. However, the evaluation process presents substantial challenges. A primary obstacle is the benchmarking of agent performance across diverse scenarios within a unified framework, especially in maintaining partially-observ… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Preprint

  28. arXiv:2312.06135  [pdf, other

    cs.CV

    ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank

    Authors: Zhanjie Zhang, Quanwei Zhang, Guangyuan Li, Wei Xing, Lei Zhao, Jiakai Sun, Zehua Lan, Junsheng Luan, Yiling Huang, Huaizhong Lin

    Abstract: Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. Small model-based approaches can preserve the content strucuture, but fail to produce highly realistic stylized images and introduce artifacts and dish… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2024

  29. arXiv:2312.04262  [pdf, other

    cs.CL cs.HC

    PsyChat: A Client-Centric Dialogue System for Mental Health Support

    Authors: Huachuan Qiu, Anqi Li, Lizhi Ma, Zhenzhong Lan

    Abstract: Dialogue systems are increasingly integrated into mental health support to help clients facilitate exploration, gain insight, take action, and ultimately heal themselves. A practical and user-friendly dialogue system should be client-centric, focusing on the client's behaviors. However, existing dialogue systems publicly available for mental health support often concentrate solely on the counselor… ▽ More

    Submitted 19 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted to CSCWD 2024 (27th International Conference on Computer Supported Cooperative Work in Design)

  30. arXiv:2311.12067  [pdf, other

    cs.CV

    Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design

    Authors: Jia Yu, Lichao Zhang, Zijie Chen, Fayu Pan, MiaoMiao Wen, Yuming Yan, Fangsheng Weng, Shuai Zhang, Lili Pan, Zhenzhong Lan

    Abstract: The fusion of AI and fashion design has emerged as a promising research area. However, the lack of extensive, interrelated data on clothing and try-on stages has hindered the full potential of AI in this domain. Addressing this, we present the Fashion-Diffusion dataset, a product of multiple years' rigorous effort. This dataset, the first of its kind, comprises over a million high-quality fashion… ▽ More

    Submitted 18 March, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  31. arXiv:2311.09861  [pdf, other

    cs.CL cs.AI

    ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology

    Authors: Junlei Zhang, Hongliang He, Nirui Song, Zhanchao Zhou, Shuyuan He, Shuai Zhang, Huachuan Qiu, Anqi Li, Yong Dai, Lizhi Ma, Zhenzhong Lan

    Abstract: The critical field of psychology necessitates a comprehensive benchmark to enhance the evaluation and development of domain-specific Large Language Models (LLMs). Existing MMLU-type benchmarks, such as C-EVAL and CMMLU, include psychology-related subjects, but their limited number of questions and lack of systematic concept sampling strategies mean they cannot cover the concepts required in psycho… ▽ More

    Submitted 16 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Under Review

  32. arXiv:2310.19651  [pdf, other

    cs.CL

    Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

    Authors: Chiyu Song, Zhanchao Zhou, Jianhao Yan, Yuejiao Fei, Zhenzhong Lan, Yue Zhang

    Abstract: Instruction tuning is a burgeoning method to elicit the general intelligence of Large Language Models (LLMs). However, the creation of instruction data is still largely heuristic, leading to significant variation in quantity and quality across existing datasets. While some research advocates for expanding the number of instructions, others suggest that a small set of well-chosen examples is adequa… ▽ More

    Submitted 22 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

  33. arXiv:2310.15204  [pdf

    cs.LG

    Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN

    Authors: Zhou Lan, Ben Liu, Yi Feng, Danhuang Dong, Peng Zhang

    Abstract: Daily electricity consumption forecasting is a classical problem. Existing forecasting algorithms tend to have decreased accuracy on special dates like holidays. This study decomposes the daily electricity consumption series into three components: trend, seasonal, and residual, and constructs a two-stage prediction method using piecewise linear regression as a filter and Dilated Causal CNN as a pr… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Key words: Daily electricity consumption forecasting; time series decomposition; piecewise linear regression; Dilated Causal CNN

  34. arXiv:2310.08129  [pdf, other

    cs.CV

    Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting

    Authors: Zijie Chen, Lichao Zhang, Fangsheng Weng, Lili Pan, Zhenzhong Lan

    Abstract: Despite significant progress in the field, it is still challenging to create personalized visual representations that align closely with the desires and preferences of individual users. This process requires users to articulate their ideas in words that are both comprehensible to the models and accurately capture their vision, posing difficulties for many users. In this paper, we tackle this chall… ▽ More

    Submitted 6 April, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted at CVPR 2024

  35. arXiv:2309.15289  [pdf, other

    cs.CV cs.LG

    SEPT: Towards Efficient Scene Representation Learning for Motion Prediction

    Authors: Zhiqian Lan, Yuxuan Jiang, Yao Mu, Chen Chen, Shengbo Eben Li

    Abstract: Motion prediction is crucial for autonomous vehicles to operate safely in complex traffic environments. Extracting effective spatiotemporal relationships among traffic elements is key to accurate forecasting. Inspired by the successful practice of pretrained large language models, this paper presents SEPT, a modeling framework that leverages self-supervised learning to develop powerful spatiotempo… ▽ More

    Submitted 19 December, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

  36. arXiv:2309.09749   

    cs.CL

    Facilitating NSFW Text Detection in Open-Domain Dialogue Systems via Knowledge Distillation

    Authors: Huachuan Qiu, Shuai Zhang, Hongliang He, Anqi Li, Zhenzhong Lan

    Abstract: NSFW (Not Safe for Work) content, in the context of a dialogue, can have severe side effects on users in open-domain dialogue systems. However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection. Leveraging… ▽ More

    Submitted 20 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: As we have submitted a final version arXiv:2403.13250, we decide to withdraw it

  37. arXiv:2309.06221  [pdf, other

    cs.CV

    Use neural networks to recognize students' handwritten letters and incorrect symbols

    Authors: JiaJun Zhu, Zichuan Yang, Binjie Hong, Jiacheng Song, Jiwei Wang, Tianhao Chen, Shuilan Yang, Zixun Lan, Fei Ma

    Abstract: Correcting students' multiple-choice answers is a repetitive and mechanical task that can be considered an image multi-classification task. Assuming possible options are 'abcd' and the correct option is one of the four, some students may write incorrect symbols or options that do not exist. In this paper, five classifications were set up - four for possible correct options and one for other incorr… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  38. arXiv:2309.02054  [pdf, other

    cs.CV

    An Adaptive Spatial-Temporal Local Feature Difference Method for Infrared Small-moving Target Detection

    Authors: Yongkang Zhao, Chuang Zhu, Yuan Li, Shuaishuai Wang, Zihan Lan, Yuanyuan Qiao

    Abstract: Detecting small moving targets accurately in infrared (IR) image sequences is a significant challenge. To address this problem, we propose a novel method called spatial-temporal local feature difference (STLFD) with adaptive background suppression (ABS). Our approach utilizes filters in the spatial and temporal domains and performs pixel-level ABS on the output to enhance the contrast between the… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  39. arXiv:2307.16457  [pdf, other

    cs.CL

    A Benchmark for Understanding Dialogue Safety in Mental Health Support

    Authors: Huachuan Qiu, Tong Zhao, Anqi Li, Shuai Zhang, Hongliang He, Zhenzhong Lan

    Abstract: Dialogue safety remains a pervasive challenge in open-domain human-machine interaction. Existing approaches propose distinctive dialogue safety taxonomies and datasets for detecting explicitly harmful responses. However, these taxonomies may not be suitable for analyzing response safety in mental health support. In real-world interactions, a model response deemed acceptable in casual conversations… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: accepted to The 12th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC2023)

  40. arXiv:2307.15020  [pdf, other

    cs.CL cs.AI

    SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark

    Authors: Liang Xu, Anqi Li, Lei Zhu, Hang Xue, Changtai Zhu, Kangkang Zhao, Haonan He, Xuanwei Zhang, Qiyue Kang, Zhenzhong Lan

    Abstract: Large language models (LLMs) have shown the potential to be integrated into human daily lives. Therefore, user preference is the most critical criterion for assessing LLMs' performance in real-world scenarios. However, existing benchmarks mainly focus on measuring models' accuracy using multi-choice questions, which limits the understanding of their capabilities in real applications. We fill this… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: 13 pages, 12 figures, 5 tables

  41. arXiv:2307.08487  [pdf, other

    cs.CL

    Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models

    Authors: Huachuan Qiu, Shuai Zhang, Anqi Li, Hongliang He, Zhenzhong Lan

    Abstract: Considerable research efforts have been devoted to ensuring that large language models (LLMs) align with human values and generate safe text. However, an excessive focus on sensitivity to certain topics can compromise the model's robustness in following instructions, thereby impacting its overall performance in completing tasks. Previous benchmarks for jailbreaking LLMs have primarily focused on e… ▽ More

    Submitted 28 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Code and data are available at https://github.com/qiuhuachuan/latent-jailbreak

  42. arXiv:2306.15334  [pdf, other

    cs.CL

    Understanding Client Reactions in Online Mental Health Counseling

    Authors: Anqi Li, Lizhi Ma, Yaling Mei, Hongliang He, Shuai Zhang, Huachuan Qiu, Zhenzhong Lan

    Abstract: Communication success relies heavily on reading participants' reactions. Such feedback is especially important for mental health counselors, who must carefully consider the client's progress and adjust their approach accordingly. However, previous NLP research on counseling has mainly focused on studying counselors' intervention strategies rather than their clients' reactions to the intervention.… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Accept to ACL 2023, oral. For code and data, see https://github.com/dll-wu/Client-React

  43. arXiv:2305.17415  [pdf, other

    cs.CL cs.AI

    Exploring Better Text Image Translation with Multimodal Codebook

    Authors: Zhibin Lan, Jiawei Yu, Xiang Li, Wen Zhang, Jian Luan, Bin Wang, Degen Huang, Jinsong Su

    Abstract: Text image translation (TIT) aims to translate the source texts embedded in the image to target translations, which has a wide range of applications and thus has important research value. However, current studies on TIT are confronted with two main bottlenecks: 1) this task lacks a publicly available TIT dataset, 2) dominant models are constructed in a cascaded manner, which tends to suffer from t… ▽ More

    Submitted 2 June, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 Main Conference

  44. arXiv:2305.15676  [pdf, other

    cs.CL

    Enhancing Grammatical Error Correction Systems with Explanations

    Authors: Yuejiao Fei, Leyang Cui, Sen Yang, Wai Lam, Zhenzhong Lan, Shuming Shi

    Abstract: Grammatical error correction systems improve written communication by detecting and correcting language mistakes. To help language learners better understand why the GEC system makes a certain correction, the causes of errors (evidence words) and the corresponding error types are two key factors. To enhance GEC systems with explanations, we introduce EXPECT, a large dataset annotated with evidence… ▽ More

    Submitted 10 June, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 9 pages, 7 figures, accepted to the main conference of ACL 2023

  45. arXiv:2305.15077  [pdf, other

    cs.CL

    Contrastive Learning of Sentence Embeddings from Scratch

    Authors: Junlei Zhang, Zhenzhong Lan, Junxian He

    Abstract: Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings. Previous studies have typically learned sentence embeddings either through the use of human-annotated natural language inference (NLI) data or via large-scale unlabeled sentences in an unsupervised manner. However, even in the case of unlabeled data, their acquisition presents challenges in certain d… ▽ More

    Submitted 24 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Emnlp 2023

  46. arXiv:2305.07424  [pdf, other

    cs.CL cs.LG

    Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding

    Authors: Hongliang He, Junlei Zhang, Zhenzhong Lan, Yue Zhang

    Abstract: Contrastive learning-based methods, such as unsup-SimCSE, have achieved state-of-the-art (SOTA) performances in learning unsupervised sentence embeddings. However, in previous studies, each embedding used for contrastive learning only derived from one sentence instance, and we call these embeddings instance-level embeddings. In other words, each embedding is regarded as a unique class of its own,… ▽ More

    Submitted 18 May, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted to AAAI 2023. The code is available at https://github.com/dll-wu/IS-CSE

  47. arXiv:2305.00450  [pdf, other

    cs.CL cs.CY

    SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support

    Authors: Huachuan Qiu, Hongliang He, Shuai Zhang, Anqi Li, Zhenzhong Lan

    Abstract: Developing specialized dialogue systems for mental health support requires multi-turn conversation data, which has recently garnered increasing attention. However, gathering and releasing large-scale and real-life multi-turn conversations to facilitate advancements in mental health presents challenges due to data privacy protection, as well as the time and cost involved. To address the challenges… ▽ More

    Submitted 22 February, 2024; v1 submitted 30 April, 2023; originally announced May 2023.

    Comments: 22 pages

  48. arXiv:2302.03517  [pdf, other

    cs.DC cs.AI

    Optimization of Topology-Aware Job Allocation on a High-Performance Computing Cluster by Neural Simulated Annealing

    Authors: Zekang Lan, Yan Xu, Yingkun Huang, Dian Huang, Shengzhong Feng

    Abstract: Jobs on high-performance computing (HPC) clusters can suffer significant performance degradation due to inter-job network interference. Topology-aware job allocation problem (TJAP) is such a problem that decides how to dedicate nodes to specific applications to mitigate inter-job network interference. In this paper, we study the window-based TJAP on a fat-tree network aiming at minimizing the cost… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  49. arXiv:2301.12071  [pdf, other

    cs.LG q-bio.MN

    RCsearcher: Reaction Center Identification in Retrosynthesis via Deep Q-Learning

    Authors: Zixun Lan, Zuo Zeng, Binjie Hong, Zhenfu Liu, Fei Ma

    Abstract: The reaction center consists of atoms in the product whose local properties are not identical to the corresponding atoms in the reactants. Prior studies on reaction center identification are mainly on semi-templated retrosynthesis methods. Moreover, they are limited to single reaction center identification. However, many reaction centers are comprised of multiple bonds or atoms in reality. We refe… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  50. AEDNet: Adaptive Edge-Deleting Network For Subgraph Matching

    Authors: Zixun Lan, Ye Ma, Limin Yu, LingLong Yuan, Fei Ma

    Abstract: Subgraph matching is to find all subgraphs in a data graph that are isomorphic to an existing query graph. Subgraph matching is an NP-hard problem, yet has found its applications in many areas. Many learning-based methods have been proposed for graph matching, whereas few have been designed for subgraph matching. The subgraph matching problem is generally more challenging, mainly due to the differ… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Journal ref: Pattern Recognition, 133, p.109033 (2023)