Skip to main content

Showing 1–50 of 864 results for author: Bin Yu

.
  1. arXiv:2410.00812  [pdf, other

    cs.CL q-bio.NC

    A generative framework to bridge data-driven models and scientific theories in language neuroscience

    Authors: Richard Antonello, Chandan Singh, Shailee Jain, Aliyah Hsu, Jianfeng Gao, Bin Yu, Alexander Huth

    Abstract: Representations from large language models are highly effective at predicting BOLD fMRI responses to language stimuli. However, these representations are largely opaque: it is unclear what features of the language stimulus drive the response in each brain area. We present generative explanation-mediated validation, a framework for generating concise explanations of language selectivity in the brai… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  2. arXiv:2409.19663  [pdf, other

    cs.CL cs.AI

    Identifying Knowledge Editing Types in Large Language Models

    Authors: Xiaopeng Li, Shangwen Wang, Shezheng Song, Bin Ji, Huijun Liu, Shasha Li, Jun Ma, Jie Yu

    Abstract: Knowledge editing has emerged as an efficient technology for updating the knowledge of large language models (LLMs), attracting increasing attention in recent years. However, there is a lack of effective measures to prevent the malicious misuse of this technology, which could lead to harmful edits in LLMs. These malicious modifications could cause LLMs to generate toxic content, misleading users i… ▽ More

    Submitted 1 October, 2024; v1 submitted 29 September, 2024; originally announced September 2024.

    Comments: Under review

  3. arXiv:2409.18839  [pdf, other

    cs.CV

    MinerU: An Open-Source Solution for Precise Document Content Extraction

    Authors: Bin Wang, Chao Xu, Xiaomeng Zhao, Linke Ouyang, Fan Wu, Zhiyuan Zhao, Rui Xu, Kaiwen Liu, Yuan Qu, Fukai Shang, Bo Zhang, Liqun Wei, Zhihao Sui, Wei Li, Botian Shi, Yu Qiao, Dahua Lin, Conghui He

    Abstract: Document content analysis has been a crucial research area in computer vision. Despite significant advancements in methods such as OCR, layout detection, and formula recognition, existing open-source solutions struggle to consistently deliver high-quality content extraction due to the diversity in document types and content. To address these challenges, we present MinerU, an open-source solution f… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Comments: MinerU Technical Report

  4. arXiv:2409.18399  [pdf

    cs.AI

    Multimodal Trajectory Prediction for Autonomous Driving on Unstructured Roads using Deep Convolutional Network

    Authors: Lei Li, Zhifa Chen, Jian Wang, Bin Zhou, Guizhen Yu, Xiaoxuan Chen

    Abstract: Recently, the application of autonomous driving in open-pit mining has garnered increasing attention for achieving safe and efficient mineral transportation. Compared to urban structured roads, unstructured roads in mining sites have uneven boundaries and lack clearly defined lane markings. This leads to a lack of sufficient constraint information for predicting the trajectories of other human-dri… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

    Comments: 11 pages,6 figures

  5. arXiv:2409.16364  [pdf, other

    astro-ph.GA

    The Cosmic Evolution of the Supermassive Black Hole Population: A Hybrid Observed Accretion and Simulated Mergers Approach

    Authors: Fan Zou, W. N. Brandt, Elena Gallo, Bin Luo, Qingling Ni, Yongquan Xue, Zhibo Yu

    Abstract: Supermassive black holes (SMBHs) can grow through both accretion and mergers. It is still unclear how SMBHs evolve under these two channels from high redshifts to the SMBH population we observe in the local universe. Observations can directly constrain the accretion channel but cannot effectively constrain mergers yet, while cosmological simulations provide galaxy merger information but can hardly… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

    Comments: 29 pages, 22 figures, 5 tables, accepted for publication in ApJ

  6. arXiv:2409.15169  [pdf, other

    cs.CR cs.HC

    CamLoPA: A Hidden Wireless Camera Localization Framework via Signal Propagation Path Analysis

    Authors: Xiang Zhang, Jie Zhang, Zehua Ma, Jinyang Huang, Meng Li, Huan Yan, Peng Zhao, Zijian Zhang, Qing Guo, Tianwei Zhang, Bin Liu, Nenghai Yu

    Abstract: Hidden wireless cameras pose significant privacy threats, necessitating effective detection and localization methods. However, existing solutions often require spacious activity areas, expensive specialized devices, or pre-collected training data, limiting their practical deployment. To address these limitations, we introduce CamLoPA, a training-free wireless camera detection and localization fram… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  7. arXiv:2409.13729  [pdf, other

    cs.CL cs.AI

    MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model

    Authors: Zhen Yang, Jinhao Chen, Zhengxiao Du, Wenmeng Yu, Weihan Wang, Wenyi Hong, Zhihuan Jiang, Bin Xu, Yuxiao Dong, Jie Tang

    Abstract: Large language models (LLMs) have demonstrated significant capabilities in mathematical reasoning, particularly with text-based mathematical problems. However, current multi-modal large language models (MLLMs), especially those specialized in mathematics, tend to focus predominantly on solving geometric problems but ignore the diversity of visual information available in other areas of mathematics… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: 30 pages,19 figures

  8. arXiv:2409.13166  [pdf, other

    cs.RO cs.AI

    Morphology and Behavior Co-Optimization of Modular Satellites for Attitude Control

    Authors: Yuxing Wang, Jie Li, Cong Yu, Xinyang Li, Simeng Huang, Yongzhe Chang, Xueqian Wang, Bin Liang

    Abstract: The emergence of modular satellites marks a significant transformation in spacecraft engineering, introducing a new paradigm of flexibility, resilience, and scalability in space exploration endeavors. In addressing complex challenges such as attitude control, both the satellite's morphological architecture and the controller are crucial for optimizing performance. Despite substantial research on o… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: The paper was accepted as an oral presentation by the 75th International Astronautical Congress, Milan, Italy

  9. arXiv:2409.12139  [pdf, other

    cs.SD cs.AI eess.AS

    Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models

    Authors: Sijing Chen, Yuan Feng, Laipeng He, Tianwei He, Wendi He, Yanni Hu, Bin Lin, Yiting Lin, Yu Pan, Pengfei Tan, Chengwei Tian, Chen Wang, Zhicheng Wang, Ruoye Xie, Jixun Yao, Quanlei Yan, Yuguang Yang, Jianhao Ye, Jingjing Yin, Yanzhen Yu, Huimin Zhang, Xiang Zhang, Guangcheng Zhao, Hongbin Zhou, Pengpeng Zou

    Abstract: With the advent of the big data and large language model era, zero-shot personalized rapid customization has emerged as a significant trend. In this report, we introduce Takin AudioLLM, a series of techniques and models, mainly including Takin TTS, Takin VC, and Takin Morphing, specifically designed for audiobook production. These models are capable of zero-shot speech production, generating high-… ▽ More

    Submitted 23 September, 2024; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: Technical Report; 18 pages; typos corrected, references added, demo url modified, author name modified;

  10. arXiv:2409.10580  [pdf, other

    cs.LG cs.AI stat.ML

    Veridical Data Science for Medical Foundation Models

    Authors: Ahmed Alaa, Bin Yu

    Abstract: The advent of foundation models (FMs) such as large language models (LLMs) has led to a cultural shift in data science, both in medicine and beyond. This shift involves moving away from specialized predictive models trained for specific, well-defined domain questions to generalist FMs pre-trained on vast amounts of unstructured data, which can then be adapted to various clinical tasks and question… ▽ More

    Submitted 15 September, 2024; originally announced September 2024.

  11. arXiv:2409.05587  [pdf, other

    cs.CV

    DSDFormer: An Innovative Transformer-Mamba Framework for Robust High-Precision Driver Distraction Identification

    Authors: Junzhou Chen, Zirui Zhang, Jing Yu, Heqiang Huang, Ronghui Zhang, Xuemiao Xu, Bin Sheng, Hong Yan

    Abstract: Driver distraction remains a leading cause of traffic accidents, posing a critical threat to road safety globally. As intelligent transportation systems evolve, accurate and real-time identification of driver distraction has become essential. However, existing methods struggle to capture both global contextual and fine-grained local features while contending with noisy labels in training datasets.… ▽ More

    Submitted 12 September, 2024; v1 submitted 9 September, 2024; originally announced September 2024.

  12. arXiv:2409.02123  [pdf, other

    cs.LG cs.AI physics.ao-ph

    PuYun: Medium-Range Global Weather Forecasting Using Large Kernel Attention Convolutional Networks

    Authors: Shengchen Zhu, Yiming Chen, Peiying Yu, Xiang Qu, Yuxiao Zhou, Yiming Ma, Zhizhan Zhao, Yukai Liu, Hao Mi, Bin Wang

    Abstract: Accurate weather forecasting is essential for understanding and mitigating weather-related impacts. In this paper, we present PuYun, an autoregressive cascade model that leverages large kernel attention convolutional networks. The model's design inherently supports extended weather prediction horizons while broadening the effective receptive field. The integration of large kernel attention mechani… ▽ More

    Submitted 12 September, 2024; v1 submitted 1 September, 2024; originally announced September 2024.

  13. arXiv:2409.00920  [pdf, other

    cs.LG cs.AI cs.CL

    ToolACE: Winning the Points of LLM Function Calling

    Authors: Weiwen Liu, Xu Huang, Xingshan Zeng, Xinlong Hao, Shuai Yu, Dexun Li, Shuai Wang, Weinan Gan, Zhengying Liu, Yuanqing Yu, Zezhong Wang, Yuxian Wang, Wu Ning, Yutai Hou, Bin Wang, Chuhan Wu, Xinzhi Wang, Yong Liu, Yasheng Wang, Duyu Tang, Dandan Tu, Lifeng Shang, Xin Jiang, Ruiming Tang, Defu Lian , et al. (2 additional authors not shown)

    Abstract: Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability. However, real function-calling data is quite challenging to collect and annotate, while synthetic data generated by existing pipelines tends to lack coverage and accuracy. In this paper, we present ToolACE, an automatic ag… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: 21 pages, 22 figures

  14. arXiv:2408.16500  [pdf, other

    cs.CV

    CogVLM2: Visual Language Models for Image and Video Understanding

    Authors: Wenyi Hong, Weihan Wang, Ming Ding, Wenmeng Yu, Qingsong Lv, Yan Wang, Yean Cheng, Shiyu Huang, Junhui Ji, Zhao Xue, Lei Zhao, Zhuoyi Yang, Xiaotao Gu, Xiaohan Zhang, Guanyu Feng, Da Yin, Zihan Wang, Ji Qi, Xixuan Song, Peng Zhang, Debing Liu, Bin Xu, Juanzi Li, Yuxiao Dong, Jie Tang

    Abstract: Beginning with VisualGLM and CogVLM, we are continuously exploring VLMs in pursuit of enhanced vision-language fusion, efficient higher-resolution architecture, and broader modalities and applications. Here we propose the CogVLM2 family, a new generation of visual language models for image and video understanding including CogVLM2, CogVLM2-Video and GLM-4V. As an image understanding model, CogVLM2… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  15. arXiv:2408.16060  [pdf, other

    astro-ph.HE astro-ph.GA

    The Remarkable X-ray Spectra and Variability of the Ultraluminous Weak-Line Quasar SDSS J1521+5202

    Authors: Shouyi Wang, W. Niel Brandt, Bin Luo, Zhibo Yu, Fan Zou, Qingling Ni, Fabio Vito

    Abstract: We present a focused X-ray and multiwavelength study of the ultraluminous weak-line quasar (WLQ) SDSS J1521+5202, one of the few X-ray weak WLQs that is amenable to basic X-ray spectral and variability investigations. J1521+5202 shows striking X-ray variability during 2006--2023, by up to a factor of $\approx 32$ in 0.5--2 keV flux, and our new 2023 Chandra observation caught it in its brightest X… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 11 pages, 3 figures, accepted for publication in ApJ

  16. arXiv:2408.15461  [pdf, other

    cs.CV cs.MM

    Hand1000: Generating Realistic Hands from Text with Only 1,000 Images

    Authors: Haozhuo Zhang, Bin Zhu, Yu Cao, Yanbin Hao

    Abstract: Text-to-image generation models have achieved remarkable advancements in recent years, aiming to produce realistic images from textual descriptions. However, these models often struggle with generating anatomically accurate representations of human hands. The resulting images frequently exhibit issues such as incorrect numbers of fingers, unnatural twisting or interlacing of fingers, or blurred an… ▽ More

    Submitted 3 September, 2024; v1 submitted 27 August, 2024; originally announced August 2024.

    Comments: Project page https://haozhuo-zhang.github.io/Hand1000-project-page/

  17. arXiv:2408.13826  [pdf, ps, other

    hep-ph

    The ground states of hidden-charm tetraquarks and their radial excitations

    Authors: Guo-Liang Yu, Zhen-Yu Li, Zhi-Gang Wang, Bin WU, Ze Zhou, Jie Lu

    Abstract: Inspired by the great progress in the observations of charmonium-like states in recent years, we perform a systematic analysis about the ground states and the first radially excited states of $qc\bar{q}\bar{c}$ ($q$=$u/d$ and $s$) tetraquark systems. Their mass spectra, root mean square (r.m.s.) radii and radial density distributions are predicted within the framework of relativized quark model. B… ▽ More

    Submitted 26 September, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

  18. arXiv:2408.10592  [pdf, other

    cs.AI cs.CG cs.LO

    Hologram Reasoning for Solving Algebra Problems with Geometry Diagrams

    Authors: Litian Huang, Xinguo Yu, Feng Xiong, Bin He, Shengbing Tang, Jiawen Fu

    Abstract: Solving Algebra Problems with Geometry Diagrams (APGDs) is still a challenging problem because diagram processing is not studied as intensively as language processing. To work against this challenge, this paper proposes a hologram reasoning scheme and develops a high-performance method for solving APGDs by using this scheme. To reach this goal, it first defines a hologram, being a kind of graph, a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  19. arXiv:2408.09462  [pdf, other

    cs.MM

    SpeechEE: A Novel Benchmark for Speech Event Extraction

    Authors: Bin Wang, Meishan Zhang, Hao Fei, Yu Zhao, Bobo Li, Shengqiong Wu, Wei Ji, Min Zhang

    Abstract: Event extraction (EE) is a critical direction in the field of information extraction, laying an important foundation for the construction of structured knowledge bases. EE from text has received ample research and attention for years, yet there can be numerous real-world applications that require direct information acquisition from speech signals, online meeting minutes, interview summaries, press… ▽ More

    Submitted 23 August, 2024; v1 submitted 18 August, 2024; originally announced August 2024.

  20. arXiv:2408.07255  [pdf, other

    hep-ph hep-ex nucl-ex nucl-th

    Dihadron azimuthal asymmetry and light-quark dipole moments at the Electron-Ion Collider

    Authors: Xin-Kai Wen, Bin Yan, Zhite Yu, C. -P. Yuan

    Abstract: We propose a novel method to probe light-quark dipole moments by examining the azimuthal asymmetries between a collinear pair of hadrons in semi-inclusive deep inelastic lepton scattering off an unpolarized proton target at the Electron-Ion Collider. These asymmetries provide a means to observe transversely polarized quarks, which arise exclusively from the interference between the dipole and the… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 6 pages, 2 figures

    Report number: JLAB-THY-24-4136,MSUHEP-24-009

  21. arXiv:2408.06701  [pdf, other

    cs.NI cs.LG

    DiffSG: A Generative Solver for Network Optimization with Diffusion Model

    Authors: Ruihuai Liang, Bo Yang, Zhiwen Yu, Bin Guo, Xuelin Cao, Mérouane Debbah, H. Vincent Poor, Chau Yuen

    Abstract: Diffusion generative models, famous for their performance in image generation, are popular in various cross-domain applications. However, their use in the communication community has been mostly limited to auxiliary tasks like data modeling and feature extraction. These models hold greater promise for fundamental problems in network optimization compared to traditional machine learning methods. Di… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 8 pages, 5 figures

  22. arXiv:2408.03361  [pdf, other

    eess.IV cs.CV

    GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

    Authors: Pengcheng Chen, Jin Ye, Guoan Wang, Yanjun Li, Zhongying Deng, Wei Li, Tianbin Li, Haodong Duan, Ziyan Huang, Yanzhou Su, Benyou Wang, Shaoting Zhang, Bin Fu, Jianfei Cai, Bohan Zhuang, Eric J Seibel, Junjun He, Yu Qiao

    Abstract: Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals, and can be applied in various fields. In the medical field, LVLMs have a high potential to offer substantial assistance for diagnosis and treatment. Before that, it is crucial to develop benchmarks to evaluate LVLMs' effectiveness in various medical applications. Curren… ▽ More

    Submitted 30 September, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

  23. arXiv:2408.00982  [pdf, other

    physics.optics eess.SP

    Adaptive optical signal-to-noise ratio recovery for long-distance optical fiber transmission

    Authors: Mingwen Zhu, Shangsu Ding, Zhixue Li, Song Yu, Jianming Shang, Bin Luo

    Abstract: In long-distance fiber optic transmission, the optic fiber link and erbium-doped fiber amplifiers can introduce excessive noise, which reduces the optical signal-to-noise ratio (OSNR). The narrow-band optical filters can be used to eliminate noise and thereby improve OSNR. However, there is a relative frequency drift between the signal and the narrow-band filter, which leads to filtered signal ins… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  24. The most distant HI galaxies discovered by the 500 m dish FAST

    Authors: Hongwei Xi, Bo Peng, Lister Staveley-Smith, Bi-Qing For, Bin Liu, Ru-Rong Chen, Lei Yu, Dejian Ding, Wei-Jian Guo, Hu Zou, Suijian Xue, Jing Wang, Thomas G. Brink, WeiKang Zheng, Alexei V. Filippenko, Yi Yang, Jianyan Wei, Y. Sophia Dai, Zi-Jian Li, Zizhao He, Chengzi Jiang, Alexei Moiseev, Sergey Kotov

    Abstract: Neutral hydrogen (HI) is the primary component of the cool interstellar medium (ISM) and is the reservoir of fuel for star formation. Owing to the sensitivity of existing radio telescopes, our understanding of the evolution of the ISM in galaxies remains limited, as it is based on only a few hundred galaxies detected in HI beyond the local Universe. With the high sensitivity of the Five-hundred-me… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 14 pages, 6 figures, 3 tables

    Journal ref: ApJL, 966(2024), L36

  25. arXiv:2407.20756  [pdf, other

    cs.CV cs.CL

    SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models

    Authors: Zheng Liu, Hao Liang, Xijie Huang, Wentao Xiong, Qinhan Yu, Linzhuang Sun, Chong Chen, Conghui He, Bin Cui, Wentao Zhang

    Abstract: Recently, with the rise of web images, managing and understanding large-scale image datasets has become increasingly important. Vision Large Language Models (VLLMs) have recently emerged due to their robust vision-understanding capabilities. However, training these models requires vast amounts of data, posing challenges to efficiency, effectiveness, data quality, and privacy. In this paper, we int… ▽ More

    Submitted 10 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

  26. arXiv:2407.19548  [pdf, other

    cs.CV

    Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle

    Authors: Zhenyu Tang, Junwu Zhang, Xinhua Cheng, Wangbo Yu, Chaoran Feng, Yatian Pang, Bin Lin, Li Yuan

    Abstract: Recent 3D large reconstruction models typically employ a two-stage process, including first generate multi-view images by a multi-view diffusion model, and then utilize a feed-forward model to reconstruct images to 3D content.However, multi-view diffusion models often produce low-quality and inconsistent images, adversely affecting the quality of the final 3D reconstruction. To address this issue,… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: Project page: https://pku-yuangroup.github.io/Cycle3D/

  27. arXiv:2407.18766  [pdf, other

    cs.IT eess.SP

    Secrecy Performance Analysis of Integrated RF-UWOC IoT Networks Enabled by UAV and Underwater-RIS

    Authors: Abrar Bin Sarawar, A. S. M. Badrudduza, Md. Ibrahim, Imran Shafique Ansari, Heejung Yu

    Abstract: In the sixth-generation (6G) Internet of Things (IoT) networks, the use of UAV-mounted base stations and reconfigurable intelligent surfaces (RIS) has been considered to enhance coverage, flexibility, and security in non-terrestrial networks (NTNs). In addition to aerial networks enabled by NTN technologies, the integration of underwater networks with 6G IoT can be considered one of the most innov… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  28. arXiv:2407.17730  [pdf, other

    cs.CL

    Are Large Language Models Possible to Conduct Cognitive Behavioral Therapy?

    Authors: Hao Shen, Zihan Li, Minqiang Yang, Minghui Ni, Yongfeng Tao, Zhengyang Yu, Weihao Zheng, Chen Xu, Bin Hu

    Abstract: In contemporary society, the issue of psychological health has become increasingly prominent, characterized by the diversification, complexity, and universality of mental disorders. Cognitive Behavioral Therapy (CBT), currently the most influential and clinically effective psychological treatment method with no side effects, has limited coverage and poor quality in most countries. In recent years,… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  29. arXiv:2407.17020  [pdf, other

    cs.CV

    EAFormer: Scene Text Segmentation with Edge-Aware Transformers

    Authors: Haiyang Yu, Teng Fu, Bin Li, Xiangyang Xue

    Abstract: Scene text segmentation aims at cropping texts from scene images, which is usually used to help generative models edit or remove texts. The existing text segmentation methods tend to involve various text-related supervisions for better performance. However, most of them ignore the importance of text edges, which are significant for downstream applications. In this paper, we propose Edge-Aware Tran… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  30. arXiv:2407.15348  [pdf, other

    physics.optics physics.ins-det

    A Novel Hybrid Digital and Analog Laser Synchronization System

    Authors: Mingwen Zhu, Shangsu Ding, Tianwei Jiang, Jianming Shang, Song Yu, Bin Luo

    Abstract: Laser synchronization is a technique that locks the wavelength of a free-running laser to that of the reference laser, thereby enabling synchronous changes in the wavelengths of the two lasers. This technique is of crucial importance in both scientific and industrial applications. Conventional synchronization systems, whether digital or analog, have intrinsic limitations in terms of accuracy or ba… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: 7 pages,9 figures

  31. arXiv:2407.14897  [pdf, other

    astro-ph.HE gr-qc

    Polarization Patterns of Non-Circular Hotspots around Kerr Black Holes: A Preliminary Study

    Authors: Bin Chen, Yehui Hou, Yu Song, Zhenyu Zhang

    Abstract: The multi-wavelength polarized light signals from supermassive black holes have sparked many studies on polarized images of accretion disks and hotspots. However, the polarization patterns within the innermost stable circular orbit (ISCO) region remain to be explored. In this study, we focus on two specific types of orbits, namely the plunging geodesics inward from the ISCO and homoclinic geodesic… ▽ More

    Submitted 18 August, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: 26 pages, 8 figures, corrected the results of radial magnetic fields

  32. arXiv:2407.13863  [pdf, other

    cs.CV

    A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks

    Authors: Yixiang Qiu, Hao Fang, Hongyao Yu, Bin Chen, MeiKang Qiu, Shu-Tao Xia

    Abstract: Model Inversion (MI) attacks aim to reconstruct privacy-sensitive training data from released models by utilizing output information, raising extensive concerns about the security of Deep Neural Networks (DNNs). Recent advances in generative adversarial networks (GANs) have contributed significantly to the improved performance of MI attacks due to their powerful ability to generate realistic image… ▽ More

    Submitted 13 September, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  33. arXiv:2407.12339  [pdf, other

    cs.CV

    Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection

    Authors: Zhenni Yu, Xiaoqin Zhang, Li Zhao, Yi Bin, Guobao Xiao

    Abstract: This paper introduces a new Segment Anything Model with Depth Perception (DSAM) for Camouflaged Object Detection (COD). DSAM exploits the zero-shot capability of SAM to realize precise segmentation in the RGB-D domain. It consists of the Prompt-Deeper Module and the Finer Module. The Prompt-Deeper Module utilizes knowledge distillation and the Bias Correction Module to achieve the interaction betw… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: ACM MM 2024

  34. arXiv:2407.07681  [pdf

    physics.optics physics.bio-ph

    Localizing axial dense emitters based onsingle-helix point spread function andcompressed sensing

    Authors: Hanzhe Wu, Danni Chen, YiHong Jiand Gan Xiang, Heng Li, Bin Yu, JunLe Qu

    Abstract: Among the approaches in three-dimensional (3D) single molecule localization microscopy, there are several point spread function (PSF) engineering approaches, in which depth information of molecules is encoded in 2D images. Usually,the molecules are excited sparsely in each raw image. The consequence is that the temporal resolution has to be sacrificed. In order to improve temporal resolution and e… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  35. arXiv:2407.07304  [pdf, other

    cs.AI

    Inference Performance Optimization for Large Language Models on CPUs

    Authors: Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

    Abstract: Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardw… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

  36. CrowdTransfer: Enabling Crowd Knowledge Transfer in AIoT Community

    Authors: Yan Liu, Bin Guo, Nuo Li, Yasan Ding, Zhouyangzi Zhang, Zhiwen Yu

    Abstract: Artificial Intelligence of Things (AIoT) is an emerging frontier based on the deep fusion of Internet of Things (IoT) and Artificial Intelligence (AI) technologies. Although advanced deep learning techniques enhance the efficient data processing and intelligent analysis of complex IoT data, they still suffer from notable challenges when deployed to practical AIoT applications, such as constrained… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted for publication in IEEE Communications Surveys & Tutorials. Copyright will be transferred without notice, after this version may no longer be accessible

  37. arXiv:2407.04201  [pdf, ps, other

    math.OC

    A General Maximum Principle for Progressive Optimal Control of Fully Coupled Forward-Backward Stochastic Systems with Jumps

    Authors: Bin Wang, Yu Si, Jingtao Shi

    Abstract: This paper is concerned with a general maximum principle for the fully coupled forward-backward stochastic optimal control problem with jumps, where the control domain is not necessarily convex, within the progressively measurable framework. It is worth noting that not only the control variable enters into all the coefficients, but also the jump size "$e$" . We first proposed that the solution… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 32 pages

    MSC Class: 93E20; 49K45; 60H10; 60G55

  38. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  39. arXiv:2407.03320  [pdf, other

    cs.CV cs.CL

    InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

    Authors: Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Rui Qian, Lin Chen, Qipeng Guo, Haodong Duan, Bin Wang, Linke Ouyang, Songyang Zhang, Wenwei Zhang, Yining Li, Yang Gao, Peng Sun, Xinyue Zhang, Wei Li, Jingwen Li, Wenhai Wang, Hang Yan, Conghui He, Xingcheng Zhang, Kai Chen, Jifeng Dai, Yu Qiao , et al. (2 additional authors not shown)

    Abstract: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. Th… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Technical Report. https://github.com/InternLM/InternLM-XComposer

  40. arXiv:2407.01937  [pdf, other

    cs.CL

    Efficient-Empathy: Towards Efficient and Effective Selection of Empathy Data

    Authors: Linzhuang Sun, Hao Liang, Jingxuan Wei, Linkun Sun, Bihui Yu, Bin Cui, Wentao Zhang

    Abstract: In recent years, with the rapid advancements in large language models (LLMs), achieving excellent empathetic response capability has become a crucial prerequisite. Consequently, managing and understanding large-scale video datasets has gained increasing importance. However, empathetic data are typically trained without any quality selection, leading to inefficient data usage and wasted computation… ▽ More

    Submitted 9 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  41. arXiv:2407.00886  [pdf, other

    cs.AI cs.CL cs.LG

    Mechanistic Interpretation through Contextual Decomposition in Transformers

    Authors: Aliyah R. Hsu, Yeshwanth Cherapanamjeri, Anobel Y. Odisho, Peter R. Carroll, Bin Yu

    Abstract: Transformers exhibit impressive capabilities but are often regarded as black boxes due to challenges in understanding the complex nonlinear relationships between features. Interpreting machine learning models is of paramount importance to mitigate risks, and mechanistic interpretability is in particular of current interest as it opens up a window for guiding manual modifications and reverse-engine… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  42. arXiv:2407.00016  [pdf, other

    cs.DC

    AdaBridge: Dynamic Data and Computation Reuse for Efficient Multi-task DNN Co-evolution in Edge Systems

    Authors: Lehao Wang, Zhiwen Yu, Sicong Liu, Chenshu Wu, Xiangrui Xu, Bin Guo

    Abstract: Running multi-task DNNs on mobiles is an emerging trend for various applications like autonomous driving and mobile NLP. Mobile DNNs are often compressed to fit the limited resources and thus suffer from degraded accuracy and generalizability due to data drift. DNN evolution, e.g., continuous learning and domain adaptation, has been demonstrated effective in overcoming these issues, mostly for sin… ▽ More

    Submitted 2 May, 2024; originally announced July 2024.

    Comments: Accepted by NSDI'24 Poster

  43. arXiv:2406.20078  [pdf, other

    cs.CV

    GM-DF: Generalized Multi-Scenario Deepfake Detection

    Authors: Yingxin Lai, Zitong Yu, Jing Yang, Bin Li, Xiangui Kang, Linlin Shen

    Abstract: Existing face forgery detection usually follows the paradigm of training models in a single domain, which leads to limited generalization capacity when unseen scenarios and unknown attacks occur. In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets. We first find a rapid degradation of de… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  44. arXiv:2406.19958  [pdf, other

    stat.ML cs.LG math.ST

    The Computational Curse of Big Data for Bayesian Additive Regression Trees: A Hitting Time Analysis

    Authors: Yan Shuo Tan, Omer Ronen, Theo Saarinen, Bin Yu

    Abstract: Bayesian Additive Regression Trees (BART) is a popular Bayesian non-parametric regression model that is commonly used in causal inference and beyond. Its strong predictive performance is supported by theoretical guarantees that its posterior distribution concentrates around the true regression function at optimal rates under various data generative settings and for appropriate prior choices. In th… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    MSC Class: 62G08; 65C40

  45. arXiv:2406.19769  [pdf, other

    eess.SP

    Decision Transformer for IRS-Assisted Systems with Diffusion-Driven Generative Channels

    Authors: Jie Zhang, Jun Li, Zhe Wang, Yu Han, Long Shi, Bin Cao

    Abstract: In this paper, we propose a novel diffusion-decision transformer (D2T) architecture to optimize the beamforming strategies for intelligent reflecting surface (IRS)-assisted multiple-input single-output (MISO) communication systems. The first challenge lies in the expensive computation cost to recover the real-time channel state information (CSI) from the received pilot signals, which usually requi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  46. arXiv:2406.18269  [pdf, other

    physics.chem-ph physics.comp-ph

    Refining Potential Energy Surface through Dynamical Properties via Differentiable Molecular Simulation

    Authors: Bin Han, Kuang Yu

    Abstract: Recently, machine learning potentials (MLP) largely enhances the reliability of molecular dynamics, but its accuracy is limited by the underlying $\textit{ab initio}$ methods. A viable approach to overcome this limitation is to refine the potential by learning from experimental data, which now can be done efficiently using modern automatic differentiation technique. However, potential refinement i… ▽ More

    Submitted 27 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  47. arXiv:2406.17555  [pdf, ps, other

    physics.plasm-ph

    A response to commenter Ke Lan's comment on our paper published in Nature Communications (2023)14:5782 by J. Yan et al

    Authors: Ji Yan, Jiwei Li, X. T. He, Lifeng Wang, Yaohua Chen, Feng Wang, Xiaoying Han, Kaiqiang Pan, Juxi Liang, Yulong Li, Zanyang Guan, Xiangming Liu, Xingsen Che, Zhongjing Chen, Xing Zhang, Yan Xu, Bin Li, Minging He, Hongbo Cai, Liang. Hao, Zhanjun Liu, Chunyang Zheng, Zhensheng Dai, Zhengfeng Fan, Bin Qiao , et al. (4 additional authors not shown)

    Abstract: A response to commenter Ke Lan's comment on our paper published in Nature Communications (2023)14:5782 by J. Yan et al

    Submitted 25 June, 2024; originally announced June 2024.

  48. arXiv:2406.17278  [pdf, other

    stat.ME econ.EM math.ST

    Estimation and Inference for CP Tensor Factor Models

    Authors: Bin Chen, Yuefeng Han, Qiyang Yu

    Abstract: High-dimensional tensor-valued data have recently gained attention from researchers in economics and finance. We consider the estimation and inference of high-dimensional tensor factor models, where each dimension of the tensor diverges. Our focus is on a factor model that admits CP-type tensor decomposition, which allows for non-orthogonal loading vectors. Based on the contemporary covariance mat… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  49. arXiv:2406.16786  [pdf, other

    cs.CE

    Generalized and high-efficiency arbitrary-positioned buffer for smoothed particle hydrodynamics

    Authors: Shuoguo Zhang, Yu Fan, Yaru Ren, Bin Qian, Xiangyu Hu

    Abstract: This paper develops an arbitrary-positioned buffer for the smoothed particle hydrodynamics (SPH) method, whose generality and high efficiency are achieved through two techniques. First, with the local coordinate system established at each arbitrary-positioned in-/outlet, particle positions in the global coordinate system are transformed into those in it via coordinate transformation. Since one loc… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 34 pages and 17 figures

  50. arXiv:2406.12793  [pdf, other

    cs.CL

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Dan Zhang, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Jingyu Sun, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong , et al. (34 additional authors not shown)

    Abstract: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More

    Submitted 29 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.