Skip to main content

Showing 1–50 of 69 results for author: Niu, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.08495  [pdf, other

    cs.CV

    Achieving Complex Image Edits via Function Aggregation with Diffusion Models

    Authors: Mohammadreza Samadi, Fred X. Han, Mohammad Salameh, Hao Wu, Fengyu Sun, Chunhua Zhou, Di Niu

    Abstract: Diffusion models have demonstrated strong performance in generative tasks, making them ideal candidates for image editing. Recent studies highlight their ability to apply desired edits effectively by following textual instructions, yet two key challenges persist. First, these models struggle to apply multiple edits simultaneously, resulting in computational inefficiencies due to their reliance on… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  2. arXiv:2407.21721  [pdf, other

    cs.MM cs.AI

    Open-Vocabulary Audio-Visual Semantic Segmentation

    Authors: Ruohao Guo, Liao Qu, Dantong Niu, Yanyu Qi, Wenzhen Yue, Ji Shi, Bowei Xing, Xianghua Ying

    Abstract: Audio-visual semantic segmentation (AVSS) aims to segment and classify sounding objects in videos with acoustic cues. However, most approaches operate on the close-set assumption and only identify pre-defined categories from training data, lacking the generalization ability to detect novel categories in practical applications. In this paper, we introduce a new task: open-vocabulary audio-visual se… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM MM 2024 (Oral)

  3. arXiv:2406.11815  [pdf, other

    cs.RO cs.CV cs.LG

    LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning

    Authors: Dantong Niu, Yuvan Sharma, Giscard Biamby, Jerome Quenum, Yutong Bai, Baifeng Shi, Trevor Darrell, Roei Herzig

    Abstract: In recent years, instruction-tuned Large Multimodal Models (LMMs) have been successful at several tasks, including image captioning and visual question answering; yet leveraging these models remains an open question for robotics. Prior LMMs for robotics applications have been extensively trained on language and action data, but their ability to generalize in different settings has often been less… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2406.11301  [pdf, other

    cs.AI cs.CL cs.LG

    Enhancing and Assessing Instruction-Following with Fine-Grained Instruction Variants

    Authors: Jiuding Yang, Weidong Guo, Kaitong Yang, Xiangyang Li, Zhuwei Rao, Yu Xu, Di Niu

    Abstract: The effective alignment of Large Language Models (LLMs) with precise instructions is essential for their application in diverse real-world scenarios. Current methods focus on enhancing the diversity and complexity of training and evaluation samples, yet they fall short in accurately assessing LLMs' ability to follow similar instruction variants. We introduce an effective data augmentation techniqu… ▽ More

    Submitted 31 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2405.08013  [pdf, other

    cs.LG cs.AI cs.SI

    CTRL: Continuous-Time Representation Learning on Temporal Heterogeneous Information Network

    Authors: Chenglin Li, Yuanzhen Xie, Chenyun Yu, Lei Cheng, Bo Hu, Zang Li, Di Niu

    Abstract: Inductive representation learning on temporal heterogeneous graphs is crucial for scalable deep learning on heterogeneous information networks (HINs) which are time-varying, such as citation networks. However, most existing approaches are not inductive and thus cannot handle new nodes or edges. Moreover, previous temporal graph embedding methods are often trained with the temporal link prediction… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  6. arXiv:2405.01762  [pdf, ps, other

    cs.LG

    EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time

    Authors: Shengyao Lu, Bang Liu, Keith G. Mills, Jiao He, Di Niu

    Abstract: Understanding and explaining the predictions of Graph Neural Networks (GNNs), is crucial for enhancing their safety and trustworthiness. Subgraph-level explanations are gaining attention for their intuitive appeal. However, most existing subgraph-level explainers face efficiency challenges in explaining GNNs due to complex search processes. The key challenge is to find a balance between intuitiven… ▽ More

    Submitted 16 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 19 pages

    Journal ref: ICML 2024

  7. arXiv:2404.19038  [pdf, other

    cs.CV cs.AI

    Embedded Representation Learning Network for Animating Styled Video Portrait

    Authors: Tianyong Wang, Xiangyu Liang, Wangguandong Zheng, Dan Niu, Haifeng Xia, Siyu Xia

    Abstract: The talking head generation recently attracted considerable attention due to its widespread application prospects, especially for digital avatars and 3D animation design. Inspired by this practical demand, several works explored Neural Radiance Fields (NeRF) to synthesize the talking heads. However, these methods based on NeRF face two challenges: (1) Difficulty in generating style-controllable ta… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  8. arXiv:2403.13293  [pdf, other

    cs.CV cs.AI cs.LG

    Building Optimal Neural Architectures using Interpretable Knowledge

    Authors: Keith G. Mills, Fred X. Han, Mohammad Salameh, Shengyao Lu, Chunhua Zhou, Jiao He, Fengyu Sun, Di Niu

    Abstract: Neural Architecture Search is a costly practice. The fact that a search space can span a vast number of design choices with each architecture evaluation taking nontrivial overhead makes it hard for an algorithm to sufficiently explore candidate networks. In this paper, we propose AutoBuild, a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: CVPR'24; 18 Pages, 18 Figures, 3 Tables

  9. arXiv:2403.07557  [pdf, other

    cs.CL cs.LG

    SIFiD: Reassess Summary Factual Inconsistency Detection with LLM

    Authors: Jiuding Yang, Hui Liu, Weidong Guo, Zhuwei Rao, Yu Xu, Di Niu

    Abstract: Ensuring factual consistency between the summary and the original document is paramount in summarization tasks. Consequently, considerable effort has been dedicated to detecting inconsistencies. With the advent of Large Language Models (LLMs), recent studies have begun to leverage their advanced language understanding capabilities for inconsistency detection. However, early attempts have shown tha… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  10. arXiv:2402.11140  [pdf, other

    cs.CL cs.AI cs.LG

    Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models

    Authors: Sijia Chen, Baochun Li, Di Niu

    Abstract: The reasoning performance of Large Language Models (LLMs) on a wide range of problems critically relies on chain-of-thought prompting, which involves providing a few chain of thought demonstrations as exemplars in prompts. Recent work, e.g., Tree of Thoughts, has pointed out the importance of exploration and self-evaluation in reasoning step selection for complex problem solving. In this paper, we… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted as a poster paper by ICLR2024. 27 pages, 5 figures, 18 tables. [Source Code](https://github.com/iQua/llmpebase/tree/main/examples/BoTReasoning)

  11. arXiv:2401.15235  [pdf, other

    eess.IV cs.CV cs.LG

    CascadedGaze: Efficiency in Global Context Extraction for Image Restoration

    Authors: Amirhosein Ghasemabadi, Muhammad Kamran Janjua, Mohammad Salameh, Chunhua Zhou, Fengyu Sun, Di Niu

    Abstract: Image restoration tasks traditionally rely on convolutional neural networks. However, given the local nature of the convolutional operator, they struggle to capture global information. The promise of attention mechanisms in Transformers is to circumvent this problem, but it comes at the cost of intensive computational overhead. Many recent studies in image restoration have focused on solving the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR), 2024. 20 pages

  12. arXiv:2401.14578  [pdf, ps, other

    cs.LG

    GOAt: Explaining Graph Neural Networks via Graph Output Attribution

    Authors: Shengyao Lu, Keith G. Mills, Jiao He, Bang Liu, Di Niu

    Abstract: Understanding the decision-making process of Graph Neural Networks (GNNs) is crucial to their interpretability. Most existing methods for explaining GNNs typically rely on training auxiliary models, resulting in the explanations remain black-boxed. This paper introduces Graph Output Attribution (GOAt), a novel method to attribute graph outputs to input graph features, creating GNN explanations tha… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 Poster

  13. arXiv:2312.17243  [pdf, other

    cs.CV

    Unsupervised Universal Image Segmentation

    Authors: Dantong Niu, Xudong Wang, Xinyang Han, Long Lian, Roei Herzig, Trevor Darrell

    Abstract: Several unsupervised image segmentation approaches have been proposed which eliminate the need for dense manually-annotated segmentation masks; current models separately handle either semantic segmentation (e.g., STEGO) or class-agnostic instance segmentation (e.g., CutLER), but not both (i.e., panoptic segmentation). We propose an Unsupervised Universal Segmentation model (U2Seg) adept at perform… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  14. arXiv:2312.15692  [pdf, other

    cs.AI

    Instruction Fusion: Advancing Prompt Evolution through Hybridization

    Authors: Weidong Guo, Jiuding Yang, Kaitong Yang, Xiangyang Li, Zhuwei Rao, Yu Xu, Di Niu

    Abstract: The fine-tuning of Large Language Models (LLMs) specialized in code generation has seen notable advancements through the use of open-domain coding queries. Despite the successes, existing methodologies like Evol-Instruct encounter performance limitations, impeding further enhancements in code generation tasks. This paper examines the constraints of existing prompt evolution techniques and introduc… ▽ More

    Submitted 17 June, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

  15. arXiv:2311.17942  [pdf, other

    cs.CV

    Object-based (yet Class-agnostic) Video Domain Adaptation

    Authors: Dantong Niu, Amir Bar, Roei Herzig, Trevor Darrell, Anna Rohrbach

    Abstract: Existing video-based action recognition systems typically require dense annotation and struggle in environments when there is significant distribution shift relative to the training data. Current methods for video domain adaptation typically fine-tune the model using fully annotated data on a subset of target domain data or align the representation of the two domains using bootstrapping or adversa… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  16. arXiv:2310.18709  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Audio-Visual Instance Segmentation

    Authors: Ruohao Guo, Yaru Chen, Yanyu Qi, Wenzhen Yue, Dantong Niu, Xianghua Ying

    Abstract: In this paper, we propose a new multi-modal task, namely audio-visual instance segmentation (AVIS), in which the goal is to identify, segment, and track individual sounding object instances in audible videos, simultaneously. To our knowledge, it is the first time that instance segmentation has been extended into the audio-visual domain. To better facilitate this research, we construct the first au… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  17. arXiv:2309.08159  [pdf, other

    cs.CV cs.IR cs.LG

    AdSEE: Investigating the Impact of Image Style Editing on Advertisement Attractiveness

    Authors: Liyao Jiang, Chenglin Li, Haolan Chen, Xiaodong Gao, Xinwang Zhong, Yang Qiu, Shani Ye, Di Niu

    Abstract: Online advertisements are important elements in e-commerce sites, social media platforms, and search engines. With the increasing popularity of mobile browsing, many online ads are displayed with visual information in the form of a cover image in addition to text descriptions to grab the attention of users. Various recent studies have focused on predicting the click rates of online advertisements… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted to KDD 2023 Applied Data Science Track

  18. arXiv:2309.07967  [pdf, other

    cs.IR

    iHAS: Instance-wise Hierarchical Architecture Search for Deep Learning Recommendation Models

    Authors: Yakun Yu, Shi-ang Qi, Jiuding Yang, Liyao Jiang, Di Niu

    Abstract: Current recommender systems employ large-sized embedding tables with uniform dimensions for all features, leading to overfitting, high computational cost, and suboptimal generalizing performance. Many techniques aim to solve this issue by feature selection or embedding dimension search. However, these techniques typically select a fixed subset of features or embedding dimensions for all instances… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted as CIKM23 Long paper

  19. arXiv:2306.15796  [pdf, other

    cs.AI

    ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis

    Authors: Yakun Yu, Mingjun Zhao, Shi-ang Qi, Feiran Sun, Baoxun Wang, Weidong Guo, Xiaoli Wang, Lei Yang, Di Niu

    Abstract: Multimodal Sentiment Analysis leverages multimodal signals to detect the sentiment of a speaker. Previous approaches concentrate on performing multimodal fusion and representation learning based on general knowledge obtained from pretrained models, which neglects the effect of domain-specific knowledge. In this paper, we propose Contrastive Knowledge Injection (ConKI) for multimodal sentiment anal… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted by ACL Findings 2023

  20. arXiv:2304.12561  [pdf, other

    cs.CV cs.MM

    TCR: Short Video Title Generation and Cover Selection with Attention Refinement

    Authors: Yakun Yu, Jiuding Yang, Weidong Guo, Hui Liu, Yu Xu, Di Niu

    Abstract: With the widespread popularity of user-generated short videos, it becomes increasingly challenging for content creators to promote their content to potential viewers. Automatically generating appealing titles and covers for short videos can help grab viewers' attention. Existing studies on video captioning mostly focus on generating factual descriptions of actions, which do not conform to video ti… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted by PAKDD23

  21. CEIL: A General Classification-Enhanced Iterative Learning Framework for Text Clustering

    Authors: Mingjun Zhao, Mengzhen Wang, Yinglong Ma, Di Niu, Haijiang Wu

    Abstract: Text clustering, as one of the most fundamental challenges in unsupervised learning, aims at grouping semantically similar text segments without relying on human annotations. With the rapid development of deep learning, deep clustering has achieved significant advantages over traditional clustering methods. Despite the effectiveness, most existing deep text clustering methods rely heavily on repre… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: The Web Conference 2023

  22. arXiv:2304.10316  [pdf, other

    cs.CV

    Search-Map-Search: A Frame Selection Paradigm for Action Recognition

    Authors: Mingjun Zhao, Yakun Yu, Xiaoli Wang, Lei Yang, Di Niu

    Abstract: Despite the success of deep learning in video understanding tasks, processing every frame in a video is computationally expensive and often unnecessary in real-time applications. Frame selection aims to extract the most informative and representative frames to help a model better understand video content. Existing frame selection methods either individually sample frames based on per-frame importa… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  23. LA3: Efficient Label-Aware AutoAugment

    Authors: Mingjun Zhao, Shan Lu, Zixuan Wang, Xiaoli Wang, Di Niu

    Abstract: Automated augmentation is an emerging and effective technique to search for data augmentation policies to improve generalizability of deep neural network training. Most existing work focuses on constructing a unified policy applicable to all data samples in a given dataset, without considering sample or class variations. In this paper, we propose a novel two-stage data augmentation algorithm, name… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: ECCV 2022

  24. arXiv:2303.17870  [pdf, other

    cs.CV

    GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation

    Authors: Jian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin

    Abstract: Recent breakthroughs in the field of language-guided image generation have yielded impressive achievements, enabling the creation of high-quality and diverse images based on user instructions.Although the synthesis performance is fascinating, one significant limitation of current image generation models is their insufficient ability to generate text coherently within images, particularly for compl… ▽ More

    Submitted 23 May, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: 24 pages, 5 figures

  25. arXiv:2303.02733  [pdf, other

    cs.LG cs.AI cs.CV

    Reparameterization through Spatial Gradient Scaling

    Authors: Alexander Detkov, Mohammad Salameh, Muhammad Fetrat Qharabagh, Jialin Zhang, Wei Lui, Shangling Jui, Di Niu

    Abstract: Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. However, there exists a gap in understanding how reparameterization may change and benefit the learning process of neural networks. In this paper, we present a novel spatial gradient scaling method to redistribute learning foc… ▽ More

    Submitted 6 March, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

    Comments: Published at ICLR 2023. Code available at https://github.com/Ascend-Research/Reparameterization

  26. A General-Purpose Transferable Predictor for Neural Architecture Search

    Authors: Fred X. Han, Keith G. Mills, Fabian Chudak, Parsa Riahi, Mohammad Salameh, Jialin Zhang, Wei Lu, Shangling Jui, Di Niu

    Abstract: Understanding and modelling the performance of neural architectures is key to Neural Architecture Search (NAS). Performance predictors have seen widespread use in low-cost NAS and achieve high ranking correlations between predicted and ground truth performance in several NAS benchmarks. However, existing predictors are often designed based on network encodings specific to a predefined search space… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted to SDM2023; version includes supplementary material; 12 Pages, 3 Figures, 6 Tables

  27. AIO-P: Expanding Neural Performance Predictors Beyond Image Classification

    Authors: Keith G. Mills, Di Niu, Mohammad Salameh, Weichen Qiu, Fred X. Han, Puyuan Liu, Jialin Zhang, Wei Lu, Shangling Jui

    Abstract: Evaluating neural network performance is critical to deep neural network design but a costly procedure. Neural predictors provide an efficient solution by treating architectures as samples and learning to estimate their performance on a given task. However, existing predictors are task-dependent, predominantly estimating neural network performance on image classification benchmarks. They are also… ▽ More

    Submitted 24 April, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 Oral Presentation; version includes supplementary material; 16 Pages, 4 Figures, 22 Tables

  28. GENNAPE: Towards Generalized Neural Architecture Performance Estimators

    Authors: Keith G. Mills, Fred X. Han, Jialin Zhang, Fabian Chudak, Ali Safari Mamaghani, Mohammad Salameh, Wei Lu, Shangling Jui, Di Niu

    Abstract: Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search. Existing approaches either rely on neural performance predictors which are limited to modeling architectures in a predefined design space involving specific sets of operators and connection rules, and cannot generalize to unseen architectures, or resort to zero-cost proxies whi… ▽ More

    Submitted 24 April, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 Oral Presentation; includes supplementary materials with more details on introduced benchmarks; 14 Pages, 6 Figures, 10 Tables

  29. One for All, All for One: Learning and Transferring User Embeddings for Cross-Domain Recommendation

    Authors: Chenglin Li, Yuanzhen Xie, Chenyun Yu, Bo Hu, Zang li, Guoqiang Shu, Xiaohu Qie, Di Niu

    Abstract: Cross-domain recommendation is an important method to improve recommender system performance, especially when observations in target domains are sparse. However, most existing techniques focus on single-target or dual-target cross-domain recommendation (CDR) and are hard to be generalized to CDR with multiple target domains. In addition, the negative transfer problem is prevalent in CDR, where the… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: 9 pages, accepted by WSDM 2023

  30. arXiv:2211.10854  [pdf, other

    cs.CL cs.LG

    Mulco: Recognizing Chinese Nested Named Entities Through Multiple Scopes

    Authors: Jiuding Yang, Jinwen Luo, Weidong Guo, Jerry Chen, Di Niu, Yu Xu

    Abstract: Nested Named Entity Recognition (NNER) has been a long-term challenge to researchers as an important sub-area of Named Entity Recognition. NNER is where one entity may be part of a longer entity, and this may happen on multiple levels, as the term nested suggests. These nested structures make traditional sequence labeling methods unable to properly recognize all entities. While recent researches f… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

  31. arXiv:2207.13848  [pdf, other

    cs.DC cs.LG cs.PF math.NA

    Predicting the Output Structure of Sparse Matrix Multiplication with Sampled Compression Ratio

    Authors: Zhaoyang Du, Yijin Guan, Tianchan Guan, Dimin Niu, Nianxiong Tan, Xiaopeng Yu, Hongzhong Zheng, Jianyi Meng, Xiaolang Yan, Yuan Xie

    Abstract: Sparse general matrix multiplication (SpGEMM) is a fundamental building block in numerous scientific applications. One critical task of SpGEMM is to compute or predict the structure of the output matrix (i.e., the number of nonzero elements per output row) for efficient memory allocation and load balance, which impact the overall performance of SpGEMM. Existing work either precisely calculates the… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: This paper has been submitted to the IEEE International Conference on Parallel and Distributed Systems (ICPADS). 8 pages, 2 fgures, 3 tables

    ACM Class: F.2.1; G.3; D.1.3; G.1.3

  32. arXiv:2206.07244  [pdf, other

    cs.DC

    OpSparse: a Highly Optimized Framework for Sparse General Matrix Multiplication on GPUs

    Authors: Zhaoyang Du, Yijin Guan, Tianchan Guan, Dimin Niu, Linyong Huang, Hongzhong Zheng, Yuan Xie

    Abstract: Sparse general matrix multiplication (SpGEMM) is an important and expensive computation primitive in many real-world applications. Due to SpGEMM's inherent irregularity and the vast diversity of its input matrices, developing high-performance SpGEMM implementation on modern processors such as GPUs is challenging. The state-of-the-art SpGEMM libraries (i.e., $nsparse$ and $spECK$) adopt several alg… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: This paper has been submitted to the IEEE Access since May 7, 2022, and is currently under review by IEEE Access. 20 pages, 11 fgures, 5 tables

    MSC Class: 68-02; 68W10; 65F50 ACM Class: D.1.3; G.1.3

  33. arXiv:2206.06611  [pdf, other

    cs.DC cs.MS cs.PF

    Accelerating CPU-Based Sparse General Matrix Multiplication With Binary Row Merging

    Authors: Zhaoyang Du, Yijin Guan, Tianchan Guan, Dimin Niu, Hongzhong Zheng, Yuan Xie

    Abstract: Sparse general matrix multiplication (SpGEMM) is a fundamental building block for many real-world applications. Since SpGEMM is a well-known memory-bounded application with vast and irregular memory accesses, considering the memory access efficiency is of critical importance for SpGEMM's performance. Yet, the existing methods put less consideration into the memory subsystem and achieved suboptimal… ▽ More

    Submitted 19 August, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: This work has been accepted by IEEE Access (DOI:10.1109/ACCESS.2022.3193937). There are 12 pages, 6 fgures, 2 tables

    MSC Class: 68-02; 68W10; 65F50 ACM Class: D.1.3; G.1.3

  34. arXiv:2205.06454  [pdf, other

    cs.AI

    R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning

    Authors: Shengyao Lu, Bang Liu, Keith G. Mills, Shangling Jui, Di Niu

    Abstract: Systematicity, i.e., the ability to recombine known parts and rules to form new sequences while reasoning over relational data, is critical to machine intelligence. A model with strong systematicity is able to train on small-scale tasks and generalize to large-scale tasks. In this paper, we propose R5, a relational reasoning framework based on reinforcement learning that reasons over relational gr… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: ICLR 2022 Spotlight

  35. RecGURU: Adversarial Learning of Generalized User Representations for Cross-Domain Recommendation

    Authors: Chenglin Li, Mingjun Zhao, Huanming Zhang, Chenyun Yu, Lei Cheng, Guoqiang Shu, Beibei Kong, Di Niu

    Abstract: Cross-domain recommendation can help alleviate the data sparsity issue in traditional sequential recommender systems. In this paper, we propose the RecGURU algorithm framework to generate a Generalized User Representation (GUR) incorporating user information across domains in sequential recommendation, even when there is minimum or no common users in the two domains. We propose a self-attentive au… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: 11 pages, 2 figures, 4 tables, Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

  36. arXiv:2110.10444  [pdf, other

    cs.CV eess.IV

    Moiré Attack (MA): A New Potential Risk of Screen Photos

    Authors: Dantong Niu, Ruohao Guo, Yisen Wang

    Abstract: Images, captured by a camera, play a critical role in training Deep Neural Networks (DNNs). Usually, we assume the images acquired by cameras are consistent with the ones perceived by human eyes. However, due to the different physical mechanisms between human-vision and computer-vision systems, the final perceived images could be very different in some cases, for example shooting on digital monito… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021

  37. arXiv:2110.06892  [pdf, other

    cs.LG cs.AI

    TAG: Toward Accurate Social Media Content Tagging with a Concept Graph

    Authors: Jiuding Yang, Weidong Guo, Bang Liu, Yakun Yu, Chaoyue Wang, Jinwen Luo, Linglong Kong, Di Niu, Zhen Wen

    Abstract: Although conceptualization has been widely studied in semantics and knowledge representation, it is still challenging to find the most accurate concept phrases to characterize the main idea of a text snippet on the fast-growing social media. This is partly attributed to the fact that most knowledge bases contain general terms of the world, such as trees and cars, which do not have the defining pow… ▽ More

    Submitted 15 June, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted by ACM SIGKDD 2022

  38. Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search

    Authors: Keith G. Mills, Fred X. Han, Jialin Zhang, Seyed Saeed Changiz Rezaei, Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

    Abstract: Neural architecture search automates neural network design and has achieved state-of-the-art results in many deep learning applications. While recent literature has focused on designing networks to maximize accuracy, little work has been conducted to understand the compatibility of architecture design spaces to varying hardware. In this paper, we analyze the neural blocks used to build Once-for-Al… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: Accepted as an Applied Research Paper at CIKM 2021; 10 pages, 8 Figures, 2 Tables

  39. L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning

    Authors: Keith G. Mills, Fred X. Han, Mohammad Salameh, Seyed Saeed Changiz Rezaei, Linglong Kong, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

    Abstract: Neural architecture search (NAS) has achieved remarkable results in deep neural network design. Differentiable architecture search converts the search over discrete architectures into a hyperparameter optimization problem which can be solved by gradient descent. However, questions have been raised regarding the effectiveness and generalizability of gradient methods for solving non-convex architect… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: Accepted as a Full Research Paper at CIKM 2021; 10 pages, 3 Figures, 5 Tables

  40. arXiv:2108.09034  [pdf, other

    cs.CV cs.CR cs.LG eess.IV

    AdvDrop: Adversarial Attack to DNNs by Dropping Information

    Authors: Ranjie Duan, Yuefeng Chen, Dantong Niu, Yun Yang, A. K. Qin, Yuan He

    Abstract: Human can easily recognize visual objects with lost information: even losing most details with only contour reserved, e.g. cartoon. However, in terms of visual perception of Deep Neural Networks (DNNs), the ability for recognizing abstract objects (visual objects with lost information) is still a challenge. In this work, we investigate this issue from an adversarial viewpoint: will the performance… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

    Comments: Accepted to ICCV 2021

  41. arXiv:2108.06747  [pdf, other

    cs.CV

    SOTR: Segmenting Objects with Transformers

    Authors: Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li

    Abstract: Most recent transformer-based models show impressive performance on vision tasks, even better than Convolution Neural Networks (CNN). In this work, we present a novel, flexible, and effective transformer-based model for high-quality instance segmentation. The proposed method, Segmenting Objects with TRansformers (SOTR), simplifies the segmentation pipeline, building on an alternative CNN backbone… ▽ More

    Submitted 17 August, 2021; v1 submitted 15 August, 2021; originally announced August 2021.

    Comments: ICCV 2021

  42. arXiv:2108.03568  [pdf, other

    cs.CV

    LeafMask: Towards Greater Accuracy on Leaf Segmentation

    Authors: Ruohao Guo, Liao Qu, Dantong Niu, Zhenbo Li, Jun Yue

    Abstract: Leaf segmentation is the most direct and effective way for high-throughput plant phenotype data analysis and quantitative researches of complex traits. Currently, the primary goal of plant phenotyping is to raise the accuracy of the autonomous phenotypic measurement. In this work, we present the LeafMask neural network, a new end-to-end model to delineate each leaf region and count the number of l… ▽ More

    Submitted 8 August, 2021; originally announced August 2021.

    Journal ref: ICCV 2021 workshop, CVPPA

  43. LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization

    Authors: Weidong Guo, Mingjun Zhao, Lusheng Zhang, Di Niu, Jinwen Luo, Zhenhua Liu, Zhenyang Li, Jianbo Tang

    Abstract: Language model pre-training based on large corpora has achieved tremendous success in terms of constructing enriched contextual representations and has led to significant performance gains on a diverse range of Natural Language Understanding (NLU) tasks. Despite the success, most current pre-trained language models, such as BERT, are trained based on single-grained tokenization, usually with fine-… ▽ More

    Submitted 3 August, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

  44. arXiv:2108.00257  [pdf, other

    cs.LG cs.CE

    BoA-PTA, A Bayesian Optimization Accelerated Error-Free SPICE Solver

    Authors: Wei W. Xing, Xiang Jin, Yi Liu, Dan Niu, Weishen Zhao, Zhou Jin

    Abstract: One of the greatest challenges in IC design is the repeated executions of computationally expensive SPICE simulations, particularly when highly complex chip testing/verification is involved. Recently, pseudo transient analysis (PTA) has shown to be one of the most promising continuation SPICE solver. However, the PTA efficiency is highly influenced by the inserted pseudo-parameters. In this work,… ▽ More

    Submitted 31 July, 2021; originally announced August 2021.

    ACM Class: I.6.5; I.6.6; J.2; J.6

  45. arXiv:2106.15283  [pdf, other

    cs.CV cs.LG eess.SP

    Similarity Embedding Networks for Robust Human Activity Recognition

    Authors: Chenglin Li, Carrie Lu Tong, Di Niu, Bei Jiang, Xiao Zuo, Lei Cheng, Jian Xiong, Jianming Yang

    Abstract: Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this paper, we design a similarity embedding neural network that maps input sensor signals onto real vec… ▽ More

    Submitted 31 May, 2021; originally announced June 2021.

  46. Meta-HAR: Federated Representation Learning for Human Activity Recognition

    Authors: Chenglin Li, Di Niu, Bei Jiang, Xiao Zuo, Jianming Yang

    Abstract: Human activity recognition (HAR) based on mobile sensors plays an important role in ubiquitous computing. However, the rise of data regulatory constraints precludes collecting private and labeled signal data from personal devices at scale. Federated learning has emerged as a decentralized alternative solution to model training, which iteratively aggregates locally updated models into a shared glob… ▽ More

    Submitted 31 May, 2021; originally announced June 2021.

  47. Verdi: Quality Estimation and Error Detection for Bilingual Corpora

    Authors: Mingjun Zhao, Haijiang Wu, Di Niu, Zixuan Wang, Xiaoli Wang

    Abstract: Translation Quality Estimation is critical to reducing post-editing efforts in machine translation and to cross-lingual corpus cleaning. As a research problem, quality estimation (QE) aims to directly estimate the quality of translation in a given pair of source and target sentences, and highlight the words that need corrections, without referencing to golden translations. In this paper, we propos… ▽ More

    Submitted 3 September, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: Accepted by The Web Conference 2021

  48. arXiv:2105.09356  [pdf, other

    cs.LG cs.CV

    Generative Adversarial Neural Architecture Search

    Authors: Seyed Saeed Changiz Rezaei, Fred X. Han, Di Niu, Mohammad Salameh, Keith Mills, Shuo Lian, Wei Lu, Shangling Jui

    Abstract: Despite the empirical success of neural architecture search (NAS) in deep learning applications, the optimality, reproducibility and cost of NAS schemes remain hard to assess. In this paper, we propose Generative Adversarial NAS (GA-NAS) with theoretically provable convergence guarantees, promoting stability and reproducibility in neural architecture search. Inspired by importance sampling, GA-NAS… ▽ More

    Submitted 23 June, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

    Comments: 17 pages, 9 figures, 13 Tables

  49. arXiv:2103.06653  [pdf, other

    cs.AR

    MPU: Towards Bandwidth-abundant SIMT Processor via Near-bank Computing

    Authors: Xinfeng Xie, Peng Gu, Yufei Ding, Dimin Niu, Hongzhong Zheng, Yuan Xie

    Abstract: With the growing number of data-intensive workloads, GPU, which is the state-of-the-art single-instruction-multiple-thread (SIMT) processor, is hindered by the memory bandwidth wall. To alleviate this bottleneck, previously proposed 3D-stacking near-bank computing accelerators benefit from abundant bank-internal bandwidth by bringing computations closer to the DRAM banks. However, these accelerato… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

  50. QBSUM: a Large-Scale Query-Based Document Summarization Dataset from Real-world Applications

    Authors: Mingjun Zhao, Shengli Yan, Bang Liu, Xinwang Zhong, Qian Hao, Haolan Chen, Di Niu, Bowei Long, Weidong Guo

    Abstract: Query-based document summarization aims to extract or generate a summary of a document which directly answers or is relevant to the search query. It is an important technique that can be beneficial to a variety of applications such as search engines, document-level machine reading comprehension, and chatbots. Currently, datasets designed for query-based summarization are short in numbers and exist… ▽ More

    Submitted 28 October, 2020; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: accepted by Computer Speech & Language