GPT-based Textile Pilling Classification Using 3D Point Cloud Data

Yu Lu1,2,∗    YuYu Chen1,2    Gang Zhou1    Zhenghua Lan1
\affiliations1Hangzhou Innovation Institute, Beihang University
2State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
\emails{yulucas, yuyu_chen, zhenghua}@buaa.edu.cn, [email protected]
Abstract

Textile pilling assessment is critical for textile quality control. We collect thousands of 3D point cloud images in the actual test environment of textiles and organize and label them as TextileNet8 dataset. To the best of our knowledge, it is the first publicly available eight-categories 3D point cloud dataset in the field of textile pilling assessment. Based on PointGPT, the GPT-like big model of point cloud analysis, we incorporate the global features of the input point cloud extracted from the non-parametric network into it, thus proposing the PointGPT+NN model. Using TextileNet8 as a benchmark, the experimental results show that the proposed PointGPT+NN model achieves an overall accuracy (OA) of 91.8% and a mean per-class accuracy (mAcc) of 92.2%. Test results on other publicly available datasets also validate the competitive performance of the proposed PointGPT+NN model. The proposed TextileNet8 dataset will be publicly available.

1 Introduction

Textiles play an important role in our lives. Pilling, a common issue associated with textiles, typically occurs when textiles are pulled out or twisted into ball shapes due to friction or movement (???), which not only damages the appearance of textiles but also significantly impacts their tactile sensation and mechanical properties. The assessment of textile pilling involves employing specialized equipment to polish textiles followed by visual inspection or automatic methodologies to categorize them based on pill quantity and characteristics (??). The results of pilling assessment serve as critical indicators of textile quality and are of great concern to textile manufacturers. According to the industry consensus, textile pilling is typically categorized into multiple grades contingent upon the severity of the pilling phenomenon (?).

Traditional manual evaluation methods rely on human experience, which is subjective and time-consuming. Several studies (??????) have proposed algorithmic and automated equipment-based approaches for textile pilling assessment, most of which utilize 2D cameras to capture images of the textiles. These methods employ various image-processing algorithms to detect the quantity and characteristics of pilling and then grade the textiles. However, these methods often rely on proprietary and non-public datasets that are small in size, and analysis based on 2D images is susceptible to environmental factors, making it challenging to effectively validate the generality of the proposed methods.

Refer to caption
Figure 1: Sample images from the collected point cloud data of polyester. Best viewed in color.

3D point cloud data has demonstrated its extensive applications across various fields, including medical engineering (???), architectural engineering (???), and autonomous driving (???). While point cloud data has been utilized in the textile industry, there is limited research on the evaluation of textile pilling. Compared to image data acquired with 2D cameras, point cloud data acquired with 3D cameras is less susceptible to environmental changes and directly reflects the object’s shape and structure, offering detailed geometric information. These advantages contribute to the widespread usage of 3D point cloud data in object recognition and measurement.

In this study, we exploit 3D point cloud data for textile pilling classification. We collected thousands of point cloud images of textiles and organized them into the TextileNet8 dataset comprising eight categories after data cleaning and annotation. Figure 1 shows an example of the collected raw point cloud images. The PointGPT model (?) utilizes a point cloud sequencer and a dual mask strategy to perform tasks such as semantic segmentation and classification on point cloud data, yielding promising results on publicly available datasets. We developed an improved version of the PointGPT model and applied it to classify textile pilling, obtaining satisfactory results on the TextileNet8 dataset.

In summary, our contributions are as follows. First, we construct an eight-categories dataset for textile pilling called TextileNet8, which, to the best of our knowledge, is the first publicly available eight-categories point cloud dataset in the field of textile pilling evaluation and can be used as a benchmark for textile pilling assessment. We evaluate the performance of various point cloud classification models on the proposed dataset and benchmark the textile piling classification task. Second, we introduce an improved model for fusing the global information of the point cloud, which is named PointGPT+NN. The proposed model achieves an overall accuracy (OA) of 91.8% and a mean accuracy per class (mAcc) of 92.2% on the proposed dataset.

Refer to caption
Figure 2: Overall architecture of the proposed PointGPT+NN. The input point cloud image is divided into multiple point patches, which are then sorted and arranged in an ordered sequence. The absolute position encoding (APE) is also generated in this stage. The arranged point patches are then put into a PointNet network for token embedding. Meanwhile, the input point cloud image is fed into a Point-NN network to do feature embedding on a complete point cloud image. c𝑐citalic_c in the Point-NN network denotes the stage number of the multi-stage hierarchy. The features generated by the PointNet and the Point-NN are then fused by element-wise addition. The fused features and the APE are then put into the extractor, while a dual mask strategy is applied to generate the logit classification results. Best viewed in color.

2 Related Work

2.1 Textile Pilling Evaluation

The majority of automated textile pilling assessment methods utilize 2D images. Zhang et al. (?) proposed a method combining 2DDTCWT image reconstruction with multi-layer perceptrons (MLP) classification and conducted experiments on 203 self-made textile pilling images to evaluate its effectiveness. Furferi et al. (?) developed a machine vision-based program to extract fabric parameters, subsequently training an artificial neural network for automatic textile pilling classification using the extracted features. Xiao et al. (?) utilized a series of image analysis techniques, including the Fourier transform, multidimensional discrete wavelet transform, and an iterative thresholding method, followed by deep learning classification for textile pilling objective evaluation. Yap et al. (?) employed a support vector machine with a radial basis function kernel for wool knitwear pilling evaluation, utilizing 17 collected textile pilling features.

Analysis based on 2D images is susceptible to environmental factors like illumination variation. Some researchers leverage 3D information in textile pilling assessment to enhance the reliability of objective evaluation methods. Jian et al. (?) proposed a semi-calibrated near-light photometric stereo (PS) method, employing the PS algorithm for 3D depth retrieval from 2D images, followed by segmentation and classification using global iterative thresholding and k-nearest neighbors (KNN) methods. Liu et al. (?) utilized structure from motion (SFM) and patch-based multi-view stereo (PMVS) algorithms for depth acquisition via 3D reconstruction, subsequently employing edge detection, adaptive threshold analysis, and morphological analysis for pilling segmentation and classification. Ouyang et al. (?) obtained depth information via 3D reconstruction, utilizing depth and distance criteria for seed selection and region growth to characterize pilling appearance. Fan et al. (?) proposed a textile pilling rating assessment system integrating active contour and neural networks, which utilizes a multi-view stereo vision algorithm for fabric surface reconstruction. The above 3D information-based textile pilling assessment methods still use 2D images and rely on 3D reconstruction to obtain more information about textile pilling.

2.2 3D Point Cloud Classification

Graph-based methods for point cloud classification involve transforming point cloud data into a graph structure. The Dynamic Graph CNN (DGCNN) (?) model dynamically computes graphs at each network layer and introduces a novel neural network module called EdgeConv. This module captures local neighborhood information of the point cloud while learning global shape features. Li et al. (?) introduced a new method for training deep graph convolutional networks (GCNs), leveraging residuals, dense connectivity, and dilated convolutions from convolutional neural networks (CNNs) adapted to the GCN architecture. They constructed a 56-layer DeepGCN model using these techniques. Xu et al. (?) proposed a method for rapid and scalable point cloud learning named Grid-GCN. Grid-GCN employs a novel data structuring strategy known as coverage-aware grid query (CAGQ), which enhances spatial coverage and reduces time complexity by leveraging the efficiency of the grid space. Hu et al. (?) introduced a multi-scale graph convolutional network (M-GCN) that extracts local geometric features based on multi-scale feature fusion. This method effectively enriches the representation capability of point clouds by extracting local topological information across scales.

Point cloud classification methods based on MLP generally independently process features for each point. Qi et al. (?) introduced PointNet, a novel neural network that can directly process point cloud data. The model preserves the alignment invariance of the point cloud and provides a unified architecture for point cloud classification, segmentation, and semantic parsing of scenes for the first time. Based on PointNet, Qi et al. (?) proposed PointNet++, which can learn local features at an incremental contextual scale by recursively applying PointNet to the nested segmentation of point clouds. PointNet++ improves the ability to recognize fine-grained patterns and generalize to complex scenes. Qian et al. (?) reviewed the previous works, optimized the training and model expansion strategies, introduced an inverted residual bottleneck design and separable MLPs for efficient and effective model scaling, and introduced PointNeXt. Zhang et al. (?) constructed a non-parametric network (Point-NN) based on furthest point sampling (FPS), KNN, and pooling operations with trigonometric functions. They also introduced a parametric network (Point-PN) by inserting linear layers on the Point-NN.

Following the success of Transformer models in NLP and image processing, researchers have begun exploring Transformer-based approaches for point cloud processing. Among these approaches, PointGPT stands out due to its innovative dual mask strategy and the incorporation of distinctive extractor-generator transformer architectures. PointGPT has demonstrated promising results on publicly available 3D point cloud analysis datasets.

3 Methodology

3.1 Generative Pre-training Transformer(GPT) for Point Cloud

The generative pre-training transformer (GPT) (?) excels at learning representative features, primarily through auto-regressive prediction. PointGPT utilizes the GPT framework for analyzing point clouds, which involves two main stages: sorting and embedding, and auto-regression pre-training. In the sorting and embedding stage, n𝑛nitalic_n center points are selected by FPS, followed by KNN to construct n𝑛nitalic_n point patches around each center point. These center points are sorted using Morton code (?), and the corresponding point patches are arranged accordingly. The sorted point patches (Pssuperscript𝑃𝑠P^{s}italic_P start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT) are put into a PointNet network (?) to extract geometric information for feature embedding. The embedding of sorted point patches is as follows (?):

T=PointNet(Ps).𝑇PointNetsuperscript𝑃𝑠T=\mathrm{PointNet}(P^{s}).italic_T = roman_PointNet ( italic_P start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ) . (1)

where T𝑇Titalic_T represents the embedding tokens with D-dimensional.

In the auto-regression pre-training stage, PointGPT employs a dual mask strategy to enhance the overall understanding of the point cloud compared to other mask methods. Specifically, the other mask methods mask half of the input token, while the dual mask strategy randomly masks an additional portion of the input token on top of the vanilla mask. The self-attention process with the dual mask strategy is as follows (?):

Selfattention(T)=softmax(QKTD(1Md))V.Selfattention𝑇softmax𝑄superscript𝐾𝑇𝐷1superscript𝑀𝑑𝑉\mathrm{Selfattention}(T)=\mathrm{softmax}(\frac{QK^{T}}{\sqrt{D}}-(1-M^{d})% \cdot\infty)V.roman_Selfattention ( italic_T ) = roman_softmax ( divide start_ARG italic_Q italic_K start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG square-root start_ARG italic_D end_ARG end_ARG - ( 1 - italic_M start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) ⋅ ∞ ) italic_V . (2)

where Q𝑄Qitalic_Q, K𝐾Kitalic_K, V𝑉Vitalic_V are T𝑇Titalic_T encoded with different weights for the D𝐷Ditalic_D channels, and Mdsuperscript𝑀𝑑M^{d}italic_M start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is the dual mask which is 0 or 1 on the location.

The extractor in PointGPT consists only of transformer decoder blocks that extract potential representations using a dual mask strategy. In addition, sinusoidal position encoding (PE) is applied to map the center points to absolute position encoding (APE). The generator in PointGPT also consists of transformer blocks responsible for generating point tokens for the subsequent prediction head. The directions relative to the next point patches are provided in the generator to maintain consistency in the order of the point patches.

Attributes Grades Total Num.
Public Access Image Type 1 2 3 4 5 6 7 8 9
Luo et al. (?) No 2D images 11 18 20 30 19 / / / / 98
Liu et al. (?) No 2D images 16 27 20 22 20 / / / / 105
Zhang et al. (?) No 2D images 32 63 46 49 13 / / / / 203
Furferi et al. (?) No 2D images 15 15 15 15 15 15 15 15 15 135
TextileNet8 (Ours) Yes 3D point clouds 150 165 180 165 165 165 180 165 / 1335
Table 1: Statistics of TextileNet8 and comparison to other datasets.

The final component of the auto-regressive pre-training stage is a prediction head, which is responsible for predicting subsequent point patches. The prediction head comprises a two-layer MLP with fully connected (FC) layers and rectified linear unit (ReLU) activation. The prediction head maps tokens generated by the generator to vectors, subsequently restructured to construct the predicted point patches.

As for the loss function in PointGPT, the generation loss Lgsuperscript𝐿𝑔L^{g}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT uses the l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-form and l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-form of the Chamfer distance (CD) (?), denoted as L1gsubscriptsuperscript𝐿𝑔1L^{g}_{1}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and L2gsubscriptsuperscript𝐿𝑔2L^{g}_{2}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and is computed as Lgsuperscript𝐿𝑔L^{g}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT = L1gsubscriptsuperscript𝐿𝑔1L^{g}_{1}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT+L2gsubscriptsuperscript𝐿𝑔2L^{g}_{2}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Specifically, the following objective: Lfsuperscript𝐿𝑓L^{f}italic_L start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT = Ldsuperscript𝐿𝑑L^{d}italic_L start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT+γ𝛾\gammaitalic_γLgsuperscript𝐿𝑔L^{g}italic_L start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT is optimized during the fine-tuning stage, where Ldsuperscript𝐿𝑑L^{d}italic_L start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT represents the loss for the downstream task, which is CrossEntropyLoss (?) for the classification task. γ𝛾\gammaitalic_γ balances the contribution of the loss of the downstream task and the loss of the generation task.

3.2 Construction of PointGPT+NN

Based on a comprehensive understanding of PointGPT, we identify a potential limitation in its feature embedding process for textile pilling grading purposes. Specifically, while PointGPT utilizes the PointNet network for feature embedding of each point patch, it fails to conduct feature embedding on a complete point cloud image. To address this, we propose a method to fuse features of the entire input point cloud during feature embedding.

One promising solution is the recently developed Point-NN (?), a non-parametric network tailored for 3D point cloud analysis. We combined the global features of the input point cloud extracted by the Point-NN model with PointGPT to find a way to enhance the performance of PointGPT on the textile pilling assessment task. Figure 2 shows the overall architecture of the constructed PointGPT+NN model.

Point-NN comprises a non-parametric encoder (NPE) for 3D feature extraction and a point memory bank (PMB) for task-specific recognition. For this study, we only use NPE for 3D feature extraction purposes. As shown in Figure 2, we only utilize the NPE for feature extraction purposes. The NPE first uses trigonometric functions for the positional encoding (PosE) and extends it for non-parametric 3D embedding. Sine and cosine trigonometric functions are used to embed a raw point into a vector (?):

fix[2m]=sine(αxi/β6mCI),fix[2m+1]=cosine(αxi/β6mCI),formulae-sequencesuperscriptsubscript𝑓𝑖𝑥delimited-[]2𝑚sine𝛼subscript𝑥𝑖superscript𝛽6𝑚subscript𝐶𝐼superscriptsubscript𝑓𝑖𝑥delimited-[]2𝑚1cosine𝛼subscript𝑥𝑖superscript𝛽6𝑚subscript𝐶𝐼\begin{split}&f_{i}^{x}[2m]=\mathrm{sine}\left(\alpha x_{i}/\beta^{\frac{6m}{C% _{I}}}\right),\\ &f_{i}^{x}[2m+1]=\mathrm{cosine}\left(\alpha x_{i}/\beta^{\frac{6m}{C_{I}}}% \right),\end{split}start_ROW start_CELL end_CELL start_CELL italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT [ 2 italic_m ] = roman_sine ( italic_α italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_β start_POSTSUPERSCRIPT divide start_ARG 6 italic_m end_ARG start_ARG italic_C start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT ) , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT [ 2 italic_m + 1 ] = roman_cosine ( italic_α italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_β start_POSTSUPERSCRIPT divide start_ARG 6 italic_m end_ARG start_ARG italic_C start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT ) , end_CELL end_ROW (3)

where α𝛼\alphaitalic_α and β𝛽\betaitalic_β control the magnitude and wavelengths, respectively. CIsubscript𝐶𝐼C_{I}italic_C start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT denotes the initial feature dimension, and m[0,CI6]𝑚0subscript𝐶𝐼6m\in\left[0,{\frac{C_{I}}{6}}\right]italic_m ∈ [ 0 , divide start_ARG italic_C start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_ARG start_ARG 6 end_ARG ] is the channel index. The NPE then adopts a multi-stage hierarchy structure, leveraging farthest point sampling (FPS), KNN, trigonometric function-based local geometry extraction, and pooling for incremental aggregation of local geometry.

This multi-stage hierarchy structure design of NPE aims to generate high-dimensional global features for the point cloud. It is worth noting that Point-NN employs simple trigonometric functions to capture local spatial geometric information and fine-grained semantics of various 3D structures without any learnable operators. When extracting global features of the input point cloud images using NPE, the number of stages in the multi-stage hierarchy structure can be varied to control the complexity of the features. In addition, adjusting the values of α𝛼\alphaitalic_α and β𝛽\betaitalic_β can better extract the positional relationships of the points.

As shown in Figure 2, the extractor in PointGPT is utilized to enhance the semantic level of the latent representations learned in the pre-training stage. The feature extracted by NPE of Point-NN is fused with PointGPT by element-wise addition as follows:

T=Point-NN(Pr)+λ×PointNet(Ps)𝑇Point-NNsuperscript𝑃𝑟𝜆PointNetsuperscript𝑃𝑠T=\text{Point-NN}\left(P^{r}\right)+\lambda\times\operatorname{PointNet}\left(% P^{s}\right)italic_T = Point-NN ( italic_P start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT ) + italic_λ × roman_PointNet ( italic_P start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ) (4)

where Prsuperscript𝑃𝑟P^{r}italic_P start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT is the raw point cloud and λ𝜆\lambdaitalic_λ balances the feature contribution of Point-NN and PointNet, which is set to 3 in our experiments.

4 Dataset

4.1 Data Acquisition and Annotation

The raw data were acquired from the actual test environment of polyester textile pilling, with each image containing approximately 200,000 points, covering an area of about 10×20102010\times 2010 × 20mm in width and length. These images were captured using an SSZN8060 3D camera manufactured by Shenzhen DeepVision Intelligent Technology Co. Raw images were then annotated by an experienced industry professional using CloudCompare Software. The annotated images subsequently underwent a uniform downsampling process, reducing the number of sample points to 8192. While fewer downsampling points improve inference speed, researches (??) indicate a potential trade-off with accuracy.

Method Input Points OA(%) mAcc(%)
DGCNN (?) 8192 78.8 79.2
DeepGCN (?) 8192 83.3 84.1
PointNet (?) 8192 73.0 73.9
PointNet++ (?) 8192 78.2 79.1
PointNeXt-s (?) 8192 85.0 86.1
PointMLP (?) 8192 84.2 85.1
Point-PN (?) 8192 84.3 84.2
PointGPT (?) 8192 90.6 91.1
PointGPT+NN (Ours) 8192 91.8 92.2
Table 2: Comparison of the effectiveness of the proposed method and other methods on the TextileNet8 dataset. The best results are highlighted in bold. OA is overall accuracy, and mAcc is mean accuracy per-class.
Mag. α𝛼\alphaitalic_α 100 500 1000 2000 3000
Best OA(%)
91.0
@β𝛽\betaitalic_β=300
91.0
@β𝛽\betaitalic_β=100
91.8
@β𝛽\betaitalic_β=100
91.0
@β𝛽\betaitalic_β=400
91.0
@β𝛽\betaitalic_β=200
Table 3: Results of search experiments for hyper-parameters α𝛼\alphaitalic_α and β𝛽\betaitalic_β of the trigonometric function in Point-NN. The second row of the table shows the best OA and the corresponding value of the hyper-parameter β𝛽\betaitalic_β. The best result is highlighted in bold.
Stage Num. OA(%) mAcc(%)
1 90.6 90.5
2 91.8 92.2
3 89.5 89.5
4 90.6 91.3
5 89.9 90.1
Table 4: Ablation study on the stage number of Multi-stage Hierarchy of Point-NN. The best results are highlighted in bold.

4.2 Statistics of TextileNet8

The proposed textile classification dataset, TextileNet8, comprises 1335 3D point cloud images of polyester textiles. Following the relevant standard (??) for textile pilling classification, textile pilling can be grades range from 1 to 4.5 with intervals of 0.5. Accordingly, we categorize these images into eight grades (1-8), with the severity of textile pilling decreasing as the grade increases. Grade 1 has the fewest point cloud images at 150, while Grades 3 and 7 have the highest number of point cloud images at 180, and the rest of the Grades have 165 each. Table 1 presents the statistics of the TextileNet8 dataset and compares it with other related datasets. Notably, TextileNet8 stands out for being collected from an actual test environment, filling a gap in publicly available 3D point cloud datasets specifically related to industrial production. To the best of our knowledge, TextileNet8 represents the first publicly accessible dataset utilizing 3D point cloud data for eight-class textile pilling assessment. This dataset promises to advance research in textile pilling assessment based on 3D point cloud data.

5 Experiments

5.1 Experimental Settings

We conduct experiments using the TextileNet8 dataset as a benchmark to evaluate the performance of various models. The training and validation sets consisted of 1068 and 267 samples, respectively, with a ratio of 8:2, consistent with the popular ModelNet40 point cloud classification dataset (?). For a comprehensive analysis, we compare our proposed models with two graph-based methods, five MLP-based methods, and the vanilla PointGPT. All models are implemented using the PyTorch and trained on two Nvidia GTX A6000 GPUs. To accommodate GPU memory limitations, we select the PointGPT-S model with the fewest parameters as the base model, which is pre-trained on the ShapeNet dataset (?). The batch size is 32 for training, a cosine learning rate (CosLR) scheduler is employed, and an AdamW optimizer uses an initial learning rate of 1e-4. The training epoch is 600 for optimal model convergence.

5.2 Result Analysis

Table 2 illustrates the performance of our proposed model and other models on the TextileNet8 dataset. As indicated in Table 2, our model achieves the best results in both OA and mAcc. Specifically, it outperforms the second-best model by 1.2% in OA and 1.1% in mAcc. Furthermore, the results indicates that graph-based point cloud processing methods exhibit mediocre performance, while MLP-based methods generally achieve an OA metric exceeding 84%, which suggests that graph-based methods may not be suitable for textile pilling assessment tasks. Leveraging the powerful feature extraction capability of transformers, PointGPT and its improved version, PointGPT+NN, can achieve over 90% OA and mACC, indicating their significant advantages over other methods for textile pilling grading.

Method Data dimension OA(%) mAcc(%)
Fusion on classification logits (?) 32x8 90.6 91.1
Fusion on feature embedding 32x512x384 91.8 92.2
Table 5: Comparison of the effectiveness of the proposed feature fusion strategy and the strategy proposed in the paper of Point-NN. The first number in the data dimension column is batch size. The best result is highlighted in bold.
Method Input Points OA(%) mAcc(%)
DGCNN (?) 1024 92.9 90.2
DeepGCN (?) 1024 93.6 90.9
M-GCN (?) 1024 93.1 90.1
PointNet (?) 1024 89.2 86.0
PointNet++ (?) 1024 92.7 90.1
PointNeXt-s (?) 1024 94.0 91.1
PointMLP (?) 1024 94.5 91.4
Point-PN (?) 1024 93.8 91.2
PointGPT (?) 1024 94.0 91.1
PointGPT+NN (Ours) 1024 94.2 91.3
Table 6: Comparison of the effectiveness of the proposed method and other methods on the ModelNet40 dataset. The best results are highlighted in bold.

5.3 Ablation Analysis

Table 3 presents the results of the ablation experiments conducted during the search for trigonometric function hyperparameters α𝛼\alphaitalic_α and β𝛽\betaitalic_β. We test α𝛼\alphaitalic_α values of 100, 500, 1000, 2000, and 3000, and β𝛽\betaitalic_β values of 50, 100, 200, 300, and 400. Each α𝛼\alphaitalic_α value is paired with each β𝛽\betaitalic_β value, resulting in 25 combinations of α𝛼\alphaitalic_α and β𝛽\betaitalic_β. The second row of Table 3 presents the highest OA achieved for each α𝛼\alphaitalic_α value alongside its corresponding β𝛽\betaitalic_β value. Notably, the model achieves its optimal OA when α𝛼\alphaitalic_α is 1000 and β𝛽\betaitalic_β is 100. Results from the table also indicate the best OA reaches 91.0% across all combinations of each group, suggesting that the proposed PointGPT+NN model demonstrates robustness to these two hyperparameters.

Table 4 shows the results of tuning the number of stages of the multi-stage hierarchy of Point-NN. We vary the number of stages from 1 to 5. The smaller number of repetitions represents the more basic features obtained, and the original Point-NN model achieves its best results at a stage repetition number of 4 on the ModelNet40 dataset. As shown in Table 4, OA and mAcc reach their optimal values when the number of stages is 2, indicating that basic features are crucial for the model to get a good result from the TextileNet8 dataset.

Table 5 illustrates the results obtained from employing two different fusion strategies. The authors of Point-NN suggest the possibility of fusing the Point-NN’s logit results with other models through linear interpolation to boost other models’ performance. Therefore, we also attempt this approach. However, the results indicate no performance improvement. We meticulously examine the numerical outputs and conclude that its failure is likely due to the relatively small logit values produced by the Point-NN model and its significantly lower accuracy compared to the PointGPT model. It is important to note that using this fusion approach requires the usage of PMB of the Point-NN. Conversely, our strategy fusion in feature embedding ensures that PointGPT learns the overall features of the input raw point cloud, thereby enhancing model performance.

5.4 Preformance on Other Dataset

To validate the effectiveness of the proposed model, we conduct comparative experiments on the widely-used ModelNet40 dataset, which comprises 12,311 CAD point cloud images from 40 synthetic object categories. Table 6 shows the results. It is observed that although our proposed model does not achieve the best results, it achieves the second-best results in both OA and mAcc, which suggests that the proposed model has competitive performance on other larger datasets and is also valid to other point cloud classification tasks. The results also demonstrate the proposed model outperforms the vanilla PointGPT, indicating the feature fusion strategy is generalizable to improve the performance of the vanilla model.

6 Conclusion

In this paper, we present TextileNet8, the first publicly available eight-categories 3D point cloud dataset for textile pilling assessment. In addition, we effectively incorporate the point cloud features extracted from the adapted Point-NN model into the feature embedding of PointGPT, thus enhancing the performance of the vanilla PointGPT model on the TextileNet8 dataset. Comparative experimental results on TextileNet8 show that the proposed PointGPT+NN model achieves an OA of 91.8% and a mAcc of 92.2%. In addition, experimental results on ModelNet40 datasets show that the proposed PointGPT+NN model achieves better results than the vanilla model. In future work, we plan to collect more 3D point cloud data for textile filling evaluation and explore optimization methods to improve the performance of PointGPT+NN.

References

  • Bar-Yecheskel and Weinberg 1988 Bar-Yecheskel, H., and Weinberg, A. 1988. An improved testing instrument for the evaluation of the shedding or defuzzing of fibres in finished garments.
  • Bouvier, Gordon, and McDonald 2011 Bouvier, D. J.; Gordon, C.; and McDonald, M. 2011. An approach for occlusion detection in construction site point cloud data. In Computing in Civil Engineering (2011). 234–241.
  • Chang et al. 2015 Chang, A. X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q.; Li, Z.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012.
  • Chen et al. 2009 Chen, X.; Xu, Z.; Chen, T.; et al. 2009. Detecting pills in fabric images based on multi-scale matched filtering. Textile Research Journal 79(15):1389–1395.
  • Chen et al. 2020 Chen, S.; Liu, B.; Feng, C.; et al. 2020. 3d point cloud processing and learning for autonomous driving: Impacting map creation, localization, and perception. IEEE Signal Processing Magazine 38(1):68–86.
  • Chen et al. 2024 Chen, G.; Wang, M.; Yang, Y.; Yu, K.; Yuan, L.; and Yue, Y. 2024. Pointgpt: Auto-regressively generative pre-training from point clouds. Advances in Neural Information Processing Systems 36.
  • Cheng et al. 2020 Cheng, Q.; Sun, P.; Yang, C.; et al. 2020. A morphing-based 3d point cloud reconstruction framework for medical image processing. Computer methods and programs in biomedicine 193:105495.
  • Fan et al. 2023 Fan, M.; Liu, L.; Deng, N.; et al. 2023. Digital 3d system for classifying fabric pilling based on improved active contours and neural network. The Visual Computer 39(10):5085–5095.
  • Fan, Su, and Guibas 2017 Fan, H.; Su, H.; and Guibas, L. J. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition, 605–613.
  • Furferi et al. 2014 Furferi, R.; Carfagni, M.; Governi, L.; Volpe, Y.; and Bogani, P. 2014. Towards automated and objective assessment of fabric pilling. International Journal of Advanced Robotic Systems 11(10):171.
  • Göktepe 2002 Göktepe, Ö. 2002. Fabric pilling performance and sensitivity of several pilling testers. Textile research journal 72(7):625–630.
  • Hu et al. 2023 Hu, J.; Wang, X.; Liao, Z.; and Xiao, T. 2023. M-gcn: Multi-scale graph convolutional network for 3d point cloud classification. In 2023 IEEE International Conference on Multimedia and Expo (ICME), 924–929. IEEE.
  • Krawczyk and Sitnik 2023 Krawczyk, D., and Sitnik, R. 2023. Segmentation of 3d point cloud data representing full human body geometry: A review. Pattern Recognition 139:109444.
  • Li et al. 2019 Li, G.; Muller, M.; Thabet, A.; and Ghanem, B. 2019. Deepgcns: Can gcns go as deep as cnns? In Proceedings of the IEEE/CVF international conference on computer vision, 9267–9276.
  • Li et al. 2020 Li, Y.; Ma, L.; Zhong, Z.; et al. 2020. Deep learning for lidar point clouds in autonomous driving: A review. IEEE Transactions on Neural Networks and Learning Systems 32(8):3412–3432.
  • Liu et al. 2021 Liu, L.; Deng, N.; Xin, B.; et al. 2021. Objective evaluation of fabric pilling based on multi-view stereo vision. The Journal of the Textile Institute 112(12):1986–1997.
  • Luo, Xin, and Yuan 2022 Luo, J.; Xin, B.; and Yuan, X. 2022. Photometric stereo-based 3d reconstruction method for the objective evaluation of fabric pilling. Wuhan University Journal of Natural Sciences 27(6):550–556.
  • Ma et al. 2022 Ma, X.; Qin, C.; You, H.; Ran, H.; and Fu, Y. 2022. Rethinking network design and local geometry in point cloud: A simple residual mlp framework. arXiv preprint arXiv:2202.07123.
  • Mohammadi, Wang, and Del Bue 2021 Mohammadi, S. S.; Wang, Y.; and Del Bue, A. 2021. Pointview-gcn: 3d shape classification with multi-view point clouds. In 2021 IEEE International Conference on Image Processing (ICIP), 3103–3107. IEEE.
  • Moon et al. 2019 Moon, D.; Chung, S.; Kwon, S.; et al. 2019. Comparison and utilization of point cloud generated from photogrammetry and laser scanning: 3d world model for smart heavy equipment planning. Automation in Construction 98:322–331.
  • Ouyang, Wang, and Xu 2013 Ouyang, W.; Wang, R.; and Xu, B. 2013. Fabric pilling measurement using three-dimensional image. Journal of Electronic Imaging 22(4):043031–043031.
  • Palmer, Zhang, and Wang 2009 Palmer, S.; Zhang, J.; and Wang, X. 2009. New methods for objective evaluation of fabric pilling by frequency domain image processing. Research journal of textile and apparel 13(1):11–23.
  • Placitelli and Gallo 2011 Placitelli, A. P., and Gallo, L. 2011. 3d point cloud sensors for low-cost medical in-situ visualization. In 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW), 596–597. IEEE.
  • Qi et al. 2017a Qi, C. R.; Su, H.; Mo, K.; et al. 2017a. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proc. CVPR, 652–660.
  • Qi et al. 2017b Qi, C. R.; Yi, L.; Su, H.; et al. 2017b. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30.
  • Qian et al. 2022 Qian, G.; Li, Y.; Peng, H.; et al. 2022. Pointnext: Revisiting pointnet++ with improved training and scaling strategies. Advances in Neural Information Processing Systems 35:23192–23204.
  • Radford et al. 2018 Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I.; et al. 2018. Improving language understanding by generative pre-training.
  • Srivastava and Sharma 2021 Srivastava, S., and Sharma, G. 2021. Exploiting local geometry for feature and graph construction for better 3d point cloud processing with graph neural networks. In 2021 IEEE INternational conference on robotics and automation (ICRA), 12903–12909. IEEE.
  • Wang and Kim 2019 Wang, Q., and Kim, M.-K. 2019. Applications of 3d point cloud data in the construction industry: A fifteen-year review from 2004 to 2018. Advanced Engineering Informatics 39:306–319.
  • Wang et al. 2019 Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S. E.; Bronstein, M. M.; and Solomon, J. M. 2019. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog) 38(5):1–12.
  • Wei et al. 2020 Wei, L.; Wan, S.; Sun, Z.; Ding, X.; and Zhang, W. 2020. Weighted attribute prediction based on morton code for point cloud compression. In 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 1–6. IEEE.
  • Wu et al. 2015 Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; and Xiao, J. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1912–1920.
  • Xiao et al. 2021 Xiao, Q.; Wang, R.; Sun, H.; and Wang, L. 2021. Objective evaluation of fabric pilling based on image analysis and deep learning algorithm. International Journal of Clothing Science and Technology 33(4):495–512.
  • Xu et al. 2020 Xu, Q.; Sun, X.; Wu, C.-Y.; Wang, P.; and Neumann, U. 2020. Grid-gcn for fast and scalable point cloud learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5661–5670.
  • Xu et al. 2023 Xu, Z.; Liang, Y.; Lu, H.; Kong, W.; and Wu, G. 2023. An approach for monitoring prefabricated building construction based on feature extraction and point cloud segmentation. Engineering, Construction and Architectural Management 30(10):5302–5332.
  • Yap et al. 2010 Yap, P.; Wang, X.; Wang, L.; et al. 2010. Prediction of wool knitwear pilling propensity using support vector machines. Textile Research Journal 80(1):77–83.
  • Zhang and Sabuncu 2018 Zhang, Z., and Sabuncu, M. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems 31.
  • Zhang et al. 2023 Zhang, R.; Wang, L.; Wang, Y.; et al. 2023. Parameter is not all you need: Starting from non-parametric networks for 3d point cloud analysis. arXiv preprint arXiv:2303.08134.
  • Zhang, Wang, and Palmer 2007 Zhang, J.; Wang, X.; and Palmer, S. 2007. Objective grading of fabric pilling with wavelet texture analysis. Textile Research Journal 77(11):871–879.
  • Zhang, Wang, and Palmer 2010 Zhang, J.; Wang, X.; and Palmer, S. 2010. Performance of an objective fabric pilling evaluation method. Textile Research Journal 80(16):1648–1657.