CN116402954A

CN116402954A - Spine three-dimensional structure reconstruction method based on deep learning

Info

Publication number: CN116402954A
Application number: CN202310503459.2A
Authority: CN
Inventors: 张俊华; 李博; 杨蕊绮; 赵阳; 林睿; 王泽彤; 王肖; 刘云凤; 王骞; 凌健航
Original assignee: Yunnan University YNU
Current assignee: Yunnan University YNU
Priority date: 2023-05-06
Filing date: 2023-05-06
Publication date: 2023-07-07

Abstract

The invention relates to the technical field of image processing, in particular to a spine three-dimensional structure reconstruction method based on deep learning, which utilizes a tomography algorithm to manufacture a spine BR image dataset for training a deep learning model; the provided image enhancement algorithm is used for enhancing bone tissue characteristics of the BR image and filtering partial noise and soft tissue information; the provided feature extraction architecture is used for extracting similar information in the BR image; the feature conversion module is arranged for converting the two-dimensional image features into three-dimensional features so as to match the spine structure; the provided three-dimensional reconstruction architecture gradually restores the network output to the corresponding size to obtain the spine structure; simultaneously, two new methods for evaluating the three-dimensional model, namely a distribution error (DisE) and a sampling accuracy (SAc), are provided, and the problem of manual labeling error is solved; the invention solves the problems of complex CT three-dimensional reconstruction process, strict condition, high cost, strong radiation and dependence on doctor experience.

Description

Spine three-dimensional structure reconstruction method based on deep learning

Technical Field

The invention relates to the technical field of image processing, in particular to a spine three-dimensional structure reconstruction method based on deep learning.

Background

Scoliosis is one of the major diseases affecting teenager growth and development. The diagnosis and treatment of this is largely dependent on the clinical three-dimensional reconstruction of the spine. Currently, the main technique for three-dimensional reconstructive imaging of the spine clinically is computed tomography (Computer tomography, CT). However, CT not only causes a high dose of radiation to the patient, but also requires the patient to remain in a certain posture during the examination. Magnetic resonance imaging (Magnetic Resonance Imaging, MRI) mostly requires that the patient lie flat, and strict physical requirements are imposed, such as the inability to have metallic implants in the body. In addition, scoliosis patients often have difficulty adapting to the body posture requirements of CT and MRI.

The biplane X-ray imaging (biplane radiographs, BR) is characterized by low radiation dose, low cost, no restriction on body posture, and fast imaging speed. BR is therefore the most cost-effective imaging modality for diagnosing scoliosis. In recent years, reconstruction of three-dimensional shapes from two-dimensional BR using convolutional neural networks (Convolutional Neural Networks, CNN) has become a subject of intense research. In recent years, reconstructing three-dimensional bone structures from two-dimensional X-ray images has been a significant challenge. There are many studies on three-dimensional bone reconstruction, mainly two types. One is a method based on a statistical shape model (statistical shape model, SSM) and a statistical appearance model (statistical appearance model, SAM), and the other is a method based on a deep learning algorithm.

Both types of statistical models SSM and SAM describe the average shape and average density distribution of bone, respectively, which also contains the trend of shape and density changes. Cootes et al, in 1995, proposed SSM for the first time, using real bone as a constraint, to generate a certain deformation to reconstruct a new bone three-dimensional structure. Linear SSM has been previously applied to three-dimensional reconstruction of femur, pelvis and hip joints in medical images, but many unreasonable shapes remain during deformation. In recent years, some statistical models for assessing hip joint structural changes in patients with osteoarthritis of the hip joint have also been proposed by students. Reyneke et al introduced a gaussian process deformable model in SSM based principal component analysis, which caused a problem that its covariance matrix was no longer limited. This approach reduces the abnormal situation of three-dimensional bone deformation. Ascadi et al use a gaussian process deformable model to reconstruct the three-dimensional structure of a femur with a incomplete surface. They also used skin and bone feature points, respectively, to predict bone structure. Salhi et al established the SSM of the scapula based on a Gaussian process deformable model for reconstructing the three-dimensional structure of the scapula with a defect surface. In the latest studies, the mean root mean square error of the reconstructed scapula has been less than 2 mm. SAM is also used for mechanical characterization and force analysis of bone mechanics because SAM can obtain a density distribution within bone.

In the study of high quality reconstruction through low radiation dose images, deep learning methods tend to be more robust than statistical model methods. There are two methods based on deep learning applied to three-dimensional reconstruction. One is to reconstruct a three-dimensional image from a two-dimensional image based on CNN, a U-shaped convolutional network (U-Net) for biomedical image segmentation, a residual neural network, and the like. Another is to generate 3D images using an contrast generating network.

Kasten et al devised a CNN with skip-concatenated encoding and decoding, copying two-dimensional pictures into three dimensions as inputs to obtain three-dimensional outputs. Chen et al use a recurrent neural network to remove artifacts from CT images and reconstruct 3D images in conjunction with a residual encoder. Ding et al pre-process by an approximate forward-backward segmentation algorithm, training a deep learning model to reconstruct CT images. Xie et al studied the artifacts of CT images and subtracted the artifacts from sparsely encoded images to improve the quality of the reconstructed images. Ma et al denoise the CT image with the structural similarity index as a loss function to reconstruct the CT image. Feng et al uses poisson-gaussian mixed noise to simulate the noise of a CT image for noise reduction. The method reconstructs the CT image through the U-Net framework, and improves the quality of CT image reconstruction. In order to improve the quality of the reconstructed CT image, kang et al propose a CNN algorithm that combines wavelet transform coefficients to obtain directional components of the artifact by wavelet transform to remove the artifact. Zheng et al reconstruct CT images using CNN by processing each CT slice through Rad on back-calculation. Ge et al analyze the back projection of CT image, convert the front projection into CT image domain, extract the characteristic in sine wave image and CT image domain, reconstruct CT image. Shen et al designed a CNN-based conversion module to convert the extracted features from two-dimensional codes to three-dimensional codes for reconstruction of CT images. Zhang et al embedded a semi-quadratic segmentation algorithm in CNN to improve the quality of CT image reconstruction. Shiode et al generated digitally reconstructed radiological images from CT images for CNN training to achieve reconstruction of the wrist joint.

However, reconstructing the three-dimensional structure of the spine directly from the BR still presents the following difficulties. The angular difference between each pair of BRs reaches 90 °, which results in a large difference in image characteristics. The X-ray image has noise and redundant soft tissue information during the imaging process. Bone tissue overlaps on the BR, which reduces the accuracy of the three-dimensional reconstruction. The evaluation methods of three-dimensional spinal reconstruction typically rely on manually marked points, which lead to deviations between different operators. Deep learning methods require rich samples and sufficiently accurate truth values, and clinical data is often difficult to obtain.

For the application of deep learning in three-dimensional reconstruction based on medical images, the following problems to be solved still exist at the present stage:

(1) Deep learning has been widely used with medical images, but relatively little research has been done in connection with three-dimensional reconstruction.

(2) For medical images, the imaging parameters are difficult to obtain, so that human tissues are difficult to reconstruct by using a supervised deep learning algorithm.

(3) Deep learning methods require rich samples and sufficiently accurate truth values, and clinical data is often difficult to obtain.

(4) The angular difference between each pair of BRs reaches 90 °, which results in a large difference in image characteristics.

(5) The X-ray image has noise and redundant soft tissue information during the imaging process.

(6) Bone tissue overlaps on the BR, which reduces the accuracy of the three-dimensional reconstruction.

(7) The evaluation methods of three-dimensional spinal reconstruction typically rely on manually marked points, which lead to deviations between different operators.

Disclosure of Invention

The invention provides a spine three-dimensional structure reconstruction method based on deep learning, which is used for reconstructing a spine three-dimensional structure from BR images and solves the problems of complex CT three-dimensional reconstruction process, strict conditions, high cost, strong radiation and dependence on doctor experience.

In order to achieve the technical purpose, the invention is realized by the following technical scheme:

a spine three-dimensional structure reconstruction method based on deep learning comprises the following steps:

s1: acquiring materials, collecting head CT data, and generating a training model after processing;

s2: optimizing materials, and analyzing and processing pixel distribution characteristics according to the image data obtained in the step S1;

s3: establishing a model, wherein the main structure is divided into feature extraction, feature conversion and three-dimensional reconstruction;

s4: the optimization model is used for optimizing the three-dimensional reconstruction configuration of the spine based on the related parameters;

s5: model evaluation: and (5) selecting an evaluation index and then performing automatic evaluation.

Further, the step S1 specifically includes: collecting head CT data, generating spinal BR data with different angles by using a tomography algorithm, and generating corresponding BR for training a model by using a TIGRE algorithm; the data sets are divided into two types of normal spine and scoliosis, and each type of data set is divided into training, verifying and testing sets according to the proportion of 8:1:1.

Further, the step S2 specifically includes: according to the pixel distribution characteristics of each X-ray image, the self-adaptive adjustment characteristics are carried out through an image enhancement algorithm, so that the bone tissue characteristics are highlighted and the noise and soft tissue information are weakened.

Further, the image enhancement algorithm is specifically

Further, the step S3 specifically includes: the end-to-end reconstruction of the three-dimensional spine structure is realized by utilizing the feature extraction, feature conversion and the three-dimensional reconstruction architecture, the model is named as SP-Net, the features of the BR image are extracted through a double-channel convolution network sharing weights, the information of different scales is extracted by utilizing residual connection, and the information is expanded to the three-dimensional reconstruction skeleton structure.

Further, the ablation experiment in the step S4 specifically includes: the experiment of each normal dataset was trained on 800 images, validated on 100 images, tested on 100 test images, the experiment of each scoliosis dataset was trained on 9600 images, validated on 1200 images, tested on 1200 test images, the images used for training, validation and testing were independent of each other; the structure and performance of the SP-Net are optimized through an ablation experiment.

Further, the evaluation indexes in the step S5 specifically include a distribution error (dee) and a sampling accuracy (SAc).

Further, the distribution error (DisE) algorithm specifically includes:

further, the distribution error (DisE) algorithm specifically includes:

the beneficial effects of the invention are as follows:

The invention utilizes a tomography algorithm to manufacture a spine BR image dataset for training a deep learning model; the provided image enhancement algorithm is used for enhancing bone tissue characteristics of the BR image and filtering partial noise and soft tissue information; the provided feature extraction architecture is used for extracting similar information in the BR image; the feature conversion module is arranged for converting the two-dimensional image features into three-dimensional features so as to match the spine structure; the provided three-dimensional reconstruction architecture gradually restores the network output to the corresponding size to obtain the spine structure; meanwhile, two new methods for evaluating the three-dimensional model, namely a distribution error (DisE) and a sampling accuracy (SAc), are provided, and the problem of manual labeling errors is solved.

The BR image data sets of all parts of the human body with different angles can be obtained by the invention; realizing three-dimensional spine reconstruction, and the result can be used for scoliosis correction and spine stress analysis;

the algorithm provided by the invention is generalized, and can be applied to three-dimensional reconstruction of other parts of a human body;

the algorithm provided by the invention can also be used for developing a system for automatically measuring the characteristics and physiological parameters of the spine;

the reconstruction method provided by the invention improves the efficiency of clinical diagnosis and treatment of scoliosis and reduces the misdiagnosis rate and missed diagnosis rate.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a diagram showing a data set according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an SP-Net network according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a feature extraction architecture according to an embodiment of the invention;

FIG. 4 is a schematic diagram of a feature transformation architecture according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a three-dimensional reconstruction architecture according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the results of an ablation experiment using HD, ASD and DisE as evaluation indicators in the present invention;

FIG. 7 is a graph showing the effect of the results of an ablation experiment using SO, VD and SAc as evaluation indexes;

FIG. 8 is a visual representation of network output according to the present invention;

fig. 9 is a representation of a three-dimensional reconstruction fit of the spine of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

The method for reconstructing the three-dimensional structure of the spine based on the deep learning according to the embodiment comprises the following steps:

Further, the image enhancement algorithm is specifically

Further, the distribution error (DisE) algorithm specifically includes:

further, the distribution error (DisE) algorithm specifically includes:

example 2

Acquisition of study materials: and (3) performing generation of spinal BR data at different angles on the CT image of the head of the patient by using a tomography algorithm.

Currently, the data sets disclosing available three-dimensional bone reconstruction are very rare, and the CT data of the present invention are derived from the first-person hospital in Yunnan province, which contains one normal case and 41 scoliosis cases, which have been allowed to be used in experiments by doctors and patients. The corresponding BR is generated by the TIGRE algorithm for training the model. The data set contains normal spinal column and scoliosis for public download. The three-dimensional reconstructed labels are obtained from a three-dimensional CT. In order to improve the quality of reconstruction, the invention designs an algorithm to improve BR. CT images, rear front (PA) images, side (LAT) images, enhanced PA (en_pa) and enhanced LAT (en_lat) are shown in fig. 1. These data fall further into two categories based on normal spine and scoliosis. Each normal spinal data set contains 1000 pairs of BRs and 1000 three-dimensional voxels, divided into training, validation and test sets at a ratio of 8:1:1. Each scoliosis dataset contains 12000 pairs of BR and 12000 three-dimensional voxels, divided into training, validation and test sets according to a ratio of 8:1:1.

Example 3

Optimization of study materials: according to the pixel value distribution characteristics of each X-ray image, the characteristics of each X-ray image are adaptively adjusted through an image enhancement algorithm, bone tissue characteristics are highlighted, and noise and soft tissue information are weakened. The algorithm is shown in example 1. The comparative test results are shown in Table 1.

TABLE 1 image enhancement algorithm contrast results

Example 4

And (3) establishing a model: and integrating the original feature extraction, feature conversion and three-dimensional reconstruction architecture to realize end-to-end reconstruction of the three-dimensional spine structure.

The model developed by the invention is named SP-Net, and as shown in figure 2, the SP-Net extracts the characteristics of the BR image through a double-channel convolution network sharing weights. Residual connection is also carried out in the double channels, and information of different scales is extracted. This information is also extended to three-dimensional reconstruction of bone structures. Thus, the main structure of SP-Net can be divided into three parts, feature extraction (FIG. 3), feature transformation (FIG. 4) and three-dimensional reconstruction (FIG. 5); it should be noted that these three parts are connected end-to-end.

Feature extraction (as shown in figure 3)

The structure of which is shown in fig. 4. The output of these two-channel CNN sharing weights reflects the similarity of semantics. Each convolution layer is followed by a batch normalization and ReLU layer. Residual connection is carried out on the 1 st convolution layer and the 2 nd convolution layer; the 3 rd convolution layer and the 4 th convolution layer are connected in a residual way; the 5 th convolution layer and the 6 th convolution layer are connected in a residual way, and the characteristics are transmitted backwards. Each channel contains six convolutional layers, with the number of neuronal convolutional layers being 128, 256, 512, 256, and 128. The kernel sizes of the odd and even convolutional layers are 4×4 (step size 2) and 3×3 (step size 1), respectively.

Feature conversion (as shown in FIG. 4)

As shown in fig. 5, the feature transformation architecture is connected to the feature extraction architecture, increasing the dimension of the extracted features. Features extracted from the BR are encoded in neurons of two magnitudes (448 and 56) to enhance the recognition of different inputs by the SP-Net. In the first branch (Trans 1), the features of the BR are encoded by a convolutional layer containing 448 neurons, followed by a batch normalization and ReLU layer. In the second branch (Trans 2), the characteristics of the BR are encoded by a convolutional layer containing 56 neurons, followed by a batch normalization and ReLU layer. The data size is expanded by deconvolution, followed by batch normalization and ReLU layers. The activated data is expanded into three dimensions by the translation layer. By merging of the merge layers, the following three-dimensional reconstruction architecture is entered.

Three-dimensional reconstruction (as shown in FIG. 5)

With feature extraction and conversion, two-dimensional data becomes three-dimensional data. Three-dimensional reconstruction architecture is used to achieve higher resolution. Therefore, we have devised the network architecture shown in fig. 6 to increase the resolution of the image by upsampling. The three-dimensional reconstruction architecture consists of 8 similar convolution segments, each consisting of one convolution layer, one batch normalization layer and one activation layer. The number of neurons for these 8 convolutional layers is 1024, 512, 256, 128, 64. The first seven active layers are function of ReLU, and the last active layer uses Sigm oid as the active function.

Example 5

Optimization of the model: in the design process of the feature extraction and feature conversion architecture, various parameters and structures are adopted to explore the configuration which is most favorable for the three-dimensional reconstruction of the spine. The traditional computer vision method is introduced into a deep learning architecture to carry out an ablation experiment.

End-to-end architecture and experiments were performed using Keras2.6.0, all run on a host equipped with Intel i7-12700KFCPU and 64GB memory and NvidiaGeforce3090 GPU. Experiments for each normal dataset were trained on 800 pairs of images, validated on 100 pairs of images, and tested on 100 pairs of test images. The experiment for each scoliosis dataset was trained on images at 9600, validated with 1200 images, and tested with 1200 test images. The images used for training, validation and testing are independent of each other.

In order to continuously optimize the structure and performance of the SP-Net, a plurality of groups of ablation experiments are designed. The results are shown in FIGS. 6-7.

Example 6

Evaluation of the model: two new automatic evaluation methods are provided, and errors caused by manually marking characteristic points for evaluation are avoided. And comparing with four classical three-dimensional model evaluation methods, and comprehensively evaluating the model performance. Finally, the invention selects related research to carry out contrast test.

The two evaluation indexes proposed by the invention are distribution error (DisE) and sampling accuracy (SAc), and the algorithm is shown in the embodiment 1; the network output is visually presented and stored in a variety of data formats, as shown in fig. 8. The invention also compares the three-dimensional reconstructed spine with the three-dimensional CT reconstructed spine, and the overlapping condition is shown in figure 9; as can be seen from fig. 8 to 9, the method for reconstructing the three-dimensional structure of the spine from the biplane X-ray image based on the deep learning provided by the invention can convert the two-dimensional image characteristics into the three-dimensional characteristics to match the spine structure, meanwhile, the constructed three-dimensional structure has small difference with the actual spine structure, the noise and the redundant interference of soft tissue information can be effectively reduced, and the imaging accuracy is improved.

In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims

1. A spine three-dimensional structure reconstruction method based on deep learning is characterized by comprising the following steps: the method comprises the following steps:

2. The method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 1, wherein the step S1 specifically comprises: collecting head CT data, generating spinal BR data with different angles by using a tomography algorithm, and generating corresponding BR for training a model by using a TIGRE algorithm; the data sets are divided into two types of normal spine and scoliosis, and each type of data set is divided into training, verifying and testing sets according to the proportion of 8:1:1.

3. The method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 1, wherein the step S2 specifically comprises: according to the pixel distribution characteristics of each X-ray image, the self-adaptive adjustment characteristics are carried out through an image enhancement algorithm, so that the bone tissue characteristics are highlighted and the noise and soft tissue information are weakened.

4. The method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 1, wherein the step S3 specifically comprises: the end-to-end reconstruction of the three-dimensional spine structure is realized by utilizing the feature extraction, feature conversion and the three-dimensional reconstruction architecture, the model is named as SP-Net, the features of the BR image are extracted through a double-channel convolution network sharing weights, the information of different scales is extracted by utilizing residual connection, and the information is expanded to the three-dimensional reconstruction skeleton structure.

5. The method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 1, wherein the ablation experiment in step S4 specifically comprises: the experiment of each normal dataset was trained on 800 images, validated on 100 images, tested on 100 test images, the experiment of each scoliosis dataset was trained on 9600 images, validated on 1200 images, tested on 1200 test images, the images used for training, validation and testing were independent of each other; the structure and performance of the SP-Net are optimized through an ablation experiment.

6. The method according to claim 1, wherein the evaluation indexes in the step S5 include distribution error (dee) and sampling accuracy (SAc).

7. The method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 6, wherein the distribution error (DisE) algorithm is specifically:

8. the method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 6, wherein the distribution error (DisE) algorithm is specifically:

9. a method for reconstructing a three-dimensional structure of a spine based on deep learning according to claim 3, wherein the image enhancement algorithm specifically comprises: