CN114266461A

CN114266461A - MSWI process dioxin emission risk early warning method based on visual distribution GAN

Info

Publication number: CN114266461A
Application number: CN202111539001.XA
Authority: CN
Inventors: 汤健; 崔璨麟; 夏恒; 王丹丹; 乔俊飞
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-12-15
Filing date: 2021-12-15
Publication date: 2022-04-01

Abstract

A visible distribution GAN-based MSWI process dioxin emission risk early warning method belongs to the field of urban solid waste incineration. At present, a long-period and high-cost off-line toxic pollutant dioxin DXN emission concentration detection mode is adopted in the industrial process, so that samples for constructing a risk early warning model are extremely rare. Aiming at the problems, a modeling method for MSWI process DXN emission risk early warning based on visual distribution generation countermeasure network (GAN) is provided. First, a risk level of DXN is introduced as conditional information on the basis of the original GAN, so that the generator can generate a virtual sample specifying the risk level. Then, the visual distribution information is used for evaluating and screening qualified virtual samples. And finally, constructing a DXN emission risk early warning model based on a mixed sample consisting of the virtual sample and the real sample. The effectiveness of the proposed method is verified by industrial process DXN data.

Description

MSWI process dioxin emission risk early warning method based on visual distribution GAN

Technical Field

The invention belongs to the field of urban solid waste incineration.

Background

The production of Municipal Solid Waste (MSW) increases year by year with the increasing population of cities. Municipal Solid Waste Incineration (MSWI) is a treatment means adopted in most countries in the world today and has the advantages of harmlessness, reduction, recycling and the like. Because Dioxin (DXN) generated in the MSWI process is a highly toxic pollutant, the dioxin not only damages the endocrine system of a poisoned person and destroys chromosomes to further cause canceration of cells, but also has an accumulation effect in organisms, and is a main reason for the 'neighbor effect' of incineration plant construction. Therefore, controlling the emissions is an environmental issue that needs to be addressed. The method has the advantages that the risk grade of DXN emission is early warned, the MSWI process is further optimally controlled, and the method has important practical significance for reducing pollutant emission.

At present, the industry mainly detects DXN in the exhaust smoke of the end chimney of the MSWI process. Both the off-line direct test and the on-line indirect test are difficult to satisfy the real-time optimization control of the MSWI process for the purpose of DXN emission reduction. In addition, the DXN emission concentration detection is difficult, long, and expensive, so that sample truth values for constructing the data-driven model are extremely rare. Therefore, the problem of detecting the DXN emission concentration in the MSWI process researched by the application belongs to the typical small sample problem, and has the characteristics of small sample quantity, unbalanced samples and the like. Generally speaking, a small number of modeling samples are difficult to accurately reflect the real characteristics of the industrial process, and a robust and reliable regression prediction model of pollutant concentration emission is difficult to construct; relatively speaking, it is easier to construct a risk discrimination classification model. In addition, experts in the field of industrial fields are also used to describe the risks of pollution emission levels by using low, medium and high level languages of emission concentration and obtain a judgment result according to self experience to adjust related control parameters. However, the imbalance of samples, i.e. the number of samples of a certain class is much smaller than that of other classes, is also a main reason for the sidedness and bias of the constructed risk discrimination model.

In summary, the application provides a construction method of a DXN emission risk early warning model in the MSWI process based on an active learning mechanism GAN. Firstly, introducing DXN risk level as condition information on the basis of original GAN, inputting the condition information and random noise into a generator, generating a virtual sample with preset DXN risk level, inputting the virtual sample and a real sample into a discriminator together, and updating the generator and the discriminator according to a discrimination result; secondly, primarily screening the virtual sample by using Maximum Mean Difference (MMD), then adopting Principal Component Analysis (PCA) to the primarily screened virtual sample to obtain visual distribution information, and judging whether the primarily screened virtual sample is qualified or not according to the visual distribution information; and finally, constructing a DXN emission risk early warning model based on a mixed sample consisting of the virtual sample and the real sample. The validity of the proposed method is verified in combination with the actual DXN data.

The grate furnace incineration process flow of a certain domestic MSWI power plant is shown in figure 1.

As shown in figure 1, MSW is collected and weighed by a special vehicle and then is transported to a discharge hall, poured into a sealed solid waste pool, sent into an incinerator hopper through a grab bucket and pushed to a grate by a feeder; the MSW sequentially undergoes four stages of drying, ignition, combustion and burning-out in the incinerator, residues after burning-out fall into a water-cooled slag bucket and are conveyed to an ash pit by a slag conveyor, and the residues are collected and conveyed to a landfill site for treatment; flue gas generated in the incineration process heats a waste heat boiler to generate high-pressure steam so as to drive a steam turbine generator to generate electricity; after adding activated carbon and slaked lime, the flue gas at the outlet of the boiler enters a reactor, the generated fly ash enters a fly ash storage tank, the flue gas enters a bag type dust collector to remove flue gas particulate matters, neutralization reactants and activated carbon adsorbates, and the flue gas is divided into three parts after treatment: tail fly ash enters a fly ash tank, part of fly ash mixture enters the reactor again after being added with water in a mixer, and tail flue gas is discharged into the atmosphere through a chimney by an induced draft fan, wherein the tail flue gas contains HCL and SO₂、CO、CO₂、NO_xAnd DXN, among others.

DXN is contained in incineration ash, fly ash and flue gas generated in MSWI process due to two reasons of incomplete combustion of solid waste and generation of new synthetic reaction. Therefore, the flue gas during incineration needs to reach 850 ℃ and be maintained for 2s to ensure effective decomposition of toxic organic matter. Injecting lime and active carbon into the reactor in the flue gas treatment stage to adsorb DXN and part of heavy metalsAnd metal is filtered by a bag type dust collector and discharged into a chimney through an induced draft fan so as to reduce the concentration of DXN in the discharged flue gas. In addition, DXN memory effects in the presence of ash deposits from this stage also result in increased DXN emission concentrations. A field Distributed Control System (DCS) collects and stores DXN-related process variables and routine pollutants (CO, HCL, SO) at the above stages₂、NO_xAnd HF, etc.) concentration. However, detection of DXN in exhaust fumes is difficult due to high cost and long cycle time.

From the above, the samples for constructing the DXN emission risk early warning model have the characteristics of small quantity, uneven distribution, high dimensionality and the like. Therefore, the application provides a method for building DXN emission risk early warning modeling in the MSWI process based on an active learning mechanism virtual sample confrontation generation strategy.

Disclosure of Invention

The MSWI process DXN emission risk early warning model construction strategy based on the active learning mechanism GAN provided by the application comprises the following steps: generating a virtual sample based on GAN, evaluating and screening the virtual sample based on visual distribution information, and constructing a risk early warning model based on a mixed sample, as shown in FIG. 2.

In FIG. 2, the true sample input and the corresponding output are denoted X, respectively_realAnd Y_real(ii) a Random noise is denoted X_noise(ii) a Virtual samples generated by the GAN generator are recorded as

Wherein

A set of virtual sample inputs is represented,

representing a corresponding set of virtual sample outputs; virtual samples that passed through the MMD prescreening are recorded

Wherein

Representing an input set of preliminary screening virtual samples,

representing a corresponding prescreened virtual sample output set; the visual distribution information is recorded as D_PCA(ii) a The qualified virtual sample obtained by judging the visual distribution information is recorded as

Wherein

A set of eligible virtual sample inputs is represented,

representing a corresponding set of qualified virtual sample outputs; the risk category output of the constructed risk early warning model is recorded as

The functions of the different modules of the strategy are as follows:

1) a GAN-based virtual sample generation module: introducing DXN emission risk level as condition information on the basis of original GAN, and inputting the condition information and random noise into a generator together to generate a virtual sample of a specified type; further, inputting the virtual sample and the real sample into a discriminator, and updating a generator and the discriminator according to a discrimination result; finally, in the game fight of the generator and the discriminator, the generated virtual samples are closer to the real samples.

2) The virtual sample screening and evaluating module based on the visual distribution information comprises: firstly, calculating the similarity degree of a virtual sample and a real sample by using MMD to perform primary screening on the virtual sample; then, performing virtual sample visualization based on PCA to obtain distribution information after dimensionality reduction; finally, judging according to the distribution information and determining whether the distribution information is qualified, and if so, calibrating the distribution information as a qualified virtual sample; if not, the virtual sample is regenerated.

3) A risk early warning model construction module based on a mixed sample: and constructing a DXN emission risk early warning model by adopting a random forest algorithm based on the mixed sample.

4.1 virtual sample Generation Module based on GAN

The GAN is an unsupervised generation model based on a game scene, and game countermeasure through a generator and a discriminator generates virtual samples close to real samples. Because the type of the virtual sample generated by the original GAN is not controllable, the module introduces DXN emission risk level as condition information to control the type of the generated virtual sample on the basis of the original GAN.

The GAN-based virtual sample generation flow is shown in fig. 3.

The virtual sample generation process comprises the following steps: firstly, X is put in_noiseAnd Y_realCommon input generator to generate input of virtual samples

Then, X is added_real、

And Y_realThen input into the discriminator, based on the discrimination result Y_real/virAn update generator and a discriminator; then, X is added_noiseAnd expected generated DXN emissions risk level

Inputting a trained generator to generate

Finally, will

And

combining to obtain virtual sample

In this application, each training batchThe number of samples is set to N_bLearning rate of alpha_IrThe maximum training algebra is N_e. The generator employs a three-layer neural network, the hidden layer uses Relu activation function, the output layer uses linear activation function, as follows:

wherein, ω is_G1Generating a weight value between an input layer and a hidden layer of the generator; b_G1To generate a bias between the input layer and the hidden layer of the generator; relu activation function Relu (x) max (0, x), x being an arbitrary input value;

to generate hidden layer output of the generator; omega_G2Generating a weight value between a hidden layer and an output layer of the generator; b_G2To generate a bias between the hidden layer and the output layer.

The discriminator adopts three layers of neural networks, a hidden layer uses a Relu activation function, and an output layer uses a Sigmoid activation function, and the method comprises the following steps:

wherein,

is composed of

And (X)_real,Y_real) (ii) a composite sample of constituents; omega_D1The weight between the input layer and the hidden layer of the discriminator; b_D1Inputting a bias between the layer and the hidden layer for the discriminator;

outputting for the hidden layer of the discriminator; omega_D2The weight between the hidden layer and the output layer of the discriminator; b_D2For discriminator hidden layers and outputsAn offset between the egress layers; sigmoid activation function

x is an arbitrary input value.

Objective function O_GANAs shown in formula (3):

wherein, P_data(X_real) Represents X_realThe distribution of (a);

is a discriminator pair (X)_real,Y_real) An output of (d); p_noise(X_noise) Represents X_noiseThe distribution of (a);

is a discriminator pair

To output of (c).

The discriminator calculates the sample as from P_noise(X_noise) Or P_data(X_real) According to the result of the discriminator, the generator learns the distribution P of the real samples_data(X_real) To reduce log (1-Y)_D ^vir) The generator and the discriminator are trained together in the game countermeasure with the minimum and the maximum.

4.2 virtual sample screening and evaluation module based on visual distribution information

The virtual sample screening and evaluating process based on the visual distribution information comprises the following steps: firstly, primarily screening a virtual sample according to the MMD values of the virtual sample and a real sample; then, carrying out final judgment on the PCA visualization distribution information of the virtual sample; and finally, if the judgment is failed, regenerating the virtual sample, continuing to execute the operation, and if the judgment is passed, obtaining a qualified virtual sample. The flow is shown in fig. 4.

4.2.1 MMD-based virtual sample prescreening module

First, take several generators and generate several groups of virtual samples.

Then, the mass of each set of virtual samples is calculated. In the application, the MMD is adopted to measure the overall mean difference between the virtual sample and the real sample, and further measure the distribution difference between the virtual sample and the real sample.

It is assumed that,

obedience distribution

Wherein

Is a set of virtual sample numbers;

obey distribution P_real，N_realIs the true sample number. Further, the supremum of the expected difference of two domain samples in the regenerated Hilbert space (RKHS) is obtained by the high-dimensional mapping function, that is:

where H is RKHS, φ (-) indicates mapping samples to high-dimensional RKHS,

and E_q[φ(X_real)]Representing the expected value of the sample mapping into RKHS, σ is the bandwidth of the gaussian kernel.

Respectively calculating N groups of virtual samples according to the formula (4)

With true samples (X)_real,Y_real) To perform a preliminary screening thereof,

for the first set of virtual samples,

for the second set of virtual samples, the first set of virtual samples,

for the Nth set of virtual samples, the preliminary screening function is as follows:

wherein min (-) represents taking N sets of virtual samples and (X)_real,Y_real) The virtual sample group with the minimum MMD value is used as the initial screening virtual sample with the best quality

4.2.2 virtual sample visualization Module based on PCA

DXN samples in the present application are high-dimensional samples, and it is difficult to intuitively sense the distribution of the generated virtual samples. Therefore, the application adopts PCA to reduce the virtual sample to 1 dimension for visualization so as to provide overall distribution information

PCA projects the raw data into a new space through a set of orthogonal vectors, eliminating the raw data redundancy while retaining the primary information. Virtual sample visualization based PCA implements the steps as follows.

First, to

Performing centralization treatment to obtain a centralization sample U, wherein

As to the number of samples,

is a sampleDimension number;

next, the covariance matrix C of U is calculated:

C＝UU^T (6)

then, a feature vector and a feature value of C are calculated by a feature decomposition method:

C＝WΛW^T (7)

wherein, W is a matrix formed by the characteristic vectors; lambda is a diagonal matrix with characteristic roots arranged in a descending order;

then, the

Down to 1 dimension as follows:

X_PCA＝μ₁U (8)

in the formula, X_PCATo a virtual sample down to 1 dimension; mu.s₁The feature vector corresponding to the maximum feature value is obtained.

Finally, X is calculated_PCAAnd (5) obtaining a PCA visualization result by a distribution function.

4.2.3 visual distribution information discrimination module

Overall distribution information D provided by virtual sample PCA visualization result_PCA：

D_PCA＝(R_real∩R_vir)/S_real (9)

Wherein R is_realThe area contained by the distribution function and the x-axis of the real sample; r_virIs the area contained by the virtual sample distribution function and the x-axis; s_realThe area of the region contained by the real sample distribution function and the x-axis; r_real∩R_virRepresents R_realAnd R_virThe area of the overlapping portion.

The visual distribution information discriminant function provided by the application is as follows:

wherein, theta_SIs an empirically set threshold.

If phi_visual(D_PCA) A value of 1 indicates that the virtual sample is a qualified virtual sample; otherwise, the dummy sample is a failed dummy sample.

4.3 Risk early warning model building module based on mixed samples

Judging the obtained qualified virtual sample

And true sample (X)_real,Y_real) Combining to obtain a mixed sample S_mix。

Random Forest (RF) is used as a classifier of the risk pre-warning model, and the steps are as follows.

Firstly, using Bootstrap algorithm and RSM algorithm to pair S_mixCarrying out random sampling on samples and characteristics to obtain N sub-sample sets;

then, constructing N decision trees by using N sub-sample sets, wherein each decision tree obtains a classification result;

finally, voting is carried out on the N classification results, and the class with the largest voting quantity is selected as the final classification result

Drawings

FIG. 1 MSWI process flow diagram based on grate furnace

FIG. 2 illustrates a DXN emission risk pre-warning model construction strategy based on an active learning mechanism GAN

FIG. 3 GAN-based virtual sample generation flow diagram

FIG. 4 virtual sample evaluation and screening process based on active learning mechanism

FIG. 5 Generation of virtual sample quality versus epoch based on DXN data

FIG. 6 precision of DXN data set 50 Risk Pre-alert model

FIG. 7DXN data testing risk early warning experimental results

Detailed Description

DXN data adopted by the application come from a certain MSWI power plant based on a grate furnace in Beijing, and cover 67 effective DXN emission concentration detection samples recorded in 2012-2018; the raw input features are processed to reduce from 314 dimensions to 120 dimensions, where the output DXN emission concentration is classified into 5 risk classes, with the classification criteria shown in table 1, where high risk, medium risk, and low risk correspond to sample numbers of 27, 12, 11, and 6. 2/3 were randomly selected as a training set to build the model, and the remainder 1/3 was used to test the model performance.

TABLE 1 DXN emissions Risk ratings Standard

For DXN datasets: the hidden layer of the generator adopts a Relu activation function, and the output layer adopts a linear activation function; the discriminator hidden layer adopts Relu activating function, the output layer adopts Sigmoid activating function, and the specific parameter setting is shown in Table 2.

Table 2 DXN dataset virtual sample generation experiment parameter settings

FIG. 5 shows the relationship between virtual sample quality and epoch generated based on DXN data.

As can be seen from fig. 5, when the epoch reaches 1000, the quality of the generated dummy sample becomes stable. Thus, from 1100 to 2000 trainings, one generator is selected every 100 times, for a total of 10 generators, each generator generating 10 sets of virtual samples, each set of virtual samples having 60 risk levels of 5, for a total of 300 virtual samples. The virtual sample set with the lowest MMD value with the real samples is screened out from 10 groups of virtual sample sets of 10 generators as the initially screened virtual sample set. The results of the experiment are shown in table 3.

Table 3 DXN dataset virtual sample primary screening experimental results based on MMD

As can be seen from Table 3, the MMD values of the 4 th set of virtual samples generated by the generator from the 2000 th training and the real samples are the lowest, so the set of virtual samples is selected.

In order to ensure the visualization effect, 27 low-risk samples, 12 medium-low risk samples, 11 medium-high risk samples and 6 high-risk samples are randomly selected from 300 virtual samples, and 67 virtual samples are visualized. The threshold value is empirically set to 0.8, and the distribution information obtained as a result of the visualization is 0.81 greater than the set threshold value. Thus, the set of virtual samples are qualified virtual samples.

And (3) constructing a risk early warning model by using a mixed sample consisting of the qualified virtual sample and the real sample, wherein relevant parameters are shown in a table 4.

TABLE 4 relevant parameters for construction of DXN data set mixed sample risk early warning model

The accuracy of 50 experiments is shown in fig. 6.

As can be seen from fig. 6, the risk early warning model trained by the mixed sample has better performance than the model trained by the real sample.

In addition, a total of 5 sets of comparative experiments were performed, and the relevant parameters are shown in table 5.

TABLE 5 DXN data set construction of relevant parameters based on Risk early warning model of mixed samples

In table 5, the risk classes are ranked in order of high risk, medium low risk, and low risk. The virtual samples are randomly extracted from the screening virtual samples, wherein the proportion of the samples of each risk level is the same as that of the real samples by the unbalanced virtual samples and the unbalanced mixed samples, and the number of the samples of each risk level is the same by the balanced virtual samples and the balanced mixed samples.

Considering the randomness of the RF algorithm, 5 experiments were performed in 50 replicates. Fig. 7 shows the accuracy of the risk early warning models constructed by experiments A, B, C, D and E, and table 6 shows the comparison of the statistical results.

TABLE 6DXN data test Risk early warning statistics comparison

From the above, it can be seen that: 1) the average accuracy of the real samples is 48.9091%, the average accuracy of the unbalanced virtual samples is 48.0909%, the average accuracy of the balanced virtual samples is 47.4989%, and the virtual samples generated by the method are very close to the real samples; 2) based on the average accuracy of the mixed samples being 70.8444% and 78.8085%, the accuracy is improved by 44% and 59% compared with the accuracy of the samples without the virtual samples, and the virtual samples are added to help improve the performance of the model; 3) the average accuracy of the balanced mixed sample is improved by 11% compared with that of the unbalanced mixed sample, and the modeling effect of the balanced data is better than that of the unbalanced data; 4) the standard deviation of the accuracy of the mixed sample is lower than that of the real sample, which indicates that the addition of the virtual sample is beneficial to improving the stability of the model.

The application provides a visible distributed GAN-based MSWI process DXN emission risk early warning method, and the innovation is represented as follows: 1) the method comprises the steps of firstly providing a DXN emission concentration risk early warning strategy based on GAN and visual distribution; 2) the VSG method based on the GAN can generate a virtual sample of a specified type through the condition information, effectively expand the number of the samples and fill up the blank of the information of a real sample; 3) the virtual sample evaluation and screening method based on the visual distribution information uses MMD to primarily screen the virtual samples, judges the distribution information provided by the visual result of the primarily screened virtual samples, and obtains qualified virtual samples after the judgment is passed, wherein the quality of the qualified virtual samples is closer to that of real samples. The effectiveness of the proposed strategy and method is verified based on industrial DXN data. Future research directions include: how to process high-dimensional and discrete process data and how to make the generator and the discriminator more stable in the game countermeasure process so as to obtain better quality virtual samples.

Claims

1. MSWI process dioxin emission risk early warning method based on visual distribution GAN, its characterized in that:

the true sample input and the corresponding output are denoted X, respectively_realAnd Y_real(ii) a Random noise is denoted X_noise(ii) a Virtual samples generated by the GAN generator are recorded as

Wherein

A set of virtual sample inputs is represented,

Wherein

Representing an input set of preliminary screening virtual samples,

Wherein

A set of eligible virtual sample inputs is represented,

1) Virtual sample generation module based on GAN

Then, X is added_real、

Inputting a trained generator to generate

Finally, will

And

combining to obtain virtual sample

The number of training samples in each batch is set as N_bLearning rate of alpha_IrThe maximum training algebra is N_e(ii) a The generator employs a three-layer neural network, the hidden layer uses Relu activation function, the output layer uses linear activation function, as follows:

to generate hidden layer output of the generator; omega_G2Generating a weight value between a hidden layer and an output layer of the generator; b_G2To generate a bias between the hidden layer and the output layer;

wherein,

is composed of

outputting for the hidden layer of the discriminator; omega_D2The weight between the hidden layer and the output layer of the discriminator; b_D2The bias between the hidden layer and the output layer is set for the discriminator; sigmoid activation function

x is any input value;

objective function O_GANAs shown in formula (3):

wherein, P_data(X_real) Represents X_realThe distribution of (a);

is a discriminator pair

An output of (d);

the discriminator calculates the sample as from P_noise(X_noise) Or P_data(X_real) According to the result of the discriminator, the generator learns the distribution P of the real samples_data(X_real) To reduce

The generator and the discriminator are trained together in the game countermeasure with the minimum and the maximum;

2) virtual sample screening and evaluating module based on visual distribution information

Virtual sample prescreening module based on MMD

Firstly, taking a plurality of generators to generate a plurality of groups of virtual samples;

then, calculating the quality of each group of virtual samples; measuring the overall mean difference between the virtual sample and the real sample by adopting the MMD, and further measuring the distribution difference between the virtual sample and the real sample;

it is assumed that,

obedience distribution

Wherein

Is a set of virtual sample numbers;

obey distribution P_real，N_realIs the true sample number; the supremum of the expected difference in the regenerated hilbert space of the two domain samples is obtained by a high-dimensional mapping function, namely:

where H is RKHS, φ (-) indicates mapping samples to high-dimensional RKHS,

and E_q[φ(X_real)]Represents the expected value of the sample mapping into RKHS, σ is the bandwidth of the gaussian kernel;

With true samples (X)_real,Y_real) To perform a preliminary screening thereof,

for the first set of virtual samples,

for the second set of virtual samples, the first set of virtual samples,

Virtual sample visualization module based on PCA

Visualization of virtual DXN samples down to 1-dimension using PCA to provide global distribution information

PCA projects original data to a new space through a group of orthogonal vectors, so that the redundancy of the original data is eliminated and main information is reserved; the virtual sample visualization implementation steps based on PCA are as follows;

first, to

As to the number of samples,

is the sample dimension;

next, the covariance matrix C of U is calculated:

C＝UU^T (6)

C＝WΛW^T (7)

then, the

Down to 1 dimension as follows:

X_PCA＝μ₁U (8)

in the formula, X_PCATo a virtual sample down to 1 dimension; mu.s₁The feature vector corresponding to the maximum feature value;

finally, X is calculated_PCADistributing the function to obtain a PCA visualization result;

visual distribution information discrimination module

D_PCA＝(R_real∩R_vir)/S_real (9)

Wherein R is_realThe area contained by the distribution function and the x-axis of the real sample; r_virIs the area contained by the virtual sample distribution function and the x-axis; s_realThe area of the region contained by the real sample distribution function and the x-axis; r_real∩R_virRepresents R_realAnd R_virThe area of the overlap;

the visual distribution information discriminant function is as follows:

wherein, theta_SFor the set threshold, take 0.8,

if phi_visual(D_PCA) A value of 1 indicates that the virtual sample is qualifiedA virtual sample; otherwise, the virtual sample is a unqualified virtual sample;

3) risk early warning model building module based on mixed samples

Judging the obtained qualified virtual sample

And true sample (X)_real,Y_real) Combining to obtain a mixed sample S_mix；

Using a random forest as a classifier of a risk early warning model, and comprising the following steps of;