CN114266461A - MSWI process dioxin emission risk early warning method based on visual distribution GAN - Google Patents
MSWI process dioxin emission risk early warning method based on visual distribution GAN Download PDFInfo
- Publication number
- CN114266461A CN114266461A CN202111539001.XA CN202111539001A CN114266461A CN 114266461 A CN114266461 A CN 114266461A CN 202111539001 A CN202111539001 A CN 202111539001A CN 114266461 A CN114266461 A CN 114266461A
- Authority
- CN
- China
- Prior art keywords
- sample
- real
- virtual
- samples
- discriminator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- HGUFODBRKLSHSI-UHFFFAOYSA-N 2,3,7,8-tetrachloro-dibenzo-p-dioxin Chemical compound O1C2=CC(Cl)=C(Cl)C=C2OC2=C1C=C(Cl)C(Cl)=C2 HGUFODBRKLSHSI-UHFFFAOYSA-N 0.000 title claims abstract description 50
- 238000009826 distribution Methods 0.000 title claims abstract description 48
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000008569 process Effects 0.000 title claims abstract description 26
- 230000000007 visual effect Effects 0.000 title claims abstract description 23
- 238000012216 screening Methods 0.000 claims abstract description 21
- 230000004913 activation Effects 0.000 claims description 14
- 238000012800 visualization Methods 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 8
- 238000005315 distribution function Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 5
- 238000007637 random forest analysis Methods 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000003066 decision tree Methods 0.000 claims description 4
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 239000002131 composite material Substances 0.000 claims description 2
- 239000000470 constituent Substances 0.000 claims description 2
- LCCNCVORNKJIRZ-UHFFFAOYSA-N parathion Chemical compound CCOP(=S)(OCC)OC1=CC=C([N+]([O-])=O)C=C1 LCCNCVORNKJIRZ-UHFFFAOYSA-N 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 239000010813 municipal solid waste Substances 0.000 abstract description 6
- 238000001514 detection method Methods 0.000 abstract description 4
- 238000004519 manufacturing process Methods 0.000 abstract description 4
- 238000004056 waste incineration Methods 0.000 abstract description 3
- 231100001234 toxic pollutant Toxicity 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 14
- UGFAIRIUMAVXCW-UHFFFAOYSA-N Carbon monoxide Chemical compound [O+]#[C-] UGFAIRIUMAVXCW-UHFFFAOYSA-N 0.000 description 11
- 239000003546 flue gas Substances 0.000 description 11
- 238000000513 principal component analysis Methods 0.000 description 10
- 238000010276 construction Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 6
- 239000010881 fly ash Substances 0.000 description 6
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 239000002956 ash Substances 0.000 description 3
- 239000003344 environmental pollutant Substances 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 231100000719 pollutant Toxicity 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000003213 activating effect Effects 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 238000002485 combustion reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000428 dust Substances 0.000 description 2
- 239000002893 slag Substances 0.000 description 2
- 239000002910 solid waste Substances 0.000 description 2
- 235000008733 Citrus aurantifolia Nutrition 0.000 description 1
- 235000011941 Tilia x europaea Nutrition 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000002156 adsorbate Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- AXCZMVOFGPJBDE-UHFFFAOYSA-L calcium dihydroxide Chemical compound [OH-].[OH-].[Ca+2] AXCZMVOFGPJBDE-UHFFFAOYSA-L 0.000 description 1
- 229910001861 calcium hydroxide Inorganic materials 0.000 description 1
- 235000011116 calcium hydroxide Nutrition 0.000 description 1
- 239000000920 calcium hydroxide Substances 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 210000000750 endocrine system Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000003517 fume Substances 0.000 description 1
- 239000004571 lime Substances 0.000 description 1
- 230000003446 memory effect Effects 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000005416 organic matter Substances 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 239000002918 waste heat Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
A visible distribution GAN-based MSWI process dioxin emission risk early warning method belongs to the field of urban solid waste incineration. At present, a long-period and high-cost off-line toxic pollutant dioxin DXN emission concentration detection mode is adopted in the industrial process, so that samples for constructing a risk early warning model are extremely rare. Aiming at the problems, a modeling method for MSWI process DXN emission risk early warning based on visual distribution generation countermeasure network (GAN) is provided. First, a risk level of DXN is introduced as conditional information on the basis of the original GAN, so that the generator can generate a virtual sample specifying the risk level. Then, the visual distribution information is used for evaluating and screening qualified virtual samples. And finally, constructing a DXN emission risk early warning model based on a mixed sample consisting of the virtual sample and the real sample. The effectiveness of the proposed method is verified by industrial process DXN data.
Description
Technical Field
The invention belongs to the field of urban solid waste incineration.
Background
The production of Municipal Solid Waste (MSW) increases year by year with the increasing population of cities. Municipal Solid Waste Incineration (MSWI) is a treatment means adopted in most countries in the world today and has the advantages of harmlessness, reduction, recycling and the like. Because Dioxin (DXN) generated in the MSWI process is a highly toxic pollutant, the dioxin not only damages the endocrine system of a poisoned person and destroys chromosomes to further cause canceration of cells, but also has an accumulation effect in organisms, and is a main reason for the 'neighbor effect' of incineration plant construction. Therefore, controlling the emissions is an environmental issue that needs to be addressed. The method has the advantages that the risk grade of DXN emission is early warned, the MSWI process is further optimally controlled, and the method has important practical significance for reducing pollutant emission.
At present, the industry mainly detects DXN in the exhaust smoke of the end chimney of the MSWI process. Both the off-line direct test and the on-line indirect test are difficult to satisfy the real-time optimization control of the MSWI process for the purpose of DXN emission reduction. In addition, the DXN emission concentration detection is difficult, long, and expensive, so that sample truth values for constructing the data-driven model are extremely rare. Therefore, the problem of detecting the DXN emission concentration in the MSWI process researched by the application belongs to the typical small sample problem, and has the characteristics of small sample quantity, unbalanced samples and the like. Generally speaking, a small number of modeling samples are difficult to accurately reflect the real characteristics of the industrial process, and a robust and reliable regression prediction model of pollutant concentration emission is difficult to construct; relatively speaking, it is easier to construct a risk discrimination classification model. In addition, experts in the field of industrial fields are also used to describe the risks of pollution emission levels by using low, medium and high level languages of emission concentration and obtain a judgment result according to self experience to adjust related control parameters. However, the imbalance of samples, i.e. the number of samples of a certain class is much smaller than that of other classes, is also a main reason for the sidedness and bias of the constructed risk discrimination model.
In summary, the application provides a construction method of a DXN emission risk early warning model in the MSWI process based on an active learning mechanism GAN. Firstly, introducing DXN risk level as condition information on the basis of original GAN, inputting the condition information and random noise into a generator, generating a virtual sample with preset DXN risk level, inputting the virtual sample and a real sample into a discriminator together, and updating the generator and the discriminator according to a discrimination result; secondly, primarily screening the virtual sample by using Maximum Mean Difference (MMD), then adopting Principal Component Analysis (PCA) to the primarily screened virtual sample to obtain visual distribution information, and judging whether the primarily screened virtual sample is qualified or not according to the visual distribution information; and finally, constructing a DXN emission risk early warning model based on a mixed sample consisting of the virtual sample and the real sample. The validity of the proposed method is verified in combination with the actual DXN data.
The grate furnace incineration process flow of a certain domestic MSWI power plant is shown in figure 1.
As shown in figure 1, MSW is collected and weighed by a special vehicle and then is transported to a discharge hall, poured into a sealed solid waste pool, sent into an incinerator hopper through a grab bucket and pushed to a grate by a feeder; the MSW sequentially undergoes four stages of drying, ignition, combustion and burning-out in the incinerator, residues after burning-out fall into a water-cooled slag bucket and are conveyed to an ash pit by a slag conveyor, and the residues are collected and conveyed to a landfill site for treatment; flue gas generated in the incineration process heats a waste heat boiler to generate high-pressure steam so as to drive a steam turbine generator to generate electricity; after adding activated carbon and slaked lime, the flue gas at the outlet of the boiler enters a reactor, the generated fly ash enters a fly ash storage tank, the flue gas enters a bag type dust collector to remove flue gas particulate matters, neutralization reactants and activated carbon adsorbates, and the flue gas is divided into three parts after treatment: tail fly ash enters a fly ash tank, part of fly ash mixture enters the reactor again after being added with water in a mixer, and tail flue gas is discharged into the atmosphere through a chimney by an induced draft fan, wherein the tail flue gas contains HCL and SO2、CO、CO2、NOxAnd DXN, among others.
DXN is contained in incineration ash, fly ash and flue gas generated in MSWI process due to two reasons of incomplete combustion of solid waste and generation of new synthetic reaction. Therefore, the flue gas during incineration needs to reach 850 ℃ and be maintained for 2s to ensure effective decomposition of toxic organic matter. Injecting lime and active carbon into the reactor in the flue gas treatment stage to adsorb DXN and part of heavy metalsAnd metal is filtered by a bag type dust collector and discharged into a chimney through an induced draft fan so as to reduce the concentration of DXN in the discharged flue gas. In addition, DXN memory effects in the presence of ash deposits from this stage also result in increased DXN emission concentrations. A field Distributed Control System (DCS) collects and stores DXN-related process variables and routine pollutants (CO, HCL, SO) at the above stages2、NOxAnd HF, etc.) concentration. However, detection of DXN in exhaust fumes is difficult due to high cost and long cycle time.
From the above, the samples for constructing the DXN emission risk early warning model have the characteristics of small quantity, uneven distribution, high dimensionality and the like. Therefore, the application provides a method for building DXN emission risk early warning modeling in the MSWI process based on an active learning mechanism virtual sample confrontation generation strategy.
Disclosure of Invention
The MSWI process DXN emission risk early warning model construction strategy based on the active learning mechanism GAN provided by the application comprises the following steps: generating a virtual sample based on GAN, evaluating and screening the virtual sample based on visual distribution information, and constructing a risk early warning model based on a mixed sample, as shown in FIG. 2.
In FIG. 2, the true sample input and the corresponding output are denoted X, respectivelyrealAnd Yreal(ii) a Random noise is denoted Xnoise(ii) a Virtual samples generated by the GAN generator are recorded asWhereinA set of virtual sample inputs is represented,representing a corresponding set of virtual sample outputs; virtual samples that passed through the MMD prescreening are recordedWhereinRepresenting an input set of preliminary screening virtual samples,representing a corresponding prescreened virtual sample output set; the visual distribution information is recorded as DPCA(ii) a The qualified virtual sample obtained by judging the visual distribution information is recorded asWhereinA set of eligible virtual sample inputs is represented,representing a corresponding set of qualified virtual sample outputs; the risk category output of the constructed risk early warning model is recorded as
The functions of the different modules of the strategy are as follows:
1) a GAN-based virtual sample generation module: introducing DXN emission risk level as condition information on the basis of original GAN, and inputting the condition information and random noise into a generator together to generate a virtual sample of a specified type; further, inputting the virtual sample and the real sample into a discriminator, and updating a generator and the discriminator according to a discrimination result; finally, in the game fight of the generator and the discriminator, the generated virtual samples are closer to the real samples.
2) The virtual sample screening and evaluating module based on the visual distribution information comprises: firstly, calculating the similarity degree of a virtual sample and a real sample by using MMD to perform primary screening on the virtual sample; then, performing virtual sample visualization based on PCA to obtain distribution information after dimensionality reduction; finally, judging according to the distribution information and determining whether the distribution information is qualified, and if so, calibrating the distribution information as a qualified virtual sample; if not, the virtual sample is regenerated.
3) A risk early warning model construction module based on a mixed sample: and constructing a DXN emission risk early warning model by adopting a random forest algorithm based on the mixed sample.
4.1 virtual sample Generation Module based on GAN
The GAN is an unsupervised generation model based on a game scene, and game countermeasure through a generator and a discriminator generates virtual samples close to real samples. Because the type of the virtual sample generated by the original GAN is not controllable, the module introduces DXN emission risk level as condition information to control the type of the generated virtual sample on the basis of the original GAN.
The GAN-based virtual sample generation flow is shown in fig. 3.
The virtual sample generation process comprises the following steps: firstly, X is put innoiseAnd YrealCommon input generator to generate input of virtual samplesThen, X is addedreal、And YrealThen input into the discriminator, based on the discrimination result Yreal/virAn update generator and a discriminator; then, X is addednoiseAnd expected generated DXN emissions risk levelInputting a trained generator to generateFinally, willAndcombining to obtain virtual sample
In this application, each training batchThe number of samples is set to NbLearning rate of alphaIrThe maximum training algebra is Ne. The generator employs a three-layer neural network, the hidden layer uses Relu activation function, the output layer uses linear activation function, as follows:
wherein, ω isG1Generating a weight value between an input layer and a hidden layer of the generator; bG1To generate a bias between the input layer and the hidden layer of the generator; relu activation function Relu (x) max (0, x), x being an arbitrary input value;to generate hidden layer output of the generator; omegaG2Generating a weight value between a hidden layer and an output layer of the generator; bG2To generate a bias between the hidden layer and the output layer.
The discriminator adopts three layers of neural networks, a hidden layer uses a Relu activation function, and an output layer uses a Sigmoid activation function, and the method comprises the following steps:
wherein,is composed ofAnd (X)real,Yreal) (ii) a composite sample of constituents; omegaD1The weight between the input layer and the hidden layer of the discriminator; bD1Inputting a bias between the layer and the hidden layer for the discriminator;outputting for the hidden layer of the discriminator; omegaD2The weight between the hidden layer and the output layer of the discriminator; bD2For discriminator hidden layers and outputsAn offset between the egress layers; sigmoid activation functionx is an arbitrary input value.
Objective function OGANAs shown in formula (3):
wherein, Pdata(Xreal) Represents XrealThe distribution of (a);is a discriminator pair (X)real,Yreal) An output of (d); pnoise(Xnoise) Represents XnoiseThe distribution of (a);is a discriminator pairTo output of (c).
The discriminator calculates the sample as from Pnoise(Xnoise) Or Pdata(Xreal) According to the result of the discriminator, the generator learns the distribution P of the real samplesdata(Xreal) To reduce log (1-Y)D vir) The generator and the discriminator are trained together in the game countermeasure with the minimum and the maximum.
4.2 virtual sample screening and evaluation module based on visual distribution information
The virtual sample screening and evaluating process based on the visual distribution information comprises the following steps: firstly, primarily screening a virtual sample according to the MMD values of the virtual sample and a real sample; then, carrying out final judgment on the PCA visualization distribution information of the virtual sample; and finally, if the judgment is failed, regenerating the virtual sample, continuing to execute the operation, and if the judgment is passed, obtaining a qualified virtual sample. The flow is shown in fig. 4.
4.2.1 MMD-based virtual sample prescreening module
First, take several generators and generate several groups of virtual samples.
Then, the mass of each set of virtual samples is calculated. In the application, the MMD is adopted to measure the overall mean difference between the virtual sample and the real sample, and further measure the distribution difference between the virtual sample and the real sample.
It is assumed that,obedience distributionWhereinIs a set of virtual sample numbers;obey distribution Preal,NrealIs the true sample number. Further, the supremum of the expected difference of two domain samples in the regenerated Hilbert space (RKHS) is obtained by the high-dimensional mapping function, that is:
where H is RKHS, φ (-) indicates mapping samples to high-dimensional RKHS,and Eq[φ(Xreal)]Representing the expected value of the sample mapping into RKHS, σ is the bandwidth of the gaussian kernel.
Respectively calculating N groups of virtual samples according to the formula (4)With true samples (X)real,Yreal) To perform a preliminary screening thereof,for the first set of virtual samples,for the second set of virtual samples, the first set of virtual samples,for the Nth set of virtual samples, the preliminary screening function is as follows:
wherein min (-) represents taking N sets of virtual samples and (X)real,Yreal) The virtual sample group with the minimum MMD value is used as the initial screening virtual sample with the best quality
4.2.2 virtual sample visualization Module based on PCA
DXN samples in the present application are high-dimensional samples, and it is difficult to intuitively sense the distribution of the generated virtual samples. Therefore, the application adopts PCA to reduce the virtual sample to 1 dimension for visualization so as to provide overall distribution information
PCA projects the raw data into a new space through a set of orthogonal vectors, eliminating the raw data redundancy while retaining the primary information. Virtual sample visualization based PCA implements the steps as follows.
First, toPerforming centralization treatment to obtain a centralization sample U, whereinAs to the number of samples,is a sampleDimension number;
next, the covariance matrix C of U is calculated:
C=UUT (6)
then, a feature vector and a feature value of C are calculated by a feature decomposition method:
C=WΛWT (7)
wherein, W is a matrix formed by the characteristic vectors; lambda is a diagonal matrix with characteristic roots arranged in a descending order;
XPCA=μ1U (8)
in the formula, XPCATo a virtual sample down to 1 dimension; mu.s1The feature vector corresponding to the maximum feature value is obtained.
Finally, X is calculatedPCAAnd (5) obtaining a PCA visualization result by a distribution function.
4.2.3 visual distribution information discrimination module
Overall distribution information D provided by virtual sample PCA visualization resultPCA:
DPCA=(Rreal∩Rvir)/Sreal (9)
Wherein R isrealThe area contained by the distribution function and the x-axis of the real sample; rvirIs the area contained by the virtual sample distribution function and the x-axis; srealThe area of the region contained by the real sample distribution function and the x-axis; rreal∩RvirRepresents RrealAnd RvirThe area of the overlapping portion.
The visual distribution information discriminant function provided by the application is as follows:
wherein, thetaSIs an empirically set threshold.
If phivisual(DPCA) A value of 1 indicates that the virtual sample is a qualified virtual sample; otherwise, the dummy sample is a failed dummy sample.
4.3 Risk early warning model building module based on mixed samples
Judging the obtained qualified virtual sampleAnd true sample (X)real,Yreal) Combining to obtain a mixed sample Smix。
Random Forest (RF) is used as a classifier of the risk pre-warning model, and the steps are as follows.
Firstly, using Bootstrap algorithm and RSM algorithm to pair SmixCarrying out random sampling on samples and characteristics to obtain N sub-sample sets;
then, constructing N decision trees by using N sub-sample sets, wherein each decision tree obtains a classification result;
Drawings
FIG. 1 MSWI process flow diagram based on grate furnace
FIG. 2 illustrates a DXN emission risk pre-warning model construction strategy based on an active learning mechanism GAN
FIG. 3 GAN-based virtual sample generation flow diagram
FIG. 4 virtual sample evaluation and screening process based on active learning mechanism
FIG. 5 Generation of virtual sample quality versus epoch based on DXN data
FIG. 6 precision of DXN data set 50 Risk Pre-alert model
FIG. 7DXN data testing risk early warning experimental results
Detailed Description
DXN data adopted by the application come from a certain MSWI power plant based on a grate furnace in Beijing, and cover 67 effective DXN emission concentration detection samples recorded in 2012-2018; the raw input features are processed to reduce from 314 dimensions to 120 dimensions, where the output DXN emission concentration is classified into 5 risk classes, with the classification criteria shown in table 1, where high risk, medium risk, and low risk correspond to sample numbers of 27, 12, 11, and 6. 2/3 were randomly selected as a training set to build the model, and the remainder 1/3 was used to test the model performance.
TABLE 1 DXN emissions Risk ratings Standard
For DXN datasets: the hidden layer of the generator adopts a Relu activation function, and the output layer adopts a linear activation function; the discriminator hidden layer adopts Relu activating function, the output layer adopts Sigmoid activating function, and the specific parameter setting is shown in Table 2.
Table 2 DXN dataset virtual sample generation experiment parameter settings
FIG. 5 shows the relationship between virtual sample quality and epoch generated based on DXN data.
As can be seen from fig. 5, when the epoch reaches 1000, the quality of the generated dummy sample becomes stable. Thus, from 1100 to 2000 trainings, one generator is selected every 100 times, for a total of 10 generators, each generator generating 10 sets of virtual samples, each set of virtual samples having 60 risk levels of 5, for a total of 300 virtual samples. The virtual sample set with the lowest MMD value with the real samples is screened out from 10 groups of virtual sample sets of 10 generators as the initially screened virtual sample set. The results of the experiment are shown in table 3.
Table 3 DXN dataset virtual sample primary screening experimental results based on MMD
As can be seen from Table 3, the MMD values of the 4 th set of virtual samples generated by the generator from the 2000 th training and the real samples are the lowest, so the set of virtual samples is selected.
In order to ensure the visualization effect, 27 low-risk samples, 12 medium-low risk samples, 11 medium-high risk samples and 6 high-risk samples are randomly selected from 300 virtual samples, and 67 virtual samples are visualized. The threshold value is empirically set to 0.8, and the distribution information obtained as a result of the visualization is 0.81 greater than the set threshold value. Thus, the set of virtual samples are qualified virtual samples.
And (3) constructing a risk early warning model by using a mixed sample consisting of the qualified virtual sample and the real sample, wherein relevant parameters are shown in a table 4.
TABLE 4 relevant parameters for construction of DXN data set mixed sample risk early warning model
The accuracy of 50 experiments is shown in fig. 6.
As can be seen from fig. 6, the risk early warning model trained by the mixed sample has better performance than the model trained by the real sample.
In addition, a total of 5 sets of comparative experiments were performed, and the relevant parameters are shown in table 5.
TABLE 5 DXN data set construction of relevant parameters based on Risk early warning model of mixed samples
In table 5, the risk classes are ranked in order of high risk, medium low risk, and low risk. The virtual samples are randomly extracted from the screening virtual samples, wherein the proportion of the samples of each risk level is the same as that of the real samples by the unbalanced virtual samples and the unbalanced mixed samples, and the number of the samples of each risk level is the same by the balanced virtual samples and the balanced mixed samples.
Considering the randomness of the RF algorithm, 5 experiments were performed in 50 replicates. Fig. 7 shows the accuracy of the risk early warning models constructed by experiments A, B, C, D and E, and table 6 shows the comparison of the statistical results.
TABLE 6DXN data test Risk early warning statistics comparison
From the above, it can be seen that: 1) the average accuracy of the real samples is 48.9091%, the average accuracy of the unbalanced virtual samples is 48.0909%, the average accuracy of the balanced virtual samples is 47.4989%, and the virtual samples generated by the method are very close to the real samples; 2) based on the average accuracy of the mixed samples being 70.8444% and 78.8085%, the accuracy is improved by 44% and 59% compared with the accuracy of the samples without the virtual samples, and the virtual samples are added to help improve the performance of the model; 3) the average accuracy of the balanced mixed sample is improved by 11% compared with that of the unbalanced mixed sample, and the modeling effect of the balanced data is better than that of the unbalanced data; 4) the standard deviation of the accuracy of the mixed sample is lower than that of the real sample, which indicates that the addition of the virtual sample is beneficial to improving the stability of the model.
The application provides a visible distributed GAN-based MSWI process DXN emission risk early warning method, and the innovation is represented as follows: 1) the method comprises the steps of firstly providing a DXN emission concentration risk early warning strategy based on GAN and visual distribution; 2) the VSG method based on the GAN can generate a virtual sample of a specified type through the condition information, effectively expand the number of the samples and fill up the blank of the information of a real sample; 3) the virtual sample evaluation and screening method based on the visual distribution information uses MMD to primarily screen the virtual samples, judges the distribution information provided by the visual result of the primarily screened virtual samples, and obtains qualified virtual samples after the judgment is passed, wherein the quality of the qualified virtual samples is closer to that of real samples. The effectiveness of the proposed strategy and method is verified based on industrial DXN data. Future research directions include: how to process high-dimensional and discrete process data and how to make the generator and the discriminator more stable in the game countermeasure process so as to obtain better quality virtual samples.
Claims (1)
1. MSWI process dioxin emission risk early warning method based on visual distribution GAN, its characterized in that:
the true sample input and the corresponding output are denoted X, respectivelyrealAnd Yreal(ii) a Random noise is denoted Xnoise(ii) a Virtual samples generated by the GAN generator are recorded asWhereinA set of virtual sample inputs is represented,representing a corresponding set of virtual sample outputs; virtual samples that passed through the MMD prescreening are recordedWhereinRepresenting an input set of preliminary screening virtual samples,representing a corresponding prescreened virtual sample output set; the visual distribution information is recorded as DPCA(ii) a The qualified virtual sample obtained by judging the visual distribution information is recorded asWhereinA set of eligible virtual sample inputs is represented,representing a corresponding set of qualified virtual sample outputs; the risk category output of the constructed risk early warning model is recorded as
1) Virtual sample generation module based on GAN
The virtual sample generation process comprises the following steps: firstly, X is put innoiseAnd YrealCommon input generator to generate input of virtual samplesThen, X is addedreal、And YrealThen input into the discriminator, based on the discrimination result Yreal/virAn update generator and a discriminator; then, X is addednoiseAnd expected generated DXN emissions risk levelInputting a trained generator to generateFinally, willAndcombining to obtain virtual sample
The number of training samples in each batch is set as NbLearning rate of alphaIrThe maximum training algebra is Ne(ii) a The generator employs a three-layer neural network, the hidden layer uses Relu activation function, the output layer uses linear activation function, as follows:
wherein, ω isG1Generating a weight value between an input layer and a hidden layer of the generator; bG1To generate a bias between the input layer and the hidden layer of the generator; relu activation function Relu (x) max (0, x), x being an arbitrary input value;to generate hidden layer output of the generator; omegaG2Generating a weight value between a hidden layer and an output layer of the generator; bG2To generate a bias between the hidden layer and the output layer;
the discriminator adopts three layers of neural networks, a hidden layer uses a Relu activation function, and an output layer uses a Sigmoid activation function, and the method comprises the following steps:
wherein,is composed ofAnd (X)real,Yreal) (ii) a composite sample of constituents; omegaD1The weight between the input layer and the hidden layer of the discriminator; bD1Inputting a bias between the layer and the hidden layer for the discriminator;outputting for the hidden layer of the discriminator; omegaD2The weight between the hidden layer and the output layer of the discriminator; bD2The bias between the hidden layer and the output layer is set for the discriminator; sigmoid activation functionx is any input value;
objective function OGANAs shown in formula (3):
wherein, Pdata(Xreal) Represents XrealThe distribution of (a);is a discriminator pair (X)real,Yreal) An output of (d); pnoise(Xnoise) Represents XnoiseThe distribution of (a);is a discriminator pairAn output of (d);
the discriminator calculates the sample as from Pnoise(Xnoise) Or Pdata(Xreal) According to the result of the discriminator, the generator learns the distribution P of the real samplesdata(Xreal) To reduceThe generator and the discriminator are trained together in the game countermeasure with the minimum and the maximum;
2) virtual sample screening and evaluating module based on visual distribution information
Virtual sample prescreening module based on MMD
Firstly, taking a plurality of generators to generate a plurality of groups of virtual samples;
then, calculating the quality of each group of virtual samples; measuring the overall mean difference between the virtual sample and the real sample by adopting the MMD, and further measuring the distribution difference between the virtual sample and the real sample;
it is assumed that,obedience distributionWhereinIs a set of virtual sample numbers;obey distribution Preal,NrealIs the true sample number; the supremum of the expected difference in the regenerated hilbert space of the two domain samples is obtained by a high-dimensional mapping function, namely:
where H is RKHS, φ (-) indicates mapping samples to high-dimensional RKHS,and Eq[φ(Xreal)]Represents the expected value of the sample mapping into RKHS, σ is the bandwidth of the gaussian kernel;
respectively calculating N groups of virtual samples according to the formula (4)With true samples (X)real,Yreal) To perform a preliminary screening thereof,for the first set of virtual samples,for the second set of virtual samples, the first set of virtual samples,for the Nth set of virtual samples, the preliminary screening function is as follows:
wherein min (-) represents taking N sets of virtual samples and (X)real,Yreal) The virtual sample group with the minimum MMD value is used as the initial screening virtual sample with the best quality
Virtual sample visualization module based on PCA
Visualization of virtual DXN samples down to 1-dimension using PCA to provide global distribution information
PCA projects original data to a new space through a group of orthogonal vectors, so that the redundancy of the original data is eliminated and main information is reserved; the virtual sample visualization implementation steps based on PCA are as follows;
first, toPerforming centralization treatment to obtain a centralization sample U, whereinAs to the number of samples,is the sample dimension;
next, the covariance matrix C of U is calculated:
C=UUT (6)
then, a feature vector and a feature value of C are calculated by a feature decomposition method:
C=WΛWT (7)
wherein, W is a matrix formed by the characteristic vectors; lambda is a diagonal matrix with characteristic roots arranged in a descending order;
XPCA=μ1U (8)
in the formula, XPCATo a virtual sample down to 1 dimension; mu.s1The feature vector corresponding to the maximum feature value;
finally, X is calculatedPCADistributing the function to obtain a PCA visualization result;
visual distribution information discrimination module
Overall distribution information D provided by virtual sample PCA visualization resultPCA:
DPCA=(Rreal∩Rvir)/Sreal (9)
Wherein R isrealThe area contained by the distribution function and the x-axis of the real sample; rvirIs the area contained by the virtual sample distribution function and the x-axis; srealThe area of the region contained by the real sample distribution function and the x-axis; rreal∩RvirRepresents RrealAnd RvirThe area of the overlap;
the visual distribution information discriminant function is as follows:
wherein, thetaSFor the set threshold, take 0.8,
if phivisual(DPCA) A value of 1 indicates that the virtual sample is qualifiedA virtual sample; otherwise, the virtual sample is a unqualified virtual sample;
3) risk early warning model building module based on mixed samples
Judging the obtained qualified virtual sampleAnd true sample (X)real,Yreal) Combining to obtain a mixed sample Smix;
Using a random forest as a classifier of a risk early warning model, and comprising the following steps of;
firstly, using Bootstrap algorithm and RSM algorithm to pair SmixCarrying out random sampling on samples and characteristics to obtain N sub-sample sets;
then, constructing N decision trees by using N sub-sample sets, wherein each decision tree obtains a classification result;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111539001.XA CN114266461A (en) | 2021-12-15 | 2021-12-15 | MSWI process dioxin emission risk early warning method based on visual distribution GAN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111539001.XA CN114266461A (en) | 2021-12-15 | 2021-12-15 | MSWI process dioxin emission risk early warning method based on visual distribution GAN |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114266461A true CN114266461A (en) | 2022-04-01 |
Family
ID=80827448
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111539001.XA Pending CN114266461A (en) | 2021-12-15 | 2021-12-15 | MSWI process dioxin emission risk early warning method based on visual distribution GAN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114266461A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115329659A (en) * | 2022-07-18 | 2022-11-11 | 浙江大学 | Method and system for real-time early warning and intelligent control of dioxin emission in waste incinerator |
WO2023222138A1 (en) * | 2022-05-16 | 2023-11-23 | 北京工业大学 | Dioxin emission risk early warning model construction method based on fnn adversarial generation |
-
2021
- 2021-12-15 CN CN202111539001.XA patent/CN114266461A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023222138A1 (en) * | 2022-05-16 | 2023-11-23 | 北京工业大学 | Dioxin emission risk early warning model construction method based on fnn adversarial generation |
CN115329659A (en) * | 2022-07-18 | 2022-11-11 | 浙江大学 | Method and system for real-time early warning and intelligent control of dioxin emission in waste incinerator |
CN115329659B (en) * | 2022-07-18 | 2023-07-14 | 浙江大学 | Method and system for real-time early warning and intelligent control of dioxin emission of waste incinerator |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xia et al. | Dioxin emission prediction based on improved deep forest regression for municipal solid waste incineration process | |
CN112464544B (en) | Method for constructing prediction model of dioxin emission concentration in urban solid waste incineration process | |
CN110135057B (en) | Soft measurement method for dioxin emission concentration in solid waste incineration process based on multilayer characteristic selection | |
CN108549792B (en) | Soft measurement method for dioxin emission concentration in solid waste incineration process based on latent structure mapping algorithm | |
CN107944173B (en) | Dioxin soft measurement system based on selective integrated least square support vector machine | |
Hao et al. | Combining neural network and genetic algorithms to optimize low NOx pulverized coal combustion | |
CN111144609A (en) | Boiler exhaust emission prediction model establishing method, prediction method and device | |
CN114266461A (en) | MSWI process dioxin emission risk early warning method based on visual distribution GAN | |
CN111260149B (en) | Dioxin emission concentration prediction method | |
CN111461355A (en) | Dioxin emission concentration migration learning prediction method based on random forest | |
WO2023138140A1 (en) | Soft-sensing method for dioxin emission during mswi process and based on broad hybrid forest regression | |
CN110991756B (en) | MSWI furnace temperature prediction method based on TS fuzzy neural network | |
Ding et al. | Gradient boosting decision tree in the prediction of NOx emission of waste incineration | |
CN113780383B (en) | Dioxin emission concentration prediction method based on semi-supervised random forest and deep forest regression integration | |
CN111462835B (en) | Dioxin emission concentration soft measurement method based on depth forest regression algorithm | |
CN114330845A (en) | MSWI process dioxin emission prediction method based on multi-window concept drift detection | |
Li et al. | Neural networks and genetic algorithms can support human supervisory control to reduce fossil fuel power plant emissions | |
WO2024146070A1 (en) | Dioxin emission concentration soft measurement method based on improved generative adversarial network | |
CN109978011A (en) | A kind of city solid waste burning process dioxin concentration prediction system | |
Jian et al. | Soft measurement of dioxin emission concentration based on deep forest regression algorithm | |
Xia et al. | Dioxin emission concentration forecasting model for MSWI process with random forest-based transfer learning | |
CN114943151A (en) | MSWI process dioxin emission soft measurement method based on integrated T-S fuzzy regression tree | |
CN113780384B (en) | Urban solid waste incineration process key controlled variable prediction method based on integrated decision tree algorithm | |
Guo et al. | Dioxin emission concentration soft measurement model of MSWI process based on unmarked samples and improved deep belief network | |
CN111476433A (en) | Data analysis-based flue gas emission prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |