CN110518580A

CN110518580A - Active power distribution network operation optimization method considering micro-grid active optimization

Info

Publication number: CN110518580A
Application number: CN201910754026.8A
Authority: CN
Inventors: 龚锦霞
Original assignee: Shanghai University of Electric Power
Current assignee: Shanghai University of Electric Power
Priority date: 2019-08-15
Filing date: 2019-08-15
Publication date: 2019-11-29
Anticipated expiration: 2039-08-15
Also published as: CN110518580B

Abstract

The invention relates to an active power distribution network operation optimization method considering active optimization of a microgrid, which comprises the following steps: s1, establishing a micro-grid independent optimization model; s2, establishing an active power distribution network operation optimization model by combining the micro-grid independent optimization model; s3, solving the operation optimization model of the active power distribution network by adopting a DDPG method to obtain an optimal operation solution of the active power distribution network; and S4, correspondingly outputting the optimal operation solution of the active power distribution network to a power flow control center and a micro-grid control center of the active power distribution network so as to optimize the operation state of the active power distribution network. Compared with the prior art, the active power distribution network optimization method has the advantages that the active power distribution network optimization model considering multi-microgrid active optimization is established, the DDPG method is adopted for solving, the optimization problem of multiple control centers is solved, and the active power distribution network can be optimized in real time.

Description

Active power distribution network operation optimization method considering micro-grid active optimization

Technical Field

The invention relates to the technical field of operation optimization of a power distribution network, in particular to an active power distribution network operation optimization method considering active optimization of a microgrid.

Background

In an Active Distribution Network (ADN), a Distributed Generation (DG) is generally connected to a distribution network in the form of a plurality of Microgrids (MGs), and the distributed generation is hierarchically controlled and managed to implement power flow management and voltage control of the distribution network, so that after the plurality of microgrids are connected to the ADN, problems of out-of-limit distribution network node voltage, increased network loss, power fluctuation on a distribution network connection line and the like caused by a high-proportion random fluctuation distributed generation can be alleviated to a certain extent.

At present, distributed power supply ownership in a power distribution network is owned by the power distribution network, output power of the distributed power supply ownership is essentially uniformly scheduled by the power distribution network and is usually a centralized optimization model, optimization targets of distributed power supply owners and active power generation and utilization characteristics of the distributed power supply owners serving as active loads are not considered in the model, relevant research of optimization scheduling of the active power distribution network is not considered on the basis of internal optimization of a microgrid, and active optimization characteristics of power generation and utilization of the distributed power supply owners are worthy of attention under the background of development of a smart power grid.

Because the active power distribution network comprises a plurality of independently optimized micro-grids, the operation optimization of the active power distribution network is actually a problem of multi-control center optimization, on one hand, because an optimization model considering the activity of the micro-grids is unknown, the solution method cannot adopt an analytic method of a deterministic modeling method; on the other hand, from the data processing point of view, the microgrid, the load and the microgrid source are all changed and cannot be accurately obtained, namely, the operation of the microgrid is equivalent to a black box. In recent years, Deep Reinforcement Learning (DRL) has been paid attention to by many researchers due to its powerful data processing capability, characterization capability and generalization performance, and related technologies have been widely researched and applied in both academic and industrial fields, and the DRL method has also been tried in the power grid: for example, a hierarchical multi-agent AGC adjusting system established based on Q-Learning, a deep Learning method based on a layer-by-layer coding network for judging fault state of a main bearing of a wind turbine generator, a power grid generator tripping control strategy based on deep reinforcement Learning and the like. The methods are based on deep learning, the optimal scheme in a limited number of schemes can be obtained under different power grid operating environments, and the learning process and the optimization strategy are discrete. However, the existing research has not yet applied the deep learning method of the continuous space to the operation optimization of the active power distribution network, and the active power distribution network cannot be effectively optimized in real time.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide an active power distribution network operation optimization method considering microgrid active optimization, a self-learning solution model considering microgrid independent optimization is established, a Deterministic strategy and a Deep learning Deterministic strategy Gradient (DDPG) algorithm are combined from the perspective of data processing, and an optimal behavior strategy is deeply learned in a continuous action environment by analyzing power grid operation data based on the accuracy, high performance and astringency of a DDPG nonlinear simulation function, so that the real-time optimization requirement of the active power distribution network is met.

The purpose of the invention can be realized by the following technical scheme: an active power distribution network operation optimization method considering microgrid active optimization comprises the following steps:

s1, establishing a micro-grid independent optimization model;

s2, establishing an active power distribution network operation optimization model by combining the micro-grid independent optimization model;

s3, solving the operation optimization model of the active power distribution network by adopting a DDPG method to obtain an optimal operation solution of the active power distribution network;

and S4, correspondingly outputting the optimal operation solution of the active power distribution network to a power flow control center and a micro-grid control center of the active power distribution network so as to optimize the operation state of the active power distribution network.

Preferably, the independent optimization model of the microgrid in the step S1 is:

wherein f is_n,m(. G) represents the mth optimization objective for microgrid n, m represents the number of optimization objectives for microgrid n, and_n(. to) an equality constraint of microgrid n, H_n(. to) an inequality constraint of the microgrid n, X_nThe independent optimization variables of the control center of the microgrid n are expressed, the values of the independent optimization variables are independent of the optimization variables of other microgrid control centers, and X_n,minAnd X_n,maxAre each X_nMinimum and maximum of, X_g,nIs the state variable of the control center of the microgrid n.

Preferably, the operation optimization model of the active power distribution network in step S2 is:

wherein, F_tRepresents the optimization target of the active power distribution network, t is the time, G_d(. represents the equality constraint of the active distribution network, H_d(. represents an inequality constraint, X, of the active distribution network_dRepresenting optimization variables, X, of active distribution networks_d,minAnd X_d,maxAre each X_dW is the total node number of the active power distribution network, M is the number of micro-grids contained in the active power distribution network, omega₁，ω₂，ω₃，ω₄Are all proportionality coefficients u_t,jIs the actual voltage at node j in the active distribution network,is the rated voltage, P, of the node j in the active power distribution network_t,lossFor the transmission loss, P, of the entire active distribution network_t,tieFor the actual power exchange between the microgrid and the upper grid,for rated exchange power, P, of the microgrid and the superior grid_t,kIs the active power output by the microgrid k,and independently optimizing the output active power for the microgrid k.

Preferably, the step S3 specifically includes the following steps:

s31, collecting historical characteristic values from operation data of the active power distribution network, wherein the characteristic values comprise state information S, action information a and return values r;

s32, forming a sample unit according to the historical characteristic value and storing the sample unit into a data pool;

s33, based on experience sample playback, resampling Y groups of sample units from the data pool, and storing the Y groups of sample units in the experience pool;

and S34, inputting the resampled sample units into a deep neural network, and obtaining an optimal solution for the operation of the active power distribution network by training the deep neural network.

Preferably, the state information in step S31 includes predicted output of independent source loads in the active power distribution network, optimized electricity purchase and sale from the microgrid to the active power distribution network, upper and lower limits of output of the microgrid, parameters of power generation cost of the microgrid, load requirements of the active power distribution network, power constraints of the lines, power and constraints of tie lines, and voltage values and constraints of nodes;

the action information is the output power and the tie line power of each microgrid in the active power distribution network;

the formula for calculating the reported value is:

wherein x is_t,iRepresenting inequality constraints in an active distribution network operation optimization model, c_t,max,iAnd c_t,min,iRespectively representing inequality constraints x_t,iAnd I represents the number of inequality constraints in the active power distribution network operation optimization model.

Preferably, the data of the sample unit is composed of the state information s at the next moment_t+1Current time status information s_tCurrent time operation information a_tAnd the current time report value r_tAnd (4) forming.

Preferably, the step S34 specifically includes the following steps:

s341, inputting the resampled sample unit into a target network to obtain an action information estimated value a 'of the next moment'_t+1And a target Q value Q'_t+1；

S342, estimating the motion information of the next moment by a'_t+1And a target Q value Q'_t+1Inputting the main network, updating the parameters of the main network to obtain the expected r of the return value_t'；

S343, expecting the return value r_tInputting the target network, updating the parameters of the target network to obtain the action information a at the next moment_t+1The optimal solution is the optimal solution for the operation of the active power distribution network.

Preferably, the primary network comprises a primary operating networkAnd a master evaluation networkThe target network comprises a target action networkAnd target evaluation networkWherein,andare all the main network parameters of the network,andare all parameters of the target network, specifically,andthe parameters of the main action network, the main evaluation network, the target action network and the target evaluation network are respectively.

Preferably, the specific process of updating the main network parameter and the target network parameter is as follows:

the objective function of the action network is set as follows:

wherein, J_t(theta) is the cumulative expectation of the objective function with attenuation, theta is a parameter of the neural network, pi (theta) is a deep learning network determined by theta, U_t(sA) is the long-term return at time t when the action value is a in state s, and γ is a discount factor;

the Q values of the networks were evaluated as follows:

0＜λ≤1

wherein,is a state s_t+1Target Q value of (1), r_t' is a state after action a is taken from s_tTo s_t+1Is expected, and lambda is a proportionality coefficient;

target Q value Q_TargetThe original Q value is the corresponding relation between the result value and the output value in the supervised learning neural network, so the loss function of the evaluation network training is constructed as follows:

using an objective function with respect toIs equivalent to the gradient of the Q-value function with respect toThe gradient of the action network is updated in combination with the Q value as follows:

in order to solve the problem that the policy divergence may be caused by the change of the network parameters, the target network parameters are updated by adopting the following formula:

0＜η＜1

wherein eta is a divergence factor.

Compared with the prior art, the invention has the following advantages:

according to the method, the active optimization model of the active distribution network considering the active optimization of the microgrid is established, the active optimization characteristic of the microgrid is fully combined, the operation of the active distribution network can reach the minimum node voltage total deviation and line loss, and the microgrid power adjustment quantity is reduced, so that the power balance of a connecting line is maintained to reduce the influence on the active distribution network, the overall optimization of the active distribution network is realized, and the fairness of the optimized operation of a multi-microgrid control center can be reflected.

The method is based on a data driving method, adopts a DDPG method with self-learning capability to extract characteristic estimation, can process the running information of the active power distribution network in real time under the environment of continuous change of optimization variables, directly provides an optimization scheme with high return, can optimize the active power distribution network in real time, and avoids the problem that the actual optimization effect is far worse than theoretical calculation due to the difference of model method-based modeling imperfection, model parameter inaccuracy and the like, thereby improving the effectiveness of the optimization method.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a system diagram of an example node of an embodiment;

fig. 3 is a microgrid load prediction curve of an embodiment;

FIG. 4 is a total load curve of an active distribution network of an embodiment;

FIG. 5 is a schematic flow chart of solving by DDPG method according to the present invention;

FIG. 6 is a schematic diagram illustrating comparison of costs of the active power distribution network according to four optimization results in the embodiment;

fig. 7 is a schematic diagram illustrating comparison of the power of the tie line of the active power distribution network according to four optimization results in the embodiment;

FIG. 8 is a schematic diagram showing the comparison of the voltages at the node 1 of the four optimization results in the embodiment;

FIG. 9 is a schematic diagram showing the comparison of the voltages at the node 5 for four optimization results in the embodiment;

fig. 10 is a schematic diagram illustrating comparison of the network loss of the active power distribution network according to four optimization results in the embodiment;

fig. 11 is a schematic diagram illustrating comparison of equivalent loads of the active distribution network according to four optimization results in the embodiment.

Detailed Description

The invention is described in detail below with reference to the figures and specific embodiments.

As shown in fig. 1, an active power distribution network operation optimization method considering microgrid active optimization includes the following steps:

s1, establishing a micro-grid independent optimization model;

In the embodiment, an IEEE14 node system is adopted, the topology structure of the active power distribution network system is shown in fig. 2, and the node parameter, the line parameter and the load parameter data of the IEEE14 node system all adopt standard data. And making sample data through the simulation data, and establishing a sample database. A simulation calculation example model shown in fig. 2 is built in MATLAB Simulink, four different microgrids are respectively connected to nodes 2,3, 6 and 8, and the capacity configuration of the microgroups is shown in table 1:

TABLE 1

The scheduling time scale is 15min in a day, 96 optimization tasks are performed in one day, the configuration of wind power, photovoltaic, energy storage and other equipment in the four micro-grids and the maximum and minimum adjustable capacity of each micro-grid are shown in table 1, the load prediction curves of the four micro-grids are shown in fig. 3, and the total load of the active power distribution network is shown in fig. 4.

The method of the present invention is applied to the following steps:

the method comprises the following steps of (I) constructing a microgrid optimization model, wherein the microgrid optimization model comprises an objective function and constraint conditions, and specifically comprises the following steps:

1. objective function

The wind, light, diesel, storage, load and other power generation units and loads in the micro-grid are scheduled and controlled by a micro-grid EMS (energy management system), and the optimization aim is to minimize f_t,k：

The first term in the formula (1) is the net cost of electricity purchase of the microgrid from the active power distribution network, the second term is the power generation cost of each micro source in the microgrid, and the day-ahead output P is obtained according to the predicted value_t,k,outThe third, fourth and fifth terms are readjustment costs respectively representing real-time adjustment cost, wind abandoning cost and light abandoning cost of interruptible load, wherein t is time, q is market price, and P is_t,k,outFor the exchange total power of the microgrid k and the active power distribution network, when the microgrid purchases power from the active power distribution network, P_t,k,outIs more than 0, and P is generated when the micro-grid sells electricity to the active power distribution network_t,k,outIf the number is less than 0, I is a DG type set of the microgrid;

i (i ═ 1,2,3,4,5) denotes the i-th DG: 1 represents a micro gas turbine (MT), 2 represents a Fuel Cell (FC), 3 represents a Storage Battery (SB), 4 represents a Wind Turbine (WT), and 5 represents a Photovoltaic (PV);

λ_W、λ_PVrespectively, the cost of abandoning wind and abandoning light, P_t,w,nThe maximum power, P, can be output for the blower at time t_t,pv,nMaximum power, P, can be output for photovoltaic time t_t,wIs the actual output power of the fan at the moment t, P_t,pvAnd the actual output power at the photovoltaic time t is obtained.

2. Constraint conditions

(1) DG unit

The output constraints of each micro power supply are as follows:

wherein,

wherein, P_t,iFor the ith DG output at time t, P_i,min、P_i,maxRespectively the upper limit and the lower limit of the active power output, S, of the ith DG_i,min、S_i,maxRespectively the minimum apparent power and the maximum apparent power of the ith DG, respectively an upper limit and a lower limit of the ith DG climbing rate.

(2) Energy storage unit

Energy balance constraint and charge-discharge power constraint of the energy storage unit:

wherein, P_t,BFor stored energy to exert force, E_tFor the energy stored, E_min、E_maxMinimum reserve capacity and maximum capacity, P, of stored energy, respectively_B,min、P_B,maxFor storing maximum charge and discharge power at time t, P_B,minCan take a negative value, alpha is the charge-discharge coefficient of stored energy, P_t,BWhen the ratio is more than or equal to 0, alpha is 1/alpha_di，P_t,BWhen < 0, alpha is alpha_ci，α_ciAnd alpha_diRespectively charge and discharge efficiency, P_t,LLoad of the microgrid at time t, P_t,ILAnd outputting the DG unit of the microgrid at the moment t.

(3) Micro-grid

The minimum and maximum values of the microgrid output power are respectively as follows:

wherein, P_t,k,min、P_t,k,maxThe maximum minimum active power output by the microgrid k,the maximum value of the transmittable capacity on the contact line is output for the piconet k.

(II) constructing an optimization model of the active power distribution network

the optimization target of the active power distribution network at the moment t is minimized F_t：

Wherein W is the number of nodes of the active power distribution network, M is the number of micro-grids in the active power distribution network, and omega₁，ω₂，ω₃，ω₄Is a proportionality coefficient of u_t,jIs the voltage at the node j and,is the nominal voltage of node j, P_t，lossFor the transmission loss, P, of the entire active distribution network_t，tieFor exchanging power between the microgrid and the upper-level power grid,exchanging power, P, for a given microgrid with a superordinate grid_t,kIs the active power output by the microgrid k,and independently optimizing the output active power for the microgrid k.

And power balance constraint:

wherein, P_t,l,iAnd Q_t,l,iLoad active and reactive power, P, for the ith DG at time t_t,kAnd Q_t,kActive power and reactive power output for microgrid kAnd (4) power.

Node voltage constraint:

u_j,min＜u_t,j＜u_j,max (8)

wherein u is_j,minAnd u_j,maxRespectively, the lower and upper voltage limits of node j.

Constraint of line transmission power:

l_z,min＜l_t,z＜l_z,max (9)

wherein l_t,zFor transmission power of line z, /)_z,minAnd l_z,maxRespectively, the lower and upper limit of the transmission capacity of the line z.

Capacity constraint of the microgrid:

wherein, P_t,k,minAnd P_t,k,maxRespectively the minimum output active power and the maximum output active power of the kth microgrid, S_k,maxThe maximum output apparent capacity of the kth ss.

The specific solving steps of the DDPG method are as follows:

(I) sample pretreatment: in thatAssigning values to each micro-source within a range, setting a fan and a photovoltaic to operate in a maximum power tracking mode under the condition of random values of the load of a power grid, randomly taking values of different functions for wind power, the photovoltaic, a micro gas turbine and the like, respectively adjusting the load rate to be 0.9-1.1, randomly changing, carrying out real-time simulation value taking, and respectively optimizing the output power of each micro-grid according to the formulas (1) - (5) so as to obtain micro-grid independent optimized output data.

(II) background data processing

The sample data includes:

status information s_t: independent source load (wind, light, storage load) prediction output in the active power distribution network, and optimized electricity purchasing (selling) and output of the micro-grid output to the active power distribution networkThe method comprises the following steps of (1) setting a lower limit, a power generation cost parameter, load requirements of an active power distribution network, power constraints of lines, tie line power and constraints thereof, and voltage values and constraints of nodes;

action information a_t: the output power and the tie line power of each microgrid;

the node voltage deviation, the power deviation and the network loss on the active power distribution network connecting line are used as important indexes of the running state of the active power distribution network before and after the evaluation action, and after the action information and the state information are obtained, the return value r can be calculated_tThe constraint conditions of the active power distribution network are uniformly expressed as follows: c. C_t,min,i≤x_t,i≤c_t,max,i(including inequalities and equations) where c_t,minAnd c_t,maxRespectively representing the lower limit and the upper limit of inequality constraint, and calculating a return value by considering the influence of the constraint on the basis of the equation (6) as follows:

wherein I is the number of inequality constraints, and then the long-term return is calculated according to equation (12):

the microgrid carries out normalization processing in the process of storing the running data samples, when training sample data is extracted, the embodiment carries out resampling by adopting modified Metropolis-Hastings (MMH) which is suitable for the problem of high-dimensional small failure rate, samples are obtained from a sample library according to the probability of action occurrence, and an experience playback sample is formed. The deep learning convolutional neural network carries out data preprocessing, so that high-value-density information is used as input data of reinforcement learning. And simultaneously training a plurality of data at each time, including data of the active power distribution network, each micro source and each load in different operation modes, so as to improve the generalization capability of the training model and the training efficiency of the deep learning convolutional neural network model. After training, all history optimization tasks can provide training sample data for the depth belief network, so that the history samples can be directly used as an initial database to perform online optimization on new tasks.

(III) optimization procedure of DDPG

As shown in fig. 5, the optimization flow of the solution of the DDPG method includes: background data processing collects state information s optimized at next moment_t+1And a at the current time_t、r_tAnd s_tForming a sample cell(s)_t,a_t,r_t,s_t+1) And storing the data into a data pool, resampling Y data from the sample storage data pool, and storing the data into an experience pool for training (experience sample playback).

In the optimization process, firstly, an action estimated value is calculated and obtained according to the current target network parameters which are not updatedAnd corresponding target Q valueThen calculating a loss function L of the evaluation network training to update the parameters of the main evaluation network;

then updating parameters of the main strategy network, the strategy network of the target network and parameters of the evaluation network, and updating parameters of the main network and the target network by training the deep neural network;

finally, the network is operated according to the updated target networkAnd obtaining current action information, and outputting the current action information to a power flow control center and each microgrid control center in the active power distribution network.

By collecting the state information s of the active power distribution network at the next optimization moment_t+1The DDPG optimization process uses a deep convolution neural network to process power grid operation data and has strong autonomous optimization capability.

(IV) analysis of the optimization results

In order to verify the advantages of the method in the aspect of improving the optimized operation of the active power distribution network, a solving method with the minimum total cost as a target is compared with the optimizing method of the invention, and in addition, in order to prove the effectiveness of the method for solving the multi-microgrid active power distribution network model by adopting DDPG (distributed denial of service) in the invention, a Particle Swarm Optimization (PSO) is used as a comparison algorithm, and the method is divided into the following four modes:

mode 1(S1) the optimization method of the present invention, which adopts the DDPG method to optimize the calculation;

mode 2(S2) the optimization method of the present invention, which adopts particle swarm optimization;

mode 3(S3) optimizing calculation by the DDPG method with the aim of minimizing the total cost;

mode 4(S4), the particle swarm optimization is performed with the aim of minimizing the total cost.

The simulation results of the modes S1-S4 are shown in FIGS. 6-11, and it can be seen from FIGS. 6-11 that the results of the DDPG algorithm and the particle swarm algorithm are similar, so that the DDPG solution can effectively obtain the optimized result, and the DDPG still has stability after multiple times of simulation.

And analyzing the optimization effect from the aspects of cost, energy utilization rate, voltage quality, active power distribution network tie line power fluctuation and the like according to the simulation result.

(1) And (5) analyzing the cost. It can be seen from fig. 6 that the operating costs of S1 and S2 with the combined optimization objective are always slightly higher than those of S3 and S4 with the objective of minimizing costs. Overall, scenario 1 incurs 3.6% more operating cost than scenario 3, and scenario 2 incurs 3.7% more operating cost than scenario 4. However, compared with the effect of comprehensively optimizing the scheduling on improving the operating environment of the microgrid and the users in the active power distribution network, the cost is acceptable.

(2) The energy utilization rate is improved. Energy storage capacity and schedulable load are comprehensively distributed through an active power distribution network layer optimization model, so that time and space complementation is formed between the charging and discharging characteristics of energy storage and photovoltaic wind power, the energy utilization rate is improved, and meanwhile, the peak regulation capacity of an active power distribution network is also improved.

(3) The total power of each tie line fluctuates. It can be seen from fig. 7 that, when analyzing the transmission power fluctuation of the tie line between the active distribution network and the main network, S1 and S2 are always smaller than S3 and S4. Therefore, the voltage quality is greatly improved under the condition of considering comprehensive optimization.

(4) And (5) voltage quality analysis. As can be seen from fig. 8 and 9, the voltage quality is greatly improved under the circumstances of S1 and S2 considering comprehensive optimization, and the node voltage fluctuation is always larger than S3 and S4.

(5) And analyzing the network loss of the active power distribution network. Accordingly, as can be seen from fig. 10, the network loss of S1 and S2 is usually slightly lower than that of S3 and S4.

(6) Peak and voltage regulation capability. As can be seen from fig. 11, the optimized active distribution grid equivalent load is incidentally much smaller than the non-optimized system load fluctuation due to the comprehensive utilization and distribution of the energy stored between the individual microgrids. Wherein, the load fluctuation of S1 and S2 is less than that of S3 and S4, which shows that the optimization effect of the optimization method of the invention is better than that of the model with the aim of minimizing the total cost.

As can be seen from fig. 6 to 11, compared with the aim of minimizing the total cost, the scheduling decision generated by the active power distribution network optimization model for active optimization of the microgrid is taken into consideration by adopting the method, so that the overall operation level of the active power distribution network is improved, including the improvement of peak-load and voltage regulation capacity, voltage quality level and energy utilization rate, on the basis, the influence of multiple microgrids and high-permeability DG access on the power flow of the upper-level power grid is reduced, and the advantage of optimized operation of the active power distribution network is embodied.

Claims

1. An active power distribution network operation optimization method considering microgrid active optimization is characterized by comprising the following steps:

s1, establishing a micro-grid independent optimization model;

2. The active power distribution network operation optimization method considering microgrid active optimization according to claim 1, wherein the microgrid independent optimization model in the step S1 is:

min f_n,1(X_n,X_g,n),...,f_n,m(X_n,X_g,n)

wherein ,f_n,m(. G) represents the mth optimization objective for microgrid n, m represents the number of optimization objectives for microgrid n, and_n(. to) an equality constraint of microgrid n, H_n(. to) an inequality constraint of the microgrid n, X_nThe independent optimization variables of the control center of the microgrid n are expressed, the values of the independent optimization variables are independent of the optimization variables of other microgrid control centers, and X_n,min and X_n,maxAre each X_nMinimum and maximum of, X_g,nIs the state variable of the control center of the microgrid n.

3. The active power distribution network operation optimization method considering microgrid active optimization according to claim 2, wherein the active power distribution network operation optimization model in the step S2 is:

min F_t

wherein ,F_tRepresents the optimization target of the active power distribution network, t is the time, G_d(. represents the equality constraint of the active distribution network, H_d(. represents an inequality constraint, X, of the active distribution network_dRepresenting optimization variables, X, of active distribution networks_d,min and X_d,maxAre each X_dW is the active matchThe total node number of the power grid, M is the number of micro grids contained in the active power distribution network, omega₁，ω₂，ω₃，ω₄Are all proportionality coefficients u_t,jIs the actual voltage at node j in the active distribution network,is the rated voltage, P, of the node j in the active power distribution network_t,lossFor the transmission loss, P, of the entire active distribution network_t,tieFor the actual power exchange between the microgrid and the upper grid,for rated exchange power, P, of the microgrid and the superior grid_t,kIs the active power output by the microgrid k,and independently optimizing the output active power for the microgrid k.

4. The active power distribution network operation optimization method considering microgrid active optimization according to claim 3, wherein the step S3 specifically comprises the following steps:

5. The method of claim 4, wherein the state information in step S31 includes predicted output of independent source loads in the active power distribution network, optimized power purchase and sale output from the microgrid to the active power distribution network, upper and lower limits of output of the microgrid, cost parameters of power generation of the microgrid, load requirements of the active power distribution network, power constraints of lines, power and constraints of tie lines, voltage values of nodes and constraints of the tie lines;

the action information is the output power and the tie line power of each microgrid in the active power distribution network.

6. The method as claimed in claim 5, wherein the calculation formula of the report value in step S31 is as follows:

wherein ,x_t,iRepresenting inequality constraints in an active distribution network operation optimization model, c_t,max,i and c_t,min,iRespectively representing inequality constraints x_t,iAnd I represents the number of inequality constraints in the active power distribution network operation optimization model.

7. The active power distribution network operation optimization method considering microgrid active optimization according to claim 6, wherein the data of the sample units are composed of state information s at the next moment_t+1Current time status information s_tCurrent time operation information a_tAnd the current time report value r_tAnd (4) forming.

8. The active power distribution network operation optimization method considering microgrid active optimization according to claim 7, wherein the step S34 specifically comprises the following steps:

S342, estimating the motion information of the next moment by a'_t+1And a target Q value Q'_t+1Input into the main networkUpdating the line main network parameters to obtain expected return value r'_t；

S343, expecting a return value r'_tInputting the target network, updating the parameters of the target network to obtain the action information a at the next moment_t+1The optimal solution is the optimal solution for the operation of the active power distribution network.

9. The method of claim 8, wherein the main network comprises a main operation networkAnd a master evaluation networkThe target network comprises a target action networkAnd target evaluation network wherein ,andare all the main network parameters of the network,andare all parameters of the target network, specifically,andthe parameters of the main action network, the main evaluation network, the target action network and the target evaluation network are respectively.

10. The active power distribution network operation optimization method considering microgrid active optimization according to claim 9, wherein the specific process of updating the main network parameters and the target network parameters is as follows:

the objective function of the action network is set as follows:

J_t(θ)＝E(U_t(s,a)|π(θ))

0＜γ＜1

wherein ,J_t(theta) is the cumulative expectation of the objective function with attenuation, theta is a parameter of the neural network, pi (theta) is a deep learning network determined by theta, U_t(s, a) is the long-term return at time t when the action value is a in state s, γ is a discount factor;

the Q values of the networks were evaluated as follows:

0＜λ≤1

wherein ,is a state s_t+1Target Q value of (2, r'_tTo adopt the state after action a is composed of s_tTo s_t+1Is expected, and lambda is a proportionality coefficient;

target Q valueThe original Q value is the corresponding relation between the result value and the output value in the supervised learning neural network, so the loss function of the evaluation network training is constructed as follows:

0＜η＜1

wherein eta is a divergence factor.