CN114610070A

CN114610070A - Unmanned aerial vehicle-cooperated wind power plant intelligent inspection method

Info

Publication number: CN114610070A
Application number: CN202210274635.5A
Authority: CN
Inventors: 张强; 闫兆鸿; 王鹏飞; 车超; 叶绯叶
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2022-03-21
Filing date: 2022-03-21
Publication date: 2022-06-10
Anticipated expiration: 2042-03-21
Also published as: CN114610070B

Abstract

The invention provides an unmanned aerial vehicle cooperative wind power plant intelligent inspection method, and belongs to the technical field of data processing. According to the invention, the unmanned aerial vehicle carries a task load to complete the routing inspection work of the wind turbine generator, the flight path of the unmanned aerial vehicle is planned through a deep reinforcement learning-simulated annealing algorithm model, and the routing inspection route planning of the intelligent wind power plant with the lowest energy consumption is realized. The invention fully considers the physical and environmental characteristics of the wind power plant, realizes the routing planning of the intelligent wind power plant inspection route with low energy consumption, has strong adaptability, and can be applied to the wind power plants in different geographic positions and terrains. The invention not only considers the characteristic of convenient real-time charging in the wind power plant, but also fully considers the climate characteristic of the wind power plant, and innovatively takes the wind speed and the wind direction into consideration of the unmanned aerial vehicle track planning; the wind power station monitoring system can adapt to wind power stations in different terrains and monsoon areas, can timely cope with sudden meteorological changes, dynamically adjusts a routing inspection route, and is very suitable for meteorological characteristics of the wind power stations.

Description

Unmanned aerial vehicle-cooperated wind power plant intelligent inspection method

Technical Field

The invention belongs to the technical field of data processing, and relates to an intelligent inspection method for an unmanned aerial vehicle-cooperated wind power plant. According to the invention, the unmanned aerial vehicle carries a task load to complete the routing inspection work of the wind turbine generator, the flight path of the unmanned aerial vehicle is planned through a deep reinforcement learning-simulated annealing algorithm model, and the routing inspection route planning of the intelligent wind power plant with the lowest energy consumption is realized.

Background

With the rapid development of wind power industry in China, the number of wind power generation sets is increased sharply. However, wind farms are often in a complex natural environment, possibly in remote mountainous areas or offshore, and are exposed to long periods of inclement weather. The wind motors are separated by hundreds of meters, the turbine cabin is as high as tens of meters, and the manual inspection cost is high, the efficiency is low, the error rate is high, and the danger is high. How to reduce the operation and maintenance management cost of huge wind turbine generator system, impel wind power system intellectuality and informationization to promote the electricity generation income of wind-powered electricity generation field, become the important problem that the wind-powered electricity generation industry is waited to solve urgently.

With the introduction and the proposal of an intelligent wind power plant concept, the wind power system is comprehensively optimized and upgraded in the aspects of part manufacturing, data management, operation and maintenance and the like. The intelligent wind power plant mainly realizes intellectualization in aspects of fan control management, equipment state perception, routing inspection maintenance and the like based on a measurement and control technology, a communication technology, a sensing technology, a big data processing technology and various intelligent algorithms. The working state data of each device is obtained through various sensors in the wind motor, the sensor data are integrated and processed through an edge server in the wind motor, whether the wind motor is in a normal working state or not can be obtained, and possible reasons of faults can be found out when the wind motor breaks down.

However, wind power systems have not achieved true intelligence. The operation data of the fan equipment still needs to be dumped and maintained by manpower generally; meanwhile, cracks on the wind turbine such as blades are difficult to confirm through a sensor, and manual work is usually required to be matched with a hanging basket or a high-power telescope for routing inspection. This leads to wind power system's system of patrolling and examining still to waste time and energy and the rate of accuracy is low, is difficult to reach the operation requirement of wisdom wind-powered electricity generation field.

Due to the flexible and mobile characteristics, unmanned aerial vehicles are playing an increasingly important role in smart wind farms. Carry the task load through unmanned aerial vehicle and accomplish the work of patrolling and examining the fan, can increase substantially the efficiency and the degree of accuracy of the work of patrolling and examining, reduce the danger that the manpower was patrolled and examined to improve power production's economic benefits. Has great research significance and practical value.

Disclosure of Invention

The method solves the problem of how to plan the routing inspection route of the unmanned aerial vehicle by using meteorological data of the environment of the wind power plant and using a depth reinforcement learning algorithm, so that the energy consumed by the planned route while traversing all wind power generation sets is minimum. In the process, the wind turbine generator is photographed by adopting an image recognition method for fault diagnosis of the wind turbine generator, and meanwhile, the related data of the wind turbine generator is uploaded; the time cost and the labor cost in the process of inspecting personnel are saved, and the safety of the wind turbine generator is improved. The invention realizes the intelligent inspection method of the wind power plant with the cooperation of the unmanned aerial vehicle, combines the technologies of deep reinforcement learning and the like, and finally provides theoretical basis and practical experience for the unmanned aerial vehicle in the field of intelligent inspection of the wind power plant with low power consumption.

The technical scheme of the invention is as follows:

an unmanned aerial vehicle cooperative wind power plant intelligent inspection method comprises an unmanned aerial vehicle route planning system based on deep reinforcement learning and simulated annealing, and a wind power plant fault detection and wind power data uploading system based on an unmanned aerial vehicle. The method comprises the following specific steps:

the method comprises the following steps: acquiring meteorological prediction data of a wind power plant in the future 4 hours, and preprocessing the data.

Step two: for a plurality of wind turbine generators X ═ X under the same wind power plant₁，x₂，...，x_nAnd fourthly, carrying out depth reinforcement learning algorithm on any two wind turbine generators x within the maximum cruising radius of the unmanned aerial vehicle_i、x_j(i is not equal to j) planning the lowest power consumption flight path and calculating the corresponding power consumption E_ij。

Step three: and determining the starting position of the unmanned aerial vehicle according to the predicted weather main wind direction, and performing flight path planning based on a simulated annealing algorithm on the unmanned aerial vehicle according to the current weather data and the learning experience of the step two.

Step four: and D, according to the planned flight path in the step three, each unit of the wind power plant is inspected so as to facilitate troubleshooting and data uploading.

The data preprocessing in the first step comprises the following specific steps:

step 1.1: and checking the meteorological data, and if the meteorological data have missing parts, smoothing the missing data.

Step 1.2: for any time t, according to wind direction data theta_tWill wind speed

Orthogonal decomposition is carried out to decompose the wind speed into the wind speeds in three mutually perpendicular directions in the three-dimensional space

Step 1.3: and carrying out data normalization processing on the wind direction and wind speed data.

In the second step, the deep reinforcement learning algorithm is constructed by the following steps:

step 2.1: firstly, establishing a Markov decision process model related to the planning of tracks of two wind turbine generators by an unmanned aerial vehicle, and determining a quintuple in the process<S，A，P，R，γ>Wherein S represents the current environmental state quantity of the unmanned aerial vehicle, A is the action quantity executed by the unmanned aerial vehicle, P is the transition probability among different states, R is the reward quantity obtained when the unmanned aerial vehicle executes the action A in the state S, and gamma is the reinforcement learning attenuation rate. The state quantity S can completely represent the current state of the unmanned aerial vehicle, in the invention, a three-dimensional coordinate system is established according to the current position of the unmanned aerial vehicle, and the state quantity S comprises the position coordinate Pos of the current unmanned aerial vehicle_U(x, y, z), and the wind velocity vector of the location where the drone is located at the current time

Simplifying the unmanned aerial vehicle motion model, wherein the action quantity A which can be executed by the unmanned aerial vehicle is equal to<a>Indicating the unmanned plane is at speed(Vector)

Moving a distance in a fixed direction within a time slice τ. In order to enable the unmanned aerial vehicle to reach the target unit with the least power consumption, the following rewarding modes are designed:

wherein, | d_s′L is the linear distance from the target point after the action is executed, | d_sI is the linear distance from the target point before the action is performed, E_ss′Energy consumption for this execution of action, E_maxThe maximum energy consumption of the unmanned aerial vehicle. When the unmanned aerial vehicle approaches the target with less energy consumption, more reward values can be obtained, and when the unmanned aerial vehicle reaches the target, a great reward value can be obtained, so that the target can absorb the unmanned aerial vehicle.

Energy consumption E required by unmanned aerial vehicle to execute action each time_ss′The calculation formula is E_ss′＝P_u·τ。P_uFor unmanned aerial vehicle power, including horizontal flight power

Vertical flight power

And resistance power

Wherein W ═ mg is drone gravity; ρ is the air density;

is the total area of the rotor of the unmanned aerial vehicle; c_D0Is the drag coefficient related to rotor geometry;

and

the relative speed of the unmanned aerial vehicle to the wind speed in the horizontal direction and the vertical direction respectively;

the speed of horizontal flight of the unmanned aerial vehicle;

hovering power for the unmanned aerial vehicle.

Step 2.2: and initializing a playback experience pool D for storing data generated by the unmanned aerial vehicle in a trial and error process. Initializing an Actor real network mu and a Critic real network Q randomly, wherein the parameters corresponding to the two neural networks are theta respectively^μAnd theta^Q(ii) a Randomly initializing an Actor target network mu 'and a criticic target network Q', wherein the parameters corresponding to the two neural networks are theta respectively^μ′And theta^Q′And make theta^μ′＝θ^μ，θ^Q′＝θ^Q。

Step 2.3: recording the initial state quantity s₁A random noise N conforming to a gaussian distribution is generated.

Step 2.4: the state quantity x at the current moment is calculated_tInput to at theta^μAdding random noise N at the current moment into the actual Actor network of the parameter_t. Outputting the action quantity a by the Actor real network_t＝μ(x_t，θ^μ)+N_tExecuting the action and calculating the reward r obtained by the action through a reward function_tWhile updating the state quantities to obtain x_t+1。

Step 2.5: creating quadruplets<x_t，a_t，r_t，x_t+1>It is stored in the playback experience pool D.

Step 2.6: randomly selecting a group of data from a playback experience pool D<x_j，a_j，r_j，x_j+1>X is to be_j，a_jInput to Critic real networkIn the formula, Q ═ Q (x) is obtained_j，a_j，θ^Q) X is to be_j+1Inputting the calculated action amount a into the Actor target network_j+1＝μ′(x_j+1，θ^μ′) And x is_j+1And a_j+1Input into Critic target network together to obtain Q (x)_j+1，a_j+1，θ^Q) Then the target value Q' is x_j+γQ′(x_j+1，a_j+1，θ^Q′) Training the Critic reality network to make the calculated Q value infinitely close to the target value Q 'by using Q' as a label, and updating theta by using a gradient descent method^Q。

Step 2.7: and updating the actual network of the Actor to ensure that the Q value of the output action quantity is maximum in the computed Q value of the Critic actual network. Updating theta by gradient descent method^μThe strategy gradient calculation method is

Step 2.8: updating theta mu' ← alpha theta for target network parameter^μ′+(1-α)θ^μ，θ^Q′←αθ^Q′+(1-α)θ^Q。

Step 2.9: and (5) repeating the step 2.3 to the step 2.8 until the loss values of the Actor target network and the Critic target network are converged and the network parameters are unchanged. After the network converges, for any two wind generating sets x_i、x_j(i ≠ j), the deep reinforcement learning model can give the flight track with the minimum power consumption of the unmanned aerial vehicle and the power consumption E_ij。

In the third step, the simulated annealing algorithm comprises the following specific steps:

step 3.1: and determining the starting position of the unmanned aerial vehicle according to the current day rose diagram. If the current wind power plant has a dominant wind direction, setting the starting position of the unmanned aerial vehicle as a corner fan opposite to the dominant wind direction; if the current dominant wind direction is not obvious, then set unmanned aerial vehicle initial position as the fan of central point position.

Step 3.2: two neural networks on the same day are trained. According to the playback experience pool and the next weather in the step twoThe predicted wind direction and wind speed are used for determining any two wind generating sets x in the maximum cruising radius of the unmanned aerial vehicle_i、x_jFlight path and flight energy consumption.

Step 3.3: starting from the initial position, sequentially selecting the lowest energy consumption track within the cruising radius of the unmanned aerial vehicle until all fans are traversed, and taking the lowest energy consumption track as an initialization path c. Simultaneously initializing a start temperature T and an end temperature T₀And an annealing speed alpha.

Step 3.4: by random thermal perturbation, another path c' is generated in the neighborhood of c. Unlike conventional simulated annealing algorithms, this disturbance can only occur between fans located in the same cruise area.

Step 3.5: and calculating the difference value delta E of the energy consumption of the unmanned aerial vehicle between the two paths c and c'. If Δ E is less than or equal to 0, updating the path, and making c ═ c', T ← α T; otherwise, a random number rand between 0-1 is generated, if

Then the path is updated, let c ═ c', T ← α T.

Step 3.6: judging T > T₀If yes, continuing to execute the step 3.4; otherwise, a near-optimal solution of the routing inspection route planning with the lowest energy consumption is obtained, and the unmanned aerial vehicle carries out maintenance and data acquisition on the fan in sequence according to the routing inspection route.

Step 3.7: and judging whether the routing inspection route needs to be updated or not based on the real-time meteorological data. When the real-time wind direction detected by a wind sensor in the wind power plant and the wind direction predicted by the meteorology are not in the same direction or the real-time wind speed and the wind speed level predicted by the meteorology are different by more than two levels, counting the wind turbine set X' ═ X which is not traversed yet₁，x₂，...，x_n′}. And 3.2, re-executing the step, and re-planning the lowest energy consumption flight path of the unmanned aerial vehicles of the rest wind turbine generators. And ending the algorithm until the unmanned aerial vehicle traverses the wind turbine generator.

The invention has the beneficial effects that: compared with other unmanned aerial vehicle track planning methods, the method provided by the invention fully considers the physical and environmental characteristics of the wind power plant, realizes the routing of the intelligent wind power plant inspection route with low energy consumption, has strong adaptability, and can be applied to wind power plants in different geographic positions and terrains. According to the invention, the routing of the unmanned aerial vehicle in the wind power plant is planned by developing an energy-saving routing inspection method in the intelligent wind power plant, so that the characteristic of convenience for real-time charging in the wind power plant is considered, the climatic characteristics of the wind power plant are also fully considered, and the wind speed and the wind direction are innovatively taken into consideration of the unmanned aerial vehicle track planning; the wind power station monitoring system can adapt to wind power stations in different terrains and monsoon areas, can timely cope with sudden meteorological changes, dynamically adjusts a routing inspection route, and is very suitable for meteorological characteristics of the wind power stations.

Drawings

FIG. 1 is a diagram of a wind farm intelligent patrol scenario of the present invention.

FIG. 2 is a timing diagram of the intelligent inspection of a wind farm of the present invention.

FIG. 3 is a flow chart of the data preprocessing of the present invention.

Fig. 4 is a schematic diagram of the unmanned aerial vehicle track planning algorithm based on reinforcement learning.

Fig. 5 is a detailed design diagram of the unmanned aerial vehicle track planning algorithm based on reinforcement learning.

Fig. 6 is a flow chart of the unmanned aerial vehicle flight path planning algorithm based on simulated annealing.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

FIG. 2 is a timing diagram of intelligent inspection of a wind farm according to the present invention, which provides an intelligent inspection method of a wind farm with cooperation of unmanned aerial vehicles, and specifically includes the following steps:

the method comprises the following steps: and acquiring meteorological data of the wind power plant in the future 4 hours, and preprocessing the data.

Step two: for a plurality of wind turbine generators X ═ X under the same wind power plant₁，x₂，...，x_nAnd fourthly, carrying out depth reinforcement learning algorithm on any two winds within the maximum cruising radius of the unmanned aerial vehicleMotor unit x_i、x_j(i ≠ j) carries out the lowest power consumption track planning and calculates the corresponding power consumption E_ij。

Step four: and D, according to the planned flight path in the step three, each unit of the wind power plant is patrolled so as to facilitate troubleshooting and data uploading.

Fig. 3 is a flow chart of data preprocessing in the first step of the present invention, which specifically includes the following steps:

Step 1.3: and carrying out normalization processing on the wind direction and the wind speed data.

Before the deep learning-simulated annealing algorithm of the present invention is explained in detail, the following description is given to the problem:

firstly, according to physical parameters and geographical environment of a wind power plant, relevant parameters of an unmanned aerial vehicle and meteorological records of nearly three years, environment construction of a simulation experiment is carried out, then the data are sent to a neural network for training and prediction, the recognition capability of the neural network on different wind speeds and wind directions is analyzed, the structure and the parameters of the neural network are continuously adjusted according to an experiment result, and further the feasibility of the method is improved. The invention adopts an Actor-critical frame-based deep reinforcement learning algorithm as a main structure, and optimizes parameters to improve the adaptability of the algorithm. Before unmanned aerial vehicle inspection starts every day, training the neural network according to weather prediction data of the day. And calculating the near-optimal unmanned aerial vehicle energy-saving routing inspection route based on a simulated annealing algorithm according to the training result. In the unmanned aerial vehicle inspection process, if the wind direction detected in real time and the wind speed are different from the weather prediction data, the path planning is carried out on the remaining fans again according to the new weather prediction data so as to ensure that the final unmanned aerial vehicle inspection route achieves the aim of low energy consumption.

Fig. 4 is a schematic diagram of the unmanned aerial vehicle path planning algorithm based on reinforcement learning.

Fig. 5 is a detailed design diagram of the unmanned aerial vehicle path planning algorithm based on reinforcement learning, and the flow specifically includes the following steps:

Simplifying the unmanned aerial vehicle motion model, wherein the action quantity A which can be executed by the unmanned aerial vehicle is equal to<a>Representing the unmanned plane velocity vector

Vertical flight power

And resistance power

Wherein W ═ mg is drone gravity; ρ is the air density;

and

the speed of horizontal flight of the unmanned aerial vehicle;

hovering power for the drone.

Step 2.2: and initializing a playback experience pool D for storing data generated by the unmanned aerial vehicle in the trial and error process. Initializing an Actor real network mu and a Critic real network Q randomly, wherein the parameters corresponding to the two neural networks are theta respectively^μAnd theta^Q(ii) a Randomly initializing an Actor target network mu 'and a criticic target network Q', wherein the parameters corresponding to the two neural networks are theta respectively^μ′And theta^Q′And make theta^μ′＝θ^μ，θ^Q′＝θ^Q。

Step 2.4: the state quantity x at the current moment is calculated_tInput to at theta^μAdding random noise N at the current moment into the actual Actor network of the parameter_t. Outputting the action amount a by an Actor real network_t＝μ(x_t，θ^μ)+N_tExecuting the action and calculating the reward r obtained by the action through a reward function_tWhile updating the state quantities to obtain x_t+1。

Step 2.6: randomly selecting a group of data from a playback experience pool D<x_j，a_j，r_j，x_j+1>X is to be_j，a_jInputting to Critic real network to obtain Q ═ Q (x)_j，a_j，θ^Q) X is to be_j+1Inputting the calculated action amount a into the Actor target network_j+1＝μ′(x_j+1，θ^μ′) And x is_j+1And a_j+1Input into Critic target network together to obtain Q (x)_j+1，a_j+1，θ^Q) Then the target value Q' is r_j+γQ′(x_j+1，a_j+1，θ^Q′) Training a Critic reality network with Q 'as a label to enable the calculated Q value to be infinitely close to a target value Q', and updating theta by using a gradient descent method^Q。

Step 2.8: updating a target network parameter θ^μ′←αθ^μ′+(1-α)θ^μ，θ^Q′←αθ^Q′+(1-α)θ^Q。

Fig. 6 is a flow chart of the unmanned aerial vehicle track planning algorithm based on simulated annealing, which specifically comprises the following steps:

step 3.1: and determining the start position cap of the unmanned aerial vehicle according to the current day rose diagram. If the current wind power plant has a dominant wind direction, setting the starting position of the unmanned aerial vehicle as a corner fan opposite to the dominant wind direction; if the current dominant wind direction is not obvious, then set unmanned aerial vehicle initial position as the fan of central point position.

Step 3.2: and training the neural network on the same day. Determining any two wind generating sets x in the maximum cruising radius of the unmanned aerial vehicle according to the playback experience pool in the step two and the wind direction and the wind speed predicted by the next-day weather_i、x_jFlight path and flight energy consumption.

Step 3.3: starting from the initial position, sequentially selecting the lowest energy consumption track within the cruising radius of the unmanned aerial vehicle until all the fans are traversed, and taking the lowest energy consumption track as an initialization path c. Initializing the start temperature T and the end temperature simultaneouslyT₀And an annealing speed alpha.

Then the path is updated, let c ═ c', T ← α T.

Step 3.6: judging T > T₀If yes, continuing to execute the step 3.4; otherwise, obtaining a near-optimal solution of the routing inspection route planning with the lowest energy consumption, and the unmanned aerial vehicle sequentially overhauls and acquires data of the fan according to the routing inspection route.

Claims

1. An unmanned aerial vehicle-cooperated wind power plant intelligent inspection method is characterized by comprising the following steps:

the method comprises the following steps: acquiring meteorological prediction data of a wind power plant in the future 4 hours, and preprocessing the data;

step two: for a plurality of wind turbine generators X ═ X under the same wind power plant₁,x₂,...,x_nAnd (5) carrying out optimization on any two wind turbine generators x within the maximum cruising radius of the unmanned aerial vehicle by utilizing a depth reinforcement learning algorithm_i、x_jPlanning the lowest power consumption track and calculating the corresponding power consumption E_ijWherein i ≠ j;

step three: determining the starting position of the unmanned aerial vehicle according to the predicted weather main wind direction, and performing flight path planning based on a simulated annealing algorithm on the unmanned aerial vehicle according to the current weather data and the learning experience in the step two;

2. The unmanned aerial vehicle-coordinated wind farm intelligent inspection method according to claim 1, wherein in the first step, the specific steps of data preprocessing are as follows:

step 1.1: checking meteorological data, and if a missing part exists, smoothing the missing data;

3. The intelligent inspection method for the wind farm with cooperation of the unmanned aerial vehicle according to claim 1 or 2, wherein in the second step, the deep reinforcement learning algorithm is constructed by the following steps:

step 2.1: firstly, establishing a Markov decision process model related to the planning of tracks of two wind turbine generators by an unmanned aerial vehicle, and determining a quintuple < S, A, P, R, gamma > in the process, wherein S represents the state quantity of the environment where the unmanned aerial vehicle is located currently, A is the action quantity executed by the unmanned aerial vehicle, P is the transition probability among different states, R is the reward quantity obtained when the unmanned aerial vehicle executes the action A in the state S, and gamma is the reinforcement learning attenuation rate;

establishing a three-dimensional coordinate system by using the current position of the unmanned aerial vehicle, wherein the state quantity S comprises the position coordinate Pos of the current unmanned aerial vehicle_U(x, y, z) and the wind velocity vector of the current position of the drone

Moving a distance in a fixed direction within a time slice τ; in order to enable the unmanned aerial vehicle to reach the target unit with the least power consumption, the following rewarding modes are designed:

wherein, | d_s′I is the linear distance between the executed action and the target point; | d_sI is the linear distance between the target point and the target point before executing the action; e_maxThe maximum energy consumption of the unmanned aerial vehicle is achieved; e_ss′For the energy consumption of the execution action, the calculation formula is E_ss′＝P_u·τ，P_uFor unmanned aerial vehicle power, including horizontal flight power

Vertical flight power

And resistance power

Wherein W ═ mg is drone gravity; ρ is the air density;

and

the speed of horizontal flight of the unmanned aerial vehicle;

hovering power for the unmanned aerial vehicle;

step 2.2: initializing a playback experience pool D, and storing data generated by the unmanned aerial vehicle in a trial and error process; initializing an Actor real network mu and a Critic real network Q randomly, wherein the parameters corresponding to the two neural networks are theta respectively^μAnd theta^Q(ii) a Randomly initializing an Actor target network mu 'and a criticic target network Q', wherein the parameters corresponding to the two neural networks are theta respectively^μ′And theta^Q′And make theta^μ′＝θ^μ，θ^Q′＝θ^Q；

Step 2.3: recording the initial state quantity s₁Generating a random noise N which accords with Gaussian distribution;

step 2.4: the state quantity x at the current moment is calculated_tInput to at theta^μAdding random noise N at the current moment into the actual Actor network of the parameter_t(ii) a Outputting the action quantity a by the Actor real network_t＝μ(x_t,θ^μ)+N_tExecute the action, go throughThe reward r obtained by the action is calculated by the reward function_tWhile updating the state quantities to obtain x_t+1；

Step 2.5: creating quadruplets<x_t,a_t,r_t,x_t+1>Storing the data into a playback experience pool D;

step 2.6: randomly selecting a group of data from a playback experience pool D<x_j,a_j,r_j,x_j+1>X is to be_j,a_jInputting into Critic real network to obtain Q = Q (x)_j,a_j,θ^Q) X is to be_j+1Inputting the calculated action amount a into the Actor target network_j+1＝μ′(x_j+1,θ^μ′) And x is_j+1And a_j+1Input into Critic target network together to obtain Q (x)_j+1,a_j+1,θ^Q) Then the target value Q' is r_j+γQ′(x_j+1,a_j+1,θ^Q′) Training the Critic reality network to make the calculated Q value infinitely close to the target value Q 'by using Q' as a label, and updating theta by using a gradient descent method^Q；

Step 2.7: updating the actual network of the Actor to enable the Q value of the output action quantity to be maximum in the computed Q value of the actual network of the Critic; updating theta by gradient descent method^μThe strategy gradient calculation method is

Step 2.8: updating a target network parameter θ^μ′←αθ^μ′+(1-α)θ^μ，θ^Q′←αθ^Q′+(1-α)θ^Q；

Step 2.9: repeating the step 2.3 to the step 2.8 until the loss values of the Actor target network and the Critic target network are converged and the network parameters are unchanged; after the network converges, for any two wind generating sets x_i、x_jThe deep reinforcement learning model can provide the flight track with minimum power consumption of the unmanned aerial vehicle and the power consumption E_ij。

4. The unmanned aerial vehicle-coordinated wind farm intelligent inspection method according to claim 1 or 2, wherein in the third step, the specific steps of the simulated annealing algorithm are as follows:

step 3.1: determining the starting position of the unmanned aerial vehicle according to the current day rose diagram; if the current wind power plant has a dominant wind direction, setting the starting position of the unmanned aerial vehicle as a corner fan opposite to the dominant wind direction; if the current dominant wind direction is not obvious, setting the starting position of the unmanned aerial vehicle as a fan at the central position;

step 3.2: training two neural networks on the same day; determining any two wind generating sets x in the maximum cruising radius of the unmanned aerial vehicle according to the playback experience pool in the step two and the wind direction and the wind speed predicted by the next-day weather_i、x_jFlight path and flight energy consumption between;

step 3.3: starting from the initial position, sequentially selecting the lowest energy consumption track within the cruising radius of the unmanned aerial vehicle until all fans are traversed, and taking the lowest energy consumption track as an initialization path c; simultaneously initializing a start temperature T and an end temperature T₀And an annealing speed α;

step 3.4: generating another path c' in the neighborhood of c by random thermal perturbation;

step 3.5: calculating the difference value delta E of the energy consumption of the unmanned aerial vehicle between the two paths c and c'; if Δ E is less than or equal to 0, updating the path, and making c ═ c', T ← α T; otherwise, a random number rand between 0-1 is generated, if

Updating the path, and making c ═ c', T ← α T;

step 3.6: judging T > T₀If yes, continuing to execute the step 3.4; otherwise, obtaining a near-optimal solution of the routing inspection route with the lowest energy consumption, and sequentially overhauling and acquiring data of the fan by the unmanned aerial vehicle according to the routing inspection route;

step 3.7: judging whether the routing inspection route needs to be updated or not based on the real-time meteorological data; when the real-time wind direction detected by the wind sensor in the wind power plant is not in the same direction as the wind direction predicted by the weather, or the real-time wind speed and the weather are predictedWhen the wind speed level of the wind turbine is different by more than two levels, counting the wind turbine generator X' which is not traversed yet₁,x₂,...,x_n′Executing the step 3.2 again, and replanning the lowest energy consumption flight path of the unmanned aerial vehicles of the rest wind turbine generators; and ending the algorithm until the unmanned aerial vehicle traverses the wind turbine generator.

5. The unmanned aerial vehicle-coordinated wind farm intelligent inspection method according to claim 3, wherein in the third step, the specific steps of the simulated annealing algorithm are as follows:

step 3.3: starting from the initial position, sequentially selecting the lowest energy consumption track within the cruise radius of the unmanned aerial vehicle until all fans are traversed, and taking the track as an initialization path c; simultaneously initializing a start temperature T and an end temperature T₀And an annealing speed α;

Updating the path, and making c ═ c', T ← α T;

step 3.6: judging T > T₀If yes, continuing to execute the step 3.4; otherwise, the patrol with the lowest energy consumption is obtainedThe method comprises the following steps that (1) a near-optimal solution of routing planning is checked, and the unmanned aerial vehicle carries out maintenance and data acquisition on a fan in sequence according to the routing;

step 3.7: judging whether the routing inspection route needs to be updated or not based on the real-time meteorological data; when the real-time wind direction detected by a wind sensor in the wind power plant is not in the same direction with the meteorological predicted wind direction or the real-time wind speed is different from the meteorological predicted wind speed by more than two levels, counting the wind turbine set X' ═ X not traversed yet₁,x₂,...,x_n′Executing the step 3.2 again, and replanning the lowest energy consumption flight path of the unmanned aerial vehicles of the rest wind turbine generators; and ending the algorithm until the unmanned aerial vehicle traverses the wind turbine generator.