CN112272277B

CN112272277B - Voice adding method and device in nuclear power test and computer equipment

Info

Publication number: CN112272277B
Application number: CN202011145835.8A
Authority: CN
Inventors: 刘爱东; 林伟; 徐鸿威
Original assignee: China General Nuclear Power Corp; CGN Power Co Ltd; Daya Bay Nuclear Power Operations and Management Co Ltd; Lingdong Nuclear Power Co Ltd; Guangdong Nuclear Power Joint Venture Co Ltd; Lingao Nuclear Power Co Ltd
Current assignee: China General Nuclear Power Corp; CGN Power Co Ltd; Daya Bay Nuclear Power Operations and Management Co Ltd; Lingdong Nuclear Power Co Ltd; Guangdong Nuclear Power Joint Venture Co Ltd; Lingao Nuclear Power Co Ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2023-07-18
Anticipated expiration: 2040-10-23
Also published as: CN112272277A

Abstract

The application relates to a voice adding method, a device and computer equipment in a nuclear power test. The method comprises the following steps: when a test video stream is received, acquiring a target test identification code associated with a current nuclear power test; screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes; displaying candidate test video streams, and screening target test video streams pointed by the voice interaction instruction from the candidate video streams when the voice interaction instruction is received; calling a voice acquisition device to acquire on-site voice streams based on the voice interaction instruction; performing silence detection on the live voice stream, and dividing the live voice stream according to the detection result of the silence detection to obtain a corresponding live voice fragment; and correspondingly adding the live voice fragments into the target test video stream. By adopting the method, the voice information of the manager can be added into the test video.

Description

Voice adding method and device in nuclear power test and computer equipment

Technical Field

The application relates to the technical field of nuclear power informatization construction, in particular to a voice adding method, a device and computer equipment in a nuclear power test.

Background

The nuclear power plant generates electricity through heat energy generated by nuclear fuel in a nuclear reactor, and in order to ensure the basic safety of the nuclear power plant, test staff can regularly perform nuclear power tests on nuclear power equipment in the nuclear power plant. When nuclear power equipment is required to be subjected to nuclear power test, a tester can carry a video recorder with him, and the video recorder is used for recording a test site, so that a manager can remotely monitor the tester based on the recorded test video.

In order not to disturb others for nuclear power experiments, when a manager remotely guides the testers to conduct experiment operations, the testers receive voice information of the manager through headphones, so that the current test video recorded based on the video recorder possibly only contains the voice information of the testers, but does not contain the voice information of the manager, and the later stage of rewinding of the test video recorded by the video recorder cannot be performed simultaneously. Therefore, a method capable of adding voice information of a manager to a test video is urgently needed.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a voice adding method, apparatus, computer device, and storage medium capable of adding voice information of a manager to a nuclear power test in a test video.

A method of voice addition in a nuclear power test, the method comprising:

when a test video stream is received, acquiring a target test identification code associated with a current nuclear power test;

screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes;

displaying the candidate test video stream, and when a voice interaction instruction is received, screening a target test video stream pointed by the voice interaction instruction from the candidate video stream;

based on the voice interaction instruction, calling a voice acquisition device to acquire an on-site voice stream;

performing silence detection on the live voice stream, and dividing the live voice stream according to a detection result of the silence detection to obtain a corresponding live voice fragment;

and when the original sound fragments with the sound amplitude exceeding the first amplitude are not included in the target test video stream within the duration of the live voice fragments from the acquisition time of the live voice fragments, correspondingly adding the live voice fragments into the target test video stream.

In one embodiment, the method further comprises:

When the original sound fragments with the sound amplitude exceeding the first amplitude are in the target test video stream within the duration of the on-site voice fragments from the acquisition time of the on-site voice fragments, extracting a first video frame in the target test video stream at the current time and a second video frame in the rest candidate test video streams except the target test video stream;

performing image matching on the first video frame and the second video frame;

and when the second video frame has the target second video frame successfully matched with the first video frame, adding the live voice fragment into a candidate test video stream containing the target second video frame.

In one embodiment, the test video stream is acquired by a video recorder; the step of screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes comprises the following steps:

receiving a test identification code and a local identification code sent by the video recorder;

based on the test identification code and the local identification code, establishing a mapping relation between the video recorder and the test identification code;

According to the mapping relation and the target test identification code, determining a candidate video recorder for video recording in the test process of the current nuclear power test;

and taking the test video stream output by the candidate video recorder as a candidate test video stream.

In one embodiment, the test video stream is acquired by a video recorder; the test video stream is embedded with a test identification code; the test identification code in the test video stream is obtained by scanning the image codes by a video recorder;

the step of screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes comprises the following steps:

and setting the test video stream embedded with the target test identification code as a candidate test video stream.

In one embodiment, the presenting the candidate trial video stream includes:

acquiring test information of a current nuclear power test, and intercepting test video fragments within a preset time length from the candidate test video stream; the test information comprises test procedures included in the current nuclear power test, and key equipment parts corresponding to the test procedures respectively;

Traversing at least one video frame in the test video clip and identifying a current operating equipment part in each traversed video frame;

when the current operation equipment part in the traversed video frame belongs to the key equipment part, determining the traversed video frame as a target video frame;

screening out target key equipment parts corresponding to the current operation equipment parts in the target video frame from the key equipment parts corresponding to the test procedures;

and determining a current test procedure based on the target key equipment part, and displaying the procedure identification of the current test procedure and the candidate test video stream correspondingly.

In one embodiment, the displaying the process identifier of the current test process corresponding to the candidate test video stream includes:

when multiple paths of candidate test video streams exist, extracting target video frames in test fragments corresponding to the candidate test video streams;

determining the integrity of the current operation equipment part in the extracted target video frame and the integrity of the hand contacted with the current operation equipment part;

image scoring is carried out on the target video frames in each test video segment according to the integrity of the current operation equipment part and the integrity of the hands;

Determining a primary test video stream and a secondary test video stream in multiple candidate test video streams according to the image scores;

displaying the primary test video stream and the process identification of the current test process at a preset primary location in the screen, and displaying the secondary test video stream at a preset secondary location in the screen.

In one embodiment, the adding the live speech segment to the target test video stream includes:

performing sound spectrum analysis on the on-site voice fragment to obtain a corresponding spectrogram; the spectrogram comprises sound amplitude values of the on-site voice fragments;

when the sound amplitude value in the on-site voice segment exceeds a second amplitude value, recognizing the on-site voice segment based on a pre-trained voice recognition model to obtain a voice subtitle;

and correspondingly adding the voice subtitle and the onsite voice fragment into the target test video stream.

A voice adding device in a nuclear power test, the device comprising:

the display module is used for acquiring a target test identification code associated with the current nuclear power test when receiving the test video stream; screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes; displaying the candidate test video stream, and when a voice interaction instruction is received, screening a target test video stream pointed by the voice interaction instruction from the candidate video stream;

The voice stream acquisition module is used for calling the voice acquisition equipment to acquire the on-site voice stream based on the voice interaction instruction;

the voice adding module is used for carrying out silence detection on the on-site voice stream, and dividing the on-site voice stream according to the detection result of the silence detection to obtain a corresponding on-site voice segment; and when the original sound fragments with the sound amplitude exceeding the first amplitude are not included in the target test video stream within the duration of the live voice fragments from the acquisition time of the live voice fragments, correspondingly adding the live voice fragments into the target test video stream.

A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

According to the voice adding method, the voice adding device, the computer equipment and the storage medium in the nuclear power test, the candidate test video stream recorded for the current nuclear power test can be screened out from the received test video streams based on the target test identification code by acquiring the target test identification code associated with the current nuclear power test; the candidate test video streams can be correspondingly displayed by determining the candidate test video streams, so that a target test video stream needing voice interaction can be determined through the displayed candidate test video streams, and a corresponding voice interaction instruction is generated according to the target test video stream; by generating the voice interaction instruction, the on-site voice stream can be acquired based on the voice interaction instruction and is automatically added into the target test video stream, so that the later-stage test video can be re-coiled, and the voice information of the manager can be re-coiled at the same time.

In addition, by judging whether the target test video stream has the original sound fragment with the sound amplitude exceeding the first amplitude, the probability that the original sound fragment covers the live voice fragment and the live voice fragment cannot be duplicated can be avoided.

Drawings

FIG. 1 is an application environment diagram of a method of voice addition in a nuclear power experiment in one embodiment;

FIG. 2 is a flow chart of a method of voice addition in a nuclear power test in one embodiment;

FIG. 3 is a schematic illustration of a candidate trial video stream in one embodiment;

FIG. 4 is a flow diagram showing candidate trial video streaming steps in one embodiment;

FIG. 5 is a block diagram of a voice adding device in a nuclear power test in one embodiment;

FIG. 6 is a block diagram of a voice adding device in a nuclear power test in another embodiment;

fig. 7 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The voice adding method in the nuclear power test can be applied to an application environment shown in fig. 1. Wherein the video recorder 102 communicates with the terminal 104 via a network. The video recorder 102 records a video of a test site of a nuclear power test, and sends a recorded test video stream to the terminal 104, and the terminal 104 correspondingly displays the test video stream. When receiving the voice interaction instruction triggered by the user, the terminal 104 determines a target test video stream pointed by the voice interaction instruction, and adds the acquired live voice stream to the target test video stream. The terminal 104 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices

In one embodiment, as shown in fig. 2, a method for adding voice in a nuclear power test is provided, and the method is applied to the terminal in fig. 1 for illustration, and includes the following steps:

step S202, when a test video stream is received, a target test identification code associated with a current nuclear power test is acquired.

The test identification code is an identification code obtained by encoding test information of the current nuclear power test before the current nuclear power test is executed. The test information is information related to the nuclear power test, and the test information can comprise test procedures included in the current nuclear power test, key equipment parts corresponding to the test procedures, test areas of the current nuclear power test and the like. A complete nuclear power test may include multiple test procedures, and a test procedure may include one or more executable steps. The manager may divide each executable step in the nuclear power test in advance, thereby obtaining a plurality of test procedures.

Specifically, when a nuclear power test needs to be executed, a tester can erect a video recorder in a test area of the nuclear power test, record a test process of the nuclear power test through the erected video recorder, obtain a test video stream, and send the test video stream to a terminal corresponding to a manager in real time. The tester is a field operator for performing the nuclear power test in a test area of the nuclear power test, and the manager is a person for remotely guiding the test operation of the tester. Because the test capability and the working experience of the test personnel are uneven, the habit and the history of the nuclear power equipment are lack of deep knowledge, and the number of the personnel of the management personnel with longer working experience and deep knowledge on the habit of the nuclear power equipment is limited, when the test personnel execute the nuclear power test, the test video can be recorded through the test process of the multi-nuclear power test of the video recorder arranged in the test area, and the test video is sent to the terminal corresponding to the management personnel, so that the management personnel can supervise the test area of the current nuclear power test through the test video, and remotely guide the test personnel executing the current nuclear power test. It is easy to understand that one manager can monitor the test areas of a plurality of nuclear power tests at the same time, so as to guide the execution of the plurality of nuclear power tests at the same time.

Because a plurality of nuclear power tests can be simultaneously carried out in the nuclear power plant, and a plurality of video recorders can be erected in a test area of each nuclear power test, when a manager needs to remotely monitor the current nuclear power test, a test identification code corresponding to the current nuclear power test can be input into a terminal, and therefore the terminal can screen candidate test video streams recorded in the test process of the current nuclear power test from received multipath test video streams according to the target test identification code. The current nuclear power test refers to a nuclear power test which needs to be remotely monitored by a current manager.

In one embodiment, prior to obtaining the target trial identification code associated with the current nuclear power trial, the method further comprises the step of generating the target trial identification code associated with the current nuclear power trial: determining the component parts of the test information of the current nuclear power test and the respective corresponding coding modes of the component parts; coding the component parts according to a coding mode to obtain a test sub-identification code; determining the total number of the test sub-identification codes, and generating corresponding test identification codes based on the total number of the test sub-identification codes and each test sub-identification code.

In the process of encoding the test information of the nuclear power experiment to generate the corresponding test identification code, the terminal can determine the component parts of the test information, encode each component part by using different encoding modes to obtain the test sub-identification code corresponding to each component part, and combine the test sub-identification codes to obtain the test identification code. For example, when the test information comprises a test name of a nuclear power test and a work order identifier of a test work order, the terminal encodes the test name according to an A encoding rule, and adds an analysis serial number corresponding to the A encoding rule to the encoded test name to obtain a first test sub-identifier; the terminal encodes the work order identifier according to the B encoding rule, and adds the analysis serial number corresponding to the B encoding rule to the encoded work order identifier to obtain a second test sub-identifier code; the terminal combines the first test sub-identification code and the second test sub-identification code to obtain a test identification code, and adds the total number of the test sub-identification codes to the test identification code.

Step S204, screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes.

Specifically, because the test identifier is associated with the nuclear power test, a candidate video stream for video recording of the current nuclear power test can be screened out from multiple paths of test video streams based on the target test identifier.

In one embodiment, the test video stream is acquired by a video recorder; based on the target test identification code, screening candidate test video streams for video recording of the current nuclear power test process from the test video streams, wherein the candidate test video streams comprise: receiving a test identification code and a local identification code sent by a video recorder; based on the test identification code and the local identification code, establishing a mapping relation between the video recorder and the test identification code; according to the mapping relation and the target test identification code, determining a candidate video recorder for video recording in the test process of the current nuclear power test; and taking the test video stream output by the candidate video recorder as a candidate test video stream.

Specifically, before a nuclear power experiment is executed, the terminal can generate a corresponding experiment identification code according to the experiment information of the nuclear power experiment, and encode the experiment identification code to obtain a corresponding image code. For example, the terminal generates a test identification code composed of numbers according to the test name and the test area of the nuclear power test, and converts the test identification code composed of numbers into a two-dimensional code in an image form. When a tester needs to test a nuclear power test, the tester can go to a designated place to get the video recorder, and scan the image codes corresponding to the test identification codes through the video recorder, so that the video recorder and the test identification codes are bound, the video recorder can acquire preset local identifications, the bound test identification codes and the local identification codes are uploaded to the terminal together, and the terminal establishes a mapping relation between the video recorder and the test identification codes according to the received test identification codes and the local identification codes. The local identification code is the identification code of the video recorder itself, and is used for uniquely identifying a video recorder.

When the terminal receives a target identification code associated with the current nuclear power test, the terminal determines a candidate video recorder for video recording in the test process of the current nuclear power test according to the mapping relation between the video recorder and the test identification code, and sets a video stream sent by the candidate video recorder as a candidate test video stream.

The video recorder only needs to correspond to the scanned image codes, and can quickly acquire the test identification codes, so that the nuclear power test, the video recorder and the test video stream output by the video recorder can be bound only based on the test identification codes, and the binding efficiency is greatly improved. In addition, by establishing the mapping relation between the video recorder and the test identification code, the candidate test video stream can be rapidly determined based on the mapping relation, so that the determination efficiency of the candidate test video stream is improved.

In one embodiment, the test video stream is acquired by a video recorder; a test identification code is embedded in the test video stream; the test identification code in the test video stream is obtained by scanning the image codes by a video recorder; based on the target test identification code, screening candidate test video streams for video recording of the current nuclear power test process from the test video streams, wherein the candidate test video streams comprise: and setting the test video stream embedded with the target test identification code as a candidate test video stream.

Specifically, when the video recorder obtains a corresponding test identifier code through scanning image coding, the video recorder embeds the test identifier code obtained through scanning into a test video stream acquired by the video recorder, and transmits the test video stream embedded with the test identifier code to the terminal, so that the terminal compares the embedded test identifier code with a target test identifier code, and the test video stream embedded with the target test identifier code is used as a candidate test video stream. The mode that the video recorder embeds the scanned test identification code in the test video stream can be freely set according to requirements. For example, embedding the test identification code in a video frame of the test video stream in a watermark manner; for another example, the test identification code is embedded in the name of the test video stream, and the test video stream and the name of the test video stream are transmitted to the terminal together.

By embedding the test identification code in the test video stream, the terminal can quickly determine the candidate test video stream only by determining the test identification code in the received test video stream, so that the determination efficiency of determining the candidate test video stream is improved.

Step S206, displaying the candidate test video streams, and when the voice interaction instruction is received, screening target test video streams pointed by the voice interaction instruction from the candidate video streams.

Specifically, in order to facilitate the supervision of the operation steps of the experimenter by the manager based on the candidate experimental video streams, the terminal may present the candidate experimental video streams in the local screen when determining the candidate experimental video streams. Referring to fig. 3, when there are multiple candidate trial video streams, the terminal may divide the home screen into a plurality of presentation areas as shown in fig. 3 and present the candidate trial video streams in the corresponding presentation areas. FIG. 3 is a schematic illustration of a candidate trial video stream in one embodiment.

Because of the complexity of the nuclear power plant, a nuclear power plant may include multiple plant parts, and multiple testers may be required to operate the multiple plant parts simultaneously when performing the same nuclear power test procedure, so that test procedures of the multiple testers are recorded based on multiple video recorders, i.e., each tester may have a respective corresponding video recorder. For example, when the a nuclear power plant includes an a1 plant part and an a2 plant part, and the nuclear power test step designed for the a nuclear power plant is to operate the a1 plant part and the a2 plant part simultaneously, it is required that a b1 tester carries a video recorder c1 to go to the area where the a1 plant part is located, and the operation step of the b1 tester is recorded by the video recorder c 1; the b2 tester is required to carry the video recorder c2 to the area where the a2 equipment part is located, and the operation steps of the b2 tester are recorded through the video recorder c2, so that the manager can remotely monitor the testers b1 and b2 through the candidate test video streams sent by the video recorder c1 and the video recorder c 2.

When a manager needs to perform voice interaction with a tester, a target test video stream recorded for the tester to be subjected to voice interaction can be screened out from multiple candidate test video streams displayed on a screen, and the target test video stream is triggered, so that a terminal generates a voice interaction instruction based on triggering operation of the manager. For example, in the above example, as shown in fig. 3, when the administrator determines that the voice interaction with the tester b1 is required, the administrator may click the test video stream sent by the video recorder c1 in the screen, at this time, the terminal displays the call control according to the clicking operation of the administrator, and when determining that the administrator clicks the call control, generates the voice interaction instruction according to the video identifier of the test video stream sent by the video recorder c 1.

Further, the terminal comprises a voice adding module, when the voice interaction instruction is generated, the terminal can send the voice interaction instruction to the voice adding module, so that the voice adding module can determine a target test video stream pointed by the voice interaction instruction based on the video identification contained in the voice interaction instruction.

In one embodiment, before receiving the voice interaction instruction, the method further comprises: when the triggering operation is detected, determining the position coordinates pointed by the triggering operation; determining a test video stream pointed by the triggering operation through the position coordinates; and taking the test video stream pointed by the triggering operation as a target test video stream, and generating a corresponding voice interaction instruction according to the video identification of the target test video stream.

When the terminal detects the triggering operation of the manager on the screen, the terminal determines the position coordinates in the screen pointed by the triggering operation, determines a display area through the position coordinates, and takes the candidate test video stream displayed in the display area as a target test video stream. The terminal obtains the video identification of the target test video stream, and generates a corresponding voice interaction instruction according to the video identification.

Step S208, calling a voice acquisition device to acquire the on-site voice stream based on the voice interaction instruction.

Step S210, performing silence detection on the live voice stream, and dividing the live voice stream according to the detection result of the silence detection to obtain a corresponding live voice segment.

Specifically, the terminal invokes the local voice acquisition equipment according to the voice interaction instruction to acquire the on-site voice stream, namely the voice stream of the manager. Because the silence segments with a certain duration are arranged between the sentences which are spoken by the manager, the terminal can detect the silence segments in the field voice stream, and segment the field voice stream according to the silence segments to obtain at least one field voice segment, so that one sentence which is spoken by the manager is recorded in a single field voice segment.

Step S212, when the original sound fragment with the sound amplitude exceeding the first amplitude is not included in the target test video stream within the duration of the live voice fragment from the acquisition time of the live voice fragment, the live voice fragment is correspondingly added into the target test video stream.

The original sound fragment is sound information carried by the original in the test video stream. The video recorder not only can record the test site, but also can record the sound information of the test site, so that the target test video stream can carry the original sound information.

Specifically, the terminal determines the acquisition time of the on-site voice segment, and judges whether the original voice segment with the voice amplitude exceeding the first amplitude is provided in the target test video stream within the duration of the on-site voice segment from the acquisition time point. For example, because abnormal noise may occur in the nuclear power test site during the process of speaking by the manager, if the site voice segment is added to the target video test stream, the site voice segment may be covered by the abnormal noise, so when the terminal acquires the current site voice segment at 10:00, the terminal needs to determine whether the target test video stream has an original voice segment with a voice amplitude exceeding the first amplitude within the duration of the site voice segment from 10:00. If the target test video stream does not have the original sound fragment with the sound amplitude exceeding the first amplitude, the terminal takes the acquisition time point as a starting point, and adds the on-site sound fragment into the target test video stream.

It is easy to understand that, because the time delay between the live voice stream and the target test video stream is short, the live voice stream received at the current moment can be added to the target video stream received at the current moment.

In one embodiment, if the target test video stream has an original sound segment with a sound amplitude exceeding the first amplitude, the terminal identifies the on-site sound segment based on the pre-trained voice identification model to obtain a voice subtitle, and adds the voice subtitle to the target test video stream with the acquisition time as a starting point.

In one embodiment, the terminal determines a target video recorder that transmits a target test video stream and transmits a live voice stream to the target video recorder, whereby a tester receives the live voice stream through headphones connected to the target video recorder and adjusts the test operation according to the live voice stream.

In one embodiment, the target video recorder can also collect the voice stream of the tester through the earphone worn on the tester, and send the voice stream collected through the earphone to the terminal, so that the terminal can correspondingly play the voice stream of the tester, and thus, remote communication can be realized.

In the voice adding method in the nuclear power test, the candidate test video stream recorded for the current nuclear power test can be screened out from the received test video streams based on the target test identification code by acquiring the target test identification code associated with the current nuclear power test; the candidate test video streams can be correspondingly displayed by determining the candidate test video streams, so that a target test video stream needing voice interaction can be determined through the displayed candidate test video streams, and a corresponding voice interaction instruction is generated according to the target test video stream; by generating the voice interaction instruction, the on-site voice stream can be acquired based on the voice interaction instruction and is automatically added into the target test video stream, so that the later-stage test video can be re-coiled, and the voice information of the manager can be re-coiled at the same time.

In one embodiment, the method for adding voice in the nuclear power test further comprises: when the original sound fragments with the sound amplitude exceeding the first amplitude are in the target test video stream within the duration of the on-site voice fragments from the acquisition time of the on-site voice fragments, extracting a first video frame in the target test video stream at the current time and a second video frame in the rest candidate test video streams except the target test video stream; performing image matching on the first video frame and the second video frame; and when the second video frames have the target second video frames which are successfully matched with the first video frames, adding the live voice fragments into candidate test video streams containing the target second video frames.

Specifically, in order to facilitate the manager to monitor the nuclear power test in an omnibearing manner, a plurality of video recorders may be erected for the same tester in a test area of the nuclear power test, so that the plurality of video recorders may record test operations of the same tester from different angles. When the target test video stream has an original sound fragment with a sound amplitude exceeding a first amplitude, the terminal extracts a first video frame in the target video stream at the current moment and a second video frame in the rest candidate test video streams except the target test video stream.

Wherein performing image matching on the first video frame and the second video frame includes: identifying first identity information of the test person in the first video frame and second identity information of the test person in the second video frame based on the pre-trained face recognition model; identifying a first current operating equipment part in a first video frame and a second current operating equipment part in a second video frame based on the pre-trained equipment part identification model; and performing image matching on the first video frame and the second video frame according to the first identity information, the second identity information, the first current operation equipment part and the second current operation equipment part.

The terminal identifies first identity information of the testers in the first video frame and second identity information of the testers in the second video frame based on a pre-trained face recognition model; the first current operating equipment part in the first video frame and the second current operating equipment part in the second video frame are identified based on the pre-trained equipment part identification model. The current operating equipment part refers to the equipment part currently operated by the tester. The terminal judges the consistency of the second identity information and the first identity information, and screens target second identity information consistent with the first identity information from a plurality of second identity information; and judging the consistency of the second current operation equipment part and the first current operation equipment part, and screening target second current operation equipment parts consistent with the first current operation equipment part from a plurality of second current operation equipment parts. The terminal takes a second video frame containing target first identity information and target second current operation equipment parts as a target second video frame matched with the first video frame, at the moment, candidate test video streams containing the target second video frame and the target test video stream are all video streams recorded aiming at test operation of the same tester, and therefore the terminal can add the on-site voice fragment into the candidate test video stream containing the target second video frame. It is easy to understand that when the first video frame only includes the test person or only includes the current operation equipment part, the image matching can be performed on the first video frame and the second video frame only according to the first identity information, the target second identity information, or the first current operation equipment part and the second current operation equipment part.

In this embodiment, when the on-site voice segment may be covered by abnormal noise in the target test video stream, by adding the on-site voice segment to the corresponding candidate test video stream, the subsequent test personnel can still perform the rewinding on the nuclear power test according to the video information and the voice information in the candidate test video stream.

In one embodiment, as shown in fig. 4, showing the candidate trial video stream includes:

step S402, test information of a current nuclear power test is obtained, and test video clips within a preset time period are intercepted from candidate test video streams; the test information comprises test procedures contained in the current nuclear power test and key equipment parts corresponding to the test procedures.

Step S404, traversing at least one video frame in the candidate trial video segments, and identifying the current operation equipment part in each traversed video frame.

Step S406, when the current operation device part in the traversed video frame belongs to the key device part, determining the traversed video frame as the target video frame.

Step S408, screening out target key equipment parts corresponding to the current operation equipment parts in the target video frame from the key equipment parts corresponding to the test procedures.

And step S410, determining a current test procedure based on the target key equipment part, and displaying the procedure identification of the current test procedure corresponding to the candidate test video stream.

Specifically, the nuclear power test is performed item by item according to the test procedure, and in the actual execution of the test procedure, the actual execution time period of each test procedure is not necessarily the same as the execution time period marked in the test information due to the occurrence of various emergency conditions, so that the current test procedure being executed at the current time cannot be determined simply based on the execution time period corresponding to each test procedure included in the test information, and further, it is necessary to confirm the current test procedure by means of the test video.

When a candidate test video stream is received, the terminal intercepts test video fragments in a preset time period from the current moment from the candidate test video stream. The test video segment consists of a plurality of video frames, the terminal traverses the plurality of video frames in the test video segment, and identifies the current operation equipment parts in each traversed video frame. The current equipment part refers to the equipment part currently operated by the tester. The terminal can detect the current operation equipment part in the video frame based on a preset equipment part detection algorithm, so as to determine the equipment part operated by the tester. The equipment part detection algorithm can be customized according to requirements, for example, the current operation equipment part can be identified based on an image identification algorithm in matlab, or the current operation equipment part can be identified based on an image identification algorithm in OpenCV.

Further, the terminal judges whether the current operation equipment part in the current traversed video frame belongs to a key equipment part in the test information, if so, the current traversed video frame is used as a target video frame, and if not, the current traversed video frame is suspended to be used as the target video frame. When all target video frames in the test video clips are determined, the terminal performs part matching on the current operation equipment parts in the target video frames and the key equipment parts corresponding to the test procedures, and screens out target key equipment parts corresponding to the current operation equipment parts in the target video frames from the key equipment parts corresponding to the test procedures according to the part matching results. And the terminal determines the corresponding relation between the test procedure and the equipment part keyword according to the equipment part keyword corresponding to each test procedure, determines the test procedure corresponding to the target key equipment part according to the corresponding relation, and determines the test procedure corresponding to the target equipment part keyword as the current test procedure. For example, when the key equipment parts corresponding to the test procedure 1 are a and B, the key equipment parts corresponding to the test procedure 2 are C and D, the current operation equipment part in the target video frame 1 is a, and the current operation equipment part in the target video frame 2 is B, the target key equipment parts corresponding to the target video frame are a and B, so that the terminal takes the test procedure 1 as the current test procedure.

Further, the terminal determines a process identifier of the current test process, and displays the process identifier of the current test process and the candidate test video stream correspondingly.

In one embodiment, when determining the process identifier of the current test process, the process identifier of the current test process can be added into the corresponding candidate test video stream, so that when the candidate test video needs to be multiplexed, the corresponding video segment can be quickly queried based on the process identifier, and the multiplexing efficiency of the test multiplexed is greatly improved.

In this embodiment, through the process identifier of the corresponding display current test process and the candidate test video stream, the administrator can determine whether the operation of the tester is wrong based on the displayed information, so that when the operation of the tester is wrong, the tester is remotely guided in time through voice interaction.

In one embodiment, displaying the process identifier of the current test process in correspondence with the candidate test video stream includes: when multiple paths of candidate test video streams exist, extracting target video frames in test fragments corresponding to the candidate test video streams; determining the integrity of a current operation equipment part in the extracted target video frame and the integrity of a hand contacted with the current operation equipment part; image scoring is carried out on the target video frames in each test video segment according to the integrity of the current operation equipment part and the integrity of the hands; determining a primary test video stream and a secondary test video stream in the multiple candidate test video streams according to the image scores; the primary test video stream and the process identification of the current test process are presented at a preset primary location in the screen, and the secondary test video stream is presented at a preset secondary location in the screen.

Specifically, because the recording view angles of the video recorders are different, some video recorders can shoot to obtain a complete test site, and some video recorders only shoot part of the test site, for example, due to the existence of a shielding object, some video recorders can shoot to obtain a complete nuclear point equipment image, and some video recorders only shoot to obtain the upper half part of the nuclear power equipment. In order to facilitate management personnel to control the whole test site, therefore, the test video stream recorded by the video recorder capable of shooting complete nuclear power equipment images can be displayed in the center of a screen, and the test video stream recorded by the video recorder capable of shooting partial nuclear power equipment images is displayed around the screen.

When multiple candidate test video streams exist, the terminal determines test fragments in each candidate test video stream and target video frames in each test fragment based on the method. The terminal traverses each target video frame in each test segment, and detects the integrity of the current operation equipment part and the integrity of the hand contacted with the current operation equipment part in the target video frame in the current traverse sequence based on a preset integrity detection algorithm. The integrity detection algorithm can be customized according to requirements. The integrity of the current operation device part is used to represent the blocked condition of the current operation device part in the target video frame, for example, when the current operation device part is blocked by 20%, the integrity of the current operation device part is 80%.

Further, the terminal performs image scoring on the target video frame according to the integrity of the current equipment part and the integrity of the hand in the target video frame, and performs summation processing on the image scores of the target video frames belonging to the same test video segment to obtain the segment scores of the test video segment. For example, the terminal performs weighted summation on the integrity of the current equipment part and the integrity of the hand to obtain a corresponding image score. And summing the image scores of the target video frames belonging to the same test video segment to obtain the segment score of the test video segment. As will be readily appreciated, the more complete the current operating device part in the target video frame and the more complete the hand in contact with the current operating device part, the higher the image score of the target video frame.

The terminal sets the candidate test video stream with the segment score exceeding the preset score threshold as a main test video stream, sets the rest candidate test video streams as secondary test video streams, and displays the main test video stream and the process identification of the current test process at the main position in the screen; the secondary candidate trial video stream is shown at a secondary location in the screen. For example, the primary test video stream is shown in the middle of the screen and the secondary test video stream is shown around the screen.

In this embodiment, because the higher the integrity of the current equipment part and the integrity of the hand contacting with the current equipment part, the easier the manager judges whether the operation of the current tester is correct, so that the primary test video stream with high integrity and the secondary test video stream with low integrity are displayed in a distinguishing manner, the judging efficiency of the manager can be improved, and the test efficiency of the nuclear power test is improved.

In one embodiment, adding the onsite speech segment correspondence to the target trial video stream includes: carrying out sound spectrum analysis on the on-site voice fragment to obtain a corresponding spectrogram; the spectrogram comprises sound amplitude values of the on-site voice fragments; when the sound amplitude value in the on-site voice segment exceeds the second amplitude value, recognizing the on-site voice segment based on a pre-trained voice recognition model to obtain a voice subtitle; and correspondingly adding the voice subtitle and the onsite voice fragment into the target test video stream.

Specifically, when the live voice segment is obtained, the terminal performs framing processing on the live voice segment based on a preset sampling frequency to obtain a plurality of voice frames, for example, a sampling point is set to 400, and one frame duration is 25ms. The terminal extracts sound signals from the voice frames, performs sound spectrum analysis on the sound signals to obtain sound amplitude values of the sound signals under the sampling time, and combines the sampling time and the corresponding sound amplitude values to obtain frequency spectrum points corresponding to each voice frame. The sound amplitude refers to a sound pressure value, and in order to strengthen the characteristics of the sound signal, the terminal adopts the sound pressure value to characterize each voice frame.

Since, in a special case, the tone of the manager may be increased involuntarily, for example, when a serious problem occurs in the test operation of the tester, the tone of the manager may be increased involuntarily in order to promptly stop the operation of the tester. Therefore, in order to facilitate the subsequent duplication of the voice information under the special condition, the terminal can screen out the on-site voice fragments with the voice amplitude exceeding the second amplitude from the plurality of on-site voice fragments obtained by segmentation according to the spectrogram, input the on-site voice fragments with the voice amplitude exceeding the second amplitude into a pre-trained voice recognition model, and recognize the on-site voice fragments with the voice amplitude exceeding the second amplitude through the pre-trained voice recognition model to obtain the voice subtitle. The speech recognition model is a model formed by an artificial neural network. The developer can train the voice recognition model through the training sample and the label, so that a pre-trained voice recognition model is obtained.

When the voice subtitle is obtained, the terminal adds the voice subtitle and the on-site voice fragment with the voice amplitude exceeding the second amplitude to the target test video stream, so that a subsequent manager or a tester can determine the test video fragment recorded for the special situation based on the voice subtitle, and the test video fragment containing the voice subtitle is subjected to repeated disc.

In one embodiment, the spectrum analysis of the on-site voice stream may be performed by using a preset spectrum analysis algorithm, and specifically may be an FFT (fast fourier transform) spectrum analysis algorithm.

In the above embodiment, by generating the spectrogram, the on-site speech segment whose sound amplitude is not in the normal range may be determined based on the sound amplitude in the spectrogram; the on-site voice fragments are converted into subtitle information, and the subtitle information is added into a target test video stream, so that the subtitle information can be used as a multi-disc guide later, and the corresponding test video fragments can be rapidly determined according to the multi-disc guide. In addition, the subtitle information is added to the target test video stream, so that when the on-site voice stream fails due to improper storage of the test video, the test video can still be multiplexed based on the subtitle information.

It should be understood that, although the steps in the flowcharts of fig. 2 and 4 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2, 4 may include steps or stages that are not necessarily performed at the same time, but may be performed at different times, or the order in which the steps or stages are performed is not necessarily sequential, but may be performed in rotation or alternatively with at least some of the other steps or stages.

In one embodiment, as shown in fig. 5, a voice adding device 500 in a nuclear power test is provided, including: a presentation module 502, a voice stream acquisition module 504, and a voice addition module 506, wherein:

the display module 502 is configured to obtain a target test identifier associated with a current nuclear power test when a test video stream is received; screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes; and displaying the candidate test video stream, and screening a target test video stream pointed by the voice interaction instruction from the candidate video stream when the voice interaction instruction is received.

The voice stream collection module 504 is configured to invoke the voice collection device to collect the live voice stream based on the voice interaction instruction.

The voice adding module 506 is configured to perform silence detection on the live voice stream, and segment the live voice stream according to a detection result of the silence detection to obtain a corresponding live voice segment; and when the original sound fragments with the sound amplitude exceeding the first amplitude are not included in the target test video stream within the duration of the live voice fragments from the acquisition time of the live voice fragments, correspondingly adding the live voice fragments into the target test video stream.

In one embodiment, as shown in fig. 6, the voice adding module 506 is further configured to extract, when the original sound clip having the sound amplitude exceeding the first amplitude is in the target test video stream during the duration of the live voice clip from the time of capturing the live voice clip, a first video frame in the target test video stream at the current time and a second video frame in the rest candidate test video streams except the target test video stream; performing image matching on the first video frame and the second video frame; and when the second video frames have the target second video frames which are successfully matched with the first video frames, adding the live voice fragments into candidate test video streams containing the target second video frames.

In one embodiment, the test video stream is acquired by a video recorder; the display module 502 further includes a test video stream determining module 5021, configured to receive a test identifier and a local identifier sent by the video recorder; based on the test identification code and the local identification code, establishing a mapping relation between the video recorder and the test identification code; according to the mapping relation and the target test identification code, determining a candidate video recorder for video recording in the test process of the current nuclear power test; and taking the test video stream output by the candidate video recorder as a candidate test video stream.

In one embodiment, the test video stream is acquired by a video recorder; the tested video stream is embedded with a test identification code; the test video stream determining module 5021 is further configured to set a test video stream embedded with a target test identifier as a candidate test video stream.

In one embodiment, the display module 502 further includes a test procedure determining module 5022, configured to obtain test information of a current nuclear power test, and intercept a test video segment within a preset duration from a candidate test video stream; the test information comprises test procedures included in the current nuclear power test, and key equipment parts corresponding to the test procedures respectively; traversing at least one video frame in the test video clip and identifying a current operating equipment part in each traversed video frame; when the current operation equipment part in the traversed video frame belongs to the key equipment part, determining the traversed video frame as a target video frame; screening out target key equipment parts corresponding to the current operation equipment parts in the target video frame from the key equipment parts corresponding to the test procedures; and determining a current test procedure based on the target key equipment part, and displaying the procedure identification of the current test procedure and the candidate test video stream correspondingly.

In one embodiment, the display module 502 is further configured to, when multiple candidate test video streams are provided, extract a target video frame in a test segment corresponding to each candidate test video stream; determining the integrity of a current operation equipment part in the extracted target video frame and the integrity of a hand contacted with the current operation equipment part; image scoring is carried out on the target video frames in each test video segment according to the integrity of the current operation equipment part and the integrity of the hands; determining a primary test video stream and a secondary test video stream in the multiple candidate test video streams according to the image scores; the primary test video stream and the process identification of the current test process are presented at a preset primary location in the screen, and the secondary test video stream is presented at a preset secondary location in the screen.

In one embodiment, the voice adding module 506 further includes a sound amplitude determining module 5061, configured to perform a spectrogram analysis on the onsite voice segment to obtain a corresponding spectrogram; the spectrogram comprises sound amplitude values of the on-site voice fragments; when the sound amplitude value in the on-site voice segment exceeds the second amplitude value, recognizing the on-site voice segment based on a pre-trained voice recognition model to obtain a voice subtitle; and correspondingly adding the voice subtitle and the onsite voice fragment into the target test video stream.

For specific limitations of the voice adding device in the nuclear power test, reference may be made to the above limitation of the voice adding method in the nuclear power test, and no further description is given here. All or part of each module in the voice adding device in the nuclear power test can be realized by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program, when executed by the processor, implements a method of speech addition in a nuclear power experiment. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. The voice adding method in the nuclear power test is characterized by comprising the following steps:

Displaying the candidate test video stream, and when a voice interaction instruction is received, screening a target test video stream pointed by the voice interaction instruction from the candidate test video stream;

based on the voice interaction instruction, calling a voice acquisition device to acquire an on-site voice stream; the on-site voice stream refers to the voice stream of the manager;

when the original sound fragments with the sound amplitude exceeding the first amplitude are not included in the target test video stream within the duration of the on-site sound fragments from the acquisition time of the on-site sound fragments, the on-site sound fragments are correspondingly added into the target test video stream;

Performing image matching on the first video frame and the second video frame;

2. The method of claim 1, wherein the test identification code is an identification code obtained by encoding test information of the current nuclear power test prior to performing the current nuclear power test.

3. The method of claim 1, wherein the test video stream is acquired by a video recorder; the step of screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes comprises the following steps:

4. The method of claim 1, wherein the test video stream is acquired by a video recorder; the test video stream is embedded with a test identification code; the test identification code in the test video stream is obtained by scanning the image codes by a video recorder;

5. The method of claim 1, wherein the presenting the candidate trial video stream comprises:

6. The method of claim 5, wherein displaying the process identifier of the current test process in correspondence with the candidate test video stream comprises:

7. The method of claim 1, wherein the adding the live speech segment to the target test video stream comprises:

8. A voice adding device in a nuclear power test, the device comprising:

the display module is used for acquiring a target test identification code associated with the current nuclear power test when receiving the test video stream; screening candidate test video streams for video recording in the current nuclear power test process from the test video streams based on the target test identification codes; displaying the candidate test video stream, and when a voice interaction instruction is received, screening a target test video stream pointed by the voice interaction instruction from the candidate test video stream;

The voice stream acquisition module is used for calling the voice acquisition equipment to acquire the on-site voice stream based on the voice interaction instruction; the on-site voice stream refers to the voice stream of the manager;

the voice adding module is used for carrying out silence detection on the on-site voice stream, and dividing the on-site voice stream according to the detection result of the silence detection to obtain a corresponding on-site voice segment; when the original sound fragments with the sound amplitude exceeding the first amplitude are not included in the target test video stream within the duration of the on-site sound fragments from the acquisition time of the on-site sound fragments, the on-site sound fragments are correspondingly added into the target test video stream; when the original sound fragments with the sound amplitude exceeding the first amplitude are in the target test video stream within the duration of the on-site voice fragments from the acquisition time of the on-site voice fragments, extracting a first video frame in the target test video stream at the current time and a second video frame in the rest candidate test video streams except the target test video stream; performing image matching on the first video frame and the second video frame; and when the second video frame has the target second video frame successfully matched with the first video frame, adding the live voice fragment into a candidate test video stream containing the target second video frame.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.