CN104123115B

CN104123115B - Audio information processing method and electronic device

Info

Publication number: CN104123115B
Application number: CN201410364822.8A
Authority: CN
Inventors: 高扬
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2014-07-28
Filing date: 2014-07-28
Publication date: 2017-05-24
Anticipated expiration: 2034-07-28
Also published as: CN104123115A

Abstract

The invention discloses an audio information processing method which is used for solving the technical problem that in the prior art, the display effect of an electronic device is poor. The method includes the steps that when a voice file is output, and M segments of audio information with a first vocal print characteristic in the voice file are analyzed; the M segments of audio information are compared with N segments of audio samples, first audio samples corresponding to the vocal print characteristic same as the first vocal print characteristic in the N segments of audio samples are determined, and first user identification information corresponding to the M segments of the audio information is determined according to the correspondence relation between the audio samples and the user identification information; the voice file is output; when the audio information with the first vocal print characteristic is played, the first display effect of an electronic device is controlled to display the first user identification information. The invention further discloses the electronic device used for achieving the method.

Description

A kind of audio-frequency information processing method and electronic equipment

Technical field

The present invention relates to field of computer technology, more particularly to a kind of audio-frequency information processing method and electronic equipment.

Background technology

As developing rapidly for science and technology is increasingly fierce with market competition, the performance and outward appearance of electronic equipment have been obtained energetically Lifting, wherein notebook computer with its small volume and less weight, be easy to carry, it is recreational strong the advantages of just by increasing people Like, as an indispensable part in studying and living.User is also more and more using the thing that electronic equipment can do, Such as：User can be communicated by mobile phone or panel computer with phonetic function, be recorded.

At present, most electronic equipment has sound-recording function, disclosure satisfy that the recording demand of several scenes, for example can View, the recording in classroom etc..Generally, due to the complexity of recording scene, after user obtains recording using electronic equipment, can cause Be not easy to distinguish voice content specific corresponding speaker during playback, especially for sound it is closer as speaker, Or the also unfamiliar speaker of listener, discrimination difficulty when all can cause to listen to.For example, in a meeting, user uses electricity Sub- equipment is recorded to conference content, when the later stage, playback was looked back, if there is the situation that many people discuss simultaneously, It is possible that it is very noisy to play sound, it is impossible to which it is specifically which participant is speaking to distinguish quickly, then listener is listening to Also need to distinguish the corresponding calling party of playback diligently during recording, and in order to rapidly react and recording substance Corresponding calling party, it may be necessary to playback repeatedly, so that the heavy load of electronic equipment, Consumer's Experience is also poor.

In summary, there is the poor technical problem of electronic equipment recording effect in the prior art.

The content of the invention

The embodiment of the present invention provides a kind of audio-frequency information processing method and electronic equipment, for solving electronic equipment recording effect Really poor technical problem.

A kind of audio-frequency information processing method, is applied in electronic equipment, and be stored with N section audio samples in the electronic equipment This, the every section audio sample in the N section audios sample corresponds to a user identity information, the user identity information bag respectively Containing the information that can be used in characterizing audio object corresponding with audio-frequency information, N is positive integer, and methods described includes：

During a voice document is exported, the M with the first vocal print feature sections in institute's voice file is parsed Audio-frequency information, M is positive integer；

The M section audios information is compared with the N section audios sample, the corresponding N of the N section audios sample is determined Whether there is and the first vocal print feature identical vocal print feature in individual vocal print feature；

If in the presence of, determine in the N section audios sample with corresponding to the first vocal print feature identical vocal print feature First audio sample, and according to audio sample and the corresponding relation of user identity information, it is determined that corresponding with the M section audios information First user identification information；

Output institute voice file；Wherein, when the audio-frequency information with first vocal print feature is played, control is described Electronic equipment shows the first user identification information with the first display effect.

Optionally, methods described also includes：

Detect included in institute's voice file audio-frequency information section in while having the second vocal print feature and the 3rd vocal print During feature, isolated with described in from audio-frequency information section according to second vocal print feature and the 3rd vocal print feature Second audio-frequency information of the second vocal print feature, and the 3rd audio-frequency information with the 3rd vocal print feature；

Compared with the N section audios sample respectively by by second audio-frequency information and the 3rd audio-frequency information It is right, determine second audio sample corresponding with second vocal print feature, and corresponding with the 3rd vocal print feature Three audio samples；And according to audio sample and the corresponding relation of user identity information, determine and the second vocal print feature phase Correspondence second user identification information, and threeth user identity information corresponding with the 3rd vocal print feature；

The electronic equipment is controlled during the audio-frequency information is played, while showing the second user mark letter Breath and the 3rd user identity information.

Optionally, the electronic equipment is controlled during the audio-frequency information end is played, while display described second User identity information and the 3rd user identity information, also include：

Audio-frequency information corresponding second audio intensity of the detection with second vocal print feature, and with the 3rd sound Corresponding 3rd audio intensity of audio-frequency information of line feature；

Compare second audio intensity and the 3rd audio intensity, the big audio-frequency information of wherein audio intensity is determined It is main audio information, and the small audio-frequency information of audio intensity is defined as secondary audio-frequency information；

According to the corresponding relation of audio intensity and display effect, control the electronic equipment show with the first display effect and The corresponding user identity information of the main audio information, and use corresponding with the secondary audio-frequency information is shown with the second display effect Family identification information.

Optionally, the M section audios information is compared with the N section audios sample, determines the N section audios sample In corresponding N number of vocal print feature whether there is with the first vocal print feature identical vocal print feature, also include：

If not existing and the first vocal print feature identical sound in the corresponding N number of vocal print feature of the N section audios sample Line feature, judges whether the M section audios information is crucial audio-frequency information；Wherein, the crucial audio-frequency information is and the electricity The audio-frequency information of the contact object correlation stored in sub- equipment；

If the M section audios information is the crucial audio-frequency information, set up and the M section audios according to the contact object The corresponding user identity information of information；Or

If the M section audios information be the crucial audio-frequency information, set the first specific identification information as with the M The corresponding user identity information of section audio information；Wherein, first specific identification information is specific pattern in the electronic equipment As information, specific character information and any information or combined information in special sound information.

Optionally, if being the crucial audio-frequency information in the M section audios information, set up according to the contact object and institute While stating M section audio information corresponding user identity information or afterwards, methods described also includes：

According to the M section audios information, the first audio-frequency fragments are obtained；

First audio-frequency fragments are stored as N+1 section audio samples；Wherein, the N+1 section audios sample Correspond to same user identity information with the M section audios information.

A kind of electronic equipment, be stored with N section audio samples in the electronic equipment, every section in the N section audios sample Audio sample corresponds to a user identity information respectively, and the user identity information includes and can be used in characterizing and audio-frequency information pair The information of the audio object answered, N is positive integer, and the electronic equipment includes：

Parsing module, for during a voice document is exported, parse in institute's voice file with first The M section audio information of vocal print feature, M is positive integer；

Comparing module, for the M section audios information to be compared with the N section audios sample, determines the N sections of sound Whether there is and the first vocal print feature identical vocal print feature in the corresponding N number of vocal print feature of frequency sample；

First determining module, if in the presence of, determine in the N section audios sample with the first vocal print feature identical The first audio sample corresponding to vocal print feature, and according to audio sample and the corresponding relation of user identity information, it is determined that and institute State the corresponding first user identification information of M section audio information；

Output module, the voice file for exporting；Wherein, when audio of the broadcasting with first vocal print feature is believed During breath, the electronic equipment is controlled to show the first user identification information with the first display effect.

Optionally, the electronic equipment also includes：

Separation module, while having the second vocal print special in the audio-frequency information section included in voice file for detecting Levy and during three vocal print features, according to second vocal print feature and the 3rd vocal print feature from the audio-frequency information Duan Zhongfen Separate out the second audio-frequency information with second vocal print feature, and the letter of the 3rd audio with the 3rd vocal print feature Breath；

Second determining module, for by by second audio-frequency information and the 3rd audio-frequency information respectively with the N Section audio sample is compared, and determines second audio sample corresponding with second vocal print feature, and with the 3rd sound The 3rd corresponding audio sample of line feature；And according to audio sample and the corresponding relation of user identity information, determine and institute State the corresponding second user identification information of the second vocal print feature, and the threeth user mark corresponding with the 3rd vocal print feature Knowledge information；

Control module, for controlling the electronic equipment during the audio-frequency information is played, while display is described Second user identification information and the 3rd user identity information.

Optionally, the electronic equipment also includes：

Detection module, for detecting corresponding second audio intensity of audio-frequency information with second vocal print feature, and Corresponding 3rd audio intensity of audio-frequency information with the 3rd vocal print feature；

Comparison module, for comparing second audio intensity and the 3rd audio intensity, will wherein audio intensity it is big Audio-frequency information be defined as main audio information, and the small audio-frequency information of audio intensity is defined as secondary audio-frequency information；

First processing module, for the corresponding relation according to audio intensity and display effect, control the electronic equipment with First display effect shows user identity information corresponding with the main audio information, and with the second display effect show with it is described The corresponding user identity information of secondary audio-frequency information.

Optionally, the electronic equipment also includes：

Judge module, if for not existing and first vocal print in the corresponding N number of vocal print feature of the N section audios sample Feature identical vocal print feature, judges whether the M section audios information is crucial audio-frequency information；Wherein, the crucial audio letter Breath is the audio-frequency information related to the contact object stored in the electronic equipment；

Second processing module, if being the crucial audio-frequency information for the M section audios information, according to the contact object Set up user identity information corresponding with the M section audios information；Or, if the M section audios information is not the key sound Frequency information, sets the first specific identification information as user identity information corresponding with the M section audios information；Wherein, it is described First specific identification information is any in specific image information, specific character information and special sound information in the electronic equipment Information or combined information.

Optionally, the electronic equipment also includes：

Acquisition module, for according to the M section audios information, obtaining the first audio-frequency fragments；

Memory module, first audio-frequency fragments are stored as N+1 section audio samples；Wherein, it is described N+1 sections Audio sample corresponds to same user identity information with the M section audios information.

In the embodiment of the present invention, because the N section audios sample standard deviation stored in the electronic equipment has correspondence user Identification information, and each user identity information includes the information that can be used in characterizing audio object corresponding with audio-frequency information, because This can know that the M section audios with first vocal print feature are believed when voice document is educated described in output by parsing Breath, and according to vocal print feature, the M section audios information is compared with the N section audios sample, then can determine to have with First audio sample of the first vocal print feature identical vocal print feature, thus according to first audio sample pair The first user identification information answered, so that when the audio-frequency information with first vocal print feature is played, that is, broadcasting When putting any audio-frequency information into the M section audios information, the first user identification information can be shown.Therefore, even if The recording substance of broadcasting has multiple calling parties, then because the corresponding vocal print feature of each calling party is differed, therefore pass through After determining there is the multistage audio-frequency information of identical vocal print feature in recording substance, corresponding user identity information is determined by comparing Afterwards, then corresponding user identity information can be shown when the audio-frequency information is played, it is currently playing such that it is able to quickly know Corresponding audio object in voice document, and spend the unnecessary time to be distinguished again without user, therefore enhance electronic equipment Recording effect, also improves the experience of user.

Brief description of the drawings

Fig. 1 is the broad flow diagram of embodiment of the present invention sound intermediate frequency information processing method；

Fig. 2 is to show the schematic diagram of first user identification information in the embodiment of the present invention；

Fig. 3 is to show the schematic diagram of second user identification information and the 3rd user identity information in the embodiment of the present invention；

Fig. 4 is the main modular figure of electronic equipment in the embodiment of the present invention.

Specific embodiment

The embodiment of the invention discloses a kind of audio-frequency information processing method, it is applied in electronic equipment, the electronic equipment In be stored with N section audio samples, the every section audio sample in the N section audios sample corresponds to a user identity information respectively, The user identity information includes the information that can be used in characterizing audio object corresponding with audio-frequency information, and N is positive integer, described Method includes：During a voice document is exported, the M with the first vocal print feature sections in institute's voice file is parsed Audio-frequency information, M is positive integer；The M section audios information is compared with the N section audios sample, the N section audios are determined Whether there is and the first vocal print feature identical vocal print feature in the corresponding N number of vocal print feature of sample；If in the presence of determining institute State in N section audio samples with the first audio sample corresponding to the first vocal print feature identical vocal print feature, and according to sound The corresponding relation of frequency sample and user identity information, it is determined that first user identification information corresponding with the M section audios information；It is defeated Go out institute's voice file；Wherein, when the audio-frequency information with first vocal print feature is played, control the electronic equipment with First display effect shows the first user identification information.

Fig. 1 is referred to, the embodiment of the invention discloses a kind of audio-frequency information processing method, being applied to one has display unit Electronic equipment in, be stored with N section audio samples in the electronic equipment, the every section audio sample in the N section audios sample A user identity information is corresponded to respectively, and the user identity information includes and can be used in characterizing audio corresponding with audio-frequency information The information of object, N is positive integer, and methods described may comprise steps of：

Step 11：During a voice document is exported, parse special with the first vocal print in institute's voice file The M section audio information levied, M is positive integer.

In the embodiment of the present invention, institute's voice file can be the recording file of the corresponding special occasions recorded.For example, meeting Discuss the recording file of content or the recording file in classroom etc..Generally, institute's voice file can be stored in local recording text Part, for example, will pass through the file that itself or miscellaneous equipment record and store local, or institute's voice file can also be obtained The recording file from other electronic equipments or high in the clouds.

Optionally, in the embodiment of the present invention, first vocal print feature can refer to institute's voice file is carried out it is defeated During going out, the corresponding vocal print feature of institute's voice file determined by Application on Voiceprint Recognition.

Generally, so-called vocal print refers to the sound wave spectrum of the carrying verbal information that electricity consumption acoustic instrument shows, and any two The voiceprint map of people is all variant.Therefore, by Application on Voiceprint Recognition, it may be determined that each audio-frequency information correspondence in institute's voice file Vocal print feature, therefore may recognize that the audio-frequency information with identical vocal print feature, and when institute's voice file is to record to have many During the corresponding speech content of individual speaker, institute's voice file can be to that should have multiple vocal print features.

Optionally, the M in institute's voice file with first vocal print feature can be determined by Application on Voiceprint Recognition Section audio information, therefore can consider that the M section audios information comes from the content of speaking of same speaker, and the M sections of sound Frequency information can be the different audio positions in institute's voice file.For example, at the speaker with multiple other speakers When under Same Scene, speaker's carrying out often is made a speech, then the corresponding M section audios information is according to speech Time sequencing is recorded into institute's voice file, so as to when institute's voice file is played, be spoken what is recorded including all The content of speaking of people is played out according to recording order, and now, the M section audios information may be interspersed in the voice text The audio-frequency information of the multiple positions in part.

Step 12：The M section audios information is compared with the N section audios sample, the N section audios sample is determined Whether there is and the first vocal print feature identical vocal print feature in corresponding N number of vocal print feature.

In the embodiment of the present invention, because everyone corresponding vocal print feature is differed, therefore the M sections of sound is being determined Frequency information, and can be by sound groove recognition technology in e when the M section audios information is compared with the N section audios sample To be judged, if can detect with the first vocal print feature identical vocal print feature, in illustrating N number of vocal print feature In the presence of with the first vocal print feature identical vocal print feature, i.e., it is no with the audio sample with the M section audios information matches Then, then in the absence of audio sample corresponding with the M section audios information, therefore can not be by the currently stored N section audios Sample determines audio object corresponding with the M section audios information.

In the embodiment of the present invention, the N section audios sample can be set previously according to one or more recording files. For example, extracting audio-frequency information corresponding with Related Contact from the recording file prerecorded or store as audio sample This, or, it is also possible to corresponding audio fragment is recorded as audio sample corresponding with the contact person according to artificial its of contact.Its In, each section audio sample standard deviation in the N section audios sample can be the audio-frequency information from voice segments.For example, literary from voice Multiple audio-frequency informations are obtained in the voice segments of part.

Optionally, in the embodiment of the present invention, the every section audio sample in the N section audios sample corresponds to a user respectively Identification information, the user identity information can be comprising the letter that can be used in characterizing audio object corresponding with audio-frequency information Breath.For example, the user identity information can be comprising information such as contact head image, name, job specification.

Step 13：If in the presence of, determine in the N section audios sample with the first vocal print feature identical vocal print feature institute Corresponding first audio sample, and according to audio sample and the corresponding relation of user identity information, it is determined that believing with the M section audios Cease corresponding first user identification information.

In the embodiment of the present invention, different determined and institute by sound groove recognition technology in e by each audio-frequency information has When stating the first audio sample described in the first vocal print feature identical, may further determine that corresponding with first audio sample The first user identification information, may thereby determine that audio object corresponding with the M section audios information.

Optionally, in the embodiment of the present invention, the corresponding relation between audio sample and user identity information can be user Pre-set.For example, user can be when the N section audios sample be set, by the information related to each audio sample It is set to user identity information corresponding with the audio sample.For example, by the head of audio object corresponding with the section audio sample One in the information such as picture, name or combination are defined as corresponding user identity information.

For example, the first audio sample of the speaker's first that is stored with user mobile phone, the sound of speaking of speaker's first is at the sound Head image information, surname comprising speaker's first in line feature 1, and first user identification information corresponding with first audio sample Name information, then when user plays a recording file using mobile phone, if including the sound of speaker's first in the recording file, Then when the recording file is played, however, it is determined that have in the vocal print feature identified in the recording file identical with vocal print feature 1 Vocal print feature, then it is considered that recording file in have vocal print feature 1 audio-frequency information be the corresponding audio of speaker's first Information, therefore these audio-frequency informations can be the audio-frequency information being associated with first user identification information.

In actual mechanical process, compare with the N section audios sample by the M section audios information, determine institute State and whether there is in the corresponding N number of vocal print feature of N section audio samples during with the first vocal print feature identical vocal print feature, also Can include：If not existing and the first vocal print feature identical sound in the corresponding N number of vocal print feature of the N section audios sample Line feature, judges whether the M section audios information is crucial audio-frequency information；Wherein, the crucial audio-frequency information is and the electricity The audio-frequency information of the contact object correlation stored in sub- equipment；If the M section audios information is the crucial audio-frequency information, according to The contact object sets up user identity information corresponding with the M section audios information；Or, if the M section audios information is not It is the crucial audio-frequency information, the first specific identification information is set and is believed as ID corresponding with the M section audios information Breath；Wherein, first specific identification information is specific image information, specific character information and specific language in the electronic equipment Any information or combined information in message breath.

Wherein, judge whether the M section audios information is the crucial audio-frequency information, there can be following two realizations to sentence Disconnected method.

The first：Judged by user.The process can be according to the contact object stored in the electronic equipment It is determined, if not being stored with corresponding audio section during the contact object of storage, the above deterministic process can be with Being user is realized.For example, in the institute's voice file played, however, it is determined that the audio-frequency information of broadcasting is not match into The audio-frequency information of work(, then user can distinguish that the audio-frequency information is according to oneself familiarity to the corresponding sound of contact person No is the corresponding acoustic information of contact person, if so, then the audio-frequency information can be defined as the crucial audio-frequency information, otherwise, Excessive setting can not be carried out to the section audio information.Therefore during by user itself to be judged, can have larger Autonomous selectivity, improves the Experience Degree of user, at the same also cause the electronic equipment recording effect have it is stronger flexible Property.

Second, judged by electronic equipment.If when the contact object is stored, be also stored with and institute The corresponding audio-frequency information of contact object is stated, then judges whether the M section audios information is that the crucial audio-frequency information can be logical The electronic equipment is crossed by Application on Voiceprint Recognition and matching to realize.If for example, user is setting up the information of the contact object While or afterwards, also for contact object stores one section of corresponding voice, so as to first vocal print feature with it is described N sections Vocal print feature is when the match is successful, can be by by first vocal print feature sound corresponding with the voice segments of the contact object Line feature is matched, and may thereby determine that whether first vocal print feature is related to the contact object, and then determines institute State whether M section audios information is the crucial audio-frequency information.

In the embodiment of the present invention, if judged result shows the M section audios information for the crucial audio-frequency information, can be with User identity information corresponding with the M section audios information is set up according to the contact object.Generally, user is right in storage contact As when, the information such as related object name, head portrait, work unit can be included, however, it is determined that the M section audios information is corresponding described When contact object is contact object 1, then head image information and name information can be set to corresponding with the M section audios information setting The content that is included of user identity information.

Additionally, setting up corresponding with the M section audios information in the contact object that corresponding head portrait is not provided with by some During user identity information, can be configured by obtaining the image related to the contact object from local or high in the clouds, will pass through The user identity information can be distinguished quickly.For example, using the figure related to the contact object for determining stored in mobile phone As setting during the head image information in the user identity information, then the head portrait part that can be included the image carries out sectional drawing, so that The head image information of the contact object is set to, discrimination degree is improved.

Or, if judging, M section audios information described in surface is not the crucial audio-frequency information, can set first specific Identification information is used as user identity information corresponding with the M section audios information；Wherein, first specific identification information is institute State specific image information in electronic equipment, specific character information and any information or combined information in special sound information.

Wherein, the specific image can refer to electronic equipment acquiescence or user it is preassigned, for being vocal print The corresponding image that unsuccessful audio-frequency information is set is matched, and corresponding text information can be set for the image, for example " unidentified ", " unknown " etc..Or, the specific image can also be exactly have mark or image easy to identify, without The special word of collocation, for example, can be shown as the image of unknown personage's head portrait so that user knows the sound now played at a glance Frequency information is and the incoherent information of contact person.

Optionally, in the embodiment of the present invention, if being the crucial audio-frequency information in the M section audios information, according to described While contact object foundation user identity information corresponding with the M section audios information or afterwards, methods described can also be wrapped Include：According to the M section audios information, the first audio-frequency fragments are obtained；Using first audio-frequency fragments as N+1 section audio samples Stored；Wherein, the N+1 section audios sample corresponds to same user identity information with the M section audios information.I.e. true When the fixed M section audios information is the crucial audio-frequency information, any one audio piece can be intercepted in the M section audios information Break as first audio-frequency fragments, and stored first audio-frequency fragments as the N+1 section audio samples, from And the quantity of audio sample is constantly increasing, so that special with the vocal print that can more compare when voice print matching is carried out Levy, with the corresponding user identity information of different vocal print features identified in institute's voice file that can be as far as possible many, so as to obtain Know corresponding audio object etc., improve the accuracy that the electronic equipment is analyzed recording file.

Step 14：Output institute voice file；Wherein, when the audio-frequency information with first vocal print feature is played, The electronic equipment is controlled to show the first user identification information with the first display effect.

In the embodiment of the present invention, after the audio-frequency information with identical vocal print feature in determining the language file, i.e., Can determine that user identity information corresponding with the audio-frequency information.So as to when institute's voice file is played, if being known by vocal print Do not determine currently playing audio-frequency information in the N section audios sample have corresponding audio sample, then can by with this The audio-frequency information of vocal print feature shows that identical user marks identification information.For example, audio object corresponding with the audio-frequency information Head image information, name information etc..

Fig. 2 please be participate in, numeral 20 represents the electronic equipment, and this is sentenced as a example by mobile phone；Numeral 21 represents the electronics and sets Standby display unit, is playing institute's voice file in the display unit, and currently playing audio is the M section audios Any one section in information, numeral 22 represents the user identity information, and this is sentenced as a example by user's head image information, wherein, label User identity information for 1 represents the first user identification information, and remaining ID is represented is wrapped with institute voice file The corresponding user identity information of other vocal print features for containing.

In the embodiment of the present invention, the audio-frequency information processing method can also include：Detect bag in institute's voice file While when there is the second vocal print feature and three vocal print features in the section audio message segment for containing, according to second vocal print feature and The characteristic parameter of the 3rd vocal print feature, with second vocal print feature second is isolated from audio-frequency information section Audio-frequency information, and the 3rd audio-frequency information with the 3rd vocal print feature；By by second audio-frequency information and described 3rd audio-frequency information is compared with the N section audios sample respectively, determines corresponding with second vocal print feature Two audio samples, and threeth audio sample corresponding with the 3rd vocal print feature；According to audio sample and user identity information Corresponding relation, determine second user identification information corresponding with second vocal print feature, and with the 3rd vocal print The 3rd corresponding user identity information of feature；Control the electronic equipment during the audio-frequency information section is played, together When show the second user identification information and the 3rd user identity information.

Wherein, the voice comprising multistage audio-frequency information while audio-frequency information section can refer in institute's voice file Section.For example, in the unit interval, the speech content of multiple speakers may be simultaneously included when playing institute's voice file, then basis Everyone can determine multiple vocal print features by corresponding audio-frequency information.Second vocal print feature and the 3rd vocal print feature It can refer to each self-corresponding vocal print feature of audio-frequency information of different objects of speaking.

It is determined that while there is second vocal print feature and institute in the section audio information included in institute's voice file After stating the 3rd vocal print feature, can be according to the characteristic parameter of second vocal print feature and the 3rd vocal print feature to the audio Message segment is extracted, so as to isolate the second audio-frequency information with second vocal print feature, and with the described 3rd 3rd audio-frequency information of vocal print feature.Wherein, the parameter attribute can be the frequency values of formant in vocal print frequency spectrum.It is general next Say, the frequency values of formant and its trend are most stable of characteristic parameters in vocal print frequency spectrum, and with very strong specificity, and The characteristic parameter less stable such as duration, loudness of a sound, waveform, can also make reference.

Optionally, in the embodiment of the present invention, second user mark letter corresponding with second vocal print feature is being determined After breath, and the 3rd user identity information corresponding with the 3rd vocal print feature, then the mistake of the audio-frequency information is being played Cheng Zhong, can simultaneously show the second user identification information and the 3rd user identity information, be worked as with causing that hearer is known Corresponding many human head pictures preceding many people speak simultaneously when.For example, comprising speaker's first of speech simultaneously and being said in institute's voice file Talk about the audio-frequency information section 1 of people's second, then when playing to audio-frequency information section, head portrait a corresponding with speaker's first and speaker's second Corresponding head portrait b will be shown simultaneously, be the corresponding sound of the two head portraits difference to represent currently playing audio-frequency information section The corresponding sound of frequency object.

Fig. 3 is referred to, numeral 30 represents the electronic equipment, and this is sentenced as a example by mobile phone；Numeral 31 represents the electronics and sets Standby display unit, is playing the audio-frequency information section in the display unit, and audio-frequency information section includes simultaneously 3rd audio-frequency information of corresponding second audio-frequency information of the second vocal print feature and the 3rd vocal print feature, digital 1 sum Word 2 represents the second user identification information and the 3rd user identity information, and the second user identification information respectively Relative to the state of other user identity informations be the state in amplifying with the 3rd user identity information, represent it is current just Playing audio-frequency information corresponding with the second user identification information and the 3rd user identity information.

Optionally, in the embodiment of the present invention, the control electronic equipment is playing the process of the audio-frequency information section In, while showing the second user identification information and the 3rd user identity information, can also include：Detection has described Corresponding second audio intensity of audio-frequency information of the second vocal print feature, and the correspondence of the audio-frequency information with the 3rd vocal print feature The 3rd audio intensity；Compare second audio intensity and the 3rd audio intensity, by the big audio of wherein audio intensity Information is defined as main audio information, and the small audio-frequency information of audio intensity is defined as into secondary audio-frequency information；According to audio intensity with The corresponding relation of display effect, controls the electronic equipment to show use corresponding with the main audio information with the first display effect Family identification information, and user identity information corresponding with the secondary audio-frequency information is shown with the second display effect.

I.e. when the audio-frequency information section is played, due to showing the second user identification information and the described 3rd simultaneously User identity information, the audio-frequency information corresponding for the ease of distinguishing particular user identification information can be according to audio-frequency information correspondence Audio intensity determine the display effect of corresponding ID.

For example, the corresponding display effect of the audio-frequency information big with audio intensity can be that user identity information is carried out with high-frequency Bounce, and the corresponding display effect of the audio-frequency information small with audio intensity can be that user identity information enters line bounce with low frequency, So as to the jumping frequency rate by observing ID, the sound intensity degree of user identity information and speaker can be contacted Come, so as to hearer when the audio-frequency information section spoken with many people is played simultaneously, can be caused by the sonority of sound and The jumping frequency rate of user identity information distinguishes the corresponding user identity information of sound, and avoids the recording file played simultaneously In simultaneously containing multiple sound when cause situation about not being easily distinguishable.

Fig. 4 is referred to, based on same inventive concept, the embodiment of the present invention also provides a kind of electronic equipment, and the electronics sets Be stored with N section audio samples in standby, and the every section audio sample in the N section audios sample corresponds to an ID letter respectively Breath, the user identity information includes the information that can be used in characterizing audio object corresponding with audio-frequency information, and N is positive integer, The electronic equipment can include parsing module 401, comparing module 402, the first determining module 403 and output module 404.

The parsing module 401 can be used for during a voice document is exported, in parsing institute's voice file The M section audio information with the first vocal print feature, M is positive integer.

The comparing module 402 can be used for comparing the M section audios information with the N section audios sample, really Determine to whether there is and the first vocal print feature identical vocal print feature in the corresponding N number of vocal print feature of the N section audios sample.

If say be the first determining module 403 can be used in the presence of, determine in the N section audios sample with first vocal print The first audio sample corresponding to feature identical vocal print feature, and according to audio sample pass corresponding with user identity information System, it is determined that first user identification information corresponding with the M section audios information；

The output module 404 can be used for output institute voice file；Wherein, it is special with first vocal print when playing During the audio-frequency information levied, the electronic equipment is controlled to show the first user identification information with the first display effect.

Optionally, in the embodiment of the present invention, the electronic equipment also includes：

Specifically, the corresponding computer program instructions of information processing method in the embodiment of the present application can be stored in On the storage mediums such as CD, hard disk, USB flash disk, when computer program corresponding with audio-frequency information processing method refers in storage medium When order is read or be performed by an electronic equipment, comprise the following steps：

Optionally, be also stored with other computer instruction in the storage medium, and these computer instructions are used to hold Row step：Detect included in institute's voice file audio-frequency information section in and meanwhile have the second vocal print feature and the 3rd vocal print spy When levying, isolated from audio-frequency information section with described the according to second vocal print feature and the 3rd vocal print feature Second audio-frequency information of two vocal print features, and the 3rd audio-frequency information with the 3rd vocal print feature；

Optionally, what is stored in the storage medium is controlling the electronic equipment playing the audio-frequency information with step End, while showing the second user identification information and the corresponding computer instruction of the 3rd user identity information in specifically quilt In implementation procedure, also comprise the following steps：

Compare the second sound intensity and the 3rd intensity of sound, the big audio-frequency information of wherein intensity of sound is determined It is main audio information, and the small audio-frequency information of intensity of sound is defined as secondary audio-frequency information；

According to the corresponding relation of intensity of sound and display effect, control the electronic equipment show with the first display effect and The corresponding user identity information of the main audio information, and use corresponding with the secondary audio-frequency information is shown with the second display effect Family identification information.

Optionally, stored in the storage medium with step by the M section audios information and the N section audios sample Compare, determine in the corresponding N number of vocal print feature of the N section audios sample with the presence or absence of identical with first vocal print feature The corresponding computer instruction of vocal print feature during being specifically performed, also comprise the following steps：

Optionally, be also stored with other computer instruction in the storage medium, these computer instructions with step Suddenly：If the M section audios information is the crucial audio-frequency information, set up and the M section audios information according to the contact object The corresponding computer instruction of corresponding user identity information is performed while being specifically performed or afterwards, is being held Comprise the following steps during row：

Obviously, those skilled in the art can carry out various changes and modification without deviating from essence of the invention to the present invention God and scope.So, if these modifications of the invention and modification belong to the scope of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to comprising these changes and modification.

Claims

1. a kind of audio-frequency information processing method, is applied in electronic equipment, and be stored with N section audio samples in the electronic equipment, Every section audio sample in the N section audios sample corresponds to a user identity information respectively, and the user identity information is included Can be used in characterizing the information of audio object corresponding with audio-frequency information, N is positive integer, and methods described includes：

During a voice document is exported, the M section audios with the first vocal print feature in institute's voice file are parsed Information, M is positive integer；

The M section audios information is compared with the N section audios sample, the corresponding N number of sound of the N section audios sample is determined Whether there is and the first vocal print feature identical vocal print feature in line feature；

If in the presence of, determine in the N section audios sample with first corresponding to the first vocal print feature identical vocal print feature Audio sample, and according to the corresponding relation of audio sample and user identity information, it is determined that corresponding with the M section audios information the One user identity information；

Output institute voice file；Wherein, when the audio-frequency information with first vocal print feature is played, the electronics is controlled Equipment shows the first user identification information with the first display effect；

The M section audios information is compared with the N section audios sample, the corresponding N number of sound of the N section audios sample is determined In line feature whether there is with the first vocal print feature identical vocal print feature, also include：

If not existing in the corresponding N number of vocal print feature of the N section audios sample and the first vocal print feature identical vocal print being special Levy, judge whether the M section audios information is crucial audio-frequency information；Wherein, the crucial audio-frequency information is to be set with the electronics The audio-frequency information of the contact object correlation of standby middle storage；

If the M section audios information is the crucial audio-frequency information, set up and the M section audios information according to the contact object Corresponding user identity information；Or

If the M section audios information be the crucial audio-frequency information, set the first specific identification information as with the M sections of sound The corresponding user identity information of frequency information；Wherein, first specific identification information is specific image letter in the electronic equipment Breath, specific character information and any information or combined information in special sound information.

2. the method for claim 1, it is characterised in that methods described also includes：

Detect included in institute's voice file audio-frequency information section in while having the second vocal print feature and the 3rd vocal print feature When, isolated with described second from audio-frequency information section according to second vocal print feature and the 3rd vocal print feature Second audio-frequency information of vocal print feature, and the 3rd audio-frequency information with the 3rd vocal print feature；

Compare with the N section audios sample respectively by by second audio-frequency information and the 3rd audio-frequency information, really Make second audio sample corresponding with second vocal print feature, and threeth audio corresponding with the 3rd vocal print feature Sample；And according to audio sample and the corresponding relation of user identity information, determine corresponding with second vocal print feature Second user identification information, and threeth user identity information corresponding with the 3rd vocal print feature；

The electronic equipment is controlled during the audio-frequency information is played, at the same show the second user identification information and 3rd user identity information.

3. method as claimed in claim 2, it is characterised in that the control electronic equipment is playing the audio-frequency information section During, while showing the second user identification information and the 3rd user identity information, also include：

Audio-frequency information corresponding second audio intensity of the detection with second vocal print feature, and it is special with the 3rd vocal print Corresponding 3rd audio intensity of audio-frequency information levied；

Compare second audio intensity and the 3rd audio intensity, second audio intensity and the 3rd audio is strong The big audio-frequency information of degree sound intermediate frequency intensity is defined as main audio information, and second audio intensity and the 3rd audio is strong The small audio-frequency information of degree sound intermediate frequency intensity is defined as secondary audio-frequency information；

According to the corresponding relation of audio intensity and display effect, control the electronic equipment with the first display effect show with it is described The corresponding user identity information of main audio information, and show that user corresponding with the secondary audio-frequency information marks with the second display effect Knowledge information.

4. the method for claim 1, it is characterised in that if being the crucial audio-frequency information in the M section audios information, While according to contact object foundation user identity information corresponding with the M section audios information or afterwards, methods described Also include：

First audio-frequency fragments are stored as N+1 section audio samples；Wherein, the N+1 section audios sample with The M section audios information corresponds to same user identity information.

5. a kind of electronic equipment, be stored with N section audio samples in the electronic equipment, every section of sound in the N section audios sample Frequency sample corresponds to a user identity information respectively, and the user identity information is corresponding with audio-frequency information comprising can be used in sign Audio object information, N is positive integer, and the electronic equipment includes：

Parsing module, for during a voice document is exported, parse in institute's voice file with the first vocal print The M section audio information of feature, M is positive integer；

Comparing module, for the M section audios information to be compared with the N section audios sample, determines the N section audios sample Whether there is and the first vocal print feature identical vocal print feature in this corresponding N number of vocal print feature；

First determining module, if in the presence of, determine in the N section audios sample with the first vocal print feature identical vocal print The first audio sample corresponding to feature, and according to the corresponding relation of audio sample and user identity information, it is determined that with it is described M sections The corresponding first user identification information of audio-frequency information；

Output module, the voice file for exporting；Wherein, when audio-frequency information of the broadcasting with first vocal print feature When, control the electronic equipment to show the first user identification information with the first display effect；

The electronic equipment also includes：

Judge module, if for not existing and first vocal print feature in the corresponding N number of vocal print feature of the N section audios sample Identical vocal print feature, judges whether the M section audios information is crucial audio-frequency information；Wherein, the crucial audio-frequency information is The audio-frequency information related to the contact object stored in the electronic equipment；

Second processing module, if being the crucial audio-frequency information for the M section audios information, sets up according to the contact object User identity information corresponding with the M section audios information；Or, if the M section audios information is not the crucial audio letter Breath, sets the first specific identification information as user identity information corresponding with the M section audios information；Wherein, described first Specific identification information is specific image information, specific character information and any information in special sound information in the electronic equipment Or combined information.

6. electronic equipment as claimed in claim 5, it is characterised in that the electronic equipment also includes：

Separation module, included in voice file for detecting audio-frequency information section in and meanwhile have the second vocal print feature and During three vocal print features, isolated from audio-frequency information section according to second vocal print feature and the 3rd vocal print feature The second audio-frequency information with second vocal print feature, and the 3rd audio-frequency information with the 3rd vocal print feature；

Second determining module, for by by second audio-frequency information and the 3rd audio-frequency information respectively with the N sections of sound Frequency sample is compared, and determines second audio sample corresponding with second vocal print feature, and special with the 3rd vocal print Levy the 3rd corresponding audio sample；And according to the corresponding relation of audio sample and user identity information, determine and described the The corresponding second user identification information of two vocal print features, and threeth ID corresponding with the 3rd vocal print feature Information；

Control module, for controlling the electronic equipment during the audio-frequency information is played, while display described second User identity information and the 3rd user identity information.

7. electronic equipment as claimed in claim 6, it is characterised in that the electronic equipment also includes：

Detection module, for detecting corresponding second audio intensity of audio-frequency information with second vocal print feature, and has Corresponding 3rd audio intensity of audio-frequency information of the 3rd vocal print feature；

Comparison module, for comparing second audio intensity and the 3rd audio intensity, by second audio intensity and The big audio-frequency information of the 3rd audio intensity sound intermediate frequency intensity is defined as main audio information, and by second audio intensity and The small audio-frequency information of the 3rd audio intensity sound intermediate frequency intensity is defined as secondary audio-frequency information；

First processing module, for the corresponding relation according to audio intensity and display effect, controls the electronic equipment with first Display effect shows user identity information corresponding with the main audio information, and is shown and the secondary sound with the second display effect The corresponding user identity information of frequency information.

8. electronic equipment as claimed in claim 5, it is characterised in that the electronic equipment also includes：

Memory module, first audio-frequency fragments are stored as N+1 section audio samples；Wherein, the N+1 sections of sound Frequency sample corresponds to same user identity information with the M section audios information.