CN107799125A

CN107799125A - A kind of audio recognition method, mobile terminal and computer-readable recording medium

Info

Publication number: CN107799125A
Application number: CN201711097133.5A
Authority: CN
Inventors: 刘康飞
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2017-11-09
Filing date: 2017-11-09
Publication date: 2018-03-13

Abstract

The invention discloses a kind of audio recognition method, mobile terminal and computer-readable recording medium, wherein, methods described includes：In voice communication course, whether the noise intensity detected in current environment meets predetermined condition；If the noise intensity meets predetermined condition, lip characteristic image is gathered, and lip reading information is identified according to the lip characteristic image；The lip reading information is converted into voice messaging and/or text information, and sent to the receiving terminal of voice communication.The present invention can ensure that the receiving terminal of voice communication receives accurate voice messaging and/or text information in a noisy environment, so as to improve communication quality.

Description

A kind of audio recognition method, mobile terminal and computer-readable recording medium

Technical field

The present invention relates to communication technical field, more particularly to a kind of audio recognition method, mobile terminal and computer-readable Storage medium.

Background technology

With the fast development of mobile terminal, people are also more and more to the functional requirements of mobile terminal.Mobile terminal Voice communication can be carried out, such as：Voice call, video calling, and send speech message etc., with realize person to person or it is man-machine it Between information exchange.The voice messaging when user that mobile terminal collects microphone speaks is sent to the reception of voice communication End, so as to realize voice communication.

But when mobile terminal is in progress voice communication in noisy environment, due to the interference of noise, cause to communicate Quality Down.

The content of the invention

It is existing to solve the invention provides a kind of audio recognition method, mobile terminal and computer-readable recording medium Mobile terminal is in the problem of causing communication quality to decline in noisy environment in technology.

In a first aspect, the embodiments of the invention provide a kind of audio recognition method, applied to mobile terminal, wherein the side Method includes：

In voice communication course, whether the noise intensity detected in current environment meets predetermined condition；

If the noise intensity meets predetermined condition, lip characteristic image is gathered, and according to the lip characteristic image Identify lip reading information；

The lip reading information is converted into voice messaging and/or text information, and sent to the receiving terminal of voice communication.

Second aspect, the embodiment of the present invention additionally provide a kind of mobile terminal, and the mobile terminal includes：

Detection module, in voice communication course, whether the noise intensity detected in current environment to meet predetermined bar Part；

Identification module, if meeting predetermined condition for the noise intensity, lip characteristic image is gathered, and according to described Lip characteristic image identifies lip reading information；

Sending module, for the lip reading information to be converted into voice messaging and/or text information, and send to voice and lead to The receiving terminal of letter.

The third aspect, the embodiment of the present invention additionally provide a kind of mobile terminal, including processor, memory and are stored in institute The computer program that can be run on memory and on the processor is stated, when the computer program is by the computing device The step of realizing audio recognition method as described above.

Fourth aspect, the embodiment of the present invention additionally provide a kind of computer-readable recording medium, described computer-readable to deposit Computer program is stored with storage media, the computer program realizes speech recognition side as described above when being executed by processor The step of method.

The embodiment of the present invention can ensure that the receiving terminal of voice communication receives accurate voice letter in a noisy environment Breath and/or text information, so as to improve communication quality.

Brief description of the drawings

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention The accompanying drawing needed to use is briefly described, it should be apparent that, drawings in the following description are only some implementations of the present invention Example, for those of ordinary skill in the art, without having to pay creative labor, can also be according to these accompanying drawings Obtain other accompanying drawings.

Fig. 1 represents one of flow chart of audio recognition method of the embodiment of the present invention；

Fig. 2 represents the two of the flow chart of the audio recognition method of the embodiment of the present invention；

Fig. 3 represents the three of the flow chart of the audio recognition method of the embodiment of the present invention；

Fig. 4 represents one of block diagram of mobile terminal of the embodiment of the present invention；

Fig. 5 represents the two of the block diagram of the mobile terminal of the embodiment of the present invention；

Fig. 6 represents the hardware architecture diagram of the mobile terminal of the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, rather than whole embodiments.Based on this hair Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.

Referring to Fig. 1, the embodiments of the invention provide a kind of audio recognition method, applied to mobile terminal, methods described bag Include：

Step 11, in voice communication course, whether the noise intensity detected in current environment meets predetermined condition.

In the embodiment, when user carries out voice communication using mobile terminal, the noise detected in current environment is strong Whether degree meets preparatory condition, can be the loudness value of noise in the current environment that will be detected as noise intensity；It is if current The loudness value of noise is more than the first predetermined threshold value in environment, it is determined that noise intensity meets predetermined condition.It can also be that detection is worked as The first loudness value of noise is corresponded in preceding environment and user inputted during voice communication between the second loudness value of voice Loudness difference, if loudness difference is less than the second predetermined threshold value, it is determined that the noise intensity in current environment meets predetermined condition, its Middle loudness value can use decibel to measure.

Wherein, the mode of voice communication includes：Voice call, video calling, send voice SMS and pass through IMU News application sends speech message etc..It should be noted that the mode of voice communication can also include with its other party in addition Formula, the present invention are not limited.

Step 12, if the noise intensity meets predetermined condition, lip characteristic image is gathered, and it is special according to the lip Levy image recognition lip reading information.

In the embodiment, if the noise intensity meets predetermined condition, influenceed in the noise intensity for then representing current environment The communication quality of voice communication, then switch to lip reading recognition mode by mobile terminal, and collection carries out language using the mobile terminal The lip characteristic image of the user of sound communication, and identify that the user carries out voice messaging according to the lip characteristic image Lip reading information during input, when the lip reading information of user can be collected as voice communication by ensureing the transmitting terminal of voice communication Input information, the influence of noise voice communication in environment is avoided, so as to ensure the communication quality of voice communication.

Step 13, the lip reading information is converted into voice messaging and/or text information, and sends connecing to voice communication Receiving end.

In the embodiment, the lip reading information that will identify to obtain according to the lip characteristic information of user, with voice and/or text The mode of word is cranked to the receiving terminal of voice communication.Such as：, will identification if user carries out voice call using mobile terminal Obtained lip reading information, which is converted to voice messaging and sent, to the receiving terminal of voice communication, ensures that the receiving terminal of voice communication can connect Clearly voice messaging is received, so as to improve communication quality；If even if user sends language using the communication applications of mobile terminal Sound message, then it will identify that obtained lip reading information is converted to voice messaging and sent to the receiving terminal of voice communication, or will identification Obtained lip reading information is converted to text information and sent to the receiving terminal of voice communication, even if avoiding using communication applications to send During speech message, input efficiency is caused to reduce because phonetic entry is switched to character input modes by Environmental Noise Influence, so as to Even if improving transmitting efficiency when using communication applications transmission speech message, and ensure that the receiving terminal of voice communication can receive Clearly voice messaging or corresponding text information, and then improve communication quality.

In such scheme, when user carries out voice communication using mobile terminal, by the noise in current environment Intensity is detected, and when detecting that noise intensity meets predetermined condition, identification collection lip characteristic image, and according to described Lip characteristic image identifies lip reading information, and the lip reading information is converted into voice messaging and/or text information is sent to language The receiving terminal of sound communication, avoids the influence of noise voice communication in environment.The program is in noisy environment in mobile terminal When carrying out voice communication, clearly voice messaging can be received and/or believe with the voice by ensureing the receiving terminal of voice communication Text information corresponding to breath, so as to improve the quality of voice communication, and then be advantageous to be lifted the experience effect of user.

In addition, the program when detecting that noise intensity meets predetermined condition, is directly switch to lip reading recognition mode, to making Lip reading when user carries out voice messaging input is identified, and avoids manual switching speech recognition mode from reducing communication efficiency.

Referring to Fig. 2, the embodiment of the present invention additionally provides a kind of audio recognition method, applied to mobile terminal, methods described Including：

Step 21, in voice communication course, the audio-frequency information that the microphone of the mobile terminal collects is obtained.

In the embodiment, if mobile terminal is in noisy environment, in the audio-frequency information collected by microphone Voice messaging during voice messaging input is carried out including the use of person, and the voice is removed in the environment that is presently in of mobile terminal Noise information outside information.

Step 22, dissection process is carried out to the audio-frequency information, extracts the noise information in the audio-frequency information.

In the embodiment, by carrying out dissection process to audio-frequency information, such as：Changed according to the sound wave of the audio-frequency information and advised Rule, voice messaging and noise information in audio-frequency information are distinguished, and extract the noise information in the audio-frequency information.

Step 23, detect whether noise intensity corresponding to the noise information meets predetermined condition.

Specifically, the step of whether noise intensity meets predetermined condition corresponding to the noise information is detected, including：Detection Whether noise intensity corresponding to the noise information is more than predetermined threshold value；If noise intensity corresponding to the noise information is more than pre- If threshold value, it is determined that the noise intensity meets predetermined condition.Such as：Judged according to the maximum of amplitude in the waveform of noise information Whether noise intensity meets predetermined condition.If amplitude is more than default amplitude threshold, it is determined that noise intensity meets predetermined condition, To avoid Environmental Noise Influence communication quality, and then mobile terminal is switched into lip during progress voice communication in noisy environment Language recognition mode, the lip reading information when user carries out voice messaging input is identified, ensure the receiving terminal energy of voice communication Clearly voice messaging is enough received, so as to improve the quality of voice communication.

Step 24, if the noise intensity meets predetermined condition, lip characteristic image is gathered, and it is special according to the lip Levy image recognition lip reading information.

In the embodiment, if the noise intensity meets predetermined condition, influenceed in the noise intensity for then representing current environment The communication quality of voice communication, then switch to lip reading recognition mode by mobile terminal, and collection carries out language using the mobile terminal The lip characteristic image of the user of sound communication, and identify that the user carries out voice messaging according to the lip characteristic image Lip reading information during input, the influence of noise voice communication in environment is avoided, so as to ensure the communication quality of voice communication.

Step 25, the lip reading information is converted into voice messaging and/or text information, and sends connecing to voice communication Receiving end.

In such scheme, when user carries out voice communication using mobile terminal, the Mike of the mobile terminal is obtained The audio-frequency information that wind collects, and extract noise information in audio-frequency information；By being carried out to the noise intensity of the noise information Detection, and when detecting that noise intensity meets predetermined condition, lip characteristic image is gathered, and according to the lip characteristic image Lip reading information is identified, and the lip reading information is converted into voice messaging and/or text information is sent to the reception of voice communication End, avoids the influence of noise voice communication in environment.The program is in noisy environment in mobile terminal and carries out voice communication When, clearly voice messaging and/or word corresponding with the voice messaging can be received by ensureing the receiving terminal of voice communication Information, so as to improve the quality of voice communication, and then be advantageous to be lifted the experience effect of user.

Referring to Fig. 3, the embodiment of the present invention additionally provides a kind of audio recognition method, applied to mobile terminal, methods described Including：

Step 31, in voice communication course, whether the noise intensity detected in current environment meets predetermined condition.

Step 32, if the noise intensity meets predetermined condition, prompt whether user is entered using lip reading recognition mode Row voice communication.

In the embodiment, if the noise intensity meets predetermined condition, influenceed in the noise intensity for then representing current environment The communication quality of voice communication, then whether user is prompted using the progress voice communication of lip reading recognition mode.Specifically, it can lead to Crossing display one includes the prompting frame of prompt message, to prompt whether user carries out voice communication using lip reading recognition mode.Also User can be prompted whether to carry out voice using lip reading recognition mode and led to by way of it will be prompted to information and carry out voice broadcast Letter.

Step 33, determine to carry out the triggering command of voice communication using lip reading recognition mode if getting user, adopt Collect lip characteristic image, and lip reading information is identified according to the lip characteristic image.

In the embodiment, determined to carry out the triggering command of voice communication using lip reading recognition mode according to user, will moved Dynamic terminal switches to lip reading recognition mode, gathers the lip characteristic image of user, to be identified according to the lip characteristic image The user carries out the lip reading information of input during voice communication, avoids the environment residing for user from being not easy to identify using lip reading During pattern, voice communication is influenceed, and then ensures the communication quality and communication success rate of voice communication.

Step 34, the lip reading information is converted into voice messaging and/or text information, and sends connecing to voice communication Receiving end.

In such scheme, when user carries out voice communication using mobile terminal, by the noise in current environment Intensity is detected, and when detecting that noise intensity meets predetermined condition, prompts whether user uses lip reading recognition mode Carry out voice communication；If getting user to determine to carry out the triggering command of voice communication using lip reading recognition mode, gather Lip characteristic image, and according to the lip reading information of input during lip characteristic image identification user's progress voice communication, and The lip reading information is converted into voice messaging and/or text information is sent to the receiving terminal of voice communication, is avoided in environment Influence of noise voice communication.The program ensures voice communication when mobile terminal is in progress voice communication in noisy environment Receiving terminal can receive clearly voice messaging and/or text information corresponding with the voice messaging, so as to improve language The quality of sound communication, and then be advantageous to be lifted the experience effect of user.

In addition, this programme determines to carry out the triggering command of voice communication using lip reading recognition mode always according to user, will Mobile terminal switches to lip reading recognition mode, during avoiding the environment residing for user from being not easy to use lip reading recognition mode, shadow Voice communication is rung, and then ensures the communication quality and communication success rate of voice communication.

Further, the step of gathering lip characteristic image, and lip reading information is identified according to the lip characteristic image, bag Include：Start the camera of the mobile terminal；The lip for obtaining the user of the progress voice communication of the camera collection is special Levy image；According to the lip characteristic image, the lip reading information when user carries out voice communication is identified.

Specifically, starting the camera of mobile terminal, the image information of the camera collection is obtained, detects described image Whether the lip characteristic image of user is included in information；If the lip characteristic image comprising user, it is determined that reach mobile Terminal carries out the condition of lip reading identification, then according to the lip characteristic image collected, identifies that the user carries out voice communication When lip reading information, to avoid the influence of noise voice communication in environment.

If the lip characteristic image not comprising user or the part lip characteristic image comprising user, mobile terminal Can not be according to the lip characteristic image collected, when identifying the lip reading information when user carries out voice communication, prompting makes User adjusts the acquisition angles of camera, in order to collect the complete lip characteristic image of user, to identify the use Person carries out lip reading information during voice communication, to avoid the influence of noise voice communication in environment.Wherein, user's adjustment is prompted The mode of the acquisition angles of camera can be prompted using prompting frame or voice broadcast prompting message by the way of carried Show, or other modes in addition, the present invention are not limited.

It should be noted that the collection lip characteristic image of above-described embodiment, and identified according to the lip characteristic image The step of lip reading information, it can apply to each embodiment of the above.

Referring to Fig. 4 and Fig. 5, the embodiment of the present invention additionally provides a kind of mobile terminal, and the mobile terminal 400 includes：

Detection module 410, for during voice communication, it is pre- whether the noise intensity in detection current environment meets Fixed condition.

Identification module 420, if meeting predetermined condition for the noise intensity, gather lip characteristic image, and according to The lip characteristic image identifies lip reading information.

Sending module 430, for the lip reading information to be converted into voice messaging and/or text information, and send to language The receiving terminal of sound communication.

Wherein, the detection module 410 includes：

Acquisition submodule 411, the sound that the microphone in voice communication course, obtaining the mobile terminal collects Frequency information.

Extracting sub-module 412, for carrying out dissection process to the audio-frequency information, extract the noise in the audio-frequency information Information.

Detection sub-module 413, for detecting whether noise intensity corresponding to the noise information meets predetermined condition.

Wherein, the detection sub-module 413 includes：

Detection unit 4131, for detecting whether noise intensity corresponding to the noise information is more than predetermined threshold value.

Processing unit 4132, if being more than predetermined threshold value for noise intensity corresponding to the noise information, it is determined that described Noise intensity meets predetermined condition corresponding to noise information.

Wherein, the identification module 420 includes：

Prompting submodule 421, if meeting predetermined condition for the noise intensity, prompt whether user uses lip reading Recognition mode carries out voice communication；

Submodule 422 is identified, if determining to carry out touching for voice communication using lip reading recognition mode for getting user Send instructions, then identification collection lip characteristic image, and lip reading information is identified according to the lip characteristic image.

Wherein, the mobile terminal 400 includes：

Start unit 4221, for starting the camera of the mobile terminal；

Acquiring unit 4222, the lip characteristic pattern of the user of the progress voice communication for obtaining the camera collection Picture；

Recognition unit 4223, during for according to the lip characteristic image, identifying that the user carries out voice communication Lip reading information.

Mobile terminal provided in an embodiment of the present invention can realize that mobile terminal is realized in Fig. 1 to Fig. 3 embodiment of the method Each process, to avoid repeating, repeat no more here.

Mobile terminal 400 in such scheme, when user carries out voice communication using mobile terminal, by current Noise intensity in environment is detected, and when detecting that noise intensity meets predetermined condition, gathers lip characteristic image, and Lip reading information is identified according to the lip characteristic image, and the lip reading information is converted into voice messaging and/or text information Send to the receiving terminal of voice communication, avoid the influence of noise voice communication in environment.The program is in noisy in mobile terminal Environment in carry out voice communication when, ensure voice communication receiving terminal can receive clearly voice messaging and/or with institute Text information corresponding to voice messaging is stated, so as to improve the quality of voice communication, and then is advantageous to be lifted the experience effect of user.

Fig. 6 is a kind of hardware architecture diagram for the mobile terminal for realizing each embodiment of the present invention.

The mobile terminal 600 includes but is not limited to：It is radio frequency unit 601, mixed-media network modules mixed-media 602, audio output unit 603, defeated Enter unit 604, sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, processor The part such as 610 and power supply 611.It will be understood by those skilled in the art that the mobile terminal structure shown in Fig. 6 is not formed Restriction to mobile terminal, mobile terminal can be included than illustrating more or less parts, either combine some parts or Different part arrangements.In embodiments of the present invention, mobile terminal include but is not limited to mobile phone, tablet personal computer, notebook computer, Palm PC, car-mounted terminal, wearable device and pedometer etc..

Wherein, processor 610, in voice communication course, whether the noise intensity detected in current environment to meet Predetermined condition；If the noise intensity meets predetermined condition, lip characteristic image is gathered, and according to the lip characteristic image Identify lip reading information；The lip reading information is converted into voice messaging and/or text information, and sent to the reception of voice communication End.

Mobile terminal 600 in such scheme, when user carries out voice communication using mobile terminal, by current Noise intensity in environment is detected, and when detecting that noise intensity meets predetermined condition, gathers lip characteristic image, and Lip reading information is identified according to the lip characteristic image, and the lip reading information is converted into voice messaging and/or text information Send to the receiving terminal of voice communication, avoid the influence of noise voice communication in environment.The program is in noisy in mobile terminal Environment in carry out voice communication when, ensure voice communication receiving terminal can receive clearly voice messaging and/or with institute Text information corresponding to voice messaging is stated, so as to improve the quality of voice communication, and then is advantageous to be lifted the experience effect of user.

It should be understood that in the embodiment of the present invention, radio frequency unit 601 can be used for receiving and sending messages or communication process in, signal Reception and transmission, specifically, by from base station downlink data receive after, handled to processor 610；In addition, will be up Data are sent to base station.Generally, radio frequency unit 601 includes but is not limited to antenna, at least one amplifier, transceiver, coupling Device, low-noise amplifier, duplexer etc..In addition, radio frequency unit 601 can also by wireless communication system and network and other set Standby communication.

Mobile terminal has provided the user wireless broadband internet by mixed-media network modules mixed-media 602 and accessed, and such as helps user to receive Send e-mails, browse webpage and access streaming video etc..

Audio output unit 603 can be receiving by radio frequency unit 601 or mixed-media network modules mixed-media 602 or in memory 609 It is sound that the voice data of storage, which is converted into audio signal and exported,.Moreover, audio output unit 603 can also be provided and moved The audio output for the specific function correlation that dynamic terminal 600 performs is (for example, call signal receives sound, message sink sound etc. Deng).Audio output unit 603 includes loudspeaker, buzzer and receiver etc..

Input block 604 is used to receive audio or video signal.Input block 604 can include graphics processor (Graphics Processing Unit, GPU) 6041 and microphone 6042, graphics processor 6041 is in video acquisition mode Or the static images or the view data of video obtained in image capture mode by image capture apparatus (such as camera) are carried out Reason.Picture frame after processing may be displayed on display unit 606.Picture frame after the processing of graphics processor 6041 can be deposited Storage is transmitted in memory 609 (or other storage mediums) or via radio frequency unit 601 or mixed-media network modules mixed-media 602.Mike Wind 6042 can receive sound, and can be voice data by such acoustic processing.Voice data after processing can be The form output of mobile communication base station can be sent to via radio frequency unit 601 by being converted in the case of telephone calling model.

Mobile terminal 600 also includes at least one sensor 605, such as optical sensor, motion sensor and other biographies Sensor.Specifically, optical sensor includes ambient light sensor and proximity transducer, wherein, ambient light sensor can be according to environment The light and shade of light adjusts the brightness of display panel 6061, and proximity transducer can close when mobile terminal 600 is moved in one's ear Display panel 6061 and/or backlight.As one kind of motion sensor, accelerometer sensor can detect in all directions (general For three axles) size of acceleration, size and the direction of gravity are can detect that when static, available for identification mobile terminal posture (ratio Such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap)；Pass Sensor 605 can also include fingerprint sensor, pressure sensor, iris sensor, molecule sensor, gyroscope, barometer, wet Meter, thermometer, infrared ray sensor etc. are spent, will not be repeated here.

Display unit 606 is used for the information for showing the information inputted by user or being supplied to user.Display unit 606 can wrap Display panel 6061 is included, liquid crystal display (Liquid Crystal Display, LCD), Organic Light Emitting Diode can be used Forms such as (Organic Light-Emitting Diode, OLED) configures display panel 6061.

User input unit 607 can be used for the numeral or character information for receiving input, and produce the use with mobile terminal The key signals input that family is set and function control is relevant.Specifically, user input unit 607 include contact panel 6071 and Other input equipments 6072.Contact panel 6071, also referred to as touch-screen, collect touch operation of the user on or near it (for example user uses any suitable objects or annex such as finger, stylus on contact panel 6071 or in contact panel 6071 Neighbouring operation).Contact panel 6071 may include both touch detecting apparatus and touch controller.Wherein, touch detection Device detects the touch orientation of user, and detects the signal that touch operation is brought, and transmits a signal to touch controller；Touch control Device processed receives touch information from touch detecting apparatus, and is converted into contact coordinate, then gives processor 610, receiving area Manage the order that device 610 is sent and performed.It is furthermore, it is possible to more using resistance-type, condenser type, infrared ray and surface acoustic wave etc. Type realizes contact panel 6071.Except contact panel 6071, user input unit 607 can also include other input equipments 6072.Specifically, other input equipments 6072 can include but is not limited to physical keyboard, function key (such as volume control button, Switch key etc.), trace ball, mouse, action bars, will not be repeated here.

Further, contact panel 6071 can be covered on display panel 6061, when contact panel 6071 is detected at it On or near touch operation after, send processor 610 to determine the type of touch event, be followed by subsequent processing device 610 according to touch The type for touching event provides corresponding visual output on display panel 6061.Although in figure 6, contact panel 6071 and display Panel 6061 is the part independent as two to realize the input of mobile terminal and output function, but in some embodiments In, can be integrated by contact panel 6071 and display panel 6061 and realize input and the output function of mobile terminal, it is specific this Place does not limit.

Interface unit 608 is the interface that external device (ED) is connected with mobile terminal 600.For example, external device (ED) can include Line or wireless head-band earphone port, external power source (or battery charger) port, wired or wireless FPDP, storage card end Mouth, port, audio input/output (I/O) port, video i/o port, earphone end for connecting the device with identification module Mouthful etc..Interface unit 608 can be used for receive the input (for example, data message, electric power etc.) from external device (ED) and One or more elements that the input received is transferred in mobile terminal 600 can be used in the He of mobile terminal 600 Data are transmitted between external device (ED).

Memory 609 can be used for storage software program and various data.Memory 609 can mainly include storing program area And storage data field, wherein, storing program area can storage program area, application program (such as the sound needed at least one function Sound playing function, image player function etc.) etc.；Storage data field can store according to mobile phone use created data (such as Voice data, phone directory etc.) etc..In addition, memory 609 can include high-speed random access memory, can also include non-easy The property lost memory, a for example, at least disk memory, flush memory device or other volatile solid-state parts.

Processor 610 is the control centre of mobile terminal, utilizes each of various interfaces and the whole mobile terminal of connection Individual part, by running or performing the software program and/or module that are stored in memory 609, and call and be stored in storage Data in device 609, the various functions and processing data of mobile terminal are performed, so as to carry out integral monitoring to mobile terminal.Place Reason device 610 may include one or more processing units；Preferably, processor 610 can integrate application processor and modulatedemodulate is mediated Device is managed, wherein, application processor mainly handles operating system, user interface and application program etc., and modem processor is main Handle radio communication.It is understood that above-mentioned modem processor can not also be integrated into processor 610.

Mobile terminal 600 can also include the power supply 611 (such as battery) to all parts power supply, it is preferred that power supply 611 Can be logically contiguous by power-supply management system and processor 610, so as to realize management charging by power-supply management system, put The function such as electricity and power managed.

In addition, mobile terminal 600 includes some unshowned functional modules, will not be repeated here.

Preferably, the embodiment of the present invention also provides a kind of mobile terminal, including processor 610, memory 609, is stored in On memory 609 and the computer program that can be run on the processor 610, the computer program are performed by processor 610 Each process of the above-mentioned audio recognition method embodiments of Shi Shixian, and identical technique effect can be reached, to avoid repeating, here Repeat no more.

The embodiment of the present invention also provides a kind of computer-readable recording medium, and meter is stored with computer-readable recording medium Calculation machine program, the computer program realize each process of above-mentioned audio recognition method embodiment, and energy when being executed by processor Reach identical technique effect, to avoid repeating, repeat no more here.Wherein, described computer-readable recording medium, such as only Read memory (Read-Only Memory, abbreviation ROM), random access memory (Random Access Memory, abbreviation RAM), magnetic disc or CD etc..

It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those key elements, and And also include the other element being not expressly set out, or also include for this process, method, article or device institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this Other identical element also be present in the process of key element, method, article or device.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on such understanding, technical scheme is substantially done to prior art in other words Going out the part of contribution can be embodied in the form of software product, and the computer software product is stored in a storage medium In (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal (can be mobile phone, computer, service Device, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.

Embodiments of the invention are described above in conjunction with accompanying drawing, but the invention is not limited in above-mentioned specific Embodiment, above-mentioned embodiment is only schematical, rather than restricted, one of ordinary skill in the art Under the enlightenment of the present invention, in the case of present inventive concept and scope of the claimed protection is not departed from, it can also make a lot Form, belong within the protection of the present invention.

Claims

1. a kind of audio recognition method, applied to mobile terminal, it is characterised in that methods described includes：

If the noise intensity meets predetermined condition, lip characteristic image is gathered, and identify according to the lip characteristic image Lip reading information；

2. audio recognition method according to claim 1, it is characterised in that described in voice communication course, detection is worked as The step of whether noise intensity in preceding environment meets predetermined condition, including：

In voice communication course, the audio-frequency information that the microphone of the mobile terminal collects is obtained；

Dissection process is carried out to the audio-frequency information, extracts the noise information in the audio-frequency information；

Detect whether noise intensity corresponding to the noise information meets predetermined condition.

3. audio recognition method according to claim 2, it is characterised in that made an uproar corresponding to the detection noise information The step of whether sound intensity meets predetermined condition, including：

Detect whether noise intensity corresponding to the noise information is more than predetermined threshold value；

If noise intensity corresponding to the noise information is more than predetermined threshold value, it is determined that noise intensity corresponding to the noise information Meet predetermined condition.

4. audio recognition method according to claim 1, it is characterised in that if the noise intensity meets predetermined bar The step of part, then gathering lip characteristic image, and lip reading information identified according to the lip characteristic image, including：

If the noise intensity meets predetermined condition, whether user is prompted using the progress voice communication of lip reading recognition mode；

If getting user to determine to carry out the triggering command of voice communication using lip reading recognition mode, lip characteristic pattern is gathered Picture, and lip reading information is identified according to the lip characteristic image.

5. a kind of mobile terminal, it is characterised in that the mobile terminal includes：

Detection module, in voice communication course, whether the noise intensity detected in current environment to meet predetermined condition；

Identification module, if meeting predetermined condition for the noise intensity, lip characteristic image is gathered, and according to the lip Characteristic image identifies lip reading information；

Sending module, for the lip reading information to be converted into voice messaging and/or text information, and send to voice communication Receiving terminal.

6. mobile terminal according to claim 5, it is characterised in that the detection module includes：

Acquisition submodule, the audio-frequency information that the microphone in voice communication course, obtaining the mobile terminal collects；

Extracting sub-module, for carrying out dissection process to the audio-frequency information, extract the noise information in the audio-frequency information；

Detection sub-module, for detecting whether noise intensity corresponding to the noise information meets predetermined condition.

7. mobile terminal according to claim 6, it is characterised in that the detection sub-module includes：

Detection unit, for detecting whether noise intensity corresponding to the noise information is more than predetermined threshold value；

Processing unit, if being more than predetermined threshold value for noise intensity corresponding to the noise information, it is determined that the noise information Corresponding noise intensity meets predetermined condition.

8. mobile terminal according to claim 5, it is characterised in that the identification module includes：

Prompting submodule, if meeting predetermined condition for the noise intensity, whether user is prompted using lip reading identification mould Formula carries out voice communication；

Submodule is identified, if determining to carry out the triggering command of voice communication using lip reading recognition mode for getting user, Lip characteristic image is then gathered, and lip reading information is identified according to the lip characteristic image.

9. a kind of mobile terminal, it is characterised in that including processor, memory and be stored on the memory and can be described The computer program run on processor, the computer program are realized during the computing device as in Claims 1-4 The step of audio recognition method described in any one.

10. a kind of computer-readable recording medium, it is characterised in that be stored with computer on the computer-readable recording medium Program, the audio recognition method as any one of Claims 1-4 is realized when the computer program is executed by processor The step of.