JPH10143195A

JPH10143195A - Post filter

Info

Publication number: JPH10143195A
Application number: JP30322396A
Authority: JP
Inventors: 秀享 ▲高▼橋; Hideyuki Takahashi
Original assignee: Olympus Optical Co Ltd
Current assignee: Olympus Corp
Priority date: 1996-11-14
Filing date: 1996-11-14
Publication date: 1998-05-29

Abstract

PROBLEM TO BE SOLVED: To provide the post filter in which reproduced sounds having no sense of incongruity are obtained even in the consonant section, that has no pitch constitution nor formant peaks, and in a no voice segment. SOLUTION: If the input signals are voice signals, a selection is made to execute a pitch control filter 3 by a selection switch 2 and also executions are made for a formant control filter 4 and a spectrum inclination compensation filter 5. If the input signals are no voice signals, the execution of the filter 3 is prohibited by the switch 2, a coefficient controlling section 7 controls the formant of the filter 4 so that the formant becomes flat and the inclination of the spectrum of the filter 5 is maintained.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ポストフィルタ、
詳しくは、音声信号を高能率に符号化した音声データを
復号化する際に用いるポストフィルタに関する。TECHNICAL FIELD The present invention relates to a post filter,
More specifically, the present invention relates to a post filter used when decoding audio data obtained by encoding an audio signal with high efficiency.

【０００２】[0002]

【従来の技術】従来、音声信号を能率よく圧縮するため
に広く用いられている手段として、音声信号のスペクト
ル包絡を表す線形予測パラメータと、線形予測残差信号
に対応する音源パラメータとで符号化する方式が知られ
ている。このような線形予測の手段を用いた音声符号化
方式は、少ない伝送容量で比較的高品質な合成音声を得
られることから、近年のハードウェア技術の進歩と相ま
ってさまざまな応用方式がさかんに研究、開発されてい
る。2. Description of the Related Art Conventionally, as means widely used for efficiently compressing a speech signal, encoding is performed by using a linear prediction parameter representing a spectrum envelope of the speech signal and an excitation parameter corresponding to the linear prediction residual signal. There is a known method. Since the speech coding method using such a linear prediction method can obtain relatively high quality synthesized speech with a small transmission capacity, various application methods are actively studied in conjunction with recent advances in hardware technology. , Has been developed.

【０００３】このような音声符号化方式の中でもよい音
質が得られる方式として、例えば、ｋｌｅｉｊｉｎ等に
よる“Ｉｍｐｒｏｖｅｄｓｐｅｅｃｈｑｕａｌｉｔ
ｙａｎｄｅｆｆｉｃｉｅｎｔｖｅｃｔｏｒｑｕａ
ｎｔｉｚａｔｉｏｎｉｎＳＥＬＰ”（ＩＣＡＳＳＰ’
８８ｓ４．４，ｐｐ．１５５−１５８，１９８８）と
題した論文に記載されている、ＣＥＬＰ（ＣｏｄｅＥ
ｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｖｅ
Ｃｏｄｉｎｇ）方式がよく知られるところにある。[0003] Among such speech coding methods, a method that can obtain good sound quality is described in, for example, "Improved speech qualit by Kleijin and the like.
yand efficient vector qua
ntization inSELP "(ICASSP '
88 s4.4, pp. 155-158, 1988), CELP (Code E).
xcited Linear Predictive
Coding) method is well known.

【０００４】また近年では、非音声区間や無声区間にお
いては、より高能率な符号化を行う、可変ビットレート
符号化が提案されており、例えば、Ｕ．Ｓ．Ｐ５，４１
４，７９６号に詳細に記載されている。このような可変
ビットレート符号化においては、トータルのビットレー
トを少なくできるという利点を有するばかりでなく、音
声の状態に関する情報を符号化データとして伝送あるい
は記録するため、復号化時には処理中のフレームの音声
の状態を直接知ることができ、復号化時における応用の
可能性が広がるという利点を有している。[0004] In recent years, variable bit rate coding has been proposed which performs more efficient coding in non-voice sections and unvoiced sections. S. P5,41
No. 4,796. Such variable bit rate coding not only has the advantage that the total bit rate can be reduced, but also transmits or records information about the state of audio as coded data. This has the advantage that the state of the audio can be directly known and the possibility of application at the time of decoding is expanded.

【０００５】上述したような音声符号化装置に対する音
声復号化装置においては、主観的音質を向上させる目的
で、ポストフィルタを使用することが知られている。こ
のポストフィルタは、聴覚マスキング特性を利用して、
ノイズシェイピングを行うものである。ノイズシェイピ
ングとは、本来はほぼ平坦なスペクトル特性をもつ量子
化雑音に対して、処理対象音声のスペクトル特性に近く
なるようにスペクトル変形を行い、マスキングにより量
子化雑音の知覚を抑える機能をいう。このようなポスト
フィルタは一般に音声復号化装置の直後に具備される。[0005] It is known that a speech decoding apparatus for the above-described speech encoding apparatus uses a post-filter for the purpose of improving subjective sound quality. This post filter utilizes the auditory masking characteristics,
This is to perform noise shaping. Noise shaping refers to a function of transforming quantization noise, which originally has almost flat spectral characteristics, so as to be close to the spectral characteristics of the speech to be processed, and suppressing the perception of the quantization noise by masking. Such a post-filter is generally provided immediately after the speech decoder.

【０００６】図１０は、ＣＥＬＰ復号化装置の直後にポ
ストフィルタが具備された構成を示すものである。FIG. 10 shows a configuration in which a post filter is provided immediately after a CELP decoding device.

【０００７】同図において、適応コードブック１１１
は、乗算器１１２を介して加算器１１５の第１の入力端
子に接続され、確率コードブック１１３は、乗算器１１
４を介して加算器１１５の第２の入力端子に接続されて
いる。加算器１１５の出力端子は合成フィルタ１１６を
介してポストフィルタ１０１に接続されるとともに、遅
延回路１１７を介して適応コードブック１１１に接続さ
れている。さらに、デマルチプレクサ１１８は、適応コ
ードブック１１１、確率コードブック１１３、合成フィ
ルタ１１６、ポストフィルタ１０１に各々接続されてい
る。[0007] In FIG.
Is connected to a first input terminal of an adder 115 via a multiplier 112, and the probability codebook 113
4 is connected to the second input terminal of the adder 115. The output terminal of the adder 115 is connected to the post filter 101 via the synthesis filter 116 and to the adaptive codebook 111 via the delay circuit 117. Further, the demultiplexer 118 is connected to the adaptive codebook 111, the probability codebook 113, the synthesis filter 116, and the post filter 101, respectively.

【０００８】上述した構成において、デマルチプレクサ
１１８は受信した信号を線形予測パラメータａ、適応コ
ードブックの遅れＬとゲインｂ、確率コードブックのイ
ンデックスｉとゲインｇに分解して、分解された線形予
測パラメータａを合成フィルタ１１６およびポストフィ
ルタ１０１に、遅れＬを適応コードブック１１１および
ポストフィルタ１０１に、ゲインｂを乗算器１１２に、
インデックスｉとゲインｇを各々確率コードブック１１
３と乗算器１１４に出力する。In the configuration described above, the demultiplexer 118 decomposes the received signal into a linear prediction parameter a, a delay L and a gain b of an adaptive codebook, and an index i and a gain g of a probability codebook. The parameter a is applied to the synthesis filter 116 and the post filter 101, the delay L is applied to the adaptive codebook 111 and the post filter 101, the gain b is applied to the multiplier 112,
The index i and the gain g are respectively assigned to the probability codebook 11
3 and output to the multiplier 114.

【０００９】上記デマルチプレクサ１１８から出力され
た適応コードブックの遅れＬに基づいて適応コードブッ
ク１１１のコードベクトルを選択する。ここで適応コー
ドブック１１１は符号化装置における適応コードブック
の内容と同じ内容を有する。適応コードブック１１１に
は、遅延回路１１７を介して過去の駆動音源信号が入力
される。乗算器１１２はゲインｂにより、入力された適
応コードベクトルを増幅し、加算器１１５に送出する。The code vector of the adaptive codebook 111 is selected based on the delay L of the adaptive codebook output from the demultiplexer 118. Here, the adaptive codebook 111 has the same content as the content of the adaptive codebook in the encoding device. A past drive sound source signal is input to adaptive codebook 111 via delay circuit 117. The multiplier 112 amplifies the input adaptive code vector with the gain b and sends the amplified adaptive code vector to the adder 115.

【００１０】一方、上記デマルチプレクサ１１８から出
力された確率コードブックのインデックスｉに基づいて
確率コードブック１１３のコードベクトルを選択する。
ここで確率コードブック１１３は符号化装置における確
率コードブックの内容と同じ内容を有する。乗算器１１
４はゲインｇにより、入力されたコードベクトルを増幅
し、加算器１１５に送出する。On the other hand, a code vector of the probability codebook 113 is selected based on the index i of the probability codebook output from the demultiplexer 118.
Here, the probability codebook 113 has the same content as the content of the probability codebook in the encoding device. Multiplier 11
4 amplifies the input code vector by the gain g and sends it to the adder 115.

【００１１】上記加算器１１５は増幅された確率コード
ベクトルと増幅された適応コードベクトルとを加算して
駆動音源信号を作成し、合成フィルタ１１６および遅延
回路１１７に送出する。The adder 115 creates a drive excitation signal by adding the amplified probability code vector and the amplified adaptive code vector, and sends it to the synthesis filter 116 and the delay circuit 117.

【００１２】上記合成フィルタ１１６は受信した線形予
測パラメータａを係数として前記駆動音源信号に対して
合成処理を行い、復号化信号を出力する。The synthesis filter 116 performs a synthesis process on the driving excitation signal using the received linear prediction parameter a as a coefficient, and outputs a decoded signal.

【００１３】上記ポストフィルタ１０１は、線形予測パ
ラメータａと適応コードブックの遅れＬを用いて、入力
された前記復号化信号に対してスペクトルシェイピング
を行う。The post-filter 101 performs spectrum shaping on the input decoded signal using the linear prediction parameter a and the delay L of the adaptive codebook.

【００１４】なお、上記ポストフィルタ１０１は、周知
の構成、例えば、Ｊｕｉｎ−ＨｗｅｙＣｈｅｎ，Ａｌ
ｌｅｎＧｅｒｓｈｏによる“ＡｄａｐｔｉｖｅＰｏ
ｓｔｆｉｌｔｅｒｉｎｇｆｏｒＱｕａｌｉｔｙＥ
ｎｈａｎｃｅｍｅｎｔｏｆＣｏｄｅｄＳｐｅｅｃ
ｈ”（ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｆＳｐ
ｅｅｃｈａｎｄＡｕｄｉｏＰｒｏｃｅｓｓｉｎ
ｇ，Ｖｏｌ．３，Ｎｏ．１，ｐｐ．５９−７１，Ｊａｎ
ｕａｒｙ，１９９５）と題した論文に記載されたような
構成をなしている。The post filter 101 has a known structure, for example, Juin-Hway Chen, Al.
"Adaptive Po" by len Gersho
stfiltering for Quality E
nhancement of Coded Spec
h "(IEEE Transactions of Sp)
ech and Audio Processin
g, Vol. 3, No. 1, pp. 59-71, Jan
uary, 1995).

【００１５】たとえば、この例によると、ポストフィル
タ１０１は、図１１に示すように、ピッチ強調フィルタ
１２１と、フォルマント強調フィルタ１２２と、スペク
トル傾斜補正フィルタ１２３と、利得制御部１２４とか
らなり、図に示すようにそれぞれが縦続接続された構成
をとる。For example, according to this example, the post filter 101 comprises a pitch emphasis filter 121, a formant emphasis filter 122, a spectrum tilt correction filter 123, and a gain control section 124, as shown in FIG. As shown in (1), they are connected in cascade.

【００１６】なお、フォルマント強調フィルタ１２２
は、図４に示すような音声波形から導きだした図５に示
すスペクトル包絡の第１，第２，第３のフォルマント
Ｘ，Ｙ，Ｚを強調するものである。このフォルマントの
ピーク値Ｘ，Ｙ，Ｚにより母音が特徴づけられる。ま
た、図５中の点線はスペクトルの傾斜を示している。The formant emphasis filter 122
Emphasizes the first, second, and third formants X, Y, and Z of the spectral envelope shown in FIG. 5 derived from the speech waveform shown in FIG. Vowels are characterized by the formant peak values X, Y, and Z. The dotted line in FIG. 5 indicates the slope of the spectrum.

【００１７】ここで、ピッチ強調フィルタ１２１の伝達
関数Ｈp （ｚ）、フォルマント強調フィルタ１２２の伝
達関数Ｈs （ｚ）、スペクトル傾斜補正フィルタ１２３
の伝達関数Ｈt （ｚ）は、それぞれ以下の式で表わされ
る。Here, the transfer function Hp (z) of the pitch emphasis filter 121, the transfer function Hs (z) of the formant emphasis filter 122, and the spectrum tilt correction filter 123
Are represented by the following equations, respectively.

【００１８】[0018]

【式１】ただし、ｐはピッチ周期（サンプル数）、γおよびλは
重み付け係数、である。(Equation 1) Here, p is a pitch period (the number of samples), and γ and λ are weighting coefficients.

【００１９】[0019]

【式２】ただし、ａ１はスペクトルパラメータ、ｎは線形予測次
数、αおよびβは重み付け係数であり、通常、０＜β＜
α＜１の範囲に設定される。(Equation 2) Here, a1 is a spectrum parameter, n is a linear prediction order, α and β are weighting coefficients, and usually 0 <β <
It is set in the range of α <1.

【００２０】[0020]

【式３】ただし、μは補正係数であり、０＜μ＜１の範囲に設定
される。(Equation 3) Here, μ is a correction coefficient and is set in a range of 0 <μ <1.

【００２１】上記式１の伝達関数をもつピッチ強調フィ
ルタにより、再生音声のピッチ成分が強調され、明瞭度
を増す役割をもつ。また、式２の伝達関数をもつフォル
マント強調フィルタにより、再生音声のフォルマント構
造が強調され、量子化雑音のスペクトルもこのフォルマ
ント部分にエネルギーが集中して、再生音声のスペクト
ル包絡に近くなって、量子化雑音を知覚されにくくする
役割をもつ。また、式３の伝達関数をもつスペクトル傾
斜補正フィルタにより、前記フォルマント強調フィルタ
で補正できなかったスペクトルの傾斜を除去する役割を
もつ。The pitch component of the reproduced sound is emphasized by the pitch emphasizing filter having the transfer function of the above equation 1, and has a role of increasing clarity. In addition, the formant enhancement filter having the transfer function of Equation 2 enhances the formant structure of the reproduced voice, and the energy of the quantization noise concentrates on this formant portion, and becomes close to the spectrum envelope of the reproduced voice. It has the role of making it difficult to perceive the formation noise. The spectrum tilt correction filter having the transfer function of Equation 3 has a role of removing the tilt of the spectrum that could not be corrected by the formant emphasis filter.

【００２２】[0022]

【発明が解決しようとする課題】しかしながら、上述し
たようなポストフィルタは、あくまでピッチ構造、フォ
ルマントピークを有する有声音に対して主観的な音質を
向上させることを目的としたものであった。ピッチ構造
やフォルマントピークをもたない子音部や非音声区間、
特に背景雑音が混入した状況での非音声区間においてポ
ストフィルタリングを行うと、ピッチ強調フィルタおよ
びフォルマント強調フィルタにより、もともとないピッ
チやフォルマントを強調してしまう。However, the above-described post-filter is intended only to improve subjective sound quality for voiced sounds having a pitch structure and a formant peak. Consonants and non-speech sections without pitch structure and formant peaks,
In particular, when post-filtering is performed in a non-speech section in a situation where background noise is mixed, an original pitch or formant is emphasized by a pitch emphasis filter and a formant emphasis filter.

【００２３】また、空調機や自動車に代表される背景雑
音は一般に、低い周波数帯域に大きなエネルギーを持
ち、高い周波数帯域で徐々にエネルギーが小さくなるよ
うな傾斜のスペクトルを有する。そのような背景雑音に
対して、スペクトル傾斜補正フィルタにより高域がかえ
って強調されてしまう。これらが原因で、再生音の音質
が非常に違和感のあるものになってしまうという問題が
ある。Background noise represented by air conditioners and automobiles generally has a slope spectrum such that it has large energy in a low frequency band and energy gradually decreases in a high frequency band. For such background noise, the high band is rather emphasized by the spectrum tilt correction filter. Due to these, there is a problem that the sound quality of the reproduced sound becomes very uncomfortable.

【００２４】本発明はかかる問題点に鑑みてなされたも
のであり、入力信号が非音声である場合においても違和
感のない再生音が得られるポストフィルタを提供するこ
とを目的とする。The present invention has been made in view of such a problem, and an object of the present invention is to provide a post filter capable of obtaining a reproduced sound without a sense of incongruity even when an input signal is non-voice.

【００２５】[0025]

【課題を解決するための手段】上記の目的を達成するた
めに本発明の第１のポストフィルタは、入力信号に対し
て、ピッチパラメータに基づいてピッチ構造を制御する
ピッチ構造制御手段と、上記入力信号が音声であるか非
音声であるかを区別する音声／非音声情報およびスペク
トルパラメータに基づいてフォルマント構造を制御する
フォルマント構造制御手段と、このフォルマント構造制
御手段の出力信号のスペクトルの傾斜を補正するスペク
トル傾斜補正手段と、上記フォルマント構造制御手段の
出力信号の利得を制御する利得制御手段と、上記音声／
非音声情報に基づき入力信号が音声であるときには上記
ピッチ構造制御手段を実行し、該入力信号が非音声であ
るときには同ピッチ構造制御手段を実行しないように制
御する選択手段と、を具備する。To achieve the above object, a first post-filter of the present invention comprises a pitch structure control means for controlling a pitch structure of an input signal based on a pitch parameter; Formant structure control means for controlling a formant structure based on speech / non-speech information and a spectrum parameter for distinguishing whether an input signal is speech or non-speech, and a slope of a spectrum of an output signal of the formant structure control means. Spectrum tilt correcting means for correcting, gain control means for controlling the gain of the output signal of the formant structure control means,
Selecting means for executing the pitch structure control means based on the non-speech information when the input signal is speech, and controlling not to execute the pitch structure control means when the input signal is non-speech.

【００２６】上記の目的を達成するために本発明の第２
のポストフィルタは、上記第１のポストフィルタにおい
て、さらに、上記音声／非音声情報に基づき入力信号が
音声であるときにはフォルマントを強調し、該入力信号
が非音声であるときには、フォルマントを平坦化するよ
うに上記フォルマント構造制御手段におけるフォルマン
ト制御係数を適応的に変化させる第１の係数制御手段を
具備する。In order to achieve the above object, the second aspect of the present invention
The post-filter of the first post-filter further enhances the formant when the input signal is a speech based on the speech / non-speech information, and flattens the formant when the input signal is non-speech. As described above, a first coefficient control means for adaptively changing the formant control coefficient in the formant structure control means is provided.

【００２７】上記の目的を達成するために本発明の第３
のポストフィルタは、上記第１または第２のポストフィ
ルタにおいて、さらに、上記音声／非音声情報に基づき
入力信号が音声であるときにはスペクトルの傾斜を補正
するようにし、該入力信号が非音声であるときにはスペ
クトルの傾斜を保つように上記スペクトル傾斜補正手段
におけるスペクトル傾斜の補正係数を適応的に変化させ
る第２の係数制御手段を具備する。In order to achieve the above object, the third aspect of the present invention
The post-filter of the first or second post-filter further corrects the slope of the spectrum when the input signal is speech based on the speech / non-speech information, and the input signal is non-speech. In some cases, a second coefficient control means for adaptively changing the correction coefficient of the spectrum tilt in the spectrum tilt correction means so as to maintain the spectrum tilt is provided.

【００２８】上記第１のポストフィルタは、ピッチ構造
制御手段で、入力信号に対して、ピッチパラメータに基
づいてピッチ構造を制御し、フォルマント構造制御手段
で、上記入力信号が音声であるか非音声であるかを区別
する音声／非音声情報およびスペクトルパラメータに基
づいてフォルマント構造を制御する。また、スペクトル
傾斜補正手段で上記フォルマント構造制御手段の出力信
号のスペクトルの傾斜を補正し、さらに、利得制御手段
で上記フォルマント構造制御手段の出力信号の利得を制
御する。また、上記音声／非音声情報に基づき入力信号
が音声であるときには上記ピッチ構造制御手段を実行
し、該入力信号が非音声であるときには同ピッチ構造制
御手段を実行しないように選択手段で制御する。The first post-filter includes a pitch structure control means for controlling a pitch structure of the input signal based on a pitch parameter, and a formant structure control means for controlling whether the input signal is a voice signal or a non-voice signal. Control the formant structure based on the speech / non-speech information and the spectral parameters that distinguish between The spectrum tilt correcting means corrects the tilt of the spectrum of the output signal of the formant structure control means, and the gain control means controls the gain of the output signal of the formant structure control means. When the input signal is voice based on the voice / non-voice information, the pitch structure control means is executed, and when the input signal is non-voice, the pitch structure control means is controlled not to execute the pitch structure control means. .

【００２９】上記第２のポストフィルタは、上記第１の
ポストフィルタの作用に加えて、さらに、第１の係数制
御手段で、上記音声／非音声情報に基づき入力信号が音
声であるときにはフォルマントを強調し、該入力信号が
非音声であるときには、フォルマントを平坦化するよう
に上記フォルマント構造制御手段におけるフォルマント
制御係数を適応的に変化させる。The second post-filter has, in addition to the function of the first post-filter, a first coefficient control means which, when the input signal is a voice based on the voice / non-voice information, forms a formant. When the input signal is non-voice, the formant control coefficient in the formant structure control means is adaptively changed so as to flatten the formant.

【００３０】上記第３のポストフィルタは、上記第１ま
たは第２のポストフィルタの作用に加えて、さらに、第
２の係数制御手段で、上記音声／非音声情報に基づき入
力信号が音声であるときにはスペクトルの傾斜を補正す
るようにし、該入力信号が非音声であるときにはスペク
トルの傾斜を保つように上記スペクトル傾斜補正手段に
おけるスペクトル傾斜の補正係数を適応的に変化させ
る。In the third post filter, in addition to the operation of the first or second post filter, an input signal is a voice based on the voice / non-voice information by a second coefficient control means. Sometimes, the spectrum tilt is corrected, and when the input signal is non-voice, the spectrum tilt correction means adaptively changes the spectrum tilt correction coefficient so as to maintain the spectrum tilt.

【００３１】[0031]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００３２】図１は、本発明が適用された一実施形態で
あるポストフィルタの構成を示したブロック図である。
なお、本実施形態のポストフィルタは、可変ビットレー
ト符号化／復号化装置の直後に具備されるものとして説
明する。また、該可変ビットレート符号化／復号化装置
は、処理対象フレームにおける入力信号の音声状態が音
声か非音声かの２種類に判別して、音声か非音声かによ
ってそれぞれ所定のビットレートで符号化／復号化され
る装置である。FIG. 1 is a block diagram showing a configuration of a post filter according to an embodiment to which the present invention is applied.
Note that the post filter of the present embodiment is described as being provided immediately after the variable bit rate encoding / decoding device. Also, the variable bit rate encoding / decoding device determines whether the audio state of the input signal in the frame to be processed is speech or non-speech, and encodes the signal at a predetermined bit rate depending on whether it is speech or non-speech. Device to be encrypted / decoded.

【００３３】図２は、上記図１に示す上記実施形態のポ
ストフィルタを適用した可変ビットレートＣＥＬＰ復号
化装置の構成を示したブロック図であり、該可変ビット
レートＣＥＬＰ復号化装置の直後に該ポストフィルタが
具備された構成を示すものである。FIG. 2 is a block diagram showing a configuration of a variable bit rate CELP decoding apparatus to which the post-filter of the above-described embodiment shown in FIG. 1 is applied. 5 shows a configuration provided with a post filter.

【００３４】まず、該可変ビットレートＣＥＬＰ復号化
装置の構成について簡単に説明する。図に示すように、
適応コードブック１１は、乗算器１２を介して加算器１
５の第１の入力端子に接続され、確率コードブック１３
は、乗算器１４を介して加算器１５の第２の入力端子に
接続されている。また、上記加算器１５の出力端子は合
成フィルタ１６を介して図１に示すポストフィルタ１に
接続されるとともに、遅延回路１７を介して適応コード
ブック１１に接続されている。なお、ポストフィルタ１
の構成については、後に詳述する。First, the configuration of the variable bit rate CELP decoding device will be briefly described. As shown in the figure,
The adaptive codebook 11 is connected to the adder 1 via the multiplier 12.
5 is connected to the first input terminal of
Is connected to the second input terminal of the adder 15 via the multiplier 14. The output terminal of the adder 15 is connected to the post filter 1 shown in FIG. 1 via the synthesis filter 16 and to the adaptive codebook 11 via the delay circuit 17. Note that post filter 1
Will be described later in detail.

【００３５】一方、当該可変ビットレートＣＥＬＰ復号
化装置は、デマルチプレクサ１８を備えており、該デマ
ルチプレクサ１８は上記適応コードブック１１、確率コ
ードブック１３、合成フィルタ１６、ポストフィルタ１
に各々接続されている。On the other hand, the variable bit rate CELP decoding device includes a demultiplexer 18, and the demultiplexer 18 includes the adaptive codebook 11, the probability codebook 13, the synthesis filter 16, and the postfilter 1.
Are connected to each other.

【００３６】さらに、上記デマルチプレクサ１８には、
音声状態判定部１９が接続されている。Further, the demultiplexer 18 includes:
The voice state determination unit 19 is connected.

【００３７】上記デマルチプレクサ１８は、上記音声状
態判定部１９を介して受信した符号化データ信号をスペ
クトルパラメータである線形予測パラメータａ、ピッチ
パラメータである適応コードブックの遅れＬと同ゲイン
ｂ、確率コードブックのインデックスｉと同ゲインｇに
分解するようになっている。そして、分解された線形予
測パラメータａを合成フィルタ１６およびポストフィル
タ１に、適応コードブックの遅れＬを適応コードブック
１１およびポストフィルタ１に、ゲインｂを乗算器１２
に、確率コードブックのインデックスｉとゲインｇを各
々確率コードブック１３と乗算器１４に出力するように
なっている。The demultiplexer 18 converts the coded data signal received via the voice state determination unit 19 into a linear prediction parameter a which is a spectrum parameter, a delay L which is the same as a delay L of an adaptive codebook which is a pitch parameter, and a probability It is decomposed into a codebook index i and the same gain g. Then, the decomposed linear prediction parameter a is applied to the synthesis filter 16 and the postfilter 1, the delay L of the adaptive codebook is applied to the adaptive codebook 11 and the postfilter 1, and the gain b is applied to the multiplier 12.
Then, the index i and the gain g of the probability codebook are output to the probability codebook 13 and the multiplier 14, respectively.

【００３８】一方、当該装置は、上記デマルチプレクサ
１８から出力された適応コードブックの遅れＬに基づい
て適応コードブック１１のコードベクトルを選択するよ
うになっている。ここで適応コードブック１１は符号化
装置における適応コードブックの内容と同じ内容を有す
る。On the other hand, the apparatus selects a code vector of the adaptive codebook 11 based on the delay L of the adaptive codebook output from the demultiplexer 18. Here, the adaptive codebook 11 has the same content as the content of the adaptive codebook in the encoding device.

【００３９】一方、上記適応コードブック１１には、遅
延回路１７を介して過去の駆動音源信号が入力されるよ
うになっている。さらに、乗算器１２にはデマルチプレ
クサ１８よりゲインｂが入力され、該ゲインｂに基づき
上記適応コードブック１１より入力された適応コードベ
クトルを増幅し、加算器１５に送出するようになってい
る。On the other hand, a past driving sound source signal is input to the adaptive codebook 11 via a delay circuit 17. Further, the multiplier 12 receives a gain b from the demultiplexer 18, amplifies the adaptive code vector input from the adaptive codebook 11 based on the gain b, and sends out the amplified code vector to the adder 15.

【００４０】また、上記デマルチプレクサ１８から出力
された確率コードブックのインデックスｉに基づいて確
率コードブック１３のコードベクトルが選択される。こ
こで上記確率コードブック１３は符号化装置における確
率コードブックの内容と同じ内容を有する。さらに、乗
算器１４には上記デマルチプレクサ１８よりゲインｇが
入力され、該ゲインｇに基づきにより上記確率コードブ
ック１３より入力された確率コードベクトルを増幅し、
加算器１５に送出するようになっている。The code vector of the probability codebook 13 is selected based on the index i of the probability codebook output from the demultiplexer 18. Here, the probability codebook 13 has the same content as the content of the probability codebook in the encoding device. Further, the gain g is input to the multiplier 14 from the demultiplexer 18, and the probability code vector input from the probability code book 13 is amplified based on the gain g,
The data is sent to the adder 15.

【００４１】上記加算器１５は増幅された確率コードベ
クトルと増幅された適応コードベクトルとを加算して駆
動音源信号を作成し、合成フィルタ１６および遅延回路
１７に対して送出するようになっている。The adder 15 creates a drive excitation signal by adding the amplified probability code vector and the amplified adaptive code vector, and sends it to the synthesis filter 16 and the delay circuit 17. .

【００４２】上記合成フィルタ１６は受信した線形予測
パラメータａを係数として上記駆動音源信号に対して合
成処理を行い、復号化信号を出力する。The synthesis filter 16 performs a synthesis process on the driving excitation signal using the received linear prediction parameter a as a coefficient, and outputs a decoded signal.

【００４３】上記ポストフィルタ１は、線形予測パラメ
ータａと適応コードブックの遅れＬとを用いて、入力さ
れた上記復号化信号に対してスペクトルシェイピングを
行うようになっている。The post filter 1 performs spectrum shaping on the input decoded signal using the linear prediction parameter a and the delay L of the adaptive codebook.

【００４４】また、同図において、音声状態判定部１９
は、符号化データから音声状態に関する情報すなわち音
声／非音声情報ｆを読み出して、該情報ｆによって所定
量の符号化データを読み出すとともに、同音声／非音声
情報ｆをポストフィルタ１に対して出力するようになっ
ている。そして、デマルチプレクサ１８で音声／非音声
情報ｆによって所定のビット配分で各パラメータに分解
したのち、復号化処理が行われるようになっている。Also, in FIG.
Reads out information related to the voice state, that is, voice / non-voice information f from the coded data, reads out a predetermined amount of coded data based on the information f, and outputs the voice / non-voice information f to the post-filter 1. It is supposed to. The demultiplexer 18 decomposes the audio / non-speech information f into parameters with a predetermined bit allocation according to the speech / non-speech information f, and then performs a decoding process.

【００４５】図１に戻って、本発明が適用される第１の
実施形態のポストフィルタを説明する。Returning to FIG. 1, a post filter according to a first embodiment to which the present invention is applied will be described.

【００４６】本実施形態のポストフィルタ１は、選択手
段としての選択スイッチ２と、ピッチ構造を制御するピ
ッチ構造制御手段としてのピッチ制御フィルタ３と、フ
ォルマント構造制御手段としてのフォルマント制御フィ
ルタ４と、スペクトル傾斜補正手段としてのスペクトル
傾斜補正フィルタ５と、利得制御手段としての利得制御
部６と、第１および第２の係数制御手段としての係数制
御部７と、を備える。The post filter 1 of this embodiment includes a selection switch 2 as a selection means, a pitch control filter 3 as a pitch structure control means for controlling a pitch structure, a formant control filter 4 as a formant structure control means, The apparatus includes a spectrum tilt correction filter 5 as a spectrum tilt correction unit, a gain control unit 6 as a gain control unit, and a coefficient control unit 7 as first and second coefficient control units.

【００４７】本実施形態のポストフィルタにおいて、ピ
ッチ制御フィルタ３の伝達関数Ｈp1（ｚ）、フォルマン
ト制御フィルタ４の伝達関数Ｈs1（ｚ）、スペクトル傾
斜補正フィルタ５の伝達関数Ｈt1（ｚ）は、それぞれ以
下の式で表わされる。In the post filter of this embodiment, the transfer function Hp1 (z) of the pitch control filter 3, the transfer function Hs1 (z) of the formant control filter 4, and the transfer function Ht1 (z) of the spectrum tilt correction filter 5 are respectively It is represented by the following equation.

【００４８】[0048]

【式４】ただし、ｐはピッチ周期（サンプル数）、γおよびλは
重み付け係数、である。(Equation 4) Here, p is a pitch period (the number of samples), and γ and λ are weighting coefficients.

【００４９】[0049]

【式５】ただし、ａi はスペクトルパラメータである線形予測パ
ラメータ、ｎは線形予測次数、αおよびβは重み付け係
数であり、本実施形態では、０＜β＜α＜１の範囲で制
御される。(Equation 5) Here, ai is a linear prediction parameter which is a spectrum parameter, n is a linear prediction order, and α and β are weighting coefficients. In the present embodiment, control is performed in a range of 0 <β <α <1.

【００５０】[0050]

【式６】ただし、μは補正係数であり、本実施形態では、０＜μ
＜１の範囲で制御される。(Equation 6) Here, μ is a correction coefficient, and in this embodiment, 0 <μ
It is controlled within the range of <1.

【００５１】このように、上記ピッチ制御フィルタ３は
上記式４で示す伝達関数を有している。そして、受信し
た音が音声であるか非音声であるかの情報、音声／非音
声情報ｆに基づいて作用する選択スイッチ２により実行
されるか否かが選択されるようになっている。この選択
スイッチ２は、上記音声／非音声情報ｆにより受信した
音が“音声”であった場合、上記ピッチ制御フィルタ３
を実行するように選択し、該ピッチ制御フィルタ３が実
行されると、再生音声のピッチ成分が強調され、明瞭度
が増すことになる。As described above, the pitch control filter 3 has a transfer function represented by the above equation (4). Then, information indicating whether the received sound is sound or non-speech, and whether or not to execute the sound is selected by the selection switch 2 that operates based on the sound / non-speech information f. When the sound received from the voice / non-voice information f is “voice”, the selection switch 2 controls the pitch control filter 3.
Is executed, and when the pitch control filter 3 is executed, the pitch component of the reproduced sound is emphasized and the clarity is increased.

【００５２】また、上記フォルマント制御フィルタ４は
上記式５で示す伝達関数を有している。その係数は上記
音声／非音声情報ｆに基づいて作用する係数制御部７に
より制御されるようになっている。この係数制御部７
は、該音声／非音声情報ｆより、受信した音が“音声”
であると判断すると、フォルマントを強調すべくフォル
マント制御フィルタ４の係数α，βを制御する。これに
より、再生音声のフォルマント構造が強調され、量子化
雑音のスペクトルが制御されて量子化雑音が低減され
る。The formant control filter 4 has a transfer function represented by the above equation (5). The coefficient is controlled by a coefficient control unit 7 that operates based on the voice / non-voice information f. This coefficient control unit 7
Indicates that the received sound is “voice” from the voice / non-voice information f.
Is determined, the coefficients α and β of the formant control filter 4 are controlled to emphasize the formant. Thereby, the formant structure of the reproduced sound is emphasized, the spectrum of the quantization noise is controlled, and the quantization noise is reduced.

【００５３】一方、上記係数制御部７は、音声／非音声
情報ｆより、受信した音が“非音声”であると判断する
と、フォルマントを平坦化させるべくフォルマント制御
フィルタ４の係数α，βを制御する。On the other hand, when the coefficient control unit 7 determines that the received sound is “non-voice” from the voice / non-voice information f, the coefficient control unit 7 changes the coefficients α and β of the formant control filter 4 to flatten the formant. Control.

【００５４】さらに、上記スペクトル傾斜補正フィルタ
５は上記式６で示す伝達関数を有している。その係数は
上記音声／非音声情報ｆに基づいて作用する係数制御部
７により制御されるようになっている。この係数制御部
７は、該音声／非音声情報ｆより、受信した音が“音
声”であると判断すると、スペクトルの傾斜を補正する
べくスペクトル傾斜補正フィルタ５の係数μを制御す
る。これにより、上記フォルマント制御フィルタ４で補
正できなかったスペクトルの傾斜が補正されることにな
る。Further, the spectrum tilt correction filter 5 has a transfer function represented by the above equation (6). The coefficient is controlled by a coefficient control unit 7 that operates based on the voice / non-voice information f. When the coefficient control unit 7 determines that the received sound is “voice” from the voice / non-voice information f, the coefficient control unit 7 controls the coefficient μ of the spectrum tilt correction filter 5 to correct the tilt of the spectrum. Thereby, the inclination of the spectrum that cannot be corrected by the formant control filter 4 is corrected.

【００５５】一方、上記係数制御部７は、音声／非音声
情報ｆより、受信した音が“非音声”であると判断する
と、スペクトルの傾斜を保持するようにスペクトル傾斜
補正フィルタ５の係数μを適宜制御する。On the other hand, when the coefficient control unit 7 determines that the received sound is “non-voice” based on the voice / non-voice information f, the coefficient μ of the spectrum tilt correction filter 5 maintains the tilt of the spectrum. Is appropriately controlled.

【００５６】以下、このような構成をなす本実施形態の
ポストフィルタの動作について、図３に示すフローチャ
ートを参照して具体的に説明する。Hereinafter, the operation of the post filter having the above-described configuration according to the present embodiment will be specifically described with reference to a flowchart shown in FIG.

【００５７】ポストフィルタ１に上記合成フィルタ１６
からの復号化信号および上記音声状態判定部１９からの
音声／非音声情報ｆが入力されると、該音声／非音声情
報ｆの値を調べることにより合成フィルタ１６からの復
号化信号が音声であるか非音声であるか判定する（ステ
ップＳ１）。本実施形態では、音声／非音声情報ｆ＝１
であれば音声、ｆ＝０であれば非音声であると判定す
る。The post-filter 1 includes the synthesizing filter 16
, And the voice / non-voice information f from the voice state determination unit 19, the value of the voice / non-voice information f is checked, and the decoded signal from the synthesis filter 16 is converted into voice. It is determined whether there is a voice or non-voice (step S1). In the present embodiment, voice / non-voice information f = 1
If f = 0, it is determined that the voice is non-voice.

【００５８】上記ステップＳ１において、判定がｙｅｓ
であれば、スイッチ２を端子ｂ側に閉じて、ピッチ制御
フィルタ３を実行する（ステップＳ２）。次に係数制御
部７は、フォルマント制御フィルタ４の重み付け係数
α、βをそれぞれ0.8、0.5に設定するとともに（ステッ
プＳ３）、スペクトル傾斜補正フィルタ５の係数μを0.
5に設定する（ステップＳ４）。In step S1, the determination is yes.
If so, the switch 2 is closed to the terminal b side, and the pitch control filter 3 is executed (step S2). Next, the coefficient control unit 7 sets the weighting coefficients α and β of the formant control filter 4 to 0.8 and 0.5, respectively (step S3), and sets the coefficient μ of the spectrum tilt correction filter 5 to 0.
Set to 5 (step S4).

【００５９】その後、フォルマント制御フィルタ４（ス
テップＳ７）、スペクトル傾斜補正フィルタ５（ステッ
プＳ８）、利得制御部６（ステップＳ９）がそれぞれ動
作され、ポストフィルタリングされた信号として出力さ
れる（ステップＳ１０）。Thereafter, the formant control filter 4 (step S7), the spectral tilt correction filter 5 (step S8), and the gain control unit 6 (step S9) are operated, and output as post-filtered signals (step S10). .

【００６０】また、上記ステップＳ１における判定がｎ
ｏ、すなわち非音声であれば、スイッチ２の端子ｃ側に
閉じることにより、ピッチ制御フィルタ３を実行しない
ようにする。次に係数制御部７は、ステップＳ５におい
てフォルマント制御フィルタ４の重ね付け係数α、βを
それぞれ0.5、0.1に設定するとともに、スペクトル傾斜
補正フィルタ５の補正係数μを0.1に設定する（ステッ
プＳ６）。この後、上記同様、フォルマント制御フィル
タ４（ステップＳ７）、スペクトル傾斜補正フィルタ５
（ステップＳ８）、利得制御部６（ステップＳ９）がそ
れぞれ実行され、ポストフィルタリングされた信号とし
て出力される（ステップＳ１０）。The determination in step S1 is n.
If it is o, that is, if it is non-voice, the pitch control filter 3 is not executed by closing the switch 2 to the terminal c side. Next, the coefficient control unit 7 sets the superposition coefficients α and β of the formant control filter 4 to 0.5 and 0.1, respectively, and sets the correction coefficient μ of the spectrum tilt correction filter 5 to 0.1 in step S5 (step S6). . Thereafter, as described above, the formant control filter 4 (step S7), the spectrum tilt correction filter 5
(Step S8), the gain control unit 6 (Step S9) is executed, and output as a post-filtered signal (Step S10).

【００６１】ところで、一般に“音声”は、図６に示す
ように周波数帯域によって、音声の強さと量子化雑音の
強さとの差が比較的大きく音が聞き易い帯域と、音声の
強さと量子化雑音の強さとの差が小さく音が聞きにくい
帯域とがある。本実施形態のポストフィルタは、このよ
うな音声を入力すると、上述した作用により図７に示す
ように量子化雑音のスペクトル特性を音声のスペクトル
特性に近づけるようにする。In general, "speech" has a relatively large difference between the strength of speech and the strength of quantization noise depending on the frequency band as shown in FIG. There is a band where the difference from the noise intensity is small and the sound is hard to hear. When such a speech is input, the post-filter of the present embodiment brings the spectral characteristics of the quantization noise closer to the spectral characteristics of the speech as shown in FIG.

【００６２】さらに、“音声”を入力する場合、量子化
雑音のスペクトル特性の傾斜が音声のスペクトル特性の
傾斜とずれてしまうことのないように、スペクトル傾斜
補正フィルタ５によりスペクトル特性の傾斜を補正す
る。Further, when "speech" is input, the slope of the spectrum characteristic is corrected by the spectrum slope correction filter 5 so that the slope of the spectrum characteristic of the quantization noise does not deviate from the slope of the spectrum characteristic of the speech. I do.

【００６３】一方、“非音声”、たとえば図８に示すよ
うな自動車やエアコンの背景雑音が入力された場合、従
来のスペクトル傾斜補正フィルタを通すと図９の“Ａ”
で示すようにスペクトル特性が高い周波数帯域で強調さ
れてしまう傾向があった。本実施形態のポストフィルタ
はかかる点を考慮して、上述したように“非音声”が入
力された場合には高い周波数帯域が強調されないように
スペクトルの傾斜を保持するように制御するようになっ
ている。これにより、図９の“Ｂ”で示すように特性が
改善される。On the other hand, when "non-speech", for example, background noise of an automobile or air conditioner as shown in FIG. 8 is input, the signal passes through a conventional spectrum tilt correction filter and becomes "A" in FIG.
As shown by, the spectral characteristics tend to be emphasized in a high frequency band. In consideration of such a point, the post filter of the present embodiment performs control so as to maintain the slope of the spectrum so that a high frequency band is not emphasized when “non-voice” is input as described above. ing. Thereby, the characteristics are improved as indicated by "B" in FIG.

【００６４】このように、本実施形態のポストフィルタ
は、上述したような処理によって、“音声”が入力され
た場合は、従来どおりピッチ制御フィルタ３でピッチ強
調を行い、フォルマント制御フィルタ４でフォルマント
構造の強調を行い、スペクトル傾斜補正フィルタ５でス
ペクトルの傾斜を補正するようにしているが、一方、
“非音声”が入力された場合は、ピッチ制御フィルタ３
を実行しないようにするとともに、フォルマント制御フ
ィルタ４でフォルマント構造を平坦化させ、スペクトル
傾斜補正フィルタ５のスペクトルの緩やかな傾斜を保つ
ように制御する。As described above, according to the post filter of the present embodiment, when “voice” is input by the above-described processing, pitch enhancement is performed by the pitch control filter 3 as in the related art, and the formant The structure is emphasized, and the spectrum tilt correction filter 5 corrects the tilt of the spectrum.
When “non-voice” is input, the pitch control filter 3
Is performed, and the formant structure is flattened by the formant control filter 4 so that the spectrum of the spectrum inclination correction filter 5 is controlled to maintain a gentle inclination.

【００６５】以上詳述したように、上記実施形態のポス
トフィルタによれば、ピッチ構造やフォルマントピーク
をもたない子音部や非音声区間においても違和感のない
再生音を得ることが可能となる。As described in detail above, according to the post-filter of the above embodiment, it is possible to obtain a reproduced sound that does not cause a sense of incongruity even in a consonant part or a non-voice section having no pitch structure or formant peak.

【００６６】[0066]

【発明の効果】以上説明したように請求項１に記載の発
明によれば、入力信号が非音声である場合においても違
和感のない再生音が得られるポストフィルタを提供でき
る。As described above, according to the first aspect of the present invention, it is possible to provide a post filter capable of obtaining a reproduced sound without a sense of incongruity even when an input signal is non-voice.

【００６７】また、請求項２に記載の発明によれば、ピ
ッチ構造、フォルマントピークをもたない子音部や非音
声区間においても違和感のない再生音が得られるポスト
フィルタを提供できる。Further, according to the second aspect of the present invention, it is possible to provide a post filter capable of obtaining a reproduced sound that does not cause a sense of incongruity even in a consonant part or a non-voice section having no pitch structure and formant peak.

【００６８】さらに、請求項３に記載の発明によれば、
ピッチ構造、フォルマントピークをもたない子音部や非
音声区間において、さらに違和感のない再生音が得られ
るポストフィルタを提供できる。Further, according to the third aspect of the present invention,
In a consonant part or a non-speech section having no pitch structure and formant peak, a post-filter capable of obtaining a reproduced sound without a sense of incongruity can be provided.

[Brief description of the drawings]

【図１】本発明の一実施形態であるポストフィルタの構
成を示したブロック図である。FIG. 1 is a block diagram showing a configuration of a post filter according to an embodiment of the present invention.

【図２】上記図１に示す上記実施形態のポストフィルタ
を適用した可変ビットレートＣＥＬＰ復号化装置の構成
を示したブロック図である。FIG. 2 is a block diagram showing a configuration of a variable bit rate CELP decoding apparatus to which the post filter of the embodiment shown in FIG. 1 is applied.

【図３】上記実施形態のポストフィルタの具体的な動作
を示したフローチャートである。FIG. 3 is a flowchart showing a specific operation of the post filter of the embodiment.

【図４】音声波形の一例を示した線図である。FIG. 4 is a diagram showing an example of an audio waveform.

【図５】スペクトル包絡の一例を示した線図である。FIG. 5 is a diagram showing an example of a spectral envelope.

【図６】上記実施形態のポストフィルタにおいて、入力
する“音声”特性の一例を示した線図である。FIG. 6 is a diagram illustrating an example of an input “voice” characteristic in the post filter of the embodiment.

【図７】上記実施形態のポストフィルタにおいて、上記
図６に示した“音声”を入力した際の、量子化雑音の処
理後のスペクトル特性を示した線図である。FIG. 7 is a diagram showing spectral characteristics after processing of quantization noise when the “voice” shown in FIG. 6 is input to the post filter of the embodiment.

【図８】上記実施形態のポストフィルタにおいて、入力
する“非音声”特性の一例を示した線図である。FIG. 8 is a diagram showing an example of an input “non-speech” characteristic in the post filter of the embodiment.

【図９】上記実施形態のポストフィルタおよび従来のポ
ストフィルタにおける、非音声が入力された際の、再生
信号のスペクトル特性を示した線図である。FIG. 9 is a diagram illustrating the spectral characteristics of a reproduced signal when non-voice is input in the post filter of the embodiment and the conventional post filter.

【図１０】従来のポストフィルタの一構成例を示したブ
ロック図である。FIG. 10 is a block diagram showing a configuration example of a conventional post filter.

【図１１】上記図１０に示すポストフィルタを適用した
可変ビットレートＣＥＬＰ復号化装置の構成を示したブ
ロック図である。FIG. 11 is a block diagram showing a configuration of a variable bit rate CELP decoding apparatus to which the post filter shown in FIG. 10 is applied.

[Explanation of symbols]

１…ポストフィルタ２…選択スイッチ３…ピッチ制御フィルタ４…フォルマント制御フィルタ５…スペクトル傾斜補正フィルタ６…利得制御部７…係数制御部１１…適応コードブック１３…確率コードブック１６…合成フィルタ１７…遅延回路１８…デマルチプレクサ１９…音声状態判定部 REFERENCE SIGNS LIST 1 post filter 2 selection switch 3 pitch control filter 4 formant control filter 5 spectral tilt correction filter 6 gain control unit 7 coefficient control unit 11 adaptive codebook 13 probability codebook 16 synthesis filter 17 Delay circuit 18 Demultiplexer 19 Voice state determination unit

Claims

[Claims]

1. A pitch structure control means for controlling a pitch structure of an input signal based on a pitch parameter, voice / non-voice information and spectrum for distinguishing whether the input signal is voice or non-voice. Formant structure control means for controlling the formant structure based on the parameters; spectrum inclination correction means for correcting the slope of the spectrum of the output signal of the formant structure control means; and gain for controlling the gain of the output signal of the formant structure control means Control means for executing the pitch structure control means based on the speech / non-speech information when the input signal is speech, and selecting not to execute the pitch structure control means when the input signal is non-speech; Means, and a post filter.

2. A formant control unit for enhancing formants when an input signal is voice based on the voice / non-voice information and flattening the formants when the input signal is non-voice. The post filter according to claim 1, further comprising: first coefficient control means for adaptively changing coefficients.

3. The spectrum tilt correcting means for correcting a slope of a spectrum when the input signal is a voice based on the voice / non-voice information, and for maintaining a tilt of the spectrum when the input signal is a non-voice. 3. The post filter according to claim 1, further comprising: a second coefficient control unit that adaptively changes a correction coefficient of a spectrum tilt in the above.