JPH04591A

JPH04591A - Method for feature extraction

Info

Publication number: JPH04591A
Application number: JP2100832A
Authority: JP
Inventors: Koichi Higuchi; 浩一樋口; Yoshiyuki Yamashita; 山下　義征
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1990-04-17
Filing date: 1990-04-17
Publication date: 1992-01-06

Abstract

PURPOSE:To stabilize extracted features even in the case of an inclined character pattern and to attain rapid character recognition to be executed without using a dictionary by extracting the inclination of an auxiliary subpattern based upon the inclination of a line element included in the subpattern, scanning a character pattern based upon the inclination and extracting the feature of the pattern. CONSTITUTION:The 1st subpattern extracting part 6 scans a character pattern stored in a pattern register 4 in vertical and horizontal directions. A subpattern expressing a line element component in the scanning direction is extracted based upon the relation between continuous black bits on the scanning line and the line width of the pattern. A character inclination extracting part 7 finds out the inclination of the subpattern based upon the inclination of the line element included in the subpattern. The 2nd subpattern extracting part 10 scans the character pattern in respective directions based upon the precedently obtained inclination, extracts the subpattern and then extracts a feature. Thereby, the extracts feature can be stabilized even in an inclined character pattern, the preparation of a dictionary corresponding to the deformation of character inclination in an identification part 12 is made unnecessary, a collating time can be shortened and a processing speed can be increased.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、媒体上の文字を光電変換して得られる文字パ
タンを、複数方向に走査して該文字パタンから特徴を抽
出する特徴抽出方法、特に高速で、安定な特徴抽出方法
に関するものである。Detailed Description of the Invention (Industrial Application Field) The present invention provides a feature extraction method for scanning a character pattern obtained by photoelectrically converting characters on a medium in multiple directions and extracting features from the character pattern. , especially regarding a fast and stable feature extraction method.

（従来の技術）従来、この種の特徴抽出方法としては、特開昭５７−２
３１８５号公報（文献１〉、及び特公昭３８−５５５５
１号公報（文献２〉に記載されるものがあった。(Prior art) Conventionally, this type of feature extraction method was disclosed in Japanese Patent Application Laid-open No. 57-2.
Publication No. 3185 (Reference 1) and Special Publication No. 38-5555
There was one described in Publication No. 1 (Reference 2).

文献１の技術では、媒体上の文字を光電変換して得られ
た文字パタンより、各方向の線素成分（ストローク成分
）を表わすサブパタンを抽出する。そして、該サブパタ
ンを複数の領域に分割してその分割された各領域毎に、
前記サブパタンの線素の量を表わす特徴を抽出して特徴
マトリクスを生成する。その後、該特徴マトリクスと、
予め用意した辞書とを、照合して前記文字を認識するも
のであった。In the technique of Document 1, subpatterns representing line element components (stroke components) in each direction are extracted from a character pattern obtained by photoelectrically converting characters on a medium. Then, the sub-pattern is divided into multiple regions, and for each divided region,
A feature matrix is generated by extracting features representing the amount of line elements in the sub-pattern. Then, the feature matrix and
The characters were recognized by comparing them with a dictionary prepared in advance.

また、文献２の技術では、文字パタンの走査により、走
査線とストロークの交差数を特徴量として抽出するもの
であった。Further, in the technique of Document 2, the number of intersections between a scanning line and a stroke is extracted as a feature amount by scanning a character pattern.

（発明が解決しようとする課題）しかしなから、上記の特徴抽出方法では、次のような課
題があった。(Problems to be Solved by the Invention) However, the above feature extraction method has the following problems.

（ｉ）　　第２図は、印刷文字の斜体字の例を示す図で
ある。この図における数字゛○゛を模式化した図を第３
図（ａ）〜（ｄ）に示す。即ち、第３図（ａ）は斜体で
ない文字パタン′○゛の例、同図（ｂ）は斜体字“０°
の例、同図（Ｃ）は前記文献１の方法を用いて同図（ａ
）の文字パタンから抽出した垂直サブパタン、及び同図
（ｄ）は前記文献１の方法を用いて同図（ｂ）の文字パ
タンから抽出した垂直サブバタンである。(i) FIG. 2 is a diagram showing an example of printed characters in italics. The third diagram is a schematic diagram of the numbers ゛○゛ in this figure.
Shown in Figures (a) to (d). That is, FIG. 3(a) is an example of the non-italic character pattern '○゛, and FIG. 3(b) is an example of the italic character pattern "0°".
An example of the same figure (C) is the same figure (a
), and (d) in the same figure is a vertical sub-pattern extracted from the character pattern in (b) using the method of Document 1.

前記文献１の特徴抽出方法を用いて第３図（ａ＞の文字
パタンの垂直サブパタンを抽出すると、第３図（Ｃ）の
ように、垂直成分を忠実に抽出できる。When the vertical sub-patterns of the character pattern shown in FIG. 3 (a>) are extracted using the feature extraction method of Reference 1, the vertical components can be faithfully extracted as shown in FIG. 3 (C).

しかし、第３図（ｂ）の斜体字に対して垂直サブパタン
を抽出すると、第３図（ｄ）に示すように、線素成分が
傾く。その上、第３図（ｄ）の破線で示すような線素成
分が抽出されるべきところが、傾いた線素に対して垂直
に走査しているため、線素の両端部が欠けてしまう。そ
のなめ、同し字形“０°でありなから、抽出される特徴
が異なる（つまり、抽出される特徴が不安定となる）と
いう問題があった。また、手書文字の変形としては、一
般に右上がりのものが多く、水平線素が傾いているため
、前記と同様に、抽出される特徴が不安定となるという
問題があった。However, when a vertical sub-pattern is extracted for the italic letters in FIG. 3(b), the line element components are tilted as shown in FIG. 3(d). Moreover, since the line element component shown by the broken line in FIG. 3(d) is to be extracted, the line element is scanned perpendicularly to the inclined line element, so both ends of the line element are missing. Because of this, there was a problem that the extracted features were different (in other words, the extracted features were unstable) because the same glyph shape was "0°".In addition, as a deformation of handwritten characters, Since most of the features are upward-sloping to the right and the horizontal line elements are tilted, there is a problem that the extracted features are unstable, similar to the above.

従って、特徴抽出後、辞書と照合して文字認識を行う場
合、文字傾斜の変形に対応した多数の辞書を用意しなけ
ればならない。そのため、辞書容量が増大し、照合の長
時間化による処理速度の低下と、ハード規模（装置規模
）の増大を招くという不都合があった。Therefore, when character recognition is performed by comparing characters with dictionaries after feature extraction, it is necessary to prepare a large number of dictionaries corresponding to the deformation of the character slope. As a result, the dictionary capacity increases, the processing speed decreases due to longer verification times, and the hardware scale (device scale) increases, which are disadvantageous.

（ｉｉ）　　前記文献２の技術では、走査線と線素の交
差数を特徴量としているので、第３図（ｂ）のように線
素に傾斜があると、走査方向を固定したとき、抽出され
る交差数（特徴量）が大きく変動し、前記（ｉ）と同様
の問題が生じる。(ii) In the technique of Document 2, the number of intersections between a scanning line and a line element is used as a feature quantity, so if a line element has an inclination as shown in Fig. 3(b), it is difficult to extract when the scanning direction is fixed. The number of intersections (features) that are generated varies greatly, and the same problem as in (i) above occurs.

本発明は前記従来技術が持っていた課題として、抽出さ
れる特徴が不安定となる点と、処理速度の低下及び装置
の大型化の点について解決した特徴抽出方法を提供する
ものである。The present invention provides a feature extraction method that solves the problems of the prior art, such as instability of extracted features, reduction in processing speed, and increase in device size.

（課題を解決するための手段）前記課題を解決するために、第１の発明は、媒体上の文
字を光電変換して得られる文字パタンを、複数の方向に
走査して該文字パタンから特徴を抽出する特徴抽出方法
において、次のような手段を講じたものである。(Means for Solving the Problem) In order to solve the problem, a first invention scans a character pattern obtained by photoelectrically converting characters on a medium in a plurality of directions and extracts features from the character pattern. In this feature extraction method, the following measures are taken.

即ち、前記文字パタンを垂直方向及び水平方向の双方又
はいずれか一方の方向に走査し、該走査方向の黒画素の
連続数と前記文字パタンの線幅との関係により、該走査
方向の線素成分を表わす補助サブパタンを抽出し、前記
補助サブパタンに含まれる線素の傾斜より、前記補助サ
ブパタンの傾斜を抽出した後、前記補助サブパタンの傾
斜に基づき、前記文字パタンを複数の方向に走査して前
記文字パタンから特徴を抽出するようにしている。That is, the character pattern is scanned in the vertical direction and/or the horizontal direction, and line elements in the scanning direction are determined based on the relationship between the number of consecutive black pixels in the scanning direction and the line width of the character pattern. After extracting an auxiliary sub-pattern representing the component and extracting the slope of the auxiliary sub-pattern from the slope of line elements included in the auxiliary sub-pattern, the character pattern is scanned in a plurality of directions based on the slope of the auxiliary sub-pattern. Features are extracted from the character pattern.

第２の発明では、媒体上の文字を光電変換して得られる
文字パタンを複数の方向に走査し、該走査方向の黒画素
の連続数と線幅との関係により、該走査方向の線素成分
を表わすサブパタンを抽出した後、耐記サブパタンを複
数の領域に分割してその分割された領域毎に、前記サブ
パタンの線素量を表わす特徴を抽出する特徴抽出方法に
おいて、次のような手段を講したものである。In the second invention, a character pattern obtained by photoelectrically converting characters on a medium is scanned in multiple directions, and line elements in the scanning direction are determined based on the relationship between the number of consecutive black pixels in the scanning direction and the line width. In a feature extraction method of extracting a subpattern representing a component, dividing the memorized subpattern into a plurality of regions, and extracting a feature representing the amount of line elements of the subpattern for each divided region, the following means is used. This is a lecture on

即ち、前記文字バタンを垂直方向及び水平方向の双方又
はいずれか一方の方向に走査し、該走査方向の黒画素の
連続数と前記文字パタンの線幅との関係により、該走査
方向の線素成分を表わす補助サブバタンを抽出し、前記
補助サブパタンに含まれる線素の傾斜より、前記補助サ
ブパタンの傾斜を抽出した後、前記補助サブパタンの傾
斜に基づき、前記サブバタンを抽出するようにしている
。That is, the character button is scanned in both the vertical direction and the horizontal direction, and the line elements in the scanning direction are determined based on the relationship between the number of consecutive black pixels in the scanning direction and the line width of the character pattern. After extracting an auxiliary sub-pattern representing a component and extracting the slope of the auxiliary sub-pattern from the slope of a line element included in the auxiliary sub-pattern, the sub-pattern is extracted based on the slope of the auxiliary sub-pattern.

（作用）第１及び第２の発明によれば、以上のように特徴抽出方
法を構成しなので、文字パタンが入力されると、その文
字パタンを垂直方向及び水平方向の双方又はいずれか一
方の方向を走査し、該走査方向の線素成分を表わす補助
サブパタンを抽出した後、その補助サブパタンの傾斜を
抽出する。(Operation) According to the first and second inventions, the feature extraction method is configured as described above, so that when a character pattern is input, the character pattern is moved vertically and/or horizontally. After scanning the direction and extracting an auxiliary sub-pattern representing a line element component in the scanning direction, the slope of the auxiliary sub-pattern is extracted.

そして第１の発明では、抽出された傾斜に従って前記文
字パタンを走査し、走査線とストロークの交差数等とい
った特徴量を抽出する。In the first invention, the character pattern is scanned according to the extracted slope, and feature quantities such as the number of intersections between scanning lines and strokes are extracted.

また第２の発明では、抽出された傾斜に従ってサブパタ
ンを抽出し、その後、そのサブパタンを複数の領域に分
割してその分割された領域毎に前記サブパタンの特徴を
抽出する。Further, in the second invention, a sub-pattern is extracted according to the extracted slope, and then the sub-pattern is divided into a plurality of regions, and the characteristics of the sub-pattern are extracted for each of the divided regions.

これにより、手書文字や印刷文字において傾斜がある文
字パタンについても、抽出される特徴が安定となり、前
記課題を解決できるのである。This makes it possible to stabilize the extracted features even for character patterns with slants in handwritten characters or printed characters, and to solve the above-mentioned problem.

（実施例）第１図は、本発明の実施例を示す特徴抽出方法を用いた
文字認識装置の機能ブロック図である。(Embodiment) FIG. 1 is a functional block diagram of a character recognition device using a feature extraction method showing an embodiment of the present invention.

この文字認識装置は、帳票等の媒体上の文字画像の光信
号ＩＮを量子化された電気信号（ディジタル信号）に変
換する光電変換部１を有し、その出力側には、行バッフ
ァ２が接続されている。行バッファ２は、例えば福２０
４８ｘ高さ１２８ビツトの大きさを有し、１行分の文字
画像のディジタル信号を格納する構成になっており、そ
の出力側には、文字切出部３を介してバタンレジスタ４
が接続されている。文字切出部３は、メモリを有し、行
バッファ２の出力から１文字分のディジタル信号（これ
を、１文字パタン」という）をバタンレジスタ４に格納
する機能を有している。This character recognition device has a photoelectric conversion unit 1 that converts an optical signal IN of a character image on a medium such as a form into a quantized electric signal (digital signal), and a line buffer 2 is provided on the output side of the photoelectric conversion unit 1. It is connected. Row buffer 2 is, for example, Fuku 20
It has a size of 48 bits x height of 128 bits, and is configured to store the digital signal of one line of character image, and the output side is connected to a button register 4 via a character cutting section 3.
is connected. The character cutting section 3 has a memory, and has a function of storing a digital signal for one character (hereinafter referred to as a "one character pattern") from the output of the line buffer 2 into the button register 4.

バタンレジスタ４は、例えば６４ｘ６４ビツトの記憶容
量を有し、その出力側には、線幅測定部ヲ及び第１のサ
ブバタン抽出部６が接続されている。第１のサブバタン
抽出部６の出力側には、文字傾斜抽出部７が接続されて
いる。また、バタンレジスタ４の出力側には、文字枠検
出部８を介して分割点決定部９が接続されると共に、第
２のサブバタン抽出部１０が接続されている。分割点決
定部９及び第２のサブバタン抽出部１０の出力側には、
特徴マトリクス抽出部１１が接続され、その出力側に、
識別部１２が接続されている。The button register 4 has a storage capacity of, for example, 64x64 bits, and a line width measuring section and a first sub-button extracting section 6 are connected to its output side. A character slope extraction section 7 is connected to the output side of the first sub-battle extraction section 6 . Further, to the output side of the button register 4, a dividing point determining section 9 is connected via a character frame detecting section 8, and a second sub-butter extracting section 10 is also connected. On the output side of the dividing point determining unit 9 and the second sub-battle extracting unit 10,
A feature matrix extraction unit 11 is connected, and on its output side,
An identification unit 12 is connected.

線幅測定部５は、バタンレジスタ４の出力に対する線幅
を測定し、その測定結果を第１および第２のサブバタン
抽出部６．１０に与える機能を有している。第１のサブ
バタン抽出部６は、バタンレジスタ４を垂直及び水平方
向に走査して第１の垂直サブバタン（補助垂直サブバタ
ン）及び第１の水平サブパタン（補助水平サブバタン）
を抽出するもので、垂直サブバタン抽出部６ａ及び水平
サブバタン抽出部６ｂより構成されている。各抽出部６
＋ａ、６ｂは、それぞれバタン格納用のメモリを有して
いる。The line width measurement section 5 has a function of measuring the line width of the output of the button register 4 and providing the measurement results to the first and second sub-button extraction sections 6.10. The first sub-pattern extraction unit 6 scans the button register 4 in the vertical and horizontal directions to extract a first vertical sub-pattern (auxiliary vertical sub-pattern) and a first horizontal sub-pattern (auxiliary horizontal sub-pattern).
It is composed of a vertical sub-pattern extraction section 6a and a horizontal sub-pattern extraction section 6b. Each extraction part 6
+a and 6b each have a memory for storing a button.

文字傾斜抽出部７は、第１のサブバタン抽出部６で抽出
された第１の垂直サブバタン及び第１の水平サブバタン
について傾斜を抽出し、その抽出結果を第２のサブバタ
ン抽出部１０及び特徴マトリクス抽出部１１へ出力する
機能を有している。The character slant extraction unit 7 extracts the slants of the first vertical sub-battens and first horizontal sub-battens extracted by the first sub-pattern extraction unit 6, and transmits the extraction results to the second sub-pattern extraction unit 10 and the feature matrix extraction. It has a function of outputting to the section 11.

文字枠検出部８は、バタンレジスタ４内の文字パタンを
走査して外接枠、つまり文字枠を検出し、その検出結果
を分割点検出部９に与える機能を有している。分割点決
定部９は、外接枠内を複数の部分領域に分割するための
分割点座標を決定し、その決定結果を特徴マトリクス抽
出部１１へ出力する機能を有している。The character frame detection unit 8 has a function of scanning the character pattern in the button register 4 to detect a circumscribed frame, that is, a character frame, and providing the detection result to the division point detection unit 9. The division point determination unit 9 has a function of determining division point coordinates for dividing the inside of the circumscribed frame into a plurality of partial regions, and outputs the determination result to the feature matrix extraction unit 11.

第２のサブバタン抽出部１０は、文字傾斜抽出部７で抽
出された傾斜に基づき、バタンレジスタ４を複数方向に
走査して第２の垂直サブバタン及び第２の水平サブパタ
ンを抽出すると共に、右斜めサブパタン及び左斜めサブ
バタンを抽出するもので、垂直サブバタン抽出部１０ａ
、水平サブバタン抽出部１０ｂ、右斜めサブバタン抽出
部１０Ｃ５及び左斜めサブバタン抽出部１０ｄより構成
されている。各抽出部１０ａ〜１０ｄは、それぞれバタ
ン格納用のメモリを有している。The second sub-pattern extraction unit 10 scans the button register 4 in a plurality of directions based on the slope extracted by the character slope extraction unit 7 to extract a second vertical sub-pattern and a second horizontal sub-pattern, and also extracts a second vertical sub-pattern and a second horizontal sub-pattern. Vertical sub-pattern extractor 10a extracts sub-patterns and left diagonal sub-patterns.
, a horizontal sub-pattern extraction section 10b, a right diagonal sub-pattern extraction section 10C5, and a left diagonal sub-pattern extraction section 10d. Each of the extraction units 10a to 10d has a memory for storing a button.

特徴マトリクス抽出部１１は、サブバタン抽出部１０か
ら出力される垂直、水平、右斜め、及び左斜めサブパタ
ンの各バタンから、特徴量を抽出して特徴マトリクスを
作成し、それを識別部１２へ与える機能を有している。The feature matrix extraction unit 11 extracts features from each of the vertical, horizontal, right diagonal, and left diagonal sub-patterns output from the sub-pattern extraction unit 10 to create a feature matrix, and provides it to the identification unit 12. It has a function.

識別部１２は、標準文字の特徴マトリクス（標準文字マ
スク）Ｇ（ｋ）と、この特徴マトリクスＧ（ｋ＞を有す
る標準文字の文字名とを、格納する辞書メモリを有して
いる。そして、特徴マトリクス抽出部１１で抽出された
特徴マトリクスＦ　（ｋ）と、辞書メモリの特徴マトリ
クスＧ　（ｋ＞とを、照合することにより、該特徴マト
リクスＦ　（ｋ＞を得た外接枠内領域の文字図形の認識
を行い、文字名ＯＵＴを出力する機能を有している。The identification unit 12 has a dictionary memory that stores a standard character feature matrix (standard character mask) G(k) and the character name of the standard character having the feature matrix G(k>). By comparing the feature matrix F (k) extracted by the feature matrix extraction unit 11 with the feature matrix G (k>) in the dictionary memory, the characters in the circumscribed frame area from which the feature matrix F (k> was obtained) are It has the function of recognizing figures and outputting character names OUT.

次に、以上のように構成される文字認識装置を用いた特
徴抽出方法と、その抽出結果から文字認識を行う方法に
ついて、各機能ブロックの処理（Ｉ）〜（Ｘ）について
説明する。Next, processes (I) to (X) of each functional block will be described regarding a feature extraction method using the character recognition device configured as described above and a method of performing character recognition from the extraction results.

（Ｉ＞　　文字バタン生成処理根票上に記入された文字画像の光信号ＩＮが光電変換部
１に入力されると、光電変換部１では、光信号ＩＮを２
値のデジタル信号、つまり文字線部を“１°　（これを
「黒ビット」という）、背景部をＯ゛　（これを「白ビ
ットＪという）に変換する。光電変換部１で変換された
１行分の文字画像のデジタル信号は、行バッファ２に格
納される。(I> Character stamp generation processing When the optical signal IN of the character image written on the root slip is input to the photoelectric conversion unit 1, the optical signal IN is converted into 2
The digital signal of the value, that is, the character line part is converted to "1°" (this is called "black bit"), and the background part is converted to "0" (this is called "white bit J)". Digital signals of character images for lines are stored in the line buffer 2.

文字切出部３では、行バッファ２に格納された文字画像
のデジタル信号から、１文字分のデジタル信号（文字バ
タン）を切出し、バタンレジスタ４に格納する。本実施
例では、帳票フォーマットが予め指定されており、文字
切出部３のメモリに、行バツフア２内の文字位置を示す
アドレスが格納されている。そのため、文字切出し動作
は、該アドレスで指定さｈｆ：行バッファ２の内容を読
み出すことにより実行される。The character cutting section 3 cuts out a digital signal for one character (character slam) from the digital signal of the character image stored in the line buffer 2, and stores it in the button register 4. In this embodiment, the form format is specified in advance, and the address indicating the character position within the line buffer 2 is stored in the memory of the character cutting section 3. Therefore, the character extraction operation is executed by reading the contents of the hf:line buffer 2 specified by the address.

（ＩＩ）　　線幅測定処理線幅測定部５は、バタンレジスタ４からのディジタル信
号を入力し、例えば２×２の窓の全ての点が黒ビットと
なる状態の個数Ｑと、全黒ビットの個数Ａとを計数し、
従来周知の（１）式に従って線幅ＷＬを算出する。(II) Line Width Measurement Process The line width measurement unit 5 inputs the digital signal from the button register 4 and calculates, for example, the number Q of states in which all points in a 2×2 window are black bits, and the number of all black bits. Count the number A,
The line width WL is calculated according to the conventionally known equation (1).

ＷＬ＝Ａ／（Ａ−Ｑ＞　　　　　・・・・・・（１）（
Ｉ［Ｉ）　　第１のサブバタン抽出処理第１のサブバタ
ン抽出部６では、次のような処理を行う。WL=A/(A-Q> ・・・・・・(1)(
I[I] First sub-battle extraction process The first sub-battle extraction unit 6 performs the following process.

垂直サブバタン抽出部６ａでは、垂直方向を主走査方向
としてバタンレジスタ４を全面走査し、垂直方向の走査
線上で連続する黒ビット（黒ラン）を検出する。そして
、検出した黒ランの中から、次式（２）を満足する長さ
ｐの黒ランを抽出する。The vertical sub-butter extractor 6a scans the entire surface of the button register 4 with the vertical direction as the main scanning direction, and detects continuous black bits (black runs) on the vertical scanning line. Then, from among the detected black runs, a black run with a length p that satisfies the following equation (2) is extracted.

１≧Ｎ、ＷＬ・・・・・・（２〉但し、９；主走査方向における黒ランの長さＮ：各サブパタンに対する任意定数（例えば、２）垂直サブバタン抽出部６ａは、（２）式を満足する黒ラ
ンを、サブパタンを構成する黒ランとみなして、内部に
設けられた垂直サブバタンメモリに格納する。（２）式
を満足しない黒ランは、白ビットとみなす。1≧N, WL (2> However, 9; Length of black run in the main scanning direction N: An arbitrary constant for each sub-pattern (for example, 2) The vertical sub-pattern extraction unit 6a is calculated using the formula (2) A black run that satisfies the equation (2) is regarded as a black run constituting a sub-pattern and is stored in an internal vertical sub-pattern memory.A black run that does not satisfy formula (2) is regarded as a white bit.

また、水平サブバタン抽出部６ｂでは、垂直サブバタン
抽出部６ａと同様の動作により、水平方向を主走査方向
としてバタンレジスタ４を走査し、水平方向の走査線上
の黒ランのなかから（２〉式を満足する黒ランを抽出す
る。そして、抽出した黒ランを、サブパタンを構成する
黒ランとみなして、内部に設けられた水平サブバタンメ
モリに格納する。In addition, the horizontal sub-battle extracting unit 6b scans the button register 4 with the horizontal direction as the main scanning direction by the same operation as the vertical sub-battle extracting unit 6a, and extracts the formula (2>) from among the black runs on the horizontal scanning line. A satisfying black run is extracted.The extracted black run is regarded as a black run constituting a sub-pattern and is stored in an internal horizontal sub-pattern memory.

（１ｖ〉　　文字傾斜抽出処理文字傾斜抽出部７は、サブバタン抽出部６より得られる
第１の垂直サブバタン（補助垂直サブバタン）及び第１
の水平サブバタン（補助水平サブバタン）のそれぞれに
ついて、該サブバタンの文字線素成分（これを「ストロ
ークｊという）を抽出する。次に、抽出した各ストロー
クの両端の座標値より、各ストロークの傾きを計算し、
それらを平均して、第１の垂直サブバタンからは垂直ス
トロークの平均傾斜θＶ、第１の水平サブバタンからは
水平ストロークの平均傾斜θｈを、それぞれ抽出する。(1v> Character slant extraction processing The character slant extraction section 7 extracts the first vertical sub-pattern (auxiliary vertical sub-pattern) obtained from the sub-pattern extraction section 6 and the first
For each of the horizontal sub-battens (auxiliary horizontal sub-battens), extract the character line element component (this is referred to as "stroke j") of the sub-pattern.Next, from the coordinate values of both ends of each extracted stroke, calculate the slope of each stroke. calculate,
By averaging them, the average inclination θV of the vertical stroke is extracted from the first vertical sub-button, and the average inclination θh of the horizontal stroke is extracted from the first horizontal sub-button.

次に、この平均傾斜θＶ、θｈの具体的な抽出方法につ
いて説明する。Next, a specific method for extracting the average slopes θV and θh will be explained.

まず、第１の垂直サブバタンについて、水平走査を全面
について行い、白ビットから黒ビ・ソト、及び黒ビット
から白ビットへの変化点を検出する。First, horizontal scanning is performed over the entire surface of the first vertical sub-pattern, and points of change from white bits to black bits and from black bits to white bits are detected.

そして、１ライン前の走査線と、現在の走査線における
変化点座標との関係より、ストロークの両端座標を抽出
する。Then, the coordinates of both ends of the stroke are extracted from the relationship between the scanning line one line before and the coordinates of the change point in the current scanning line.

抽出したストロークの両端座標を（ＶＸＳｉ。The coordinates of both ends of the extracted stroke are (VXSi.

ＶＹＳｉ）と（ＶＸＥ　ｉ　、　ＶＹＥ　ｉ　）とした
とき、（３）式を用いて垂直ストロークの平均傾斜θ■
を計算する。但し、ｉ−１，・・・・・・、Ｐｖ、Ｐｖ
は第１の垂直サブバタンより抽出したストローク数、ま
たＶＹＳｉ＜ＶＹＥｉである。VYSi) and (VXE i , VYE i ), the average inclination of the vertical stroke θ■ is calculated using equation (3).
Calculate. However, i-1,..., Pv, Pv
is the number of strokes extracted from the first vertical sub-button, and VYSi<VYEi.

θ■＝＝１・・・・・・（３）ここで、ＶＬＧｉは次式（４）より求められる。θ■= =1 ・・・・・・(3) Here, VLGi is obtained from the following equation (4).

ＶＬＧｉ＝ＨＡＸ　　Ｉ、　ｊ　ＶＸＥｉ−ＶＸＳｉ　　１．　　
、：　ＶＹＥｉ−ＶＹＳｉ　　ｉ）トローク数、またＨ
ＸＳｊ＜ＨＸＥｊである。VLGi=HAX I, j VXEi-VXSi 1.
,: VYEi-VYSi i) Number of strokes, also H
XSj<HXEj.

θｈ＝この（４）式は、２点間の距離を、その２点間の水平及
び垂直座標差のうちで小さい方の１／２と他の一方との
和とする近似式である。θh= Equation (4) is an approximate expression in which the distance between two points is the sum of 1/2 of the smaller of the horizontal and vertical coordinate differences between the two points and the other one.

また、第１の水平サブバタンより水平ストロークの平均
傾斜θｈを、次のようにして抽出する。Further, the average slope θh of the horizontal stroke is extracted from the first horizontal sub-button as follows.

第１の水平サブバタンについて垂直走査を行い、水平ス
トロークの両端座標を抽出する。その両端座標を、（Ｈ
ＸＳｊ、ＨＹＳｊ　）と（ＨＸＥｊ。Vertical scanning is performed for the first horizontal sub-button, and coordinates of both ends of the horizontal stroke are extracted. The coordinates of both ends are (H
XSj, HYSj ) and (HXEj.

ＨＹＥｊ）としたとき、水平ストロークの平均座標θｈ
を（ヲ）式で計算する。但し、ｊ＝１．・・・・・・、
Ｐｈ、Ｐｈは水平サブバタンより抽出したスここで、Ｈ
ＬＧｊは次式（６）より求められる。HYEj), the average coordinate of the horizontal stroke θh
Calculate using formula (wo). However, j=1.・・・・・・、
Ph, Ph is the value extracted from the horizontal sub-button, where H
LGj is obtained from the following equation (6).

）ＩＬＧｊ＝ＭＡＸ　　Ｉ　Ｉ　ＨＸＥＪ−ＨＸＳＪＨｖＥｊ−ｈｖ
ｓｊ口なお、ストローク数がＯのときは、傾斜を０とする。即
ち、Ｐｖ＝ＯのときはθＶ＝Ｏ１ｐｈ＝Ｏのときはθｈ
＝○とする。) ILGj= MAX I I HXEJ-HXSJHvEj-hv
Note that when the number of strokes is O, the slope is set to 0. That is, when Pv=O, θV=O1 When ph=O, θh
=○.

（Ｖ）　　文字枠検出処理文字枠検出部８では、バタンレジスタ４のパタンを走査
してそのパタンの左端座標χＭ、右端座標Ｘｒ、上端座
標Ｙｔ及び下端座標Ｙｂを検出する。外接枠、つまり文
字枠は（ｘｆＪ、ｙｔ＞、（Ｘ、Ｑ　、　Ｙｂ）、（Ｘ
ｒ、Ｙｔ）、（Ｘｒ、Ｙｂ）の４点を結ぶ矩形枠となる
。(V) Character frame detection processing The character frame detection unit 8 scans the pattern of the button register 4 and detects the left end coordinate χM, right end coordinate Xr, upper end coordinate Yt, and lower end coordinate Yb of the pattern. The circumscribing frame, that is, the character frame is (xfJ, yt>, (X, Q, Yb), (X
This is a rectangular frame connecting the four points r, Yt) and (Xr, Yb).

また、文字枠検出後は、特徴量の正規化を行うために、
必要な文字枠の大きさを算出する。即ち、バタンレジス
タ４のＸ軸に対し、平行な方向（水平方向〉の文字枠の
大きさをＷＰｈとしてｗｐｈ＝Ｘｒ−ＸＪＩ↑１を、垂
直な方向く垂直方向）の文字枠の大きさをＷＰｖとして
ＷＰｖ＝Ｙｔ−Ｙｂ＋１を、それぞれ算出する。さらに
、右斜め及び左斜め４５°方向の文字枠の大きさをＷＰ
ｒ及びＷＰＪＩとしてＷＰｈ−ｉ−ＷＰｖＷＰｒ＝ＷＰｊｌ　＝を算出する。これらの算出結果は、分割点決定部９に与
えられる。In addition, after character frame detection, in order to normalize the feature amount,
Calculate the size of the required character frame. In other words, let WPh be the size of the character frame in the parallel direction (horizontal direction) to the X axis of the button register 4, and let wph=Xr-XJI↑1 be the size of the character frame in the perpendicular direction (vertical direction). WPv=Yt-Yb+1 is calculated as WPv. Furthermore, the size of the character frame in the diagonal right and 45° left diagonal directions is set to WP.
WPh-i-WPv WPr=WPjl = is calculated as r and WPJI. These calculation results are given to the division point determining section 9.

（ＶＩ）　　分割点決定処理分割点決定部９は、外接文字枠内をＮＸｘＮＹ個の部分
領域に分割するためのＸ軸上及びＹ軸上の分割点座標を
、各外接枠毎に決定する。但し、ＮＸはＸ軸方向におけ
る分割数、及びＮＹはＸ軸方向における分割数である。(VI) Division Point Determination Process The division point determination unit 9 determines division point coordinates on the X-axis and Y-axis for dividing the inside of the circumscribed character frame into NX×NY partial regions for each circumscribed frame. However, NX is the number of divisions in the X-axis direction, and NY is the number of divisions in the X-axis direction.

分割数ＮＸ及びＮＹは、文字の複雑さに応して任意好適
な値に設定するのが好ましい。例えば、漢字・カタカナ
等の画数の少ない文字を認識対象とする場合は、（２Ｘ
２）〜（３Ｘ３）程度の少ない数の部分領域に、外接枠
内領域を分割する。The numbers of divisions NX and NY are preferably set to arbitrary suitable values depending on the complexity of the characters. For example, when recognizing characters with a small number of strokes such as kanji and katakana, (2X
2) Divide the area within the circumscribing frame into a small number of partial areas of about 3×3.

漢字が認識対象となる場合には、（４Ｘ４）〜（８Ｘ８
）程度の部分領域に、外接枠内領域を分割することが多
い。しかし、本実施例では、認識対象の複雑さにかかわ
らず、外接枠内領域を等分割、例えば（４ｘ４）個に分
割するようにした。When kanji are to be recognized, (4X4) to (8X8
) The area within the circumscribing frame is often divided into partial areas of approximately However, in this embodiment, regardless of the complexity of the recognition target, the area within the circumscribed frame is divided into equal parts, for example, into (4x4) parts.

Ｘ軸上の分割座標ＤＸ（ｎ）、及びＹ軸上の分割座標Ｄ
Ｙ（ｎ）は、次式（７）、（８）で決定される。Division coordinate DX(n) on the X axis and division coordinate D on the Y axis
Y(n) is determined by the following equations (7) and (8).

ＤＸ　（ｎ＞ＮＸ・・・・・・（７）ＤＹ　（ｍ）但し、ｎ＝１．２．＝・＝−、ＮＸ−ｌｍ＝１．２．・
・・・・・、ＮＹ−１本実施例では、例えばＮＸ＝ＮＹ＝４Ｆ）　　サブバタン抽出処理第４図（ａ）、（ｂ）は、サブパタンの抽出方法を示す
図であり、同図（ａ＞は文字パタンの例、及び同図（ｂ
）は垂直サブパタンの例である。第４図（ａ）中の矢印
Ｐは、垂直サブバタン抽出時の走査経路を示す。DX (n> NX...(7) DY (m) However, n=1.2.=・=-, NX-lm=1.2.・
..., NY-1 In this embodiment, for example, NX=NY=4F) Sub-pattern extraction process FIGS. a> is an example of a character pattern, and the same figure (b
) is an example of a vertical sub-pattern. An arrow P in FIG. 4(a) indicates a scanning path when vertical sub-battles are extracted.

第２のサブバタン抽出部１０では、文字傾斜抽出部７で
得られた平均傾斜θ■、θｈに基づき、垂直サブバタン
抽出部１０ａ、水平サブバタン抽山部１０ｂ、右斜めサ
ブバタン抽出部１０ｃ及び左斜めサブバタン抽出部１０
ｄにより、文字傾斜抽出部７で得られた傾斜に対応して
、それぞれバタンレジスタ４上に設定したＸ軸方向にほ
ぼ垂直な方向（垂直方向）及びほぼ平行な方向（水平方
向）と、Ｘ軸から反時計方向４５°の方向（右斜め４５
°方向）及び時計方向４５°の方向（左斜め４５°方向
）とを、主走査方向としてバタンレジスタ４を走査し、
各主走査方向に対応する第２の垂直サブバタン、第２の
水平サブバタン、右斜めサブパタン、及び左斜めサブバ
タンを抽出する。The second sub-pattern extraction section 10 extracts a vertical sub-pattern extraction section 10a, a horizontal sub-pattern extraction section 10b, a right diagonal sub-pattern extraction section 10c, and a left diagonal sub-pattern extraction section based on the average slopes θ■ and θh obtained by the character slope extraction section 7. Extraction part 10
d, a direction approximately perpendicular to the X-axis direction (vertical direction) and a direction approximately parallel to the X-axis direction (horizontal direction) set on the button register 4, and 45° counterclockwise from the axis (45° diagonally to the right)
45° clockwise direction (45° diagonal direction to the left) as the main scanning direction, and scans the button register 4,
A second vertical sub-pattern, a second horizontal sub-pattern, a right diagonal sub-pattern, and a left diagonal sub-pattern corresponding to each main scanning direction are extracted.

まず、垂直サブバタン抽出部１０ａの動作を説明する。First, the operation of the vertical sub-button extraction section 10a will be explained.

この垂直サブバタン抽出部１０ａでは、文字傾斜抽出部
７で得られた平均傾斜θＶに基づき、第４図（ａ）の文
字パタンを走査し、走査線上で連続する黒ビット（黒ラ
ン〉を検出する。そして、検出した黒ランのなかから次
式（９）を満足する長さ９の黒ランを抽出する。The vertical sub-pattern extraction unit 10a scans the character pattern shown in FIG. 4(a) based on the average slope θV obtained by the character slope extraction unit 7, and detects continuous black bits (black runs) on the scanning line. Then, a black run with a length of 9 that satisfies the following equation (9) is extracted from the detected black runs.

ｐ≧Ｎ・ＷＬ・・・・・（９）但し、ｇ；主走査方向における黒ランの長さＮ；各サブパタンに対する任意定数（例えば、２）走査経路Ｐは次のとおりである。垂直走査は上辺から走
査を開始する。走査開始アドレス（ＸａＹＴ＞からの走
査経路Ｐの座標（ｘ−、ｙ・）は、次式（１０）で表わ
すことができる。p≧N·WL (9) However, g: length N of the black run in the main scanning direction; arbitrary constant for each sub-pattern (for example, 2) The scanning path P is as follows. Vertical scanning starts from the top edge. The coordinates (x-, y.) of the scanning path P from the scanning start address (XaYT>) can be expressed by the following equation (10).

ｘｌ＝ｘａｙｌ＝ＹＴｘ　ｊ＝Ｘａ−ｉ−θｖ　ｘ　（ｙ　ｉＹ　Ｔ　）’ｊ
ｉ＝’５’ｊ　１；１・・・・・・（１０）但し、θＶは実数、θｖｘ　（ｙ　１−ＹＴ）の結果は
小数点以下切捨て、座標は全て整数である。xl=xa yl=YT x j=Xa-i-θv x (y iY T )'j
i='5'j 1;1 (10) However, θV is a real number, the result of θvx (y 1 - YT) is rounded down to the decimal point, and all coordinates are integers.

垂直サブバタン抽出部１０ａは、（９）式を満足する黒
ランを、第２の垂直サブバタンを構成する黒ランとみな
して図示しない垂直サブバタンメモリに格納する。（９
）式を満足しない黒ランは白ビットとみなす。The vertical sub-battle extracting unit 10a stores the black runs that satisfy equation (9) in a vertical sub-battle memory (not shown), regarding them as black runs constituting the second vertical sub-battan. (9
) Black runs that do not satisfy the formula are considered white bits.

また、水平サブバタン抽出部１０ｂは、文字傾斜抽出部
７で得られた平均傾斜θｈを用い、垂直サブバタン抽出
部１０ａと同様の動作により、第２の水平サブバタンを
抽出する。同様に、右斜め及び左斜めサブバタン抽出部
１０ｃ、１０ｄは、右斜め及び左斜め方向を主走査方向
として原パタンを走査し、それぞれの主走査方向の走査
線上の黒ランのなかから、（９）式を満足する黒ランを
抽出し、抽出した黒ランを、サブパタンを構成する黒ラ
ンとみなして図示しない右斜め及び左斜めサブバタンメ
モリに格納する。Further, the horizontal sub-pattern extracting section 10b extracts a second horizontal sub-pattern using the average slope θh obtained by the character slope extracting section 7 by the same operation as the vertical sub-pattern extracting section 10a. Similarly, the right diagonal and left diagonal sub-pattern extraction units 10c and 10d scan the original pattern with the right diagonal and left diagonal directions as the main scanning directions, and select (9) from among the black runs on the scanning lines in the respective main scanning directions. ), and store the extracted black runs in right diagonal and left diagonal sub-pattern memories (not shown) as black runs constituting a sub-pattern.

本実施例では、右斜め、及び左斜めサブパタンの抽出時
には、文字の傾きは考慮しない。In this embodiment, when extracting right diagonal and left diagonal sub-patterns, the inclination of characters is not considered.

（■）　特徴マトリクス抽出処理文字枠検出部８が文字バタンの外接枠を規定する座標Ｘ
ｊ　、Ｘｒ、’ｒ’ｔ、Ｙｂを検出し、さらに分割点決
定部９が文字パタンについて対象分割点座標を検出する
と、特徴マトリクス抽出部１１では、垂直、水平、右斜
め、及び左斜めサブパタンの各バタンから特徴量を抽出
し、特徴マトリクスを作成する。(■) Character frame extraction processing character frame detection unit 8 determines the coordinates
j, Extract features from each button and create a feature matrix.

即ち、特徴マトリクス抽出部１１は、一つの外接枠内領
域を、対象分割点座標と座標ＸＪ）、Ｘｒ。That is, the feature matrix extraction unit 11 extracts one circumscribed frame area from the target dividing point coordinates and the coordinates XJ) and Xr.

Ｙｔ、ＹｂとによってＮＸＸＮＹ個の部分領域に分割し
、各部分領域内のサブパタンの文字線量を表す特徴量を
抽出する。そして、一つの外接枠領域内の各サブパタン
から抽出したＮＸｘＮＹｘ４個の特徴量から成る特徴マ
トリクスを、当該外接枠内領域の特徴量マトリクスとし
て抽出する。The area is divided into NXXNY partial areas based on Yt and Yb, and feature amounts representing the character dose of the subpattern in each partial area are extracted. Then, a feature matrix consisting of NXxNYx4 feature quantities extracted from each sub-pattern within one circumscribed frame area is extracted as a feature quantity matrix for the circumscribed frame area.

まず、水平サブバタン（Ｈ３Ｐ＞からの特徴量抽出につ
き説明する。First, feature extraction from the horizontal sub-panel (H3P>) will be explained.

特徴マトリクス抽出部１１は、対象分割点座標と座標Ｘ
、Ｑ　、Ｘｒ、Ｙｔ、Ｙｂとに基づき、外接枠内領域を
ＮＸｘＮＹ個の部分領域に分割しく対象分割点座標及び
座標ＸＩ　、Ｘｒ、Ｙｔ、Ｙｂは分割点座標である）、
各部分領域毎に部分領域内の水平サブパタン）（ＳＰの
黒ビット数ＢＨ（ｉ。The feature matrix extraction unit 11 extracts target division point coordinates and coordinates
, Q, Xr, Yt, Yb, the area within the circumscribing frame is divided into NXxNY partial areas.The coordinates of the target dividing point and the coordinates XI, Xr, Yt, Yb are the dividing point coordinates),
Horizontal sub-pattern within the partial area for each partial area) (number of black bits of SP BH(i.

ｊ〉を計数する。Count j〉.

分割点座標から分割領域の決定は、次のように行う。ま
ず、Ｘ軸上の分割点座標ＤＸ　（ｎ＞を、文字パタンの
外接枠の上端と下端の中点を通りＸ軸に平行な直線上に
設定する。この座標を起点として、次式（１１）〜（１
３）で求められる座標系列の左側を、第４図（ｂ）に示
すような分割境界Ｓとする。The division area is determined from the division point coordinates as follows. First, set the division point coordinates DX (n>) on the X-axis on a straight line parallel to the X-axis passing through the midpoint of the upper and lower edges of the circumscribed frame of the character pattern. From this coordinate as a starting point, use the following equation (11 )～(1
Let the left side of the coordinate series obtained in 3) be a division boundary S as shown in FIG. 4(b).

ｘ　ｏ　＝　Ｄχ（ｎ）ｙ　（）　＝　（Ｙ　Ｔ　了Ｙ　Ｂ　）　／　２・・・
・・・（１１）（ｘ□、ｙ□）から、下方への座標系列は、ｘ　、＝Ｄ
Ｘ　（ｎ）ＴＩ　ＮＴ　（θ■べ（ｙ　ｉ−ｙ○）＝０．５）ｙｉ
＝ｙｉ　１７１・・・・・・（１２）但し、ｉ＝１．２，３．・・・・・＜　（ＹＴ−ＹＢ）／２−ＹＴ）（ｘ　ｏ　、　ｙ　ｏ　）から、上方への座標系列は、
ｘ−＝ＤＸ　（ｎ＞」 −ＩＮＴ（θｖ×（ｙ　Ｏｙ　Ｊ　）　十〇　−５）ｙ
ｊ＝ｙｊ士１−１・・・・・・（１３）但し、ｊ＝−１，−２，−３，・・・・・・（ＹＢ−（
ＹＴ−ＹＢ）、／２）で求める。但し、ＩＮ”ＩＮ　　＋は、その（）内の演
算を実数値を用いて行い、結果の小数点以下を切捨てて
、整数値とすることを表す。x o = Dχ(n) y () = (Y T completed Y B ) / 2...
...(11) From (x□, y□), the downward coordinate series is x, = D
X (n) TI NT (θ■be(y i-y○)=0.5)yi
=yi 171 (12) However, i=1.2, 3. ...< (YT-YB)/2-YT) From (x o , y o ), the upward coordinate series is
x−=DX (n> ”−INT(θv×(y Oy J ) 10 −5)y
j=yj 1-1 (13) However, j=-1,-2,-3,...(YB-(
YT-YB), /2). However, IN"IN + indicates that the operation in parentheses is performed using a real value, and the result is rounded down to an integer value.

水平方向の分割境界Ｓも同様に、ＤＹ　（ｍ）とθｈを
用いて設定する。Similarly, the horizontal division boundary S is set using DY (m) and θh.

以上のように分割された分割領域毎に、水平サブパタン
Ｈ８Ｐの黒ビット数ＢＨ（ｉ、ｊ＞を計数する。このＢ
Ｈ（ｉ、ｊ）は、一つの外接枠内領域に関する第ｉ行第
ｊ列の部分領域の黒ビット数である。次に、（１４）式
に従って第ｉ行第ｊ列の部分領域に関する特徴量ＦＨ（
ｉ、ｊ＞を計算する。For each divided area divided as described above, count the number of black bits BH (i, j>) of the horizontal sub-pattern H8P.
H(i,j) is the number of black bits in the partial area at the i-th row and j-th column regarding one circumscribed frame area. Next, according to equation (14), the feature amount FH(
Calculate i, j>.

ＢＨ（ｉ、ｊ）ＦＨ（ｉ、　ｊ＞＝　ｗｔ、、−ｗｐｈ・・・・・・（
１４）但し、ｉ＝１．２．・・・・・・、ＮＸｊ＝１．２．・
・・・・・、ＮＹＷＬ；線幅ｗｐｈ、文字幅（＝Ｘｒ−ＸＪＩ　−ｉ−１）さらに、
Ｈ８Ｐの場合と同様にして、第ｉ行第ｊ列の部分領域の
ＶＳＰ、Ｒ８Ｐ、ＬＳＰの黒ビット数ＢＶ（ｉ、ｊ）、
ＢＲ（ｉ、ｊ）、ＢＬ（ｉ、ｊ）を計数し、次式（１５
）〜（１７）に従って第ｉ行第ｊ列の部分領域に関する
ｖｓｐ。BH(i, j) FH(i, j>= wt,, -wph......(
14) However, i=1.2. ......, NXj=1.2.・
..., NY WL; line width wph, character width (=Xr-XJI-i-1), and
In the same manner as in the case of H8P, the number of black bits BV(i, j) of VSP, R8P, and LSP in the partial area of the i-th row and j-th column,
BR (i, j) and BL (i, j) are counted and the following formula (15
) to (17) for the partial region in the i-th row and j-th column.

Ｒ３Ｐ、ＬＳＰの特徴量ＦＶ（ｉ、ｊ）、ＦＲ＜ｉ、ｊ
＞、ＦＬ（ｉ、ｊ）を算出する。R3P, LSP feature quantity FV (i, j), FR<i, j
>, calculate FL(i,j).

ＦＬ（ｉ、ｊ）＝ＷＬ　−ＷＰＪ・・・・・・（１７）但し、ＷＰｖ：文字高さ（＝Ｙｂ−Ｙｔ下１）ＷＰｒ＝ＷＰ、
ｌ！　＝　（ＷＰｖ＋ＷＰｈ＞／２以上のようにして、
外接枠内領域の各部分領域毎にＶＳＰ、Ｈ３Ｐ、Ｈ３Ｐ
、ＬＳＰの特徴量を抽出し、これらＮＸｘＮＹｘ４個の
特徴量から成る特徴マトリクスＦ　（ｋ＞（ｋ＝１．２
．・・・・・ＮＸＸＮＹＸ４）を得る。特徴マトリクス
抽出部１１は、特徴マトリクスＦ　（ｋ）を各外接枠領
域毎に抽出し、その抽出結果を識別部１２へ送る。FL (i, j) = WL - WPJ (17) However, WPv: character height (=Yb - Yt bottom 1) WPr = WP,
l! = (WPv+WPh＞/2 or more,
VSP, H3P, H3P for each partial area of the area within the circumscribed frame
, LSP features are extracted, and a feature matrix F (k>(k=1.2
．． ...NXXNYX4) is obtained. The feature matrix extraction unit 11 extracts the feature matrix F (k) for each circumscribed frame area, and sends the extraction result to the identification unit 12 .

（ＩＸ）　　識別処理識別部１２は、抽出された特徴マトリクスＦ（ｋ）と、
内部に設けられた辞書メモリ内の特徴マトリクスＧ（ｋ
＞とを、照合することにより、該特徴マトリクスＦ　（
ｋ）を得た外接枠内領域の文字図形の認識を行う。この
認識では、次式（１８）に従って特徴マトリクスＦ　（
ｋ）とＧ　（ｋ）間の距離りを求め、距離りが最小とな
る特徴マトリクスＧ（ｋ＞の標準文字の文字名（例えば
、ＪＩＳ規格に定められた文字コード＞ＯＵＴを認識結
果として出力する。(IX) The identification processing identification unit 12 uses the extracted feature matrix F(k),
Feature matrix G(k
>, the feature matrix F (
k) Recognize the characters and figures in the area within the circumscribed frame obtained. In this recognition, the feature matrix F (
Find the distance between k) and G (k), and output the character name of the standard character of the feature matrix G(k> with the minimum distance (for example, character code>OUT specified in the JIS standard) as a recognition result. do.

以上のように、本実施例では、次のような利点を有して
いる。As described above, this embodiment has the following advantages.

本実施例では、バタンレジスタ４内の文字パタンについ
て、第１のサブバタン抽出部６により、垂直方向及び水
平方向の双方の走査を行い、その走査線上の黒ビットの
連続と該文字パタンの線幅との関係により、当該走査方
向の線素成分を表わすサブパタンを抽出する。さらに、
文字傾斜抽出部７により、前記サブパタンに含まれる線
素の傾斜より、該サブパタンの傾斜θＶ、θｈを求める
。In this embodiment, the character pattern in the button register 4 is scanned in both the vertical and horizontal directions by the first sub-button extractor 6, and the sequence of black bits on the scanning line and the line width of the character pattern are Based on the relationship, a sub-pattern representing the line element component in the scanning direction is extracted. moreover,
The character slope extraction unit 7 calculates the slopes θV and θh of the sub-pattern from the slopes of the line elements included in the sub-pattern.

そして、第２のサブバタン抽出部１０で、該傾斜θＶ、
θｈに基づき前記文字パタンを各方向に走査してサブパ
タンを抽出した後、特徴を抽出している。そのため、傾
斜を有する文字パタンについても、抽出される特徴が安
定となる。従って、文字傾斜の変形に対応した辞書を識
別部１２内に用意する必要がなく、辞書容量の減少によ
り、照合時間の短縮と、それによる処理速度の高速化が
図れると共に、ハード規模が小さく、認識精度の良い文
字認識が可能となる。Then, in the second sub-battle extraction unit 10, the slope θV,
The character pattern is scanned in each direction based on θh to extract sub-patterns, and then features are extracted. Therefore, the extracted features are stable even for character patterns having an inclination. Therefore, there is no need to prepare a dictionary corresponding to the deformation of the character slope in the identification unit 12, and by reducing the dictionary capacity, it is possible to shorten the collation time and thereby increase the processing speed, and the hardware size is small. Character recognition with high recognition accuracy becomes possible.

なお、本発明は、上記実施例に限定されず、種々の変形
が可能ある。その変形例としては、例えば次のようなも
のがある。Note that the present invention is not limited to the above embodiments, and various modifications are possible. Examples of such modifications include the following.

（ｉ）　本実施例では、水平、垂直の両方向について、
ストロークの傾斜を抽出する場合について説明した。し
かし、手書文字については、一般に右上がりの文字が多
いので、水平ストロークの傾斜抽出のみで十分である。(i) In this example, in both horizontal and vertical directions,
The case of extracting the slope of a stroke has been explained. However, since handwritten characters generally have many characters that slope upward to the right, it is sufficient to extract only the slope of horizontal strokes.

また、印刷文字の斜体字を対象とする場合は、垂直スト
ロークの傾斜抽出を行えば良い。このように、読取り対
象により、適宜、傾斜抽出方向を選択し、構成の簡略化
を図ることが可能である。Moreover, when the target is an italic character of a printed character, it is sufficient to extract the inclination of a vertical stroke. In this way, the tilt extraction direction can be selected as appropriate depending on the object to be read, and the configuration can be simplified.

（ｉｉ）　　上記実施例の特徴抽出方法は、例えば前記
文献２に記載されたような、文字パタンの走査により特
徴の抽出を行う方法にも適用できる。例えば、前記文献
２の方法では、走査線とストロークの交差数を特徴量と
しているので、ストロークに傾斜があると、走査方向を
固定したとき、抽出される交差数が大きく変動する。そ
こで、上記実施例の方法によって傾斜を抽出し、該傾斜
に従って特徴抽出を行えば、上記実施例とほぼ同様の効
果が得られる。(ii) The feature extraction method of the above embodiment can also be applied to a method of extracting features by scanning a character pattern, such as that described in Document 2 above. For example, in the method of Document 2, the number of intersections between a scanning line and a stroke is used as a feature, so if the stroke has an inclination, the number of intersections extracted will vary greatly when the scanning direction is fixed. Therefore, by extracting the slope using the method of the above embodiment and performing feature extraction according to the slope, substantially the same effect as that of the above embodiment can be obtained.

（ｉｉｉ　）　　第１図の機能ブロックを、個別回路で
構成する以外に、コンピュータを用いたプログラム制御
等で実行する構成にしてもよい。(iii) The functional blocks shown in FIG. 1 may be configured to be executed by program control using a computer, instead of being configured by individual circuits.

（発明の効果）以上詳細に説明したように、第１の発明によれば、文字
パタンについて垂直方向及び水平方向の双方又はいずれ
か一方の方向の走査を行って、走査線上の黒ビットの連
続と該文字パタンの線幅との関係により、該走査方向の
線素成分を表わす補助サブパタンを抽出する。そして、
該サブパタンに含ま１する線素の傾斜より、該サブパタ
ンの傾斜を求める。次に、その傾斜に従って文字パタン
を各方向に走査し、該文字バタンから、走査線とストロ
ークの交差数等といった特徴を抽出するようにしている
。そのため、手書文字のように水平線素が右上がりに傾
いた文字や、印刷文字のように垂直線素が右に傾いた斜
体字等を対象とする場合でも、安定な特徴を抽出するこ
とができる。従って、文字傾斜の変形に対応した辞書を
用意する必要がなく、処理速度が速く、小さなハード規
模で、認識精度の良い文字認識が可能となる。(Effects of the Invention) As described in detail above, according to the first invention, a character pattern is scanned in both the vertical direction and the horizontal direction, or in either one or both of the vertical and horizontal directions, so that the black bits on the scanning line are continuous. Based on the relationship between and the line width of the character pattern, an auxiliary sub-pattern representing the line element component in the scanning direction is extracted. and,
The slope of the sub-pattern is determined from the slope of one line element included in the sub-pattern. Next, the character pattern is scanned in each direction according to the inclination, and features such as the number of intersections between a scanning line and a stroke are extracted from the character pattern. Therefore, it is possible to extract stable features even when dealing with characters such as handwritten characters in which the horizontal line elements are tilted upward to the right, or italic characters in which the vertical line elements are tilted to the right such as in printed characters. can. Therefore, there is no need to prepare a dictionary that accommodates the deformation of the character slope, and character recognition with high processing speed and small hardware scale is possible with high recognition accuracy.

第２の発明では、抽出された傾斜に従ってサブバタンを
抽出し、その後、そのサブパタンの特徴を抽出している
ので、前記第１の発明とほぼ同様の効果が得られる。In the second invention, sub-patterns are extracted according to the extracted slopes, and then the features of the sub-patterns are extracted, so that substantially the same effect as the first invention can be obtained.

[Brief explanation of drawings]

第１図は本発明の実施例を示す特徴抽出方法を用いた文
字認識装置の機能ブロック図、第２図は斜体字の例を示
す図、第３図（ａ）〜（ｄ）は従来の特徴抽出方法を説
明するための図、第４図（ａ）、（ｂ）はサブバタン抽
出方法を説明するための図である。１・・・・・・光電変換部、２・・・・・・行バッファ
、３・・・・・文字切出部、４・・・・・・バタンレジ
スタ、５・・・・・・線幅測定部、６・・・・・・第１
のサブバタン抽出部、７・・・・・・文字傾斜抽出部、
８・・・・・・文字枠検出部、９・・・・・・分割点決
定部、１０・・・・・・第２のサブバタン抽出部、１１
・・・・・・特徴マトリクス抽出部、１２・・曲識別部
。FIG. 1 is a functional block diagram of a character recognition device using a feature extraction method showing an embodiment of the present invention, FIG. 2 is a diagram showing an example of italic characters, and FIGS. FIGS. 4(a) and 4(b) are diagrams for explaining the feature extraction method. FIGS. 4(a) and 4(b) are diagrams for explaining the sub-pattern extraction method. 1...Photoelectric conversion unit, 2...Line buffer, 3...Character cutting unit, 4...Bun register, 5...Line Width measurement section, 6...1st
sub-battle extraction section, 7...character slope extraction section,
8...Character frame detection unit, 9...Division point determination unit, 10...Second sub-button extraction unit, 11
... Feature matrix extraction section, 12... Song identification section.

Claims

[Claims] 1. A feature extraction method in which a character pattern obtained by photoelectrically converting characters on a medium is scanned in a plurality of directions to extract features from the character pattern, comprising: scanning in both or one of the horizontal directions, and extracting an auxiliary sub-pattern representing a line element component in the scanning direction based on the relationship between the number of consecutive black pixels in the scanning direction and the line width of the character pattern; After extracting the slope of the auxiliary sub-pattern from the slope of the line elements included in the auxiliary sub-pattern, the character pattern is scanned in a plurality of directions based on the slope of the auxiliary sub-pattern to extract features from the character pattern. A feature extraction method characterized by. 2. A character pattern obtained by photoelectrically converting characters on a medium is scanned in multiple directions, and a sub-pattern representing a line element component in the scanning direction is created based on the relationship between the number of consecutive black pixels in the scanning direction and the line width. In a feature extraction method that divides the sub-pattern into a plurality of regions and extracts a feature representing the amount of line elements of the sub-pattern for each divided region, the character pattern is divided into vertical and horizontal directions. scan in both or either direction,
Based on the relationship between the number of consecutive black pixels in the scanning direction and the line width of the character pattern, an auxiliary sub-pattern representing a line element component in the scanning direction is extracted, and from the slope of the line element included in the auxiliary sub-pattern, A feature extraction method, comprising: extracting the slope of the sub-pattern, and then extracting the sub-pattern based on the slope of the auxiliary sub-pattern.