JP2590698B2

JP2590698B2 - Character string data retrieval device

Info

Publication number: JP2590698B2
Application number: JP5224858A
Authority: JP
Inventors: 法雄齋藤
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1993-09-10
Filing date: 1993-09-10
Publication date: 1997-03-12
Anticipated expiration: 2012-03-12
Also published as: JPH0785065A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、文字列データ検索方法
に関し、特に検索対象であるデータ構造から文字列を高
速に検索する文字列検索方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for retrieving character string data, and more particularly to a method for retrieving a character string at high speed from a data structure to be searched.

【０００２】[0002]

【従来の技術】従来の文字列データ検索方法は、検索対
象のデータ構造に文字列データを登録する場合に、デー
タ分類キーとなる文字列を１つに固定しているので、デ
ータを後から追加する場合でも、データ分類キーとなる
文字列をそのままに固定して、追加する検索対象データ
を加えている。2. Description of the Related Art In a conventional character string data search method, when registering character string data in a data structure to be searched, a character string serving as a data classification key is fixed to one, so that the data is stored later. Even in the case of addition, the character string serving as the data classification key is fixed as it is, and the search target data to be added is added.

【０００３】このような従来の文字列データ検索方法の
参考文献としては、“特開昭６２−２４１０２６文字列
検索方式”がある。[0003] As a reference of such a conventional character string data search method, there is "Japanese Patent Laid-Open No. Sho 62-241026 Character String Search Method".

【０００４】[0004]

【発明が解決しようとする課題】上述の従来の文字列デ
ータ検索方法は、検索対象のデータ構造を作成するとき
のデータ分類キーを１つに固定しているので、データ構
造にデータ分類キーごとに検索対象データを登録すると
きにデータ構造に登録された各々の個数に大きな偏りが
生じる可能性があるという欠点を有している。In the above-described conventional character string data search method, the number of data classification keys used when creating a data structure to be searched is fixed to one. There is a disadvantage that when the search target data is registered, there is a possibility that a large deviation may occur in the numbers registered in the data structure.

【０００５】図６は従来の文字列データ検索方法で検索
対象データを登録したデータ構造を表すハッシュ表の一
例を示す図である。例えば、検索対象のデータ構造とし
てはオープンハッシュ法を用い、ハッシュの要素数を５
にしてデータ分類キーを文字列の先頭から１文字目に固
定したときに、検索対象データとして｛Ａ０，Ｂ０，Ｃ
０，Ｄ０，Ｄ１，Ｄ２，Ｄ３，Ｄ４，Ｄ５，Ｄ６｝を入
力した場合には、作成されたデータ構造に登録された各
々の個数について図６に示すような大きな偏りが生じて
いる。このような検索対象のデータ構造からでは、ハッ
シュ法を用いても、データ量の多い位置があるために、
文字列を高速に検索することができないこととなる。FIG. 6 is a diagram showing an example of a hash table representing a data structure in which search target data is registered by a conventional character string data search method. For example, an open hash method is used as a data structure to be searched, and the number of hash elements is set to 5
When the data classification key is fixed to the first character from the beginning of the character string, {A0, B0, C
When 0, D0, D1, D2, D3, D4, D5, D6} are input, a large bias as shown in FIG. 6 occurs for each number registered in the created data structure. From such a search target data structure, even if the hash method is used, there is a position with a large amount of data.
This means that character strings cannot be searched at high speed.

【０００６】本発明の目的は、検索対象のデータ構造上
の偏りを減らして、文字列を短時間に検索することがで
きる文字列データ検索方法を提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to provide a character string data search method capable of searching for a character string in a short time while reducing a bias in a data structure of a search target.

【０００７】[0007]

【課題を解決するための手段】本発明の文字列データ検
索装置は、データ分類キーの決定ルールおよび検索対象
データを何個登録するごとにデータ分類キーの見直しを
行うかを定めて初期登録を行う初期設定部と、検索対象
の文字列データを入力して前記初期設定部で定めた個数
分の検索対象データを送出する検索対象データ入力部
と、前記検索対象データ入力部から受けた検索対象デー
タをデータ分類キーの各々を使用して分類することによ
って、その検索対象データをデータ構造に登録するデー
タ登録部と、データ分類キーの各々に対して前記データ
登録部により登録されたデータ構造に含む検索対象デー
タが最大の個数と最小の個数との差を計算して、その値
をデータ構造解析条件として保存するデータ解析部と、
前記データ登録部で使用するデータ分類キーを、前記初
期設定部で定めた検索対象データの個数ごとに見直し
て、前記データ解析部のデータ構造解析条件が小さいデ
ータ分類キーに変更し、前記データ登録部へ戻すことに
より前記検索対象データ入力部で入力された検索対象デ
ータを変更したデータ分類キーにより再登録させるデー
タ分類キー変更部と、検索したい検索文字列を入力する
検索文字列入力部と、前記データ登録部により登録され
た検索対象のデータ構造から前記データ解析部で保存し
たデータ構造解析条件が小さいデータ分類キーによる分
類を使用し、前記検索文字列入力部で入力した検索文字
列を検索する文字列検索部と、前記文字列検索部で検索
した検索結果を出力する出力部とを備える。According to the present invention, a character string data detection method according to the present invention is provided.
The search device determines the rule for determining the data classification key and the number of data to be searched, and determines the number of data classification keys to be reviewed. The initial setting unit performs initial registration, and inputs the character string data to be searched. A search target data input unit that sends out the search target data for the number determined by the initial setting unit, and classifying the search target data received from the search target data input unit using each of the data classification keys. A data registration unit for registering the search target data in the data structure, and a difference between the maximum number and the minimum number of search target data included in the data structure registered by the data registration unit for each data classification key. And a data analysis unit for calculating the value and storing the value as a data structure analysis condition;
The data classification key used in the data registration unit is reviewed for each number of search target data determined in the initial setting unit, the data structure analysis condition of the data analysis unit is changed to a smaller data classification key, and the data registration key is changed. A data classification key changing unit for re-registering the search target data input in the search target data input unit with the changed data classification key by returning to the unit, a search character string input unit for inputting a search character string to be searched, Searching the search character string input in the search character string input unit using the data structure analysis condition stored in the data analysis unit using a classification with a small data classification key from the data structure of the search target registered by the data registration unit. includes a character string search unit that, and an output unit for outputting a search result of a search with the character string search unit.

【０００８】また、第２の発明の文字列データ検索方法
は、（Ａ）データ分類キーの決定ルールおよび検索対象
データを何個登録するごとにデータ分類キーの見直しを
行うかを定めて初期設定を行う初期設定部と、（Ｂ）検
索対象の文字列データを入力して前記初期設定部で定め
た個数分の検索対象データを送出する検索対象データ入
力部と、（Ｃ）前記検索対象データ入力部から受けた検
索対象データをデータ分類キーの各々を使用して分類す
ることによって、その検索対象データをデータ構造に登
録するデータ登録部と、（Ｄ）データ分類キーの各々に
対して前記データ登録部により登録されたデータ構造に
含む検索対象データが最大の個数と最小の個数との差を
計算して、その値をデータ構造解析条件として保存する
データ解析部と、（Ｅ）前記データ登録部で使用するデ
ータ分類キーを、前記初期設定部で定めた検索対象デー
タの個数ごとに見直して、前記データ解析部のデータ構
造解析条件が小さいデータ分類キーに変更し、前記デー
タ登録部へ戻すことにより前記検索対象データ入力部で
入力された検索対象データを変更したデータ分類キーに
より再登録させるデータ分類キー変更部と、（Ｆ）検索
したい検索文字列を入力する検索文字列入力部と、
（Ｇ）前記データ登録部により登録された検索対象のデ
ータ構造から前記データ解析部で保存したデータ構造解
析条件が小さいデータ分類キーによる分類を使用し、前
記検索文字列入力部で入力した検索文字列を検索する文
字列検索部と、（Ｈ）前記文字列検索部で検索した検索
結果を出力する出力部と、を含んでいる。Further, the character string data search method according to the second invention is characterized in that: (A) a rule for determining a data classification key and an initial setting by defining how many data to be searched should be reviewed every time data to be searched is registered; (B) a search target data input unit that inputs character string data to be searched and sends out the search target data for the number determined by the initial setting unit; and (C) the search target data. A data registration unit for classifying the search target data received from the input unit using each of the data classification keys to register the search target data in a data structure; and (D) for each of the data classification keys, A data analysis unit that calculates a difference between the maximum number and the minimum number of search target data included in the data structure registered by the data registration unit, and stores the value as a data structure analysis condition; E) The data classification key used in the data registration unit is reviewed for each number of search target data determined in the initial setting unit, and the data structure analysis condition of the data analysis unit is changed to a smaller data classification key. A data classification key changing unit for returning the search target data input at the search target data input unit by the changed data classification key by returning to the data registration unit; and (F) a search character for inputting a search character string to be searched A column input section,
(G) a search character input from the search character string input unit using a classification based on a data classification key having a small data structure analysis condition stored in the data analysis unit from the data structure to be searched registered by the data registration unit; A character string search unit for searching a column; and (H) an output unit for outputting a search result searched by the character string search unit.

【０００９】[0009]

【実施例】次に、本発明の実施例について図面を参照し
て説明する。図１は本発明の文字列データ検索方法の一
実施例を示すブロック図である。本実施例の文字列デー
タ検索方法は、図１に示すように、検索対象データ作成
部１０９と検索結果作成部１１０とを含んで構成されて
いる。Next, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing one embodiment of a character string data search method according to the present invention. As shown in FIG. 1, the character string data search method according to the present embodiment includes a search target data creation unit 109 and a search result creation unit 110.

【００１０】そして、検索対象データ作成部１０９で
は、初期設定部１０１は、データ分類キーの決定ルール
および検索対象データを何個登録するごとにデータ分類
キーの変更を行うかの個数を定めている。また、検索対
象データ入力部１０２は、検索対象データを入力し、デ
ータ登録部１０３へ初期設定部１０１で決めた個数分の
検索対象データを送っている。In the search target data creation unit 109, the initial setting unit 101 determines the data classification key determination rule and the number of data search keys to be changed each time the search target data is registered. . Further, the search target data input unit 102 inputs the search target data and sends the data registration unit 103 with the search target data of the number determined by the initial setting unit 101.

【００１１】また、データ登録部１０３は、検索対象デ
ータ入力部１０２から送られた検索対象データについて
検索対象データの文字列の一部をデータ分類キーとする
ことにより、そのデータ分類キーに従って検索対象デー
タをデータ構造に登録して、データ解析部１０４を動作
させている。The data registration unit 103 uses a part of the character string of the search target data for the search target data sent from the search target data input unit 102 as a data classification key, and performs the search according to the data classification key. The data is registered in the data structure, and the data analysis unit 104 is operated.

【００１２】そして、データ解析部１０４は、データ構
造に含まれている検索対象データが最大の個数と最小の
個数との差をデータ構造解析条件として計算し、その結
果を保存してからデータ分類キー変更部１０５を動作さ
せている。The data analysis unit 104 calculates the difference between the maximum number and the minimum number of search target data included in the data structure as a data structure analysis condition, saves the result, and then classifies the data. The key changing unit 105 is operated.

【００１３】そこで、データ分類キー変更部１０５は、
データ登録部１０３で使用しているデータ分類キーを、
初期設定部１０１で定めた検索対象データの個数ごとに
見直して、データ解析部１０４のデータ構造解析条件が
小さいデータ分類キーに変更し、データ解析部１０４を
通じてデータ登録部１０３へ戻すことによって、変更し
たデータ分類キーにより、検索対象データ入力部１０２
で入力された検索対象データをデータ構造に再登録させ
ている。Therefore, the data classification key changing unit 105
The data classification key used in the data registration unit 103 is
The data is reviewed for each number of search target data determined by the initial setting unit 101, and the data structure analysis condition of the data analysis unit 104 is changed to a smaller data classification key, and returned to the data registration unit 103 through the data analysis unit 104. The search target data input unit 102 is
Is re-registered in the data structure.

【００１４】そして、全てのデータ分類キーについて、
データ構造への登録と解析とを終了したら、データ登録
部１０３は、上記のデータ構造解析条件が最も小さいデ
ータ分類キーを用いて、検索対象のデータ構造の再構築
を行い、次の検索対象データの登録に進んでいる。Then, for all data classification keys,
When the registration and analysis of the data structure are completed, the data registration unit 103 reconstructs the data structure of the search target using the data classification key having the smallest data structure analysis condition, and executes the next search target data. Proceed to registration.

【００１５】このようにして、検索対象データ入力部１
０２から初期設定部１０１が決めた個数分の検索対象デ
ータをデータ登録部１０３へ送ることを繰返すことによ
り、登録する検索対象データがなくなるまでこれらの処
理を繰返している。Thus, the search target data input unit 1
By repeating the transmission of the number of search target data determined by the initial setting unit 101 from 02 to the data registration unit 103, these processes are repeated until there is no more search target data to be registered.

【００１６】さらに、検索結果作成部１１０では、検索
文字列入力部１０６は、検索文字列データを入力し、文
字列検索部１０７を動作させている。そして、文字列検
索部１０７は、検索対象データ作成部１０９により作成
した検索対象データの中から検索文字列入力部１０６で
入力した検索文字列を検索して、出力部１０８を動作さ
せている。なお、この検索には、データ登録部１０３が
検索対象データを登録するときに最終的に用いたデータ
分類キーを使用している。また、出力部１０８は、文字
列検索部１０７が検索した検索結果を出力している。Further, in the search result creating unit 110, the search character string input unit 106 inputs the search character string data, and operates the character string search unit 107. The character string search unit 107 searches the search target data created by the search target data creation unit 109 for the search character string input by the search character string input unit 106, and operates the output unit 108. Note that this search uses the data classification key finally used when the data registration unit 103 registers the search target data. The output unit 108 outputs a search result searched by the character string search unit 107.

【００１７】図２は本実施例のデータ登録部１０３とデ
ータ解析部１０４とデータ分類キー変更部１０５との動
作の一例を示すフローチャートである。また、図３，〜
５は本実施例で検索対象データを登録したデータ構造を
表すハッシュ表の一例を示す図である。FIG. 2 is a flowchart showing an example of the operation of the data registration unit 103, the data analysis unit 104, and the data classification key change unit 105 according to this embodiment. Also, FIG.
FIG. 5 is a diagram illustrating an example of a hash table representing a data structure in which search target data is registered in the present embodiment.

【００１８】次に、図１に示す本実施例の動作を図２を
参照しながら説明する。ここでは、検索対象のデータ構
造にはオープンハッシュ法を用いて、ハッシュの要素数
を５として、｛Ａ０，Ｂ０，Ｃ０，Ｄ０，Ｄ１，Ｄ２，
Ｄ３，Ｄ４，Ｄ５，Ｄ６｝というデータを検索対象デー
タとして登録する場合を例にとって説明する。Next, the operation of this embodiment shown in FIG. 1 will be described with reference to FIG. Here, the open hash method is used for the data structure to be searched, and the number of hash elements is set to 5, and {A0, B0, C0, D0, D1, D2,
An example will be described in which data D3, D4, D5, D6} are registered as search target data.

【００１９】まず、初期設定部１０１は、文字列データ
の先頭からの１文字を取出してその文字コードを５で割
った余りをデータ分類キーとして、文字列データを５つ
追加するごとにデータ分類キーの見直しを行って、登録
するデータが終了した時点で再度データ分類キーの見直
しを行うことにしている。また、検索対象データ入力部
１０２は、登録データ｛Ａ０，Ｂ０，Ｃ０，Ｄ０，Ｄ
１，Ｄ２，Ｄ３，Ｄ４，Ｄ５，Ｄ６｝を読込みデータ登
録部１０３へ最初に｛Ａ０，Ｂ０，Ｃ０，Ｄ０，Ｄ１｝
を送っている。First, the initial setting unit 101 extracts one character from the head of character string data and sets the remainder obtained by dividing the character code by 5 as a data classification key. The key is reviewed, and when the data to be registered is completed, the data classification key is reviewed again. Also, the search target data input unit 102 stores the registered data {A0, B0, C0, D0, D
1, D2, D3, D4, D5, D6} are first read into the data registration unit 103 {A0, B0, C0, D0, D1}.
Has been sent.

【００２０】そこで、図２に示すように、データ登録部
１０３は、ステップ２０１で、文字列の先頭から１文字
目の文字コードを５で割った余りをデータ分類キーとし
て、ステップ２０２で、求めたデータ分類キーを使って
登録データをハッシュの要素に分類して図３に示すよう
に登録し、データ解析部１０４を動作させている。Therefore, as shown in FIG. 2, the data registration unit 103 determines in step 201 the remainder obtained by dividing the character code of the first character from the beginning of the character string by 5 as a data classification key, and in step 202 The registered data is classified into hash elements using the data classification key, and registered as shown in FIG. 3, and the data analysis unit 104 is operated.

【００２１】そして、ステップ２０３で、データ解析部
１０４は、ハッシュの要素の個数が最大のものから最小
のものを引いた値を、データ構造解析条件として保存し
て、データ分類キー変更部１０５を動作させている。こ
の場合のデータ構造解析条件は、最大の２から最小の０
を引いて２となっている。Then, in step 203, the data analysis unit 104 stores a value obtained by subtracting the minimum number of hash elements from the maximum number of hash elements as a data structure analysis condition, and the data classification key change unit 105 It is working. In this case, the data structure analysis conditions range from a maximum of 2 to a minimum of 0.
To 2

【００２２】また、ステップ２０４で、データ分類キー
変更部１０５は、まだ先頭から２番目の文字について調
べていないので、ステップ２０５に移行し、ステップ２
０５で、データ分類キーを文字列の先頭から２番目の文
字を５で割った余りに変えてステップ２０２へ戻ってい
る。In step 204, since the data classification key changing unit 105 has not yet checked the second character from the beginning, the process proceeds to step 205, and
At step 05, the data classification key is changed to a value obtained by dividing the second character from the head of the character string by 5, and the process returns to step 202.

【００２３】そこで、ステップ２０２で、データ登録部
１０３は、現在登録されている検索対象データをステッ
プ２０５で求めたデータ分類キーを使って分類して、図
４に示すように再登録し、データ解析部１０４を動作さ
せている。そして、ステップ２０３で、データ解析部１
０４は、データ構造解析条件を求めると最大の４から最
小の０を引いて４になり、この結果を保存してデータ分
類キー変更部１０５を動作させている。In step 202, the data registration unit 103 classifies the currently registered search target data using the data classification key obtained in step 205 and re-registers the data as shown in FIG. The analysis unit 104 is operating. Then, in step 203, the data analysis unit 1
In step 04, when the data structure analysis condition is obtained, the minimum 0 is subtracted from the maximum 4 to become 4. The result is stored and the data classification key changing unit 105 is operated.

【００２４】また、ステップ２０４で、データ分類キー
変更部１０５は、この例ではデータ分類キーとなる文字
が先頭から１文字目を５で割った余りおよび２文字目を
５で割った余りだけであるので、全てのデータ分類キー
に対して調べたことになり、ステップ２０６に移行す
る。In step 204, the data classification key changing unit 105 determines in this example that only the remainder of the first character divided by 5 and the second character divided by 5 is used as the data classification key. Since all the data classification keys have been checked, the process proceeds to step 206.

【００２５】そして、ステップ２０６で、データ解析部
１０４は、ステップ２０３で求めたデータ構造解析条件
を比較して、この例ではデータ分類キーとして１番目の
文字を５で割った余りを使う場合の方がデータ構造解析
条件が小さいので、１番目の文字を５で割った余りをデ
ータ分類キーとして、データ登録部１０３を動作させ
て、ステップ２０７で、データ登録部１０３は、データ
構造を組み上げることにより、最初から５個の｛Ａ０，
Ｂ０，Ｃ０，Ｄ０，Ｄ１｝に対するデータ登録を完成し
ている。In step 206, the data analysis unit 104 compares the data structure analysis conditions obtained in step 203, and in this example, uses the remainder obtained by dividing the first character by 5 as the data classification key. Since the data structure analysis condition is smaller, the data registration unit 103 is operated using the remainder obtained by dividing the first character by 5 as a data classification key. In step 207, the data registration unit 103 sets up the data structure. From the beginning, five A0,
Data registration for B0, C0, D0, D1} has been completed.

【００２６】次に、検索対象データ入力部１０２は、
｛Ｄ２，Ｄ３，Ｄ４，Ｄ５，Ｄ６｝をデータ登録部１０
３に送り、｛Ａ０，Ｂ０，Ｃ０，Ｄ０，Ｄ１｝を登録し
た時と同様にして追加登録して、検索対象データの登録
を終了している。この追加登録を行った結果では、デー
タ分類キーは２文字目を使用した方が偏りが少なくなる
ので、データ分類キーを２文字目の文字コードを５で割
った余りに変えて図５に示すように分類を行っている。Next, the search target data input unit 102
{D2, D3, D4, D5, D6} are stored in the data registration unit 10
3, and {A0, B0, C0, D0, D1} are additionally registered in the same manner as when registered, and the registration of the search target data is completed. As a result of this additional registration, the bias is reduced when the second character is used as the data classification key. Therefore, the data classification key is changed to a value obtained by dividing the character code of the second character by 5 as shown in FIG. Are classified.

【００２７】一方、登録された検索対象データから文字
列検索を行うには、検索文字列入力部１０６が検索文字
列データを入力して文字列検索部１０７に送ることによ
り、文字列検索部１０７は、検索対象データ作成部１０
９により作成された検索対象データと検索文字列入力部
１０６で入力された検索文字列とを使って検索文字列を
検索し、出力部１０８は、その結果を出力している。On the other hand, to perform a character string search from the registered search target data, the search character string input unit 106 inputs the search character string data and sends it to the character string search unit 107. Is the search target data creation unit 10
The search character string is searched using the search target data created in step 9 and the search character string input by the search character string input unit 106, and the output unit 108 outputs the result.

【００２８】例えば、Ｄ６という文字列を検索する場合
には、２文字目の文字コードを５で割ったときの値は４
であるので、図５のハッシュ表の４の位置を調べると、
その２番目にＤ６が出てくることとなる。このようにし
て、検索対象のデータを５個登録するごとにデータ登録
のデータ分類キーの見直しを行うことにより、文字列を
高速に検索することができる。For example, when searching for the character string D6, the value obtained by dividing the character code of the second character by 5 is 4
Therefore, if the position of 4 in the hash table of FIG.
D6 comes out second. In this way, the character string can be searched at high speed by reviewing the data classification key of data registration every time five data to be searched are registered.

【００２９】下記に、検索対象データをオープンハッシ
ュ法で登録するときの処理時間式を示している。The processing time formula when registering the search target data by the open hash method is shown below.

【００３０】（ｎ＊ｓ＋ｃ）＊ｋ＊ｙ＋（ｘ＊ｓ＋ｃ）＊ｋ＋ｂなお、この式では、データ分類キーの候補の個数をｋ、
検索対象データの個数をｚ、検索対象データ一個を登録
する時間をｓ、その検索対象データが検索対象のデータ
構造にデータ構造解析条件を求める時間をｃ、その条件
の中で最も良いものを選ぶ時間をｂとして検索対象デー
タをｎ個追加するごとにデータ分類キーの見直しを行
い、ｄをｎで割った商をｙとして、ｄをｎで割った余り
をｘとしている。(N * s + c) * k * y + (x * s + c) * k + b In this equation, the number of data classification key candidates is k,
The number of search target data is z, the time for registering one search target data is s, the time for the search target data to obtain a data structure analysis condition in the data structure of the search target is c, and the best one of the conditions is selected. The data classification key is reviewed every time n search target data is added with the time b, and the quotient obtained by dividing d by n is represented by y, and the remainder obtained by dividing d by n is represented by x.

【００３１】[0031]

【発明の効果】以上説明したように、本発明の文字列デ
ータ検索方法は、文字列データを一定の数だけ登録する
ごとに、データ分類キーの見直しを行って、検索対象デ
ータの再登録を行うことにより、検索対象のデータがど
のような文字列を含んでいるのかが分からない場合で
も、最適なデータ分類キーによるデータ登録を行うこと
ができるので、検索対象のデータ構造に偏りが少なくな
り、文字列を短時間に検索することができるという効果
を有している。As described above, according to the character string data search method of the present invention, every time a fixed number of character string data are registered, the data classification key is reviewed and the data to be searched is re-registered. By doing so, even if you do not know what character string the search target data contains, you can perform data registration with the optimal data classification key, reducing the bias in the search target data structure This has the effect that a character string can be searched in a short time.

[Brief description of the drawings]

【図１】本発明の文字列データ検索方法の一実施例を示
すブロック図である。FIG. 1 is a block diagram showing an embodiment of a character string data search method according to the present invention.

【図２】本実施例のデータ登録部１０３とデータ解析部
１０４とデータ分類キー変更部１０５との動作の一例を
示すフローチャートである。FIG. 2 is a flowchart illustrating an example of an operation of a data registration unit 103, a data analysis unit 104, and a data classification key change unit 105 according to the embodiment.

【図３】本実施例で１文字目をデータ分類キーとして５
個の検索対象データを登録したデータ構造を表すハッシ
ュ表の一例を示す図である。FIG. 3 shows the first character as a data classification key of 5 in the present embodiment.
FIG. 6 is a diagram illustrating an example of a hash table representing a data structure in which pieces of search target data are registered.

【図４】本実施例で２文字目をデータ分類キーとして５
個の検索対象データを登録したデータ構造を表すハッシ
ュ表の一例を示す図である。FIG. 4 shows the second character as a data classification key of 5 in the present embodiment.
FIG. 6 is a diagram illustrating an example of a hash table representing a data structure in which pieces of search target data are registered.

【図５】本実施例で２文字目をデータ分類キーとして全
部で１０個の検索対象データを登録したデータ構造を表
すハッシュ表の一例を示す図である。FIG. 5 is a diagram illustrating an example of a hash table representing a data structure in which a total of ten search target data are registered using a second character as a data classification key in the embodiment.

【図６】従来の文字列データ検索方法で１文字目をデー
タ分類キーとして１０個の検索対象データを登録したデ
ータ構造を表すハッシュ表の一例を示す図である。FIG. 6 is a diagram illustrating an example of a hash table representing a data structure in which ten search target data are registered using the first character as a data classification key in a conventional character string data search method.

[Explanation of symbols]

１０１初期設定部１０２検索対象データ入力部１０３データ登録部１０４データ解析部１０５データ分類キー変更部１０６検索文字列入力部１０７文字列検索部１０８出力部１０９検索対象データ作成部１１０検索結果作成部 Reference Signs List 101 initial setting unit 102 search target data input unit 103 data registration unit 104 data analysis unit 105 data classification key change unit 106 search character string input unit 107 character string search unit 108 output unit 109 search target data creation unit 110 search result creation unit

Claims

(57) [Claims]

1. An initial setting unit for performing initial registration by determining a rule for determining a data classification key and how many data to be searched are to be reviewed each time data is registered, and inputting character string data to be searched. A search target data input unit for transmitting the search target data of the number determined by the initial setting unit, and classifying the search target data received from the search target data input unit using each of the data classification keys. A data registration unit for registering the data to be searched in the data structure, and a search for the maximum number and the minimum number of data to be searched included in the data structure registered by the data registration unit for each data classification key. A data analysis unit that calculates a difference and stores the value as a data structure analysis condition, and a data classification key used by the data registration unit are defined by the initial setting unit. The search target data input in the search target data input unit by reviewing for each number of search target data, changing the data structure analysis condition of the data analysis unit to a smaller data classification key, and returning to the data registration unit. A data classification key changing unit for re-registering with a changed data classification key, a search character string input unit for inputting a search character string to be searched, and a data analysis unit based on a data structure of a search target registered by the data registration unit. A character string search unit that searches for a search character string input in the search character string input unit using a classification based on a data classification key having a small data structure analysis condition saved in the search character string search unit; A character string data search device comprising: an output unit that outputs the character string data.