JP2020190861A

JP2020190861A - Image search apparatus and image search method

Info

Publication number: JP2020190861A
Application number: JP2019095022A
Authority: JP
Inventors: 田村　雅人; Masahito Tamura; 雅人田村; 廣池　敦; Atsushi Hiroike; 敦廣池; 努今田; Tsutomu Imada; 冠中前; Kan Nakamae
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2019-05-21
Filing date: 2019-05-21
Publication date: 2020-11-26

Abstract

To provide an image search apparatus and an image search method for displaying, at the top of search results, images which can be easily compared while realizing highly accurate similar image search.SOLUTION: An image search apparatus 1000 comprises: an input unit 100 that accepts input of images including a search target and annotation data of the images; a feature extraction unit 102 that extracts a feature quantity of the image; a comparison easiness calculation unit 104 that calculates comparison easiness between the images; a learning unit 103 that learns a method for extracting the feature quantity of the image and a method for calculating the comparison easiness using the images and the annotation data; a search unit 101 that determines the order of search results to be displayed on a display screen based on the feature quantity of the image learned by the learning unit 103 and the comparison easiness between the images; and a display unit 105 that displays the search results on the display screen in the order determined by the search unit 101.SELECTED DRAWING: Figure 1

Description

本発明は、画像検索装置、画像検索方法に関する。 The present invention relates to an image search device and an image search method.

本技術分野の背景技術として、特許文献１がある。該公報では、検索に必要な局所特徴量の算出の過程で回転角度を算出するようにし、検索結果として画像を表示する際には、推定された回転角度に基づいて回転正規化の要否を判定し、回転角度が所定値以上である場合には、クエリ画像を基準にしてサンプル画像を回転正規化した後に表示するようにして、検索結果を確認する時に、心的回転による負荷を軽減するとともに、計算量を抑えて検索結果を得る技術が開示されている。 Patent Document 1 is a background technique in this technical field. In the publication, the rotation angle is calculated in the process of calculating the local feature amount required for the search, and when the image is displayed as the search result, the necessity of rotation normalization is determined based on the estimated rotation angle. When the judgment is made and the rotation angle is equal to or more than a predetermined value, the sample image is rotated and normalized based on the query image and then displayed to reduce the load due to the mental rotation when checking the search result. At the same time, a technique for obtaining search results while suppressing the amount of calculation is disclosed.

特開２０１６−４７０３１号公報Japanese Unexamined Patent Publication No. 2016-47031

類似画像検索の用途として、監視システムがある。例えば監視者が不審者を発見した場合に、類似画像検索を用いて監視者が不審者を追跡することを補助できる。また、例えば監視者が置き去りにされた荷物を発見した場合に、類似画像検索を用いて荷物の所有者を特定することができる。このような用途においては、同一人物や同一荷物などの同一個体が検索結果の上位に出現することが好ましく、高精度な検索が求められる。 There is a monitoring system as an application of similar image search. For example, if a observer finds a suspicious person, similar image retrieval can be used to help the observer track the suspicious person. In addition, for example, when a watcher finds a baggage left behind, the owner of the baggage can be identified by using a similar image search. In such an application, it is preferable that the same individual such as the same person or the same baggage appears at the top of the search results, and a highly accurate search is required.

従来、高精度な類似画像検索を行う方法として、Convolutional neural network (CNN) を利用して画像から特徴量を抽出し、特徴量の類似度を計算する方法が提案されている。この方法では、照明環境や画像内での検索対象の位置や大きさが異なる同一個体の様々な画像に対して、類似の特徴量を抽出するようにCNNを学習しておくため、高精度な検索を行うことができる。一方で、どのような画像に対しても類似する特徴量を抽出するように学習するため、必ずしもクエリ画像と比較しやすい画像が検索結果の上位に出現するとは限らない。クエリ画像と検索結果の比較画像が同一個体か異なる個体かを瞬時に判断するためには、比較しやすい画像を上位に表示することが好ましい。また、例えばカメラ内で個体の追跡処理を行っておき、既に同一個体と判明している画像が複数枚ある場合には、それらの中からクエリ画像と比較しやすい画像を選択し、検索結果に表示することが好ましい。 Conventionally, as a method for performing a highly accurate similar image search, a method has been proposed in which a feature amount is extracted from an image using a convolutional neural network (CNN) and the similarity degree of the feature amount is calculated. In this method, CNN is learned so as to extract similar features for various images of the same individual having different search target positions and sizes in the lighting environment and images, so that it is highly accurate. You can search. On the other hand, since learning is performed so as to extract similar features for any image, an image that is easy to compare with the query image does not always appear at the top of the search results. In order to instantly determine whether the query image and the comparison image of the search result are the same individual or different individuals, it is preferable to display an image that is easy to compare at the top. Also, for example, if an individual is tracked in the camera and there are multiple images that are already known to be the same individual, an image that is easy to compare with the query image is selected from them and used as the search result. It is preferable to display.

前記特許文献１では、クエリ画像を基準にした比較画像の回転角を計算し、検索結果を表示する際に比較画像を回転させることで比較の容易度を向上させる方法が記載されている。記載の方法は局所特徴量を利用して類似度の計算及び回転角度の計算を行っており、画像内での同一個体の向きを揃えることで比較の容易度が向上するとしている。しかし、この方法はCNNのような事前学習を行って類似画像検索を行う方法に適用することはできず、高精度の検索を実現しつつ、比較の容易度を向上させることができない。また、物体が撮影された方向が異なったり、物体の一部が遮蔽されている場合など、画像内で物体の見た目が異なる場合に角度の計算が容易ではなく、比較の容易度を向上させることは困難である。 Patent Document 1 describes a method of calculating the rotation angle of a comparison image based on a query image and rotating the comparison image when displaying a search result to improve the ease of comparison. The described method uses local features to calculate similarity and rotation angle, and it is said that the ease of comparison is improved by aligning the orientations of the same individuals in the image. However, this method cannot be applied to a method such as CNN in which pre-learning is performed to perform a similar image search, and it is not possible to improve the ease of comparison while realizing a highly accurate search. In addition, it is not easy to calculate the angle when the appearance of the object is different in the image, such as when the object is photographed in a different direction or a part of the object is shielded, and the ease of comparison is improved. It is difficult.

本発明の一態様では、高精度な類似画像検索を実現しつつ、比較の容易な画像を検索結果の上位に表示することができる画像検索装置の提供を目的とする。 One aspect of the present invention is to provide an image search device capable of displaying an image that can be easily compared at a higher level of a search result while realizing a highly accurate similar image search.

本発明の一態様は、検索対象を含む画像及び前記画像のアノテーションデータの入力を受け付ける入力部と、前記画像の特徴量を抽出する特徴抽出部と、前記画像同士の比較容易度を計算する比較容易度計算部と、前記画像と前記アノテーションデータとを用いて、前記画像の特徴量の抽出方法と前記比較容易度の計算方法とを学習する学習部と、前記学習部が学習した前記画像の特徴量と前記画像同士の比較容易度とに基づいて、表示画面に表示する検索結果の順序を決定する検索部と、前記検索部が決定した順序で前記検索結果を前記表示画面に表示する表示部と、を備えることを特徴とする画像検索装置として構成される。 One aspect of the present invention is an input unit that accepts input of an image including a search target and annotation data of the image, a feature extraction unit that extracts a feature amount of the image, and a comparison that calculates the ease of comparison between the images. Using the ease calculation unit, the image and the annotation data, a learning unit that learns a method for extracting features of the image and a method for calculating the comparison ease, and a learning unit that learns the image. A search unit that determines the order of search results to be displayed on the display screen based on the feature amount and the ease of comparison between the images, and a display that displays the search results on the display screen in the order determined by the search unit. It is configured as an image search device characterized by including a unit.

本発明の一態様によれば、高精度な類似画像検索を実現しつつ、比較の容易な画像を検索結果の上位に表示することができる。 According to one aspect of the present invention, it is possible to display an image that can be easily compared at the top of the search result while realizing a highly accurate similar image search.

一実施例を示す基本構成図Basic configuration diagram showing an embodiment 一実施例を示す画像検索のフローチャートImage search flowchart showing an example 一実施例を示す比較容易度計算の学習を行うためのフローチャートFlow chart for learning comparison ease calculation showing one example 一実施例を示す比較容易度計算の学習を行うためのアノテーションAnnotation for learning comparison ease calculation showing an example 一実施例を示す比較容易度計算の学習用のアノテーション方法Annotation method for learning comparison ease calculation showing an example 一実施例を示す検索結果の表示画面Search result display screen showing an example 一実施例を示す比較容易度計算の学習用のアノテーション方法Annotation method for learning comparison ease calculation showing an example 一実施例を示す構成図Configuration diagram showing an embodiment 一実施例を示す追跡のフローチャートTracking flowchart showing an example 一実施例を示す画像検索のフローチャートImage search flowchart showing an example

以下、図面を参照して本発明の実施形態を説明する。以下の記載および図面は、本発明を説明するための例示であって、説明の明確化のため、適宜、省略および簡略化がなされている。本発明は、他の種々の形態でも実施する事が可能である。特に限定しない限り、各構成要素は単数でも複数でも構わない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The following description and drawings are examples for explaining the present invention, and are appropriately omitted and simplified for the sake of clarification of the description. The present invention can also be implemented in various other forms. Unless otherwise specified, each component may be singular or plural.

図面において示す各構成要素の位置、大きさ、形状、範囲などは、発明の理解を容易にするため、実際の位置、大きさ、形状、範囲などを表していない場合がある。このため、本発明は、必ずしも、図面に開示された位置、大きさ、形状、範囲などに限定されない。 The position, size, shape, range, etc. of each component shown in the drawings may not represent the actual position, size, shape, range, etc., in order to facilitate understanding of the invention. Therefore, the present invention is not necessarily limited to the position, size, shape, range and the like disclosed in the drawings.

以下の説明では、「テーブル」、「リスト」等の表現にて各種情報を説明することがあるが、各種情報は、これら以外のデータ構造で表現されていてもよい。データ構造に依存しないことを示すために「ＸＸテーブル」、「ＸＸリスト」等を「ＸＸ情報」と呼ぶことがある。識別情報について説明する際に、「識別情報」、「識別子」、「名」、「ＩＤ」、「番号」等の表現を用いるが、これらについてはお互いに置換が可能である。 In the following description, various information may be described by expressions such as "table" and "list", but various information may be expressed by a data structure other than these. The "XX table", "XX list", etc. may be referred to as "XX information" to indicate that they do not depend on the data structure. When describing the identification information, expressions such as "identification information", "identifier", "name", "ID", and "number" are used, but these can be replaced with each other.

同一あるいは同様な機能を有する構成要素が複数ある場合には、同一の符号に異なる添字を付して説明する場合がある。ただし、これらの複数の構成要素を区別する必要がない場合には、添字を省略して説明する場合がある。 When there are a plurality of components having the same or similar functions, they may be described by adding different subscripts to the same reference numerals. However, when it is not necessary to distinguish between these plurality of components, the subscripts may be omitted for explanation.

また、以下の説明では、プログラムを実行して行う処理を説明する場合があるが、プログラムは、プロセッサ（例えばＣＰＵ、ＧＰＵ）によって実行されることで、定められた処理を、適宜に記憶資源（例えばメモリ）および／またはインターフェースデバイス（例えば通信ポート）等を用いながら行うため、処理の主体がプロセッサとされてもよい。同様に、プログラムを実行して行う処理の主体が、プロセッサを有するコントローラ、装置、システム、計算機、ノードであってもよい。プログラムを実行して行う処理の主体は、演算部であれば良く、特定の処理を行う専用回路（例えばＦＰＧＡやＡＳＩＣ）を含んでいてもよい。 Further, in the following description, a process performed by executing a program may be described, but the program is executed by a processor (for example, CPU, GPU) to appropriately store a predetermined process as a storage resource (a storage resource (for example, CPU, GPU). Since it is performed using, for example, a memory) and / or an interface device (for example, a communication port), the main body of processing may be a processor. Similarly, the main body of processing for executing a program may be a controller, a device, a system, a computer, or a node having a processor. The main body of the processing performed by executing the program may be an arithmetic unit, and may include a dedicated circuit (for example, FPGA or ASIC) that performs a specific processing.

プログラムは、プログラムソースから計算機のような装置にインストールされてもよい。プログラムソースは、例えば、プログラム配布サーバまたは計算機が読み取り可能な記憶メディアであってもよい。プログラムソースがプログラム配布サーバの場合、プログラム配布サーバはプロセッサと配布対象のプログラムを記憶する記憶資源を含み、プログラム配布サーバのプロセッサが配布対象のプログラムを他の計算機に配布してもよい。また、以下の説明において、２以上のプログラムが１つのプログラムとして実現されてもよいし、１つのプログラムが２以上のプログラムとして実現されてもよい。 The program may be installed from the program source into a device such as a calculator. The program source may be, for example, a program distribution server or a computer-readable storage medium. When the program source is a program distribution server, the program distribution server includes a processor and a storage resource for storing the program to be distributed, and the processor of the program distribution server may distribute the program to be distributed to other computers. Further, in the following description, two or more programs may be realized as one program, or one program may be realized as two or more programs.

本実施形態では、画像検索装置の一例として、スーツケースなどの荷物画像の検索を行う例を説明する。なお、検索対象は荷物である必要はなく、撮像可能な対象（例えば、物体をはじめとする撮像対象）であればよい。 In the present embodiment, as an example of the image search device, an example of searching a luggage image such as a suitcase will be described. The search target does not have to be a baggage, and may be a target that can be imaged (for example, an image target such as an object).

図１は、本実施形態で用いられる画像検索装置の構成例を示す図である。図に示すように、本実施形態の画像検索装置１０００は、入力部１００、検索部１０１、特徴抽出部１０２、学習部１０３、比較容易度計算部１０４、表示部１０５を有して構成される。以下で各ブロックについて説明する。画像検索装置１０００は、ハードウェアとしては、例えば、キーボードやマウス等の入力装置、LCD（Liquid Crystal Display）等の表示装置、あるいはタッチパネル等の入出力装置、HDD（Hard Disk Drive）やSSD（Solid State Drive）等の記憶装置やメモリ等の記憶媒体を備えた一般的なＰＣやサーバ等のコンピュータから構成される。 FIG. 1 is a diagram showing a configuration example of an image search device used in the present embodiment. As shown in the figure, the image search device 1000 of the present embodiment includes an input unit 100, a search unit 101, a feature extraction unit 102, a learning unit 103, a comparison ease calculation unit 104, and a display unit 105. .. Each block will be described below. The hardware of the image search device 1000 includes, for example, an input device such as a keyboard and a mouse, a display device such as an LCD (Liquid Crystal Display), an input / output device such as a touch panel, an HDD (Hard Disk Drive), and an SSD (Solid). It is composed of a computer such as a general PC or a server equipped with a storage device such as State Drive) and a storage medium such as a memory.

入力部１００は、検索もしくは学習を実行する際に各処理の実行に必要なデータを受け付ける。入力部１００は、検索を実行する際にはクエリ画像及び比較対象の画像群を受け付け、それらのデータを検索部１０１に出力し、学習を実行する際には学習に必要な画像とアノテーションデータを受け付け、それらのデータを学習部１０３に出力する。 The input unit 100 receives data necessary for executing each process when executing search or learning. The input unit 100 receives the query image and the image group to be compared when executing the search, outputs the data to the search unit 101, and outputs the image and annotation data necessary for learning when executing the learning. It receives and outputs those data to the learning unit 103.

検索部１０１は、検索のためのデータ管理や計算処理を行う。検索処理では、検索部１０１は、まず入力部１０１からクエリ画像及び比較対象の画像群を受け付け、それらのデータを特徴抽出部１０２や比較容易度計算部１０４に出力する。検索部１０１は、特徴抽出部１０２が画像から抽出した特徴量や比較容易度計算部１０４が画像から計算した比較容易度を取得し、それらから比較画像群の表示順を決定する。検索部１０１は、最後にクエリ画像や比較画像群、表示順を表示部１０５に出力する。比較容易度とは、検索対象を含む画像同士の比較の容易性、すなわち比較が容易である程度を示す情報である。以下では、比較容易度が上記画像同士の比較の容易性を示すものとして説明しているが、画像に含まれる検索対象同士の比較の容易性を比較容易度としてもよい。 The search unit 101 manages data for searching and performs calculation processing. In the search process, the search unit 101 first receives the query image and the image group to be compared from the input unit 101, and outputs the data to the feature extraction unit 102 and the comparison ease calculation unit 104. The search unit 101 acquires the feature amount extracted from the image by the feature extraction unit 102 and the comparison ease calculated by the comparison ease calculation unit 104 from the image, and determines the display order of the comparison image group from them. Finally, the search unit 101 outputs the query image, the comparison image group, and the display order to the display unit 105. The comparison ease is information indicating the ease of comparison between images including the search target, that is, the ease of comparison and a certain degree of comparison. Hereinafter, the ease of comparison is described as indicating the ease of comparison between the above images, but the ease of comparison between the search targets included in the images may be referred to as the ease of comparison.

特徴抽出部１０２は、画像から類似度計算に必要な特徴量を抽出する。検索を実行する際には、特徴抽出部１０２は、検索部１０１から取得した画像に対して特徴量を抽出し、抽出した特徴量を検索部１０１に出力する。特徴抽出部１０２は、学習を実行する際には学習部１０３から取得した画像に対して特徴量を抽出し、抽出した特徴量を学習部１０３に出力する。また、特徴抽出部１０２は、特徴量を抽出するために必要なパラメータを保持しており、学習を実行する際にはこのパラメータを学習部１０３と送取得する。 The feature extraction unit 102 extracts the feature amount required for the similarity calculation from the image. When executing the search, the feature extraction unit 102 extracts the feature amount from the image acquired from the search unit 101, and outputs the extracted feature amount to the search unit 101. When executing learning, the feature extraction unit 102 extracts a feature amount from the image acquired from the learning unit 103, and outputs the extracted feature amount to the learning unit 103. Further, the feature extraction unit 102 holds a parameter necessary for extracting the feature amount, and sends and acquires this parameter to the learning unit 103 when executing learning.

学習部１０３は、特徴抽出部１０２が特徴量を抽出する際に使用するパラメータや比較容易度計算部１０４が比較容易度を計算する際に使用するパラメータを学習により更新する。学習を実行する際には、学習部１０３は、入力部１００から学習に必要な画像とアノテーションデータを取得する。特徴量を抽出するためのパラメータを更新する際には、学習部１０３は、特徴抽出部１０２に画像を出力して、特徴抽出部１０２が画像から抽出した特徴量を取得し、この特徴量とアノテーションデータからパラメータの更新値を計算する。計算された更新値は特徴抽出部１０２に出力される。比較容易度を計算するためのパラメータを更新する際には、学習部１０３は、比較容易度計算部１０４に画像を出力して、比較容易度計算部１０４が画像から計算した比較容易度を取得し、この比較容易度とアノテーションデータからパラメータの更新値を計算する。計算された更新値は比較容易度計算部１０４に出力される。 The learning unit 103 updates the parameters used when the feature extraction unit 102 extracts the feature amount and the parameters used when the comparison ease calculation unit 104 calculates the comparison ease by learning. When executing learning, the learning unit 103 acquires images and annotation data necessary for learning from the input unit 100. When updating the parameters for extracting the feature amount, the learning unit 103 outputs an image to the feature extraction unit 102, acquires the feature amount extracted from the image by the feature extraction unit 102, and obtains the feature amount and the feature amount. Calculate the update value of the parameter from the annotation data. The calculated update value is output to the feature extraction unit 102. When updating the parameters for calculating the comparison ease, the learning unit 103 outputs an image to the comparison ease calculation unit 104, and acquires the comparison ease calculated by the comparison ease calculation unit 104 from the image. Then, the parameter update value is calculated from this comparison ease and the annotation data. The calculated update value is output to the comparison ease calculation unit 104.

表示部１０５は、検索結果を表示する。表示部１０５は、検索を実行する際に検索部１０１から表示するクエリ画像や比較画像群、表示順を取得し、表示順に応じて結果を並べる。また、表示部１０５は、検索部１０１から画像の類似度や比較容易度を取得してそれらを表示することもできる。 The display unit 105 displays the search result. The display unit 105 acquires the query image, the comparison image group, and the display order to be displayed from the search unit 101 when executing the search, and arranges the results according to the display order. In addition, the display unit 105 can acquire the similarity and comparison ease of images from the search unit 101 and display them.

画像検索装の各部は、プログラムの実行により実現される。例えば、画像検索装置１０００のＣＰＵが、ＲＯＭからプログラムを読み出して実行することにより、上記各部の機能が実現される。上記プログラムは、ＵＳＢ(Universal Serial Bus)メモリ等の記憶媒体から読み出されたり、ネットワークを介した他のコンピュータからダウンロードする等して、画像検索装置１０００に提供されてもよい。 Each part of the image search device is realized by executing the program. For example, the CPU of the image search device 1000 reads a program from the ROM and executes it to realize the functions of the above-mentioned parts. The program may be provided to the image search device 1000 by being read from a storage medium such as a USB (Universal Serial Bus) memory or being downloaded from another computer via a network.

図２は、本実施形態で用いられる画像検索のフローの例を示す図である。以下で検索のフローを説明する。 FIG. 2 is a diagram showing an example of an image search flow used in the present embodiment. The search flow will be described below.

ステップ２００では、検索実行者の検索開始操作により検索部が検索処理を開始する。検索実行者とは人である必要はなく、検索を実行するプログラムなどであってもよい。 In step 200, the search unit starts the search process by the search start operation of the search executor. The search executor does not have to be a person, and may be a program that executes the search.

ステップ２０１では、入力部１００が、検索実行者が入力装置を操作して検索を行うためのクエリとなる画像と検索対象となる比較画像群の入力を受け付ける。 In step 201, the input unit 100 receives input of an image to be a query for the search executor to operate the input device to perform a search and a comparison image group to be searched.

ステップ２０２では、入力部１００が、ステップ２０１で検索実行者から入力されたクエリとなる画像および検索対象となる比較画像を検索部１０１に出力する。 In step 202, the input unit 100 outputs the image to be the query input from the search executor in step 201 and the comparison image to be searched to the search unit 101.

ステップ２０３では、検索部１０１が、ステップ２０２で入力部１００から取得したこれらの画像を特徴抽出部１０２に出力する。 In step 203, the search unit 101 outputs these images acquired from the input unit 100 in step 202 to the feature extraction unit 102.

ステップ２０４では、特徴抽出部１０２がステップ２０３で検索部１０１から取得したこれらの画像に対して特徴量の抽出を行う。特徴量の抽出には、例えばCNNを用いる。CNNは画像を入力されて特徴量となる数値ベクトルを出力する。数値ベクトルの計算には、CNNが保持するパラメータを用いて畳み込み演算により行う。このパラメータは任意の値に設定可能であり、学習により同一個体の画像に対しては類似するベクトルを出力するようにしておく。 In step 204, the feature extraction unit 102 extracts the feature amount from these images acquired from the search unit 101 in step 203. For example, CNN is used for extracting the feature amount. CNN inputs an image and outputs a numerical vector as a feature. The numerical vector is calculated by a convolution operation using the parameters held by the CNN. This parameter can be set to any value, and a similar vector is output for the image of the same individual by learning.

ステップ２０５では、特徴抽出部１０２が、ステップ２０４で抽出した特徴量を検索部１０１に出力する。 In step 205, the feature extraction unit 102 outputs the feature amount extracted in step 204 to the search unit 101.

ステップ２０６では、検索部１０１が、ステップ２０２で入力部１００から取得したクエリとなる画像および検索対象となる比較画像を比較容易度計算部１０４に出力する。 In step 206, the search unit 101 outputs the query image and the comparison image to be searched obtained from the input unit 100 in step 202 to the comparison ease calculation unit 104.

ステップ２０７では、比較容易度計算部１０４が、ステップ２０６で検索部１０１から取得したこれらの画像に対して比較容易度の計算を行う。比較容易度の計算には、例えばCNNを用いる。CNNは二枚の画像を入力されて、その二枚の画像に対する比較容易度の値を出力する。値は、例えば0.0から1.0の範囲内にあり、大きいほど比較が容易であることを示す数値である。比較容易度の計算には、CNNが保持するパラメータを用いて畳み込み演算により行う。このパラメータは任意の値に設定可能であり、学習により比較が容易な画像の組み合わせに対しては大きな値を出力するようにしておく。 In step 207, the comparison ease calculation unit 104 calculates the comparison ease with respect to these images acquired from the search unit 101 in step 206. For example, CNN is used to calculate the ease of comparison. CNN inputs two images and outputs the value of the ease of comparison for the two images. The value is, for example, in the range of 0.0 to 1.0, and the larger the value, the easier the comparison. The comparison ease is calculated by a convolution operation using the parameters held by the CNN. This parameter can be set to any value, and a large value is output for a combination of images that can be easily compared by learning.

ステップ２０８では、比較容易度計算部１０４が、ステップ２０７で計算した比較容易度を検索部１０１に出力する。 In step 208, the comparison ease calculation unit 104 outputs the comparison ease calculated in step 207 to the search unit 101.

ステップ２０９では、検索部１０１が、ステップ２０５で特徴抽出部１０２から取得した特徴量とステップ２０８で比較容易度計算部１０４から取得した比較容易度を用いて検索結果の表示順を決定する。特徴量を用いてクエリ画像と比較画像の類似度を計算するには、例えば特徴量である数値ベクトルのユークリッド距離を計算する。数値ベクトルは予めベクトルの大きさが1.0になるように正規化しておくことで、ユークリッド距離の範囲を0.0から2.0の範囲に制限する。更に以下の式により類似度を0.0から1.0の範囲に制限し、大きい値ほど類似度が高くなるようにする。 In step 209, the search unit 101 determines the display order of the search results using the feature amount acquired from the feature extraction unit 102 in step 205 and the comparison ease acquired from the comparison ease calculation unit 104 in step 208. To calculate the similarity between the query image and the comparison image using the feature amount, for example, the Euclidean distance of the numerical vector which is the feature amount is calculated. By normalizing the numerical vector so that the magnitude of the vector becomes 1.0 in advance, the range of the Euclidean distance is limited to the range of 0.0 to 2.0. Furthermore, the similarity is limited to the range of 0.0 to 1.0 by the following formula, and the larger the value, the higher the similarity.

ただし、f₁およびf₂は比較する二つの画像の特徴量である数値ベクトルを表し、sは類似度を表す。表示順は例えば以下のスコアを計算することにより決定する。

However, f ₁ and f ₂ represent the numerical vector which is the feature quantity of the two images to be compared, and s represents the similarity. The display order is determined, for example, by calculating the following scores.

ただし、pはスコア、sは前記の処理で計算した類似度、cはステップ２０７で計算した比較容易度、αは類似度と比較容易度の表示順の決定への重要度を表し、0.0から1.0の範囲で設定する。αが小さいと類似度が高いものから順に表示し、αが大きいと比較容易度が高いものから順に表示する。αを検索結果表示画面で動的に変更できるようにしておけば、検索実行者の要求に応じて類似度と比較容易度の重要度を変更することができ、柔軟な表示が可能となる。 However, p is the score, s is the similarity calculated in the above process, c is the comparison ease calculated in step 207, α is the importance for determining the display order of similarity and comparison ease, and is from 0.0. Set in the range of 1.0. When α is small, the items with the highest degree of similarity are displayed in order, and when α is large, the items with the highest degree of comparison are displayed in order. If α can be dynamically changed on the search result display screen, the importance of similarity and comparison ease can be changed according to the request of the search executor, and flexible display becomes possible.

ステップ２１０では、検索部１０１が、クエリ画像、比較画像群及びステップ２０９で計算した表示順のスコアを表示部１０５に出力する。 In step 210, the search unit 101 outputs the query image, the comparison image group, and the score in the display order calculated in step 209 to the display unit 105.

ステップ２１１では、表示部１０５がステップ２１０で検索部１０１から取得したクエリ画像、比較画像群及び表示順のスコアを使用して表示順を決定し、検索結果の表示を行う。検索結果の表示は、表示部１０５が、例えばクエリ画像を表示し、その横に比較画像群を表示スコアが降順になるように上から下に並べて表示装置に表示する。 In step 211, the display unit 105 determines the display order using the query image, the comparison image group, and the score of the display order acquired from the search unit 101 in step 210, and displays the search result. To display the search results, the display unit 105 displays, for example, a query image, and displays the comparison image group next to the query image group on the display device by arranging them from top to bottom so that the display scores are in descending order.

ステップ２１２では、入力部１００が、検索実行者から、検索を続行するか終了するかの決定を受け付ける。 In step 212, the input unit 100 receives from the search executor a decision as to whether to continue or end the search.

ステップ２１３では、入力部１００が、検索終了の決定を受け付けた場合（ステップ２１２；Ｙｅｓ）、検索を終了する。一方、入力部１００が、検索終了の決定を受け付けていない場合（ステップ２１２；Ｎｏ）、ステップ２０１に戻り、以降の処理を繰り返す。 In step 213, when the input unit 100 accepts the decision to end the search (step 212; Yes), the search ends. On the other hand, when the input unit 100 does not accept the decision to end the search (step 212; No), the process returns to step 201 and the subsequent processing is repeated.

本実施形態のような画像の検索方法を用いることで、クエリ画像と同一個体の画像を比較画像群の中から見つけ出し、検索結果の上位に表示する高精度な検索を実現しつつ、クエリ画像に対して比較しやすい画像を選択して表図することができ、検索結果の確認時間を減少させることができる。特許文献１の方法では比較の容易度を向上させることはできるが、特徴量の抽出方法を学習できないために、検索の精度が低くなってしまう。また、一般的な特徴量の抽出方法を学習する方法を用いる場合、全ての同一個体の画像に対して類似度が高くなるように学習を行うため、クエリ画像に対する比較画像の比較容易度は考慮されない。これらの方法に対して、本手法は特徴量の抽出方法とともに、比較容易度の計算方法も学習し表示順に反映させるため、検索結果のクエリ画像と比較画像を比較しやすくなる。 By using the image search method as in this embodiment, the image of the same individual as the query image is found from the comparison image group, and the query image is displayed while realizing a highly accurate search to be displayed at the top of the search results. On the other hand, images that are easy to compare can be selected and tabulated, and the confirmation time of search results can be reduced. Although the method of Patent Document 1 can improve the ease of comparison, the accuracy of the search is lowered because the method of extracting the feature amount cannot be learned. In addition, when using a method for learning a general feature extraction method, learning is performed so that the similarity is high for all the images of the same individual, so the ease of comparison of the comparison image with respect to the query image is taken into consideration. Not done. In contrast to these methods, this method learns not only the feature amount extraction method but also the comparison ease calculation method and reflects it in the display order, so that it becomes easy to compare the query image of the search result with the comparison image.

本フローでは、クエリ画像や比較画像がそれぞれ独立した一枚の画像であることを前提としていたが、これらは一枚である必要はなく、同一個体の複数枚の画像であってもよい。このような場合には、一つの個体に対して特徴量を表す数値ベクトルが複数できるが、例えばそれらの数値ベクトルの平均値を計算して一つの数値ベクトルにまとめてもよいし、クエリ画像と比較画像の複数の数値ベクトルに対して全ての組み合わせの類似度を計算して、それらの類似度の最小値や平均値を計算してもよい。このような方法を用いることで、一対一の類似度計算を行うよりも画像の変化に頑強な類似度計算を行えるようになる。 In this flow, it is premised that the query image and the comparison image are one independent image, but these do not have to be one, and may be a plurality of images of the same individual. In such a case, a plurality of numerical vectors representing the feature amount can be created for one individual. For example, the average value of those numerical vectors may be calculated and combined into one numerical vector, or the query image and the query image. The similarity of all combinations may be calculated for a plurality of numerical vectors of the comparison image, and the minimum value or the average value of the similarity may be calculated. By using such a method, it becomes possible to perform a similarity calculation that is more robust to changes in the image than a one-to-one similarity calculation.

また、本フローでは、検索を行う際に検索実行者が画像を入力して逐一特徴量の抽出や比較容易度の計算を行っていたが、例えば予め特徴量の抽出や比較容易度の計算を行い、画像や抽出した特徴量、計算した比較容易度を記憶媒体に保存しておき、検索時には記憶媒体に保存されたものを用いることで、検索を実行する際の処理時間を減少させることができる。 Further, in this flow, when performing a search, the search executor inputs an image and extracts the feature amount and calculates the comparison ease one by one. For example, the feature amount is extracted and the comparison ease is calculated in advance. It is possible to reduce the processing time when executing the search by saving the image, the extracted features, and the calculated comparison ease in the storage medium, and using the one saved in the storage medium at the time of searching. it can.

また、本フローでは、特徴量の抽出方法として一般的なCNNを使用していたが、従来から知られている他の手法、例えば回転に対して頑強なCNNを使用して特徴量を抽出することで、荷物画像が回転している場合でも特徴量が変化せず、検索精度を向上させることができる。 In addition, in this flow, a general CNN was used as a feature extraction method, but a feature is extracted using another conventionally known method, for example, a CNN that is robust against rotation. As a result, the feature amount does not change even when the luggage image is rotated, and the search accuracy can be improved.

図３は、本実施形態で用いられる比較容易度計算の学習を行うためのフローチャートの例を示す図である。以下で学習のフローを説明する。 FIG. 3 is a diagram showing an example of a flowchart for learning the comparison ease calculation used in the present embodiment. The learning flow will be described below.

ステップ３００では、学習実行者の学習開始操作により学習部が学習処理を開始する。学習実行者とは人である必要はなく、学習を実行するプログラムなどであってもよい。 In step 300, the learning unit starts the learning process by the learning start operation of the learning executor. The learning executor does not have to be a person, and may be a program that executes learning.

ステップ３０１では、入力部１００が、学習実行者が入力装置を操作して学習を行うための画像及びアノテーションデータの入力を受け付ける。 In step 301, the input unit 100 receives input of image and annotation data for the learning executor to operate the input device for learning.

ステップ３０２では、入力部１００が、ステップ３０１で学習実行者から入力された画像及びアノテーションデータを学習部１０３に出力する。 In step 302, the input unit 100 outputs the image and annotation data input from the learning executor in step 301 to the learning unit 103.

ステップ３０３では、学習部１０３が、ステップ３０２で入力部１００から取得した画像を比較容易度計算部１０４に出力する。 In step 303, the learning unit 103 outputs the image acquired from the input unit 100 in step 302 to the comparison ease calculation unit 104.

ステップ３０４では、比較容易度計算部１０４が、ステップ３０３で学習部１０３から取得した画像に対して比較容易度の計算を行う。比較容易度の計算方法は図２におけるステップ２０７と同様である。 In step 304, the comparison ease calculation unit 104 calculates the comparison ease with respect to the image acquired from the learning unit 103 in step 303. The method of calculating the ease of comparison is the same as in step 207 in FIG.

ステップ３０５では、比較容易度計算部１０４が、ステップ３０４で計算した比較容易度と比較容易度を計算する際に使用したパラメータを学習部１０３に出力する。 In step 305, the comparison ease calculation unit 104 outputs to the learning unit 103 the comparison ease calculated in step 304 and the parameters used when calculating the comparison ease.

ステップ３０６では、学習部１０３が、ステップ３０５で比較容易度計算部１０４から取得した比較容易度と、ステップ３０２で入力部１００から取得したアノテーションデータを用いて、ステップ３０５で比較容易度計算部１０４から取得した比較容易度計算用のパラメータの更新を行う。パラメータの更新には、例えば確率的勾配降下法を用いる。確率的勾配降下法の目的関数は、例えば比較容易度計算部１０４から取得した比較容易度の推定値と入力部１００から取得したアノテーションデータに含まれる比較容易度の正解値の平均二乗誤差を用いる。平均二乗誤差Ｅは以下のように計算される。 In step 306, the learning unit 103 uses the comparison ease acquired from the comparison ease calculation unit 104 in step 305 and the annotation data acquired from the input unit 100 in step 302, and the comparison ease calculation unit 104 in step 305. Update the parameters for calculation of comparison ease obtained from. For updating the parameters, for example, the stochastic gradient descent method is used. The objective function of the stochastic gradient descent method uses, for example, the mean square error of the estimated value of the comparative ease acquired from the comparative ease calculation unit 104 and the correct answer value of the comparative ease included in the annotation data acquired from the input unit 100. .. The mean square error E is calculated as follows.

ただし、ｃ_ｇは比較容易度の正解値、ｃ_ｐは比較容易度の推定値である。平均二乗誤差Ｅを減少させるようにパラメータを更新する。

However, c _g is the correct answer value of the ease of comparison, and c _p is the estimated value of the ease of comparison. Update the parameters to reduce the mean square error E.

ステップ３０７では、学習部１０３がステップ３０６で更新した比較容易度計算用のパラメータを比較容易度計算部１０４に出力する。
ステップ３０８では、比較容易度計算部１０４がステップ３０７で学習部１０３から取得した比較容易度計算用のパラメータを比較容易度の計算に使用するように設定する。
ステップ３０９では、入力部１００が、学習実行者から、学習を続行するか終了するかの決定を受け付ける。
ステップ３１０では、入力部１００が、学習終了の決定を受け付けた場合（ステップ３０９；Ｙｅｓ）、学習を終了する。一方、入力部１００が、学習終了の決定を受け付けていない場合（ステップ３０９；Ｎｏ）、ステップ３０１に戻り、以降の処理を繰り返す。 In step 307, the learning unit 103 outputs the parameters for comparison ease calculation updated in step 306 to the comparison ease calculation unit 104.
In step 308, the comparison ease calculation unit 104 is set to use the comparison ease calculation parameter acquired from the learning unit 103 in step 307 for the comparison ease calculation.
In step 309, the input unit 100 receives a decision from the learning executor as to whether to continue or end the learning.
In step 310, when the input unit 100 accepts the decision to end learning (step 309; Yes), the learning ends. On the other hand, when the input unit 100 has not received the determination of the end of learning (step 309; No), the process returns to step 301, and the subsequent processing is repeated.

本実施形態のような比較容易度計算の学習を行うと、比較のしやすさについて様々な要素を考慮することができる。特許文献１の方法では画像の回転について比較容易度を考慮することができるがその他の要素については考慮することができない。それに対して、本手法では、回転の他に例えば荷物の遮蔽についても考慮することができる。荷物が他の荷物に遮蔽されており、荷物を比較する際の重要な部分が見えない場合に比較容易度を小さくするように学習しておけば、このような画像は検索結果の上位から除外することができる。 When the comparison ease calculation is learned as in the present embodiment, various factors can be considered for the ease of comparison. In the method of Patent Document 1, the ease of comparison can be considered for the rotation of the image, but other factors cannot be considered. On the other hand, in this method, in addition to rotation, for example, shielding of luggage can be considered. Such images will be excluded from the top of the search results if you learn to reduce the ease of comparison when your luggage is shielded by other luggage and you cannot see the important parts when comparing your luggage. can do.

なお、本フローの比較容易度計算の学習方法は、目的関数を変更することで特徴抽出の学習に用いるパラメータの更新にも用いることができる。特徴抽出の学習における目的関数には、例えばSoftmax cross entropyで個体の識別を行うように学習する方法がある。このときの学習データセットは一つである必要はなく、複数のデータセットを用いることができる。複数のデータセットを用いる場合、複数のデータセットに含まれる全ての個体の中から各個体を識別するように学習するのではなく、特徴量を共有しながら各データセット内の個体の中から各個体を識別するように学習する方が精度を向上させることができる。 The learning method of the comparison ease calculation in this flow can also be used to update the parameters used for learning the feature extraction by changing the objective function. An objective function in the learning of feature extraction includes, for example, a method of learning to identify an individual by Softmax cross entropy. The training data set at this time does not have to be one, and a plurality of data sets can be used. When using multiple datasets, instead of learning to identify each individual from all the individuals included in the multiple datasets, each individual from among the individuals in each dataset is shared while sharing the features. The accuracy can be improved by learning to identify the individual.

図４は、本実施形態における比較容易度計算の学習を行うためのアノテーションデータの例を示す図である。アノテーションデータ４００は検索対象を含む二つの画像IDの組み合わせとその二つの画像の比較容易度の集合で構成される。画像IDとは一つの画像に対して他の画像との重複が無いように付与された識別子であり、個々の画像を一意に特定するために使用される。当該一つの画像と他の画像との間の比較容易度は比較容易度計算部１０４が推定する目標値であり、0から1.0の範囲内で値が大きいほど比較が容易であることを示す数値である。画像ID1と画像ID2の順序は関係なく、画像ID1と画像ID2の組み合わせが同じものは同じアノテーションデータとして扱う。 FIG. 4 is a diagram showing an example of annotation data for learning the comparison ease calculation in the present embodiment. The annotation data 400 is composed of a combination of two image IDs including a search target and a set of ease of comparison between the two images. The image ID is an identifier given to one image so as not to overlap with other images, and is used to uniquely identify each image. The ease of comparison between the one image and the other image is a target value estimated by the comparison ease calculation unit 104, and a numerical value indicating that the larger the value in the range of 0 to 1.0, the easier the comparison. Is. The order of image ID1 and image ID2 is irrelevant, and the same combination of image ID1 and image ID2 is treated as the same annotation data.

図５は、本実施形態における比較容易度計算の学習用のアノテーション方法の例を示す図である。表示部１０４は、表示装置に表示するアノテーション画面５００として、同一個体の異なる画像５０１-５０４と比較容易度５０５を表示する。表示されている数値は、アノテーション実行者が入力した各画像の比較容易度であり、例えば数値５０５は画像５０２と画像５０４に対してアノテーション実行者が入力装置を操作して比較容易度0.8を入力したことを示している。 FIG. 5 is a diagram showing an example of an annotation method for learning the comparison ease calculation in the present embodiment. The display unit 104 displays different images 501-504 of the same individual and a comparison ease level 505 as the annotation screen 500 to be displayed on the display device. The displayed numerical value is the comparison ease of each image input by the annotation executor. For example, the numerical value 505 is the comparison ease 0.8 input by the annotation executor operating the input device for the image 502 and the image 504. It shows that it was done.

本実施形態のように、人間が検索結果を確認する同様の状況である画像を比較する状況において、比較容易度の入力によりアノテーションデータを作成することによって、人間の比較しやすさの感覚を検索結果に反映することができる。 As in the present embodiment, in a situation where a human confirms a search result and compares images in the same situation, a human sense of ease of comparison is searched by creating annotation data by inputting the degree of comparison. It can be reflected in the result.

本表示画面では、画面の四隅にアノテーション対象の画像を配置したが、例えば二つの画像と比較容易度を行方向に並べ、各アノテーションデータを列方向に並べるテーブル表示画面を用いてもよい。このような表示画面を用いると、アノテーション対象の画像が大量にある場合にも対応できる。 In this display screen, the images to be annotated are arranged at the four corners of the screen. For example, a table display screen in which the two images and the ease of comparison are arranged in the row direction and each annotation data is arranged in the column direction may be used. By using such a display screen, it is possible to deal with a large number of images to be annotated.

図６は、本実施形態における検索結果の表示画面の例を示す図である。以下では、検索部１０１が、入力装置によるユーザからの操作を受け付けて、検索結果を表示する順序を、画像の特徴量の類似度と比較容易度の重要度とに応じて変更し、表示部１０５が、検索部１０１による変更の結果を表示画面に表示する。 FIG. 6 is a diagram showing an example of a search result display screen according to the present embodiment. In the following, the search unit 101 receives an operation from the user by the input device, changes the order of displaying the search results according to the similarity of the feature amount of the image and the importance of the comparison ease, and changes the display unit. 105 displays the result of the change by the search unit 101 on the display screen.

例えば、表示部１０５は、表示装置に表示する検索結果表示画面６００として、検索に使用したクエリ画像６０１、比較画像群の検索結果６０２、クエリ画像と比較画像の類似度６０３、クエリ画像と比較画像の比較容易度６０４及び類似度と比較容易度の重要度を設定するα値６０５を表示する。本表示画面ではα値６０５が0.5に設定されており、類似度と比較容易度が同等の重要度であるとしてスコアが計算されている。検索結果６０２はスコアが降順になるように並べられている。 For example, the display unit 105 uses the search result display screen 600 to be displayed on the display device as the query image 601 used for the search, the search result 602 of the comparison image group, the similarity 603 between the query image and the comparison image, and the query image and the comparison image. The comparison ease 604 and the α value 605 that sets the importance of the similarity and the comparison ease are displayed. On this display screen, the α value 605 is set to 0.5, and the score is calculated assuming that the similarity and the ease of comparison are of equal importance. The search results 602 are arranged so that the scores are in descending order.

本実施形態のように、入力部１００が、検索実行者からα値６０５の設定操作を受け付け、表示画面上でα値６０５を任意に設定できるようにすることで、類似度と比較容易度の重要度を変更することができ、柔軟な検索結果の確認ができる。 As in the present embodiment, the input unit 100 receives the setting operation of the α value 605 from the search executor and enables the α value 605 to be arbitrarily set on the display screen, so that the similarity and the comparison ease can be improved. You can change the importance and check the search results flexibly.

本表示画面ではα値を数値で入力するようにしているが、例えばスライドバーなどを用いて設定してもよい。スライドバーを用いることでα値６０５の入力を簡易に行うことができるようになる。 Although the α value is input as a numerical value on this display screen, it may be set by using, for example, a slide bar. By using the slide bar, it becomes possible to easily input the α value 605.

本実施形態では、画像検索装置の一例として、スーツケースなどの荷物画像の検索を行う例を説明する。なお、検索対象は荷物である必要はなく、実施例１の場合と同様、撮像可能な対象であればよい。本実施形態では、比較容易度を画像の回転角度から算出されるものとし、比較容易度計算の学習を行うためのアノテーションデータとして画像の回転角度を用いる。その他については実施例１と同様である。 In the present embodiment, as an example of the image search device, an example of searching a luggage image such as a suitcase will be described. The search target does not have to be a baggage, and may be a target that can be imaged as in the case of the first embodiment. In the present embodiment, the comparison ease is calculated from the rotation angle of the image, and the rotation angle of the image is used as the annotation data for learning the comparison ease calculation. Others are the same as in Example 1.

図７は、本実施形態における比較容易度計算の学習用のアノテーション方法の例を示す図である。
表示部１０４は、表示装置に表示するアノテーション画面７００として、同一個体の異なる画像７０１-７０２及び回転角度７０３を表示する。入力部１００は、アノテーション実行者から、基準画像７０１と比較画像７０２の荷物の回転角度の入力を受け付ける。比較容易度計算部１０４は、入力された回転角度から比較容易度を以下のように計算する。 FIG. 7 is a diagram showing an example of an annotation method for learning the comparison ease calculation in the present embodiment.
The display unit 104 displays different images 701-702 and rotation angles 703 of the same individual as the annotation screen 700 to be displayed on the display device. The input unit 100 receives input of the rotation angle of the luggage of the reference image 701 and the comparison image 702 from the annotation executor. The comparison ease calculation unit 104 calculates the comparison ease from the input rotation angle as follows.

ただし、ｃは比較容易度、θは回転角度を表している。

However, c represents the ease of comparison and θ represents the rotation angle.

本実施形態のように、比較容易度を画像の回転角度に限定することで、アノテーション実行者がアノテーションをしやすくなる特徴がある。 As in the present embodiment, by limiting the ease of comparison to the rotation angle of the image, there is a feature that the annotation executor can easily annotate.

本実施形態では、画像検索装置の一例として、スーツケースなどの荷物画像の検索を行う例を説明する。なお、検索対象は荷物である必要はなく、実施例１の場合と同様、撮像可能な対象であればよい。本実施形態では、物体の追跡が事前に行われており、各個体の画像が複数枚ある場合に、その複数枚の画像からクエリ画像と比較しやすい画像を選択する。その他については実施例１と同様である。 In the present embodiment, as an example of the image search device, an example of searching a luggage image such as a suitcase will be described. The search target does not have to be a baggage, and may be a target that can be imaged as in the case of the first embodiment. In the present embodiment, the tracking of the object is performed in advance, and when there are a plurality of images of each individual, an image that can be easily compared with the query image is selected from the plurality of images. Others are the same as in Example 1.

図８は、本実施形態で用いられる構成例を示す図である。図に示すように、本実施形態の画像検索装置８０００は、図１の構成例に追跡部８００と追跡データ蓄積部８０１を追加して構成される。以下で追加された各ブロックについて説明する。 FIG. 8 is a diagram showing a configuration example used in the present embodiment. As shown in the figure, the image search device 8000 of the present embodiment is configured by adding the tracking unit 800 and the tracking data storage unit 801 to the configuration example of FIG. Each block added below will be described.

追跡部８００は、時系列的に連続する複数の画像について、各画像から荷物を検出し、その荷物を時系列方向に追跡する。追跡を実行する際には、まず追跡部８００が入力部１００から時系列的に連続する画像を取得する。次に追跡部８００は、取得した画像に対して追跡処理を実施する。最後に追跡処理の結果を追跡データ蓄積部８０１に出力する。 The tracking unit 800 detects a package from each of a plurality of images that are continuous in time series, and tracks the package in the time series direction. When executing tracking, first, the tracking unit 800 acquires continuous images in time series from the input unit 100. Next, the tracking unit 800 performs tracking processing on the acquired image. Finally, the result of the tracking process is output to the tracking data storage unit 801.

追跡データ蓄積部８０１は、追跡部８００が追跡処理を終了した際に、追跡結果を取得し、記憶装置や記憶媒体に保存しておく。 When the tracking unit 800 finishes the tracking process, the tracking data storage unit 801 acquires the tracking result and stores it in a storage device or a storage medium.

図９は、本実施形態で用いられる追跡のフローチャートの例を示す図である。以下で追跡のフローを説明する。 FIG. 9 is a diagram showing an example of a tracking flowchart used in the present embodiment. The tracking flow will be described below.

ステップ９００では、追跡実行者の追跡開始操作により追跡部が追跡処理を開始する。追跡実行者とは人である必要はなく、追跡を実行するプログラムなどであってもよい。 In step 900, the tracking unit starts the tracking process by the tracking start operation of the tracking executor. The tracker does not have to be a person, but may be a program that executes the track.

ステップ９０１では、入力部１００が、追跡実行者が入力装置を操作して追跡を行うための時系列的に連続する画像の入力を受け付ける。 In step 901, the input unit 100 receives input of a time-series continuous image for the tracking executor to operate the input device to perform tracking.

ステップ９０２では、入力部１００が、ステップ９０１で追跡実行者から入力された時系列的に連続する画像を追跡部８００に出力する。 In step 902, the input unit 100 outputs the time-series continuous images input from the tracking executor in step 901 to the tracking unit 800.

ステップ９０３では、追跡部８００が、ステップ９０２で入力部１００から取得した時系列的に連続する画像に対して追跡処理を実施する。追跡には、例えばまず、従来から知られている様々な手法を用いて各画像の荷物の検出を行う。検出結果は荷物の種類と画像内における位置と大きさになる。次に、追跡部８００は、時系列的に隣接する画像の検出結果について、同じ種類として検出した荷物のうち空間的に距離が近い荷物同士を結びつけ、結び付けた荷物を含む画像に同じ追跡IDを付与する距離の計算には、例えば検出の位置のユークリッド距離を用いる。このようにして生成された検出結果は荷物の種類、荷物が映る画像ID、同じ種類の荷物として検出した画像の追跡ID、各画像における荷物の位置と大きさとなる。 In step 903, the tracking unit 800 performs tracking processing on the time-series continuous images acquired from the input unit 100 in step 902. For tracking, for example, first, the package of each image is detected by using various conventionally known methods. The detection result is the type of luggage and the position and size in the image. Next, the tracking unit 800 connects the packages detected as the same type that are spatially close to each other with respect to the detection results of the images adjacent to each other in time series, and assigns the same tracking ID to the images including the linked packages. For the calculation of the given distance, for example, the Euclidean distance of the detection position is used. The detection result generated in this way is the type of the package, the image ID in which the package is reflected, the tracking ID of the image detected as the same type of package, and the position and size of the package in each image.

ステップ９０４では、追跡部８００が、ステップ９０３で生成した検出結果を追跡結果として追跡データ蓄積部８０１に出力する。 In step 904, the tracking unit 800 outputs the detection result generated in step 903 to the tracking data storage unit 801 as the tracking result.

ステップ９０５では、追跡データ蓄積部８０１が、ステップ９０４で追跡部８００から取得した追跡結果を示すデータを記憶装置や記憶媒体に保存する。 In step 905, the tracking data storage unit 801 stores data indicating the tracking result acquired from the tracking unit 800 in step 904 in a storage device or a storage medium.

ステップ９０６では、入力部１００が、追跡実行者から、追跡を続行するか終了するかの決定を受け付ける。 In step 906, the input unit 100 receives a decision from the tracking executor whether to continue or end the tracking.

ステップ９０７では、入力部１００が、追跡終了の決定を受け付けた場合（ステップ９０６；Ｙｅｓ）、追跡を終了する。一方、入力部１００が、追跡終了の決定を受け付けていない場合（ステップ９０６；Ｎｏ）、ステップ９０１に戻り、以降の処理を繰り返す。 In step 907, when the input unit 100 accepts the decision to end the tracking (step 906; Yes), the tracking ends. On the other hand, when the input unit 100 does not accept the determination of the end of tracking (step 906; No), the process returns to step 901 and the subsequent processing is repeated.

本実施形態のような追跡を行うと、検索実行前に同一個体の複数の画像を集合としてまとめておけるため、比較対象となる画像の枚数を減らすことができ、検索精度を向上させることができる。また、類似度を計算する際に、クエリ画像と比較画像それぞれについて複数枚の画像を使用できるため、画像の変化に頑強な類似度計算を行うことができ、検索精度を向上させることができる。 When tracking as in the present embodiment is performed, a plurality of images of the same individual can be collected as a set before the search is executed, so that the number of images to be compared can be reduced and the search accuracy can be improved. .. Further, when calculating the similarity, since a plurality of images can be used for each of the query image and the comparison image, it is possible to perform the similarity calculation that is robust against changes in the images, and it is possible to improve the search accuracy.

図１０は、本実施形態で用いられる画像検索のフローチャートの例を示す図である。以下で図２の検索フローと異なるステップのみ説明する。 FIG. 10 is a diagram showing an example of an image search flowchart used in the present embodiment. Only the steps different from the search flow of FIG. 2 will be described below.

ステップ１０００では、入力部１００が、検索実行者が入力装置を操作して検索を行うためのクエリとなる画像の入力を受け付ける。 In step 1000, the input unit 100 receives an input of an image as a query for the search executor to operate the input device to perform a search.

ステップ１００１では、追跡データ蓄積部８０１が、検索の比較対象となる画像群及び追跡データを検索部１０１に出力する。 In step 1001, the tracking data storage unit 801 outputs the image group and the tracking data to be compared with the search to the search unit 101.

ステップ１００２では、検索部１０１が、ステップ１０００で入力部１００から取得したクエリ画像及びステップ１００１で追跡データ蓄積部８０１から取得した比較画像群を特徴量抽出部１０２に出力する。 In step 1002, the search unit 101 outputs the query image acquired from the input unit 100 in step 1000 and the comparison image group acquired from the tracking data storage unit 801 in step 1001 to the feature amount extraction unit 102.

ステップ１００３では、検索部１０１が、ステップ１０００で入力部１００から取得したクエリ画像及びステップ１００１で追跡データ蓄積部８０１から取得した比較画像群を比較容易度計算部１０４に出力する。 In step 1003, the search unit 101 outputs the query image acquired from the input unit 100 in step 1000 and the comparison image group acquired from the tracking data storage unit 801 in step 1001 to the comparison ease calculation unit 104.

ステップ１００４では、検索部１０１が、ステップ２０５で特徴抽出部１０２から取得した特徴量を用いて検索結果の表示順を決定する。検索部１０１は、表示順を決めるスコアとして、例えばステップ２０９の類似度計算方法を用いて特徴量から類似度を計算し、その類似度を用いる。 In step 1004, the search unit 101 determines the display order of the search results using the feature amount acquired from the feature extraction unit 102 in step 205. The search unit 101 calculates the similarity from the feature amount by using, for example, the similarity calculation method of step 209 as a score for determining the display order, and uses the similarity.

ステップ１００５では、検索部１０１が、比較対象の各個体を示す複数の画像の中から比較容易度を使用して表示する画像を選択する。検索部１０１は、例えば複数の画像の中からクエリ画像に対して比較容易度が最も大きいものを選択する。 In step 1005, the search unit 101 selects an image to be displayed using the ease of comparison from a plurality of images showing each individual to be compared. The search unit 101 selects, for example, the image having the highest degree of comparison with respect to the query image from a plurality of images.

本実施形態のような画像の検索方法を用いることで、比較対象の同一個体を示す複数の画像の中からクエリ画像と比較しやすいものを選択することができ、検索結果の確認時間を減少させることができる。 By using the image search method as in the present embodiment, it is possible to select an image that is easy to compare with the query image from a plurality of images indicating the same individual to be compared, and the confirmation time of the search result is reduced. be able to.

このように、上記各実施例によれば、高精度な類似画像検索を実現しつつ、比較の容易な画像を検索結果の上位に表示することができる画像検索装置を提供することができる。また、類似特徴量の抽出方法とともに、画像の比較容易度を計算するように学習しておき、検索の際に特徴量の類似度と比較の容易度の両方を用いて検索結果を並べ替えることで比較の容易度を向上させる方法を提供することができる。 As described above, according to each of the above-described embodiments, it is possible to provide an image search device capable of displaying an image that can be easily compared at the top of the search result while realizing a highly accurate similar image search. Also, learn to calculate the comparison ease of images together with the extraction method of similar features, and sort the search results using both the similarity of features and the ease of comparison when searching. Can provide a method for improving the ease of comparison.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 The present invention is not limited to the above-mentioned examples, and includes various modifications. For example, the above-described embodiment has been described in detail in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to those having all the described configurations. Further, it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. Further, it is possible to add / delete / replace a part of the configuration of each embodiment with another configuration.

１０００、８０００画像検索装置
１００入力部
１０１検索部
１０２特徴抽出部
１０３学習部
１０４比較容易度計算部
１０５表示部
８００追跡部
８０１追跡データ蓄積部 1000, 8000 Image search device 100 Input unit 101 Search unit 102 Feature extraction unit 103 Learning unit 104 Comparison ease calculation unit 105 Display unit 800 Tracking unit 801 Tracking data storage unit

Claims

An input unit that accepts input of an image including a search target and annotation data of the image,
A feature extraction unit that extracts the feature amount of the image,
A comparison ease calculation unit that calculates the comparison ease between the images,
A learning unit that learns a method for extracting features of the image and a method for calculating the degree of comparison using the image and the annotation data.
A search unit that determines the order of search results to be displayed on the display screen based on the feature amount of the image learned by the learning unit and the ease of comparison between the images.
A display unit that displays the search results on the display screen in the order determined by the search unit, and
An image search device characterized by comprising.

The image search device according to claim 1.
The input unit receives input of data including one image, another image, and ease of comparison between the one image and the other image as the annotation data.
The learning unit performs the learning using the data input from the input unit.
An image search device characterized by this.

The image search device according to claim 1.
The search unit receives an operation from the user by the input device, and changes the order of displaying the search results according to the similarity of the feature amount of the image and the importance of the comparison ease.
The display unit displays the result of the change by the search unit on the display screen.
An image search device characterized by this.

The image search device according to claim 1.
The comparison ease calculation unit calculates the comparison ease using the rotation angle of the image.
An image search device characterized by this.

The image search device according to claim 1.
A tracking unit that tracks the search target using the images that are continuous in chronological order,
A tracking data storage unit that stores the results of tracking by the tracking unit is provided.
The search unit searches for an image including the search target from the tracking result accumulated by the tracking data storage unit, and based on the searched image and the comparison ease, displays an image to be displayed on the display screen. select,
An image search device characterized by this.

The input unit accepts the input of the image including the search target and the annotation data of the image,
The feature extraction unit extracts the feature amount of the image and
The comparison ease calculation unit calculates the comparison ease between the images, and then
The learning unit learns the method of extracting the feature amount of the image and the method of calculating the degree of comparison using the image and the annotation data.
The search unit determines the order of the search results to be displayed on the display screen based on the feature amount of the image learned by the learning unit and the ease of comparison between the images.
The display unit displays the search results on the display screen in the order determined by the search unit.
An image search method characterized by that.

The image search method according to claim 6.
The input unit receives input of data including one image, another image, and ease of comparison between the one image and the other image as the annotation data.
The learning unit performs the learning using the data input from the input unit.
An image search method characterized by that.

The image search method according to claim 6.
The search unit receives an operation from the user by the input device, and changes the order of displaying the search results according to the similarity of the feature amount of the image and the importance of the comparison ease.
The display unit displays the result of the change by the search unit on the display screen.
An image search method characterized by that.

The image search method according to claim 6.
The comparison ease calculation unit calculates the comparison ease using the rotation angle of the image.
An image search method characterized by that.

The image search method according to claim 6.
The tracking unit tracks the search target using the images that are continuous in chronological order.
The tracking data storage unit accumulates the results of tracking by the tracking unit,
The search unit searches for an image including the search target from the tracking result accumulated by the tracking data storage unit, and based on the searched image and the comparison ease, displays an image to be displayed on the display screen. select,
An image search method characterized by that.