JP4875781B1

JP4875781B1 - Distributed data storage system

Info

Publication number: JP4875781B1
Application number: JP2011151338A
Authority: JP
Inventors: 敏文新谷; 壮一最首
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2011-07-08
Filing date: 2011-07-08
Publication date: 2012-02-15
Anticipated expiration: 2031-07-08
Also published as: WO2013008353A1; JP2013020314A

Abstract

【課題】重要データから秘密分散技術により生成された複数の部分データを複数のデータセンターに分散保管するにあたり、秘密分散を実装するライブラリ等に依存せずにデータの可用性を向上させるデータ分散保管システムを提供する。
【解決手段】各サーバ１００はクライアント端末３００から受信した部分データを保管するデータ保管部１１０を有し、クライアント端末３００は、重要データから秘密分散技術によりｋ個以上集めなければ重要データを復元できないｎ個（ｋ≦ｎ）の部分データを生成する分割処理部３１０と、ｎ個の部分データおよびｎ個のコピーを２ｎ個のサーバ１００に保管し、重要データを復元するためのｍ個（ｋ≦ｍ≦ｎ）の異なる部分データもしくはコピーをｍ個のサーバ１００から収集する分散管理部３２０と、ｍ個の部分データもしくはコピーから秘密分散技術により重要データを復元する復元処理部３３０とを有する。
【選択図】図１A distributed data storage system that improves the availability of data without depending on a library or the like that implements secret sharing when storing a plurality of partial data generated from important data by a secret sharing technique in a plurality of data centers. I will provide a.
Each server 100 has a data storage unit 110 that stores partial data received from a client terminal 300. The client terminal 300 cannot restore important data unless k or more are collected from the important data by a secret sharing technique. A division processing unit 310 that generates n (k ≦ n) partial data, and n partial data and n copies stored in 2n servers 100 and m (k ≦ m ≦ n) having a distributed management unit 320 that collects partial data or copies from m servers 100 and a restoration processing unit 330 that restores important data from the m partial data or copies by a secret sharing technique .
[Selection] Figure 1

Description

本発明は、電子データの保管技術に関し、特に、重要データから秘密分散技術により複数の非重要データを生成して複数の拠点に分散保管するデータ分散保管システムに適用して有効な技術に関するものである。 The present invention relates to electronic data storage technology, and more particularly to a technology effective when applied to a data distributed storage system that generates a plurality of non-critical data from important data by secret sharing technology and distributes and stores the data at a plurality of locations. is there.

情報システムを有する企業等においては、情報漏洩などの情報セキュリティ事故を防止するため、機密性の高いデータなどの重要なデータを保護する手段を講じる必要がある。一方でこれらを実現するための様々な手段も提案されている。 A company having an information system needs to take measures to protect important data such as highly confidential data in order to prevent an information security accident such as information leakage. On the other hand, various means for realizing these have been proposed.

重要データを保護するための手段として、例えば、企業等が重要データをセキュリティ対策が多重に施されたデータセンターに保管することが考えられる。しかしながら、外部からアクセス可能なプライベートなデータセンターを独自に構築・運用するのは技術面・コスト面等で多大な負荷を要し、容易に実現できるものではない。 As a means for protecting important data, for example, it is conceivable that a company or the like stores important data in a data center where multiple security measures are taken. However, the construction and operation of a private data center that can be accessed from the outside requires a great load in terms of technology and cost, and cannot be easily realized.

これに対して第三者が運用してサービスとして外部に提供しているデータセンターを利用することも考えられる。しかし、第三者が運用管理するデータセンターに自社の重要データを保管することはセキュリティ面で高いリスクが伴う。ましてや近年利用が拡大しているクラウドコンピューティング環境における仮想データセンターや仮想サーバに重要データを保管することは非常にリスクが高いことから、重要データを取り扱う業務を行う情報システムをクラウドコンピューティング環境を利用して構築するということがなかなか普及しない一因ともなっている。 On the other hand, it is also possible to use a data center that is operated by a third party and provided as a service to the outside. However, storing important company data in a data center operated and managed by a third party involves a high security risk. In addition, storing important data in virtual data centers and virtual servers in cloud computing environments that have been increasingly used in recent years is extremely risky. Using it for construction is also one of the reasons why it is not popular.

一方、重要データを保管する際に、データを秘匿化したり改竄を防止したりする手段を講じて保管することも行われている。一般的には、暗号鍵を用いて重要データを暗号化して保管することが行われているが、この場合、暗号化されたデータには重要データの情報が全て含まれている。従って、例えば暗号化データが第三者に取得されたような場合、何らかの理由で当該第三者に暗号鍵も取得、解読された場合は容易に重要データが復元されてしまう。また、暗号鍵を取得されなくとも、暗号鍵が有限長であることから、理論上は有限回数の試行によって暗号化されたデータから重要データが復元されてしまう可能性を有する。 On the other hand, when important data is stored, the data is concealed or stored by taking measures to prevent tampering. In general, important data is encrypted and stored using an encryption key. In this case, the encrypted data includes all important data information. Therefore, for example, when encrypted data is acquired by a third party, important data is easily restored when the encryption key is also acquired and decrypted by the third party for some reason. Even if the encryption key is not acquired, since the encryption key has a finite length, theoretically, there is a possibility that important data may be restored from the encrypted data after a finite number of trials.

これに対し、重要データを強固に秘匿化する手法として、いわゆる秘密分散の技術も用いられている。秘密分散では、重要データを、それだけでは意味のない（重要データを復元・推測できない）非重要データに分割・分散することで、一部の非重要データが第三者に取得された場合でも、第三者による重要データの復元を理論上も不可能とすることができる。 On the other hand, a so-called secret sharing technique is also used as a method for strongly concealing important data. In secret sharing, important data is divided into non-important data that cannot be used by itself (important data cannot be recovered or guessed). Even if some non-important data is obtained by a third party, It is theoretically impossible to restore important data by a third party.

秘密分散の手法としては種々のものが提案されている。例えば、特許第４０３９８１０号明細書（特許文献１）には、電子情報ファイルを複数の情報エレメントに分割し、分割された情報エレメントを選択し順序を変えて組み合わせることにより、２個以上の情報ブロックであって全ての情報ブロックを統合しなければ全ての情報エレメントを含むことにならないような情報ブロックを生成し、情報エレメントに分割した方法に係る分割情報と情報ブロックを生成した方法に係る形成情報を記録した分割抽出データを生成し、各情報ブロックと分割抽出データとのうち、少なくとも１つを証明局に保管し、他を分離して別々に保管することで電子情報の安全を確保する技術が開示されている。 Various secret sharing techniques have been proposed. For example, in Japanese Patent No. 4039810 (Patent Document 1), two or more information blocks are obtained by dividing an electronic information file into a plurality of information elements, selecting the divided information elements, and combining them in a different order. In this case, an information block that does not include all information elements unless all the information blocks are integrated is generated, and division information relating to the method divided into information elements and formation information relating to the method generating the information block A technology that secures the security of electronic information by generating divided extracted data that records information, storing at least one of each information block and divided extracted data in a certification authority, and separately storing the other separately Is disclosed.

一方、重要データに対応する非重要データや情報ブロック（以下では“部分データ”と記載する場合がある）を全て集めなくとも所定の個数以上集めれば重要データを復元可能な秘密分散の手法として、例えば、非特許文献１に記載されたような多項式補間を用いた（ｋ，ｎ）閾値秘密分散法が従来から用いられている。この手法によれば、ｎ個に分散した部分データのうち少なくともｋ個（ｋ≦ｎ）を集めれば重要データを復元することができる。また、この手法をさらに改良した種々の閾値秘密分散法も提案されている。 On the other hand, as a secret sharing technique that can restore important data if you collect more than a predetermined number of non-important data and information blocks corresponding to important data (sometimes referred to as “partial data” below) For example, a (k, n) threshold secret sharing method using polynomial interpolation as described in Non-Patent Document 1 has been conventionally used. According to this method, important data can be restored by collecting at least k pieces (k ≦ n) of n pieces of partial data. Various threshold secret sharing methods that further improve this method have also been proposed.

これに関連して、例えば、特開２００９−１３９９９０号公報（特許文献２）には、記憶装置に格納されたデータを、復元の際に基準個数の部分データが必要となる秘密分散法により、基準個数以上の所定の個数の部分データに分割する分割部と、部分データを他の情報処理装置に送信するとともに記憶装置から削除する送信部と、記憶装置へデータを復元する場合に、他の情報処理装置から部分データを取得して記憶装置に格納する取得部と、基準個数の部分データが記憶装置に格納されたことを条件にデータを復元する復元部とを備える情報処理装置が開示されている。 In relation to this, for example, in JP 2009-139990 A (Patent Document 2), the data stored in the storage device is converted into a secret sharing method that requires a reference number of partial data at the time of restoration. When dividing data into a predetermined number of partial data equal to or greater than the reference number, a transmission unit that transmits partial data to another information processing device and deletes the partial data from the storage device, An information processing apparatus including an acquisition unit that acquires partial data from an information processing device and stores the partial data in a storage device, and a restoration unit that restores data on condition that a reference number of partial data is stored in the storage device is disclosed. ing.

特許第４０３９８１０号明細書Patent No. 4039810 特開２００９−１３９９９０号公報JP 2009-139990 A

A.Shamir、"How to Share a Secret"、Communications of the ACM、vol.22 no.11 pp.612-613、1979.A. Shamir, "How to Share a Secret", Communications of the ACM, vol.22 no.11 pp.612-613, 1979.

近年、ノート型ＰＣ（Personal Computer）などの携帯可能な情報処理端末が広く利用されるに従って、これらの端末自体の盗難や紛失等に伴う情報漏洩のリスクが高まっている。例えば、個人情報等を取り扱う企業などにおいては、従業員等がこれらの端末を紛失したような場合には、監督官庁への届出や報告等が必要となる場合がある。しかし、従来は実際の情報漏洩範囲については特定することができない場合がほとんどであり、紛失した時点で全てのデータが漏洩もしくはその可能性があると報告せざるを得なかった。 In recent years, as portable information processing terminals such as notebook PCs (Personal Computers) are widely used, there is an increased risk of information leakage due to theft or loss of these terminals themselves. For example, in a company that handles personal information or the like, if an employee or the like loses these terminals, a report or report to a supervisory authority may be required. However, in the past, the actual information leakage range could not be specified in most cases, and when it was lost, all data had to be reported as leaked or possibly.

これに対して、端末内の重要データを含むデータを外部のサーバ等に保管することで端末の紛失等に伴う情報漏洩のリスクを低減することが考えられる。このとき、重要データをそのまま外部のサーバ等に保管するのではなく、例えば、上述した秘密分散の技術を利用して重要データを非重要データに分割・分散して部分データとし、これを外部のサーバ等に分散保管するようにすることで、例えば、クラウドコンピューティング環境における仮想データセンターや仮想サーバなどに保管するような場合においても情報漏洩のリスクを低減させることが可能である。 On the other hand, it is conceivable to reduce the risk of information leakage due to loss of the terminal by storing data including important data in the terminal in an external server or the like. At this time, the important data is not stored in an external server or the like as it is, for example, using the secret sharing technique described above, the important data is divided / distributed into non-important data to obtain partial data, which is By storing in a distributed manner on a server or the like, for example, the risk of information leakage can be reduced even when storing in a virtual data center or a virtual server in a cloud computing environment.

すなわち、各データセンター等に分散保管される部分データは、それ自体では意味をなさず、当該部分データのみからは重要データの内容を復元したり推測したりすることができない。従って、当該データセンターやサーバ等に不正に侵入し、当該部分データを取得した第三者はもちろん、例えば、各データセンターの管理者等の内部の者が悪意を持って当該部分データを取得した場合でも、取得された部分データから重要データの内容が漏洩する事態を防ぐことができる。 In other words, partial data distributed and stored in each data center or the like does not make sense by itself, and the contents of important data cannot be restored or estimated from only the partial data. Therefore, not only the third party who illegally invaded the data center or server and acquired the partial data, but also, for example, an insider such as an administrator of each data center acquired the partial data maliciously. Even in this case, it is possible to prevent a situation in which the content of important data is leaked from the acquired partial data.

また、秘密分散の技術により重要データを複数の部分データに分割・分散した場合、部分データの一部が滅失した場合でも、所定の個数以上の部分データを集めることができれば元の重要データを復元できることから、データの可用性を向上させることもできる。例えば、（ｋ，ｎ）閾値型の秘密分散により、重要データからｎ個の部分データを生成した場合、ｋ個以上の部分データを集めることができれば重要データを復元することができる。換言すれば、（ｎ−ｋ）個までの部分データの滅失には耐えることが可能である。 In addition, when important data is divided / distributed into multiple partial data using secret sharing technology, even if a part of the partial data is lost, the original important data can be restored if a predetermined number of partial data can be collected. As a result, data availability can be improved. For example, when n partial data are generated from important data by (k, n) threshold type secret sharing, the important data can be restored if k or more partial data can be collected. In other words, it is possible to withstand the loss of up to (n−k) partial data.

しかしながら、現実の情報システムの実装において、重要データのセキュアな取り扱いが必要とされる場面では、暗号化の技術が広く一般に用いられているのと比較して、上述した秘密分散の技術は未だあまり適用されていないのが現状である。このため、現時点では、利用実績や処理速度等の観点で、実用に耐え得る機能や性能を有する秘密分散のライブラリは限られている。 However, in the case where secure handling of important data is required in the implementation of an actual information system, the secret sharing technique described above is still not much compared to the encryption technique widely used in general. The current situation is not applied. For this reason, at the present time, there are limited secret sharing libraries having functions and performance that can withstand practical use from the viewpoints of actual use and processing speed.

これらのライブラリでは、例えば、秘密分散に係る機能が固定化されていたり、制約があったり等、秘密分散によって得られる可用性を拡張・変更することができないことが多く、データの可用性が、使用する秘密分散のライブラリの機能や仕様に依存してしまうという状況となっている。 In these libraries, the availability obtained by secret sharing cannot be expanded or changed in many cases, for example, functions related to secret sharing are fixed or restricted, and the availability of data is used. The situation depends on the functions and specifications of the secret sharing library.

そこで本発明の目的は、重要データから秘密分散技術により生成された複数の部分データを複数のデータセンターに分散保管するにあたり、秘密分散を実装するライブラリ等の機能もしくは仕様に依存せずにデータの可用性を向上させることができるデータ分散保管システムを提供することにある。 Accordingly, an object of the present invention is to distribute and store a plurality of partial data generated from important data by a secret sharing technique in a plurality of data centers without depending on functions or specifications of a library or the like that implements secret sharing. To provide a distributed data storage system capable of improving availability.

本発明の前記ならびにその他の目的と新規な特徴は、本明細書の記述および添付図面から明らかになるであろう。 The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.

本願において開示される発明のうち、代表的なものの概要を簡単に説明すれば、以下のとおりである。 Of the inventions disclosed in this application, the outline of typical ones will be briefly described as follows.

本発明の代表的な実施の形態によるデータ分散保管システムは、クライアント端末において、重要データから秘密分散技術により複数の非重要データである部分データを生成し、前記各部分データを、ネットワークを介して接続された複数のサーバに分散保管するデータ分散保管システムであって、以下の特徴を有するものである。 A data distributed storage system according to a representative embodiment of the present invention generates partial data that is a plurality of non-important data from important data by secret sharing technology at a client terminal, and the partial data is transmitted via a network. A distributed data storage system that stores data in a plurality of connected servers, and has the following characteristics.

すなわち、前記各サーバは、それぞれ、前記クライアント端末から受信した前記部分データを保管するデータ保管部を有する。 That is, each of the servers has a data storage unit that stores the partial data received from the client terminal.

また、前記クライアント端末は、ユーザから保管を指示された前記重要データから前記秘密分散技術によりｋ個以上集めなければ前記重要データを復元できないｎ個（ｋ≦ｎ）の前記部分データを生成する分割処理部と、前記分割処理部によって生成されたｎ個の前記部分データ、および前記各部分データのｎ個のコピーを、２ｎ個の前記サーバの前記データ保管部にそれぞれ保管し、また、前記重要データを復元するためのｍ個（ｋ≦ｍ≦ｎ）の異なる前記部分データもしくは前記部分データのコピーをｍ個の前記サーバからそれぞれ収集する分散管理部と、前記ユーザから利用を指示された前記重要データについて、前記分散管理部から取得したｍ個の異なる前記部分データもしくは前記部分データのコピーから前記秘密分散技術により前記重要データを復元する復元処理部とを有する。 Further, the client terminal generates n pieces (k ≦ n) of partial data that cannot be restored unless the k is collected from the important data instructed to be stored by the user by the secret sharing technique. The processing unit, the n pieces of partial data generated by the division processing unit, and n copies of the partial data are respectively stored in the data storage units of 2n servers, and the important data A distributed management unit for collecting m pieces (k ≦ m ≦ n) of different partial data or a copy of the partial data for restoring data from each of the m servers, and the user instructed to use the partial data For important data, from the m different pieces of partial data or a copy of the partial data obtained from the distributed management unit, the secret sharing technique And a restoration processing unit for restoring the serial important data.

本願において開示される発明のうち、代表的なものによって得られる効果を簡単に説明すれば以下のとおりである。 Among the inventions disclosed in the present application, effects obtained by typical ones will be briefly described as follows.

本発明の代表的な実施の形態によれば、重要データから秘密分散技術により生成された複数の部分データを複数のデータセンターに分散保管するにあたり、秘密分散を実装するライブラリ等の機能もしくは仕様に依存せずにデータの可用性を向上させることが可能となる。 According to a typical embodiment of the present invention, when a plurality of partial data generated from important data by a secret sharing technique is distributed and stored in a plurality of data centers, functions or specifications of a library or the like for implementing secret sharing are used. It becomes possible to improve the availability of data without depending on it.

本発明の一実施の形態であるデータ分散保管システムの構成例について概要を示した図である。It is the figure which showed the outline | summary about the structural example of the data distribution storage system which is one embodiment of this invention. 本発明の一実施の形態におけるデータの保管の概念について説明した図である。It is the figure explaining the concept of the storage of the data in one embodiment of this invention. 本発明の一実施の形態におけるデータの可用性の例について説明した図である。It is a figure explaining the example of the availability of the data in one embodiment of this invention. 本発明の一実施の形態におけるデータの可用性の別の例について説明した図である。It is a figure explaining another example of the availability of the data in one embodiment of this invention. 本発明の一実施の形態における部分データを保管するサーバの選択の例について示した図である。It is the figure shown about the example of selection of the server which stores the partial data in one embodiment of this invention. 本発明の一実施の形態における認証処理の流れの例について概要を示した図である。It is the figure which showed the outline | summary about the example of the flow of the authentication process in one embodiment of this invention.

以下、本発明の実施の形態を図面に基づいて詳細に説明する。なお、実施の形態を説明するための全図において、同一部には原則として同一の符号を付し、その繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiment, and the repetitive description thereof will be omitted.

＜概要＞
本発明の一実施の形態であるデータ分散保管システムは、ユーザがクライアント端末上で、ユーザの重要データを保管する際に、当該重要データから秘密分散技術により複数の部分データを生成し、これらを複数のデータセンターのサーバに送信して分散保管する。このとき、各部分データのコピーをそれぞれ別のデータセンターに保管して二重化する。これにより、後述するように、使用する秘密分散のライブラリの機能や仕様に依存せず、データの可用性を大きく向上させることができる。 <Overview>
The data distributed storage system according to an embodiment of the present invention generates a plurality of partial data from the important data by secret sharing technology when the user stores the user's important data on the client terminal. Send to multiple data center servers for distributed storage. At this time, a copy of each partial data is stored in a separate data center and duplicated. As a result, as described later, the availability of data can be greatly improved without depending on the functions and specifications of the secret sharing library to be used.

また、各データセンターに対するシングルサインオンの仕組みを実装し、認証の際に、各データセンターで異なる固有情報（鍵）を用いて認証処理を行うことで、各データセンターへのアクセスを独立して安全に行えるようにし、各データセンター間でのセキュリティを確保する。これらにより、各データセンターの管理者等の内部の者であっても１つの部分データしか得ることができず、当該部分データから重要データを復元・推測することはできないため、ユーザは安全に重要データを保管することができる。 In addition, by implementing a single sign-on mechanism for each data center and performing authentication processing using different unique information (key) at each data center during authentication, access to each data center can be made independent. Ensure safety and ensure security between data centers. As a result, even insiders such as managers of each data center can obtain only one partial data, and important data cannot be restored / inferred from the partial data. Data can be stored.

図２は、本実施の形態のデータ分散保管システムにおけるデータの保管の概念について説明した図である。データ分散保管システム１は、クライアント端末３００と、複数のデータセンター１０（図２の例では１０ａ〜ｈの８つ）が図示しないネットワークを介して接続された構成を有している。ここで、クライアント端末３００は、ユーザが業務等で通常使用し、重要データ５００の入力や参照などの処理を行う情報処理端末であり、例えば、ＰＣや、タブレット型端末、スマートフォン、携帯電話などの携帯端末などが該当する。 FIG. 2 is a diagram for explaining the concept of data storage in the data distributed storage system of the present embodiment. The data distributed storage system 1 has a configuration in which a client terminal 300 and a plurality of data centers 10 (eight of 10a to h in the example of FIG. 2) are connected via a network (not shown). Here, the client terminal 300 is an information processing terminal that is normally used by a user for business or the like and performs processing such as input and reference of important data 500. For example, the client terminal 300 is a PC, a tablet terminal, a smartphone, a mobile phone, or the like. This applies to mobile terminals.

また、データセンター１０は、サーバ機器を保管して運用管理を行う拠点であり、例えば、多数のサーバ機器等を保管して高度な運用管理を行うことができる専用のデータセンター施設や、いわゆるコンテナ型やモジュール型などの可搬型のデータセンターなどであってもよいし、オフィスビル内のマシンルームなど専用ではない施設であってもよい。また、クラウドコンピューティング環境における仮想データセンターであってもよい。本実施の形態での各データセンター１０は、それぞれ、データを保持・記憶するためのストレージ機器等からなるデータ保管部１１０（図２の例では１１０ａ〜１１０ｈ）を有するサーバ（図示しない）を１台以上有している。 The data center 10 is a base for storing and managing server devices. For example, a dedicated data center facility that can store a large number of server devices and perform advanced operations management, or a so-called container. It may be a portable data center such as a mold or module, or may be a non-dedicated facility such as a machine room in an office building. Further, it may be a virtual data center in a cloud computing environment. Each data center 10 according to the present embodiment includes one server (not shown) having a data storage unit 110 (110a to 110h in the example of FIG. 2) composed of storage devices and the like for holding and storing data. Has more than one unit.

なお、これらの各データセンター１０は、地理的にも組織的にも相互に関連のないものとするのが望ましい。すなわち、例えば同一の敷地内や隣接する敷地に存在したり、同一もしくは関連する事業者等により運用されていたりなど、データセンター１０間で、管理者等が相互に物理的もしくは電子的にアクセスすることが可能な構成とはなっていないものとするのが望ましい。 Each of these data centers 10 is preferably not related to each other geographically and organizationally. In other words, managers and the like access each other physically or electronically between the data centers 10 such as being in the same site or adjacent sites, or being operated by the same or related business operators, etc. It is desirable that the configuration is not possible.

上記のような環境で、ユーザからの指示に基づいて、クライアント端末３００に存在する重要データ５００についてセキュアな保管を行う場合、まず、クライアント端末３００において、重要データ５００から秘密分散技術を利用して複数の部分データ５１０（図２の例では５１０ａ〜５１０ｄの４つ）を生成する。それぞれの部分データ５１０は、上述したように、単独では意味をなさない非重要データである。なお、重要データ５００は漏洩を防止するために削除する。 In the environment as described above, when the important data 500 existing in the client terminal 300 is securely stored based on an instruction from the user, first, the client terminal 300 uses the secret sharing technique from the important data 500. A plurality of partial data 510 (four items 510a to 510d in the example of FIG. 2) are generated. As described above, each partial data 510 is non-important data that does not make sense alone. The important data 500 is deleted to prevent leakage.

利用する秘密分散技術（秘密分散のアルゴリズム）については特に限定されず、例えば、ｎ個の部分データ５１０うちｋ個以上集めれば重要データ５００を復元することができるが、ｋ個未満では原則として重要データ５００を復元することができない、いわゆる（ｋ，ｎ）閾値型（１＜ｋ≦ｎ）の秘密分散の手法を用いることができる。また、ｋ、ｎの値も特に限定されず、セキュリティの強度や可用性、処理速度等の要件などに応じて適宜決定することができる。なお、ｎの値（生成する部分データ５１０の数）はデータセンター１０の数の半分以下であるものとする。換言すれば、ｎ個の部分データ５１０（図２の例では４つ）に対して２ｎ個以上のデータセンター１０（図２の例では８つ）が利用可能となるようにする。 The secret sharing technique (secret sharing algorithm) to be used is not particularly limited. For example, if k or more of n partial data 510 are collected, the important data 500 can be restored. A so-called (k, n) threshold type (1 <k ≦ n) secret sharing technique that cannot restore the data 500 can be used. Also, the values of k and n are not particularly limited, and can be appropriately determined according to requirements such as security strength, availability, and processing speed. Note that the value of n (the number of partial data 510 to be generated) is less than half of the number of data centers 10. In other words, 2n or more data centers 10 (eight in the example of FIG. 2) are made available for n partial data 510 (four in the example of FIG. 2).

生成した４つの部分データ５１０は、それぞれコピーを作成して二重化し、図示するように、重複しないように８つのデータセンター１０に振り分けて送信し、データ保管部１１０に分散保管する。すなわち、同一の重要データ５００から生成した部分データ５１０（以下では、特に断らない限り部分データ５１０にはコピーも含むものとする）は、それぞれ別個に異なるデータセンター１０に保管するものとし、いずれか２つ以上の部分データ５１０が同一のデータセンター１０に保管されることがないようにする。なお、クライアント端末３００上の各部分データ５１０は漏洩を防止するために削除する。 The generated four partial data 510 are duplicated by creating a copy, as shown in the figure, distributed and transmitted to the eight data centers 10 so as not to overlap, and distributedly stored in the data storage unit 110. That is, the partial data 510 generated from the same important data 500 (hereinafter, unless otherwise specified, the partial data 510 includes a copy) are stored separately in different data centers 10, and any two of them are stored. The above partial data 510 is prevented from being stored in the same data center 10. Each partial data 510 on the client terminal 300 is deleted to prevent leakage.

以上のように、重要データ５００から秘密分散技術により部分データ５１０を生成して各データセンター１０に分散保管することで、クライアント端末３００の盗難や紛失等による重要データ５００の漏洩を防止することができる。また、各データセンター１０は重要データ５００につき部分データ５１０を１つしか有していないため、データセンター１０に対して第三者が侵入して不正に部分データ５１０を取得したり、データセンター１０の管理者等の内部の者が部分データ５１０を取得したりした場合でも、部分データ５１０を１つしか得ることができない。当該部分データ５１０だけでは重要データ５００を復元・推測することはできないため、重要データ５００の内容が漏洩することはない。 As described above, the partial data 510 is generated from the important data 500 by the secret sharing technique and stored in each data center 10 to prevent leakage of the important data 500 due to theft or loss of the client terminal 300. it can. Since each data center 10 has only one partial data 510 for the important data 500, a third party enters the data center 10 to obtain the partial data 510 illegally, Even if an internal person such as an administrator of the system acquires partial data 510, only one partial data 510 can be obtained. Since the important data 500 cannot be restored / estimated only by the partial data 510, the contents of the important data 500 are not leaked.

また、各データセンター１０が地理的にも組織的にも相互に関連のないものとなっている場合には、データセンター１０の管理者等の内部の者が、他のデータセンター１０にアクセスすることも、後述する認証処理と合わせて、困難である。従って、例えば、データセンター１０の管理者等の内部の者が悪意を持った場合でも、他のデータセンター１０から部分データ５１０を不正に取得し、ｋ個以上集めて重要データ５００を復元してしまうという事態を防止することができる。 Further, when the data centers 10 are not related to each other geographically or organizationally, an internal person such as an administrator of the data center 10 accesses the other data center 10. This is also difficult together with the authentication processing described later. Therefore, for example, even if an insider such as an administrator of the data center 10 has malicious intent, the partial data 510 is illegally acquired from other data centers 10, and k or more pieces are collected to restore the important data 500. Can be prevented.

また、各部分データ５１０をそれぞれ異なるデータセンター１０に保管して二重化することで、使用する秘密分散のライブラリの機能や仕様に依存せず、通常データの単なるバックアップの場合と比較して、リソースの必要量に対するデータの可用性を大きく向上させることができる。 Also, by storing each partial data 510 in a different data center 10 and duplicating it, it is not dependent on the functions and specifications of the secret sharing library to be used, and compared with the case of simple backup of normal data, Data availability for the required amount can be greatly improved.

例えば、重要データ５００から（３，４）閾値型の秘密分散により４つの部分データ５１０を生成し、これらを４つのデータセンター１０にそれぞれ分散保管した場合、そのうちの３つの部分データ５１０を集めることができれば、元の重要データ５００を復元することができる。すなわち、１つのデータセンター１０での障害等による部分データ５１０の滅失等には耐えることができる。これに対し、上述の図２の例に示したように、４つの部分データ５１０およびそれらのコピーを、８つの異なるデータセンター１０にそれぞれ分散保管することにより、リソースの必要量の増加分に比して、全体としてデータの可用性を大きく向上させることができる。 For example, when four pieces of partial data 510 are generated from the important data 500 by (3,4) threshold type secret sharing, and these are stored in four data centers 10 respectively, three pieces of the partial data 510 are collected. If possible, the original important data 500 can be restored. That is, it is possible to withstand the loss of the partial data 510 due to a failure or the like in one data center 10. On the other hand, as shown in the example of FIG. 2 described above, the four partial data 510 and their copies are distributed and stored in eight different data centers 10, respectively. As a result, the overall availability of data can be greatly improved.

図３および図４は、本実施の形態のデータ分散保管システムにおけるデータの可用性の例について説明した図である。図３の例では、図２の例に示したように、部分データ５１０を８つのデータセンター１０に二重化して分散保管している状態で、５つのデータセンター１０ａ、ｃ、ｄ、ｆ、ｈにおいて障害等により部分データ５１０が取得できなくなった状態を示している。このような場合でも、正常に稼働している残りの３つのデータセンター１０ｂ、ｅ、ｇから３つの異なる部分データ５１０ａ、ｂ、ｃを集めることができ、重要データ５００を復元することができる場合があることを示している。 3 and 4 are diagrams for explaining an example of data availability in the data distributed storage system according to the present embodiment. In the example of FIG. 3, as shown in the example of FIG. 2, five data centers 10 a, c, d, f, and h are stored in a state where the partial data 510 is duplicated and stored in eight data centers 10. Shows a state in which the partial data 510 cannot be acquired due to a failure or the like. Even in such a case, when the three different partial data 510a, b, c can be collected from the remaining three data centers 10b, e, g that are operating normally, and the important data 500 can be restored. It shows that there is.

一方で、図４の例に示すように、５つ未満のデータセンター１０の障害等であっても、特定の部分データ５１０とそのコピーを保管するデータセンター１０に障害等が偏った場合などでは、重要データ５００の復元ができなくなる場合もある。図４の例では、４つのデータセンター１０ｃ、ｄ、ｇ、ｈにおいて障害等により部分データ５１０が取得できなくなった状態を示している。この場合、異なる部分データ５１０としては、部分データ５１０ａ、ｂの２つしか集めることができず、重要データ５００を復元することができない。 On the other hand, as shown in the example of FIG. 4, even if there are less than five data center 10 failures, etc., when the failure or the like is biased to the data center 10 storing the specific partial data 510 and its copy, etc. In some cases, the important data 500 cannot be restored. In the example of FIG. 4, the partial data 510 cannot be acquired due to a failure or the like in the four data centers 10c, d, g, and h. In this case, only two pieces of partial data 510 a and b can be collected as different partial data 510, and the important data 500 cannot be restored.

ここで、図４の例に示すような状態で、障害等となっているデータセンター１０のうちのいずれか１つが正常に稼働しているとした場合、当該データセンター１０から取得できる部分データ５１０は、部分データ５１０ｃもしくはｄのいずれかとなるため、取得可能な部分データ５１０ａ、ｂと合わせて、全体として異なる部分データ５１０を３つ集めることができることになり、重要データ５００を復元することが可能となる。ここから、３つのデータセンター１０の障害等であれば、いずれのデータセンター１０の障害等であっても必ず３つ以上の異なる部分データ５１０を集めることができ、重要データ５００を復元することが可能であることが分かる。 Here, in the state shown in the example of FIG. 4, if any one of the data centers 10 that are faulty is operating normally, the partial data 510 that can be acquired from the data center 10. Is either partial data 510c or d, so that it is possible to collect three different partial data 510 as a whole, together with the obtainable partial data 510a, b, and to restore the important data 500 It becomes. From here, as long as there are three data center 10 faults, etc., any data center 10 fault can always collect three or more different partial data 510 and restore important data 500. It turns out that it is possible.

上述したように、例えば、重要データ５００から（３，４）閾値型の秘密分散により４つの部分データ５１０を生成し、これらを４つのデータセンター１０にそれぞれ分散保管した場合、そのうちの１つのデータセンター１０での障害等には耐えることができる。ここで、図２の例に示したように４つの部分データ５１０とそれらのコピーを、８つの異なるデータセンター１０に二重化して分散保管した場合、データの保管に要するリソースの量としては２倍必要となる。一方で、耐えることができるデータセンター１０の障害等の数は、２倍の２つではなく、上述したように、最低でも３つ、最大で５つとなり、全体としてデータの可用性が２倍以上向上することになる。 As described above, for example, when the four partial data 510 are generated from the important data 500 by (3,4) threshold type secret sharing, and these are stored in the four data centers 10 in a distributed manner, one of the data is stored. It can withstand obstacles at the center 10. Here, as shown in the example of FIG. 2, when the four partial data 510 and their copies are duplicated and stored in eight different data centers 10, the amount of resources required to store the data is doubled. Necessary. On the other hand, the number of failures of the data center 10 that can be tolerated is not doubled to two, but as described above, the minimum is three and the maximum is five, and the data availability as a whole is more than doubled. Will improve.

上記の内容を一般化すると、重要データ５００から（ｋ，ｎ）閾値型の秘密分散によりｎ個の部分データ５１０を生成し、これらをｎ個の異なるデータセンター１０に分散保管した場合、そのうちの（ｎ−ｋ）個のデータセンター１０の障害等には耐えられるが、ｎ個の部分データ５１０およびそれらのコピーを、２ｎ個の異なるデータセンター１０に分散保管した場合、耐えることができるデータセンター１０の障害等の数は、２倍の２（ｎ−ｋ）個ではなく、最低でも（２（ｎ−ｋ）＋１）個、最大で（２ｎ−ｋ）個となる。ここで、１＜ｋ≦ｎの場合、２（ｎ−ｋ）＜（２（ｎ−ｋ）＋１）＜（２ｎ−ｋ）であることから、全体としてデータの可用性が２倍以上向上することになる。 When the above contents are generalized, n partial data 510 are generated from the important data 500 by (k, n) threshold type secret sharing, and these are distributed and stored in n different data centers 10. A data center that can withstand failures of (n−k) data centers 10, but can withstand when n partial data 510 and copies thereof are distributed and stored in 2n different data centers 10. The number of 10 faults and the like is not twice (2 (n−k)) but at least (2 (n−k) +1) and at most (2n−k). Here, in the case of 1 <k ≦ n, since 2 (n−k) <(2 (n−k) +1) <(2n−k), the data availability as a whole is improved more than twice. become.

＜システム構成＞
以下では、本実施の形態のデータ分散保管システム１のシステム構成について説明する。図１は、本発明の一実施の形態であるデータ分散保管システム１の構成例について概要を示した図である。データ分散保管システム１は、インターネット等のネットワーク４００に対して、複数のサーバ１００、マスタサーバ２００、およびクライアント端末３００が接続する構成を有する。なお、上述したように、各サーバ１００は、地理的にも組織的にも相互に関連のないデータセンター１０内においてそれぞれ運用管理されているものとする。また、サーバ１００（データセンター１０）の数は、重要データ５００から（ｋ，ｎ）閾値型の秘密分散によって生成されるｎ個の部分データ５１０に対して、２ｎ個以上が利用可能となるようにする。 <System configuration>
Below, the system configuration | structure of the data distribution storage system 1 of this Embodiment is demonstrated. FIG. 1 is a diagram showing an outline of a configuration example of a data distributed storage system 1 according to an embodiment of the present invention. The data distributed storage system 1 has a configuration in which a plurality of servers 100, a master server 200, and client terminals 300 are connected to a network 400 such as the Internet. As described above, it is assumed that each server 100 is managed and managed in the data center 10 that is not related to each other both geographically and organizationally. In addition, the number of servers 100 (data centers 10) is 2n or more available for n partial data 510 generated from important data 500 by (k, n) threshold type secret sharing. To.

サーバ１００は、サーバ機器によって構成されるコンピュータシステムであり、ファイルサーバもしくはストレージサーバ等として、ユーザ認証を経た後にクライアント端末３００等からのアクセスを受け付けてデータ（部分データ５１０）の保管サービスを提供する機能を有する。サーバ１００は、例えば、磁気ディスク等のストレージ機器からなるデータ保管部１１０およびソフトウェアプログラムにより実装される認証処理部１２０を有する。データ保管部１１０は、ＯＳ（Operating System）などの指示に基づいて、指定されたデータについての読み書きを行う。 The server 100 is a computer system composed of server devices. As a file server or a storage server, the server 100 accepts access from the client terminal 300 after user authentication and provides a data (partial data 510) storage service. It has a function. The server 100 includes, for example, a data storage unit 110 including a storage device such as a magnetic disk and an authentication processing unit 120 implemented by a software program. The data storage unit 110 reads and writes specified data based on an instruction from an OS (Operating System) or the like.

認証処理部１２０は、サーバ１００へのアクセスに対しての認証処理を行う。認証処理部１２０は、認証処理を行う際に利用する情報として、ユーザ毎のアカウント情報からなるユーザ情報１３０を有する。ユーザ情報１３０は、例えば、データベースやファイルテーブル等によって構成され、例えば、登録されたユーザのユーザＩＤ毎に、ユーザ毎に異なる固有情報としてのユーザシーズ１３１、およびパスワードを所定の手順によりハッシュ化したハッシュ化パスワード１３２などのアカウント情報を有する。また、認証処理部１２０は、サーバ毎に異なる固有情報としてのサーバシーズ１４０を有する。 The authentication processing unit 120 performs an authentication process for access to the server 100. The authentication processing unit 120 includes user information 130 including account information for each user as information used when performing authentication processing. The user information 130 is configured by, for example, a database, a file table, or the like. For example, for each registered user ID of the registered user, the user seed 131 as unique information that is different for each user and the password are hashed by a predetermined procedure. Account information such as hashed password 132 is included. In addition, the authentication processing unit 120 has a server seed 140 as unique information that is different for each server.

本実施の形態では、認証処理部１２０は、後述するように、クライアント端末３００との間でチャレンジ／レスポンス方式により認証処理を行う。すなわち、ユーザからの認証要求に対して、サーバシーズ１４０、ユーザシーズ１３１、およびチャレンジとしての乱数等を送信する。さらに、これらによってハッシュ化されたパスワード（ハッシュ値）をクライアント端末３００からレスポンスとして受信して、受信したハッシュ値と、ハッシュ化パスワード１３２を上記乱数によってハッシュ化したものとを比較して認証を行う。従って、認証処理部１２０は、乱数生成の機能やハッシュアルゴリズムを実装している。なお、これらの実装には公知の各種技術やアルゴリズムを利用することができる。サーバ１００とクライアント端末３００との間の通信経路のセキュリティが確保されるなどの場合には、チャレンジ／レスポンス方式以外の他の方式を採用するなどしてもよい。 In the present embodiment, as will be described later, the authentication processing unit 120 performs authentication processing with the client terminal 300 by a challenge / response method. That is, in response to the authentication request from the user, the server seeds 140, the user seeds 131, and a random number as a challenge are transmitted. Further, the hashed hash value is received as a response from the client terminal 300, and authentication is performed by comparing the received hash value with the hashed password 132 hashed by the random number. . Therefore, the authentication processing unit 120 has a random number generation function and a hash algorithm. In addition, various known techniques and algorithms can be used for these implementations. When security of the communication path between the server 100 and the client terminal 300 is secured, a method other than the challenge / response method may be adopted.

マスタサーバ２００は、サーバ機器やＰＣ等によって構成されるコンピュータシステムであり、各サーバ１００に保持するユーザシーズ１３１およびサーバシーズ１４０を生成して提供する。各サーバ１００を代表して認証を行ういわゆる認証サーバではないため、ユーザ認証の機能は有さない。マスタサーバ２００は、例えば、ソフトウェアプログラムにより実装されるシーズ生成部２１０を有する。シーズ生成部２１０は、管理者等からの指示もしくは各サーバ１００からの要求等に基づいてシーズを生成し、ユーザシーズ１３１もしくはサーバシーズ１４０として、対象のサーバ１００にネットワーク４００を介して提供する。 The master server 200 is a computer system composed of server devices, PCs, and the like, and generates and provides user seeds 131 and server seeds 140 held in each server 100. Since it is not a so-called authentication server that performs authentication on behalf of each server 100, it does not have a user authentication function. The master server 200 includes, for example, a seed generation unit 210 that is implemented by a software program. The seeds generation unit 210 generates seeds based on an instruction from an administrator or the like or a request from each server 100, and provides the seeds as user seeds 131 or server seeds 140 to the target server 100 via the network 400.

シーズの生成方法やシーズのフォーマット等については特に限定されないが、例えば、所定の長さのユニークな文字列やバイナリデータを生成してシーズとすることができる。なお、マスタサーバ２００は、他のサーバ１００とは独立したデータセンター１０に設置されていてもよいし、いずれかのサーバ１００と同一のデータセンター１０に、外部からアクセス可能な構成により設置されていてもよい。 The seed generation method and the seed format are not particularly limited. For example, a unique character string or binary data having a predetermined length can be generated and used as a seed. Note that the master server 200 may be installed in the data center 10 independent of the other servers 100, or installed in the same data center 10 as any one of the servers 100 with a configuration accessible from the outside. May be.

クライアント端末３００は、重要データ５００から秘密分散技術を利用して部分データ５１０を複数生成し、さらに各部分データ５１０のコピーを生成して、これらをそれぞれ重複しないように各サーバ１００（各データセンター１０）に振り分けて送信し、データ保管部１１０に分散保管する機能を有する。クライアント端末３００は、例えば、ソフトウェアプログラムにより実装される分割処理部３１０、分散管理部３２０、復元処理部３３０、認証要求部３４０およびインタフェース部３５０の各部と、データベースもしくはファイルテーブル等からなる分散状況３２１および設定情報３０１の各テーブルを有する。 The client terminal 300 generates a plurality of partial data 510 from the important data 500 using a secret sharing technique, and further generates a copy of each partial data 510 so that each server 100 (each data center) does not overlap each other. 10), and a function for distributing and storing the data in the data storage unit 110. The client terminal 300 includes, for example, a distribution processing unit 321 implemented by a software program, a distribution management unit 320, a restoration processing unit 330, an authentication request unit 340, an interface unit 350, and a distribution status 321 including a database or a file table. And each table of setting information 301.

分割処理部３１０は、後述するインタフェース部３５０を介してユーザからセキュアな保管を指示された重要データ５００から、設定情報３０１の設定内容等に従って秘密分散により各サーバ１００に分散保管する複数の部分データ５１０を生成する。上述したように、秘密分散の手法は特に限定されず、公知の（ｋ，ｎ）閾値型の秘密分散の手法を用いることができる。設定情報３０１には、例えば、利用する秘密分散のアルゴリズムを特定する情報や、ｋ、ｎなどのパラメータを予め設定しておくことができる。 The division processing unit 310 performs a plurality of partial data to be distributed and stored in each server 100 by secret sharing according to the setting contents of the setting information 301 from the important data 500 instructed to be securely stored by the user via the interface unit 350 described later 510 is generated. As described above, the secret sharing technique is not particularly limited, and a known (k, n) threshold type secret sharing technique can be used. In the setting information 301, for example, information for specifying the secret sharing algorithm to be used and parameters such as k and n can be set in advance.

分散管理部３２０は、重要データ５００の分散保管の際に、分割処理部３１０によって秘密分散により生成された各部分データ５１０について、それぞれコピーを生成し、これらを設定情報３０１の設定内容に基づく所定の条件に従って各サーバ１００に送信して分散保管するとともに、各部分データ５１０がいずれのサーバ１００に保管されているかの対応に係る情報を分散状況３２１に記録して管理する。 When the important data 500 is stored in a distributed manner, the distribution management unit 320 generates a copy of each partial data 510 generated by secret sharing by the division processing unit 310, and creates a predetermined copy based on the setting contents of the setting information 301. Are transmitted to each server 100 according to the above conditions and distributedly stored, and information related to correspondence with which server 100 each partial data 510 is stored is recorded in the distributed status 321 and managed.

各部分データ５１０をそれぞれどのサーバ１００に保管するかを決定する手法については種々のものが考えられる。図５は、部分データ５１０を保管するサーバ１００の選択の例について示した図である。図５の例では、各重要データ５００（“重要データα”、“重要データβ”、“重要データγ”、…）から（３，４）閾値型の秘密分散により生成した４つの部分データ５１０（“Ａ”、“Ｂ”、“Ｃ”、“Ｄ”）およびそれらのコピー（“ａ”、“ｂ”、“ｃ”、“ｄ”）に対して、１０個のサーバ１００（“サーバ＃１”〜“サーバ＃１０”）から保管先となるサーバ１００を８つ選択して割り当てた場合を示している。 Various methods for determining in which server 100 to store each partial data 510 can be considered. FIG. 5 is a diagram illustrating an example of selection of the server 100 that stores the partial data 510. In the example of FIG. 5, four pieces of partial data 510 generated by (3,4) threshold type secret sharing from each important data 500 (“important data α”, “important data β”, “important data γ”,...). (“A”, “B”, “C”, “D”) and their copies (“a”, “b”, “c”, “d”), 10 servers 100 (“server” 8 shows a case where eight servers 100 serving as storage destinations are selected and assigned from # 1 "to" Server # 10 ").

例えば、各サーバ１００（“サーバ＃１”〜“サーバ＃１０”）をランダムあるいはスペック等に基づく優先順位等に従って順序付けしておき、そこからその時点で障害等により稼動していないサーバ１００（図５の例では、“重要データγ”を保管する際の“サーバ＃６”）を除外した上で、リストの順序に従って２ｎ個のサーバ１００を順に選択するようにしてもよい。このとき、毎回リストの先頭（例えば“サーバ＃１”）から２ｎ個のサーバ１００を選択するようにしてもよいし、図５の例に示すように、選択する際の始点を保管する重要データ５００毎にずらして、選択するサーバ１００をローテーションするようにしてもよい。 For example, the servers 100 ("Server # 1" to "Server # 10") are ordered in accordance with a priority or the like based on random or specs, etc., and then the servers 100 that are not operating due to a failure or the like at that time (see FIG. In the example of 5, 2n servers 100 may be sequentially selected according to the order of the list after excluding “server # 6” when storing “important data γ”. At this time, 2n servers 100 may be selected from the top of the list (for example, “Server # 1”) each time, or as shown in the example of FIG. 5, important data for storing the starting point for selection The server 100 to be selected may be rotated by shifting every 500.

選択するサーバ１００をローテーションすることで、複数の重要データ５００について、部分データ５１０の分散保管のされ方がそれぞれ異なるようにすることができる。これにより、例えば、複数のサーバ１００（図５の例では、網掛けされた“サーバ＃１”、“サーバ＃２”、“サーバ＃５”、“サーバ＃６”の４つ）で障害等により部分データ５１０の取得が不能となった場合に、復元できなくなる重要データ５００の範囲を一部に抑え（図５の例では“重要データα”のみ）、全ての重要データ５００が復元不能となるような事態を防止することができる。 By rotating the server 100 to be selected, the partial data 510 can be distributed and stored differently for a plurality of important data 500. Thereby, for example, a failure or the like in a plurality of servers 100 (in the example of FIG. 5, four shaded “server # 1”, “server # 2”, “server # 5”, “server # 6”) If the acquisition of the partial data 510 becomes impossible, the range of the important data 500 that cannot be restored is limited to a part (only “important data α” in the example of FIG. 5), and all the important data 500 cannot be restored. Can be prevented.

選択した２ｎ個のサーバ１００に対して部分データ５１０を割り当てる手法についても種々のものが考えられる。例えば、図５の例に示すように、２ｎ個のサーバ１００のリストに対して、ｎ個の部分データ５１０、ｎ個のコピーの順で順次割り当てるようにしてもよいし、各部分データ５１０をランダムに割り当てるようにしてもよい。また、ある部分データ５１０とそのコピーが相互に地理的に近いサーバ１００（データセンター１０）に保管されないように、例えば、２ｎ個のサーバ１００のリストを地理的距離に基づいて予め２つのグループに分類してから、それぞれのグループに属するサーバ１００に、部分データ５１０とそのコピーをそれぞれ分離して割り当てる等の考慮を行ってもよい。 Various methods for assigning the partial data 510 to the selected 2n servers 100 can be considered. For example, as shown in the example of FIG. 5, n partial data 510 and n copies may be sequentially assigned to a list of 2n servers 100, or each partial data 510 may be assigned to each list. You may make it allocate at random. Further, for example, a list of 2n servers 100 is divided into two groups in advance based on the geographical distance so that a certain partial data 510 and a copy thereof are not stored in the server 100 (data center 10) that is geographically close to each other. After the classification, consideration may be given such that the partial data 510 and its copy are separately allocated to the servers 100 belonging to the respective groups.

設定情報３０１には、例えば、分散保管先となる各サーバ１００に対するアクセス情報（ＩＰアドレスやホスト名等）、複数のサーバ１００の中から２ｎ個のサーバ１００を選択し、部分データ５１０とそのコピーを割り当てるための基準や条件（例えばサーバ１００の優先順位や順序付けされたリスト、ローテーションの際の方法等）などを予め設定しておくことができる。 For the setting information 301, for example, access information (IP address, host name, etc.) for each server 100 serving as a distributed storage destination, 2n servers 100 are selected from the plurality of servers 100, and partial data 510 and its copy Can be set in advance, such as a priority or an order for the server 100, an ordered list, a rotation method, and the like.

また、分散管理部３２０は、後述する復元処理部３３０による重要データ５００の復元の際に、復元処理部３３０からの要求に基づいて、分散状況３２１の内容、および設定情報３０１の設定内容に基づく所定の条件に従って、各サーバ１００から、重要データ５００を復元するための異なるｍ個の部分データ５１０（もしくはそのコピー）を収集して復元処理部３３０に受け渡す。 Also, the distribution management unit 320 is based on the contents of the distribution status 321 and the setting contents of the setting information 301 based on a request from the restoration processing unit 330 when the important data 500 is restored by the restoration processing unit 330 described later. In accordance with a predetermined condition, m different partial data 510 (or copies) for restoring the important data 500 are collected from each server 100 and transferred to the restoration processing unit 330.

なお、収集する部分データ５１０の個数ｍの値は、重要データ５００を復元するために必要な閾値ｋ以上である必要があり、また、ｎ個全ての部分データ５１０を収集するものとしてもよい（ｋ≦ｍ≦ｎ）。設定情報３０１には、例えば、ｍの値や、ｍ＜ｎである場合に、対象となるｍ個のサーバ１００を選択するための基準や条件、障害等により対象のサーバ１００から部分データ５１０を取得できなかった場合の代替となるサーバ１００の決定方法（例えば、取得できなかった部分データ５１０に対するコピーを取得するのか、他の部分データ５１０を取得するのか等）などを予め設定しておくことができる。 Note that the number m of the partial data 510 to be collected needs to be equal to or greater than the threshold value k necessary for restoring the important data 500, and all the n partial data 510 may be collected ( k ≦ m ≦ n). The setting information 301 includes, for example, partial data 510 from the target server 100 depending on criteria, conditions, failures, and the like for selecting the target m servers 100 when m is m or n <n. A determination method of the server 100 as an alternative in the case where the data cannot be acquired (for example, whether to acquire a copy of the partial data 510 that could not be acquired, whether to acquire other partial data 510, or the like) is set in advance. Can do.

なお、サーバ１００の障害等により、例えば、部分データ５１０の分散保管時にｎ個の部分データ５１０のうちいずれかとそのコピーの双方とも各サーバ１００に保管できなかった場合や、部分データ５１０の収集時に異なるｋ個以上を収集できなかったなどの場合は、ユーザに対してエラーを応答するようにしてもよい。また、各サーバ１００との間で部分データ５１０の送受信を行う際に、クライアント端末３００および各サーバ１００がそれぞれ部分データ５１０に対して所定の暗号化を施した上で送受信することで、情報漏洩のリスクをさらに低減させるようにしてもよい。 Note that, for example, when any of the n pieces of partial data 510 and a copy thereof cannot be stored in each server 100 due to a failure of the server 100 or when the partial data 510 is collected, When it is not possible to collect k or more different items, an error may be returned to the user. Further, when the partial data 510 is transmitted / received to / from each server 100, the client terminal 300 and each server 100 perform predetermined encryption on the partial data 510 to transmit / receive information, thereby leaking information. This risk may be further reduced.

復元処理部３３０は、インタフェース部３５０を介してユーザから参照や編集等の利用を指示された重要データ５００について、これを復元するために必要な数以上の異なる部分データ５１０を分散管理部３２０に要求して取得し、取得した部分データ５１０から秘密分散の手法により重要データ５００を復元する。 The restoration processing unit 330 stores, in the distributed management unit 320, more than the number of different pieces of partial data 510 necessary for restoring important data 500 instructed to be used for reference or editing by the user via the interface unit 350. Requested and acquired, and the important data 500 is restored from the acquired partial data 510 by a secret sharing technique.

認証要求部３４０は、分散管理部３２０が各サーバ１００に対して部分データ５１０を分散保管する際、および各サーバ１００から部分データ５１０を収集する際の、各サーバ１００に対する認証の要求を行う。例えば、ログイン画面を介してユーザからユーザＩＤおよびパスワードの入力を受け付け、後述するように、チャレンジ／レスポンス方式等により、各サーバ１００の認証処理部１２０との間で順次もしくは並行的にそれぞれ個別に認証処理を行うことで、シングルサインオンの機能を実現する。 The authentication request unit 340 requests authentication of each server 100 when the distributed management unit 320 stores the partial data 510 in a distributed manner for each server 100 and collects the partial data 510 from each server 100. For example, an input of a user ID and a password is accepted from a user via a login screen, and individually or in parallel with the authentication processing unit 120 of each server 100 by a challenge / response method or the like, as will be described later. Implementing the authentication process realizes the single sign-on function.

ここでは、後述するように、認証要求の送信に対してサーバ１００の認証処理部１２０から送信されたサーバシーズ１４０、ユーザシーズ１３１、および乱数に基づいて、ユーザから指定されたパスワードを所定の手順によりハッシュ化し、これをサーバ１００の認証処理部１２０に送信することで認証処理を行う。従って、認証要求部３４０は、サーバ１００の認証処理部１２０が実装しているものと同一のハッシュアルゴリズムを実装している。 Here, as will be described later, based on the server seeds 140, the user seeds 131, and the random numbers transmitted from the authentication processing unit 120 of the server 100 in response to the transmission of the authentication request, the password designated by the user is set in a predetermined procedure. Is hashed and transmitted to the authentication processing unit 120 of the server 100 for authentication processing. Accordingly, the authentication request unit 340 is implemented with the same hash algorithm as that implemented by the authentication processing unit 120 of the server 100.

インタフェース部３５０は、クライアント端末３００における画面表示等のユーザインタフェースやデータの送受信などの入出力機能を有する。ユーザは、例えば、一般的なＯＳが有するファイル管理用の画面等を利用して、データ分散保管システム１の機能を利用することができる。 The interface unit 350 has a user interface such as a screen display in the client terminal 300 and an input / output function such as data transmission / reception. The user can use the function of the data distributed storage system 1 by using, for example, a file management screen of a general OS.

例えば、ファイル管理用の画面において重要データを特定のフォルダ等にドラッグ＆ドロップなどの簡易な操作により移動する。これをトリガとして、分割処理部３１０および分散管理部３２０によって、自動的に当該重要データ５００から（ｋ，ｎ）閾値型の秘密分散によりｎ個の部分データ５１０を生成し、各部分データ５１０をユーザに意識させずに各サーバ１００に分散保管することができる。なお、上述したように、このとき重要データ５００はクライアント端末３００から削除するが、ファイル管理用の画面上では、ユーザに意識させないよう、例えば、重要データ５００に対応するダミーファイル等を作成して残しておく。 For example, on the file management screen, important data is moved to a specific folder or the like by a simple operation such as drag and drop. With this as a trigger, the division processing unit 310 and the distribution management unit 320 automatically generate n pieces of partial data 510 from the important data 500 by (k, n) threshold type secret sharing, It can be distributed and stored in each server 100 without making the user aware of it. As described above, the important data 500 is deleted from the client terminal 300 at this time. However, for example, a dummy file corresponding to the important data 500 is created on the file management screen so as not to make the user aware of it. Leave it.

また、例えば、ユーザは、ファイル管理用の画面において特定のフォルダにて管理されている重要データ５００のダミーファイル等に対して操作を行うことで、重要データ５００に対する参照や編集等の操作を行うことができる。すなわち、ダミーファイル等に対する操作をトリガとして、分散管理部３２０および復元処理部３３０によって、ダミーファイル等に対応する重要データ５００について、自動的に各サーバ１００から異なるｍ個（ｋ≦ｍ≦ｎ）の部分データ５１０を収集し、重要データ５００を復元してユーザに利用可能とすることができる。 Further, for example, the user performs operations such as reference and editing on the important data 500 by performing operations on the dummy file or the like of the important data 500 managed in a specific folder on the file management screen. be able to. That is, using the operation on the dummy file or the like as a trigger, the distribution management unit 320 and the restoration processing unit 330 automatically change m important data 500 corresponding to the dummy file or the like from each server 100 (k ≦ m ≦ n). Can be collected and the important data 500 can be restored and made available to the user.

＜認証処理＞
以下では、本実施の形態のデータ分散保管システム１における認証処理の内容について説明する。本実施の形態のデータ分散保管システム１では、上述したように、複数のサーバ１００に対して部分データ５１０を分散保管する際、および複数のサーバ１００から部分データ５１０を収集する際に、ユーザによる各サーバ１００に対する個別の認証処理に伴う煩雑さを回避するため、シングルサインオンの仕組みを有する。 <Authentication process>
Below, the content of the authentication process in the data distribution storage system 1 of this Embodiment is demonstrated. In the data distributed storage system 1 according to the present embodiment, as described above, when the partial data 510 is distributed and stored in the plurality of servers 100 and when the partial data 510 is collected from the plurality of servers 100, In order to avoid the complexity associated with individual authentication processing for each server 100, a single sign-on mechanism is provided.

シングルサインオンの環境を実現する手法としては、例えば、各サーバ１００がＳＡＭＬ（Security Assertion Markup Language）プロトコル等を用いてサーバ１００間で通信を行って、認証サーバ等の特定のサーバで行った認証結果の情報を自動的に引き継ぐことで、各サーバ１００でのユーザによる再度の認証手続きを不要とする手法などがある。 As a technique for realizing a single sign-on environment, for example, each server 100 communicates between servers 100 using a SAML (Security Assertion Markup Language) protocol or the like, and authentication is performed by a specific server such as an authentication server. There is a technique that eliminates the need for re-authentication by the user at each server 100 by automatically taking over the resulting information.

しかしながら、このような手法によるシングルサインオンの環境は、例えば、イントラネット上の社内システムなど、サーバ１００間で認証情報の引き継ぎ・受け入れを許容する信頼関係が成立していることが前提となる。従って、本実施の形態のように、各データセンター１０が地理的にも組織的にも関連のないものである場合は、セキュリティ上の関係等からこのような信頼関係が成立しない場合もある。 However, the single sign-on environment based on such a method is premised on the establishment of a trust relationship that allows the server 100 to take over and accept authentication information, such as an in-house system on an intranet. Accordingly, when each data center 10 is not geographically or organizationally related as in the present embodiment, such a trust relationship may not be established due to a security relationship or the like.

また、このような環境で上述したようなシングルサインオンの手法を用いると、例えば、あるサーバ１００において、ユーザの認証処理のために認証サーバ等から取得した認証情報を、悪意を持った内部の者が利用して他のサーバ１００に対して不正にアクセスを行い、当該他のサーバ１００に保管されている部分データ５１０を取得してしまうということも考えられる。従って、本実施の形態のようなデータ分散保管システム１の環境では、各サーバ１００（データセンター１０）間での部分データ５１０の不正取得を防止するためのセキュリティについても考慮する必要がある。 In addition, when the single sign-on method as described above is used in such an environment, for example, in a certain server 100, authentication information acquired from an authentication server or the like for user authentication processing is stored in a malicious internal manner. It is also conceivable that a user accesses the other server 100 illegally and acquires partial data 510 stored in the other server 100. Accordingly, in the environment of the distributed data storage system 1 as in the present embodiment, it is necessary to consider security for preventing unauthorized acquisition of the partial data 510 between the servers 100 (data centers 10).

本実施の形態では、例えば代表となる認証サーバ等での認証結果をＳＡＭＬプロトコル等によって各サーバ１００間で引き継ぐような認証手法ではなく、各データセンター１０間で異なる固有情報（鍵）を用いて個別に認証処理を行うことで、各データセンター１０へのアクセスを独立して安全に行える仕組みを有し、各データセンター１０間でのセキュリティを確保する。 In the present embodiment, for example, the authentication result of a representative authentication server or the like is not an authentication method in which each server 100 is inherited by the SAML protocol or the like, but unique information (key) that is different between the data centers 10 is used. By performing the authentication process individually, it has a mechanism that allows access to each data center 10 independently and safely, and ensures security between the data centers 10.

認証処理を行うに当たっての初期状態として、各サーバ１００では、予め、マスタサーバ２００のシーズ生成部２１０によって生成されたシーズをそれぞれサーバシーズ１４０として保持しているものとする。さらに、各ユーザによって、ユーザＩＤ、パスワード等を含むアカウント情報の初期登録が事前に行われているものとする。このとき、アカウント情報として、ユーザＩＤ毎にそれぞれマスタサーバ２００のシーズ生成部２１０によって生成されたシーズをユーザシーズ１３１として保持しておく。さらに、パスワードについては、当該ユーザシーズ１３１およびサーバシーズ１４０をシード値として、所定のハッシュアルゴリズムによりハッシュ化したハッシュ化パスワード１３２として保持しておく。 As an initial state for performing the authentication process, it is assumed that each server 100 holds a seed generated by the seed generation unit 210 of the master server 200 as a server seed 140 in advance. Furthermore, initial registration of account information including a user ID, a password, and the like is performed in advance by each user. At this time, as account information, the seeds generated by the seed generation unit 210 of the master server 200 are stored as user seeds 131 for each user ID. Further, the password is stored as a hashed password 132 hashed by a predetermined hash algorithm using the user seeds 131 and the server seeds 140 as seed values.

パスワードを直接保持しないことで、パスワードの漏洩を防止することができる。また、ユーザ毎にユニークなユーザシーズ１３１をシード値としてハッシュ化を行うことで、例えば、複数のユーザによって偶然同一のパスワードが指定された場合でも、ユーザ毎にハッシュ値が異なるようにすることができる。 By not holding the password directly, the leakage of the password can be prevented. In addition, hashing is performed by using a unique user seed 131 for each user as a seed value, so that, for example, even when the same password is accidentally specified by a plurality of users, the hash value may be different for each user. it can.

図６は、本実施の形態における認証処理の流れの例について概要を示した図である。まず、ユーザはクライアント端末３００の認証要求部３４０を介して、認証（ログイン）の要求を行う。このとき、例えば、ユーザＩＤおよびパスワードの情報をログイン画面等を介して指定する。認証要求部３４０は、指定されたユーザＩＤを含む認証の要求をサーバ１００へ送信する（Ｓ０１）。 FIG. 6 is a diagram showing an outline of an example of the flow of authentication processing in the present embodiment. First, the user requests authentication (login) via the authentication request unit 340 of the client terminal 300. At this time, for example, user ID and password information are specified via a login screen or the like. The authentication request unit 340 transmits an authentication request including the designated user ID to the server 100 (S01).

ユーザＩＤを受信したサーバ１００の認証処理部１２０は、チャレンジ／レスポンス方式におけるチャレンジとしての乱数を生成し、さらにシーズを取得して、これらをクライアント端末３００に送信する（Ｓ０２）。ここでは、乱数に加えて、サーバシーズ１４０と、ユーザ情報１３０に保持されたユーザＩＤに対応するユーザシーズ１３１を取得する。 Upon receiving the user ID, the authentication processing unit 120 of the server 100 generates a random number as a challenge in the challenge / response method, acquires seeds, and transmits them to the client terminal 300 (S02). Here, in addition to the random number, the server seed 140 and the user seed 131 corresponding to the user ID held in the user information 130 are acquired.

サーバシーズ１４０とユーザシーズ１３１、および乱数を受信したクライアント端末３００の認証要求部３４０では、ステップＳ０１において指定されたパスワードを所定のハッシュアルゴリズムによりハッシュ化する（Ｓ０３）。さらに、ステップＳ０３で得られたハッシュ値を、ユーザシーズ１３１をシード値としてハッシュ化する（Ｓ０４）。さらに、ステップＳ０４で得られたハッシュ値を、サーバシーズ１４０をシード値としてハッシュ化する（Ｓ０５）。さらに、ステップＳ０５で得られたハッシュ値を、乱数をシード値としてハッシュ化することでワンタイム化し、得られたハッシュ値をサーバ１００へ送信する（Ｓ０６）。 The server seed 140, the user seed 131, and the authentication request unit 340 of the client terminal 300 that has received the random number hash the password specified in step S01 with a predetermined hash algorithm (S03). Further, the hash value obtained in step S03 is hashed using the user seeds 131 as a seed value (S04). Furthermore, the hash value obtained in step S04 is hashed using the server seeds 140 as a seed value (S05). Further, the hash value obtained in step S05 is made one-time by hashing using a random number as a seed value, and the obtained hash value is transmitted to the server 100 (S06).

なお、上記のステップＳ０３〜Ｓ０５の一連のハッシュ化処理手順は、一例であり、同等の結果が得られる他の手順とすることも当然可能であるが、事前のユーザ登録の際にパスワードをハッシュ化してハッシュ化パスワード１３２を取得する際のハッシュ化処理と同一の手順である必要がある。また、例えば、ステップＳ０２において、サーバ１００からパスワードの有効期限が経過しているためパスワードを更新する旨の指示を受信した場合など、必要に応じて、ステップＳ０３を実行する前にパスワード（およびハッシュ化パスワード１３２）の更新を行えるようにしてもよい。 Note that the series of hashing procedures in steps S03 to S05 described above is merely an example, and other procedures that can obtain equivalent results are naturally possible. However, the password is hashed in advance during user registration. It is necessary to use the same procedure as the hashing process when the hashed password 132 is acquired. Further, for example, when an instruction to update the password is received from the server 100 in step S02, the password (and hash) is executed before executing step S03 as necessary. Update password 132) may be updated.

ハッシュ値を受信したサーバ１００の認証処理部１２０は、ユーザ情報１３０から対象のユーザＩＤに対応するハッシュ化パスワード１３２を取得し（Ｓ０７）、取得したハッシュ化パスワード１３２を、ステップＳ０２で生成した乱数をシード値としてハッシュ化する（Ｓ０８）。その後、得られたハッシュ値と、ステップＳ０７でクライアント端末３００から受信したハッシュ値とを比較することで認証処理を行い、認証結果をクライアント端末３００に送信する（Ｓ０９）。すなわち、比較の結果両者が一致すれば認証は成立し、不一致であれば認証は不成立となる。なお、このとき例えば、クライアント端末３００からの要求電文からＩＰアドレス等の発信元の所在に係る情報を取得し、当該情報が所定の範囲内にあるか否か等の他の条件を認証の成否の判断に加えてもよい。 Upon receiving the hash value, the authentication processing unit 120 of the server 100 acquires the hashed password 132 corresponding to the target user ID from the user information 130 (S07), and the random number generated in step S02 is the acquired hashed password 132. Is hashed as a seed value (S08). Thereafter, authentication processing is performed by comparing the obtained hash value with the hash value received from the client terminal 300 in step S07, and the authentication result is transmitted to the client terminal 300 (S09). That is, if the two match as a result of the comparison, the authentication is established, and if the two do not match, the authentication is not established. At this time, for example, information related to the location of the transmission source such as the IP address is acquired from the request message from the client terminal 300, and other conditions such as whether or not the information is within a predetermined range are successful or unsuccessful. It may be added to the judgment.

クライアント端末３００の認証要求部３４０は、認証結果を受領し（Ｓ１０）、その後、必要に応じて他のサーバ１００に対しても順次上記の一連の処理を自動的に行い、各サーバ１００に対する認証処理を行う。各サーバ１００での認証処理は独立していることから、必要な複数のサーバ１００に対して上記の一連の処理を同時並行的に行うことも可能である。なお、必要なサーバ１００の情報については、例えば、クライアント端末３００の設定情報３０１等に予め設定しておいてもよいし、分散管理部３２０が、部分データ５１０の分散保管時や収集時に選択したサーバ１００を対象としてもよい。 The authentication request unit 340 of the client terminal 300 receives the authentication result (S10), and then automatically performs the above-described series of processes sequentially for the other servers 100 as necessary, and authenticates each server 100. Process. Since the authentication processing in each server 100 is independent, the above-described series of processing can be performed simultaneously on a plurality of necessary servers 100 in parallel. The necessary server 100 information may be set in advance in the setting information 301 of the client terminal 300, for example, or selected by the distribution management unit 320 at the time of distributed storage or collection of the partial data 510. The server 100 may be the target.

以上の処理により、ユーザは、ユーザＩＤおよびパスワードの指定を１回行うだけで、必要な各サーバ１００に対して認証処理を行うことができる。 With the above processing, the user can perform authentication processing for each necessary server 100 only by specifying the user ID and password once.

上述したような手法をとることにより、例えば、あるサーバ１００やデータセンター１０の管理者等が、対象のユーザのユーザシーズ１３１やハッシュ化パスワード１３２などのアカウント情報を自身のユーザ情報１３０から取得したとしても、これらの情報を利用して他のサーバ１００（データセンター１０）に対してなりすましによる認証を行うことはできず、サーバ１００間でのセキュリティは確保される。 By taking the above-described method, for example, an administrator of a certain server 100 or data center 10 acquires account information such as the user's seed 131 or hashed password 132 of the target user from his / her user information 130. However, it is not possible to authenticate other servers 100 (data center 10) by impersonation using these pieces of information, and security between the servers 100 is ensured.

これは、あるサーバ１００でのユーザのハッシュ化パスワード１３２の値は、自身のサーバシーズ１４０によってハッシュ化されたものであり、他のサーバ１００における当該ユーザのハッシュ化パスワード１３２は、当該他のサーバ１００のサーバシーズ１４０によってハッシュ化されたものであるため値が異なるからである。従って、両者を同じ乱数をシード値としてハッシュ化しても同一のハッシュ値とはならず、図６のステップＳ０９において認証は不成立となる。また、当該他のサーバ１００のサーバシーズ１４０を何らかの手段で取得してきたとしても、対象のユーザのパスワードを知らない限り、当該他のサーバ１００におけるハッシュ化パスワード１３２と同じ値のハッシュ値を生成することはできない。 This is because the value of the user's hashed password 132 in a certain server 100 is hashed by its own server seeds 140, and the user's hashed password 132 in the other server 100 is stored in the other server 100. This is because the value is different because it is hashed by 100 server seeds 140. Therefore, even if both are hashed using the same random number as a seed value, the same hash value is not obtained, and authentication is not established in step S09 in FIG. Further, even if the server seeds 140 of the other server 100 are acquired by some means, a hash value having the same value as the hashed password 132 in the other server 100 is generated unless the password of the target user is known. It is not possible.

以上に説明したように、本発明の一実施の形態であるデータ分散保管システム１によれば、重要データ５００から秘密分散技術により部分データ５１０を生成して各データセンター１０に分散保管することで、クライアント端末３００の盗難や紛失等による重要データ５００の漏洩を防止することができる。また、各データセンター１０は重要データ５００につき部分データ５１０を１つしか有していないため、データセンター１０に対して第三者が侵入して不正に部分データ５１０を取得したり、データセンター１０の管理者等の内部の者が部分データ５１０を取得したりした場合でも、部分データ５１０を１つしか得ることができない。当該部分データ５１０だけでは重要データ５００を復元・推測することはできないため、重要データ５００の内容が漏洩することはない。 As described above, according to the data distributed storage system 1 according to an embodiment of the present invention, the partial data 510 is generated from the important data 500 by the secret sharing technique and is distributed and stored in each data center 10. The leakage of the important data 500 due to theft or loss of the client terminal 300 can be prevented. Since each data center 10 has only one partial data 510 for the important data 500, a third party enters the data center 10 to obtain the partial data 510 illegally, Even if an internal person such as an administrator of the system acquires partial data 510, only one partial data 510 can be obtained. Since the important data 500 cannot be restored / estimated only by the partial data 510, the contents of the important data 500 are not leaked.

また、各部分データ５１０を複数のデータセンター１０に送信して分散保管する際に、各部分データ５１０のコピーをそれぞれ別のデータセンター１０に保管して二重化する。これにより、使用する秘密分散のライブラリの機能や仕様に依存せず、必要となるリソースの増加量に比して、データの可用性を大きく向上させることができる。 Further, when each partial data 510 is transmitted to a plurality of data centers 10 and distributedly stored, a copy of each partial data 510 is stored in a separate data center 10 and duplicated. As a result, the availability of data can be greatly improved as compared with the required increase in resources, without depending on the functions and specifications of the secret sharing library to be used.

また、サーバ１００毎に異なる固有情報（サーバシーズ１４０）を用いて認証を行うことで、データセンター１０の管理者等の内部の者が、悪意を持って他のデータセンター１０にアクセスすることも困難であることから、各データセンター１０間での部分データ５１０の不正取得を防止するためのセキュリティを確保することが可能となる。 In addition, by performing authentication using unique information (server seeds 140) that is different for each server 100, an internal person such as an administrator of the data center 10 may access other data centers 10 with malicious intent. Since this is difficult, it is possible to ensure security for preventing unauthorized acquisition of the partial data 510 between the data centers 10.

また、ユーザによるクライアント端末３００からの一度の認証処理によって複数のサーバ１００に対するシングルサインオンを実現することができ、また、各サーバ１００での認証を同時並行的に行うことが可能である。従って、重要データ５００から生成された部分データ５１０の分散保管、および重要データ５００を復元するために必要な数以上の部分データ５１０の収集において、認証に要する時間を削減してレスポンスの低下を抑止することが可能となる。 In addition, single sign-on for a plurality of servers 100 can be realized by a single authentication process from the client terminal 300 by a user, and authentication on each server 100 can be performed simultaneously in parallel. Therefore, in the distributed storage of the partial data 510 generated from the important data 500 and the collection of the partial data 510 exceeding the number necessary for restoring the important data 500, the time required for authentication is reduced and the deterioration of the response is suppressed. It becomes possible to do.

以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は前記実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能であることはいうまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.

本発明は、重要データから秘密分散技術により複数の非重要データを生成して複数の拠点に分散保管するデータ分散保管システムに利用可能である。 INDUSTRIAL APPLICABILITY The present invention can be used in a data distributed storage system that generates a plurality of non-important data from important data using a secret sharing technique and distributes and stores the data at a plurality of locations.

１…データ分散保管システム、
１０、１０ａ〜ｈ…データセンター、
１００…サーバ、１１０、１１０ａ〜ｈ…データ保管部、１２０…認証処理部、１３０…ユーザ情報、１３１…ユーザシーズ、１３２…ハッシュ化パスワード、１４０…サーバシーズ、
２００…マスタサーバ、２１０…シーズ生成部、
３００…クライアント端末、３０１…設定情報、３１０…分割処理部、３２０…分散管理部、３２１…分散状況、３３０…復元処理部、３４０…認証要求部、３５０…インタフェース部、
４００…ネットワーク、
５００…重要データ、５１０ａ〜ｄ…部分データ。 1 ... Data distributed storage system,
10, 10a-h ... data center,
DESCRIPTION OF SYMBOLS 100 ... Server, 110, 110a-h ... Data storage part, 120 ... Authentication process part, 130 ... User information, 131 ... User seed, 132 ... Hash password, 140 ... Server seed,
200: Master server, 210: Seeds generation unit,
300 ... Client terminal 301 ... Setting information 310 ... Division processing unit 320 ... Distribution management unit 321 ... Distribution status 330 ... Restore processing unit 340 ... Authentication request unit 350 ... Interface unit
400 ... Network,
500: important data, 510a to d: partial data.

Claims

A data distributed storage system that generates a plurality of partial data, which are non-critical data, from secret data using a secret sharing technique at a client terminal, and distributes and stores each partial data on a plurality of servers connected via a network. And
Each of the servers has a data storage unit that stores the partial data received from the client terminal,
The client terminal generates n pieces (k ≦ n) of partial data that cannot be restored unless the k is collected from the important data instructed to be stored by the user by the secret sharing technique. When,
The n pieces of partial data generated by the division processing unit and n copies of each piece of partial data were selected from 2n or more servers by sequentially rotating each important data to be stored. 2 m pieces of the partial data or copies of the partial data are stored in the data storage units of the 2n servers, and m (k ≦ m ≦ n) of the partial data for restoring the important data. A distributed management unit to collect each from the server;
A restoration processing unit that restores the important data from the m different partial data or a copy of the partial data acquired from the distribution management unit by the secret sharing technique for the important data instructed to be used by the user; A distributed data storage system comprising:

The data distributed storage system according to claim 1 ,
Each of the servers further includes an authentication processing unit that performs an authentication process for access to the server,
The client terminal is further configured to store the partial data or a copy of the partial data in the servers, and collect the partial data or the partial data from the servers. An authentication request unit that receives a user ID and password designation from the user and sequentially or in parallel sends authentication requests to the servers;
The authentication processing unit of the server includes server seeds that are unique information different for each server, user seeds that are unique information different for each user for each registered user ID, and a password for the user. User information holding account information including a hashed password hashed in a predetermined procedure using the server seeds and the user seeds, and for the authentication request received from the client terminal, The server seeds and the user seeds related to the target user and the generated random number are transmitted to the client terminal,
The authentication request unit of the client terminal hashes the password designated by the user using a predetermined procedure using the server seeds and the user seeds received from the server, and further hashed using the random numbers. Sending a hash value to the server;
The authentication processing unit of the server performs authentication by comparing the hash value received from the client terminal with a value obtained by hashing the hashed password relating to the target user using the random number, and performing authentication. A data distributed storage system, wherein a result is transmitted to the client terminal.

The data distributed storage system according to claim 2 ,
The data distributed storage system further comprising a master server connected to the network and generating and providing a seed value serving as the seed for each server based on a request from each server.

In the data distribution storage system of any one of Claims 1-3 ,
The distribution management unit of the client terminal copies the partial data and the partial data when the n partial data and the n copies of the partial data are stored in 2n servers. A data distributed storage system, wherein information relating to correspondence with which of the servers is stored is recorded in a distributed status recording unit.

In the data distribution storage system of any one of Claims 1-4 ,
The client terminal stores k, m, and n values related to the secret sharing technique, access information for each server, and the distribution management unit stores n pieces of partial data and n copies of the partial data. Conditions for selecting 2n servers to be targeted, conditions for selecting m servers for which the distributed management unit collects m different partial data or copies of the partial data, and the distribution Among the determination methods of the server as an alternative when the management unit cannot acquire the partial data or the copy of the partial data from the server, at least one piece of information has preset setting information A featured distributed data storage system.

In the data distribution storage system of any one of Claims 1-5 ,
When the distribution management unit of the client terminal stores the n pieces of partial data and the n copies of the partial data in 2n servers, respectively, the n partial data and the part If both copies of data could not be stored on the server, or if k different partial data or copies of the partial data were collected from the server, respectively, k or more could not be collected, A distributed data storage system characterized by responding an error to a user.

In the data distribution storage system of any one of Claims 1-6 ,
The data distribution, wherein the client terminal and each server encrypt the partial data or the copy of the partial data to be transmitted by a predetermined means when transmitting or receiving the partial data or the copy of the partial data Storage system.