JP2003140672A

JP2003140672A - Phoneme business system

Info

Publication number: JP2003140672A
Application number: JP2001340687A
Authority: JP
Inventors: Kazunori Hayashi; 和典林; Masaru Mase; 優間瀬
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2001-11-06
Filing date: 2001-11-06
Publication date: 2003-05-16

Abstract

PROBLEM TO BE SOLVED: To enable an owner of copyright on a phoneme to obtain a copyright charge corresponding to the use of the phoneme and a user of service using the phoneme to easily receive the service. SOLUTION: A phoneme service system is equipped with a phoneme input means of inputting the phoneme, a copyright owner registering means of registering the owner of copyright on the phoneme, a voice synthesizing means of analyzing data of a voice synthesis object by using, for example, text data of a document, etc., and a database of generated phonemes from the phoneme input means, extracting and connecting optimum phonemes, data by data, and computing the consumption of the phonemes, and a selling means of selling the service using the phonemes to users at service requests from the users and further equipped with a copyright charge calculating means of calculating copyright charges by owners of copyright on the phonemes according to the use quantities calculated by the voice synthesizing means.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は音声の最小構成要素
である音素を用いた音素ビジネスシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a phoneme business system using phonemes, which are the minimum constituent elements of speech.

【０００２】[0002]

【従来の技術】近年、電子メールやワープロ等のテキス
トデータを音声に変換する機能がパーソナルコンピュー
タに搭載されるようになってきている。しかしながらテ
キストデータを音声変換するのみの機能であれば、コス
トパフォーマンスに欠ける等の問題がある。また出力さ
れる音声の種類も男性や女性といった一般的なものであ
り、必ずしもユーザが所望する声色での音声出力ではな
いので、ユーザが聴いていて楽しさを感じにくい面があ
った。2. Description of the Related Art In recent years, personal computers have come to have a function of converting text data such as electronic mails and word processors into voice. However, there is a problem such as lack of cost performance if the function is only for converting text data into voice. Also, the type of voice that is output is a general one such as male or female, and it is not necessarily the voice output in the voice color desired by the user, so there is a side in which it is difficult for the user to hear and enjoy.

【０００３】特開平７−１４０９９９号公報には、人間
の発声に近い合成音声を生成することができる音声合成
装置及び音声合成方法が開示されている。すなわち、辞
書の中に読み仮名、アクセント型等の情報をととも、ア
クセント指令値及び又は音韻継続時間長情報を予め用意
しておき、音韻の継続時間長を用いて音素片データのパ
ラメータ列を生成し、それらを基に音声波形を合成する
ことにより、人間の発声に一段と近い合成音声を出力す
るものである。また特開平１１−１４３４８３号公報に
は、パソコン、ワープロ、ゲーム機などを利用する際の
合成音声の発生について、特にユーザが任意でかつ多様
な合成音声を選ぶことが可能な手段を実現するシステム
が開示されている。Japanese Unexamined Patent Publication No. 7-140999 discloses a voice synthesizing apparatus and a voice synthesizing method capable of generating a synthetic voice close to a human voice. That is, along with information such as reading kana and accent type in a dictionary, an accent command value and / or phoneme duration information is prepared in advance, and a parameter string of phoneme piece data is obtained using the phoneme duration. By generating and synthesizing a voice waveform based on the generated voices, a synthetic voice that is much closer to human speech is output. Further, Japanese Patent Application Laid-Open No. 11-143483 discloses a system that realizes a means by which a user can select a variety of synthetic voices for generation of synthetic voices when using a personal computer, word processor, game machine, or the like. Is disclosed.

【０００４】これらの問題を解決する為の手段として音
声発声者の肉声をサンプリングして作成された音素デー
タベースを用いる読み上げシステムが考案されている。As a means for solving these problems, a reading system using a phoneme database created by sampling a real voice of a voice utterer has been devised.

【０００５】[0005]

【発明が解決しようとする課題】実在の人物から採取し
た音素は発声者固有の個性をもつものであり、著作権に
似た権利を認めるべきであるが、従来の読み上げシステ
ムにおいてはその権利（以下著作権とする）を認める手
段を備えていない。本来は著作権を認めるべきであり、
使用に応じて相応の対価（著作権料）が音素の著作権所
有者に支払わなければならないが、従来の読み上げシス
テムにおいては音素が使用されても著作権料を支払うも
のではない。A phoneme collected from a real person has a personality peculiar to the speaker, and a right similar to copyright should be recognized, but in the conventional reading system, the right ( It is not provided with any means for recognizing the following). Originally you should recognize copyright,
A corresponding consideration (copyright fee) must be paid to the phoneme copyright owner depending on the use, but in the conventional reading system, even if the phoneme is used, the copyright fee is not paid.

【０００６】実在の人物から採取した音素が使用された
場合、その著作権所有者に対して著作権料が支払われな
い為に、権利者は不利益を被ることになる。また販売手
段が無いためにユーザは容易にサービスを受けることが
難しかった。従って音素を利用したビジネスを発展させ
る為の障害となる可能性があった。またユーザがシステ
ムから容易にサービスを受けるための販売手段も備えて
はいなかった。[0006] When a phoneme collected from a real person is used, the copyright holder is disadvantageous because the copyright fee is not paid to the copyright owner. In addition, it is difficult for the user to easily receive the service because there is no selling means. Therefore, it could be an obstacle to developing a business using phonemes. In addition, it does not have a selling means for users to easily receive services from the system.

【０００７】[0007]

【課題を解決するための手段】上記課題を考慮し、音素
が使用される場合においては、その音素の著作権所有者
に対して使用に応じた著作権料を支払う手段とユーザが
容易にサービスを受けることができる販売手段が必要で
ある。これを実現する為に、音声の最小構成要素を音素
と定め、その個性を持つ音素と、その音素を取り込む音
素取り込み手段と音素の著作権所有者を登録する著作権
者登録手段と、音声合成目的のデータ、例えば文章等の
テキストデータと音素取り込み手段から生成される音素
のデータベースを用いて音声合成目的のデータを解析
し、そのデータ毎に最適な音素を抽出して繋ぎあわせる
とともに音素の使用量を算出する音声合成手段と、音声
合成手段が処理した合成音データをユーザに提供する配
信手段と、音声合成手段によって算出された使用量に応
じて音素の著作権所有者毎に著作権料を算出する著作権
料算出手段と著作権料の算出情報を基に著作権料を音素
の著作権所有者に支払う金銭支払い手段と音素を利用し
たサービスをユーザに販売する販売手段と、音声合成目
的のデータを記録する合成目的データ記録手段と音素取
り込み手段が作成した音素データベースを記録する音素
データベース記録手段とユーザからのサービス要求を受
け付け、ユーザとシステムのインターフェースを行うユ
ーザインターフェース手段から構成される音素ビジネス
システムを提供する。In view of the above problems, when a phoneme is used, a means for paying a copyright fee according to the use to a copyright owner of the phoneme and a user can easily provide a service. You need a means of sale that can receive the. In order to realize this, the minimum constituent element of speech is defined as a phoneme, a phoneme with its individuality, a phoneme importing means for capturing the phoneme, a copyright holder registration means for registering the copyright owner of the phoneme, and a speech synthesis. The target data, for example, text data such as sentences, and the phoneme database generated from the phoneme capturing means are used to analyze the data for the purpose of speech synthesis, and the optimum phoneme is extracted and connected for each data, and the phoneme is used. A voice synthesizing means for calculating the amount, a delivering means for providing the synthesized voice data processed by the voice synthesizing means to the user, and a copyright fee for each phoneme copyright owner according to the usage amount calculated by the voice synthesizing means. Selling to the user a payment method and a phoneme-based service that pays the copyright fee to the phoneme copyright owner based on the copyright fee calculation method and the copyright fee calculation information Selling means, a synthesis purpose data recording means for recording data for speech synthesis purpose, a phoneme database recording means for recording a phoneme database created by the phoneme importing means, and a service request from the user are accepted, and the interface between the user and the system is performed. A phoneme business system including user interface means is provided.

【０００８】[0008]

【発明の実施の形態】請求項1記載の発明は音声の最小
構成要素を音素と定め、その個性を持つ音素と、その音
素を取り込む音素取り込み手段と音素の著作権所有者を
登録する著作権者登録手段と、音声合成目的のデータ、
例えば文章等のテキストデータと音素取り込み手段から
生成される音素のデータベースを用いて音声合成目的の
データを解析し、そのデータ毎に最適な音素を抽出して
繋ぎあわせるとともに音素の使用量を算出する音声合成
手段と、音声合成手段が処理した合成音データをユーザ
に提供する配信手段と、音声合成手段によって算出され
た使用量に応じて音素の著作権所有者毎に著作権料を算
出する著作権料算出手段と著作権料の算出情報を基に著
作権料を音素の著作権所有者に支払う金銭支払い手段と
音素を利用したサービスをユーザに販売する販売手段
と、音声合成目的のデータを記録する合成目的データ記
録手段と音素取り込み手段が作成した音素データベース
を記録する音素データベース記録手段とユーザからのサ
ービス要求を受け付け、ユーザとシステムのインターフ
ェースを行うユーザインターフェース手段から構成され
る音素ビジネスシステムであり、音素の著作権所有者は
音素の使用に応じた著作権料が得られ、また音素を利用
したサービスのユーザは容易にサービスを受けることが
できるようになる。BEST MODE FOR CARRYING OUT THE INVENTION The invention according to claim 1 defines a phoneme as a minimum constituent element of a voice, and registers a phoneme having its individuality, a phoneme capturing means for capturing the phoneme, and a copyright holder of the phoneme. Person registration means, voice synthesis purpose data,
For example, data for text-to-speech synthesis is analyzed using text data such as sentences and a phoneme database generated from the phoneme capturing means, optimal phonemes are extracted and connected for each data, and the usage amount of phonemes is calculated. A speech synthesis means, a delivery means for providing the synthesized speech data processed by the speech synthesis means to the user, and a copyright fee for each phoneme copyright owner according to the usage amount calculated by the speech synthesis means. Based on the royalty fee calculation means and the copyright fee calculation information, a payment method for paying the copyright fee to the copyright holder of the phoneme, a sales means for selling the service using the phoneme to the user, and data for the voice synthesis purpose are provided. Accepting a service request from a user and a phoneme database recording means for recording the phoneme database created by the synthesis purpose data recording means and the phoneme capturing means for recording. , A phoneme business system composed of user interface means for interfacing the user with the system, where the copyright owner of the phoneme gets a copyright fee according to the use of the phoneme, and the user of the service using the phoneme You will be able to receive services easily.

【０００９】（実施の形態）以下、本発明の音素ビジネ
スシステムの実施例について図１から図４を用いて説明
する。(Embodiment) An embodiment of a phoneme business system of the present invention will be described below with reference to FIGS. 1 to 4.

【００１０】図２は本発明の音素ビジネスシステムの概
略説明図である。以下に本システムの概略説明を行う。
(201)は本発明の音素ビジネスシステムであり、音素を
利用したサービスをユーザに販売するとともに、音素の
使用に応じた著作権料を音素の著作権所有者に支払う。
(202)は本発明の音素ビジネスシステムに音素を提供す
る音素提供者である。(203)は本発明の音素ビジネスシ
ステムから音素を利用したサービスを受ける一般ユーザ
である。(204)は本発明の音素ビジネスシステムから音
素を利用したサービスを受け、さらに一般ユーザに対し
て音声情報等のサービスを提供する企業や市役所等の行
政機関及び学校等の教育機関や宗教団体等のコンテンツ
提供者である。FIG. 2 is a schematic explanatory view of the phoneme business system of the present invention. The outline of this system will be described below.
Reference numeral (201) is a phoneme business system of the present invention, which sells a service using a phoneme to a user and pays a copyright fee according to the use of the phoneme to the owner of the phoneme copyright.
(202) is a phoneme provider that provides phonemes to the phoneme business system of the present invention. (203) is a general user who receives a service using a phoneme from the phoneme business system of the present invention. (204) is a company that receives services using phonemes from the phoneme business system of the present invention and further provides services such as voice information to general users, administrative institutions such as city halls, educational institutions such as schools, religious groups, etc. Is a content provider of.

【００１１】音素提供者が本システムに音素を提供する
と、本システムでは提供された音素の著作権所有者の登
録が行われる(205)。次にコンテンツ提供者と一般ユー
ザは本システムに対して、ネットワーク経由や電話、フ
ァックス、郵便、口頭等及びこれらを組み合わせた手段
で所望の音声キャラクタを用いての音声合成目的データ
の音声合成サービスを要求する(206)。音声合成目的デ
ータとは音声合成させたい文章が記述されたデータであ
り、データの形式は限定しない。またその内容は例えば
ニュースや行政案内及び教科書や予め本システムに記録
された小説等の文章、ユーザが作成した文章、自分史、
ドラマ、地方の方言等である。音素ビジネスシステムは
ユーザから要求のあった音声キャラクタの音素データベ
ースを用いて音声合成を行い、合成音データをネットワ
ーク経由や光ディスクや磁気ディスク、半導体メモリー
等の記録媒体に記録して郵便または人手にてユーザに配
信し、そのサービスに対する料金を徴収する(207)。When the phoneme provider provides the phoneme to the system, the system registers the copyright owner of the provided phoneme (205). Next, the content provider and the general user provide a voice synthesis service of voice synthesis target data using a desired voice character to the system via a network, telephone, fax, mail, verbal or a combination of these. Request (206). The voice synthesis target data is data in which a sentence to be voice-synthesized is described, and the data format is not limited. The contents are, for example, news, administrative guides, textbooks, sentences such as novels recorded in advance in this system, sentences created by the user, personal history,
Dramas, local dialects, etc. The phoneme business system performs voice synthesis using the phoneme database of voice characters requested by the user, and records the synthesized voice data via a network or a recording medium such as an optical disk, magnetic disk, or semiconductor memory by mail or manually. It is delivered to the user and the fee for the service is collected (207).

【００１２】一般ユーザは、配信された合成音データを
合成音データ入力手段と、音声出力手段を備えた端末装
置に取り込み、再生することで所望の音声キャラクタで
の合成音声を聴くことができる。合成音データ入力手段
とは例えば、モデム等のネットワークインターフェー
ス、光ディスクや磁気ディスクや半導体メモリー等であ
る記憶媒体のデータ入力手段である。また音声出力手段
とはスピーカやヘッドフォン、イヤフォン等である。ま
たコンテンツ提供者は配信された合成音データを前記記
録媒体に記録し、一般ユーザのサービス要求に備える。
また一般ユーザはコンテンツ提供者に対してネットワー
ク経由や電話、FAX、郵便、口頭及びこれらを組み合わ
せた方法にてキャラクタ音声でのニュースや行政案内等
を要求し(208)、コンテンツ提供者は要求されたサービ
スをネットワーク経由や光ディスク、磁気ディスク、半
導体メモリー等の記録媒体に記録して郵便または人手に
て一般ユーザに配信する(209)。そして一般ユーザは配
信された合成音データを前記手段にて取り込み、合成音
声を聴くことができる。そして本システムからサービス
に使用された音素の使用に応じて、使用された音素の著
作権所有者に著作権料が支払われる(210)。ここまでが
本システムの概略説明である。A general user can listen to the synthesized voice with a desired voice character by taking in the distributed synthesized voice data to a terminal device equipped with the synthesized voice data input means and the voice output means and reproducing it. The synthetic sound data input means is, for example, a network interface such as a modem, data input means of a storage medium such as an optical disk, a magnetic disk, or a semiconductor memory. The audio output means is a speaker, headphones, earphones, or the like. Further, the content provider records the distributed synthetic voice data in the recording medium to prepare for a service request of a general user.
In addition, the general user requests the content provider via the network, telephone, fax, mail, verbal or a combination of these for news in character voice, administrative guidance, etc. (208), and the content provider is requested. The service is recorded via a network or on a recording medium such as an optical disk, a magnetic disk, or a semiconductor memory and delivered to a general user by mail or manually (209). Then, the general user can take in the delivered synthetic voice data by the means and listen to the synthetic voice. Then, according to the use of the phoneme used for the service from the system, a copyright fee is paid to the copyright owner of the used phoneme (210). The above is the outline of the present system.

【００１３】次に本システムの詳細説明を行う。図１は
本発明の音素ビジネスシステムのブロック図である。(1
01)は音素登録者が発声する肉声であり、(102)は発声さ
れた肉声から音素を抽出し、データベース化する音素取
り込み手段である。(103)は音素取り込み手段から取り
込まれた音素の著作権所有者の登録を行う著作権者登録
手段であり、(104)は音素取り込み手段から生成された
音素のデータベースを用い、音声合成したい目的のデー
タを分析した結果、最適な音素を組み合わせて発音する
とともに、音素の使用量をも算出する音声合成手段であ
る。Next, a detailed description of this system will be given. FIG. 1 is a block diagram of a phoneme business system of the present invention. (1
Reference numeral 01) is a real voice uttered by a phoneme registrant, and reference numeral 102 is a phoneme importing means for extracting phonemes from the uttered real voice and creating a database. (103) is a copyright holder registration means for registering the copyright owner of the phoneme taken in by the phoneme importing means, and (104) is the purpose of synthesizing speech using the phoneme database generated by the phoneme importing means. As a result of analyzing the data of (1), it is a voice synthesizing unit that combines optimal phonemes and pronounces, and also calculates the usage amount of phonemes.

【００１４】(105)は音声合成手段によって算出された
使用量の結果に応じ、音素の著作権所有者毎に著作権料
を算出する著作権料算出手段であり、(106)は著作権料
課金手段からの算出情報を基に著作権料を音素の著作権
所有者に支払う金銭支払い手段である。(107)は音声合
成手段が処理した合成音データをユーザに提供する為の
配信手段であり、インターネット等のネットワーク経由
または光ディスク及び磁気ディスクや半導体メモリー等
の記録媒体に記録して郵便または人手にてユーザに合成
音データを提供する。(108)は音素を利用したサービス
をユーザに販売する販売手段である。Reference numeral (105) is a copyright fee calculation means for calculating a copyright fee for each phoneme copyright owner according to the result of the usage amount calculated by the speech synthesis means, and (106) is a copyright fee. It is a monetary payment means for paying the copyright fee to the phoneme copyright owner based on the calculated information from the charging means. (107) is a delivery means for providing the synthesized voice data processed by the voice synthesis means to the user, and is recorded via a network such as the Internet or on a recording medium such as an optical disk and a magnetic disk or a semiconductor memory, and then mailed or manually. And provide synthesized voice data to the user. (108) is a selling means for selling a service using phonemes to the user.

【００１５】(109)は音声合成目的のデータ、例えば小
説や文章のテキストデータ等を記録する為の合成目的デ
ータ記録手段であり、(110)は音素取り込み手段が作成
した音素データベースを記録する音素データベース記録
手段である。合成目的データ記録手段、音素データベー
ス記録手段は光ディスクや磁気ディスク、半導体メモリ
ー等であり、データが記録できればここに記載したもの
に限定しない。(111)はユーザからのサービス要求を受
け付け、ユーザとシステムのインターフェースを行うユ
ーザインターフェース手段である。ユーザインターフェ
ース手段はインターネット上で用いられるWEBシステム
でも良いし、電話やFAX、郵便あるいは直接人手で行っ
ても良い。(109) is a synthesis target data recording means for recording data for speech synthesis purpose, for example, text data of a novel or a sentence, and (110) is a phoneme for recording a phoneme database created by the phoneme capturing means. It is a database recording means. The synthesis purpose data recording means and the phoneme database recording means are optical disks, magnetic disks, semiconductor memories, etc., and are not limited to those described here as long as data can be recorded. (111) is a user interface unit that receives a service request from a user and interfaces with the user and the system. The user interface means may be a WEB system used on the Internet, or may be a telephone, FAX, mail, or directly by hand.

【００１６】次に動作の説明を行う。本システムの動作
は2つの動作に大別できる。一つは肉声を取り込み、音
素を蓄積するまでの動作、もう一つは蓄積した音素の利
用から販売、著作権所有者への課金、著作権料支払いま
での動作である。Next, the operation will be described. The operation of this system can be roughly divided into two operations. One is the operation of capturing the real voice and accumulating the phonemes, and the other is the operation from the use of the accumulated phonemes to the sale, the charge to the copyright owner, and the payment of the copyright fee.

【００１７】初めに本システムの音素蓄積の動作につい
て説明する。図３は本発明の音素ビジネスシステムにお
ける音素蓄積までの動作フローチャートである。音素登
録者が発声を行うとマイク等を備えた音素取り込み手段
は発声された肉声を任意のフォーマットに沿った形でデ
ータベース化し、音素データベース記録手段に記録する
(s301)。次に著作権者登録手段は音素取り込み手段が取
り込んだ音素に関し、その音素の著作権所有者の登録を
行う(s302)。なお、(s301)、(s302)の動作の順番は入れ
替わっても良い。以上が音素蓄積までの動作である。First, the operation of phoneme storage of this system will be described. FIG. 3 is an operation flowchart up to phoneme storage in the phoneme business system of the present invention. When a phoneme registrant utters a voice, the phoneme capturing means equipped with a microphone and the like forms a database of the uttered real voice in a form conforming to an arbitrary format and records it in the phoneme database recording means.
(s301). Next, the copyright holder registration means registers the copyright owner of the phoneme with respect to the phoneme taken in by the phoneme taking means (s302). The order of the operations of (s301) and (s302) may be exchanged. The above is the operation up to phoneme accumulation.

【００１８】図４は本発明の音素ビジネスシステムにお
ける音素を利用したサービスの受け付けから販売、著作
権料支払いまでの動作フローチャートである。ユーザ、
すなわちコンテンツ提供者や一般ユーザが本システムに
対し、インターネット等のネットワークや電話、FAX、
郵便、口頭及びこれらを組み合わせた手段を用いて、所
望の音声キャラクタや音声合成目的のデータを指定し、
音声合成サービスを要求すると、本システムのユーザイ
ンターフェース手段はサービス要求を受け付ける(s40
1)。ユーザが指定する音声合成目的のデータとは予めシ
ステム内部の合成目的データ記録部に記録されているデ
ータやコンテンツ提供者及び一般ユーザが本システムに
対して音声合成を依頼するデータである。コンテンツ提
供者及び一般ユーザが音声合成を依頼したデータはシス
テム内部の合成目的データ記録手段に記録される。FIG. 4 is an operation flowchart from the reception of a service using a phoneme to the sale and payment of a copyright fee in the phoneme business system of the present invention. A user,
That is, content providers and general users can use this system to access networks such as the Internet, telephones, faxes,
Designate desired voice characters and voice synthesis target data using mail, verbal or a combination of these.
When a voice synthesis service is requested, the user interface means of this system accepts the service request (s40
1). The voice synthesis target data designated by the user is data recorded in advance in the synthesis target data recording section in the system or data requested by the content provider and general user to the system. The data requested by the content provider and the general user for voice synthesis is recorded in the synthesis target data recording means inside the system.

【００１９】次に販売手段はユーザインターフェース手
段が受け付けたユーザからのサービス要求内容を認識
し、ユーザが依頼したサービスに応じた料金を計算す
る。そして料金の結果をユーザインターフェース手段を
用いてユーザに提示し、ユーザからの了承を得て、料金
を徴収する(s402)。この徴収形態については以下のよう
に複数の形態がある。例えばユーザに提供する音声キャ
ラクタの数に応じた料金徴収、または音声キャラクタの
質（世間相場）に応じた料金徴収、もしくは各キャラク
タの音素データ量に応じた料金徴収、それと音素を用い
て音声合成するデータの本数やデータ量に応じた料金徴
収、または音声合成済みデータの本数または量に応じた
料金徴収、勿論、上記各料金徴収要因を色々組み合わせ
た料金徴収もあり得る。なお必ずしもこの段階で料金を
徴収する必要はなく、ユーザ名、クレジット番号等のユ
ーザ情報を記録し、課金を行って後日に料金を徴収して
も良い。Next, the selling means recognizes the content of the service request from the user accepted by the user interface means, and calculates the charge according to the service requested by the user. Then, the result of the charge is presented to the user using the user interface means, and the charge is collected with the user's approval (s402). There are several forms of this collection as follows. For example, charge collection according to the number of voice characters provided to the user, charge collection according to the quality of the voice character (public market price), or charge collection according to the phoneme data amount of each character, and voice synthesis with it There may be a charge collection according to the number or amount of data to be stored, or a charge collection according to the number or amount of voice-synthesized data, or of course, a charge collection in which the above-mentioned respective charge collection factors are variously combined. Note that it is not always necessary to collect the fee at this stage, and user information such as a user name and a credit number may be recorded, the fee may be charged, and the fee may be collected at a later date.

【００２０】次に音声合成手段はユーザから指定された
音声合成目的データを合成目的データ記録部から読み出
し、場合によっては音声合成可能なデータに変換して、
順次解析を行い、各データに最も適する音素データをユ
ーザが指定した音声キャラクタの音素データベースから
読み出して、繋ぎ合わせ、合成音データを作成する。合
成音データは場合によっては配信手段やユーザが使用す
る端末装置に最適なデータフォーマットに変換される(s
403)。次に配信手段は音声合成手段が作成した合成音デ
ータをユーザに配信する(s404)。Next, the voice synthesizing means reads the voice synthesizing target data designated by the user from the synthesis target data recording unit, converts it into voice synthesizable data in some cases, and
Sequential analysis is performed, and the phoneme data most suitable for each data is read from the phoneme database of the voice character specified by the user and connected to create synthetic sound data. In some cases, the synthesized voice data is converted into a data format that is optimal for the distribution means and the terminal device used by the user (s
403). Next, the distribution means distributes the synthetic sound data created by the speech synthesis means to the user (s404).

【００２１】一般ユーザは本システムから配信された合
成音データを合成音データ入力手段と、音声出力手段を
備えた端末装置に取り込み、再生することで所望の音声
キャラクタでの文章の朗読を聴くことができる。またコ
ンテンツ提供者は配信された合成音データを前記記録媒
体に記録し、一般ユーザのサービス要求に備える。次に
音声合成手段は音声合成の際に使用された音素の使用量
を算出する(s405)。なおここでは音素の使用量とした
が、音声合成したい目的のデータの使用量や音声合成音
の使用量であっても良い。また使用量についてもデータ
の量及び合成時間の意味も勿論含んでいる。A general user listens to the reading of a sentence with a desired voice character by taking in the synthesized voice data distributed from this system into a terminal device equipped with a synthesized voice data input means and a voice output means, and reproducing it. You can Further, the content provider records the distributed synthetic voice data in the recording medium to prepare for a service request of a general user. Next, the voice synthesizing unit calculates the usage amount of the phonemes used in the voice synthesis (s405). It should be noted that although the phoneme usage amount is used here, it may be the usage amount of target data for which voice synthesis is desired or the usage amount of voice synthesized speech. The amount of data used and the meaning of the synthesis time are of course included.

【００２２】次に著作権料算出手段は音声合成手段から
の使用量の算出結果に基づき、使用量に応じた著作権料
の算出を行う(s406)。そしてこの算出情報を基に金銭支
払い手段より、著作権料が音素の著作権所有者に対して
支払われる(s407)。Next, the copyright fee calculating means calculates the copyright fee according to the usage amount based on the calculation result of the usage amount from the voice synthesizing means (s406). Then, based on the calculated information, the money payment means pays the copyright fee to the copyright holder of the phoneme (s407).

【００２３】[0023]

【発明の効果】本発明のシステムにより、音素の著作権
所有者は音素の使用に応じた著作権料が得られ、また音
素を利用したサービスのユーザは容易にサービスを受け
ることができるようになる。したがって音素を用いるビ
ジネスそのものが大きく発展する可能性がある。According to the system of the present invention, a copyright holder of a phoneme can obtain a copyright fee according to the use of the phoneme, and a user of a service using the phoneme can easily receive the service. Become. Therefore, there is a possibility that the business itself using phonemes will develop significantly.

[Brief description of drawings]

【図１】本発明の音素ビジネスシステムのブロック図FIG. 1 is a block diagram of a phoneme business system of the present invention.

【図２】本発明の音素ビジネスシステムの概略説明図FIG. 2 is a schematic explanatory diagram of a phoneme business system of the present invention.

【図３】本発明の音素ビジネスシステムにおける音素蓄
積までの動作フローチャートFIG. 3 is an operation flowchart up to phoneme accumulation in the phoneme business system of the present invention.

【図４】本発明の音素ビジネスシステムにおける音素を
利用したサービスの受け付けから販売、著作権料支払い
までの動作フローチャートFIG. 4 is an operation flowchart from reception of a service using a phoneme to sales and payment of a copyright fee in the phoneme business system of the present invention.

[Explanation of symbols]

(101) 音素登録者が発声する肉声 (102) 音素取り込み手段 (103) 著作権者登録手段 (104) 音声合成手段 (105) 著作権料算出手段 (106) 金銭支払い手段 (107) 配信手段 (108) 販売手段 (109) 合成目的データ記録手段 (110) 音素データベース記録手段 (111) ユーザインターフェース手段 (101) Real voice uttered by phoneme registrants (102) Phoneme capture means (103) Copyright holder registration means (104) Speech synthesis means (105) Copyright fee calculation means (106) Money payment method (107) Delivery method (108) Sales means (109) Compositing purpose data recording means (110) Phoneme database recording means (111) User interface means

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 13/00 Ｇ１０Ｌ 3/00 Ｅ 13/06 5/04 Ｚ ─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁷ Identification code FI theme code (reference) G10L 13/00 G10L 3/00 E 13/06 5/04 Z

Claims

[Claims]

1. A phoneme is defined as a minimum constituent element of a voice, a phoneme having its individuality, a phoneme importing means for importing the phoneme, a copyright holder registration means for registering a copyright owner of the phoneme, and a speech synthesis purpose. Data, for example, text data such as sentences, and the phoneme database generated from the phoneme capturing means are used to analyze the data for the purpose of speech synthesis, and the optimum phonemes are extracted and connected for each data and the usage amount of the phonemes is determined. A voice synthesizing unit for calculating, a delivery unit for providing the synthesized voice data processed by the voice synthesizing unit to the user, and a copyright fee for each phoneme copyright owner according to the usage amount calculated by the voice synthesizing unit. Selling by selling to a user a payment method and a phoneme-based service that pays the copyright fee to the phoneme copyright owner based on the copyright fee calculation method and the copyright fee calculation information Means, a synthesis purpose data recording means for recording data for speech synthesis purpose, and a phoneme database recording means for recording a phoneme database created by the phoneme importing means, and a user interface for accepting a service request from the user and performing an interface between the user and the system. A phoneme business system composed of means.

2. The phoneme is "a" or "i", "ka" or "ki".
The phoneme business system according to claim 1, which is a sound composed of a combination of vowels and consonants.

3. The phoneme is a single sound which is a minimum unit of continuous speech (for example, "Aki" is composed of single sounds of "a", "k", and "i"). Phoneme business system.

4. The phoneme business system according to claim 1, wherein the phonemes are words.

5. The phoneme business system according to claim 1, wherein the phonemes are clauses and sentences.

6. The phoneme business system according to claim 1, wherein the phonemes are onomatopoeia, onomatopoeia and mimetic words.

7. The phoneme business system according to claim 1, wherein the phoneme is digitally synthesized speech. 8. The reading system according to claim 1, wherein the synthesized voice data input means of the terminal device is a memory card, a storage device such as an optical disk and a magnetic disk, or a network interface such as a modem.