JPH0378083A

JPH0378083A - Double precision arithmetic system and arithmetic unit for sum of products

Info

Publication number: JPH0378083A
Application number: JP21466089A
Authority: JP
Inventors: Akira Kikuchi; 明菊地
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1989-08-21
Filing date: 1989-08-21
Publication date: 1991-04-03

Abstract

PURPOSE:To decrease the number of arithmetic cycles by adding partial products whose digits are matched, each other in one arithmetic cyle while executing shift processing to the partial product. CONSTITUTION:Data read from data buses 4 and 5 can be parallelly supplied to a multiplier 6 and an arithmetic theory computing element 7 and the arithmetic result of the multiplier 6 is held in a buffer register 8 for one instruction cycle period and applied to the computing element 7. The computing element 7 executes addition and subtraction, etc., to the data, which are selectively applied from the register 8 or a data bus 3 through a selector 15, and a result is once held in accumulators 9A and 9B and returned to the bus 3 afterwards. When a carry is generated in the addition processing of the computing element 7 and the state is held in carry flags 16A and 16B and transmitted at prescribed timing as the carry signal of the prescribed input bit of the computing element 7. Thus, the number of the arithmetic cycles can be reduced.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明はマイクロプロセッサにおける演算技術ことに倍
精度演算方式や積和演算装置に関し、例えばディジタル
信号処理プロセッサに適用して有効な技術に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to arithmetic techniques in microprocessors, particularly to double-precision arithmetic systems and multiply-accumulate arithmetic devices, and relates to techniques that are effective when applied to, for example, digital signal processing processors.

[Prior art]

ディジタル信号処理は、フィルタリング、等化、変調、
フリーエ変換、信号の特性パラメータ抽出、予測、画像
強調などに利用され、このようなディジタル信号処理を
目的とした専用プロセッサとしてのディジタル信号処理
プロセッサは、ディジタル信号処理という応用分野で高
い処理能力を得る為に、命令制御系と演算実行系とを分
離して命令フェッチ、命令デコード、演算実行の各サイ
クルをパイプライン的に並列処理可能にしたり、さらに
は乗算器と加算器を個別的に備えることによって頻度の
高い積和演算を並列的に実行可能とするような独特のア
ーキテクチャを持つ、このようなディジタル信号処理プ
ロセッサにおける演算速度と演算精度との間には相反す
る関係があり、演算精度を高めるには浮動小数点演算が
望ましいが演算速度が犠牲になり、また、固定小数点演
算によって高速演算を達成しようとすると演算精度が犠
牲になるという傾向がある。そこで、固定少数点演算に
よってリアルタイム処理を確保し、高い演算精度が必要
な場合に倍精度演算を行うようにすることができる０例
えば、単精度のデータ語長が１６ビツトである場合に、
それ以上の演算精度を得るための倍精度の乗算により３
２ビツトのデータと３２ビツトのデータとの乗算を行う
場合には、単精度の１６ビツトデ一タ同士の部分積を求
めるための乗算、部分積に対する桁合わせ若しくは少数
点位置合わせのためのシフト演算、そして、桁合わせさ
れた部分積に対する加算処理が必要になる。従来このシ
フト演算は算術論理演算器の算術シフトによって行われ
ており、この算術シフトは加算命令とは別に独立した算
術シフト命令を実行することによって行われる。Digital signal processing includes filtering, equalization, modulation,
Digital signal processing processors, which are used for Freeier transform, signal characteristic parameter extraction, prediction, image enhancement, etc., are specialized processors for the purpose of digital signal processing, and have high processing capabilities in the application field of digital signal processing. Therefore, it is necessary to separate the instruction control system and the arithmetic execution system so that each cycle of instruction fetch, instruction decode, and arithmetic execution can be processed in parallel in a pipeline manner, or to provide separate multipliers and adders. There is a contradictory relationship between arithmetic speed and arithmetic accuracy in such digital signal processing processors, which have a unique architecture that makes it possible to execute frequently occurring multiply-accumulate operations in parallel. Floating-point arithmetic is desirable for increasing the number of points, but it comes at the cost of computation speed, and if fixed-point arithmetic is used to achieve high-speed computation, there is a tendency for computation accuracy to be sacrificed. Therefore, real-time processing can be ensured by fixed-point arithmetic, and double-precision arithmetic can be performed when high arithmetic precision is required.For example, when the single-precision data word length is 16 bits,
3 by double precision multiplication to obtain higher arithmetic precision.
When multiplying 2-bit data by 32-bit data, multiplication is required to obtain partial products between single-precision 16-bit data, and shift operations are required to align digits or decimal points for partial products. , and addition processing is required for the digit-aligned partial products. Conventionally, this shift operation is performed by an arithmetic shift of an arithmetic logic unit, and this arithmetic shift is performed by executing an independent arithmetic shift instruction apart from an addition instruction.

尚、ディジタル信号処理プロセッサについて記載された
文献の例としては「日立ディジタル信号処理プロセッサ
ＨＤ６１８１０Ｂユーザーズマニュアル」　（昭和５８
年１月株式会社日立製作所発行）がある。An example of a document describing digital signal processing processors is "Hitachi Digital Signal Processor HD61810B User's Manual" (1978).
(Published by Hitachi, Ltd., January 2015).

[Problem to be solved by the invention]

しかしながら、単精度のデータ語長以上の演算精度を得
るための倍精度の乗算において、単精度による部分積の
乗算、部分積に対する桁合わせ若しくは少数点位置合わ
せ、そして１桁合わせされた部分積に対する加算処理が
必要になると、単精度の乗算に比べて演算サイクル数が
著しく増えて高速演算の要求に答えることができないと
いう問題点があった０例えば、３２ビツトデータＡＢ（
Ａは上位１６ビツト、Ｂは下位１６ビツト）と３２ビツ
トデータＣＤ　（Ｃは上位１６ビツト、Ｄは下位１６ビ
ツト）とを倍精度で乗算するときには、夫々３２ビツト
の部分積ＢＸＤ、ＡＸＤ、ＢＸＣ，ＡＸＣを求める乗算
４サイクルが必要になると共に、少数点位置が相互に同
じ部分積ＡＸＤとＢＸＣとの加算１サイクル、この加算
結果と前記部分積ＢＸＤとの間のシフト演算サイクル並
びにその後の両者の加算１サイクル、そして１部分積Ａ
ＸＤとＢＸＣとの加算結果と前記部分積Ｂ×Ｄとの間の
シフト演算サイクル並びにその後の両者の加算１サイク
ルが必要になり、合計９サイクルに亘る演算処理が必要
になり、単精度の乗算に比べて演算サイクル数は９倍に
なる。However, in double precision multiplication to obtain arithmetic precision greater than the data word length of single precision, multiplication of partial products by single precision, digit alignment or decimal point alignment for partial products, and partial products aligned by one digit When addition processing becomes necessary, the number of calculation cycles increases significantly compared to single-precision multiplication, and there is a problem that it cannot meet the demands for high-speed calculation.0For example, when processing 32-bit data AB (
When multiplying the 32-bit data CD (A is the upper 16 bits, B is the lower 16 bits) and 32-bit data CD (C is the upper 16 bits, D is the lower 16 bits) in double precision, 32-bit partial products BXD, AXD, and BXC are respectively generated. , AXC are required, as well as one cycle of addition of partial products AXD and BXC with the same decimal point positions, a shift calculation cycle between this addition result and the partial product BXD, and subsequent cycles of both. 1 cycle of addition, and 1 partial product A
A shift calculation cycle between the addition result of XD and BXC and the partial product B×D, and one cycle of subsequent addition of both are required, resulting in a total of 9 cycles of calculation processing, and single-precision multiplication. The number of calculation cycles is nine times that of .

本発明の目的は、固定少数点演算において演算サイクル
数を少なくすることができる倍精度演算方式を提供する
ことにある。また本発明の別の目的は少ない演算サイク
ル数で倍精度演算を行うことができる積和演算装置を提
供することにある。An object of the present invention is to provide a double-precision arithmetic method that can reduce the number of arithmetic cycles in fixed-point arithmetic. Another object of the present invention is to provide a product-sum calculation device that can perform double-precision calculations with a small number of calculation cycles.

本発明の前記並びにそのほかの目的と新規な特徴は５本
明細書の記述及び添付図面から明らかになるであろう。The above and other objects and novel features of the present invention will become apparent from the description of the present specification and the accompanying drawings.

[Means to solve the problem]

本願において開示される発明のうち代表的なものの概要
を簡単に説明すれば下記の通りである。A brief overview of typical inventions disclosed in this application is as follows.

すなわち、単位語長に対して複数倍の語長をもつデータ
の乗算を行うための倍精度演算方式において、単位語長
同士の部分積を取得するための乗算サイクルと、部分積
同士の加算サイクルとを含め、前記加算サイクルには、
その１演算サイクル中に、部分積同士の加算を行う前に
必要な桁合わせシフト処理を含め、そして前記シフト処
理を、加算器の一方の入力側に設けたセレクタに、前記
単位語長分の右シフト、左シフト、又は単なる通過の指
示を与える処理にするものである。In other words, in a double-precision arithmetic method for multiplying data with a word length multiple times the unit word length, there is a multiplication cycle to obtain partial products between unit word lengths, and an addition cycle between partial products. The addition cycle includes:
During that one operation cycle, a digit alignment shift process necessary before adding the partial products is included, and the shift process is performed by a selector provided on one input side of the adder for the unit word length. This process gives an instruction to shift to the right, shift to the left, or simply pass.

前記倍精度演算において、部分積同士の加算処理で発生
するキャリーに対してはそのキャリーを所定の桁に伝達
して後の加算処理に反映することが必要になる。このと
き、そのキャリーの伝達操作を容易化するには、斯るキ
ャリーの伝達処理の指示を該当する加算命令に含めると
よい。In the double-precision arithmetic operation, it is necessary to transmit the carry to a predetermined digit and reflect it in the subsequent addition process for a carry generated in the addition process between partial products. At this time, in order to facilitate the carry transfer operation, it is preferable to include an instruction for such carry transfer processing in the corresponding addition instruction.

また、単位語長に対して複数倍の語長をもつデータの乗
算を積和演算によって実行可能な乗算器と加算器を含め
て上記倍精度演算を行う積和演算装置を構成するには、
前記乗算器の演算結果を前記加算器の一方の入力に導く
伝達経路、又は加算器の演算結果をその加算器の他方の
入力に導く伝達経路にセレクタを配置し、このセレクタ
を、加算器による加算処理の前に前記単位語長分の右シ
フト、左シフト、又は単なる通過を選択してシフト処理
を行うようにするものである。In addition, in order to configure a product-sum operation device that performs the above-mentioned double-precision operation, including a multiplier and an adder that can perform multiplication of data having a word length multiple times the unit word length by a product-sum operation,
A selector is placed on a transmission path that leads the calculation result of the multiplier to one input of the adder, or a transmission path that leads the calculation result of the adder to the other input of the adder, and Before the addition process, the shift process is performed by selecting a right shift, a left shift, or a simple pass by the unit word length.

このとき、加算サイクルと乗算サイクルとを並列実行可
能にするには、前記乗算器の演算結果を１命令サイクル
期間保持して前記加算器に与えるデータバッファを設け
るとよい。At this time, in order to enable the addition cycle and the multiplication cycle to be executed in parallel, it is preferable to provide a data buffer that holds the operation result of the multiplier for one instruction cycle period and supplies it to the adder.

[For production]

上記した手段によれば、前記加算サイクルは、部分積に
対するシフト処理を行いながら桁合わせされた部分積同
士の加算を１演算サイクルで実行することにより、桁合
わせのための独立したシフト演算サイクルを省くように
作用し、もって、固定小数点による倍精度演算において
演算サイクル数を少なくすることを達成するものである
。According to the above-mentioned means, in the addition cycle, an independent shift operation cycle for digit alignment is performed by performing addition of digit-aligned partial products in one operation cycle while performing a shift process on the partial products. This serves to reduce the number of calculation cycles in fixed-point double-precision calculations.

また、単位語長に対して複数倍の語長をもつデータの乗
算を積和演算によって実行可能な前記積和演算装置にお
いて、加算器へのデータ伝達経路に配置されたセレクタ
は、加算器へのデータ転送途上で部分積に対する桁合わ
せを行うように作用し、これによって、桁合わせのため
の独立したシフト演算サイクルを省き、もって、少ない
演算サイクル数で固定小数点による倍精度乗算を行う積
和演算装置を達成するものである。Further, in the product-sum operation device capable of performing multiplication of data having a word length multiple times the unit word length by a product-sum operation, the selector arranged on the data transmission path to the adder is configured to transmit data to the adder. It acts to perform digit alignment for partial products during data transfer, thereby omitting an independent shift operation cycle for digit alignment, and thereby performing a product-sum operation that performs fixed-point double-precision multiplication with a reduced number of operation cycles. This is to achieve an arithmetic device.

〔Example〕

第１図には本発明の一実施例であるディジタル信号処理
プロセッサが示される。同図に示されるディジタル信号
処理プロセッサ１は、特に制限されないが、公知の半導
体集積回路製造技術によりシリコンのような１個の半導
体基板に形成されている。FIG. 1 shows a digital signal processor that is an embodiment of the present invention. The digital signal processing processor 1 shown in the figure is formed on a single semiconductor substrate such as silicon by a known semiconductor integrated circuit manufacturing technique, although this is not particularly limited.

第１図のディジタル信号処理プロセッサ１は、積和演算
などをリアルタイムで高速処理可能とするために命令制
御系と演算系とを基本的に分離して持ち、命令フェッチ
、命令デコード、そしてデータ転送や演算などの命令実
行を並列パイプライン処理可能になっている。The digital signal processing processor 1 shown in FIG. 1 basically has an instruction control system and an arithmetic system separated in order to enable high-speed processing of multiply-accumulate operations in real time, and handles instruction fetch, instruction decoding, and data transfer. It is now possible to execute instructions such as and operations using parallel pipeline processing.

前記演算系は、フィルタリング処理などのディジタル信
号処理における変数データ等を格納するためのマルチボ
ートＲＡＭ　（ランダム・アクセス・メモリ）２Ａ及び
ディジタル信号処理に利用される係数データが格納され
る係数ＲＯＭ　（リード・オンリ・メモリ）２Ｂを含む
データメモリ２を持ち、複数化されたデータバス３〜５
がデータメモ２の夫々のボートに接続されて並列的にデ
ータ転送可能にされ、さらに、乗算器６と算術論理演算
器７を個別的に備えることによって頻度の高い積和演算
を並列的に実行可能にされている。例えば１、記マルチ
ポートＲＡＭ２Ａのライトポートはデータバス３に結合
される。マルチボートＲＡＭ２Ａの２個のリードポート
及び係数ＲＯＭ２Ｂの２個のリードボートは夫々データ
バス４，５に接続されていて、データバス４，５から読
み出されるデータは乗算器６及び算術論理演算器７に並
列的に供給可能になっている。前記乗算器６の演算結果
はバッファレジスタ８に１命令サイクル期間保持されて
算術論理演算器７に与えられる。この算術論理演算器７
は、前記バッファレジスタ８やデータメモリ２さらには
データバス３からセレクタ１５を介して選択的に与えら
れるデータに対して加減算などを行う、算術論理演算器
７の演算結果は一旦アキュムレータ９Ａ又は９Ｂに保持
されてからデータバス３に戻されるようになっている。The arithmetic system includes a multi-board RAM (Random Access Memory) 2A for storing variable data etc. in digital signal processing such as filtering processing, and a coefficient ROM (Read) for storing coefficient data used in digital signal processing.・It has data memory 2 including 2B (only memory), and multiple data buses 3 to 5.
are connected to each port of the data memo 2 to enable data transfer in parallel, and are further provided with separate multipliers 6 and arithmetic and logic units 7 to perform frequently used multiply-accumulate operations in parallel. is made possible. For example, the write port of the multiport RAM 2A (1) is coupled to the data bus 3. The two read ports of the multi-board RAM 2A and the two read ports of the coefficient ROM 2B are connected to data buses 4 and 5, respectively, and the data read from the data buses 4 and 5 is sent to the multiplier 6 and the arithmetic logic unit 7. can be supplied in parallel. The operation result of the multiplier 6 is held in the buffer register 8 for one instruction cycle period and is applied to the arithmetic and logic unit 7. This arithmetic logic operator 7
The arithmetic and logic unit 7 performs addition and subtraction on data selectively provided from the buffer register 8, data memory 2, and data bus 3 via the selector 15. After being held, it is returned to the data bus 3.

前記算術論理演算器７による加算処理でキャリーが発生
すると、その状態はキャリーフラグ１６Ａ。When a carry occurs in the addition process by the arithmetic and logic unit 7, its status is the carry flag 16A.

１６Ｂに保持され、所定のタイミングで算術論理演算器
７の所定入力ビットの桁上げ信号として伝達される。前
記セレクタ１５及びキャリーフラグ１６Ａ、１６Ｂにつ
いては後で詳述する。16B, and is transmitted as a carry signal of a predetermined input bit of the arithmetic and logic unit 7 at a predetermined timing. The selector 15 and carry flags 16A and 16B will be described in detail later.

前記データバス３にはそのほかにアドレスポインタ１０
．ステータスレジスタ１１、コントロールレジスタ１２
、及び入出力回路１４などが接続されている。The data bus 3 also includes an address pointer 10.
．． Status register 11, control register 12
, and an input/output circuit 14 are connected thereto.

前記アドレスポインタ１０は、マルチボートＲＡＭ２Ａ
や係数ＲＯＭ２Ｂをアドレシングするためのメモリアド
レスを保持、そのアドレス情報はデータバス３を介して
与えられ、或いは命令のアドレスフィールドに含まれる
情報によって与えられる。前記ステータスレジスタ１１
はディジタル信号処理プロセッサ１の内部状態例えば前
記入出力回路１４によるデータの入出力状態や割込みマ
スク状態などを反映するフラグを保持する。前記コント
ロールレジスタ１２はディジタル信号処理プロセッサ１
の動作を制御するための条件を保持する。前記入出力回
路１４は、図示しないホストプロセッサやシリアルイン
タフェース回路などの外部モジュールとパラレルデータ
ＤＡＴＡやシリアルデータＳｉｎ、５ｏｕｔをやりとり
するものであり、パラレル或いはシリアル形式のインプ
ットレジスタ及びアウトプットレジスタを含む。The address pointer 10 is a multi-board RAM 2A.
The address information is provided via the data bus 3 or by information contained in the address field of the instruction. The status register 11
holds a flag that reflects the internal state of the digital signal processor 1, such as the data input/output state by the input/output circuit 14 and the interrupt mask state. The control register 12 is a digital signal processing processor 1.
maintain conditions to control the behavior of The input/output circuit 14 exchanges parallel data DATA and serial data Sin and 5out with external modules such as a host processor and a serial interface circuit (not shown), and includes an input register and an output register in parallel or serial format.

命令制御系は、特に制限されないが１次に実行すべき命
令番地を保有するプログラムカウンタ２０、割込みが発
生してサブルーチンジャンプ命令が実行されるときに前
記プログラムカウンタ２０の値を退避するスタックレジ
スタ２１、ディジタル信号処理プロセッサ１の動作プロ
グラムを複数のマイクロ命令系列として保有すると共に
前記プログラムカウンタ２０の出力によってアドレシン
グされる命令ＲＯＭ２２、命令ＲＯＭ２２から出力され
る命令をフッチする命令レジスタ２３、この命令レジス
タ２３から出力される命令をデコードして各種内部制御
信号などを生成する命令デコーダ及びコントローラ２４
、そしてアドレス制御回路２５を備える。The instruction control system includes, but is not particularly limited to, a program counter 20 that holds the address of the instruction to be executed first, and a stack register 21 that saves the value of the program counter 20 when an interrupt occurs and a subroutine jump instruction is executed. , an instruction ROM 22 that holds the operating program of the digital signal processor 1 as a plurality of microinstruction sequences and is addressed by the output of the program counter 20, an instruction register 23 that fetches instructions output from the instruction ROM 22, and this instruction register 23. An instruction decoder and controller 24 that decodes instructions output from the controller and generates various internal control signals, etc.
, and an address control circuit 25.

前記アドレス制御回路２５は、外部割込み信号とみなさ
れるような外部制御情報２６や１分岐の条件を指示する
ための内部制御情報２７、そして命令レジスタ２３が保
有する命令に含まれているブランチのような無条件分岐
のためのアドレス情報２８が供給され、それらの情報に
従った分岐先アトレース情報を前記プログラムカウンタ
２０に与え、また、−切分岐が指示されていないときに
はそのプログラムカウンタ２０に対するインクリメント
動作を指示する。The address control circuit 25 stores external control information 26 that is regarded as an external interrupt signal, internal control information 27 for instructing conditions for one branch, and information such as a branch included in an instruction held in the instruction register 23. Address information 28 for an unconditional branch is supplied, branch destination attrace information according to the information is given to the program counter 20, and when a -cut branch is not instructed, an increment operation is performed for the program counter 20. instruct.

本実施例のディジタル信号処理プロセッサ１は。The digital signal processor 1 of this embodiment is as follows.

特に制限されないが、浮動小数点演算や固定小数点演算
を行うようになっているが、特に固定小数点演算は、１
６ビツトの単位語長データに対する単精度の演算のほか
に、最大３２ビツトの精度を持つデータに対する倍精度
演算を行う、以下その倍精度演算による乗算（以下単に
倍精度乗算とも記す）について説明する。Although there are no particular restrictions, floating-point operations and fixed-point operations are performed;
In addition to single-precision operations on 6-bit unit word length data, double-precision operations are performed on data with a maximum precision of 32 bits. Multiplication using double-precision operations (hereinafter simply referred to as double-precision multiplication) will be explained below. .

第２図には演算系における倍精度乗算時のデータの流れ
が示される。伝達されるデータのビット数は夫々の機能
ブロックを結ぶバスに付記されている。FIG. 2 shows the flow of data during double precision multiplication in the arithmetic system. The number of bits of data to be transmitted is added to the bus connecting each functional block.

第３図には上位１６ビツトＡ及び下位１６ビツトＢから
成る３２ビツトのデータＡＢと、上位１６ビツトＣ及び
下位１６ビツトＤから成る３２ビツトのデータＣＤとの
倍精度乗算の概念図が示される。斯る倍精度乗算におい
ては、単位語長同士の乗算によって４個の部分積ＢＸＤ
、ＡＸＤ、ＢＸＣ，ＡＸＣが求められ、それらの部分積
は桁合せが行われて加算され、これによって６４ビツト
の乗算結果が得られる。FIG. 3 shows a conceptual diagram of double-precision multiplication of 32-bit data AB consisting of upper 16 bits A and lower 16 bits B and 32-bit data CD consisting of upper 16 bits C and lower 16 bits D. . In such double precision multiplication, four partial products BXD are obtained by multiplying unit word lengths.
, AXD, BXC, and AXC are obtained, and their partial products are digit-aligned and added, thereby obtaining a 64-bit multiplication result.

前記セレクタ１５は、前記部分積間の桁合せのために特
別に設けられており、本実施例に従えば。The selector 15 is specially provided for digit alignment between the partial products, and according to this embodiment.

算術論理演算ＩＩ７による加算処理の前に、３２ビツト
の入力データに対する前記単位語長分の１６ビツト右シ
フト（２−”）　、　１６ビツト左シフト（２＋１＠）
、又は単なる通過（２’）を選択する。Before addition processing by arithmetic logic operation II7, 16-bit right shift (2-”) and 16-bit left shift (2+1@) for the 32-bit input data by the unit word length.
, or just pass (2').

特に単なる通過動作はその他の単精度演算や浮動小数点
演算を考慮したモードである。In particular, the simple pass operation is a mode that takes other single-precision operations and floating-point operations into consideration.

第４図には前記セレクタ１５の一例が示されている、こ
のセレクタ１５は３２ビツトを１単位とする３個のゲー
ト回路１５Ａ〜１５Ｇを含み、内部バス３を介して３２
ビツトのデータＤ３１〜Ｄ０が与えられる。ゲート回路
１５Ａの上位１６ビツトには論理値０が与えられ、その
下位１６ビツトにはデータＤ、１〜Ｄ０の内の上位１６
ビツトＤ１、〜Ｄ０が与えられる。前記ゲート回路１５
Ｂには３２ビツトのデータＤ、□〜Ｄ、がそのまま与え
られる。そして、ゲート回路１５Ｇの上位１６ビツトに
はデータＤ、□〜Ｄｌｌの内の下位１６ビツトＤ。FIG. 4 shows an example of the selector 15. This selector 15 includes three gate circuits 15A to 15G each having 32 bits as a unit.
Bit data D31-D0 are given. The logic value 0 is given to the upper 16 bits of the gate circuit 15A, and the upper 16 of data D, 1 to D0 are given to the lower 16 bits.
Bits D1, .about.D0 are provided. The gate circuit 15
32-bit data D, □ to D are given to B as they are. The upper 16 bits of the gate circuit 15G contain data D, and the lower 16 bits D of □ to Dll.

〜Ｄ、が与えられ、その下位１６ビツトには論理値０が
与えられる。前記ゲート回路１５Ａが選択されると、セ
レクタ１５の出力は３２ビツトの入力データＤ、１〜Ｄ
、を１６ビツト右シフトした値になり、また、前記ゲー
ト回路１５Ｃが選択されると、セレクタ１５の出力は３
２ビツトの入力データＤ３１〜Ｄ６を１６ビツト左シフ
トした値になる。ゲート回路１５Ｂが選択される場合に
はシフト処理されることなくそのまま３２ビツトのデー
タＤ、１〜Ｄ、が出力される。~D, and its lower 16 bits are given a logic value of 0. When the gate circuit 15A is selected, the output of the selector 15 is the 32-bit input data D, 1 to D.
, shifted to the right by 16 bits, and when the gate circuit 15C is selected, the output of the selector 15 is 3.
The value is obtained by shifting the 2-bit input data D31 to D6 to the left by 16 bits. When the gate circuit 15B is selected, 32-bit data D, 1-D is output as is without being subjected to shift processing.

本実施例のディジタル信号処理プロセッサ１は、演算命
令のスループットを向上させるために同一命令サイクル
中に複数の動作が実行されるように水平型のマイクロ命
令体系を持ち、前記命令Ｒ○Ｍ２２に格納されている。The digital signal processor 1 of this embodiment has a horizontal microinstruction system so that multiple operations are executed during the same instruction cycle in order to improve the throughput of arithmetic instructions, and the instructions are stored in the instruction R○M 22. has been done.

１命令サイクルによって並列実行可能な動作は乗算器６
のオペレージ目ン、算術論理演算器７のオペレーション
、データメモリ２のリード・ライなどとされる０乗算は
。Operations that can be executed in parallel in one instruction cycle are multiplier 6.
Multiplication by 0, which is used for operations, operations of the arithmetic and logic unit 7, read/write of the data memory 2, etc.

全ての命令サイクルで動作可能になっており１乗算器６
の入力データが命令によって選択されることによって乗
算が行われる０乗算結果はバッファレジスタ８に蓄えら
れ、次の命令サイクルでその乗算結果を利用して算術論
理演算器７による加減算が実行されることになる。これ
により、乗算と加算はパイプライン的に並列実行され、
積和演算は見掛は上１命令サイクルで行われる。It is possible to operate in every instruction cycle, and 1 multiplier 6
Multiplication is performed when the input data of is selected by the instruction.The 0 multiplication result is stored in the buffer register 8, and the arithmetic and logic operator 7 performs addition and subtraction using the multiplication result in the next instruction cycle. become. This allows multiplication and addition to be executed in parallel in a pipelined manner,
The product-sum operation is apparently performed in one instruction cycle.

第５図には倍精度乗算に利用される命令フォーマット例
が示される。この命令フォーマットは、特に制限されな
いが、算術論理演算器７の動作を指示するオペレーショ
ンコード指定フィールド３Ｏ１算術論理演算器７の入力
指定フィールド３１、アキュムレータ９Ａ、９Ｂの入出
力選択指定フィールド３２、セレクタ１５の動作指定フ
ィールド３３、キャリーフラグの伝達指定フィールド３
４、データメモリ２のアドレス指定フィールド３５など
を含む。FIG. 5 shows an example of an instruction format used for double precision multiplication. This command format is not particularly limited, but includes an operation code designation field 3O for instructing the operation of the arithmetic and logic unit 7, an input designation field 31 for the arithmetic and logic unit 7, an input/output selection designation field 32 for the accumulators 9A and 9B, and a selector 15. operation specification field 33, carry flag transmission specification field 3
4, the addressing field 35 of the data memory 2, etc.

次に倍精度乗算手順を第３図の例に従って説明する。第
６図にはその乗算手順の一例が示されている。第６図の
乗算手順において夫々３２ビツトのデータＡＢとＣＤと
の倍精度乗算は５回の命令サイクルＣＹ０１〜ＣＹＣ５
によって達成される。Next, the double precision multiplication procedure will be explained according to the example shown in FIG. FIG. 6 shows an example of the multiplication procedure. In the multiplication procedure shown in FIG. 6, the double-precision multiplication of 32-bit data AB and CD requires five instruction cycles CY01 to CYC5.
achieved by.

■命令サイクルＣＹＣ１３２ビツトデータＡＢの内の上位１６ビツトＡと３２ビ
ツトデータＣＤの内の下位１６ビツトＤとが乗算器６に
供給されて両者の部分積ＡＸＤが演算され、その演算結
果がバッファレジスタ８に保持される。■Instruction cycle CYC1 The upper 16 bits A of the 32-bit data AB and the lower 16 bits D of the 32-bit data CD are supplied to the multiplier 6 to calculate the partial product AXD of both, and the result of the calculation is stored in the buffer. It is held in register 8.

■命令サイクルＣＹＣ２命令サイクルＣＹＣＩでバッファレジスタ８に保持さた
前記部分積ＡＸＤの演算結果がノン・オペレーションの
算術論理演算器７を通してそのままアキュムレータ９Ａ
に送られる。これに並行して３２ビツトデータＡＢの内
の下位１６ビツトＢと３２ビツトデータＣＤの内の上位
１６ビツトＣとが乗算器６に供給されて両者の部分積Ｂ
ＸＣが演算され、その演算結果がバッファレジスタ８に
保持される。■Instruction cycle CYC2 The operation result of the partial product AXD held in the buffer register 8 in the instruction cycle CYCI is directly passed through the non-operation arithmetic and logic unit 7 to the accumulator 9A.
sent to. In parallel with this, the lower 16 bits B of the 32-bit data AB and the upper 16 bits C of the 32-bit data CD are supplied to the multiplier 6, and the partial product B of both is supplied.
XC is calculated and the result of the calculation is held in buffer register 8.

■命令サイクルＣＹＣ３前記命令サイクルＣＹＣ２でバッファレジスタ８に保持
さた部分積ＢＸＣの演算結果と前記命令サイクルＣＹＣ
２でアキュムレータ９Ａに格納された前記部分積ＡＸＤ
の演算結果とが算術論理演算器７で加算される。このと
き双方の部分積ＡＸＤ、ＢＸＣの小数点位置は一致して
いるため、アキュムレータ９Ａから算術論理演算器７に
導かれるデータはセレクタ１５を通るが、斯るセレクタ
１５には入力データのスルー状態が選択されている。こ
れにより双方の部分積ＡＸＤ、ＢＸＣはそのまま加算さ
れ、加算結果が前記アキュムレータ９Ａに格納される。■Instruction cycle CYC3 The operation result of the partial product BXC held in the buffer register 8 in the instruction cycle CYC2 and the instruction cycle CYC
2, the partial product AXD stored in the accumulator 9A
The arithmetic and logic unit 7 adds the result of the operation. At this time, since the decimal point positions of both partial products AXD and BXC match, the data led from the accumulator 9A to the arithmetic and logic unit 7 passes through the selector 15, but the selector 15 has no input data through state. Selected. As a result, both partial products AXD and BXC are added as they are, and the addition result is stored in the accumulator 9A.

また、この加算処理でキャリ−が発生した場合にそのキ
ャリーはキャリーフラグ１６Ｂをセットする。この加算
処理に並行して３２ビツトデータＡＢの内の下位１６ビ
ツトＢと３２ビツトデータＣＤの内の下位１６ビツトＤ
とが乗算器６に供給されて両者の部分積ＢＸＤが演算さ
れ、その演算結果がバッファレジスタ８に保持される。Furthermore, if a carry occurs in this addition process, the carry sets the carry flag 16B. In parallel with this addition process, the lower 16 bits B of the 32-bit data AB and the lower 16 bits D of the 32-bit data CD are
are supplied to the multiplier 6 to calculate their partial product BXD, and the result of the calculation is held in the buffer register 8.

■命令サイクルＣＹＣ４前記命令サイクルＣＹＣ３でバッファレジスタ８に保持
さた部分積ＢＸＤの演算結果と前記命令サイクルＣＹＣ
３でアキュムレータ９Ａに格納されたデータ（部分積Ａ
ＸＤと部分積ＢＸＣとの加算結果）とが算術論理演算器
７で加算されるが、このときセレクタ１５には１６ビツ
ト左シフトの動作が命令によって選択されているため、
セレクタ１５を通してアキュムレータ９Ａから供給され
るデータの小数点位置が部分積ＢＸＤの小数点位置に合
せられる。したがって、データバッファ８の格納データ
に対して少数点位置のずれたアキュムレータ９Ａの格納
データは、加算サイクルにおける算術論理演算器７への
供給途上でセレクタ１５により自動的に桁合せされて、
データバッファ８の格納データに加算される。加算結果
は別のアキュムレータ９Ｂに格納される。また、この加
算処理でキャリーが発生した場合にそのキャリーはキャ
リーフラグ１６Ａをセットする。この加算処理に並行し
て３２ビツトデータＡＢの内の上位１６ビツトＡと３２
ビツトデータＣＤの内の上位１６ビツトＣとが乗算器６
に供給されて両者の部分積ＡＸＣが演算され、その演算
結果がバッファレジスタ８に保持される。■Instruction cycle CYC4 The operation result of the partial product BXD held in the buffer register 8 in the instruction cycle CYC3 and the instruction cycle CYC
3, the data stored in the accumulator 9A (partial product A
The result of addition of XD and partial product BXC) is added by the arithmetic and logic unit 7, but at this time, the 16-bit left shift operation has been selected in the selector 15 by the instruction.
The decimal point position of the data supplied from the accumulator 9A through the selector 15 is matched with the decimal point position of the partial product BXD. Therefore, the data stored in the accumulator 9A whose decimal point position is shifted from the data stored in the data buffer 8 is automatically digit-aligned by the selector 15 while being supplied to the arithmetic and logic unit 7 in the addition cycle.
It is added to the data stored in the data buffer 8. The addition result is stored in another accumulator 9B. Further, if a carry occurs in this addition process, the carry sets the carry flag 16A. In parallel with this addition process, the upper 16 bits A and 32 of the 32 bit data AB
The upper 16 bits C of the bit data CD are input to the multiplier 6.
The partial product AXC of both is calculated, and the result of the calculation is held in the buffer register 8.

■命令サイルＣＹＣ５前記命令サイクルＣＹＣ４でバッファレジスタ８に保持
さた部分積ＡＸＣの演算結果と前記命令サイクルＣＹＣ
３でアキュムレータ９Ａに格納されたデータ（部分積Ａ
ＸＤと部分積ＢＸＣとの加算結果）とが算術論理演算器
７で加算されるが、このときセレクタ１５には１６ビツ
ト右シフトの動作が命令によって選択されているため、
セレクタ１５を通してアキュムレータ９Ａから供給され
るデータの小数点位置が自動的に部分積ＡＸＣの小数点
位置に合せられる。したがって、データバッファ８の格
納データに対して少数点位置のずれたアキュムレータ９
Ａの格納データは、加算サイクルにおける算術論理演算
器７への供給途上でセレクタ１５により自動的に桁合せ
されて、データバッファ８の格納データに加算される。■Instruction cycle CYC5 The calculation result of the partial product AXC held in the buffer register 8 in the instruction cycle CYC4 and the instruction cycle CYC
3, the data stored in the accumulator 9A (partial product A
The result of addition of XD and partial product BXC) is added by the arithmetic and logic unit 7, but at this time, since the selector 15 has been instructed to perform a 16-bit right shift operation,
The decimal point position of the data supplied from the accumulator 9A through the selector 15 is automatically adjusted to the decimal point position of the partial product AXC. Therefore, the accumulator 9 has a decimal point position shifted from the data stored in the data buffer 8.
The data stored in A is automatically digit-aligned by the selector 15 while being supplied to the arithmetic and logic unit 7 in the addition cycle, and added to the data stored in the data buffer 8.

このときキャリーフラグ１６Ａ、１６Ｂの内容が算術論
理演算器７に伝達され、該当ビットの桁上げに利用され
る。この説明においてキャリーフラグ１６Ａは算術論理
演算器７における最下位ビットの桁上げ信号とされ、ま
た、キャリーフラグ１６Ｂは下位から第１６ビツト目の
桁上げ信号として利用される。この加算結果はアキュム
レータ９Ａに格納される。At this time, the contents of the carry flags 16A and 16B are transmitted to the arithmetic and logic unit 7 and used to carry the corresponding bits. In this explanation, the carry flag 16A is used as a carry signal for the least significant bit in the arithmetic logic unit 7, and the carry flag 16B is used as a carry signal for the 16th bit from the lowest order. This addition result is stored in accumulator 9A.

これら命令サイクルＣＹＣＩ〜ＣＹＣ５により、夫々３
２ビツトのデータＡＢとＣＤとの倍精度乗算の結果が２
個のアキュムレータ１６Ａ、１６Ｂに得られる。アキュ
ムレータ９Ｂには下位３２ビツトが格納され、アキュム
レータ９Ａには上位３２ビツトが格納される。この倍精
度乗算に必要とされる演算サイクル数は１部分積を求め
るための乗算４サイクルと、桁合せ処理を含んだ加算３
サイクルで済む、このとき、桁合せ処理を従来のように
算術論理演算器７を利用した算術シフトなどで行う場合
には、更にシフト演算サイクルが最低２サイクル余計に
必要になり、このために命令サイクル数も増える。These instruction cycles CYCI to CYC5 result in 3
The result of double-precision multiplication of 2-bit data AB and CD is 2.
accumulators 16A and 16B. The lower 32 bits are stored in accumulator 9B, and the upper 32 bits are stored in accumulator 9A. The number of calculation cycles required for this double-precision multiplication is 4 cycles of multiplication to obtain the partial product and 3 cycles of addition including digit alignment processing.
At this time, if the digit alignment process is performed by an arithmetic shift using the arithmetic and logic unit 7 as in the past, at least two additional shift operation cycles are required. The number of cycles also increases.

上記実施例によれば以下の作用効果を得るものである。According to the above embodiment, the following effects can be obtained.

（１）１６ビツトの単位語長同士の部分積を取得するた
めの乗算サイクルと部分積同士の加算サイクルとによっ
て相互に３２ビツトのデータの乗算を行うとき、アキュ
ムレータ９Ａ、９Ｂの値を算術論理演算器７に戻すデー
タ伝達経路に配置したセレクタ１５に、１６ビツト右シ
フト、１６ビツト左シフト、又は単なる通過の指示を与
え、１加算サイクル中に１部分積同士の加算を行う前に
必要な桁合わせシフト処理を含めたから１部分積に対す
るシフト処理を行いながら桁合わせされた部分積同士の
加算が１演算サイクルで実行され、桁合わせのための独
立したシフト演算サイクル、例えば算術論理演算器７を
利用する算術シフト演算サイクルを省くことができ、こ
れにより、固定小数点による倍精度演算の演算サイクル
数を少なくすることができると共に、少ない演算サイク
ル数で固定小数点による倍精度乗算を行う積和演算装置
を得ることができる。(1) When 32-bit data is mutually multiplied by a multiplication cycle to obtain a partial product between unit word lengths of 16 bits and an addition cycle between partial products, the values of accumulators 9A and 9B are calculated using arithmetic logic. A 16-bit right shift, a 16-bit left shift, or a simple pass instruction is given to the selector 15 placed on the data transmission path returning to the arithmetic unit 7, and necessary information is given to the selector 15 disposed on the data transmission path returning to the arithmetic unit 7. Since digit alignment shift processing is included, addition of digit aligned partial products is performed in one operation cycle while performing shift processing on one partial product, and an independent shift operation cycle for digit alignment, for example, arithmetic logic operator 7 It is possible to omit the arithmetic shift calculation cycle that uses fixed-point numbers, thereby reducing the number of calculation cycles for fixed-point double-precision calculations. You can get the equipment.

（２）上記作用効果より、単精度の固定小数点演算によ
り演算速度の高速化を優先させ、高い演算精度が必要な
場合にだけ倍精度演算を行うようなアーキテクチャ−を
採用するディジタル信号処理プロセッサのようなマイク
ロプロセッサもしくはマイクロコンピュータにおける全
体的な演算処理能力の向上を達成することができる。(2) Based on the above effects, a digital signal processing processor that adopts an architecture that prioritizes speeding up calculations through single-precision fixed-point calculations and performs double-precision calculations only when high calculation precision is required. It is possible to achieve an improvement in the overall processing power of a microprocessor or microcomputer.

（３）前記倍精度演算において１部分積間士の加算処理
で発生するキャリーに対してはそのキャリーを所定の桁
に伝達して後の加算処理に反映することが必要になるが
、このとき、そのキャリーの伝達処理を該当する加算命
令に従って行うようにすることにより、その操作を容易
化若しくはその操作に柔軟性を与えることができるよう
になる。(3) In the above-mentioned double-precision arithmetic, it is necessary to transmit the carry that occurs in the addition process of the partial product to a predetermined digit and reflect it in the subsequent addition process. By performing the carry transfer process according to the corresponding addition instruction, the operation can be made easier or more flexible.

（４）前記乗算器８の演算結果を１命令サイクル期間保
持して前記算術論理演算器７に与えるデータバッファ８
を設けることにより、加算サイクルと乗算サイクルとを
パイプライン的に並列実行可能になり、これによって倍
精度乗算を一層高速化することができる。(4) Data buffer 8 that holds the operation result of the multiplier 8 for one instruction cycle period and provides it to the arithmetic and logic operator 7.
By providing this, it becomes possible to execute addition cycles and multiplication cycles in parallel in a pipeline manner, thereby further speeding up double precision multiplication.

以上本発明者によってなされた発明を実施例に基づいて
具体的に説明したが、本発明はそれに限定されるもので
はなくその要旨を逸脱しない範囲において種々変更する
ことができる。Although the invention made by the present inventor has been specifically described above based on examples, the present invention is not limited thereto and can be modified in various ways without departing from the gist thereof.

例えば上記実施例ではアキュムレータ９Ａ、９Ｂの値を
シフトするようにセレクタ１５を配置したが、第７図に
示されるようにバッファレジスタ８の値をシフトするよ
うに配置してもよい。なお、このようなセレクタ１５の
配置を採用する場合には第６図の演算手順に対してシフ
ト方向やシフトさせるべき部分積が異なってくるが、演
算サイクル数は上記実施例と同じにすることができる。For example, in the above embodiment, the selector 15 is arranged so as to shift the values of the accumulators 9A and 9B, but it may be arranged so as to shift the value of the buffer register 8 as shown in FIG. Note that when adopting such an arrangement of the selector 15, the shift direction and the partial products to be shifted will be different from the calculation procedure shown in FIG. 6, but the number of calculation cycles should be the same as in the above embodiment. I can do it.

また、セレクタは上記実施例の構成に限定されず、例え
ば第８図に示されるように１乗算器６から出力される３
２ビツトデータをそのまま保持するバッファレジスタ８
のほかに、その３２ビツトデータの上位１６ビツトを下
位に格納すると共に上位１６ビツトには論理０が強制的
に挿入されるバッファレジスタ４０と１乗算器６から出
力される３２ビツトデータの下位１６ビツトを上位に格
納すると共に下位１６ビツトには論理Ｏが強制的に挿入
されるバッファレジスタ４１とを設けて構成することも
できる。Furthermore, the selector is not limited to the configuration of the above embodiment; for example, as shown in FIG.
Buffer register 8 that holds 2-bit data as is
In addition, the upper 16 bits of the 32-bit data output from the buffer register 40 and the 1 multiplier 6 are stored in the lower order, and logic 0 is forcibly inserted into the upper 16 bits. It is also possible to provide a buffer register 41 in which bits are stored in the upper part and a logic O is forcibly inserted into the lower 16 bits.

また、倍精度演算は３２ビツトデータと３２ビツトデー
タとの乗算に限定されず、適宜ビット数の単位語長の複
数倍のデータ相互間での演算に利用することができる。Further, the double-precision operation is not limited to multiplication of 32-bit data by 32-bit data, but can be used for operations between data having an appropriate number of bits multiple times the unit word length.

また、倍精度演算は必ずしも上記実施例のような乗算と
加算をパイプライン的に行う手法に限定されない。Furthermore, double-precision arithmetic is not necessarily limited to the method of performing multiplication and addition in a pipeline manner as in the above embodiment.

以上の説明では主として本発明者によってなされた発明
をその背景となった利用分野であるディジタル信号処理
プロセッサに適用した場合について説明したが１本発明
はそれに限定されず、乗算器を含むようなマイクロコン
ピュータやマイクロプロセッサなどにも広く適用するこ
とができる。In the above explanation, the invention made by the present inventor was mainly applied to a digital signal processing processor, which is the background field of application, but the present invention is not limited thereto, and It can also be widely applied to computers, microprocessors, etc.

本発明は、少なくとも倍精度演算さらには積和演算を行
う条件のものに適用することができる。The present invention can be applied to at least double-precision calculations and even product-sum calculations.

〔Effect of the invention〕

本願において開示される発明のうち代表的なものによっ
て得られる効果を簡単に説明すれば下記の通りである。A brief explanation of the effects obtained by typical inventions disclosed in this application is as follows.

すなわち、単位語長同士の部分積を取得するための乗算
サイクルと部分積同士の加算サイクルとによって倍精度
演算を行うとき、前記加算サイクルに、部分積同士の加
算前の必要な桁合わせシフト処理を含め、そのシフト処
理を、加算器の一方の入力側に設けたセレクタに対して
前記単位語長分の右シフト、左シフト、又は単なる通過
の指示を与えて行うようにすることにより、前記加算サ
イクルでは、部分積に対するシフト処理を行いながら桁
合わせされた部分積同士の加算が１演算サイクルで実行
され１桁合わせのための独立したシフト演算サイクルが
不要になり、もって、固定小数点による倍精度演算にお
いて演算サイクル数を少なくすることができるという効
果がある。In other words, when performing double-precision arithmetic using a multiplication cycle to obtain partial products between unit word lengths and an addition cycle between the partial products, the necessary digit alignment shift processing before addition of the partial products is added to the addition cycle. , and the shift process is performed by giving an instruction to right shift, left shift, or simply pass by the unit word length to a selector provided on one input side of the adder. In the addition cycle, addition of partial products whose digits have been aligned while performing shift processing on the partial products is performed in one operation cycle, eliminating the need for an independent shift operation cycle for digit alignment. This has the effect of reducing the number of calculation cycles in precision calculation.

前記倍精度演算において、部分積同士の加算処理で発生
するキャリーを所定の桁に伝達して後の加算結果に反映
させるとき、そのキャリーの伝達処理の指示を該当する
加算命令に含めておくことにより、斯るキャリーの伝達
操作を容易化することができる。In the double-precision arithmetic operation, when a carry generated in the addition process between partial products is transmitted to a predetermined digit and reflected in the subsequent addition result, an instruction for the carry transmission process is included in the corresponding addition instruction. This makes it possible to facilitate the carry transfer operation.

乗算器と加算器を含む積和演算装置における前記乗算器
の演算結果を前記加算器の一方の入力に導く伝達経路、
又は加算器の演算結果をその加算器の他方の入力に導く
伝達経路にセレクタを配置し、このセレクタを、加算器
による加算処理の前に単位語長分の右シフト、左シフト
、又は単なる通過を選択してシフト処理可能にすること
により、前記セレクタは、固定小数点による倍精度演算
に際して加算器へのデータ転送途上で部分積に対する桁
合わせを行い、これによって１桁合わせのための独立し
たシフト演算サイクルを省くことができ、もって、少な
い演算サイクル数で固定小数点による倍精度演算を行う
積和演算装置を得ることができるという効果がある。a transmission path that leads the calculation result of the multiplier to one input of the adder in a product-sum calculation device including a multiplier and an adder;
Alternatively, a selector is placed in the transmission path that leads the operation result of the adder to the other input of the adder, and this selector is shifted to the right, shifted to the left, or simply passed by the unit word length before the addition processing by the adder. By selecting and enabling shift processing, the selector performs digit alignment for the partial product during data transfer to the adder during fixed-point double-precision arithmetic, and thereby performs an independent shift for single-digit alignment. This has the advantage that arithmetic cycles can be omitted, thereby providing a product-sum arithmetic device that performs fixed-point double-precision arithmetic operations with a small number of arithmetic cycles.

そして、前記乗算器の演算結果を１命令サイクル期間保
持して前記加算器に与えるデータバッファを設けること
により、加算サイクルと乗算サイクルとが並列実行可能
になり、倍精度演算を一層高速化することができる。By providing a data buffer that holds the operation result of the multiplier for one instruction cycle period and supplies it to the adder, addition cycles and multiplication cycles can be executed in parallel, further speeding up double-precision operations. I can do it.

[Brief explanation of drawings]

第１図は本発明の一実施例であるディジタル信号処理プ
ロセッサのブロック図。第２図は第１図のディジタル信号処理プロセッサの演算
系における倍精度乗算時のデータの象れに沿ったブロッ
ク図、第３図は倍精度乗算手順の概念図。第４図はセレクタのｍ個ブロック図。第５図は倍精度乗算に利用される命令のｍ個フオーマッ
ト図。第６図は倍精度乗算手順の一例フローチャート、第７図
は積和演算系におけるセレクタの別の配置例を示すブロ
ック図。第８図は積和演算系におけるセレクタの別の構成例を示
すブロック図である。１・・・ディジタル信号処理プロセッサ、３・・・デー
タバス、６・・・乗算器、７・・・算術論理演算器、８
・・・バッファレジスタ、９Ａ、９Ｂ・・・アキュムレ
ータ、１５・・・セレクタ、１５Ａ〜１５Ｃ・・・ゲー
ト回路、１６Ａ、１６Ｂ・・・キャリーフラグ、２２・
・・命令ＲＯＭ。第　　２　図第　　３　図FIG. 1 is a block diagram of a digital signal processor that is an embodiment of the present invention. FIG. 2 is a block diagram showing the shape of data during double-precision multiplication in the arithmetic system of the digital signal processor shown in FIG. 1, and FIG. 3 is a conceptual diagram of the double-precision multiplication procedure. FIG. 4 is a block diagram of m selectors. FIG. 5 is a format diagram of m instructions used for double precision multiplication. FIG. 6 is a flowchart of an example of a double-precision multiplication procedure, and FIG. 7 is a block diagram showing another example of the arrangement of selectors in the product-sum operation system. FIG. 8 is a block diagram showing another configuration example of the selector in the product-sum calculation system. DESCRIPTION OF SYMBOLS 1... Digital signal processing processor, 3... Data bus, 6... Multiplier, 7... Arithmetic logic unit, 8
...Buffer register, 9A, 9B...Accumulator, 15...Selector, 15A-15C...Gate circuit, 16A, 16B...Carry flag, 22.
...Instruction ROM. Figure 2 Figure 3

Claims

[Claims] 1. A double-precision arithmetic method for performing multiplication of data having a word length that is multiple times the unit word length, including a multiplication cycle for obtaining partial products between unit word lengths. and an addition cycle between partial products, and the addition cycle includes a digit alignment shift process necessary before adding the partial products together during one calculation cycle, and the shift process includes a digit alignment shift process that is necessary before performing the addition of the partial products. A double-precision arithmetic method that is performed by giving a right shift, left shift, or simple pass instruction for the unit word length to a selector provided on one input side. 2. The addition cycle includes a process of transmitting a carry generated and held by the previous addition process of partial products to a predetermined digit according to the instruction.
Double precision arithmetic method described. 3. Includes a multiplier and an adder that can perform multiplication of data having a word length multiple times the unit word length by a product-sum operation, and guides the operation result of the multiplier to one input of the adder. A selector is disposed on a transmission path or a transmission path that leads the calculation result of an adder to the other input of the adder, and the selector shifts the right shift by the unit word length and the left shift by the unit word length before addition processing by the adder. A product-sum operation device that performs shift processing by selecting a shift or a simple pass. 4. Claim 3 further comprising a data buffer that holds the operation result of the multiplier for one instruction cycle period and supplies it to the adder.
The product-sum calculation device described above.