JP4828006B2

JP4828006B2 - Image processing device

Info

Publication number: JP4828006B2
Application number: JP31813498A
Authority: JP
Inventors: 俊明志野
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1998-11-09
Filing date: 1998-11-09
Publication date: 2011-11-30
Anticipated expiration: 2018-11-09
Also published as: JP2000148975A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置に関する。
【０００２】
【従来の技術】
種々のＣＡＤ(Computer Aided Design) システムや、アミューズメント装置などにおいて、コンピュータグラフィックスがしばしば用いられている。特に、近年の画像処理技術の進展に伴い、３次元コンピュータグラフィックスを用いたシステムが急速に普及している。
このような３次元コンピュータグラフィックスでは、各画素（ピクセル）に対応する色を決定するときに、各画素の色の値を計算し、この計算した色の値を、当該画素に対応するディスプレイバッファ（フレームバッファ）のアドレスに書き込むレンダリング(Rendering) 処理を行う。
【０００３】
レンダリング処理の手法の一つに、ポリゴン（Polygon)レンダリングがある。この手法では、立体モデルを三角形の単位図形（ポリゴン）の組み合わせとして表現しておき、このポリゴンを単位として描画を行うことで、表示画面の色を決定する。
【０００４】
ポリゴンレンダリングでは、物理座標系における三角形の各頂点についての、座標（ｘ，ｙ，ｚ）と、色データ（Ｒ，Ｇ，Ｂ）と、張り合わせのイメージパターンを示すテクスチャデータの同次座標（ｓ，ｔ）および同次項ｑの値とを入力とし、これらの値を三角形の内部で補間する処理が行われる。
ここで、同次項ｑは、簡単にいうと、拡大縮小率のようなもので、実際のテクスチャバッファのＵＶ座標系における座標、すなわち、テクスチャ座標データ（ｕ，ｖ）は、同次座標（ｓ，ｔ）を同次項ｑで除算した「ｓ／ｑ」および「ｔ／ｑ」に、それぞれテクスチャサイズＵＳＩＺＥおよびＶＳＩＺＥを乗じたものとなる。
【０００５】
図２８は、３次元コンピュータグラフィックスシステムの基本的な概念を示すシステム構成図である。
【０００６】
この３次元コンピュータグラフィックスシステムにおいては、グラフィックス描画等のデータは、メインプロセッサ１のメインメモリ２、あるいは外部からのグラフィックスデータを受けるＩ／Ｏインタフェース回路３からメインバス４を介してレンダリングプロセッサ５ａ、フレームバッファメモリ５ｂを有するレンダリング回路５に与えられる。
【０００７】
レンダリングプロセッサ５ａには、表示するためのデータを保持することを目的とするフレームバッファ５ｂと、描画する図形要素（たとえば三角形）の表面に張り付けるテクスチャデータを保持しているテクスチャメモリ６が結合されている。
そして、レンダリングプロセッサ５ａによって、図形要素毎に表面にテクスチャを張り付けた図形要素を、フレームバッファ５ｂに描画するという処理が行われる。
【０００８】
フレームバッファ５ｂとテクスチャメモリ６は、一般的にＤＲＡＭ(Dynamic Random Access Memory)により構成される。
そして、図２８のシステムにおいては、フレームバッファ５ｂとテクスチャメモリ６は、物理的に別々のメモリシステムとして構成されている。
【０００９】
【発明が解決しようとする課題】
ところで、上述したような３次元コンピュータグラフィックシステムなどでは、ＤＲＡＭの読み出し動作および書き込み動作を効率よく行うために、例えば、グラフィック処理の特色を考慮してＤＲＡＭへのアクセス時のアドレス指定に制限を設けることが考えられる。
しかしながら、このような制限を設けると、図２８に示すシステムにおいて、レンダリングプロセッサ５ａおよびメインプロセッサ１と、フレームバッファメモリ５ｂおよびテクスチャメモリ６との間でデータ送受信を行う際に、ＤＲＡＭに対してのアクセス時のアドレス指定の制限によって生じる種々の課題を解決する必要がある。
【００１０】
本発明は、かかる事情に鑑みてなされたものであり、メモリへのアクセス時のアドレス指定に制限を設けた場合でも、メモリへのアクセスを正確に行うことができる画像処理装置を提供することを目的とする。
【００１１】
【課題を解決するための手段】
本発明によれば、
単位図形の頂点について、３次元座標（ｘ，ｙ，ｚ）、Ｒ（赤），Ｇ（緑），Ｂ（青）データ、テクスチャ同次座標（ｓ，ｔ）および同次項ｑを含むポリゴンレンダリングデータを用いてレンダリング処理を行って、複数の画素からなる画像データを生成する画像処理回路と、
所定の大きさのブロックを単位として前記レンダリング処理によって生成された、前記複数の画素からなる画像データの書き込みが行われる複数の記憶回路であって、前記書き込みは代表点を基準とする第１の矩形領域を単位として行われ、当該第１の矩形領域の直交する第１および第２方向について第１の数の倍数のアドレスしか指定できないという制約がある、複数の記憶回路と、
前記画像処理回路から入力した前記レンダリング処理によって生成された前記複数の画素からなる画像データを含む前記ブロックの前記代表点のアドレスを前記制限に基づいて指定したときに、前記前記画像処理回路から入力した前記画像データが前記制約に基づいて前記複数の記憶回路内の該当する記憶回路の該当する前記第１の矩形領域に書き込まれるように、前記画像データを、画素データ単位で並べ替える、データ並べ替え回路と、
書き込み回路と
を具備し、
前記並べ替え回路は、前記画像処理回路から入力した前記レンダリング処理によって生成された前記複数の画素からなる画像データを前記複数の記憶回路に転送して記憶する第１の転送モードにおいて、
前記画像処理回路から入力した前記レンダリング処理された第１のビット幅の第１の画像データを第２のビット幅の第２の画像データ内に配置し、
前記第２の画像データを含む前記ブロックの前記代表点のアドレスを前記制限に基づいて指定したときに、前記第１の画像データが前記複数の記憶回路の該当する記憶回路の該当する前記第１の矩形記憶領域に書き込まれるように、前記第２の画像データを、制御信号に基づいてシフト処理、選択処理、保持処理を反復して、並べ替え、
前記書き込み回路は、前記並べ替えられた前記第２の画像データを含む前記ブロックを、前記複数の記憶回路の該当する記憶回路に分散して書き込む、
画像処理装置が提供される。
【００１２】
また本発明によれば、
単位図形の頂点について、３次元座標（ｘ，ｙ，ｚ）、Ｒ（赤），Ｇ（緑），Ｂ（青）データ、テクスチャ同次座標（ｓ，ｔ）および同次項ｑを含むポリゴンレンダリングデータが、それぞれ分散して記憶されている、所定の大きさのブロックを単位として記憶領域に記憶されたレンダリング処理によって生成された、複数の画素データからなる画像データの書き込みおよび読み出しが行われる複数の記憶回路であって、前記書き込みは代表点を基準とする第１の矩形領域を単位として行われ、当該第１の矩形領域の直交する第１および第２方向について第１の数の倍数のアドレスしか指定できないという第１の制約があり、前記読み出しは代表点を基準とする第２の矩形領域を単位として行われ、当該第２の矩形領域の直交する第１方向については任意のアドレスを指定できるが第２方向については第２の数の倍数でしかアドレスを指定できないという第２の制約がある、複数の記憶回路と、
第１のビット幅の第１の画像データを入力し、当該第１の画像データについてレンダリング処理を行う画像処理回路と、
前記第２の制限に基づいて代表点を指定して前記ブロック単位で前記複数の記憶回路から前記画像データを読み出す読み出し回路と、
前記読み出しによって得られた第２のビット幅の第２の画像データを、画素データ単位で並べ替えて生成した第１ビット幅の前記第１の画像データを前記画像処理回路に出力するデータ並べ替え回路と
を具備し、
前記並べ替え回路は、前記記憶回路から前記画像処理回路に画像データを転送する第２の転送モードにおいて、
前記画像処理回路から入力したアドレスに応じたアドレスを生成して前記読み出し回路に出力して前記複数の記憶回路から該当する画像データを読み出し、
前記画像処理回路で規定する前記第１のビット幅の画像データに変換して前記画像処理回路に出力する、
画像処理装置が提供される。
【００１３】
本発明によれば、
単位図形の頂点について、３次元座標（ｘ，ｙ，ｚ）、Ｒ（赤），Ｇ（緑），Ｂ（青）データ、テクスチャ同次座標（ｓ，ｔ）および同次項ｑを含むポリゴンレンダリングデータがそれぞれ分散して、所定の大きさのブロックを単位としてレンダリング処理によって生成された画像データの書き込みが行われ、および、前記それぞれ分散した書き込まれた前記レンダリング処理によって生成された画像データの読出しが行われる、複数の記憶回路であって、前記書き込みは代表点を基準とする第１の矩形領域を単位として行われ、当該第１の矩形領域の直交する第１および第２方向について第１の数の倍数のアドレスしか指定できないという第１の制約があり、前記読み出しは代表点を基準とする第２の矩形領域を単位として行われ、当該第２の矩形領域の直交する第１方向については任意のアドレスを指定できるが第２方向については第２の数の倍数でしかアドレスを指定できないという第２の制約がある、複数の記憶回路と、
前記第２の制限に基づいて代表点を指定して前記ブロックを単位として前記複数の記憶回路から前記画像データを読み出す読み出し回路と、
前記画像処理回路から前記記憶回路に画像データを転送する第１の転送モードと、前記記憶回路から前記画像処理回路に画像データを転送する第２の転送モードとのいずれかに応じた処理を行う、並べ替え回路であって、前記読み出された画像データを含む前記ブロックの前記代表点のアドレスを前記第１の制限に基づいて指定したときに、前記読み出された画像データが該当する記憶回路内の該当する第１の矩形領域に書き込まれるように、前記読み出された画像データを画素データ単位で並べ替える、データ並べ替え回路と、
前記第１の制限に基づいて代表点を指定して、前記並べ替えられた画像データを含む前記ブロックを前記記憶回路に書き込む書き込み回路と
を具備し、
前記並べ替え回路は、前記画像処理回路から前記複数の記憶回路に前記画像データを転送して記憶する第１の転送モードにおいて、
前記画像処理回路から入力した前記レンダリング処理された第１のビット幅の第１の画像データを第２のビット幅の第２の画像データ内に配置し、
前記第２の画像データを含む前記ブロックの前記代表点のアドレスを前記第１の制限に基づいて指定したときに、前記第１の画像データが前記複数の記憶回路の該当する記憶回路の該当する矩形領域に書き込まれるように、前記第２の画像データを、制御信号に基づいてシフト処理、選択処理、保持処理を反復して、並べ替え、
前記書き込み回路は、前記並べ替えられた前記第２の画像データを含む前記ブロックを、前記複数の記憶回路の該当する記憶回路に分散して書き込み、
前記並べ替え回路は、前記記憶回路から前記画像処理回路に画像データを転送する第２の転送モードにおいて、
前記画像処理回路から入力したアドレスに応じたアドレスを生成して前記読み出し回路に出力して前記複数の記憶回路から該当する画像データを読み出し、
前記画像処理回路で規定する前記第１のビット幅の画像データ内に配置して前記画像処理回路に出力する、
画像処理装置が提供される。
【００１４】
本発明によれば、
単位図形の頂点について、３次元座標（ｘ，ｙ，ｚ）、Ｒ（赤），Ｇ（緑），Ｂ（青）データ、テクスチャ同次座標（ｓ，ｔ）および同次項ｑを含むポリゴンレンダリングデータを用いてレンダリング処理を行う画像処理装置であって、
複数の記憶回路であって、複数の画素ビットからなる表示データと少なくとも一つの図形要素が必要とするテクスチャデータとを当該複数の記憶回路に分散して記憶し、所定の大きさのブロックを単位として上記レンダリング処理によって生成された画像データの書き込みが行われる、複数の記憶回路であって、前記書き込みは代表点を基準とする第１の矩形領域を単位として行われ、当該第１の矩形領域の直交する第１および第２方向について第１の数の倍数のアドレスしか指定できないという第１の制約があり、前記読み出しは代表点を基準とする第２の矩形領域を単位として行われ、当該第２の矩形領域の直交する第１方向については任意のアドレスを指定できるが第２方向については第２の数の倍数でしかアドレスを指定できないという第２の制約がある、複数の記憶回路と、
前記単位図形の頂点のポリゴンレンダリングデータを補間して、前記単位図形内に位置する画素の補間データを生成する補間データ生成回路と、
前記補間データに含まれるテクスチャ同次座標（ｓ，ｔ）を同次項ｑで除算して「ｓ／ｑ」および「ｔ／ｑ」を生成し、当該「ｓ／ｑ」および「ｔ／ｑ」に応じたテクスチャアドレスを用いて、前記複数の記憶回路からテクスチャデータを読み出し、図形要素の表面への前記読み出したテクスチャデータの張り付け処理を行って複数の画素ビットからなる表示データを生成するテクスチャ処理回路と、
前記テクスチャー処理回路から前記記憶回路に画像データを転送する第１の転送モードに応じたデータ並べ替え処理を行う、データ並べ替え回路であって、前記テクスチャ処理回路から入力した複数の画素ビットからなる表示データを含む前記ブロックの前記代表点のアドレスを前記第１の制限に基づいて指定したときに、前記表示データが前記複数の記憶回路内の該当する記憶回路に分散して書き込まれるように、画像データ単位で、前記表示データを並べ替えるデータ並べ替え回路と、
前記並べ替えられた前記表示データを含む前記ブロックを、前記複数の記憶回路に分散して書き込む書き込み回路と
を具備し、
前記並べ替え回路は、前記画像処理回路から前記複数の記憶回路に前記画像データを転送して記憶する第１の転送モードにおいて、
前記画像処理回路から入力した前記レンダリング処理された第１のビット幅の第１の画像データを第２のビット幅の第２の画像データ内に配置し、
前記第２の画像データを含む前記ブロックの前記代表点のアドレスを前記第１の制限に基づいて指定したときに、前記第１の画像データが前記複数の記憶回路の所定の記憶領域に書き込まれるように、前記第２の画像データを、制御信号に基づいてシフト処理、選択処理、保持処理を反復して、並べ替え、
前記書き込み回路は、前記並べ替えられた前記第２の画像データを含む前記ブロックを、前記複数の記憶回路の該当する記憶回路に分散して書き込む、
画像処理装置が提供される。
【００１５】
本発明によれば、
単位図形の頂点について、３次元座標（ｘ，ｙ，ｚ）、Ｒ（赤），Ｇ（緑），Ｂ（青）データ、テクスチャ同次座標（ｓ，ｔ）および同次項ｑを含むポリゴンレンダリングデータを用いてレンダリング処理を行う画像処理装置であって、
複数の記憶回路であって、表示データと少なくとも一つの図形要素が必要とするテクスチャデータとを当該複数の記憶回路に分散して記憶し、所定の大きさのブロックを単位として上記レンダリング処理によって生成された画像データの書き込みが行われる複数の記憶回路であって、前記書き込みは代表点を基準とする第１の矩形領域を単位として行われ、当該第１の矩形領域の直交する第１および第２方向について第１の数の倍数のアドレスしか指定できないという第１の制約があり、前記読み出しは代表点を基準とする第２の矩形領域を単位として行われ、当該第２の矩形領域の直交する第１方向については任意のアドレスを指定できるが第２方向については第２の数の倍数でしかアドレスを指定できないという第２の制約がある、複数の記憶回路と、
前記単位図形の頂点のポリゴンレンダリングデータを補間して、前記単位図形内に位置する画素の補間データを生成する補間データ生成回路と、
前記補間データに含まれるテクスチャ同次座標（ｓ，ｔ）を同次項ｑで除算して「ｓ／ｑ」および「ｔ／ｑ」を生成し、当該「ｓ／ｑ」および「ｔ／ｑ」に応じたテクスチャアドレスを用いて前記複数の記憶回路から読み出されたテクスチャデータを、図形要素の表面への張り付け処理を行って表示データを生成するテクスチャ処理回路と、
前記第２の制限に基づいて代表点を指定して前記ブロック単位で前記複数の記憶回路から前記テクスチャデータを読み出す読み出し回路と、
インタフェース回路と、
前記記憶回路から読みだした表示データを前記テクスチャー処理回路に転送する第２の転送モードに応じたデータの並べ替えを行うデータ並べ替え回路であって、前記読み出されたテクスチャデータを、前記インタフェース回路の出力フォーマットに応じて並べ替えて、前記インタフェース回路に出力するデータ並べ替え回路と
を具備し、
前記並べ替え回路は、前記第２の転送モードにおいて、前記インタフェースを経由して前記テクスチャー回路から画像処理回路から入力したアドレスに応じたアドレスを生成して前記読み出し回路に出力して前記複数の記憶回路から該当する画像データを読み出し、前記画像処理回路で規定する前記第１のビット幅の画像データ内に配置して前記画像処理回路に出力する、
画像処理装置が提供される。
【００１６】
【発明の実施の形態】
以下、本実施形態においては、パーソナルコンピュータなどに適用される、任意の３次元物体モデルに対する所望の３次元画像をＣＲＴ(Cathode Ray Tube)などのディスプレイ上に高速に表示する３次元コンピュータグラフィックスシステムについて説明する。
【００１７】
図１は、本発明に係る画像処理装置としての３次元コンピュータグラフィックスシステム１０のシステム構成図である。
【００１８】
３次元コンピュータグラフィックスシステム１０は、立体モデルを単位図形である三角形（ポリゴン）の組み合わせとして表現し、このポリゴンを描画することで表示画面の各画素の色を決定し、ディスプレイに表示するポリゴンレンダリング処理を行うシステムである。
また、３次元コンピュータグラフィックスシステム１０では、平面上の位置を表現する（ｘ，ｙ）座標の他に、奥行きを表すｚ座標を用いて３次元物体を表し、この（ｘ，ｙ，ｚ）の３つの座標で３次元空間の任意の一点を特定する。
【００１９】
図１に示すように、３次元コンピュータグラフィックスシステム１０は、メインプロセッサ１１、メインメモリ１２、Ｉ／Ｏインタフェース回路１３、およびレンダリング回路１４が、メインバス１５を介して接続されている。
以下、各構成要素の機能について説明する。
【００２０】
メインプロセッサ１１は、例えば、アプリケーションの進行状況などに応じて、メインメモリ１２から必要なグラフィックデータを読み出し、このグラフィックデータに対してクリッピング(Clipping)処理、ライティング(Lighting)処理などのジオメトリ(Geometry)処理などを行い、ポリゴンレンダリングデータを生成する。メインプロセッサ１１は、ポリゴンレンダリングデータＳ１１を、メインバス１５を介してレンダリング回路１４に出力する。
【００２１】
Ｉ／Ｏインタフェース回路１３は、必要に応じて、外部から動きの制御情報またはポリゴンレンダリングデータなどを入力し、これをメインバス１５を介してレンダリング回路１４に出力する。
【００２２】
ここで、ポリゴンレンダリングデータは、ポリゴンの各３頂点の（ｘ，ｙ，ｚ，Ｒ，Ｇ，Ｂ，ｓ，ｔ，ｑ）のデータを含んでいる。
ここで、（ｘ，ｙ，ｚ）データは、ポリンゴの頂点の３次元座標を示し、（Ｒ，Ｇ，Ｂ）データは、それぞれ当該３次元座標における赤、緑、青の輝度値を示している。
（ｓ，ｔ，ｑ）データのうち、（ｓ，ｔ）は、対応するテクスチャの同次座標を示しており、ｑは同次項を示している。ここで、「ｓ／ｑ」および「ｔ／ｑ」に、それぞれテクスチャサイズＵＳＩＺＥおよびＶＳＩＺＥを乗じてテクスチャ座標データ（ｕ，ｖ）が得られる。テクスチャバッファ１４７ａに記憶されたテクスチャデータへのアクセスは、テクスチャ座標データ（ｕ，ｖ）を用いて行われる。
すなわち、ポリゴンレンダリングデータは、三角形の各頂点の物理座標値と、それぞれの頂点の色とテクスチャデータである。
【００２３】
以下、レンダリング回路１４について詳細に説明する。
図１に示すように、レンダリング回路１４は、ホストインタフェース回路１４９、ＤＤＡ(Digital Differential Anarizer) セットアップ回路１４１、トライアングルＤＤＡ回路１４２、テクスチャエンジン回路１４３、メモリインタフェース（Ｉ／Ｆ）回路１４４、ＣＲＴコントロール回路１４５、ＲＡＭＤＡＣ回路１４６、ＤＲＡＭ１４７およびＳＲＡＭ(Static RAM)１４８を有する。
本実施形態におけるレンダリング回路１４は、一つの半導体チップ内にロジック回路と少なくとも表示データとテクスチャデータとを記憶するＤＲＡＭ１４７とが混載されている。
【００２４】
ＤＲＡＭ１４７
ＤＲＡＭ１４７は、テクスチャバッファ１４７ａ、ディスプレイバッファ１４７ｂ、ｚバッファ１４７ｃおよびテクスチャＣＬＵＴ(Color Look Up Table) バッファ１４７ｄとして機能する。
また、ＤＲＡＭ１４７は、後述するように、同一機能を有する複数（本実施形態では４個）のモジュールに分割されている。
【００２５】
また、ＤＲＡＭ１４７には、より多くのテクスチャデータを格納するために、インデックスカラーにおけるインデックスと、そのためのカラールックアップテーブル値が、テクスチャＣＬＵＴバッファ１４７ｄに格納されている。
インデックスおよびカラールックアップテーブル値は、テクスチャ処理に使われる。すなわち、通常はＲ，Ｇ，Ｂそれぞれ８ビットの合計２４ビットでテクスチャ要素を表現するが、それではデータ量が膨らむため、あらかじめ選んでおいた例えば２５６色等の中から一つの色を選んで、そのデータをテクスチャ処理に使う。このことで２５６色であればそれぞれのテクスチャ要素は８ビットで表現できることになる。インデックスから実際のカラーへの変換テーブルは必要になるが、テクスチャの解像度が高くなるほど、よりコンパクトなテクスチャデータとすることが可能となる。
これにより、テクスチャデータの圧縮が可能となり、内蔵ＤＲＡＭの効率良い利用が可能となる。
【００２６】
さらにＤＲＡＭ１４７には、描画と同時並行的に隠れ面処理を行うため、描画しようとしている物体の奥行き情報が格納されている。
なお、表示データと奥行きデータおよびテクスチャデータの格納方法としては、メモリブロックの先頭から連続して表示データが格納され、次に奥行きデータが格納され、残りの空いた領域に、テクスチャの種類毎に連続したアドレス空間でテクスチャデータが格納される。これにより、テクスチャデータを効率よく格納できることになる。
【００２７】
図２は、ＤＲＡＭ１４７、ＳＲＡＭ１４８、並びに、ＤＲＡＭ１４７およびＳＲＡＭ１４８へアクセスするメモリＩ／Ｆ回路１４４の具体的な構成例を示すブロック図である。
【００２８】
図２に示すように、図１に示すＤＲＡＭ１４７およびＳＲＡＭ１４８は、前述したように４個のメモリモジュール２００，２１０，２２０，２３０に分割されている。
【００２９】
メモリモジュール２００は、メモリ２０１，２０２を有する。
メモリ２０１は、ＤＲＡＭ１４７の一部を構成するバンク２０１Ａ，２０１Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２０１Ｃ，２０１Ｄとを有する。
また、メモリ２０２は、ＤＲＡＭ１４７の一部を構成するバンク２０２Ａ，２０２Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２０２Ｃ，２０２Ｄとを有する。
なお、ＳＲＡＭ１４８を構成するバンク２０１Ｃ，２０１Ｄ，２０２Ｃ，２０２Ｄに対しては同時アクセスが可能である。
【００３０】
メモリモジュール２１０は、メモリ２１１，２１２を有する。
メモリ２１１は、ＤＲＡＭ１４７の一部を構成するバンク２１１Ａ，２１１Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２１１Ｃ，２１１Ｄとを有する。
また、メモリ２１２は、ＤＲＡＭ１４７の一部を構成するバンク２１２Ａ，２１２Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２１２Ｃ，２１２Ｄとを有する。
なお、ＳＲＡＭ１４８を構成するバンク２１１Ｃ，２１１Ｄ，２１２Ｃ，２１２Ｄに対しては同時アクセスが可能である。
【００３１】
メモリモジュール２２０は、メモリ２２１，２２２を有する。
メモリ２２１は、ＤＲＡＭ１４７の一部を構成するバンク２２１Ａ，２２１Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２２１Ｃ，２２１Ｄとを有する。
また、メモリ２２２は、ＤＲＡＭ１４７の一部を構成するバンク２２２Ａ，２２２Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２２２Ｃ，２２２Ｄとを有する。
なお、ＳＲＡＭ１４８を構成するバンク２２１Ｃ，２２１Ｄ，２２２Ｃ，２２２Ｄに対しては同時アクセスが可能である。
【００３２】
メモリモジュール２３０は、メモリ２３１，２３２を有する。
メモリ２３１は、ＤＲＡＭ１４７の一部を構成するバンク２３１Ａ，２３１Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２３１Ｃ，２３１Ｄとを有する。
また、メモリ２３２は、ＤＲＡＭ１４７の一部を構成するバンク２３２Ａ，２３２Ｂと、ＳＲＡＭ１４８の一部を構成するバンク２３２Ｃ，２３２Ｄとを有する。
なお、ＳＲＡＭ１４８を構成するバンク２３１Ｃ，２３１Ｄ，２３２Ｃ，２３２Ｄに対しては同時アクセスが可能である。
【００３３】
ここで、メモリモジュール２００，２１０，２２０，２３０の各々は、図１に示すテクスチャバッファ１４７ａ、ディスプレイバッファ１４７ｂ、Ｚバッファ１４７ｃおよびテクスチャＣＬＵＴバッファ１４７ｄの全ての機能を持つ。
すなわち、メモリモジュール２００，２１０，２２０，２３０の各々は、対応する画素のテクスチャデータ、描画データ（（Ｒ，Ｇ，Ｂ）データ）、ｚデータおよびテクスチャカラールックアップテーブルデータの全てを記憶する。
ただし、メモリモジュール２００，２１０，２２０，２３０は、相互で異なる画素についてのデータを記憶する。
ここで、同時に処理される１６画素についてのテクスチャデータ、描画データ、ｚデータおよびテクスチャカラールックアップテーブルデータが、相互に異なるバンク２０１Ａ，２０１Ｂ，２０２Ａ，２０２Ｂ，２１１Ａ，２１１Ｂ，２１２Ａ，２１２Ｂ，２２１Ａ，２２１Ｂ，２２２Ａ，２２２Ｂ，２３１Ａ，２３１Ｂ，２３２Ａ，２３２Ｂに記憶される。
これにより、メモリＩ／Ｆ回路１４４は、ＤＲＡＭ１４７に対して、例えば８（Ｘ方向）×２画素（Ｙ方向）の１６画素についてのデータが同時に書き込むことが可能になる。
なお、読み出しは、例えば、８（Ｘ方向）×１画素（Ｙ方向）の８画素についてデータが同時に読み出すことが可能になる。
なお、メモリＩ／Ｆ回路１４４は、後述するように、いわゆる所定のインターリーブ方式のアドレッシングに基づいてＤＲＡＭ１４７へのアクセス（書き込み）を行う。
【００３４】
図３は、ＤＲＡＭ１４７のバッファ（例えばテクスチャバッファ）としての構成例を示す概略図である。
図３に示すように、２×８画素（ピクセル）の領域でメモリアクセスされたデータは、ページ（ロウ）やブロック（カラム）と呼ばれる領域に格納される。
各ロウＲＯＷ０〜ＲＯＷn+1 は、図３（ａ）に示すように、それぞれ４個のカラム（ブロック）Ｍ０Ａ，Ｍ０Ｂ，Ｍ１Ａ，Ｍ１Ｂに区分けされている。
そして、書き込み時には、Ｘ方向およびＹ方向について偶数のバウンダリ、読み出し時には、Ｘ方向について８の倍数のバウンダリ、Ｙ方向については任意バウンダリの領域でアクセスが行われる。
【００３５】
なお、バンク２０１Ｃ，２０１Ｄ，２０２Ｃ，２０２Ｄ，２１１Ｃ，２１１Ｄ，２１２Ｃ，２１２Ｄ，２２１Ｃ，２２１Ｄ，２２２Ｃ，２２２Ｄ，２３１Ｃ，２３１Ｄ，２３２Ｃ，２３２Ｄには、それぞれバンク２０１Ａ，２０１Ｂ，２０２Ａ，２０２Ｂ，２１１Ａ，２１１Ｂ，２１２Ａ，２１２Ｂ，２２１Ａ，２２１Ｂ，２２２Ａ，２２２Ｂ，２３１Ａ，２３１Ｂ，２３２Ａ，２３２Ｂに記憶されているテクスチャデータが記憶される。
【００３６】
次に、インターリーブ方式のアドレッシングに基づくテクスチャバッファ１４７ａにおけるテクスチャデータの記憶パターンについて、図４〜図６に関連付けてさらに詳細に説明する。
図４はテクスチャデータに含まれる同時にアクセスが行われるカラーデータ（画素データ）を説明するための図、図５はテクスチャデータを構成する単位ブロックを説明するための図、図６はテクスチャバッファのアドレス空間を説明するするための図である。
【００３７】
本実施形態の場合、図４に示すように、テクスチャデータに含まれる、２×８のマトリクス状に配置された画素の色データを示すカラーデータｐｉｘ₀〜ｐｉｘ₁₅が、同時にアクセスされる。
【００３８】
カラーデータｐｉｘ₀〜ｐｉｘ₁₅は、テクスチャバッファ１４７ａを構成するＳＲＡＭ１４８の異なるバンクに記憶される必要がある。
本実施形態では、カラーデータｐｉｘ₀，ｐｉｘ₁，ｐｉｘ₈，ｐｉｘ₉が、それぞれ図２に示すメモリ２０１のバンク２０１Ｃ，２０１Ｄおよびメモリ２０２のバンク２０２Ｃ，２０２Ｄに記憶される。また、カラーデータｐｉｘ₂，ｐｉｘ₃，ｐｉｘ₁₀，ｐｉｘ₁₁が、それぞれ図２に示すメモリ２１１のバンク２１１Ｃ，２１１Ｄおよびメモリ２１２のバンク２１２Ｃ，２１２Ｄに記憶される。また、カラーデータｐｉｘ₄，ｐｉｘ₅，ｐｉｘ₁₂，ｐｉｘ₁₃が、それぞれ図２に示すメモリ２２１のバンク２２１Ｃ，２２１Ｄおよびメモリ２２２のバンク２２２Ｃ，２２２Ｄに記憶される。さらに、カラーデータｐｉｘ₆，ｐｉｘ₇，ｐｉｘ₁₄，ｐｉｘ₁₅が、それぞれ図２に示すメモリ２３１のバンク２３１Ｃ，２３１Ｄおよびメモリ２３２のバンク２３２Ｃ，２３２Ｄに記憶される。
【００３９】
本実施形態では、同時に処理される矩形領域内に位置する画素のカラーデータｐｉｘ₀〜ｐｉｘ₁₅の組を単位ブロックＲ_iと呼び、例えば、１枚のイメージを示すテクスチャデータは、図５に示すように、Ｂ×Ａのマトリクス状に配置された単位ブロックＲ₀〜Ｒ_BA-1からなる。
単位ブロックＲ₀〜Ｒ_BA-1は、図６に示すように、１次元のアドレス空間で連続したアドレスを持つように、テクスチャバッファ１４７ａを構成するＤＲＡＭ１４７に記憶されている。また、各単位ブロックＲ₀〜Ｒ_BA-1内のカラーデータｐｉｘ₀〜ｐｉｘ₁₅は、１次元のアドレス空間内で連続したアドレスを持つように、ＳＲＡＭ１４８の相互に異なるバンクに記憶される。
すなわち、テクスチャバッファ１４７ａには、同時にアクセスが行われるカラーデータからなる単位ブロックが、一次元のアドレス空間で連続したアドレスを持つように記憶される。
【００４０】
ホストインタフェース回路１４９
ホストインタフェース回路１４９は、レンダリング回路１４の外部でメインバス１５に接続され、レンダリング回路１４の内部でＤＤＡセットアップ回路１４１およびメモリＩ／Ｆ回路１４４に接続されている。
ホストインタフェース回路１４９は、後述するように、メモリＩ／Ｆ回路１４４と、メインプセッサ１１およびＩ／Ｏインタフェース回路１３との間で通信を行う際のインタフェースとなる。
【００４１】
ＤＤＡセットアップ回路１４１
ＤＤＡセットアップ回路１４１は、後段のトライアングルＤＤＡ回路１４２において物理座標系上の三角形の各頂点の値を線形補間して、三角形の内部の各画素の色と深さ情報を求めるに先立ち、ポリゴンレンダリングデータＳ１１が示す（ｚ，Ｒ，Ｇ，Ｂ，ｓ，ｔ，ｑ）データについて、三角形の辺と水平方向の差分などを求めるセットアップ演算を行う。
このセットアップ演算は、具体的には、開始点の値と終点の値と、開始点と終点との距離を用いて、単位長さ移動した場合における、求めようとしている値の変分を算出する。
ＤＤＡセットアップ回路１４１は、算出した変分データＳ１４１をトライアングルＤＤＡ回路１４２に出力する。
【００４２】
トライアングルＤＤＡ回路１４２
トライアングルＤＤＡ回路１４２は、ＤＤＡセットアップ回路１４１から入力した変分データＳ１４１を用いて、三角形内部の各画素における線形補間された（ｚ，Ｒ，Ｇ，Ｂ，ｓ，ｔ，ｑ）データを算出する。
トライアングルＤＤＡ回路１１は、各画素の（ｘ，ｙ）データと、当該（ｘ，ｙ）座標における（ｚ，Ｒ，Ｇ，Ｂ，ｓ，ｔ，ｑ）データとを、ＤＤＡデータ（補間データ）Ｓ１４２としてテクスチャエンジン回路１４３に出力する。
例えば、トライアングルＤＤＡ回路１４２は、並行して処理を行う矩形内に位置する８（＝２×４）画素分のＤＤＡデータＳ１４２をテクスチャエンジン回路１４３に出力する。
【００４３】
テクスチャエンジン回路１４３
テクスチャエンジン回路１４３は、「ｓ／ｑ」および「ｔ／ｑ」の算出処理、テクスチャ座標データ（ｕ，ｖ）の算出処理、テクスチャバッファ１４７ａからの（Ｒ，Ｇ，Ｂ）データの読み出し処理等をパイプライン方式で行う。
なお、テクスチャエンジン回路１４３は、例えば所定の矩形内に位置する８画素についての処理を同時に並行して行う。
【００４４】
テクスチャエンジン回路１４３は、ＤＤＡデータＳ１４２が示す（ｓ，ｔ，ｑ）データについて、ｓデータをｑデータで除算する演算と、ｔデータをｑデータで除算する演算とを行う。
テクスチャエンジン回路１４３には、例えば図示しない除算回路が８個設けられており、８画素についての除算「ｓ／ｑ」および「ｔ／ｑ」が同時に行われる。
【００４５】
また、テクスチャエンジン回路１４３は、除算結果である「ｓ／ｑ」および「ｔ／ｑ」に、それぞれテクスチャサイズＵＳＩＺＥおよびＶＳＩＺＥを乗じて、テクスチャ座標データ（ｕ，ｖ）を生成する。
また、テクスチャエンジン回路１４３は、メモリＩ／Ｆ回路１４４を介して、ＳＲＡＭ１４８あるいはＤＲＡＭ１４７に、生成したテクスチャ座標データ（ｕ，ｖ）を含む読み出し要求を出力し、メモリＩ／Ｆ回路１４４を介して、ＳＲＡＭ１４８あるいはテクスチャバッファ１４７ａに記憶されているテクスチャデータを読み出すことで、（ｓ，ｔ）データに対応したテクスチャアドレスに記憶された（Ｒ，Ｇ，Ｂ）データＳ１４８を得る。
ここで、ＳＲＡ５１４８には、前述したようにテクスチャバッファ１４７ａに格納されているテクスチャデータが記憶される。
テクスチャエンジン回路１４３は、読み出した（Ｒ，Ｇ，Ｂ）データＳ１４８の（Ｒ，Ｇ，Ｂ）データと、前段のトライアングルＤＤＡ回路１４２からのＤＤＡデータＳ１４２に含まれる（Ｒ，Ｇ，Ｂ）データとを、それぞれ掛け合わせるなどして、カラーデータＳ１４３を生成する。
テクスチャエンジン回路１４３は、このカラーデータＳ１４３をメモリＩ／Ｆ回路１４４に出力する。
【００４６】
なお、テクスチャバッファ１４７ａには、ＭＩＰＭＡＰ（複数解像度テクスチャ）などの複数の縮小率に対応したテクスチャデータが記憶されている。ここで、何れの縮小率のテクスチャデータを用いるかは、所定のアルゴリズムを用いて、前記三角形単位で決定される。
【００４７】
テクスチャエンジン回路１４３は、フルカラー方式の場合には、テクスチャバッファ１４７ａから読み出した（Ｒ，Ｇ，Ｂ）データを直接用いる。
一方、テクスチ７エンジン回路１４３は、インデックスカラー方式の場合には、あらかじめ作成したカラールックアップテーブル（ＣＬＵＴ）をテクスチャＣＬＵＴバッファ１４７ｄから読み出して、内蔵するＳＲＡＭに転送および記憶し、このカラールックアップテーブルを用いて、テクスチャバッファ１４７ａから読み出したカラーインデックスに対応する（Ｒ，Ｇ，Ｂ）データを得る。
【００４８】
メモリＩ／Ｆ回路１４４
メモリＩ／Ｆ回路１４４は、テクスチャエンジン回路１４３から入力したカラーデータＳ１４３に対応するｚデータと、ｚバッファ１４７ｃに記憶されているｚデータとの比較を行い、入力したカラーデータＳ１４３によって描画される画像が、前回、ディスプレイバッファ１４７ｂに書き込まれた画像より、手前（視点側）に位置するか否かを判断し、手前に位置する場合には、画像データＳ１４３に対応するｚデータでｚバッファ１４７ｃに記憶されたｚデータを更新する。
また、メモリＩ／Ｆ回路１４４は、（Ｒ，Ｇ，Ｂ）データをディスプレイバッファ１４７ｂに書き込む。
さらに、メモリＩ／Ｆ回路１４４は、テクスチャエンジン回路１４３からのＳＲＡＭ１４８に、生成されたテクスチャ座標データ（ｕ，ｖ）を含む読み出し要求を受けた場合には、ＳＲＡＭ１４８に記憶された（Ｒ，Ｇ，Ｂ）データＳ１４８を読み出す。
また、メモリＩ／Ｆ回路１４４は、ＣＲＴコントロール回路１４５から表示データを読み出す要求を受けた場合には、この要求に応じて、ディスプレイバッファ１４７ｂから一定の固まり、例えば８画素あるいは１６画素単位で表示データを読み出す。
【００４９】
また、メモリＩ／Ｆ回路１４４は、ＤＲＡＭ１４７から読み出したデータをホストインタフェース回路１４９に出力する。
【００５０】
メモリＩ／Ｆ回路１４４は、ＤＲＡＭ１４７およびＳＲＡＭ１４８へのアクセス（書き込みまたは読み出し）を行うが、書き込み経路と読み出し経路とが別経路として構成されている。
すなわち、書き込みの場合には書き込みアドレスＡＤＲＷとカラーデータｃｄｗが書き込み系回路で処理されてＤＲＡＭ１４７に書き込み、読み出しの場合には読み出し系回路で処理されてＤＲＡＭ１４７またはＳＲＡＭ１４８から読み出す。
そして、メモリＩ／Ｆ回路１４４は、所定のインターリーブ方式のアドレッシングに基づいてＤＲＡＭ１４７へのアクセスを、書き込みを１６画素単位、読み出しを８画素単位で行う。
【００５１】
以下に、メモリＩ／Ｆ回路１４４の具体的な構成例について、図２にを参照しながら説明する。
【００５２】
メモリＩ／Ｆ回路１４４は、図２に示すように、ディストリビュータ３００、アドレスコンバータ３１０，３２０，３３０，３４０、メモリコントローラ３５０，３６０，３７０，３８０、読み出しコントローラ３９０およびデータ並べ替え部４００を有する。
【００５３】
〔ディストリビュータ３００〕
ディストリビュータ３００は、書き込み時に、データ並べ替え部４００から８画素分のカラーデータｃｗｄおよび書き込みアドレスＡＤＲＷ₂を入力し、これらから１６画素分のカラーデータを生成した後に、各々４画素分のデータからなる４つの画像データＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４に分割し、その画像データおよび書き込みアドレスをそれぞれアドレスコンバータ３１０，３２０，３３０，３４０に出力する。
ここで、１画素分の（Ｒ，Ｇ，Ｂ）データは各々３２ビット、ｚデータは３２ビットからなる。
【００５４】
〔アドレスコンバータ３１０，３２０，３３０，３４０〕
アドレスコンバータ３１０，３２０，３３０，３４０は、書き込み時に、ディストリビュータ３００から入力した（Ｒ，Ｇ，Ｂ）データおよびｚデータに対応したアドレスを、それぞれメモリモジュール２００，２１０，２２０，２３０内のアドレスに変換し、それぞれ変換したアドレスＳ３１０，Ｓ３２０，Ｓ３３０，Ｓ３４０と分割された画像データをメモリコントローラ３５０，３６０，３７０，３８０に出力する。
【００５５】
図７は、このディストリビュータ３００の画像データ処理（ピクセル処理）を模式的に示す図である。
この図は、前述した図３〜図６に対応するものであり、ディストリビュータ３００は、ＤＲＡＭ１４７に対して、例えば２×８画素の１６画素についてのデータが同時にアクセス可能になるように画像データ処理を行う。
そして、読み出し動作では、Ｘ方向について８の倍数のバウンダリ、かつＹ方向について任意のバウンダリでの領域でアクセスを行い、書き込み動作では、Ｘ方向およびＹ方向について偶数のバウンダリの領域でアクセスを行うようなアドレッシングとなるように画像データの処理を行う。
これによりＤＲＡＭ１４７は、アクセスの先頭がメモリセル番号ＭＣＮ「１」，「２」，「３」にはならず、必ずメモリセル番号ＭＣＮ「０」となり、ページ違反の発生等が防止される。
また、ディストリビュータ３００は、各ＤＲＡＭモジュール２２０〜２３０に対して、カラーデータを、表示領域において隣接した部分は、異なるＤＲＡＭモジュールとなる配置するとなるように画像データの処理を行う。
これにより、三角形のような平面を描画する場合には面で同時に処理できることになるため、それぞれのＤＲＡＭモジュールの動作確率は非常に高くなっている。
【００５６】
〔メモリコントローラ３５０，３６０，３７０，３８０〕
メモリコントローラ３５０，３６０，３７０，３８０は、それぞれ書き込み系配線群４０１Ｗ，４０２Ｗ，４１１Ｗ，４１２Ｗ，４２１Ｗ，４２２Ｗ，４３１Ｗ，４３２Ｗ、並びに読み出し系配線群４０１Ｒ，４０２Ｒ，４１１Ｒ，４１２Ｒ，４２１Ｒ，４２２Ｒ，４３１Ｒ，４３２Ｒを介してメモリモジュール２００，２１０，２２０，２３０に接続されており、書き込み時および読み出し時にメモリモジュール２００，２１０，２２０，２３０に対してのアクセスを制御する。
【００５７】
具体的には、書き込み時には、メモリコントローラ３５０，３６０，３７０，３８０は、ディストリビュータ３００から出力され、アドレスコンバータ３５０，３６０，３７０，３８０から入力した４画素分の（Ｒ，Ｇ，Ｂ）データおよびｚデータを、書き込み系配線群４０１Ｗ，４０２Ｗ，４１１Ｗ，４１２Ｗ，４２１Ｗ，４２２Ｗ，４３１Ｗ，４３２Ｗを介してメモリモジュール２００，２１０，２２０，２３０に同時に書き込む。
このとき、例えば、メモリモジュール２００では、前述したように、バンク２０１Ａ，２０１Ｂ，２０２Ａ，２０２Ｂの各々に、１画素分の（Ｒ，Ｇ，Ｂ）データおよびｚデータが記憶される。メモリモジュール２１０，２２０，２３０についても同じである。
【００５８】
また、各メモリコントローラ３５０，３６０，３７０，３８０は、自身のステートマシンがいわゆるアイドル（ＩＤＬＥ）状態にあるときに、アイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０を読み出しコントローラ３９０にアクティブで出力し、このアイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０に応答した読み出しコントローラ３９０による読み出しアドレスおよび読み出し要求信号Ｓ３９１を受けて、読み出し系配線群４０１Ｒ，４０２Ｒ，４１１Ｒ，４１２Ｒ，４２１Ｒ，４２２Ｒ，４３１Ｒ，４３２Ｒを介してデータの読み出しを行い、読み出し系配線群３５１，３６１，３７１，３８１、並びに配線群４４０を介して読み出しコントローラ３９０に出力する。
【００５９】
なお、本実施形態では、書き込み系配線群４０１Ｗ，４０２Ｗ，４１１Ｗ，４１２Ｗ，４２１Ｗ，４２２Ｗ，４３１Ｗ，４３２Ｗ、並びに読み出し系配線群４０１Ｒ，４０２Ｒ，４１１Ｒ，４１２Ｒ，４２１Ｒ，４２２Ｒ，４３１Ｒ，４３２Ｒの配線本数は１２８本（１２８ビット）、読み出し系配線群３５１，３６１，３７１，３８１の配線本数は２５６本（２５６ビット）、ならびに読み出し系配線群４４０の配線本数は１０２４本（１０２４ビット）である。
【００６０】
〔読み出しコントローラ３９０〕
読み出しコントローラ３９０は、アドレスコンバータ３９１およびデータ演算処理部３９２により構成されている。
アドレスコンバータ３９１は、読み出しアドレスＡＤＲＲ₂を受けた場合、メモリコントローラ３５０，３６０，３７０，３８０からのアイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０をすべてアクティブで受けると、このアイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０に応答して、８画素単位で読み出しを行うように、読み出しアドレスおよび読み出し要求信号Ｓ３９１を各メモリコントローラ３５０，３６０，３７０，３８０に出力する。
データ演算部３９２は、読み出しアドレスおよび読み出し要求信号Ｓ３９１に応答して各メモリコントローラ３５０，３６０，３７０，３８０で読み出された８画素あるいは１６画素単位の、テクスチャデータ、（Ｒ，Ｇ，Ｂ）データ、ｚデータおよびテクスチャカラールックアップテーブルデータを配線群４４０を介して入力し、所定の演算処理を行って、データ並べ替え部４００でデータ並べ替えを行った後に、要求先、例えばホストインタフェース回路１４９に出力する。
また、ＤＲＡＭ１４７の記憶領域内において一の矩形領域からデータを読み出して他の矩形領域に書き込む場合にも、読み出しコントローラ３９０、データ並べ替え部４００およびディストリビュータ３００を介してデータ転送が行われる。
【００６１】
読み出しコントローラ３９０は、上述したように、メモリコントローラ３５０，３６０，３７０，３８０のすべてがアイドル状態にあるときに、読み出しアドレスおよび読み出し要求信号Ｓ３９１をメモリコントローラ３５０，３６０，３７０，３８０に出力して読み出しデータを受けることから、読み出すデータの同期をとることができる。
したがって、読み出しコントローラ３９０は、データを一時的に保持するＦＩＦＯ(First In First Out)回路等の保持回路を設ける必要がなく、回路規模の縮小化が図られている。
【００６２】
〔データ並べ替え部４００〕
データ並べ替え部４００は、「ＨｏｓｔｔｏＬｏｃａｌ」、「ＬｏｃａｌｔｏＨｏｓｔ」、「ＬｏｃａｌｔｏＬｏｃａｌ」の３つの転送モードを有し、各々の転送モードに応じてデータの並べ替えを行う。
データ並べ替え部４００は、例えば、テクスチャエンジン回路１４３から入力した転送モード指示信号および画素データ長指示信号に基づいて、転送モードおよび１画素のデータ長を判断する。
なお、１画素のデータ長としては、例えば、３２ビット、１６ビット、８ビットおよび４ビットがある。
ここで、１画素のデータ長３２ビットおよび１６ビットは、テクスチャデータなどに用いられ、１画素のデータ長８ビットおよび４ビットは、インデックスデータなどに用いられる。
【００６３】
データ並べ替え部４００は、「ＨｏｓｔｔｏＬｏｃａｌ」転送モードにおいて、テクスチャエンジン回路１４３から入力した６４ビット幅の転送データｈｗｄ（カラーデータＳ１４３）と書き込みアドレスＡＤＲＷ₁とから、２５６ビット幅のカラーデータｃｗｄと、代表点座標（ｂｔｒｘ，ｂｔｒｙ）およびバリッドフラグｂｔｒｖｌｄを含むアドレスＡＤＲＷ₂とを生成し、これらをディストリビュータ３００に出力する。
【００６４】
また、データ並べ替え部４００は、「ＬｏｃａｌｔｏＨｏｓｔ」転送モードにおいて、ホストインタフェース回路１４９から入力したアドレスＡＤＲＲ₁に応じたアドレスＡＤＲＲ₂を生成し、当該アドレスＡＤＲＲ₂に基づいて行われた読み出し動作によって、コントローラ３９０から入力した８画素分の２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒを、パッキングして６４ビット幅の転送データｈｒｄを生成し、これをホストインタフェース回路１４９に出力する。
【００６５】
さらに、データ並べ替え部４００は、「ＬｏｃａｌｔｏＬｏｃａｌ」転送モードにおいて、ＤＲＡＭ１４７の記憶領域内で一の矩形記憶領域からデータを読み出して他の矩形記憶領域に転送する（書き込む）際に、読み出しコントローラ３９０から８画素分の２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒを入力し、これを転送先の矩形記憶領域の起点座標にあわせてシフトしてカラーデータｃｗｄを生成し、当該カラーデータｃｗｄと、転送先の書き込み動作で使用される代表点座標（ｂｔｒｘ，ｂｔｒｙ）およびバリッドフラグｂｔｒｖｌｄを含むアドレスＡＤＲＷ₂とをディストリビュータ３００に出力する。
【００６６】
図８は、図２に示すデータ並べ替え部４００の構成図である。
図８に示すように、データ並べ替え部４００は、ＦＩＦＯ(First In First Out)回路４１０₁〜４１０₃、ラッチ回路４１０₄データ並べ替え回路４２０、アドレス生成部４３０およびコントローラ４４０を有する。
【００６７】
ＦＩＦＯ回路４１０₁は、例えば、６４ビット幅で深さ３段のＦＩＦＯ回路であり、コントローラ４４０からの制御信号Ｓ４４０₁に基づいて、テクスチャエンジン回路１４３から入力した６４ビット幅の転送データｈｗｄをＦＩＦＯ方式でデータ並べ替え回路４２０に出力する。
ＦＩＦＯ回路４１０₂は、例えば、６４ビット幅で深さ３段のＦＩＦＯ回路であり、コントローラ４４０からの制御信号Ｓ４４０₂に基づいて、データ並べ替え回路４２０から入力した６４ビット幅の転送データｈｒｄをＦＩＦＯ方式でホストインタフェース回路１４９に出力する。
【００６８】
ＦＩＦＯ回路４１０₃は、例えば、６４ビット幅で深さ１６段のＦＩＦＯ回路であり、コントローラ４４０からの制御信号Ｓ４４０₃に基づいて、読み出しコントローラ３９０から入力した各々３２ビットの８画素分のカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒをＦＩＦＯ方式でデータ並べ替え回路４２０に出力する。
ＦＩＦＯ回路４１０₃へのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒの入力は、後述するように、コントローラ４４０によって制御される。
ラッチ回路４１０₄は、例えば、２５６ビット幅であり、コントローラ４４０からの制御信号Ｓ４４０₄に基づいて、データ並べ替え回路４２０から入力した２５６ビット幅のカラーデータｃｗｄ＿ａを図２に示すディストリビュータ３００に出力する。
【００６９】
図９は、図８に示すデータ並べ替え回路４２０の構成図である。
図９に示すように、データ並べ替え回路４２０は、シフト回路４５０₁，４５０₂、セレクタ４５１₀〜４５１₇、ラッチ回路４５２₀〜４５２₇およびラッチ回路４５３₀〜４５３₇を有する。
【００７０】
シフト回路４５０₁は、「ＨｏｓｔｔｏＬｏｃａｌ」転送を行う場合に、コントローラ４４０から入力した制御信号Ｓ４４０₆に基づいて、６４ビット幅の転送データｈｗｄ、すなわち２画素分のカラーデータをＬＳＢ側に配置した２５６ビット（８画素）分のカラーデータをテクスチャエンジン回路１４３から入力し、当該カラーデータを３２ビット単位でシフト処理して生成した２５６ビットのカラーデータｍｃ＿０ａ〜ｍｃ＿７ａを出力する。
【００７１】
セレクタ４５１₀〜４５１₇は、制御信号Ｓ４４０₆に基づいて、それぞれ入力端子ａを介して入力した３２ビットのカラーデータｍｃ＿０ａ〜ｍｃ＿７ａと、入力端子ｂを介して入力した３２ビットのカラーデータｍｃ＿０ａ〜ｍｃ＿７ａとのうち一方を選択して、出力端子ａからラッチ回路４５２₀〜４５２₇に出力する。
【００７２】
ラッチ回路４５２₀〜４５２₇は、制御信号Ｓ４４０₆に基づいて、それぞれセレクタ４５１₀〜４５１₇の出力端子ｃから出力した３２ビットのカラーデータを、図８に示すＦＩＦＯ回路４１０₂あるいはラッチ回路４１０₄に出力する。
ラッチ回路４５３₀〜４５３₇は、制御信号Ｓ４４０₆に基づいて、それぞれシフト回路４５０₁から入力したカラーデータｍｃ＿０ａ〜ｍｃ＿７ａを、シフト回路４５０₂と、図８に示すＦＩＦＯ回路４１０₂あるいはラッチ回路４１０₄に出力する。
ここで、ラッチ回路４５２₀〜４５２₇の各々の３２ビットの出力線は、それぞれラッチ回路４５３₀〜４５３₇の各々の３２ビットの出力線と同じであり、それぞれラッチ回路４５２₀〜４５２₇とラッチ回路４５３₀〜４５３₇とのうち一方から３２ビットのカラーデータが出力されるように制御される。
【００７３】
シフト回路４５０₂は、ラッチ回路４５３₀〜４５３₇から入力した２５６ビットのカラーデータを３２ビット単位でシフト処理して生成した各々３２ビットのカラーデータｍｃ＿０ｂ〜ｍｃ＿７ｂを、それぞれセレクタ４５１₀〜４５１₇の入力端子ｂに出力する。
【００７４】
以下、図８に示すデータ並べ替え部４００におけるデータ並べ替え処理とアドレス生成部４３０におけるアドレス生成処理とを、各転送モードに分けて説明する。
なお、前述したように、ＤＲＡＭ１４７に対しての書き込みは、代表点を基準とした８画素（Ｘ方向）×２画素（Ｙ方向）の１６画素のカラーデータが記憶されている矩形記憶領域を単位として行われる。一方、ＤＲＡＭ１４７からの読み出しは、代表点を基準とした８画素（Ｘ方向）×１画素（Ｙ方向）の８のカラーデータが記憶されている矩形記憶領域を単位として行われる。
【００７５】
また、代表点のアドレスは、読み出し動作では、Ｙ方向については任意のアドレスを指定できるが、Ｘ方向について８の倍数のアドレスしか指定できない。一方、代表点のアドレスは、書き込み動作では、Ｘ方向およびＹ方向の双方とも、２の倍数のアドレスしか指定できない。
【００７６】
以下に示す並べ替え処理では、ＤＲＡＭ１４７の記憶領域における、図１０に示すような、幅「３」、高さ「３」で起点座標が（５，５）の矩形記憶領域５００に対してのカラーデータの読み出し動作および書き込み動作を伴う場合を例示して説明する。
【００７７】
＜「ＨｏｓｔｔｏＬｏｃａｌ」転送モード＞
以下、テクスチャエンジン回路１４３から入力した図１１に示す６４ビット幅の転送データｈｗｄ₀〜ｈｗｄ₄に含まれる８画素分の各々３２ビット（１画素のデータ長が３２ビット）のカラーデータｐｉｘ₀〜ｐｉｘ₈を、図１０に示す矩形記憶領域５００に転送して書き込む場合の動作を説明する。
【００７８】
この場合には、テクスチャエンジン回路１４３から、転送データｈｗｄが、矩形記憶領域５００の図１０中左上端点からスキャンライン方向にＸ座標が増加する順番で与えられ、１スキャンライン終了後、次のスキャンラインのデータがＸ座標が増加する順番で与えられる。このとき、スキャンラインが異なるデータも６４ビット内にパックされて与えられる。
【００７９】
先ず、テクスチャエンジン回路１４３からの図１１に示す各々６４ビット幅の転送データｈｗｄ₀〜ｈｗｄ₄が、図８に示すＦＩＦＯ回路４１０₁を介して、データ並べ替え回路４２０に順に出力される。
そして、転送データｈｗｄ₀のカラーデータｐｉｘ₀，ｐｉｘ₁をＬＳＢから順に配置した２５６ビットのカラーデータが、図９に示すシフト回路４５０₁において、図１２（Ａ）に示すように、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータｃｗｄ＿ａ₀が、図９に示すラッチ回路４５３₀〜４５３₇から図１３（Ａ）に示すカラーデータｃｗｄ₀として出力され、カラーデータｃｗｄ₀が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【００８０】
次に、転送データｈｗｄ₁のカラーデータｐｉｘ₂，ｐｉｘ₃をＬＳＢから順に配置した２５６ビットのカラーデータが、図９に示すシフト回路４５０₁において、図１２（Ｂ）に示すように、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータｃｗｄ＿ａ₁が、図９に示すラッチ回路４５３₀〜４５３₇から図１３（Ｂ）に示すカラーデータｃｗｄ₁として出力され、カラーデータｃｗｄ₁が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
また、シフト後のカラーデータｃｗｄ＿ａ₁は、セレクタ４５１₀〜４５１₇およびラッチ回路４５２₀〜４５２₇を介して、シフト回路４５０₂に出力される。
【００８１】
次に、転送データｈｗｄ₂のカラーデータｐｉｘ₄，ｐｉｘ₅をＬＳＢから順に配置した２５６ビットのカラーデータが、図９に示すシフト回路４５０₁において、図１２（Ｃ）に示すように、３２ビット単位でＭＳＢに向けて「２」だけシフトされた後に、図９に示すセレクタ４５１₀〜４５１₇を介してラッチ回路４５３₀〜４５３₇に記憶される。
このとき、カラーデータｐｉｘ₄，ｐｉｘ₅が、ラッチ回路４５３₂，４５３₃に記憶される。
また、前述したシフト回路４５０₂に出力された、図１２（Ｂ）に示す２５６ビットのカラーデータｃｗｄ＿ａ₁が、ＬＳＢに向けて「１」だけシフトされた後に、セレクタ４５１₀〜４５１₇を介してラッチ回路４５２₀〜４５２₇に記憶される。このとき、カラーデータｐｉｘ₃が、ラッチ回路４５２₁に記憶される。
そして、ラッチ回路４５３₀，４５２₁，４５３₂〜４５３₇からの出力によって、図１３（Ｃ）に示すように、カラーデータｐｉｘ₃，ｐｉｘ₄，ｐｉｘ₅を含むカラーデータｃｗｄ₂が、図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【００８２】
次に、転送データｈｗｄ₃のカラーデータｐｉｘ₆，ｐｉｘ₇をＬＳＢから順に配置した２５６ビットのカラーデータが、図９に示すシフト回路４５０₁において、図１２（Ｄ）に示すように、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータｃｗｄ＿ａ₃が、図９に示すラッチ回路４５３₀〜４５３₇から図１３（Ｄ）に示すカラーデータｃｗｄ₃として出力され、カラーデータｃｗｄ₃が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【００８３】
次に、転送データｈｗｄ₄のカラーデータｐｉｘ₈をＬＳＢから順に配置した２５６ビットのカラーデータが、図９に示すシフト回路４５０₁において、図１２（Ｅ）に示すように、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータｃｗｄ＿ａ₄が、図９に示すラッチ回路４５３₀〜４５３₇から図１３（Ｅ）に示すカラーデータｃｗｄ₄として出力され、カラーデータｃｗｄ₄が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【００８４】
また、データ並べ替え部４００では、上述したカラーデータｃｗｄ₀〜ｃｗｄ₄に対応する代表点座標（ｂｔｒｘ，ｂｔｒｙ）およびバリッドフラグｂｔｒｖｌｄをアドレス生成部４３０において生成する。
代表点座標（ｂｔｒｘ，ｂｔｒｙ）およびバリッドフラグｂｔｒｖｌｄは、アドレスＡＤＲＷ₂内に含められ、ディストリビュータ３００に出力される。
ここで、カラーデータｃｗｄ₀〜ｃｗｄ₄に対応する、代表点座標を（ｂｔｒｘ₀，ｂｔｒｙ₀）〜（ｂｔｒｘ₄，ｂｔｒｙ₄）とし、バリッドフラグをｂｔｒｖｌｄ₀〜ｂｔｒｖｌｄ₄とする。
なお、アドレス生成部４３０における、アドレスＡＤＲＷ₂の生成は、図８に示すコントローラ４４０からの制御信号Ｓ４４０₅に基づいて行われる。
【００８５】
また、バリッドフラグｂｔｒｖｌｄは、１６ビットからなり、前述したように８画素（Ｘ方向）×２画素（Ｙ方向）の１６画素分の矩形記憶領域（２５６×２ビット）を単位としてＤＲＡＭ１４７に書き込みを行う際に、記憶内容の書き換えを行うか否かを、各画素の記憶領域毎に示している。バリッドフラグｂｔｒ座標ｖｌｄの各ビットは、書き込みを行う単位である１６画素分の記憶領域の各１画素分の記憶領域に対応している。バリッドフラグｂｔｒｖｌｄの各ビットは、対応する記憶領域をマスクする場合（書き換えを行わない場合）に論理値「０」を示し、マスクしない場合（書き換えを行う場合）に論理値「１」を示している。
【００８６】
ここで、１回の書き込みで使用されるカラーデータは８画素分の２５６ビットのデータであるため、当該カラーデータを、書き込みを行う記憶領域のうちＹ方向のアドレスが偶数の記憶領域に書き込む場合には、１６ビットのバリッドフラグｂｔｒｖｌｄのうち上位８ビットを論理値「０」にし、書き込みを行う記憶領域のうちＹ方向のアドレスが奇数の記憶領域に書き込む場合には、１６ビットのバリッドフラグｂｔｒｖｌｄのうち下位８ビットを論理値「０」にする。
【００８７】
図１３（Ａ）〜（Ｅ）に示すように、カラーデータｃｗｄ₀〜ｃｗｄ₄のそれぞれについて、バリッドフラグｂｔｒｖｌｄ₀〜ｂｔｒｖｌｄ₄が、アドレス生成部４３０において生成される。
ここで、例えば、バリッドフラグｂｔｒｖｌｄ₀は、下位８ビットが論理値「０」になっているため、カラーデータｃｗｄ₀は、書き込みを行う記憶領域のうちＹ方向のアドレスが奇数の記憶領域に書き込まれる。
【００８８】
また、アドレス生成部４３０は、図１２（Ａ）〜（Ｅ）に示すカラーデータｃｗｄ₀〜ｃｗｄ₄のそれぞれを、前述した８画素（Ｘ方向）×２画素（Ｙ方向）の１６画素分の矩形記憶領域を単位としてＤＲＡＭ１４７に書き込む際の図１３（Ａ）〜（Ｅ）に示す代表点座標（ｂｔｒｘ₀，ｂｔｒｙ₀）〜（ｂｔｒｘ₄，ｂｔｒｙ₄）をそれぞれ生成する。
【００８９】
上述したように図２にデータ並べ替え部４００の図８に示すデータ並べ替え回路４２０で生成された図１３（Ａ）〜（Ｅ）に示すカラーデータｃｗｄ₀〜ｃｗｄ₄は、上述したように図８に示すアドレス生成部４３０において生成されたアドレスＡＤＲＷ₂に含まれる図１３（Ａ）〜（Ｅ）に示す代表点座標（ｂｔｒｘ₀，ｂｔｒｙ₀）〜（ｂｔｒｘ₄，ｂｔｒｙ₄）およびバリッドフラグをｂｔｒｖｌｄ₀〜ｂｔｒｖｌｄ₄に基づいた図２に示すディストリビュータ３００、アドレスコンバータ３１０およびメモリコントローラ３５０，３６０，３７０，３８０の処理を経て、以下に示すように、図２に示すＤＲＡＭ１４７に書き込まれる。
【００９０】
先ず、図１４（Ａ）に示すように、代表点座標（ｂｔｒｘ₀，ｂｔｒｙ₀）＝（４，４）によって規定される１６画素分の矩形記憶領域５００内のＹ座標が「５」（奇数）の記憶領域に、当該記憶領域内のＸ座標が最も小さいアドレスに図１３（Ａ）に示すカラーデータｃｗｄ₀のＬＳＢが書き込まれるように、カラーデータｃｗｄ₀が書き込まれる。このとき、前述したように、バリッドフラグｂｔｒｖｌｄ₀の下位８ビットが全て論理値「０」の場合にはＹ座標が奇数の記憶領域にカラーデータｃｗｄ₀が書き込まれ、バリッドフラグｂｔｒｖｌｄ₀の上位８ビットが全て論理値「０」の場合にはＹ座標が偶数の記憶領域にカラーデータｃｗｄ₀が書き込まれる。
また、カラーデータｃｗｄ₀の書き込みは、バリッドフラグｂｔｒｖｌｄ₀のうち、論理値「１」のビットに対応する各々３２ビットの記憶領域に対してのみ行われる。この場合には、バリッドフラグｂｔｒｖｌｄ₀は、「０ｘ０６００」、すなわち「０００００１１０００００００００」であり、論理値「１」のビットは、（Ｘ，Ｙ）座標が（５，５）および（６，５）の記憶領域に対応しているため、図１４（Ａ）に示すように、当該記憶領域に対してのみカラーデータｐｉｘ₀，ｐｉｘ₁が書き込まれる。
【００９１】
次に、図１４（Ｂ）に示すように、代表点座標（ｂｔｒｘ₁，ｂｔｒｙ₁）＝（６，４）によって規定される１６画素分の矩形記憶領域５００内のＹ座標が「５」（奇数）の記憶領域に、図１３（Ｂ）に示すカラーデータｃｗｄ₁のカラーデータｐｉｘ₂が書き込まれる。書き込み方法は、前述した図１４（Ａ）に示す場合と同じである。
【００９２】
次に、同様に、図１４（Ｃ）に示すように、代表点座標（ｂｔｒｘ₂，ｂｔｒｙ₂）＝（４，６）によって規定される１６画素分の矩形記憶領域５００内のＹ座標が「６」（偶数）の記憶領域に、図１３（Ｃ）に示すカラーデータｃｗｄ₂のカラーデータｐｉｘ₃，ｐｉｘ₄，ｐｉｘ₅が書き込まれる。
【００９３】
次に、同様に、図１５（Ｄ）に示すように、代表点座標（ｂｔｒｘ₃，ｂｔｒｙ₃）＝（４，６）によって規定される１６画素分の矩形記憶領域５００内のＹ座標が「７」（奇数）の記憶領域に、図１３（Ｄ）に示すカラーデータｃｗｄ₃のカラーデータｐｉｘ₆，ｐｉｘ₇が書き込まれる。
【００９４】
次に、同様に、図１５（Ｆ）に示すように、代表点座標（ｂｔｒｘ₄，ｂｔｒｙ₄）＝（６，６）によって規定される１６画素分の矩形記憶領域５００内のＹ座標が「７」（奇数）の記憶領域に、図１３（Ｅ）に示すカラーデータｃｗｄ₄のカラーデータｐｉｘ₈が書き込まれる。
【００９５】
これにより、テクスチャエンジン回路１４３から入力した図１１に示す６４ビット幅の転送データｈｗｄ₀〜ｈｗｄ₄に含まれる各々３２ビットの８画素分のカラーデータｐｉｘ₀〜ｐｉｘ₈が、図１０に示す矩形記憶領域５００に書き込まれる。
【００９６】
なお、上述した図１１〜図１５に示す例では、１画素のデータ長が３２ビットのカラーデータｐｉｘ₀〜ｐｉｘ₈を、テクスチャエンジン回路１４３から入力して図１０に示す矩形記憶領域５００に書き込む場合を例示したが、１画素のデータ長は任意であり、例えば１６ビットであってもよい。
以下、１画素のデータ長が１６ビットのカラーデータｐｉｘ₀〜ｐｉｘ₈を、テクスチャエンジン回路１４３から入力して図１０に示す矩形記憶領域５００に書き込む場合の処理について説明する。
この場合には、図１６に示すように、テクスチャエンジン回路１４３から入力される６４ビット幅の転送データには、各々４画素分のカラーデータが含まれる。
この場合には、図１６に示す転送データｈｗｄ₀に含まれるカラーデータｐｉｘ₀〜ｐｉｘ₃を含む２５６ビットのカラーデータが、図９に示すデータ並べ替え回路４２０のシフト回路４５０₁においてＭＳＢ方向に１ビットだけシフトされ、図１７（Ａ）に示す２５６ビットのカラーデータｃｗｄ＿ａ₀が生成され、当該カラーデータｃｗｄ＿ａ₀に応じてラッチ回路４５３₀〜４５３₇から、図１８（Ａ）に示すカラーデータｃｗｄ₀がディストリビュータ３００に出力される。
次に、図１６に示す転送データｈｗｄ₁に含まれるカラーデータｐｉｘ₄〜ｐｉｘ₇を含む２５６ビットのカラーデータが、図９に示すデータ並べ替え回路４２０のシフト回路４５０₁においてＭＳＢ方向に２ビットだけシフトされ、さらに、転送データｈｗｄ₀に含まれるカラーデータｐｉｘ₃を追加することで、図１７（Ｂ）に示す２５６ビットのカラーデータｃｗｄ＿ａ₁が生成され、当該カラーデータｃｗｄ＿ａ₁に応じた図１８（Ｂ）に示すカラーデータｃｗｄ₁がディストリビュータ３００に出力される。
【００９７】
次に、図１６に示す転送データｈｗｄ₂に含まれるカラーデータｐｉｘ₈を含む２５６ビットのカラーデータが、図９に示すデータ並べ替え回路４２０のシフト回路４５０₁においてＭＳＢ方向に３ビットだけシフトされ、さらに、転送データｈｗｄ₁に含まれるカラーデータｐｉｘ₆、ｐｉｘ₇を追加することで、図１７（Ｃ）に示す２５６ビットのカラーデータｃｗｄ＿ａ₂が生成され、当該カラーデータｃｗｄ＿ａ₂に応じた図１８（Ｃ）に示すカラーデータｃｗｄ₂がディストリビュータ３００に出力される。
【００９８】
また、カラーデータｃｗｄ₀〜ｃｗｄ₃に対応する図１８（Ａ）〜（Ｃ）に示す代表点座標（ｂｔｒｘ，ｂｔｒｙ）およびバリッドフラグｂｔｒｖｌｄが、アドレス生成部４３０において生成される。
【００９９】
これにより、テクスチャエンジン回路１４３から入力した図１６に示す６４ビット幅の転送データｈｗｄ₀〜ｈｗｄ₂に含まれる各々１６ビットの８画素分のカラーデータｐｉｘ₀〜ｐｉｘ₈が、図１０に示す矩形記憶領域５００に書き込まれる。
【０１００】
＜「ＬｏｃａｌｔｏＨｏｓｔ」転送モード＞
以下、ＤＲＡＭ１４７から読み出した８画素分のカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒを、６４ビット幅の転送データにパッキングして、ホストインタフェース回路１４９に出力する場合の動作を説明する。
なお、前述したように、ＤＲＡＭ１４７からの読み出しは、代表点を基準とした８画素（Ｘ方向）×１画素（Ｙ方向）の８のカラーデータが記憶されている矩形記憶領域を単位として行われる。また、代表点のアドレスは、読み出し動作では、Ｙ方向については任意のアドレスを指定できるが、Ｘ方向について８の倍数のアドレスしか指定できない。
【０１０１】
この場合に、図８に示すデータ並べ替え部４００のアドレス生成部４３０は、、ホストインタフェース回路１４９から入力したアドレスＡＤＲＲ₁に基づいて、ＤＲＡＭ１４７内の読み出しを行う矩形記憶領域について、図１９に示すＸ，Ｙ座標系における図１９中左上端からスキャンライン方向にＸ座標が増大する順番で代表点座標（ｓｂｘ，ｓｂｙ）を算出する。そして、当該代表点座標（ｓｂｘ，ｓｂｙ）を含むアドレスＡＤＲＲ₂をディストリビュータ３００に出力する。
【０１０２】
また、データ並べ替え回路４２０は、アドレスＡＤＲＲ₂に応じてＤＲＡＭ１４７から読み出された８画素分のカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒを、図２に示す読み出しコントローラ３９０から図８に示すデータ並べ替え部４００のＦＩＦＯ回路４１０₃を介して入力し、これを６４ビット幅の転送データｈｗｄにパッキングして、ＦＩＦＯ回路４１０₁を介して、ホストインタフェース回路１４９に出力する。
このとき、カラーデータｍｃ＿０ｄｔｒが、当該読み出しを行う矩形記憶領域の代表点座標に記憶されていたカラーデータであり、カラーデータｍｃ＿７ｄｔｒが、当該矩形記憶領域内のＸ座標が最大の座標に記憶されていたカラーデータである。
【０１０３】
以下、一例として、図１９に示す幅１３、高さ３、起点座標が（５，５）のＤＲＡＭ１４７内の矩形記憶領域６００から読み出したカラーデータｐｉｘ₀〜ｐｉｘ₃₈を、４画素分のカラーデータを含む６４ビット幅の転送データにパッキングしてホストインタフェース回路１４９に出力する場合の動作を説明する。
【０１０４】
先ず、アドレス生成部４３０に、アドレスＡＤＲＲ₁が、ホストインタフェース回路１４９から入力される。
そして、アドレス生成部４３０において、図１９に示す矩形記憶領域６００の読み出し動作を行うために、図１９および図２０（Ａ）〜（Ｉ）に示す代表点座標（ｓｂｘ₀，ｓｂｙ₀）〜（ｓｂｘ₈，ｓｂｙ₈）を含むアドレスＡＤＲＲ₂が、図２に示すディストリビュータ３００に順に出力され、アドレスコンバータ３１０およびおよびメモリコントローラ３５０，３６０，３７０，３８０の処理を経て、図２０（Ａ）〜（Ｉ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図２に示す読み出しコントローラ３９０から図８に示すデータ並べ替え部４００のＦＩＦＯ回路４１０₃を介してデータ並べ替え回路４２０に順に出力される。
【０１０５】
図９に示すデータ並べ替え回路４２０では、コントローラ４４０からの制御信号Ｓ４４０₆に基づいて以下に示す処理が行われる。
なお、図９に示すシフト回路４５０₁は、セレクタ４５１₀〜４５１₇の入力端子ａに接続されている２５６（＝３２×８）ビットの出力用記憶領域と、当該出力用記憶領域の上位ビットに位置する２２４（＝３２×８）ビットの内部記憶領域とを有し、３２ビット単位でシフト動作を行う。
【０１０６】
先ず、図２０（Ａ）に示す２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒがシフト回路４５０₁に入力され、３２ビット単位で、ＬＳＢに向けて「５」だけシフトされた後に、カラーデータｐｉｘ₀〜ｐｉｘ₂が、ラッチ回路４５３₀〜４５３₂に記憶される。
【０１０７】
次に、図２０（Ｂ）に示す２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、シフト回路４５０₁に入力され、３２ビット単位で、ＭＳＢに向けて「３」だけシフトされ、出力用記憶領域に記憶されているカラーデータｐｉｘ₃〜ｐｉｘ₇が、セレクタ４５１₃〜４５１₇の入力端子ａおよび出力端子ｃを介して、ラッチ回路４５２₃〜４５２₇に書き込まれる。
これにより、ラッチ回路４５３₀〜４５３₇に、図２１（Ｂ）の出力用記憶領域に対応する部分に示されるカラーデータｃｗｄ＿ｂ₁が記憶される。
【０１０８】
そして、ラッチ回路４５３₀〜４５３₃に記憶されているカラーデータｐｉｘ₀〜ｐｉｘ₃が、図２２に示す６４ビットの転送データｈｒｄ₀として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
次に、ラッチ回路４５３₄〜４５３₇に記憶されているカラーデータｐｉｘ₄〜ｐｉｘ₇が、図２２に示す６４ビットの転送データｈｒｄ₁として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
【０１０９】
また、シフト回路４５０₁の内部記憶領域に記憶されているカラーデータｐｉｘ₈〜ｐｉｘ₁₀が、３２ビット単位で、ＬＳＢに向けて「８」だけシフトされた後に、ラッチ回路４５３₀〜４５３₂に書き込まれる。
次に、図２０（Ｃ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒがシフト回路４５０₁に入力され、３２ビット単位で、ＭＳＢに向けて「３」だけシフトされた後に出力され、カラーデータｐｉｘ₁₁，ｐｉｘ₁₂が、ラッチ回路４５３₃，４５３₄に書き込まれる。
次に、図２０（Ｄ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、シフト回路４５０₁に入力され、シフトされずに出力され、カラーデータｐｉｘ₁₃〜ｐｉｘ₁₅が、ラッチ回路４５３₅〜４５３₇に書き込まれる。
これにより、ラッチ回路４５３₀〜４５３₇に、図２１（Ｄ）の出力用記憶領域に対応する部分に示されるカラーデータｃｗｄ＿ｂ₃が記憶される。
【０１１０】
そして、ラッチ回路４５３₀〜４５３₃に記憶されているカラーデータｐｉｘ₈〜ｐｉｘ₁₁が、図２２に示す６４ビットの転送データｈｒｄ₂として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
次に、ラッチ回路４５３₄〜４５３₇に記憶されているカラーデータｐｉｘ₁₂〜ｐｉｘ₁₅が、図２２に示す６４ビットの転送データｈｒｄ₃として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
【０１１１】
また、次に、図２０（Ｅ）に示す２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、シフト回路４５０₁に入力され、シフトされずに出力され、カラーデータｐｉｘ₁₆〜ｐｉｘ₂₃が、ラッチ回路４５３₀〜４５３₇に書き込まれる。
これにより、ラッチ回路４５３₀〜４５３₇に、図２１（Ｅ）の出力用記憶領域に対応する部分に示されるカラーデータｃｗｄ＿ｂ₄が記憶される。
【０１１２】
そして、ラッチ回路４５３₀〜４５３₃に記憶されているカラーデータｐｉｘ₁₆〜ｐｉｘ₁₉が、図２２に示す６４ビットの転送データｈｒｄ₄として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
次に、ラッチ回路４５３₄〜４５３₇に記憶されているカラーデータｐｉｘ₂₀〜ｐｉｘ₂₃が、図２２に示す６４ビットの転送データｈｒｄ₅として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
【０１１３】
また、次に、図２０（Ｆ）に示す２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒがシフト回路４５０₁に入力され、シフトされずに、カラーデータｐｉｘ₂₄，ｐｉｘ₂₅が、ラッチ回路４５３₀，４５３₁に記憶される。
【０１１４】
次に、図２０（Ｇ）に示す２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、シフト回路４５０₁に入力され、３２ビット単位で、ＬＳＢに向けて「３」だけシフトされ、カラーデータｐｉｘ₂₆〜ｐｉｘ₂₈が、ラッチ回路４５３₂〜４５３₄に書き込まれる。
【０１１５】
次に、図２０（Ｈ）に示す２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、シフト回路４５０₁に入力され、３２ビット単位で、ＭＳＢに向けて「５」だけシフトされ、カラーデータｐｉｘ₂₉〜ｐｉｘ₃₆のうち、出力用記憶領域に記憶されているカラーデータｐｉｘ₂₉〜ｐｉｘ₃₁が、ラッチ回路４５３₅〜４５３₇に書き込まれる
これにより、ラッチ回路４５３₀〜４５３に、図２１（Ｈ）の出力用記憶領域に対応する部分に示されるカラーデータｃｗｄ＿ｂ₇が記憶される。
【０１１６】
そして、ラッチ回路４５３₀〜４５３₃に記憶されているカラーデータｐｉｘ₂₄〜ｐｉｘ₂₇が、図２２に示す６４ビットの転送データｈｒｄ₆として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
次に、ラッチ回路４５３₄〜４５３₇に記憶されているカラーデータｐｉｘ₂₈〜ｐｉｘ₃₁が、図２２に示す６４ビットの転送データｈｒｄ₇として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
【０１１７】
次に、シフト回路４５０₁の内部記憶領域に記憶されているカラーデータｐｉｘ₃₂〜ｐｉｘ₃₆が、３２ビット単位で、ＬＳＢに向けて「８」だけシフトされた後に、ラッチ回路４５３₀〜４５３₄に書き込まれる。
次に、図２０（Ｉ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒがシフト回路４５０₁に入力され、３２ビット単位で、ＭＳＢに向けて「５」だけシフトされた後に出力され、カラーデータｐｉｘ₃₇，ｐｉｘ₃₈が、ラッチ回路４５３₅，４５３₆に書き込まれる。
これにより、ラッチ回路４５３₀〜４５３₆に、図２１（Ｉ）に示されるカラーデータｃｗｄ＿ｂ₈が記憶される。
【０１１８】
そして、ラッチ回路４５３₀〜４５３₃に記憶されているカラーデータｐｉｘ₃₂〜ｐｉｘ₃₅が、図２２に示す６４ビットの転送データｈｒｄ₈として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
次に、ラッチ回路４５３₄〜４５３₆に記憶されているカラーデータｐｉｘ₃₆〜ｐｉｘ₃₈が、図２２に示す６４ビットの転送データｈｒｄ₉として、図８に示すＦＩＦＯ回路４１０₂を介して、ホストインタフェース回路１４９に出力される。
【０１１９】
以上説明したように、図１９に示す幅１３、高さ３、起点座標が（５，５）のＤＲＡＭ１４７内の矩形記憶領域６００から読み出したカラーデータｐｉｘ₀〜ｐｉｘ₃₈が、図２２に示すように、４画素分のカラーデータを含む６４ビット幅の転送データｈｒｄ₀〜ｈｒｄ₉にパッキングされてホストインタフェース回路１４９に出力される。
【０１２０】
＜「ＬｏｃａｌｔｏＬｏｃａｌ」転送モード＞
この場合には、ＤＲＡＭ１４７内の転送元の記憶領域から代表点座標（ｓｂｘ，ｓｂｙ）を基準として読み出した８画素分のカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒを、書き込み先（転送先）の記憶領域に、代表点座標（ｂｔｒｘ，ｂｔｒｙ）を基準として書き込む。
このとき、転送元の矩形記憶領域の代表点座標の算出順番は、前述した「ＨｏｓｔｔｏＬｏｃａｌ」転送モードのように、一方向ではなく、指定された方向で行われる。
【０１２１】
以下、代表点座標（ｓｂｘ，ｓｂｙ）を右から左、下から上に順に生成して、ＤＲＡＭ１４７の図１９に示す矩形記憶領域６００から読み出したカラーデータｐｉｘ₀〜ｐｉｘ₃₈を、幅１３、高さ３で起点座標が（１０，７）の矩形記憶領域に、代表点座標（ｂｔｒｘ，ｂｔｒｙ）を右から左、下から上に順に生成して書き込む場合の動作を説明する。
【０１２２】
先ず、図８に示すアドレス生成部４３０において、図１９に示す矩形記憶領域６００の読み出し動作を行うために、図２３（Ａ）〜（Ｉ）に示す代表点座標（ｓｂｘ₈，ｓｂｙ₈）〜（ｓｂｘ₀，ｓｂｙ₀）を含むアドレスＡＤＲＲ₂が、図２に示すディストリビュータ３００に順に出力され、アドレスコンバータ３１０およびおよびメモリコントローラ３５０，３６０，３７０，３８０の処理を経て、図２３（Ａ）〜（Ｉ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図２に示す読み出しコントローラ３９０から図８に示すデータ並べ替え部４００のＦＩＦＯ回路４１０₃を介してデータ並べ替え回路４２０に順に出力される。
すなわち、前述した図２０に示す場合と比べて代表点座標が逆の順序で、ディストリビュータ３００に出力される。
【０１２３】
次に、図９に示すデータ並べ替え回路４２０では、コントローラ４４０からの制御信号Ｓ４４０₆に基づいて以下に示す処理が行われる。
先ず、図２３（Ａ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁において、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータが図９に示すラッチ回路４５３₀〜４５３₇から図２４（Ａ）に示すカラーデータｃｗｄ₀として出力され、カラーデータｃｗｄ₀が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【０１２４】
次に、図２３（Ｂ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁においてシフトされずに、ラッチ回路４５３₀〜４５３₇に書き込まれた後に、シフト回路４５０₂に出力され、シフト回路４５０₂において、３２ビット単位でＬＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータが、セレクタ４５１₀〜４５１₇を介して、ラッチ回路４５２₀〜４５２₇に書き込まれる。
そして、ラッチ回路４５２₀〜４５２₇から、図２４（Ａ）に示すカラーデータｃｗｄ₁が、図８に示すラッチ回路４１０₄を介して、ディストリビュータ３００に出力される。
【０１２５】
次に、ラッチ回路４５３₀〜４５３₇に記憶されている図２３（Ｂ）に示すカラーデータが、さらに、シフト回路４５０₂に出力され、シフト回路４５０₂において、３２ビット単位でＭＳＢに向けて「３」だけシフトされ、当該シフト後のカラーデータが、セレクタ４５１₀〜４５１₇を介して、ラッチ回路４５２₀〜４５２₇に書き込まれる。これにより、ラッチ回路４５２₃に、カラーデータｐｉｘ₂₉が記憶される。
また、図２３（Ｃ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁において、ＬＳＢに向けて「５」だけシフトされ、ラッチ回路４５３₀〜４５３₇に書き込まれる。これにより、ラッチ回路４５３₀〜４５３₂に、カラーデータｐｉｘ₂₆〜ｐｉｘ₂₈が記憶される。
そして、ラッチ回路４５２₃，４５３₀〜４５３₂から、図２４（Ｃ）に示すカラーデータｃｗｄ₂が、図８に示すラッチ回路４１０₄を介して、ディストリビュータ３００に出力される。
【０１２６】
次に、図２３（Ｄ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁において、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータが図９に示すラッチ回路４５３₀〜４５３₇から図２４（Ｄ）に示すカラーデータｃｗｄ₃として出力され、カラーデータｃｗｄ₃が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【０１２７】
次に、図２３（Ｅ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁においてシフトされずに、ラッチ回路４５３₀〜４５３₇に書き込まれた後に、シフト回路４５０₂に出力され、シフト回路４５０₂において、３２ビット単位でＬＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータが、セレクタ４５１₀〜４５１₇を介して、ラッチ回路４５２₀〜４５２₇に書き込まれる。
そして、ラッチ回路４５２₀〜４５２₇から、図２４（Ｅ）に示すカラーデータｃｗｄ₄が、図８に示すラッチ回路４１０₄を介して、ディストリビュータ３００に出力される。
【０１２８】
次に、ラッチ回路４５３₀〜４５３₇に記憶されている図２３（Ｅ）に示すカラーデータが、さらに、シフト回路４５０₂に出力され、シフト回路４５０₂において、３２ビット単位でＭＳＢに向けて「３」だけシフトされ、当該シフト後のカラーデータが、セレクタ４５１₀〜４５１₇を介して、ラッチ回路４５２₀〜４５２₇に書き込まれる。これにより、ラッチ回路４５２₃に、カラーデータｐｉｘ₁₆が記憶される。
また、図２３（Ｆ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁において、ＬＳＢに向けて「５」だけシフトされ、ラッチ回路４５３₀〜４５３₇に書き込まれる。これにより、ラッチ回路４５３₀〜４５３₂に、カラーデータｐｉｘ₁₃〜ｐｉｘ₁₅が記憶される。
そして、ラッチ回路４５２₃，４５３₀〜４５３₂から、図２４（Ｆ）に示すカラーデータｃｗｄ₅が、図８に示すラッチ回路４１０₄を介して、ディストリビュータ３００に出力される。
【０１２９】
次に、図２３（Ｇ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁において、３２ビット単位でＭＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータが図９に示すラッチ回路４５３₀〜４５３₇から図２４（Ｇ）に示すカラーデータｃｗｄ₆として出力され、カラーデータｃｗｄ₆が図８に示すラッチ回路４１０₄を介して図２に示すディストリビュータ３００に出力される。
【０１３０】
次に、図２３（Ｈ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁においてシフトされずに、ラッチ回路４５３₀〜４５３₇に書き込まれた後に、シフト回路４５０₂に出力され、シフト回路４５０₂において、３２ビット単位でＬＳＢに向けて「１」だけシフトされ、当該シフト後のカラーデータが、セレクタ４５１₀〜４５１₇を介して、ラッチ回路４５２₀〜４５２₇に書き込まれる。
そして、ラッチ回路４５２₀〜４５２₇から、図２４（Ｈ）に示すカラーデータｃｗｄ₇が、図８に示すラッチ回路４１０₄を介して、ディストリビュータ３００に出力される。
【０１３１】
次に、ラッチ回路４５３₀〜４５３₇に記憶されている図２３（Ｈ）に示すカラーデータが、さらに、シフト回路４５０₂に出力され、シフト回路４５０₂において、３２ビット単位でＭＳＢに向けて「３」だけシフトされ、当該シフト後のカラーデータが、セレクタ４５１₀〜４５１₇を介して、ラッチ回路４５２₀〜４５２₇に書き込まれる。これにより、ラッチ回路４５２₃に、カラーデータｐｉｘ₃が記憶される。
また、図２３（Ｉ）に示すカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒが、図９に示すシフト回路４５０₁において、ＬＳＢに向けて「５」だけシフトされ、ラッチ回路４５３₀〜４５３₇に書き込まれる。これにより、ラッチ回路４５３₀〜４５３₂に、カラーデータｐｉｘ₀〜ｐｉｘ₃が記憶される。
そして、ラッチ回路４５２₃，４５３₀〜４５３₂から、図２４（Ｉ）に示すカラーデータｃｗｄ₈が、図８に示すラッチ回路４１０₄を介して、ディストリビュータ３００に出力される。
【０１３２】
また、データ並べ替え部４００では、図２４（Ａ）〜（Ｉ）に示されるように、上述したカラーデータｃｗｄ₀〜ｃｗｄ₈に対応する代表点座標（ｂｔｒｘ₀，ｂｔｒｙ₀）〜（ｂｔｒｘ₈，ｂｔｒｙ₈）およびバリッドフラグｂｔｒｖｌｄ₀〜ｂｔｒｖｌｄ₈が、アドレス生成部４３０において生成される。代表点座標（ｂｔｒｘ₀，ｂｔｒｙ₀）〜（ｂｔｒｘ₈，ｂｔｒｙ₈）およびバリッドフラグｂｔｒｖｌｄ₀〜ｂｔｒｖｌｄ₈は、アドレスＡＤＲＷ₂内に含められ、ディストリビュータ３００に出力される。
【０１３３】
そして、代表点座標（ｂｔｒｘ₀，ｂｔｒｙ₀）〜（ｂｔｒｘ₈，ｂｔｒｙ₈）およびバリッドフラグｂｔｒｖｌｄ₀〜ｂｔｒｖｌｄ₈を基準として、８画素（Ｘ方向）×２画素（Ｙ方向）の１６画素を単位として、図２４（Ａ）〜（Ｉ）に示すカラーデータｃｗｄ₀〜ｃｗｄ₈に含まれるカラーデータｐｉｘ₀〜ｐｉｘ₃₈が、ＤＲＡＭ１４７内の矩形記憶領域７００に書き込まれる。
これにより、ＤＲＡＭ１４７の図１９に示す矩形記憶領域６００から読み出したカラーデータｐｉｘ₀〜ｐｉｘ₃₈が、図２５に示す矩形記憶領域に７００に転送される。
以上が、図８に示すデータ並べ替え回路４２０およびアドレス生成部４３０の説明である。
【０１３４】
コントローラ４４０は、テクスチャエンジン回路１４３から入力した転送モード指示信号Ｓ８００および画素データ長指示信号Ｓ８０１に基づいて、制御信号Ｓ４４０₁〜Ｓ４４０₆を生成する。
コントローラ４４０は、ＦＩＦＯ回路４１０₁〜４１０₃およびラッチ回路４１０₄をそれぞれ制御する制御信号Ｓ４４０₁〜Ｓ４４０₄を出力する。
また、コントローラ４４０は、アドレス生成部４３０を制御する制御信号Ｓ４４０₅を出力する。
また、コントローラ４４０は、データ並べ替え回路４２０を制御する制御信号Ｓ４４０₆を出力する。
【０１３５】
また、コントローラ４４０は、ＦＩＦＯ回路４１０₃への入力、すなわちＤＲＡＭ１４７からのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒの読み出しを以下に示すようにして制御する。
コントローラ４４０は、ＦＩＦＯ回路４１０₃の記憶領域を監視し、各々６４ビット幅の１６段の記憶回路のうち半分の８段の記憶回路が空（有効データの未記憶状態）になったことを検出すると、画素データ長指示信号Ｓ８０１に応じた回数だけ連続して読み出しを行うことを指示する制御信号Ｓ４４０₅をアドレス生成部４３０に出力する。
ここで、１回の読み出しによって、８画素分の２５６ビットのカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒがＤＲＡＭ１４７から読み出される。
【０１３６】
具体的には、コントローラ４４０は、画素データ長指示信号Ｓ８０１が示す画素データ長に応じて、図２６に示す読み出し指示回数を示す制御信号Ｓ４４０₅をアドレス生成部４３０に出力する。
すなわち、コントローラ４４０は、１画素のデータ長が３２ビット、１６ビット、８ビットおよび４ビットの場合に、それぞれ２回、４回、８回および１６回の読み出し指示回数を示す制御信号Ｓ４４０₅をアドレス生成部４３０に出力する。
【０１３７】
図２６に示す読み出し指示回数を示す制御信号Ｓ４４０₅を出力することで、ＦＩＦＯ回路４１０₃の空（未記憶）の８段の記憶回路に、ＤＲＡＭ１４７から読み出されたカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒを記憶させることができる。このとき、ＦＩＦＯ回路４１０₃の８段の記憶回路は、８×６４ビットのカラーデータを記憶する。
【０１３８】
すなわち、画素データ長が３２ビットの場合には、１回の読み出し動作で読み出されるカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒには、図２７（Ａ）に示すように、３２×８ビットのカラーデータが含まれているため、これを２回連続して読み出すことで、３２×８×２（＝８×６４）ビットの読み出しを行うことができる。
また、画素データ長が１６ビットの場合には、１回の読み出し動作で読み出されるカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒには、図２７（Ｂ）に示すように、１６×８ビットのカラーデータが含まれているため、これを４回連続して読み出すことで、１６×８×４（＝８×６４）ビットの読み出しを行うことができる。
また、画素データ長が８ビットの場合には、１回の読み出し動作で読み出されるカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒには、図２７（Ｃ）に示すように、８×８ビットのカラーデータが含まれているため、これを８回連続して読み出すことで、８×８×８（＝８×６４）ビットの読み出しを行うことができる。
また、画素データ長が４ビットの場合には、１回の読み出し動作で読み出されるカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒには、図２７（Ｄ）に示すように、４×８ビットのカラーデータが含まれているため、これを１６回連続して読み出すことで、４×８×１６（＝８×６４）ビットの読み出しを行うことができる。
【０１３９】
以上説明したように、コントローラ４４０が、ＦＩＦＯ回路４１０₃の記憶領域を監視し、各々６４ビット幅の１６段の記憶回路のうち半分の８段の記憶回路が空になったときに、画素データ長指示信号Ｓ８０１に応じた回数だけ連続してＤＲＡＭ１４７から読み出しを行うことで、コントローラ４４０によるＦＩＦＯ回路４１０₃の制御を簡単にでき、コントローラ４４０の負担を軽減できる。そのため、コントローラ４４０が行うその他の処理が、ＦＩＦＯ回路４１０₃の制御に待たされる状態を回避できる。
すなわち、上述したようなコントローラ４４０による制御を行っていない一般的な従来のＦＩＦＯ回路のでは、例えば、空き（未記憶）状態の記憶回路が存在する間、メモリからの読み出しを連続して行い、全ての記憶回路が記憶状態になったときに読み出しを停止するように入力制御を行う。しかしながら、このように制御を行うと、読み出しを停止したときに、１回の読み出しで読み出される８画素分のカラーデータのうち一部のデータのみがＦＩＦＯ回路に書き込まれる状態が発生する。そのため、次に前回と同じ８画素のカラーデータを読み出し、当該読み出した８画素のカラーデータのうち前回にＦＩＦＯ回路に書き込まれなかった画素のカラーデータを特定し、当該特定したカラーデータのみをＦＩＦＯ回路に書き込むように制御を行う必要があり、制御が複雑になる。
【０１４０】
ＣＲＴコントロール回路１４５
ＣＲＴコントロール回路１４５は、与えられた水平および垂直同期信号に同期して、図示しないＣＲＴに表示するアドレスを発生し、ディスプレイバッファ１４７ｂから表示データを読み出す要求をメモリＩ／Ｆ回路１４４に出力する。この要求に応じて、メモリＩ／Ｆ回路１４４は、ディスプレイバッファ１４７ｂから一定の固まりで表示データを読み出す。ＣＲＴコントローラ回路１４５は、ディスプレイバッファ１４７ｂから読み出した表示データを記憶するＦＩＦＯ回路を内蔵し、一定の時間間隔で、ＲＡＭＤＡＣ回路１４６に、ＲＧＢのインデックス値を出力する。
【０１４１】
ＲＡＭＤＡＣ回路１４６
ＲＡＭＤＡＣ回路１４６は、各インデックス値に対応するＲ，Ｇ，Ｂデータを記憶しており、ＣＲＴコントローラ回路１４５から入力したＲＧＢのインデックス値に対応するデジタル形式のＲ，Ｇ，Ｂデータを、図示しないＤ／Ａコンバータ(Digital/Analog Converter)に転送し、アナログ形式のＲ，Ｇ，Ｂデータを生成する。ＲＡＭＤＡＣ回路１４６は、この生成されたＲ，Ｇ，ＢデータをＣＲＴに出力する。
【０１４２】
以下、上述した３次元コンピュータグラフィックシステム１０の全体動作について説明する。
３次元コンピュータグラフィックシステム１０において、グラフィック描画等のデータは、メインプロセッサ１１のメインメモリ１２、あるいは外部からのグラフィックスデータを受けるＩ／Ｏインタフェース回路１３からメインバス１５を介してレンダリング回路１４に与えられる。
なお、必要に応じて、グラフィックス描画等のデータは、メインプロセッサ１１等において、座標変換、クリップ処理、ライティング処理等のジオメトリ処理が行われる。
ジオメトリ処理が終わったグラフィックスデータは、三角形の各３頂点の頂点座標ｘ，ｙ，ｚ、輝度値Ｒ，Ｇ，Ｂ、描画しようとしている画素と対応するテクスチャ座標ｓ，ｔ，ｑとからなるポリゴンレンダリングデータＳ１１となる。
【０１４３】
このポリゴンレンダリングデータＳ１１は、レンダリング回路１４のＤＤＡセットアップ回路１４１に入力される。
ＤＤＡセットアップ回路１４１においては、ポリゴンレンダリングデータＳ１１に基づいて、三角形の辺と水平方向の差分などを示す変分データＳ１４１が生成される。具体的には、開始点の値と終点の値、並びに、その間の距離を用いて、単位長さ移動した場合における、求めようとしている値の変化分である変分が算出され、変分データＳ１４１としてトライアングルＤＤＡ回路１４２に出力される。
【０１４４】
トライアングルＤＤＡ回路１４２においては、変分データＳ１４１を用いて、、三角形内部の各画素における線形補間された（ｚ，Ｒ，Ｇ，Ｂ，ｓ，ｔ，ｑ）データが算出される。
そして、この算出された（ｚ，Ｒ，Ｇ，Ｂ，ｓ，ｔ，ｑ）データと、三角形の各頂点の（ｘ，ｙ）データとが、ＤＤＡデータＳ１４２として、トライアングルＤＤＡ回路１４２からテクスチャエンジン回路１４３に出力される。
【０１４５】
テクスチャエンジン回路１４３においては、ＤＤＡデータＳ１４２が示す（ｓ，ｔ，ｑ）データについて、ｓデータをｑデータで除算する演算と、ｔデータをｑデータで除算する演算とが行われる。そして、除算結果「ｓ／ｑ」および「ｔ／ｑ」に、それぞれテクスチャサイズＵＳＩＺＥおよびＶＳＩＺＥが乗算され、テクスチャ座標データ（ｕ，ｖ）が生成される。
【０１４６】
次に、テクスチャエンジン回路１４３からメモリＩ／Ｆ回路１４４に対して、アドレスＡＤＲＲ₁を含む読み出し要求Ｓ１４３が出力され、メモリＩ／Ｆ回路１４４を介して、ＤＲＡＭ１４７（ＳＲＡＭ１４８）に記憶された（Ｒ，Ｇ，Ｂ）データＳ１４８が読み出される。
【０１４７】
次に、テクスチャエンジン回路１４３において、読み出した（Ｒ，Ｇ，Ｂ）データＳ１４８の（Ｒ，Ｇ，Ｂ）データと、前段のトライアングルＤＤＡ回路１４２からのＤＤＡデータＳ１４２に含まれる（Ｒ，Ｇ，Ｂ）データとが掛け合わされ、カラーデータＳ１４３として生成される。
このカラーデータＳ１４３は、テクスチャエンジン回路１４３からメモリＩ／Ｆ回路１４４に出力される。
【０１４８】
フルカラーの場合には、テクスチャバッファ１４７ａからのデータ（Ｒ，Ｇ，Ｂ）を直接用いればよいが、インデックスカラーの場合には、あらかじめ作成しておいたカラーインデックステーブル（Color Index Table ）のデータが、テクスチャＣＬＵＴ（Color Look Up Table)バッファ１４７ｄより、ＳＲＡＭ等で構成される一時保管バッファへ転送され、この一時保管バッファのＣＬＵＴを用いてカラーインデックスから実際のＲ，Ｇ，Ｂカラーが得られる。
なお、ＣＵＬＴがＳＲＡＭで構成された場合は、カラーインデックスをＳＲＡＭのアドレスに入力すると、その出力には実際のＲ，Ｇ，Ｂカラーが出てくるといった使い方となる。
【０１４９】
そして、メモリＩ／Ｆ回路１４４において、テクスチャエンジン回路１４３から入力したカラーデータＳ１４３に対応するｚデータと、ｚバッファ１４７ｃに記憶されているｚデータとの比較が行われ、入力したカラーデータＳ１２によって描画される画像が、前回、ディスプレイバッファ２１に書き込まれた画像より、手前（視点側）に位置するか否かが判断される。
判断の結果、手前に位置する場合には、画像データＳ１４３に対応するｚデータでｚバッファ１４７ｃに記憶されたｚデータが更新される。
【０１５０】
次に、メモリＩ／Ｆ回路１４４において、（Ｒ，Ｇ，Ｂ）データがディスプレイバッファ１４７ｂに書き込まれる。
これら書き込む（更新も含む）べきデータは、書き込み系回路である、図２に示すデータ並べ替え部４００、ディストリビュータ３００、アドレスデコーダ３１０，３２０，３３０，３４０を介してメモリコントローラ３５０，３６０，３７０，３８０に供給され、メモリコントローラ３５０，３６０，３７０，３８０によって、それぞれ書き込み系配線群４０１Ｗ，４０２Ｗ，４１１Ｗ，４１２Ｗ，４２１Ｗ，４２２Ｗ，４３１Ｗ，４３２Ｗを介し所定のメモリに対して並列的に書き込まれる。
【０１５１】
メモリＩ／Ｆ回路１４４においては、今から描画しようとしている画素におけるテクスチャアドレスに対応したテクスチャを格納しているメモリブロックがそのテクスチャアドレスにより算出され、そのメモリブロックにのみ読みだし要求が出され、テクスチャデータが読み出される。
この場合、該当するテクスチャデータを保持していないメモリブロックにおいては、テクスチャ読み出しのためのアクセスが行われないため、描画により多くのアクセス時間を提供することが可能となっている。
【０１５２】
このとき、メモリＩ／Ｆ回路１４４における、テクスチャエンジン回路１４３からの画像データＳ１４３をＤＲＡＭ１４７に書き込む処理は、例えば、図１１〜図１８を参照して前述したように、６４ビット幅の転送データｈｗｄを、２５６ビット幅のカラーデータｃｗｄに変換して行われる。
【０１５３】
また、描画においても同様に、今から描画しようとしている画素アドレスに対応するカラーデータを格納しているメモリブロックに対して、該当アドレスからカラーデータがモディファイ書き込み(Modify Write)を行うために読み出され、モディファイ後、同じアドレスへ書き戻される。
【０１５４】
隠れ面処理を行う場合には、やはり同じように今から描画しようとしている画素アドレスに対応する奥行きデータを格納しているメモリブロックに対して、該当アドレスから奥行きデータがモディファイ書き込み(Modify Write)を行うために読み出され、必要ならばモディファイ後、同じアドレスへ書き戻される。
【０１５５】
このようなメモリＩ／Ｆ回路１４４に基づくＤＲＡＭ１４７とのデータのやり取りにおいては、それまでの処理を複数並行処理することで、描画性能を向上させることができる。
特に、トライアングルＤＤＡ回路１４２とテクスチャエンジン１４３の部分を並列実行形式で、同じ回路に設ける（空間並列）か、または、パイプラインを細かく挿入する（時間並列）ことで、部分的に動作周波数を増加させるという手段により、複数画素の同時算出が行われる。
【０１５６】
また、カラーデータは、メモリＩ／Ｆ回路１４４の制御のもと、表示領域において隣接した部分は、異なるＤＲＡＭモジュールとなるように配置される。
これにより、三角形のような平面を描画する場合には面で同時に処理される。このため、それぞれのＤＲＡＭモジュールの動作確率は非常に高い。
【０１５７】
そして、図示しないＣＲＴに画像を表示する場合には、ＣＲＴコントロール回路１４５において、与えられた水平垂直同期周波数に同期して、表示アドレスが発生され、メモリＩ／Ｆ回路１４４へ表示データ転送の要求が出される。
メモリＩ／Ｆ回路１４４では、その要求に従い、一定のまとまった固まりで、表示データがＣＲＴコントロール回路１４５に転送される。
ＣＲＴコントロール回路１４５では、図示しないディスプレイ用ＦＩＦＯ(First In First Out)等にその表示データが貯えられ、一定の間隔でＲＡＭＤＡＣ１４６へＲＧＢのインデックス値が転送される。
【０１５８】
また、図１に示すホストインタフェース回路１４９を介して、メインプロセッサ１１からメモリＩ／Ｆ回路１４４に対してＤＲＡＭ１４７あるいはＳＲＡＭ１４８に格納されているデータの読み出し要求があった場合、読み出しコントローラ３９０のアドレスコンバータ３９１に読み出しアドレスＡＤＲＲ₂が入力される。
このとき、アドレスコンバータ３９１ではメモリコントローラ３５０，３６０，３７０，３８０からのアイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０をすべてアクティブで入力された否かのチェックが行われる。そして、アイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０がすべてアクティブで入力されると、アイドル信号Ｓ３５０，Ｓ３６０，Ｓ３７０，Ｓ３８０に応答して、８画素あるいは１６画素単位で読み出しを行うように、読み出しアドレスおよび読み出し要求信号Ｓ３９１が各メモリコントローラ３５０，３６０，３７０，３８０に出力される。
【０１５９】
読み出しアドレスおよび読み出し要求信号Ｓ３９１を受けて、各メモリコントローラ３５０，３６０，３７０，３８０で８画素あるいは１６画素単位の、テクスチャデータ、（Ｒ，Ｇ，Ｂ）データ、ｚデータおよびテクスチャカラールックアップテーブルデータが読み出し系配線群４０１Ｒ，４０２Ｒ，４１１Ｒ，４１２Ｒ，４２１Ｒ，４２２Ｒ，４３１Ｒ，４３２Ｒを介して並列的に読み出され、さらに読み出し系配線群３５１，３６１，３７１，３８１，配線群４４０を介してデータ演算部３９２に入力される。
そして、データ演算部３９２で所定の演算処理が行われた後に、データ並べ替え部４００において、２５６ビット幅のカラーデータｍｃ＿０ｄｔｒ〜ｍｃ＿７ｄｔｒから６４ビット幅の転送データｈｒｄに変換され、要求先、例えばメインプロセッサ１１にホストインタフェース回路１４９を介して出力される。
【０１６０】
ＲＡＭＤＡＣ１４６においては、ＲＡＭ内部にＲＧＢのインデックスに対するＲＧＢ値が記憶されていて、インデックス値に対するＲＧＢ値が図示しないＤ／Ａコンバータへ転送される。
そして、Ｄ／Ａコンバータでアナログ信号に変換されたＲＧＢ信号がＣＲＴへ転送される。
【０１６１】
以上説明したように、本実施形態によれば、図１に示すホストインタフェース回路１４９およびテクスチャエンジン回路１４３と、ＤＲＡＭ１４７との間でのデタフォーマットが異なる場合でも、図２に示すデータ並べ替え部４００においてデータの並べ替えを行うことで、テクスチャエンジン回路１４３およびホストインタフェース回路１４９とＤＲＡＭ１４７との間でのデータ転送を正確に行うことができる。
また、本実施形態によれな、ＤＲＡＭ１４７に対してアクセスするアドレスパターンに制限がある場合でも、データ並べ替え部４００において当該制限を考慮してデータの並べ替えを行うことで、ＤＲＡＭ１４７に対してのカラーデータの書き込みおよび読み出しを正確に行うことができる。
【０１６２】
また、本実施形態によれば、図８に示すデータ並べ替え部４００において、コントローラ４４０が、ＦＩＦＯ回路４１０₃の記憶領域を監視し、各々６４ビット幅の１６段の記憶回路のうち半分の８段の記憶回路が空になったときに、画素データ長指示信号Ｓ８０１に応じた回数だけ連続してＤＲＡＭ１４７から読み出しを行うことで、コントローラ４４０によるＦＩＦＯ回路４１０₃の制御を簡単にでき、コントローラ４４０の負担を軽減できる。そのため、コントローラ４４０が行うその他の処理が、ＦＩＦＯ回路４１０₃の制御処理によって待たされる状態を回避できる。
【０１６３】
また、本実施形態によれば、ＤＲＡＭ１４７およびＳＲＡＭ１４８へのアクセス（書き込みまたは読み出し）を行うメモリＩ／Ｆ回路１４４において、書き込み経路と読み出し経路とを別経路として構成し、書き込みの場合には書き込みアドレスＡＤＲＷとカラーデータｃｗｄを書き込み系回路であるデータ並べ替え部４００、ディストリビュータ３００、アドレスコンバータ３１０，３２０，３３０，３４０およびメモリコントローラ３５０，３６０，３７０，３８０で処理してＤＲＡＭ１４７に書き込み、読み出しの場合には読み出し系回路であるデータ並べ替え部４００、読み出しコントローラ３９０、メモリコントローラ３５０，３６０，３７０，３８０で処理してＤＲＡＭ１４７またはＳＲＡＭ１４８から読み出すことから、例えば読み出しの割り込みが入った時等に早いタイミングで切り替えることができ、読み出しの際の性能向上を図れる利点がある。
【０１６４】
また、メモリＩ／Ｆ回路１４４は、所定のインターリーブ方式のアドレッシングに基づいてＤＲＡＭ１４７へのアクセスを、例えば１６画素単位あるいは８画素単位で行い、アクセスを行う領域に制限を加えていることから、ペ−ジ違反を起こさない画像処理装置を実現できる。
【０１６５】
さらに、本実施形態によれば、半導体チップ内部に内蔵されたＤＲＡＭ１４７に、表示データと少なくとも一つの図形要素が必要とするテクスチャデータを記憶させた構成を有することから、表示領域以外の部分にテクスチャデータを格納できることになり、内蔵ＤＲＡＭの有効利用が可能となり、高速処理動作、並びに低消費電力化を並立させるようにした画像処理装置が実現可能となる。
そして、単一メモリシステムを実現でき、すべてが内蔵された中だけで処理ができる。その結果、ア−キテクチャとしても大きなパラダイムシフトとなる。
また、メモリの有効利用ができることで、内部に持っているＤＲＡＭのみでの処理が可能となり、内部にあるがゆえのメモリと描画システムの間の大きなバンド幅が、十分に活用可能となる。また、ＤＲＡＭにおいても特殊な処理を組み込むことが可能となる。
【０１６６】
さらに、表示アドレス空間において、隣接するアドレスにおける表示要素が、それぞれ異なるＤＲＡＭのブロックになるように配置するので、さらにビット線の有効利用が可能となり、グラフィックス描画におけるような、比較的固まった表示領域へのアクセスが多い場合には、それぞれのモジュ−ルが同時に処理できる確率が増加し、描画性能の向上が可能となる。
【０１６７】
また、チップ内部にＤＲＡＭを内蔵することで、その高速なインタ−フェ−ス部分がチップの内部だけで完結することになるため、大きな付加容量のＩ／Ｏバッファであるとか、チップ間配線容量をドライブする必要がなくなり、消費電力は内蔵しない場合に比較して小さくなる。
よって、さまざまな技術を使って、一つのチップの中だけですべてができるような仕組みは、今後の携帯情報端末等の身近なデジタル機器のためには、必要不可欠な技術要素となっている。
【０１６８】
本発明は上述した実施形態には限定されない。
例えば、ＤＲＡＭ１４７に対してのアクセスパターンの制限は、上述したものに制限されず、その他のアクセスパターンの制限がある場合でも、本発明を適用可能である。
また、図８に示すデータ並べ替え部４００の構成や、図９に示すデータ並べ替え回路４２０の構成も、これらに限定されるものではない。
【０１６９】
また、図８に示すＦＩＦＯ回路４１０₁〜４１０₃の段数は任意である。
また、コントローラ４４０が、ＦＩＦＯ回路４１０₃の記憶状態を開始して読み出し要求を出すタイミングは、空領域が半分になったとき以外でもよい。
【０１７０】
また、上述した図１に示す３次元コンピュータグラフィックスシステム１０では、ＳＲＡＭ１４８を用いる構成を例示したが、ＳＲＡＭ１４８を設けない構成にしてもよい。
【０１７１】
さらに、図１に示す３次元コンピュータグラフィックスシステム１０では、ポリゴンレンダリングデータを生成するジオメトリ処理を、メインプロセッサ１１で行う場合を例示したが、レンダリング回路１４で行う構成にしてもよい。
【０１７２】
【発明の効果】
以上説明したように、本発明の画像処理装置によれば、記憶回路へのアクセス時のアドレス指定に制限がある場合でも、画像処理回路と記憶回路の間のデータ送受信を正確に行うことができる。
また、本発明の画像処理装置によれば、記憶回路へのアクセス時のアドレス指定に制限がある場合でも、記憶回路の記憶領域内でのデータ転送を正確に行うことができる。
【図面の簡単な説明】
【図１】本発明に係る３次元コンピュータグラフィックスシステムの構成を示すブロック図である。
【図２】本発明に係るレンダリング回路におけるＤＲＡＭ、ＳＲＡＭ、並びに、ＤＲＡＭおよびＳＲＡＭへアクセスするメモリＩ／Ｆ回路の具体的な構成例を示すブロック図である。
【図３】本発明に係るＤＲＡＭバッファの構成例を示す概略図である。
【図４】テクスチャデータに含まれる同時にアクセスが行われるカラーデータを説明するための図である。
【図５】テクスチャデータを構成する単位ブロックを説明するための図である。
【図６】テクスチャバッファのアドレス空間を説明するするための図である。
【図７】本発明に係るメモリＩ／Ｆ回路におけるディストリビュータの画像データ処理を説明するための図である。
【図８】図８は、図２に示すデータ並べ替え部４００の構成図である。
【図９】図９は、図８に示すデータ並べ替え回路４２０の構成図である。
【図１０】図１０は、図１に示すＤＲＡＭ内の記憶領域を示す図である。
【図１１】図１１は、図２に示すデータ並べ替え部における１画素のデータ長が３２ビットの場合の「ＨｏｓｔｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図１２】図１２は、図２に示すデータ並べ替え部における１画素のデータ長が３２ビットの場合の「ＨｏｓｔｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図１３】図１３は、図２に示すデータ並べ替え部における１画素のデータ長が３２ビットの場合の「ＨｏｓｔｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図１４】図１４は、図１３に示すカラーデータを図１に示すＤＲＡＭに書き込む動作を説明するための図である。
【図１５】図１５は、図１３に示すカラーデータを図１に示すＤＲＡＭに書き込む動作を説明するための図である。
【図１６】図１６は、図２に示すデータ並べ替え部における１画素のデータ長が１６ビットの場合の「ＨｏｓｔｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図１７】図１７は、図２に示すデータ並べ替え部における１画素のデータ長が１６ビットの場合の「ＨｏｓｔｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図１８】図１８は、図２に示すデータ並べ替え部における１画素のデータ長が１６ビットの場合の「ＨｏｓｔｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図１９】図１９は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＨｏｓｔ」転送モードの動作を説明するための図である。
【図２０】図２０は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＨｏｓｔ」転送モードの動作を説明するための図である。
【図２１】図２１は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＨｏｓｔ」転送モードの動作を説明するための図である。
【図２２】図２２は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＨｏｓｔ」転送モードの動作を説明するための図である。
【図２３】図２３は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＬｏａｃｌ」転送モードの動作を説明するための図である。
【図２４】図２４は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図２５】図２５は、図２に示すデータ並べ替え部における「ＬｏｃａｌｔｏＬｏｃａｌ」転送モードの動作を説明するための図である。
【図２６】図２６は、図８に示すコントローラからアドレス生成部に出力する制御信号の読み出し要求に示される読み出し指示回数と、１画素のデータ長との関係を説明するための図である。
【図２７】図２７は、１画素のデータ長が３２ビット、１６ビット、８ビットおよび４ビットの場合における、１回の読み出し動作によってＤＲＡＭから読み出されるカラーデータを説明するための図である。
【図２８】３次元コンピュータグラフィックスシステムの基本的な概念を示すシステム構成図である。
【符号の説明】
１０…３次元コンピュータグラフィックスシステム、１１…メインプロセッサ、１２…メインメモリ、１３…Ｉ／Ｏインタフェース回路、１４…レンダリング回路、１４１…ＤＤＡセットアップ回路、１４２…トライアングルＤＤＡ回路、１４３…テクスチャエンジン回路、１４４…メモリＩ／Ｆ回路、１４５…ＣＲＴコントローラ回路、１４６…ＲＡＭＤＡＣ回路、１４７…ＤＲＡＭ、１４７ａ…テクスチャバッファ、１４７ｂ…ディスプレイバッファ、１４７ｃ…ｚバッファ、１４７ｄ…テクスチャＣＬＵＴバッファ、１４８…ＳＲＡＭ、２００，２１０，２２０，２３０…メモリモジュール、３００…ディストリビュータ、３１０，３２０，３３０，３４０…アドレスデコーダ、３５０，３６０，３７０，３８０…メモリコントローラ、３９０…読み出しコントローラ、３９１…アドレスデコーダ、３９２…データ演算処理部、４００…データ並べ替え部、４１０₁〜４１０₃…ＦＩＦＯ回路、４１０₄…ラッチ回路、４２０…データ並べ替え回路、４３０…アドレス生成部、４４０…コントローラ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus.
[0002]
[Prior art]
Computer graphics are often used in various CAD (Computer Aided Design) systems and amusement machines. In particular, with the recent development of image processing technology, systems using three-dimensional computer graphics are rapidly spreading.
In such 3D computer graphics, when determining the color corresponding to each pixel (pixel), the color value of each pixel is calculated, and the calculated color value is used as the display buffer corresponding to the pixel. Rendering processing to write to the (frame buffer) address.
[0003]
One of the rendering processing methods is polygon rendering. In this method, a three-dimensional model is expressed as a combination of triangular unit graphics (polygons), and the color of the display screen is determined by drawing with the polygon as a unit.
[0004]
In the polygon rendering, the coordinates (x, y, z), color data (R, G, B), and the coordinate coordinates (s) of the texture data indicating the image pattern of pasting for each vertex of the triangle in the physical coordinate system. , T) and the value of the homogeneous term q are input, and a process of interpolating these values inside the triangle is performed.
Here, the homogeneous term q is simply an enlargement / reduction ratio, and the coordinates of the actual texture buffer in the UV coordinate system, that is, the texture coordinate data (u, v) are represented by the homogeneous coordinates (s , T) divided by the homogeneous term q is multiplied by the texture sizes USIZE and VSIZE, respectively, and “t / q”.
[0005]
FIG. 28 is a system configuration diagram showing a basic concept of a three-dimensional computer graphics system.
[0006]
In this three-dimensional computer graphics system, graphics rendering data is rendered via a main bus 4 from the main memory 2 of the main processor 1 or from an I / O interface circuit 3 that receives graphics data from the outside. 5a and a rendering circuit 5 having a frame buffer memory 5b.
[0007]
The rendering processor 5a is coupled with a frame buffer 5b intended to hold data for display and a texture memory 6 holding texture data to be attached to the surface of a graphic element (for example, a triangle) to be drawn. ing.
Then, the rendering processor 5a performs a process of drawing a graphic element with a texture on the surface of each graphic element in the frame buffer 5b.
[0008]
The frame buffer 5b and the texture memory 6 are generally constituted by a DRAM (Dynamic Random Access Memory).
In the system of FIG. 28, the frame buffer 5b and the texture memory 6 are configured as physically separate memory systems.
[0009]
[Problems to be solved by the invention]
By the way, in the above-described three-dimensional computer graphic system or the like, in order to efficiently perform the read operation and write operation of the DRAM, for example, the address designation at the time of accessing the DRAM is limited in consideration of the characteristics of graphic processing. It is possible.
However, when such a restriction is provided, in the system shown in FIG. 28, when data is transmitted / received between the rendering processor 5a and the main processor 1, and the frame buffer memory 5b and the texture memory 6, the DRAM is not connected. There is a need to solve various problems caused by addressing restrictions during access.
[0010]
The present invention has been made in view of such circumstances, and it is an object of the present invention to provide an image processing apparatus capable of accurately accessing a memory even when there is a restriction on address designation when accessing the memory. Objective.
[0011]
[Means for Solving the Problems]
  According to the present invention,
  Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic An image processing circuit that performs rendering processing using data and generates image data including a plurality of pixels;
  Writing of image data composed of the plurality of pixels generated by the rendering process in units of a block having a predetermined sizeThe writing is performed in units of the first rectangular area with the representative point as a reference, and the first and second directions orthogonal to the first rectangular area are the first. There is a restriction that only addresses that are a multiple of the number can be specified.A plurality of memory circuits;
  When the address of the representative point of the block including the image data composed of the plurality of pixels generated by the rendering process input from the image processing circuit is specified based on the restriction,Input from the image processing circuitThe image data isBased on the constraintsIn the plurality of memory circuitsThe corresponding first rectangular area of the corresponding memory circuitTo be written on,in frontA data rearrangement circuit for rearranging the image data in units of pixel data;
  bookWith a built-in circuit
  Comprising
  The rearrangement circuit is connected to the image processing circuit.Image data composed of the plurality of pixels generated by the input rendering processThe plurality of memory circuitsSwitch toIn the first transfer mode for sending and storing,
    The rendered first image data having the first bit width input from the image processing circuit is converted into second image data having the second bit width.Placed in the
    When the address of the representative point of the block including the second image data is designated based on the restriction, the first image data is stored in the plurality of storage circuits.The corresponding first rectangular storage area of the corresponding storage circuitThe second image data is written toBased on the control signal, the shift process, selection process, and hold process are repeated,Sorting,
  The writing circuit distributes and writes the blocks including the rearranged second image data to corresponding storage circuits of the plurality of storage circuits;
  An image processing apparatus is provided.
[0012]
  Also according to the invention,
  Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic Writing and reading of image data composed of a plurality of pixel data generated by a rendering process in which data is stored in a distributed manner and stored in a storage area in units of blocks of a predetermined size are performed.A plurality of storage circuits, wherein the writing is performed in units of a first rectangular area with a representative point as a reference, and a multiple of a first number in the first and second directions orthogonal to the first rectangular area There is a first restriction that only the address of the second rectangular area can be specified, and the reading is performed in units of the second rectangular area with the representative point as a reference, and an arbitrary address is set in the first direction perpendicular to the second rectangular area. There is a second restriction that the address can be specified only in multiples of the second number in the second direction.A plurality of memory circuits;
  An image processing circuit for inputting first image data having a first bit width and performing a rendering process on the first image data;
  SaidSecond restrictionA readout circuit for designating a representative point based on the block and reading out the image data from the plurality of storage circuits in units of the block;
  Data rearrangement for outputting the first image data of the first bit width generated by rearranging the second image data of the second bit width obtained by the reading in units of pixel data to the image processing circuit Circuit and
  Comprising
  In the second transfer mode in which the rearrangement circuit transfers image data from the storage circuit to the image processing circuit,
    An address corresponding to the address input from the image processing circuit is generated and output to the reading circuit to read out corresponding image data from the plurality of storage circuits,
    Converting to image data of the first bit width defined by the image processing circuit and outputting to the image processing circuit;
  An image processing apparatus is provided.
[0013]
  According to the present invention,
  Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic The data is distributed, the image data generated by the rendering process is written in units of a predetermined size block, and the image data generated by the distributed writing process is read. Is doneA plurality of memory circuits, wherein the writing is performed in units of a first rectangular area with a representative point as a reference, and a first number of first and second directions orthogonal to the first rectangular area There is a first restriction that only multiple addresses can be specified, and the reading is performed in units of the second rectangular area with the representative point as a reference, and the first direction orthogonal to the second rectangular area is arbitrary. There is a second restriction that the address can be specified but the address can only be specified in multiples of the second number in the second direction.A plurality of memory circuits;
  SaidSecondA readout circuit that reads out the image data from the plurality of storage circuits in units of the blocks by designating representative points based on restrictions;
  Processing is performed according to either a first transfer mode in which image data is transferred from the image processing circuit to the storage circuit, or a second transfer mode in which image data is transferred from the storage circuit to the image processing circuit. A rearrangement circuit, wherein the address of the representative point of the block including the read image data isFirst limitWhen the image data is specified based onApplicableIn memory circuitThe corresponding first rectangular areaA data rearrangement circuit for rearranging the read image data in units of pixel data so as to be written to
  SaidFirstA writing circuit that designates a representative point based on a restriction and writes the block including the rearranged image data to the storage circuit;
  Comprising
  In the first transfer mode, the rearrangement circuit transfers and stores the image data from the image processing circuit to the plurality of storage circuits.
    The rendered first image data having the first bit width input from the image processing circuit is converted into second image data having the second bit width.Placed inside,
    The address of the representative point of the block including the second image data isFirst limitThe first image data is stored in the plurality of storage circuits when designated based onThe corresponding rectangular area of the corresponding memory circuitThe second image data is written toBased on the control signal, the shift process, selection process, and hold process are repeated,Sorting,
  The writing circuit distributes and writes the blocks including the rearranged second image data to corresponding storage circuits of the plurality of storage circuits,
  In the second transfer mode in which the rearrangement circuit transfers image data from the storage circuit to the image processing circuit,
    An address corresponding to the address input from the image processing circuit is generated and output to the reading circuit to read out corresponding image data from the plurality of storage circuits,
    Image data of the first bit width defined by the image processing circuitPlaced insideOutput to the image processing circuit;
  An image processing apparatus is provided.
[0014]
  According to the present invention,
  Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic An image processing apparatus that performs rendering processing using data,
  A plurality of storage circuits, wherein display data consisting of a plurality of pixel bits and texture data required by at least one graphic element are distributed and stored in the plurality of storage circuits, and a block of a predetermined size is unit The image data generated by the rendering process is written asA plurality of storage circuits, wherein the writing is performed in units of a first rectangular area with a representative point as a reference, and a multiple of a first number in the first and second directions orthogonal to the first rectangular area There is a first restriction that only the address of the second rectangular area can be specified, and the reading is performed in units of the second rectangular area with the representative point as a reference, and an arbitrary address is set in the first direction perpendicular to the second rectangular area. There is a second restriction that the address can be specified only in multiples of the second number in the second direction.A plurality of memory circuits;
  An interpolation data generation circuit that interpolates polygon rendering data at the vertices of the unit graphic and generates interpolation data of pixels located in the unit graphic;
  The texture homogeneous coordinates (s, t) included in the interpolation data are divided by the homogeneous term q to generate “s / q” and “t / q”, and the “s / q” and “t / q”. Texture processing that reads texture data from the plurality of storage circuits using a texture address according to the image, and pastes the read texture data on the surface of the graphic element to generate display data composed of a plurality of pixel bits Circuit,
  A data rearrangement circuit for performing data rearrangement processing according to a first transfer mode for transferring image data from the texture processing circuit to the storage circuit, and comprising a plurality of pixel bits input from the texture processing circuit The address of the representative point of the block containing display data isFirst limitThe display data is stored in the plurality of storage circuits when specified based onApplicable memory circuitA data rearrangement circuit for rearranging the display data in units of image data so as to be written in a distributed manner;
  A writing circuit that writes the blocks including the rearranged display data in a distributed manner to the plurality of storage circuits;
  Comprising
  In the first transfer mode, the rearrangement circuit transfers and stores the image data from the image processing circuit to the plurality of storage circuits.
    The rendered first image data having the first bit width input from the image processing circuit is converted into second image data having the second bit width.Placed inside,
    The address of the representative point of the block including the second image data isFirst limitThe second image data so that the first image data is written into a predetermined storage area of the plurality of storage circuits when designated based onBased on the control signal, the shift process, selection process, and hold process are repeated,Sorting,
  The writing circuit distributes and writes the blocks including the rearranged second image data to corresponding storage circuits of the plurality of storage circuits;
  An image processing apparatus is provided.
[0015]
  According to the present invention,
  Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic An image processing apparatus that performs rendering processing using data,
  A plurality of storage circuits, wherein display data and texture data required by at least one graphic element are distributed and stored in the plurality of storage circuits, and generated by the above rendering process in units of a predetermined size block. Written image data is writtenA plurality of memory circuits, wherein the writing is performed in units of a first rectangular area with a representative point as a reference, and a first number of first and second directions orthogonal to the first rectangular area There is a first restriction that only multiple addresses can be specified, and the reading is performed in units of the second rectangular area with the representative point as a reference, and the first direction orthogonal to the second rectangular area is arbitrary. There is a second restriction that the address can be specified but the address can only be specified in multiples of the second number in the second direction.A plurality of memory circuits;
  An interpolation data generation circuit that interpolates polygon rendering data at the vertices of the unit graphic and generates interpolation data of pixels located in the unit graphic;
  The texture homogeneous coordinates (s, t) included in the interpolation data are divided by the homogeneous term q to generate “s / q” and “t / q”, and the “s / q” and “t / q”. A texture processing circuit that generates display data by applying a texture data read from the plurality of storage circuits to a surface of a graphic element using a texture address according to
  SaidSecond restrictionA readout circuit that designates a representative point based on the block and reads the texture data from the plurality of storage circuits in units of the block;
  An interface circuit;
  A data rearrangement circuit for rearranging data according to a second transfer mode for transferring display data read from the storage circuit to the texture processing circuit, wherein the read texture data is transferred to the interface. A data rearrangement circuit for rearranging according to the output format of the circuit and outputting to the interface circuit;
  Comprising
  The rearrangement circuit includes:SaidIn the second transfer mode, an address corresponding to an address input from the image processing circuit from the texture circuit via the interface is generated and output to the readout circuit, and corresponding image data is output from the plurality of storage circuits. Read and image data of the first bit width defined by the image processing circuitPlaced insideOutput to the image processing circuit;
  An image processing apparatus is provided.
[0016]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, in the present embodiment, a 3D computer graphics system that is applied to a personal computer or the like and displays a desired 3D image of an arbitrary 3D object model on a display such as a CRT (Cathode Ray Tube) at high speed. Will be described.
[0017]
FIG. 1 is a system configuration diagram of a three-dimensional computer graphics system 10 as an image processing apparatus according to the present invention.
[0018]
The three-dimensional computer graphics system 10 represents a three-dimensional model as a combination of triangles (polygons) that are unit figures, determines the color of each pixel on the display screen by drawing the polygon, and renders the polygon on the display. It is a system that performs processing.
The three-dimensional computer graphics system 10 represents a three-dimensional object using z coordinates representing depth in addition to (x, y) coordinates representing a position on a plane, and this (x, y, z). Any one point in the three-dimensional space is specified by the three coordinates.
[0019]
As shown in FIG. 1, a three-dimensional computer graphics system 10 includes a main processor 11, a main memory 12, an I / O interface circuit 13, and a rendering circuit 14 connected via a main bus 15.
Hereinafter, the function of each component will be described.
[0020]
For example, the main processor 11 reads out necessary graphic data from the main memory 12 in accordance with the progress of the application and the like, and geometry (Geometry) such as clipping processing and lighting processing for the graphic data. Processing is performed to generate polygon rendering data. The main processor 11 outputs the polygon rendering data S11 to the rendering circuit 14 via the main bus 15.
[0021]
The I / O interface circuit 13 receives movement control information, polygon rendering data, or the like from the outside as necessary, and outputs this to the rendering circuit 14 via the main bus 15.
[0022]
Here, the polygon rendering data includes data of (x, y, z, R, G, B, s, t, q) of each of the three vertices of the polygon.
Here, the (x, y, z) data indicates the three-dimensional coordinates of the top of the apple, and the (R, G, B) data indicates the red, green, and blue luminance values in the three-dimensional coordinates, respectively. Yes.
Of the (s, t, q) data, (s, t) indicates the homogeneous coordinates of the corresponding texture, and q indicates the homogeneous term. Here, “s / q” and “t / q” are multiplied by the texture sizes USIZE and VSIZE, respectively, to obtain texture coordinate data (u, v). Access to the texture data stored in the texture buffer 147a is performed using the texture coordinate data (u, v).
That is, the polygon rendering data is a physical coordinate value of each vertex of the triangle, a color of each vertex, and texture data.
[0023]
Hereinafter, the rendering circuit 14 will be described in detail.
As shown in FIG. 1, the rendering circuit 14 includes a host interface circuit 149, a DDA (Digital Differential Anarizer) setup circuit 141, a triangle DDA circuit 142, a texture engine circuit 143, a memory interface (I / F) circuit 144, and a CRT control circuit. 145, a RAMDAC circuit 146, a DRAM 147, and an SRAM (Static RAM) 148.
In the rendering circuit 14 according to this embodiment, a logic circuit and a DRAM 147 that stores at least display data and texture data are mixedly mounted in one semiconductor chip.
[0024]
DRAM147
The DRAM 147 functions as a texture buffer 147a, a display buffer 147b, a z buffer 147c, and a texture CLUT (Color Look Up Table) buffer 147d.
The DRAM 147 is divided into a plurality (four in this embodiment) of modules having the same function, as will be described later.
[0025]
In addition, in order to store more texture data, the DRAM 147 stores an index in the index color and a color lookup table value for the index color in the texture CLUT buffer 147d.
The index and color lookup table values are used for texture processing. That is, the texture element is usually expressed by 24 bits in total of 8 bits for each of R, G, and B. However, since the data amount is expanded, one color is selected from, for example, 256 colors selected in advance, The data is used for texture processing. Thus, if there are 256 colors, each texture element can be expressed by 8 bits. Although a conversion table from an index to an actual color is necessary, the texture data becomes more compact as the texture resolution increases.
Thereby, the texture data can be compressed, and the built-in DRAM can be used efficiently.
[0026]
The DRAM 147 stores depth information of an object to be drawn in order to perform hidden surface processing in parallel with drawing.
Note that the display data, depth data, and texture data are stored as follows: display data is stored continuously from the beginning of the memory block, then depth data is stored, and the remaining space is stored for each texture type. Texture data is stored in a continuous address space. Thereby, texture data can be stored efficiently.
[0027]
FIG. 2 is a block diagram illustrating a specific configuration example of the DRAM 147, the SRAM 148, and the memory I / F circuit 144 that accesses the DRAM 147 and the SRAM 148.
[0028]
As shown in FIG. 2, the DRAM 147 and the SRAM 148 shown in FIG. 1 are divided into four memory modules 200, 210, 220, and 230 as described above.
[0029]
The memory module 200 includes memories 201 and 202.
The memory 201 includes banks 201A and 201B that constitute a part of the DRAM 147, and banks 201C and 201D that constitute a part of the SRAM 148.
The memory 202 includes banks 202A and 202B that constitute a part of the DRAM 147, and banks 202C and 202D that constitute a part of the SRAM 148.
It should be noted that the banks 201C, 201D, 202C, 202D constituting the SRAM 148 can be accessed simultaneously.
[0030]
The memory module 210 includes memories 211 and 212.
The memory 211 includes banks 211A and 211B that constitute a part of the DRAM 147, and banks 211C and 211D that constitute a part of the SRAM 148.
The memory 212 includes banks 212A and 212B that constitute part of the DRAM 147, and banks 212C and 212D that constitute part of the SRAM 148.
Note that the banks 211C, 211D, 212C and 212D constituting the SRAM 148 can be accessed simultaneously.
[0031]
The memory module 220 includes memories 221 and 222.
The memory 221 includes banks 221A and 221B that constitute a part of the DRAM 147, and banks 221C and 221D that constitute a part of the SRAM 148.
The memory 222 includes banks 222A and 222B that constitute a part of the DRAM 147, and banks 222C and 222D that constitute a part of the SRAM 148.
Note that the banks 221C, 221D, 222C, and 222D constituting the SRAM 148 can be accessed simultaneously.
[0032]
The memory module 230 includes memories 231 and 232.
The memory 231 includes banks 231A and 231B that constitute a part of the DRAM 147, and banks 231C and 231D that constitute a part of the SRAM 148.
The memory 232 includes banks 232A and 232B that constitute part of the DRAM 147, and banks 232C and 232D that constitute part of the SRAM 148.
Note that the banks 231C, 231D, 232C, and 232D constituting the SRAM 148 can be accessed simultaneously.
[0033]
Here, each of the memory modules 200, 210, 220, and 230 has all the functions of the texture buffer 147a, the display buffer 147b, the Z buffer 147c, and the texture CLUT buffer 147d shown in FIG.
That is, each of the memory modules 200, 210, 220, and 230 stores all of texture data, drawing data ((R, G, B) data), z data, and texture color lookup table data of the corresponding pixel.
However, the memory modules 200, 210, 220, and 230 store data about different pixels.
Here, texture data, drawing data, z data, and texture color look-up table data for 16 pixels processed simultaneously are different from each other in banks 201A, 201B, 202A, 202B, 211A, 211B, 212A, 212B, 221A, 221B, 222A, 222B, 231A, 231B, 232A, 232B.
As a result, the memory I / F circuit 144 can simultaneously write, for example, data for 16 pixels of 8 (X direction) × 2 pixels (Y direction) to the DRAM 147.
For example, data can be read simultaneously for 8 pixels of 8 (X direction) × 1 pixel (Y direction).
Note that the memory I / F circuit 144 accesses (writes) the DRAM 147 based on so-called predetermined interleaving addressing, as will be described later.
[0034]
FIG. 3 is a schematic diagram illustrating a configuration example of the DRAM 147 as a buffer (for example, a texture buffer).
As shown in FIG. 3, data accessed in a memory area of 2 × 8 pixels (pixels) is stored in an area called a page (row) or a block (column).
Each row ROW0 to ROWn + 1 is divided into four columns (blocks) M0A, M0B, M1A, and M1B as shown in FIG.
Then, at the time of writing, an even number of boundaries in the X direction and the Y direction is accessed, and at the time of reading, an access is made in a boundary that is a multiple of 8 in the X direction and an arbitrary boundary in the Y direction.
[0035]
Banks 201C, 201D, 202C, 202D, 211C, 211D, 212C, 212D, 221C, 221D, 222C, 222D, 231C, 231D, 232C, and 232D are included in banks 201A, 201B, 202A, 202B, 211A, and 211B, respectively. , 212A, 212B, 221A, 221B, 222A, 222B, 231A, 231B, 232A, and 232B are stored.
[0036]
Next, the texture data storage pattern in the texture buffer 147a based on the interleaved addressing will be described in more detail with reference to FIGS.
4 is a diagram for explaining color data (pixel data) included in the texture data and accessed simultaneously, FIG. 5 is a diagram for explaining unit blocks constituting the texture data, and FIG. 6 is an address of the texture buffer. It is a figure for demonstrating space.
[0037]
In the case of the present embodiment, as shown in FIG. 4, color data pix indicating the color data of pixels arranged in a 2 × 8 matrix included in the texture data.₀~ Pix₁₅Are accessed simultaneously.
[0038]
Color data pix₀~ Pix₁₅Need to be stored in different banks of the SRAM 148 that make up the texture buffer 147a.
In this embodiment, the color data pix₀, Pix₁, Pix₈, Pix₉Are stored in the banks 201C and 201D of the memory 201 and the banks 202C and 202D of the memory 202 shown in FIG. Also, color data pix₂, Pix_Three, Pix_Ten, Pix₁₁Are stored in the banks 211C and 211D of the memory 211 and the banks 212C and 212D of the memory 212 shown in FIG. Also, color data pix_Four, Pix_Five, Pix₁₂, Pix₁₃Are stored in the banks 221C and 221D of the memory 221 and the banks 222C and 222D of the memory 222 shown in FIG. Furthermore, color data pix₆, Pix₇, Pix₁₄, Pix₁₅Are stored in the banks 231C and 231D of the memory 231 and the banks 232C and 232D of the memory 232 shown in FIG.
[0039]
In the present embodiment, color data pix of pixels located in a rectangular area that is processed simultaneously.₀~ Pix₁₅Unit block R_iFor example, texture data indicating one image is represented by unit blocks R arranged in a B × A matrix as shown in FIG.₀~ R_BA-1Consists of.
Unit block R₀~ R_BA-1Is stored in the DRAM 147 constituting the texture buffer 147a so as to have continuous addresses in a one-dimensional address space, as shown in FIG. Each unit block R₀~ R_BA-1Color data pix in₀~ Pix₁₅Are stored in different banks of the SRAM 148 so as to have consecutive addresses in a one-dimensional address space.
That is, the texture buffer 147a stores unit blocks composed of color data that are accessed simultaneously so as to have continuous addresses in a one-dimensional address space.
[0040]
Host interface circuit 149
The host interface circuit 149 is connected to the main bus 15 outside the rendering circuit 14, and is connected to the DDA setup circuit 141 and the memory I / F circuit 144 inside the rendering circuit 14.
As will be described later, the host interface circuit 149 serves as an interface for performing communication between the memory I / F circuit 144 and the main processor 11 and the I / O interface circuit 13.
[0041]
DDA setup circuit 141
Prior to obtaining the color and depth information of each pixel inside the triangle by linearly interpolating the value of each vertex of the triangle on the physical coordinate system in the triangle DDA circuit 142 in the subsequent stage, the DDA setup circuit 141 is polygon rendering data. For the (z, R, G, B, s, t, q) data indicated by S11, a setup calculation is performed to obtain the difference between the sides of the triangle and the horizontal direction.
Specifically, this set-up calculation uses the start point value, end point value, and distance between the start point and end point to calculate the variation of the value to be obtained when the unit length is moved. .
The DDA setup circuit 141 outputs the calculated variation data S141 to the triangle DDA circuit 142.
[0042]
Triangle DDA circuit 142
The triangle DDA circuit 142 uses the variation data S141 input from the DDA setup circuit 141 to calculate linearly interpolated (z, R, G, B, s, t, q) data for each pixel inside the triangle. .
The triangle DDA circuit 11 converts (x, y) data of each pixel and (z, R, G, B, s, t, q) data in the (x, y) coordinates into DDA data (interpolation data). In step S142, the data is output to the texture engine circuit 143.
For example, the triangle DDA circuit 142 outputs, to the texture engine circuit 143, DDA data S142 for 8 (= 2 × 4) pixels located in a rectangle to be processed in parallel.
[0043]
Texture engine circuit 143
The texture engine circuit 143 calculates “s / q” and “t / q”, calculates texture coordinate data (u, v), reads (R, G, B) data from the texture buffer 147a, etc. Is performed in a pipeline manner.
Note that the texture engine circuit 143 simultaneously performs, for example, processing on eight pixels located within a predetermined rectangle.
[0044]
The texture engine circuit 143 performs an operation for dividing the s data by the q data and an operation for dividing the t data by the q data for the (s, t, q) data indicated by the DDA data S142.
The texture engine circuit 143 is provided with, for example, eight division circuits (not shown), and divisions “s / q” and “t / q” for eight pixels are performed simultaneously.
[0045]
Also, the texture engine circuit 143 multiplies the division results “s / q” and “t / q” by the texture sizes USIZE and VSIZE to generate texture coordinate data (u, v).
Further, the texture engine circuit 143 outputs a read request including the generated texture coordinate data (u, v) to the SRAM 148 or the DRAM 147 via the memory I / F circuit 144 and passes through the memory I / F circuit 144. By reading the texture data stored in the SRAM 148 or the texture buffer 147a, (R, G, B) data S148 stored at the texture address corresponding to the (s, t) data is obtained.
Here, the texture data stored in the texture buffer 147a as described above is stored in the SRA 5148.
The texture engine circuit 143 reads the (R, G, B) data of the read (R, G, B) data S148 and the (R, G, B) data included in the DDA data S142 from the triangle DDA circuit 142 in the previous stage. And color data S143 is generated.
The texture engine circuit 143 outputs the color data S143 to the memory I / F circuit 144.
[0046]
The texture buffer 147a stores texture data corresponding to a plurality of reduction ratios such as MIPMAP (multiple resolution texture). Here, which reduction rate of texture data is used is determined in units of triangles using a predetermined algorithm.
[0047]
The texture engine circuit 143 directly uses the (R, G, B) data read from the texture buffer 147a in the case of the full color system.
On the other hand, in the case of the index color method, the text 7 engine circuit 143 reads a color lookup table (CLUT) created in advance from the texture CLUT buffer 147d, transfers and stores it in the built-in SRAM, and this color lookup table. Is used to obtain (R, G, B) data corresponding to the color index read from the texture buffer 147a.
[0048]
Memory I / F circuit 144
The memory I / F circuit 144 compares the z data corresponding to the color data S143 input from the texture engine circuit 143 with the z data stored in the z buffer 147c, and is rendered by the input color data S143. It is determined whether or not the image is positioned on the near side (viewpoint side) with respect to the previous image written in the display buffer 147b. If the image is positioned on the near side, the z buffer 147c is used with z data corresponding to the image data S143. The z data stored in is updated.
Further, the memory I / F circuit 144 writes (R, G, B) data to the display buffer 147b.
Further, when the memory I / F circuit 144 receives a read request including the generated texture coordinate data (u, v) from the texture engine circuit 143 to the SRAM 148, the memory I / F circuit 144 stores the (R, G) in the SRAM 148. , B) Read data S148.
Further, when the memory I / F circuit 144 receives a request to read display data from the CRT control circuit 145, the memory I / F circuit 144 displays a certain amount of data from the display buffer 147b, for example, in units of 8 pixels or 16 pixels in response to the request. Read data.
[0049]
Further, the memory I / F circuit 144 outputs the data read from the DRAM 147 to the host interface circuit 149.
[0050]
The memory I / F circuit 144 accesses (writes or reads) the DRAM 147 and the SRAM 148, but the write path and the read path are configured as separate paths.
That is, in the case of writing, the write address ADRW and the color data cdw are processed by the writing system circuit and written to the DRAM 147, and in the case of reading, the reading address circuit is processed and read from the DRAM 147 or the SRAM 148.
Then, the memory I / F circuit 144 performs access to the DRAM 147 based on a predetermined interleave type addressing, writing in units of 16 pixels, and reading out in units of 8 pixels.
[0051]
A specific configuration example of the memory I / F circuit 144 will be described below with reference to FIG.
[0052]
As shown in FIG. 2, the memory I / F circuit 144 includes a distributor 300, address converters 310, 320, 330, and 340, memory controllers 350, 360, 370, and 380, a read controller 390, and a data rearrangement unit 400.
[0053]
[Distributor 300]
At the time of writing, the distributor 300 outputs the color data cwd and write address ADRW for 8 pixels from the data rearrangement unit 400.₂And color data for 16 pixels are generated from these, and then divided into four image data S301, S302, S303, and S304 each consisting of data for four pixels, and the image data and the write address are respectively converted into address converters. It outputs to 310,320,330,340.
Here, (R, G, B) data for one pixel is 32 bits each, and z data is 32 bits.
[0054]
[Address converters 310, 320, 330, 340]
Address converters 310, 320, 330, and 340 use addresses corresponding to (R, G, B) data and z data input from distributor 300 at the time of writing as addresses in memory modules 200, 210, 220, and 230, respectively. The converted addresses S310, S320, S330, and S340 and the divided image data are output to the memory controllers 350, 360, 370, and 380, respectively.
[0055]
FIG. 7 is a diagram schematically showing image data processing (pixel processing) of the distributor 300.
This figure corresponds to FIG. 3 to FIG. 6 described above, and the distributor 300 performs image data processing so that data for 16 pixels of 2 × 8 pixels, for example, can be simultaneously accessed to the DRAM 147. Do.
Then, in the read operation, access is made in an area of a boundary that is a multiple of 8 in the X direction and in an arbitrary boundary in the Y direction, and in the write operation, access is made in an even boundary area in the X direction and the Y direction. The image data is processed so as to achieve proper addressing.
As a result, the DRAM 147 does not start with the memory cell numbers MCN “1”, “2”, “3”, but always has the memory cell number MCN “0”, thereby preventing the occurrence of page violation.
The distributor 300 processes the color data for each of the DRAM modules 220 to 230 so that the adjacent portions in the display area are arranged in different DRAM modules.
As a result, when a plane such as a triangle is drawn, processing can be performed simultaneously on the plane, so that the operation probability of each DRAM module is very high.
[0056]
[Memory controllers 350, 360, 370, 380]
The memory controllers 350, 360, 370, and 380 include write system wiring groups 401W, 402W, 411W, 412W, 421W, 422W, 431W, and 432W, and read system wiring groups 401R, 402R, 411R, 412R, 421R, 422R, and 431R, respectively. , 432R to the memory modules 200, 210, 220, and 230, and controls access to the memory modules 200, 210, 220, and 230 during writing and reading.
[0057]
Specifically, at the time of writing, the memory controllers 350, 360, 370, 380 are output from the distributor 300, and (R, G, B) data for four pixels input from the address converters 350, 360, 370, 380 and The z data is simultaneously written into the memory modules 200, 210, 220, and 230 via the write system wiring groups 401W, 402W, 411W, 412W, 421W, 422W, 431W, and 432W.
At this time, for example, in the memory module 200, as described above, (R, G, B) data and z data for one pixel are stored in each of the banks 201A, 201B, 202A, 202B. The same applies to the memory modules 210, 220, and 230.
[0058]
Further, each of the memory controllers 350, 360, 370, and 380 actively outputs an idle signal S350, S360, S370, and S380 to the controller 390 when its state machine is in a so-called idle (IDLE) state. In response to the read address and read request signal S391 from the read controller 390 in response to the idle signals S350, S360, S370, and S380, the data is transmitted via the read system wiring groups 401R, 402R, 411R, 412R, 421R, 422R, 431R, and 432R. Are read out and output to the read controller 390 via the read system wiring groups 351, 361, 371, 381 and the wiring group 440.
[0059]
In the present embodiment, the number of wirings of the write system wiring groups 401W, 402W, 411W, 412W, 421W, 422W, 431W, 432W and the read system wiring groups 401R, 402R, 411R, 412R, 421R, 422R, 431R, 432R. Is 128 (128 bits), the number of wirings of the readout system wiring groups 351, 361, 371, 381 is 256 (256 bits), and the number of wirings of the readout system wiring group 440 is 1024 (1024 bits).
[0060]
[Read controller 390]
The read controller 390 includes an address converter 391 and a data operation processing unit 392.
The address converter 391 reads the read address ADRR₂If the idle signals S350, S360, S370, and S380 from the memory controllers 350, 360, 370, and 380 are all received in an active state, in response to the idle signals S350, S360, S370, and S380, the unit is 8 pixels. The read address and the read request signal S391 are output to each of the memory controllers 350, 360, 370, and 380 so that the data is read at the same time.
The data operation unit 392 is a texture data (R, G, B) in units of 8 pixels or 16 pixels read by each of the memory controllers 350, 360, 370, 380 in response to the read address and the read request signal S391. Data, z data, and texture color look-up table data are input via the wiring group 440, subjected to predetermined arithmetic processing, and after data rearrangement by the data rearrangement unit 400, a request destination, for example, a host interface circuit To 149.
Even when data is read from one rectangular area in the storage area of the DRAM 147 and written to another rectangular area, data transfer is performed via the read controller 390, the data rearrangement unit 400, and the distributor 300.
[0061]
As described above, the read controller 390 outputs the read address and the read request signal S391 to the memory controllers 350, 360, 370, and 380 when all of the memory controllers 350, 360, 370, and 380 are in the idle state. Since the read data is received, the read data can be synchronized.
Therefore, the read controller 390 does not need to have a holding circuit such as a FIFO (First In First Out) circuit that temporarily holds data, and the circuit scale is reduced.
[0062]
[Data rearrangement unit 400]
The data rearrangement unit 400 has three transfer modes of “Host to Local”, “Local to Host”, and “Local to Local”, and rearranges data according to each transfer mode.
For example, the data rearrangement unit 400 determines the transfer mode and the data length of one pixel based on the transfer mode instruction signal and the pixel data length instruction signal input from the texture engine circuit 143.
The data length of one pixel includes, for example, 32 bits, 16 bits, 8 bits, and 4 bits.
Here, the data length of 32 bits and 16 bits per pixel is used for texture data, and the data length of 8 bits and 4 bits per pixel is used for index data.
[0063]
In the “Host to Local” transfer mode, the data rearrangement unit 400 transfers the 64-bit width transfer data hwd (color data S143) input from the texture engine circuit 143 and the write address ADRW.₁And an address ADRW including 256-bit width color data cwd, representative point coordinates (btrx, btry) and a valid flag btrvld.₂Are output to the distributor 300.
[0064]
In addition, the data rearrangement unit 400 receives the address ADRR input from the host interface circuit 149 in the “Local to Host” transfer mode.₁Address ADRR according to₂And the address ADRR₂By the reading operation performed based on the above, the 256-bit color data mc_0dtr to mc_7dtr for 8 pixels input from the controller 390 are packed to generate the transfer data hrd having a 64-bit width, which is transferred to the host interface circuit 149. Output.
[0065]
Furthermore, in the “Local to Local” transfer mode, the data rearrangement unit 400 reads data from one rectangular storage area in the storage area of the DRAM 147 and transfers (writes) data to another rectangular storage area. 390-bit 256-bit color data mc_0dtr to mc_7dtr is input from 390 and shifted according to the start point coordinates of the rectangular storage area of the transfer destination to generate color data cwd. The color data cwd and the transfer destination Address ADRW including representative point coordinates (btrx, btry) and valid flag btrvld used in the write operation₂Are output to the distributor 300.
[0066]
FIG. 8 is a configuration diagram of the data rearrangement unit 400 shown in FIG.
As shown in FIG. 8, the data rearrangement unit 400 includes a FIFO (First In First Out) circuit 410.₁~ 410_Three, Latch circuit 410_FourIt has a data rearrangement circuit 420, an address generation unit 430, and a controller 440.
[0067]
FIFO circuit 410₁Is a FIFO circuit having a 64-bit width and a three-stage depth, and a control signal S440 from the controller 440, for example.₁The 64-bit width transfer data hwd input from the texture engine circuit 143 is output to the data rearrangement circuit 420 by the FIFO method.
FIFO circuit 410₂Is a FIFO circuit having a 64-bit width and a three-stage depth, and a control signal S440 from the controller 440, for example.₂The 64-bit width transfer data hrd input from the data rearrangement circuit 420 is output to the host interface circuit 149 by the FIFO method.
[0068]
FIFO circuit 410_ThreeIs, for example, a 64-bit FIFO circuit having a depth of 16 stages, and a control signal S440 from the controller 440._ThreeBased on the above, color data mc_0dtr to mc_7dtr for 8 pixels each having 32 bits input from the read controller 390 are output to the data rearrangement circuit 420 by the FIFO method.
FIFO circuit 410_ThreeThe color data mc_0dtr to mc_7dtr are input to the controller 440 as will be described later.
Latch circuit 410_FourIs, for example, 256 bits wide and a control signal S440 from the controller 440._FourThe color data cwd_a having a 256-bit width input from the data rearrangement circuit 420 is output to the distributor 300 shown in FIG.
[0069]
FIG. 9 is a configuration diagram of the data rearrangement circuit 420 shown in FIG.
As shown in FIG. 9, the data rearrangement circuit 420 includes a shift circuit 450.₁, 450₂, Selector 451₀~ 451₇, Latch circuit 452₀~ 452₇And latch circuit 453₀~ 453₇Have
[0070]
Shift circuit 450₁The control signal S440 input from the controller 440 when performing “Host to Local” transfer.₆Based on the above, the 64-bit wide transfer data hwd, that is, 256 bits (8 pixels) of color data in which 2 pixels of color data are arranged on the LSB side is input from the texture engine circuit 143, and the color data is 32 bits. The 256-bit color data mc_0a to mc_7a generated by the shift processing in units is output.
[0071]
Selector 451₀~ 451₇Control signal S440₆Are selected from among the 32-bit color data mc_0a to mc_7a input via the input terminal a and the 32-bit color data mc_0a to mc_7a input via the input terminal b, respectively, and the output terminal a to latch circuit 452₀~ 452₇Output to.
[0072]
Latch circuit 452₀~ 452₇Control signal S440₆Based on the selectors 451 respectively.₀~ 451₇The 32-bit color data output from the output terminal c of the FIFO circuit 410 shown in FIG.₂Alternatively, the latch circuit 410_FourOutput to.
Latch circuit 453₀~ 453₇Control signal S440₆Based on the shift circuit 450 respectively.₁The color data mc_0a to mc_7a input from the shift circuit 450₂And the FIFO circuit 410 shown in FIG.₂Alternatively, the latch circuit 410_FourOutput to.
Here, the latch circuit 452₀~ 452₇Each of the 32-bit output lines is connected to a latch circuit 453, respectively.₀~ 453₇Are the same as the 32-bit output lines of the latch circuit 452 respectively.₀~ 452₇And latch circuit 453₀~ 453₇Control is performed so that 32-bit color data is output from one of the two.
[0073]
Shift circuit 450₂The latch circuit 453₀~ 453₇Each of the 32-bit color data mc_0b to mc_7b generated by shifting the 256-bit color data input from is shifted in units of 32 bits.₀~ 451₇To the input terminal b.
[0074]
Hereinafter, the data rearrangement process in the data rearrangement unit 400 shown in FIG. 8 and the address generation process in the address generation unit 430 will be described separately for each transfer mode.
As described above, writing to the DRAM 147 is performed in units of a rectangular storage area in which 16 pixels of color data of 8 pixels (X direction) × 2 pixels (Y direction) with reference to the representative point are stored. As done. On the other hand, reading from the DRAM 147 is performed in units of rectangular storage areas in which 8 color data of 8 pixels (X direction) × 1 pixel (Y direction) with reference to the representative point are stored.
[0075]
As the representative point address, an arbitrary address can be specified in the Y direction in the read operation, but only an address that is a multiple of 8 in the X direction can be specified. On the other hand, the address of the representative point can specify only an address that is a multiple of 2 in both the X direction and the Y direction in the write operation.
[0076]
In the rearrangement processing shown below, the color for the rectangular storage area 500 having the width “3”, the height “3”, and the origin coordinates (5, 5) as shown in FIG. A case involving data reading and writing operations will be described as an example.
[0077]
<"Host to Local" transfer mode>
Hereinafter, the 64-bit width transfer data hwd shown in FIG. 11 input from the texture engine circuit 143₀~ Hwd_FourColor data pix of 32 bits each (data length of one pixel is 32 bits)₀~ Pix₈Will be described in the case of transferring and writing to the rectangular storage area 500 shown in FIG.
[0078]
In this case, the transfer data hwd is given from the texture engine circuit 143 in the order in which the X coordinate increases in the scan line direction from the upper left end point in FIG. 10 of the rectangular storage area 500 in FIG. Line data is given in the order of increasing X-coordinates. At this time, data having different scan lines is also packed in 64 bits.
[0079]
First, the 64-bit-wide transfer data hwd shown in FIG. 11 from the texture engine circuit 143 is shown.₀~ Hwd_FourIs the FIFO circuit 410 shown in FIG.₁And sequentially output to the data rearrangement circuit 420.
And transfer data hwd₀Color data pix₀, Pix₁Are arranged in order from the LSB, and the 256-bit color data is converted into the shift circuit 450 shown in FIG.₁As shown in FIG. 12A, the color data cwd_a after the shift is shifted by “1” toward the MSB in units of 32 bits.₀Is the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG.₀Output as color data cwd₀Is a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0080]
Next, transfer data hwd₁Color data pix₂, Pix_ThreeAre arranged in order from the LSB, and the 256-bit color data is converted into the shift circuit 450 shown in FIG.₁In FIG. 12B, as shown in FIG. 12B, the color data cwd_a is shifted by “1” toward the MSB in units of 32 bits.₁Is the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG.₁Output as color data cwd₁Is a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
In addition, the color data cwd_a after the shift₁The selector 451₀~ 451₇And latch circuit 452₀~ 452₇Via the shift circuit 450₂Is output.
[0081]
Next, transfer data hwd₂Color data pix_Four, Pix_FiveAre arranged in order from the LSB, and the 256-bit color data is converted into the shift circuit 450 shown in FIG.₁12C, after being shifted by “2” toward the MSB in 32-bit units, the selector 451 shown in FIG.₀~ 451₇Through the latch circuit 453₀~ 453₇Is remembered.
At this time, the color data pix_Four, Pix_FiveHowever, the latch circuit 453₂453_ThreeIs remembered.
Further, the shift circuit 450 described above is used.₂The 256-bit color data cwd_a shown in FIG.₁Is shifted by “1” toward the LSB, then the selector 451₀~ 451₇Via the latch circuit 452₀~ 452₇Is remembered. At this time, the color data pix_ThreeIs latch circuit 452₁Is remembered.
Then, the latch circuit 453₀, 452₁453₂~ 453₇As shown in FIG. 13C, the color data pix_Three, Pix_Four, Pix_FiveColor data cwd including₂Is the latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0082]
Next, transfer data hwd_ThreeColor data pix₆, Pix₇Are arranged in order from the LSB, and the 256-bit color data is converted into the shift circuit 450 shown in FIG.₁As shown in FIG. 12D, the color data cwd_a after the shift is shifted by “1” toward the MSB in units of 32 bits._ThreeIs the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG._ThreeOutput as color data cwd_ThreeIs a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0083]
Next, transfer data hwd_FourColor data pix₈Are arranged in order from the LSB, and the 256-bit color data is converted into the shift circuit 450 shown in FIG.₁In FIG. 12E, as shown in FIG. 12E, the color data cwd_a is shifted by “1” toward the MSB in units of 32 bits._FourIs the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG._FourOutput as color data cwd_FourIs a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0084]
In the data rearrangement unit 400, the color data cwd described above.₀~ Cwd_FourThe address generation unit 430 generates the representative point coordinates (btrx, btry) and the valid flag btrvld corresponding to.
The representative point coordinates (btrx, btry) and the valid flag btrvld are stored in the address ADRW.₂And output to the distributor 300.
Here, the color data cwd₀~ Cwd_FourThe representative point coordinates corresponding to (btrx₀, Btry₀) To (btrx_Four, Btry_Four) And the valid flag is btrvld₀~ Btrvld_FourAnd
The address ADRW in the address generator 430₂Is generated by the control signal S440 from the controller 440 shown in FIG._FiveBased on.
[0085]
The valid flag btrvld is composed of 16 bits and, as described above, writes to the DRAM 147 in units of a rectangular storage area (256 × 2 bits) of 16 pixels of 8 pixels (X direction) × 2 pixels (Y direction). Whether or not to rewrite the stored contents when performing is shown for each storage area of each pixel. Each bit of the valid flag btr coordinate vld corresponds to a storage area for one pixel in a storage area for 16 pixels which is a unit for writing. Each bit of the valid flag btrvld indicates a logical value “0” when the corresponding storage area is masked (when rewriting is not performed), and indicates a logical value “1” when not masked (when rewriting). Yes.
[0086]
Here, since the color data used in one writing is 256-bit data for 8 pixels, the color data is written in a storage area where the address in the Y direction is an even number among the storage areas to be written. When the upper 8 bits of the 16-bit valid flag btrvld are set to the logical value “0” and the address in the Y direction is written to the odd-numbered storage area among the storage areas to be written, the 16-bit valid flag btrvld The lower 8 bits are set to the logical value “0”.
[0087]
As shown in FIGS. 13A to 13E, the color data cwd₀~ Cwd_FourFor each of the valid flags btrvld₀~ Btrvld_FourIs generated by the address generation unit 430.
Here, for example, a valid flag btrvld₀Since the lower 8 bits are the logical value “0”, the color data cwd₀Is written in a memory area where the address in the Y direction is odd among the memory areas to be written.
[0088]
Further, the address generation unit 430 performs color data cwd shown in FIGS.₀~ Cwd_FourAre written in the DRAM 147 in units of a rectangular storage area of 16 pixels of 8 pixels (X direction) × 2 pixels (Y direction) described above, as shown in FIG. btrx₀, Btry₀) To (btrx_Four, Btry_Four) Respectively.
[0089]
As described above, the color data cwd shown in FIGS. 13A to 13E generated by the data rearrangement circuit 420 shown in FIG. 8 of the data rearrangement unit 400 shown in FIG.₀~ Cwd_FourIs the address ADRW generated in the address generation unit 430 shown in FIG.₂Representative point coordinates (btrx) shown in FIGS.₀, Btry₀) To (btrx_Four, Btry_Four) And the valid flag to btrvld₀~ Btrvld_Four2 is processed by the distributor 300, the address converter 310, and the memory controllers 350, 360, 370, and 380 shown in FIG. 2, and written in the DRAM 147 shown in FIG.
[0090]
First, as shown in FIG. 14A, the representative point coordinates (btrx₀, Btry₀) = (4, 4), the storage area of the rectangular storage area 500 corresponding to 16 pixels has a Y coordinate of “5” (odd number), and the X coordinate in the storage area has the smallest address in FIG. Color data cwd shown in A)₀Color data cwd so that the LSB of₀Is written. At this time, as described above, the valid flag btrvld₀When the lower 8 bits of all are logical values “0”, the color data cwd is stored in the storage area where the Y coordinate is an odd number.₀Is written and the valid flag btrvld₀When the upper 8 bits of all are logical values “0”, the color data cwd is stored in the storage area where the Y coordinate is an even number.₀Is written.
Also, color data cwd₀Is written as a valid flag btrvld.₀Of these, only the 32-bit storage area corresponding to the bit of the logical value “1” is performed. In this case, the valid flag btrvld₀Is “0x0600”, that is, “000001000000000”, and the bits of the logical value “1” correspond to the storage areas with the (X, Y) coordinates of (5, 5) and (6, 5). As shown in FIG. 14A, color data pix only for the storage area.₀, Pix₁Is written.
[0091]
Next, as shown in FIG. 14B, the representative point coordinates (btrx₁, Btry₁) = (6, 4) The color data cwd shown in FIG. 13B is stored in the storage area where the Y coordinate in the rectangular storage area 500 for 16 pixels is “5” (odd number).₁Color data pix₂Is written. The writing method is the same as that shown in FIG.
[0092]
Next, similarly, as shown in FIG. 14C, the representative point coordinates (btrx₂, Btry₂) = (4, 6) The color data cwd shown in FIG. 13C is stored in the storage area where the Y coordinate in the rectangular storage area 500 for 16 pixels is “6” (even number).₂Color data pix_Three, Pix_Four, Pix_FiveIs written.
[0093]
Next, similarly, as shown in FIG. 15D, the representative point coordinates (btrx_Three, Btry_Three) = (4, 6), the color data cwd shown in FIG. 13D is stored in the storage area where the Y coordinate in the rectangular storage area 500 for 16 pixels is “7” (odd number)._ThreeColor data pix₆, Pix₇Is written.
[0094]
Next, similarly, as shown in FIG. 15 (F), the representative point coordinates (btrx_Four, Btry_Four) = (6, 6), the color data cwd shown in FIG. 13E is stored in the storage area where the Y coordinate in the rectangular storage area 500 for 16 pixels is “7” (odd number)._FourColor data pix₈Is written.
[0095]
As a result, the 64-bit width transfer data hwd shown in FIG. 11 input from the texture engine circuit 143 is obtained.₀~ Hwd_FourColor data pix for 8 pixels of 32 bits each₀~ Pix₈Is written in the rectangular storage area 500 shown in FIG.
[0096]
In the example shown in FIGS. 11 to 15 described above, color data pix having a data length of one pixel of 32 bits.₀~ Pix₈Is input from the texture engine circuit 143 and written in the rectangular storage area 500 shown in FIG. 10, but the data length of one pixel is arbitrary, and may be 16 bits, for example.
Hereinafter, color data pix having a data length of 16 bits per pixel₀~ Pix₈Is input from the texture engine circuit 143 and written into the rectangular storage area 500 shown in FIG.
In this case, as shown in FIG. 16, the 64-bit width transfer data input from the texture engine circuit 143 includes color data for 4 pixels each.
In this case, the transfer data hwd shown in FIG.₀Color data pix included in₀~ Pix_Three256-bit color data including the shift data 450 of the data rearrangement circuit 420 shown in FIG.₁Is shifted by 1 bit in the MSB direction, and 256-bit color data cwd_a shown in FIG.₀Is generated and the color data cwd_a₀In response to the latch circuit 453₀~ 453₇To color data cwd shown in FIG.₀Is output to the distributor 300.
Next, the transfer data hwd shown in FIG.₁Color data pix included in_Four~ Pix₇256-bit color data including the shift data 450 of the data rearrangement circuit 420 shown in FIG.₁Is shifted by 2 bits in the MSB direction, and the transfer data hwd₀Color data pix included in_ThreeTo add 256-bit color data cwd_a shown in FIG.₁Is generated and the color data cwd_a₁Color data cwd shown in FIG.₁Is output to the distributor 300.
[0097]
Next, the transfer data hwd shown in FIG.₂Color data pix included in₈256-bit color data including the shift data 450 of the data rearrangement circuit 420 shown in FIG.₁Is shifted by 3 bits in the MSB direction, and the transfer data hwd₁Color data pix included in₆, Pix₇To add 256-bit color data cwd_a shown in FIG.₂Is generated and the color data cwd_a₂Color data cwd shown in FIG.₂Is output to the distributor 300.
[0098]
Also, color data cwd₀~ Cwd_ThreeThe representative point coordinates (btrx, btry) and the valid flag btrvld shown in FIGS. 18A to 18C corresponding to are generated in the address generation unit 430.
[0099]
As a result, the 64-bit width transfer data hwd shown in FIG. 16 input from the texture engine circuit 143 is obtained.₀~ Hwd₂Color data pix for 8 pixels of 16 bits each₀~ Pix₈Is written in the rectangular storage area 500 shown in FIG.
[0100]
<"Local to Host" transfer mode>
Hereinafter, an operation when the color data mc_0dtr to mc_7dtr for 8 pixels read from the DRAM 147 is packed into 64-bit transfer data and output to the host interface circuit 149 will be described.
As described above, reading from the DRAM 147 is performed in units of rectangular storage areas in which 8 color data of 8 pixels (X direction) × 1 pixel (Y direction) with reference to the representative point are stored. . As the representative point address, an arbitrary address can be specified in the Y direction in the read operation, but only an address that is a multiple of 8 in the X direction can be specified.
[0101]
In this case, the address generation unit 430 of the data rearrangement unit 400 illustrated in FIG. 8 receives the address ADRR input from the host interface circuit 149.₁For the rectangular storage area to be read out in the DRAM 147, the representative point coordinates (sbx, sby) in the order in which the X coordinate increases in the scan line direction from the upper left end in FIG. 19 in the X, Y coordinate system shown in FIG. Is calculated. Then, an address ADRR including the representative point coordinates (sbx, sby)₂Is output to the distributor 300.
[0102]
In addition, the data rearrangement circuit 420 has an address ADRR.₂The color data mc_0dtr to mc_7dtr for 8 pixels read from the DRAM 147 in response to the read data from the read controller 390 shown in FIG. 2 to the FIFO circuit 410 of the data rearrangement unit 400 shown in FIG._ThreeIs packed into 64-bit wide transfer data hwd, and the FIFO circuit 410₁To the host interface circuit 149.
At this time, the color data mc_0dtr is the color data stored in the representative point coordinates of the rectangular storage area to be read out, and the color data mc_7dtr is stored in the coordinates having the maximum X coordinate in the rectangular storage area. Color data.
[0103]
Hereinafter, as an example, the color data pix read from the rectangular storage area 600 in the DRAM 147 with the width 13, the height 3, and the starting point coordinates (5, 5) shown in FIG.₀~ Pix₃₈Will be described when the data is packed into 64-bit width transfer data including color data for four pixels and output to the host interface circuit 149.
[0104]
First, the address ADRR is sent to the address generator 430.₁Is input from the host interface circuit 149.
Then, in order to perform the read operation of the rectangular storage area 600 shown in FIG. 19 in the address generation unit 430, the representative point coordinates (sbx shown in FIGS. 19 and 20A to 20I) are used.₀, Sby₀) To (sbx₈, Sby₈) Including address ADRR₂2 are sequentially output to the distributor 300 shown in FIG. 2, and after the processing of the address converter 310 and the memory controllers 350, 360, 370, and 380, the color data mc_0dtr to mc_7dtr shown in FIGS. The read circuit 390 shown in FIG. 2 to the FIFO circuit 410 of the data rearrangement unit 400 shown in FIG._ThreeAre sequentially output to the data rearrangement circuit 420.
[0105]
In the data rearrangement circuit 420 shown in FIG. 9, the control signal S440 from the controller 440 is displayed.₆Based on the above, the following processing is performed.
Note that the shift circuit 450 shown in FIG.₁The selector 451₀~ 451₇A 256 (= 32 × 8) -bit output storage area connected to the input terminal “a”, and a 224 (= 32 × 8) -bit internal storage area located in the upper bits of the output storage area. The shift operation is performed in units of 32 bits.
[0106]
First, 256-bit color data mc_0dtr to mc_7dtr shown in FIG.₁Color data pix after being shifted by “5” toward the LSB in units of 32 bits.₀~ Pix₂However, the latch circuit 453₀~ 453₂Is remembered.
[0107]
Next, 256-bit color data mc_0dtr to mc_7dtr shown in FIG.₁The color data pix that is input to the MSB, shifted by “3” toward the MSB in units of 32 bits, and stored in the output storage area_Three~ Pix₇Is selector 451_Three~ 451₇Through the input terminal a and the output terminal c of the latch circuit 452._Three~ 452₇Is written to.
As a result, the latch circuit 453₀~ 453₇The color data cwd_b shown in the portion corresponding to the output storage area in FIG.₁Is memorized.
[0108]
Then, the latch circuit 453₀~ 453_ThreeColor data pix stored in₀~ Pix_ThreeIs the 64-bit transfer data hrd shown in FIG.₀As shown in FIG.₂Is output to the host interface circuit 149.
Next, the latch circuit 453_Four~ 453₇Color data pix stored in_Four~ Pix₇Is the 64-bit transfer data hrd shown in FIG.₁As shown in FIG.₂Is output to the host interface circuit 149.
[0109]
The shift circuit 450₁Color data pix stored in the internal storage area₈~ Pix_TenIs shifted by “8” toward the LSB in units of 32 bits, and then the latch circuit 453₀~ 453₂Is written to.
Next, the color data mc_0dtr to mc_7dtr shown in FIG.₁Is output after being shifted by “3” toward the MSB in units of 32 bits, and the color data pix₁₁, Pix₁₂However, the latch circuit 453_Three453_FourIs written to.
Next, the color data mc_0dtr to mc_7dtr shown in FIG.₁Is output without being shifted, and the color data pix₁₃~ Pix₁₅However, the latch circuit 453_Five~ 453₇Is written to.
As a result, the latch circuit 453₀~ 453₇The color data cwd_b shown in the portion corresponding to the output storage area in FIG._ThreeIs memorized.
[0110]
Then, the latch circuit 453₀~ 453_ThreeColor data pix stored in₈~ Pix₁₁Is the 64-bit transfer data hrd shown in FIG.₂As shown in FIG.₂Is output to the host interface circuit 149.
Next, the latch circuit 453_Four~ 453₇Color data pix stored in₁₂~ Pix₁₅Is the 64-bit transfer data hrd shown in FIG._ThreeAs shown in FIG.₂Is output to the host interface circuit 149.
[0111]
Next, 256-bit color data mc_0dtr to mc_7dtr shown in FIG.₁Is output without being shifted, and the color data pix₁₆~ Pix_{twenty three}However, the latch circuit 453₀~ 453₇Is written to.
As a result, the latch circuit 453₀~ 453₇The color data cwd_b shown in the portion corresponding to the output storage area in FIG._FourIs memorized.
[0112]
Then, the latch circuit 453₀~ 453_ThreeColor data pix stored in₁₆~ Pix₁₉Is the 64-bit transfer data hrd shown in FIG._FourAs shown in FIG.₂Is output to the host interface circuit 149.
Next, the latch circuit 453_Four~ 453₇Color data pix stored in₂₀~ Pix_{twenty three}Is the 64-bit transfer data hrd shown in FIG._FiveAs shown in FIG.₂Is output to the host interface circuit 149.
[0113]
Next, 256-bit color data mc_0dtr to mc_7dtr shown in FIG.₁Color data pix without being shifted to_{twenty four}, Pix_{twenty five}However, the latch circuit 453₀453₁Is remembered.
[0114]
Next, 256-bit color data mc_0dtr to mc_7dtr shown in FIG.₁Is shifted by “3” toward the LSB in units of 32 bits, and the color data pix₂₆~ Pix₂₈However, the latch circuit 453₂~ 453_FourIs written to.
[0115]
Next, 256-bit color data mc_0dtr to mc_7dtr shown in FIG.₁Is shifted by “5” toward the MSB in units of 32 bits, and the color data pix₂₉~ Pix₃₆Color data pix stored in the output storage area₂₉~ Pix₃₁However, the latch circuit 453_Five~ 453₇Written to
As a result, the latch circuit 453₀˜453, the color data cwd_b shown in the portion corresponding to the output storage area of FIG.₇Is memorized.
[0116]
Then, the latch circuit 453₀~ 453_ThreeColor data pix stored in_{twenty four}~ Pix₂₇Is the 64-bit transfer data hrd shown in FIG.₆As shown in FIG.₂Is output to the host interface circuit 149.
Next, the latch circuit 453_Four~ 453₇Color data pix stored in₂₈~ Pix₃₁Is the 64-bit transfer data hrd shown in FIG.₇As shown in FIG.₂Is output to the host interface circuit 149.
[0117]
Next, the shift circuit 450₁Color data pix stored in the internal storage area₃₂~ Pix₃₆Is shifted by “8” toward the LSB in units of 32 bits, and then the latch circuit 453₀~ 453_FourIs written to.
Next, the color data mc_0dtr to mc_7dtr shown in FIG.₁Is output after being shifted by “5” toward the MSB in units of 32 bits, and the color data pix₃₇, Pix₃₈However, the latch circuit 453_Five453₆Is written to.
As a result, the latch circuit 453₀~ 453₆In addition, the color data cwd_b shown in FIG.₈Is memorized.
[0118]
Then, the latch circuit 453₀~ 453_ThreeColor data pix stored in₃₂~ Pix₃₅Is the 64-bit transfer data hrd shown in FIG.₈As shown in FIG.₂Is output to the host interface circuit 149.
Next, the latch circuit 453_Four~ 453₆Color data pix stored in₃₆~ Pix₃₈Is the 64-bit transfer data hrd shown in FIG.₉As shown in FIG.₂Is output to the host interface circuit 149.
[0119]
As described above, the color data pix read from the rectangular storage area 600 in the DRAM 147 having the width 13, the height 3, and the starting point coordinates (5, 5) shown in FIG.₀~ Pix₃₈However, as shown in FIG. 22, 64-bit wide transfer data hrd including color data for four pixels.₀~ Hrd₉And output to the host interface circuit 149.
[0120]
<"Local to Local" transfer mode>
In this case, the 8-pixel color data mc_0dtr to mc_7dtr read out from the transfer source storage area in the DRAM 147 with reference to the representative point coordinates (sbx, sby) is stored in the write destination (transfer destination) storage area. A point coordinate (btrx, btry) is written as a reference.
At this time, the calculation order of the representative point coordinates of the rectangular storage area of the transfer source is performed in a designated direction instead of one direction as in the “Host to Local” transfer mode described above.
[0121]
Hereinafter, the representative point coordinates (sbx, sby) are generated in order from right to left and from bottom to top, and the color data pix read out from the rectangular storage area 600 shown in FIG.₀~ Pix₃₈Is described when the representative point coordinates (btrx, btry) are generated and written in order from right to left and from bottom to top in a rectangular storage area having a width of 13 and a height of 3 and an origin coordinate of (10, 7). To do.
[0122]
First, in order to perform the read operation of the rectangular storage area 600 shown in FIG. 19 in the address generator 430 shown in FIG. 8, the representative point coordinates (sbx) shown in FIGS.₈, Sby₈) To (sbx₀, Sby₀) Including address ADRR₂2 are sequentially output to the distributor 300 shown in FIG. 2, and after the processing of the address converter 310 and the memory controllers 350, 360, 370, and 380, the color data mc_0dtr to mc_7dtr shown in FIGS. The read circuit 390 shown in FIG. 2 to the FIFO circuit 410 of the data rearrangement unit 400 shown in FIG._ThreeAre sequentially output to the data rearrangement circuit 420.
That is, the representative point coordinates are output to the distributor 300 in the reverse order compared to the case shown in FIG.
[0123]
Next, in the data rearrangement circuit 420 shown in FIG. 9, the control signal S440 from the controller 440 is obtained.₆Based on the above, the following processing is performed.
First, the color data mc_0dtr to mc_7dtr shown in FIG. 23A is converted into the shift circuit 450 shown in FIG.₁9 is shifted by “1” toward the MSB in units of 32 bits, and the color data after the shift is the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG.₀Output as color data cwd₀Is a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0124]
Next, the color data mc_0dtr to mc_7dtr shown in FIG. 23B is converted into the shift circuit 450 shown in FIG.₁Without being shifted in the latch circuit 453₀~ 453₇Is written to the shift circuit 450.₂Output to the shift circuit 450₂, The data is shifted by “1” toward the LSB in units of 32 bits, and the color data after the shift is the selector 451.₀~ 451₇Through the latch circuit 452₀~ 452₇Is written to.
The latch circuit 452₀~ 452₇To the color data cwd shown in FIG.₁Is the latch circuit 410 shown in FIG._FourIs output to the distributor 300.
[0125]
Next, the latch circuit 453₀~ 453₇The color data shown in FIG. 23B stored in FIG.₂Output to the shift circuit 450₂In FIG. 5, the shift is performed by “3” toward the MSB in units of 32 bits, and the color data after the shift is converted to the selector 451.₀~ 451₇Through the latch circuit 452₀~ 452₇Is written to. Thus, the latch circuit 452_ThreeAnd color data pix₂₉Is memorized.
Further, the color data mc_0dtr to mc_7dtr shown in FIG. 23C is converted into the shift circuit 450 shown in FIG.₁, The latch circuit 453 is shifted by “5” toward the LSB.₀~ 453₇Is written to. As a result, the latch circuit 453₀~ 453₂And color data pix₂₆~ Pix₂₈Is memorized.
The latch circuit 452_Three453₀~ 453₂To color data cwd shown in FIG.₂Is the latch circuit 410 shown in FIG._FourIs output to the distributor 300.
[0126]
Next, the color data mc_0dtr to mc_7dtr shown in FIG. 23D is converted into the shift circuit 450 shown in FIG.₁9 is shifted by “1” toward the MSB in units of 32 bits, and the color data after the shift is the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG._ThreeOutput as color data cwd_ThreeIs a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0127]
Next, the color data mc_0dtr to mc_7dtr shown in FIG. 23E is converted into the shift circuit 450 shown in FIG.₁Without being shifted in the latch circuit 453₀~ 453₇Is written to the shift circuit 450.₂Output to the shift circuit 450₂, The data is shifted by “1” toward the LSB in units of 32 bits, and the color data after the shift is the selector 451.₀~ 451₇Through the latch circuit 452₀~ 452₇Is written to.
The latch circuit 452₀~ 452₇To the color data cwd shown in FIG._FourIs the latch circuit 410 shown in FIG._FourIs output to the distributor 300.
[0128]
Next, the latch circuit 453₀~ 453₇The color data shown in FIG. 23E stored in FIG.₂Output to the shift circuit 450₂In FIG. 5, the shift is performed by “3” toward the MSB in units of 32 bits, and the color data after the shift is converted to the selector 451.₀~ 451₇Through the latch circuit 452₀~ 452₇Is written to. Thus, the latch circuit 452_ThreeAnd color data pix₁₆Is memorized.
Further, the color data mc_0dtr to mc_7dtr shown in FIG. 23F is converted into the shift circuit 450 shown in FIG.₁, The latch circuit 453 is shifted by “5” toward the LSB.₀~ 453₇Is written to. As a result, the latch circuit 453₀~ 453₂And color data pix₁₃~ Pix₁₅Is memorized.
The latch circuit 452_Three453₀~ 453₂To the color data cwd shown in FIG._FiveIs the latch circuit 410 shown in FIG._FourIs output to the distributor 300.
[0129]
Next, the color data mc_0dtr to mc_7dtr shown in FIG. 23G is converted into the shift circuit 450 shown in FIG.₁9 is shifted by “1” toward the MSB in units of 32 bits, and the color data after the shift is the latch circuit 453 shown in FIG.₀~ 453₇To color data cwd shown in FIG.₆Output as color data cwd₆Is a latch circuit 410 shown in FIG._FourTo the distributor 300 shown in FIG.
[0130]
Next, the color data mc_0dtr to mc_7dtr shown in FIG. 23 (H) is converted into the shift circuit 450 shown in FIG.₁Without being shifted in the latch circuit 453₀~ 453₇Is written to the shift circuit 450.₂Output to the shift circuit 450₂, The data is shifted by “1” toward the LSB in units of 32 bits, and the color data after the shift is the selector 451.₀~ 451₇Through the latch circuit 452₀~ 452₇Is written to.
The latch circuit 452₀~ 452₇To color data cwd shown in FIG.₇Is the latch circuit 410 shown in FIG._FourIs output to the distributor 300.
[0131]
Next, the latch circuit 453₀~ 453₇The color data shown in FIG. 23 (H) stored in FIG.₂Output to the shift circuit 450₂In FIG. 5, the shift is performed by “3” toward the MSB in units of 32 bits, and the color data after the shift is converted to the selector 451.₀~ 451₇Through the latch circuit 452₀~ 452₇Is written to. Thus, the latch circuit 452_ThreeAnd color data pix_ThreeIs memorized.
Further, the color data mc_0dtr to mc_7dtr shown in FIG. 23 (I) is converted into the shift circuit 450 shown in FIG.₁, The latch circuit 453 is shifted by “5” toward the LSB.₀~ 453₇Is written to. As a result, the latch circuit 453₀~ 453₂And color data pix₀~ Pix_ThreeIs memorized.
The latch circuit 452_Three453₀~ 453₂To the color data cwd shown in FIG.₈Is the latch circuit 410 shown in FIG._FourIs output to the distributor 300.
[0132]
In the data rearrangement unit 400, as shown in FIGS. 24A to 24I, the color data cwd described above is used.₀~ Cwd₈Representative point coordinates (btrx₀, Btry₀) To (btrx₈, Btry₈) And valid flag btrvld₀~ Btrvld₈Is generated by the address generation unit 430. Representative point coordinates (btrx₀, Btry₀) To (btrx₈, Btry₈) And valid flag btrvld₀~ Btrvld₈Is the address ADRW₂And output to the distributor 300.
[0133]
And the representative point coordinates (btrx₀, Btry₀) To (btrx₈, Btry₈) And valid flag btrvld₀~ Btrvld₈24A, the color data cwd shown in FIGS. 24A to 24I with 16 pixels of 8 pixels (X direction) × 2 pixels (Y direction) as a unit.₀~ Cwd₈Color data pix included in₀~ Pix₃₈Is written into the rectangular storage area 700 in the DRAM 147.
As a result, the color data pix read from the rectangular storage area 600 shown in FIG.₀~ Pix₃₈Are transferred to the rectangular storage area shown in FIG.
The above is the description of the data rearrangement circuit 420 and the address generation unit 430 shown in FIG.
[0134]
The controller 440 controls the control signal S440 based on the transfer mode instruction signal S800 and the pixel data length instruction signal S801 input from the texture engine circuit 143.₁~ S440₆Is generated.
The controller 440 includes a FIFO circuit 410.₁~ 410_ThreeAnd latch circuit 410_FourControl signal S440 for controlling₁~ S440_FourIs output.
Further, the controller 440 controls the control signal S440 that controls the address generation unit 430._FiveIs output.
In addition, the controller 440 controls the data rearrangement circuit 420 to control signal S440.₆Is output.
[0135]
The controller 440 also includes a FIFO circuit 410._ThreeInput, that is, reading of the color data mc_0dtr to mc_7dtr from the DRAM 147 is controlled as follows.
The controller 440 includes a FIFO circuit 410._ThreeAnd when it is detected that half of the 16-bit storage circuits each having a 64-bit width are empty (effective data is not stored), the pixel data length instruction signal S801 is detected. Control signal S440 for instructing to read continuously for the number of times corresponding to_FiveIs output to the address generation unit 430.
Here, 256-bit color data mc_0dtr to mc_7dtr for 8 pixels is read from the DRAM 147 by one reading.
[0136]
Specifically, the controller 440 controls the control signal S440 indicating the number of read instructions shown in FIG. 26 according to the pixel data length indicated by the pixel data length instruction signal S801._FiveIs output to the address generation unit 430.
That is, when the data length of one pixel is 32 bits, 16 bits, 8 bits, and 4 bits, the controller 440 controls the control signal S440 indicating the number of read instructions twice, four times, eight times, and 16 times, respectively._FiveIs output to the address generation unit 430.
[0137]
A control signal S440 indicating the number of read instructions shown in FIG._FiveIs output from the FIFO circuit 410._ThreeThe color data mc_0dtr to mc_7dtr read from the DRAM 147 can be stored in the empty (unstored) 8-stage storage circuit. At this time, the FIFO circuit 410_ThreeThe 8-stage storage circuit stores 8 × 64-bit color data.
[0138]
That is, when the pixel data length is 32 bits, the color data mc_0dtr to mc_7dtr read out in one readout operation includes 32 × 8-bit color data as shown in FIG. Therefore, it is possible to read out 32 × 8 × 2 (= 8 × 64) bits by continuously reading this twice.
Further, when the pixel data length is 16 bits, the color data mc_0dtr to mc_7dtr read out in one readout operation includes 16 × 8-bit color data as shown in FIG. Therefore, it is possible to read out 16 × 8 × 4 (= 8 × 64) bits by continuously reading this four times.
Further, when the pixel data length is 8 bits, the color data mc_0dtr to mc_7dtr read out in one readout operation includes 8 × 8-bit color data as shown in FIG. Therefore, it is possible to read out 8 × 8 × 8 (= 8 × 64) bits by continuously reading this out eight times.
Further, when the pixel data length is 4 bits, the color data mc_0dtr to mc_7dtr read out by one readout operation includes 4 × 8-bit color data as shown in FIG. Therefore, 4 × 8 × 16 (= 8 × 64) bits can be read by continuously reading this 16 times.
[0139]
As described above, the controller 440 includes the FIFO circuit 410._ThreeWhen the half of the 16-bit storage circuits each having a 64-bit width are emptied, the DRAM 147 continuously receives the number of times corresponding to the pixel data length instruction signal S801. By performing the reading, the FIFO circuit 410 by the controller 440 is obtained._ThreeThe control of the controller 440 can be reduced. Therefore, the other processing performed by the controller 440 is the FIFO circuit 410._ThreeIt is possible to avoid a state waiting for the control.
That is, in a general conventional FIFO circuit that is not controlled by the controller 440 as described above, for example, while there is an empty (unstored) storage circuit, reading from the memory is continuously performed, Input control is performed so that reading is stopped when all memory circuits are in a memory state. However, when the control is performed in this way, when the reading is stopped, a state in which only a part of the color data for 8 pixels read by one reading is written in the FIFO circuit occurs. Therefore, next, the same 8-pixel color data as before is read, and among the read 8-pixel color data, the color data of the pixels not previously written in the FIFO circuit is specified, and only the specified color data is FIFO. It is necessary to perform control so as to write to the circuit, and the control becomes complicated.
[0140]
CRT control circuit 145
The CRT control circuit 145 generates an address to be displayed on a CRT (not shown) in synchronization with the applied horizontal and vertical synchronization signals, and outputs a request to read display data from the display buffer 147b to the memory I / F circuit 144. In response to this request, the memory I / F circuit 144 reads the display data from the display buffer 147b in a certain chunk. The CRT controller circuit 145 includes a FIFO circuit that stores display data read from the display buffer 147b, and outputs RGB index values to the RAMDAC circuit 146 at regular time intervals.
[0141]
RAMDAC circuit 146
The RAMDAC circuit 146 stores R, G, B data corresponding to each index value, and digital R, G, B data corresponding to the RGB index value input from the CRT controller circuit 145 is not shown. The data is transferred to a D / A converter (Digital / Analog Converter) to generate analog R, G, B data. The RAMDAC circuit 146 outputs the generated R, G, B data to the CRT.
[0142]
Hereinafter, the overall operation of the above-described three-dimensional computer graphic system 10 will be described.
In the three-dimensional computer graphic system 10, data such as graphic drawing is given to the rendering circuit 14 via the main bus 15 from the main memory 12 of the main processor 11 or the I / O interface circuit 13 that receives graphics data from the outside. It is done.
If necessary, data such as graphics drawing is subjected to geometry processing such as coordinate conversion, clip processing, and lighting processing in the main processor 11 or the like.
The graphics data that has been subjected to the geometry processing includes vertex coordinates x, y, and z of each of the three vertices of the triangle, luminance values R, G, and B, and texture coordinates s, t, and q corresponding to the pixel to be rendered. This is polygon rendering data S11.
[0143]
The polygon rendering data S11 is input to the DDA setup circuit 141 of the rendering circuit 14.
In the DDA setup circuit 141, variation data S141 indicating the difference between the sides of the triangle and the horizontal direction is generated based on the polygon rendering data S11. Specifically, using the starting point value and the ending point value, and the distance between them, a variation that is a change in the value to be obtained when the unit length is moved is calculated, and the variation data The data is output to the triangle DDA circuit 142 as S141.
[0144]
In the triangle DDA circuit 142, linearly interpolated (z, R, G, B, s, t, q) data for each pixel inside the triangle is calculated using the variation data S141.
Then, the calculated (z, R, G, B, s, t, q) data and (x, y) data of each vertex of the triangle are used as DDA data S142 from the triangle DDA circuit 142 to the texture engine. It is output to the circuit 143.
[0145]
In the texture engine circuit 143, for the (s, t, q) data indicated by the DDA data S142, an operation for dividing the s data by the q data and an operation for dividing the t data by the q data are performed. The division results “s / q” and “t / q” are multiplied by the texture sizes USIZE and VSIZE, respectively, to generate texture coordinate data (u, v).
[0146]
Next, the address ADRR is sent from the texture engine circuit 143 to the memory I / F circuit 144.₁Is output, and (R, G, B) data S148 stored in the DRAM 147 (SRAM 148) is read via the memory I / F circuit 144.
[0147]
Next, in the texture engine circuit 143, the (R, G, B) data of the read (R, G, B) data S148 and the DDA data S142 from the triangle DDA circuit 142 in the preceding stage are included (R, G, B). B) The data is multiplied and generated as color data S143.
The color data S143 is output from the texture engine circuit 143 to the memory I / F circuit 144.
[0148]
In the case of full color, the data (R, G, B) from the texture buffer 147a may be used directly. However, in the case of index color, the data of the color index table (Color Index Table) created in advance is used. Then, the texture is transferred from a texture CLUT (Color Look Up Table) buffer 147d to a temporary storage buffer composed of SRAM or the like, and the actual R, G, B colors are obtained from the color index using the CLUT of the temporary storage buffer.
When the CULT is configured by SRAM, when the color index is input to the SRAM address, the actual R, G, B color is output as the output.
[0149]
Then, the memory I / F circuit 144 compares the z data corresponding to the color data S143 input from the texture engine circuit 143 with the z data stored in the z buffer 147c, and uses the input color data S12. It is determined whether or not the image to be drawn is positioned closer to the front (viewpoint side) than the image previously written in the display buffer 21.
As a result of the determination, if it is located on the near side, the z data stored in the z buffer 147c is updated with the z data corresponding to the image data S143.
[0150]
Next, in the memory I / F circuit 144, (R, G, B) data is written to the display buffer 147b.
These data to be written (including update) are memory controllers 350, 360, 370, and 370 via the data rearrangement unit 400, distributor 300, and address decoders 310, 320, 330, and 340 shown in FIG. 380, and written in parallel to a predetermined memory by the memory controllers 350, 360, 370, and 380 via the write system wiring groups 401W, 402W, 411W, 412W, 421W, 422W, 431W, and 432W, respectively.
[0151]
In the memory I / F circuit 144, a memory block storing a texture corresponding to a texture address in a pixel to be drawn is calculated based on the texture address, and a read request is issued only to the memory block. Texture data is read out.
In this case, in the memory block that does not hold the corresponding texture data, access for texture reading is not performed, so that it is possible to provide more access time for drawing.
[0152]
At this time, the process of writing the image data S143 from the texture engine circuit 143 to the DRAM 147 in the memory I / F circuit 144 is, for example, as described above with reference to FIGS. Is converted into color data cwd having a 256-bit width.
[0153]
Similarly, in drawing, the color data is read from the corresponding address to modify write (Modify Write) to the memory block storing the color data corresponding to the pixel address to be drawn. And after modification, it is written back to the same address.
[0154]
When performing hidden surface processing, the depth data is modified and written from the corresponding address to the memory block storing the depth data corresponding to the pixel address to be drawn. It is read to do, and if necessary, after modification, it is written back to the same address.
[0155]
In such data exchange with the DRAM 147 based on the memory I / F circuit 144, drawing performance can be improved by performing a plurality of processes up to that time in parallel.
In particular, the triangle DDA circuit 142 and the texture engine 143 are provided in the same circuit in parallel execution format (spatial parallel), or the pipeline is finely inserted (time parallel) to partially increase the operating frequency. The simultaneous calculation of a plurality of pixels is performed by the means of causing them.
[0156]
The color data is arranged so that adjacent portions in the display area are different DRAM modules under the control of the memory I / F circuit 144.
As a result, when drawing a plane such as a triangle, the plane is processed simultaneously. For this reason, the operation probability of each DRAM module is very high.
[0157]
When an image is displayed on a CRT (not shown), the CRT control circuit 145 generates a display address in synchronization with a given horizontal / vertical synchronization frequency, and requests a display data transfer to the memory I / F circuit 144. Is issued.
In the memory I / F circuit 144, the display data is transferred to the CRT control circuit 145 in a certain set according to the request.
In the CRT control circuit 145, the display data is stored in a display FIFO (First In First Out) or the like (not shown), and the RGB index values are transferred to the RAMDAC 146 at regular intervals.
[0158]
When there is a request for reading data stored in the DRAM 147 or the SRAM 148 from the main processor 11 to the memory I / F circuit 144 via the host interface circuit 149 shown in FIG. Read address ADRR 391₂Is entered.
At this time, the address converter 391 checks whether or not the idle signals S350, S360, S370, and S380 from the memory controllers 350, 360, 370, and 380 are all input in an active state. When the idle signals S350, S360, S370, and S380 are all input in an active state, in response to the idle signals S350, S360, S370, and S380, the readout address and A read request signal S391 is output to each memory controller 350, 360, 370, 380.
[0159]
Upon receipt of the read address and read request signal S391, the texture data, (R, G, B) data, z data, and texture color lookup table in units of 8 pixels or 16 pixels in each of the memory controllers 350, 360, 370, 380 Data is read out in parallel through the read system wiring groups 401R, 402R, 411R, 412R, 421R, 422R, 431R, and 432R, and further through the read system wiring groups 351, 361, 371, 381, and the wiring group 440. The data is input to the data calculation unit 392.
Then, after a predetermined calculation process is performed by the data calculation unit 392, the data rearrangement unit 400 converts the 256-bit width color data mc_0dtr to mc_7dtr to the 64-bit width transfer data hrd, and requests the request destination, for example, the main The data is output to the processor 11 via the host interface circuit 149.
[0160]
In the RAMDAC 146, RGB values for RGB indexes are stored in the RAM, and the RGB values for the index values are transferred to a D / A converter (not shown).
Then, the RGB signal converted into an analog signal by the D / A converter is transferred to the CRT.
[0161]
As described above, according to the present embodiment, even if the data format between the host interface circuit 149 and the texture engine circuit 143 shown in FIG. 1 and the DRAM 147 is different, the data rearrangement unit 400 shown in FIG. By rearranging the data in, data transfer between the texture engine circuit 143 and the host interface circuit 149 and the DRAM 147 can be performed accurately.
In addition, even when there is a restriction on the address pattern for accessing the DRAM 147 according to the present embodiment, the data rearrangement unit 400 performs data rearrangement in consideration of the restriction, thereby Color data can be written and read accurately.
[0162]
Further, according to the present embodiment, in the data rearrangement unit 400 shown in FIG._ThreeWhen the half of the 16-bit storage circuits each having a 64-bit width are emptied, the DRAM 147 continuously receives the number of times corresponding to the pixel data length instruction signal S801. By performing the reading, the FIFO circuit 410 by the controller 440 is obtained._ThreeThe control of the controller 440 can be reduced. Therefore, the other processing performed by the controller 440 is the FIFO circuit 410._ThreeIt is possible to avoid a state of waiting by the control process.
[0163]
Further, according to the present embodiment, in the memory I / F circuit 144 that accesses (writes or reads) the DRAM 147 and the SRAM 148, the write path and the read path are configured as separate paths, and in the case of writing, the write address When ADRW and color data cwd are processed by the data rearrangement unit 400, the distributor 300, the address converters 310, 320, 330, and 340 and the memory controllers 350, 360, 370, and 380, which are writing circuits, and written to the DRAM 147 and read out For example, the data rearrangement unit 400, the read controller 390, and the memory controllers 350, 360, 370, and 380, which are read system circuits, are processed and read from the DRAM 147 or the SRAM 148. Can be switched at an earlier timing in a time like that contains the interruption of the reading, there is an advantage that attained the performance improvement of the time of reading.
[0164]
Further, the memory I / F circuit 144 performs access to the DRAM 147 based on predetermined interleaving addressing, for example, in units of 16 pixels or in units of 8 pixels, and restricts the area to be accessed. An image processing apparatus that does not cause a violation can be realized.
[0165]
Furthermore, according to the present embodiment, the DRAM 147 built in the semiconductor chip has a configuration in which display data and texture data required by at least one graphic element are stored. Since data can be stored, the built-in DRAM can be used effectively, and an image processing apparatus capable of paralleling high-speed processing operation and low power consumption can be realized.
And, a single memory system can be realized, and processing can be performed only when everything is built in. As a result, the architecture is a big paradigm shift.
In addition, since the memory can be used effectively, the processing can be performed only with the internal DRAM, and the large bandwidth between the memory and the drawing system can be fully used because it is inside. Also, special processing can be incorporated in the DRAM.
[0166]
Furthermore, in the display address space, display elements at adjacent addresses are arranged so as to be different DRAM blocks, so that bit lines can be used more effectively, and a relatively solid display as in graphics drawing. When there are many accesses to the area, the probability that each module can be processed simultaneously increases, and the drawing performance can be improved.
[0167]
In addition, since the DRAM is built in the chip, the high-speed interface part is completed only inside the chip, so that it is an I / O buffer with a large additional capacity, or an inter-chip wiring capacity. It is no longer necessary to drive the drive, and the power consumption is smaller than when not built in.
Therefore, a mechanism that can do everything in one chip using various technologies is an indispensable technical element for familiar digital devices such as portable information terminals in the future.
[0168]
The present invention is not limited to the embodiment described above.
For example, the access pattern restrictions on the DRAM 147 are not limited to those described above, and the present invention can be applied even when there are other access pattern restrictions.
Further, the configuration of the data rearrangement unit 400 shown in FIG. 8 and the configuration of the data rearrangement circuit 420 shown in FIG. 9 are not limited to these.
[0169]
Further, the FIFO circuit 410 shown in FIG.₁~ 410_ThreeThe number of stages is arbitrary.
The controller 440 also includes a FIFO circuit 410._ThreeThe timing of issuing the read request after starting the storage state may be other than when the empty area is halved.
[0170]
Further, in the above-described three-dimensional computer graphics system 10 shown in FIG. 1, the configuration using the SRAM 148 is illustrated, but a configuration without the SRAM 148 may be used.
[0171]
Further, in the three-dimensional computer graphics system 10 shown in FIG. 1, the case where the geometry processing for generating polygon rendering data is performed by the main processor 11 is exemplified, but the configuration may be such that it is performed by the rendering circuit 14.
[0172]
【The invention's effect】
As described above, according to the image processing apparatus of the present invention, data transmission / reception between the image processing circuit and the storage circuit can be performed accurately even when there is a limitation in addressing when accessing the storage circuit. .
Further, according to the image processing apparatus of the present invention, even when there is a restriction on address designation when accessing the storage circuit, data transfer within the storage area of the storage circuit can be performed accurately.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a three-dimensional computer graphics system according to the present invention.
FIG. 2 is a block diagram showing a specific configuration example of a DRAM, SRAM, and a memory I / F circuit for accessing the DRAM and SRAM in the rendering circuit according to the present invention.
FIG. 3 is a schematic diagram showing a configuration example of a DRAM buffer according to the present invention.
FIG. 4 is a diagram for explaining color data included in texture data and accessed simultaneously.
FIG. 5 is a diagram for explaining unit blocks constituting texture data.
FIG. 6 is a diagram for explaining an address space of a texture buffer.
FIG. 7 is a diagram for explaining image data processing of a distributor in the memory I / F circuit according to the present invention.
FIG. 8 is a configuration diagram of the data rearrangement unit 400 shown in FIG. 2;
9 is a configuration diagram of the data rearrangement circuit 420 shown in FIG. 8. FIG.
FIG. 10 is a diagram showing a storage area in the DRAM shown in FIG. 1;
11 is a diagram for explaining an operation in a “Host to Local” transfer mode when the data length of one pixel in the data rearrangement unit shown in FIG. 2 is 32 bits.
12 is a diagram for explaining the operation in “Host to Local” transfer mode when the data length of one pixel in the data rearrangement unit shown in FIG. 2 is 32 bits. FIG.
13 is a diagram for explaining an operation in a “Host to Local” transfer mode in a case where the data length of one pixel in the data rearrangement unit shown in FIG. 2 is 32 bits.
14 is a diagram for explaining an operation of writing the color data shown in FIG. 13 to the DRAM shown in FIG. 1;
15 is a diagram for explaining an operation of writing the color data shown in FIG. 13 to the DRAM shown in FIG. 1;
16 is a diagram for explaining an operation in a “Host to Local” transfer mode when the data length of one pixel in the data rearrangement unit shown in FIG. 2 is 16 bits.
17 is a diagram for explaining an operation in a “Host to Local” transfer mode when the data length of one pixel in the data rearrangement unit shown in FIG. 2 is 16 bits.
18 is a diagram for explaining an operation in “Host to Local” transfer mode when the data length of one pixel in the data rearrangement unit shown in FIG. 2 is 16 bits.
FIG. 19 is a diagram for explaining an operation in a “Local to Host” transfer mode in the data rearrangement unit shown in FIG. 2;
FIG. 20 is a diagram for explaining the operation in the “Local to Host” transfer mode in the data rearrangement unit shown in FIG. 2;
FIG. 21 is a diagram for explaining the operation in the “Local to Host” transfer mode in the data rearrangement unit shown in FIG. 2;
FIG. 22 is a diagram for explaining an operation in a “Local to Host” transfer mode in the data rearrangement unit shown in FIG. 2;
FIG. 23 is a diagram for explaining an operation in a “Local to Loacl” transfer mode in the data rearrangement unit shown in FIG. 2;
24 is a diagram for explaining the operation in the “Local to Local” transfer mode in the data rearrangement unit shown in FIG. 2; FIG.
FIG. 25 is a diagram for explaining the operation in the “Local to Local” transfer mode in the data rearrangement unit shown in FIG. 2;
FIG. 26 is a diagram for explaining the relationship between the number of read instructions indicated in the read request for the control signal output from the controller shown in FIG. 8 to the address generation unit and the data length of one pixel.
FIG. 27 is a diagram for explaining color data read from a DRAM by one read operation when the data length of one pixel is 32 bits, 16 bits, 8 bits, and 4 bits;
FIG. 28 is a system configuration diagram showing a basic concept of a three-dimensional computer graphics system.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 10 ... Three-dimensional computer graphics system, 11 ... Main processor, 12 ... Main memory, 13 ... I / O interface circuit, 14 ... Rendering circuit, 141 ... DDA setup circuit, 142 ... Triangle DDA circuit, 143 ... Texture engine circuit, 144 ... Memory I / F circuit, 145 ... CRT controller circuit, 146 ... RAMDAC circuit, 147 ... DRAM, 147a ... Texture buffer, 147b ... Display buffer, 147c ... z buffer, 147d ... Texture CLUT buffer, 148 ... SRAM, 200, 210, 220, 230 ... Memory module, 300 ... Distributor, 310, 320, 330, 340 ... Address decoder, 350, 360, 370, 380 ... Memory controller 390 ... read controller, 391 ... address decoder, 392 ... data processing unit, 400 ... data rearrangement unit, 410₁~ 410_Three... FIFO circuit, 410_Four... Latch circuit, 420 ... Data rearrangement circuit, 430 ... Address generation unit, 440 ... Controller

Claims

Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic An image processing circuit that performs rendering processing using data and generates image data including a plurality of pixels;
A plurality of storage circuits to which image data composed of the plurality of pixels generated by the rendering process in units of a predetermined size block is written , wherein the writing is based on a representative point as a first A plurality of storage circuits which are performed in units of rectangular areas and have a restriction that only addresses that are multiples of the first number can be specified in the first and second directions orthogonal to each other in the first rectangular area ;
When the address of the representative point of the block including the image data composed of the plurality of pixels generated by the rendering process input from the image processing circuit is specified based on the restriction, the input from the image processing circuit above so that the image data is written in the first rectangular area corresponding to the appropriate memory circuits of said plurality of memory circuits, based on the constraints were, previous SL image data, rearranged by the pixel data unit, data A reordering circuit;
; And a written can lump circuit,
The rearrangement circuit, in the first transfer mode for storing image data composed of the plurality of pixels generated by the rendering processing input from the image processing circuit is transferred to the plurality of memory circuits,
Placing the rendered first image data of the first bit width input from the image processing circuit in the second image data of the second bit width ;
When the address of the representative point of the block including the second image data is designated based on the restriction, the first image data corresponds to the corresponding storage circuit of the plurality of storage circuits . The second image data is rearranged so as to be written in the rectangular storage area by repeating shift processing, selection processing, and holding processing based on a control signal ,
The writing circuit distributes and writes the blocks including the rearranged second image data to corresponding storage circuits of the plurality of storage circuits;
Image processing device.

In each of the storage circuits, the address of the representative point is designated at a predetermined address interval for each of two orthogonal directions in the two-dimensional address space.
The image processing apparatus according to claim 1.

The block is rectangular;
The second bit width of the second image data is the same as the length of one side of the block;
When the address in the one side direction of the block in the storage area of the storage circuit is the address in the X direction and the address in the direction orthogonal to the X direction is the address in the Y direction, the data rearrangement circuits are the same Among the plurality of pieces of pixel data included in the first image data, pixel data written at different Y-direction addresses in the predetermined storage area are arranged in the second image data different from each other.
The image processing apparatus according to claim 1.

The data rearrangement circuit generates validity instruction data for specifying valid pixel data in the block, and includes the validity instruction data in the generated address.
The writing circuit is configured to store the block including the image data on the basis of pixel data on which the validity in the block is instructed based on the validity instruction data included in the address. To distribute and write
The image processing apparatus according to claim 1 .

Wherein the storage circuitry, the image processing apparatus according to claim 1, and data rearrangement circuit are mixed in one semiconductor chip.

Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic plurality data are stored respectively distributed and was generated by the rendering processing stored in the storage area in units of predetermined size of the block, the writing and reading of image data composed of a plurality of pixel data is performed The writing is performed in units of the first rectangular area with the representative point as a reference, and is a multiple of the first number in the first and second directions orthogonal to the first rectangular area. There is a first restriction that only an address can be specified, and the reading is performed in units of a second rectangular area with a representative point as a reference, and in the first direction perpendicular to the second rectangular area. Can specify any address In its for the second direction is a second limitation that can address only in multiples of the second number, and a plurality of storage circuits,
An image processing circuit for inputting first image data having a first bit width and performing a rendering process on the first image data;
A readout circuit for designating a representative point based on the second restriction and reading out the image data from the plurality of storage circuits in the block unit;
Data rearrangement for outputting the first image data of the first bit width generated by rearranging the second image data of the second bit width obtained by the reading in units of pixel data to the image processing circuit Circuit and
In the second transfer mode in which the rearrangement circuit transfers image data from the storage circuit to the image processing circuit,
An address corresponding to the address input from the image processing circuit is generated and output to the reading circuit to read out corresponding image data from the plurality of storage circuits,
Converting to image data of the first bit width defined by the image processing circuit and outputting to the image processing circuit;
Image processing device.

In each of the storage circuits, the address of the representative point is designated at a predetermined address interval for each of two orthogonal directions in the two-dimensional address space.
The image processing apparatus according to claim 6.

The image processing apparatus according to claim 6, wherein the storage circuit, the readout circuit, and the data rearrangement circuit are mixedly mounted in one semiconductor chip.

Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic The data is distributed, the image data generated by the rendering process is written in units of a predetermined size block, and the image data generated by the distributed writing process is read. In the plurality of memory circuits, the writing is performed in units of a first rectangular area with a representative point as a reference, and the first and second directions orthogonal to the first rectangular area are first described. There is a first restriction that only addresses that are multiples of the number can be specified, and the reading is performed in units of a second rectangular area with the representative point as a reference. Although for the first direction of the orthogonal of the rectangular area may be set to any address for the second direction is a second limitation that can address only in multiples of the second number, and a plurality of storage circuits,
A readout circuit for designating a representative point based on the second restriction and reading out the image data from the plurality of storage circuits in units of the block;
Processing is performed according to either a first transfer mode in which image data is transferred from the image processing circuit to the storage circuit, or a second transfer mode in which image data is transferred from the storage circuit to the image processing circuit. A rearrangement circuit that stores the corresponding image data when the address of the representative point of the block including the read image data is designated based on the first restriction. A data rearrangement circuit for rearranging the read image data in units of pixel data so as to be written in a corresponding first rectangular area in the circuit;
A writing circuit that designates a representative point based on the first restriction and writes the block including the rearranged image data to the storage circuit;
In the first transfer mode, the rearrangement circuit transfers and stores the image data from the image processing circuit to the plurality of storage circuits.
Placing the rendered first image data of the first bit width input from the image processing circuit in the second image data of the second bit width;
When the address of the representative point of the block including the second image data is designated based on the first restriction , the first image data corresponds to the corresponding storage circuit of the plurality of storage circuits. The second image data is rearranged by repeating the shift process, the selection process, and the holding process based on the control signal so as to be written in the rectangular area .
The writing circuit distributes and writes the blocks including the rearranged second image data to corresponding storage circuits of the plurality of storage circuits,
In the second transfer mode in which the rearrangement circuit transfers image data from the storage circuit to the image processing circuit,
An address corresponding to the address input from the image processing circuit is generated and output to the reading circuit to read out corresponding image data from the plurality of storage circuits,
Arranged in the image data of the first bit width defined by the image processing circuit and outputting to the image processing circuit;
Image processing device.

The data rearrangement circuit generates validity instruction data for specifying valid pixel data in the block, and includes the validity instruction data in the generated address.
The writing circuit is configured to store the block including the image data on the basis of pixel data on which the validity in the block is instructed based on the validity instruction data included in the address. To distribute and write
The image processing apparatus according to claim 9.

In each of the storage circuits, the address of the representative point is designated at a predetermined address interval for each of two orthogonal directions in the two-dimensional address space.
The image processing apparatus according to claim 9 or 10.

Each of the storage circuits is different in the size of the block and the designation of the representative point address between the reading and the writing,
The image processing apparatus according to claim 9.

The image processing apparatus according to claim 9, wherein the storage circuit, the read circuit, the data rearrangement circuit, and the write circuit are mounted together in one semiconductor chip.

Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic An image processing apparatus that performs rendering processing using data,
A plurality of storage circuits, wherein display data consisting of a plurality of pixel bits and texture data required by at least one graphic element are distributed and stored in the plurality of storage circuits, and a block of a predetermined size is unit A plurality of storage circuits to which the image data generated by the rendering process is written , wherein the writing is performed in units of a first rectangular area with a representative point as a reference, and the first rectangular area In the first and second directions orthogonal to each other, there is a first restriction that only an address that is a multiple of the first number can be specified. An arbitrary address can be specified for the first direction orthogonal to the second rectangular area, but an address can only be specified by a multiple of the second number for the second direction. Cormorants there is a second constraint, a plurality of storage circuits,
An interpolation data generation circuit that interpolates polygon rendering data at the vertices of the unit graphic and generates interpolation data of pixels located in the unit graphic;
The texture homogeneous coordinates (s, t) included in the interpolation data are divided by the homogeneous term q to generate “s / q” and “t / q”, and the “s / q” and “t / q”. Texture processing that reads texture data from the plurality of storage circuits using a texture address according to the image, and pastes the read texture data on the surface of the graphic element to generate display data composed of a plurality of pixel bits Circuit,
A data rearrangement circuit for performing data rearrangement processing according to a first transfer mode for transferring image data from the texture processing circuit to the storage circuit, and comprising a plurality of pixel bits input from the texture processing circuit When the address of the representative point of the block including display data is designated based on the first restriction , the display data is written in a distributed manner in the corresponding storage circuits in the plurality of storage circuits. A data rearrangement circuit for rearranging the display data in units of image data;
A writing circuit that distributes and writes the blocks including the rearranged display data to the plurality of storage circuits, and
In the first transfer mode, the rearrangement circuit transfers and stores the image data from the image processing circuit to the plurality of storage circuits.
Placing the rendered first image data of the first bit width input from the image processing circuit in the second image data of the second bit width;
When the address of the representative point of the block including the second image data is designated based on the first restriction , the first image data is written to a predetermined storage area of the plurality of storage circuits. As described above, the second image data is rearranged by repeating shift processing, selection processing, and holding processing based on the control signal ,
The writing circuit distributes and writes the blocks including the rearranged second image data to corresponding storage circuits of the plurality of storage circuits;
Image processing device.

Polygon rendering including three-dimensional coordinates (x, y, z), R (red), G (green), B (blue) data, texture homogeneous coordinates (s, t) and homogeneous term q for the vertices of the unit graphic An image processing apparatus that performs rendering processing using data,
A plurality of storage circuits, wherein display data and texture data required by at least one graphic element are distributed and stored in the plurality of storage circuits, and generated by the above rendering process in units of a predetermined size block. a plurality of storage circuits writing Ru performed of the image data, the writing is performed a first rectangular area relative to the representative point as a unit, the first and perpendicular to the first rectangular region There is a first restriction that only addresses that are multiples of the first number can be specified in the second direction, and the reading is performed in units of the second rectangular area based on the representative point, and the second rectangular area Although in the first direction perpendicular can specify any address for the second direction is a second limitation that can address only in multiples of the second number, a plurality A memory circuit,
An interpolation data generation circuit that interpolates polygon rendering data at the vertices of the unit graphic and generates interpolation data of pixels located in the unit graphic;
The texture homogeneous coordinates (s, t) included in the interpolation data are divided by the homogeneous term q to generate “s / q” and “t / q”, and the “s / q” and “t / q”. A texture processing circuit that generates display data by applying a texture data read from the plurality of storage circuits to a surface of a graphic element using a texture address according to
A readout circuit for designating a representative point based on the second restriction and reading the texture data from the plurality of storage circuits in units of blocks;
An interface circuit;
A data rearrangement circuit for rearranging data according to a second transfer mode for transferring display data read from the storage circuit to the texture processing circuit, wherein the read texture data is transferred to the interface. A data rearrangement circuit that rearranges the output according to the output format of the circuit and outputs the data to the interface circuit,
The rearrangement circuit, in the second transfer mode, via to generate and said plurality of storage and output to the read circuit an address corresponding to an address inputted from the image processing circuit from said texture circuit the interface reads out the image data corresponding the circuit, and outputs the arranged in the image data of the first bit width specified by the image processing circuit to the image processing circuit,
Image processing device.