Skip to content

a matrix multiplier with 2 MAC in parallel, and a spare matrix vector multiplier based on CSR format

Notifications You must be signed in to change notification settings

taiqianguo/matrix_multiplier

Repository files navigation

matrix_multiplier

a matrix multiplier with 2 MAC in parallel, and a sparse matrix vector multiplier based on CSR format

The matrix is instansiated as a 88 matrix A and B with 8 bits signed value, the result C is 18 bit signed value matrix. Here's the result of one channel MAC matrix_multiplier based on three stage pipeline which is , memory access, mac, write back. The address fetching is based on a counter , the code is in file matrix_mux1.v The performace is 514 clock cyclein total , which is 648 mac calculation, with 2 initial pipeline stage delay .

image

Here's the result of two channel MAC matrix_multiplier based on three stage pipeline which is , memory access, mac, write back. The address fetching is also based on a counter , the code is in file matrix_mux.v The performace is 260 clock cyclein total , which is 32*8 mac calculation, with 2 initial pipeline stage delay , 2 write back buffer delay (casue we assume here the write back is single port)

image

Here's the sparse matrix vector multiplier, which is wildly used , the function is S*v=d , the compression format is CSR. link of CSR format The module in SpMV.v (spares matrix vector multi plier) is one channel mac, which is four stage pipeline. First stage is to fetch row_index, S vector and column_index. Second stage is fetch v based on column_index which in dicate in this rwo, where is not empty and should be multiplied with element in vector. Third stage is mac calculation, then the write back stage.

he main principle is to obtain the target address through row_index. When the self-increment address is equal to the target address, the value of the next row of the s matrix is ​​obtained. AS seen in simulation this four-stage pipeline. Each row has four calculations, so when clock_counter=8, macc_clear is set and the data is written to the result. The result data is datac4 and the address is addrd.

image

About

a matrix multiplier with 2 MAC in parallel, and a spare matrix vector multiplier based on CSR format

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published