TWI729939B - Method and processor for decompression of model parameters using functions based upon cumulative count distributions - Google Patents
Method and processor for decompression of model parameters using functions based upon cumulative count distributions Download PDFInfo
- Publication number
- TWI729939B TWI729939B TW109132752A TW109132752A TWI729939B TW I729939 B TWI729939 B TW I729939B TW 109132752 A TW109132752 A TW 109132752A TW 109132752 A TW109132752 A TW 109132752A TW I729939 B TWI729939 B TW I729939B
- Authority
- TW
- Taiwan
- Prior art keywords
- function
- coefficient
- decompression
- compression
- circuit
- Prior art date
Links
Images
Landscapes
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
本發明係關於一種解壓縮之方法及處理器,特別是使用基於累積計數分佈之函數之用於模型參數之解壓縮之方法及處理器。The present invention relates to a method and processor for decompression, in particular a method and processor for decompression of model parameters using a function based on cumulative count distribution.
本發明大體上係關於一種模型參數之解壓縮,且明確言之係關於用於一神經網路之模型參數之解壓縮。The present invention generally relates to the decompression of a model parameter, and specifically relates to the decompression of the model parameter used in a neural network.
神經網路及其他類型之模型可用於處理各種類型之資料。例如,一神經網路模型可經訓練以辨識經接收輸入影像內是否存在某些類型之物件。訓練及機器學習可用於判定待藉由模型用於處理輸入資料之一係數集,諸如一神經網路模型之神經元之間的權重。Neural networks and other types of models can be used to process various types of data. For example, a neural network model can be trained to recognize whether certain types of objects exist in the received input image. Training and machine learning can be used to determine a set of coefficients to be used by the model to process input data, such as the weights between neurons of a neural network model.
一種預測模型(例如,一神經網路模型)可與該模型之一係數集結合使用。該係數集可儲存於一記憶體中且經存取用於對輸入資料(例如,待藉由該模型分析之一影像)執行算術運算。A predictive model (for example, a neural network model) can be used in combination with a coefficient set of the model. The coefficient set can be stored in a memory and accessed for performing arithmetic operations on input data (for example, an image to be analyzed by the model).
為減少記憶體使用,在儲存之前壓縮該係數集。在操作該輸入資料之前將需要解壓縮該等經儲存壓縮係數。在一些實施例中,基於一函數壓縮經判定係數值。基於解壓縮係數值之一累積計數分佈產生該函數。例如,一模型係數集之計數值可概算一雙峰分佈、高斯(Gaussian)分佈、帕松(Poisson)分佈或一函數可定義之其他類型之分佈。該經判定函數之函數參數可儲存於一記憶體中且由一解壓縮電路使用以將該函數應用於該等壓縮模型係數而進行解壓縮。相較於其他解壓縮方法(例如,一查找表),儲存該等函數參數可消耗較少記憶體,且亦可減少解壓縮期間所需之記憶體查找之一量。To reduce memory usage, compress the coefficient set before saving. The stored compression factors will need to be decompressed before operating the input data. In some embodiments, the determined coefficient value is compressed based on a function. The function is generated based on the cumulative count distribution of one of the decompression coefficient values. For example, the count value of a model coefficient set can be approximated by a bimodal distribution, a Gaussian distribution, a Poisson distribution, or other types of distributions that can be defined by a function. The function parameters of the determined function can be stored in a memory and used by a decompression circuit to apply the function to the compression model coefficients for decompression. Compared with other decompression methods (for example, a look-up table), storing the function parameters can consume less memory and can also reduce the amount of memory search required during decompression.
在一些實施例中,提供一種用於解壓縮模型係數值之方法。該方法包括接收與一模型相關聯之壓縮係數資料。在一些實施例中,透過一模型訓練程序判定該係數資料之值,且其中使用基於該係數資料之值之一累積分佈之一壓縮函數來壓縮該係數資料。該方法進一步包括擷取與該壓縮函數相關聯之一組函數參數,該組函數參數指定至少一函數類型。該方法進一步包括基於該等經擷取函數參數組態一解壓縮電路。該方法進一步包括在該解壓縮電路處基於函數參數解壓縮該壓縮係數資料以產生解壓縮係數值。該方法進一步包括藉由基於該等壓縮係數值對經接收輸入資料執行算術運算而將該模型應用於該經接收輸入資料。In some embodiments, a method for decompressing model coefficient values is provided. The method includes receiving compression factor data associated with a model. In some embodiments, the value of the coefficient data is determined through a model training process, and a compression function based on a cumulative distribution of the value of the coefficient data is used to compress the coefficient data. The method further includes retrieving a set of function parameters associated with the compression function, the set of function parameters specifying at least one function type. The method further includes configuring a decompression circuit based on the retrieved function parameters. The method further includes decompressing the compression coefficient data based on the function parameter at the decompression circuit to generate a decompression coefficient value. The method further includes applying the model to the received input data by performing arithmetic operations on the received input data based on the compression factor values.
一預測模型(例如,一神經網路模型)可在處理經接收輸入資料時利用一係數集。例如,對於一神經網路模型,該係數集可對應於神經網路之不同神經元之間的權重。該係數集可儲存於一記憶體中且經存取用於對輸入資料(例如,待藉由模型分析之一影像)執行算術運算。A predictive model (for example, a neural network model) can utilize a set of coefficients when processing the received input data. For example, for a neural network model, the coefficient set may correspond to the weights between different neurons of the neural network. The coefficient set can be stored in a memory and accessed for performing arithmetic operations on input data (for example, an image to be analyzed by a model).
為減少記憶體使用,在儲存之前壓縮該係數集。在操作輸入資料之前將需要解壓縮經儲存壓縮係數。查找表可用於將壓縮係數值映射至解壓縮係數值。然而,查找表可需要大量記憶體用於儲存,特別是在係數範圍較大時。另外,在一些實施例中,可能對不同係數子集執行不同類型之壓縮,且因而將需要儲存多個查找表。To reduce memory usage, compress the coefficient set before saving. The stored compression factor will need to be decompressed before operating the input data. A lookup table can be used to map compression factor values to decompression factor values. However, the look-up table may require a large amount of memory for storage, especially when the coefficient range is large. In addition, in some embodiments, different types of compression may be performed on different coefficient subsets, and thus multiple lookup tables will need to be stored.
在一些實施例中,基於一函數壓縮經判定係數值。基於解壓縮係數值之一累積計數分佈產生函數。例如,一模型係數集之計數值可概算一雙峰分佈、高斯分佈、帕松分佈或一函數可定義之其他類型之分佈。經判定函數之函數參數可儲存於一記憶體中且由一解壓縮電路使用以將函數應用於壓縮模型係數而進行解壓縮。與儲存用於解壓縮之一查找表相比,儲存函數參數可消耗較少記憶體。另外,儲存一經判定函數之函數參數所需之儲存空間與係數值之範圍或不同可能係數值之數目無關。In some embodiments, the determined coefficient value is compressed based on a function. A function is generated based on the cumulative count distribution of one of the decompression coefficient values. For example, the count value of a model coefficient set can be approximated by a bimodal distribution, a Gaussian distribution, a Passon distribution, or another type of distribution that can be defined by a function. The function parameters of the determined function can be stored in a memory and used by a decompression circuit to apply the function to the compression model coefficients for decompression. Compared with storing a lookup table for decompression, storing function parameters can consume less memory. In addition, the storage space required to store the function parameters of a determined function has nothing to do with the range of coefficient values or the number of different possible coefficient values.
使用解壓縮函數亦減少解壓縮期間所需之記憶體查找之量。例如,一給定函數之函數參數僅需要在開始由解壓縮電路進行解壓縮時查找一次,且其等用於解壓縮使用該函數壓縮之大量係數。另一方面,使用查找表進行解壓縮通常需要針對待解壓縮之各係數進行一記憶體查找。Using the decompression function also reduces the amount of memory lookup required during decompression. For example, the function parameters of a given function only need to be looked up once when the decompression circuit starts to decompress, and they are used to decompress a large number of coefficients compressed using the function. On the other hand, decompression using a lookup table usually requires a memory lookup for each coefficient to be decompressed.
圖1繪示根據一些實施例之用於儲存及解壓縮用於一模型中之模型係數的一系統之一示意圖。一張量流處理器(TSP) 100或其他類型之處理器經組態以基於一經儲存模型接收及處理輸入資料值102 (例如,來自一輸入影像)而產生輸出資料值 (例如,輸入影像之一分類、輸入資料中之某些類型之物件或特性之識別及/或類似者)。TSP 100可為一積體電路(IC)。在一些實施例中,輸入資料值102可為儲存於記憶體108中之輸入值且其等表示在TSP 100內之別處執行之算術運算之結果。FIG. 1 shows a schematic diagram of a system for storing and decompressing model coefficients used in a model according to some embodiments. A flow processor (TSP) 100 or other type of processor is configured to receive and process input data values 102 (for example, from an input image) based on a stored model to generate output data values (for example, the input image A classification, the identification of certain types of objects or characteristics in the input data and/or similar ones). The
TSP 100使用一或多個算術電路單元及一或多個模型係數來操作輸入資料值102。算術電路單元包含邏輯電路,該等邏輯電路對輸入值102及模型係數執行算術運算且產生表示算術運算之一結果之輸出資料值。例如,算術電路單元可使用模型係數對輸入值執行一矩陣乘法運算且產生表示矩陣乘積之輸出資料值。一預測模型118 (諸如一神經網路)之執行通常可使用矩陣乘法之數個連續階段來實施。在其他實施例中,算術電路單元之算術運算可包含一迴旋運算、一點乘積運算、一快速傅立葉(Fourier)變換(FFT)運算及/或其他算術運算。算術電路單元106可使用單指令多資料(SIMD)處理來執行運算。The TSP 100 uses one or more arithmetic circuit units and one or more model coefficients to manipulate the
TSP 100包括一記憶體108,記憶體108儲存由算術單元使用以操作輸入資料值102的壓縮模型係數112。可由編譯器120自預測模型118產生壓縮模型係數112。預測模型118可對應於利用一係數集之任何類型之模型。在一些實施例中,透過一機器學習或訓練程序判定該係數集。例如,在一些實施例中,預測模型118係一迴旋神經網路(CNN)或其他類型之神經網路模型。The
一旦已建構或充分訓練預測模型118,便可由一編譯器120編譯模型118以由TSP 110使用用於處理輸入資料值102。編譯器120分析預測模型118之係數值,且選擇用於壓縮模型之係數值之一或多個壓縮方案。接著,將壓縮係數值作為壓縮模型係數112儲存於記憶體108中。Once the
為了由算術電路單元使用以操作輸入資料值102,需要解壓縮與模型相關聯之壓縮模型係數112。一解壓縮電路經組態以自記憶體108接收壓縮模型係數112,且輸出可由算術單元操作之解壓縮模型係數。In order to be used by the arithmetic circuit unit to manipulate the
在一些實施例中,編譯器120基於自與模型相關聯之係數值之一分佈導出的一函數選擇用於預測模型118之係數之一壓縮方案。例如,在許多情況中,模型之係數值之分佈可具有為一雙峰分佈、一高斯分佈或一帕松分佈之一分佈。編譯器120判定最佳擬合模型係數分佈之一函數類型,且將經判定函數之參數作為函數參數114儲存於記憶體108中。函數參數114可指示與分佈相關聯之一函數類型以及函數之係數及/或與函數相關之其他參數的值。在一些實施例中,所儲存之函數參數之類型係基於一函數類型。In some embodiments, the
解壓縮電路支援用於解壓縮壓縮模型係數112之數種可能函數。解壓縮電路藉由將由函數參數114定義之特定函數應用於壓縮模型係數112解壓縮壓縮模型係數112以判定解壓縮模型係數。The decompression circuit supports several possible functions for decompressing the
使用函數執行解壓縮可減少儲存用於解壓縮之資料所需之記憶體之量(例如,相較於一查找表)。另外,儲存函數參數所需之記憶體之量可與係數值之範圍或不同可能係數值之數目無關。使用解壓縮函數亦減少解壓縮期間所需之記憶體查找之量。例如,函數參數114可表示在解壓縮開始時查找一次之少量恆定記憶體,且可用於解壓縮包括許多係數之一長串資料。另一方面,使用查找表進行解壓縮通常需要針對待解壓縮之各係數進行一記憶體查找。Using functions to perform decompression can reduce the amount of memory required to store the data used for decompression (for example, compared to a lookup table). In addition, the amount of memory required to store the function parameters can be independent of the range of coefficient values or the number of different possible coefficient values. Using the decompression function also reduces the amount of memory lookup required during decompression. For example, the
在一些實施例中,記憶體108可將預測模型118之壓縮模型係數儲存為複數個不同係數集(例如,一第一壓縮模型係數112A集及一第二壓縮模型係數112B集)。可能已基於一不同函數(例如,與第一函數參數114A相關聯之一第一函數及與第二函數參數114B相關聯之一第二函數)且運用算術或霍夫曼(Huffman)寫碼壓縮各壓縮模型係數112集。在一些實施例中,一不同解壓縮電路(例如,解壓縮電路110A及110B)可用於解壓縮使用不同函數壓縮之不同壓縮模型係數集,以產生不同解壓縮模型係數(例如,解壓縮模型係數116A及116B)集。可由多個算術單元(例如,算術單元106A及106B)操作經輸出解壓縮模型參數116A及116B以產生輸出資料值之多個集(例如:輸出資料104A及104B)。In some embodiments, the memory 108 may store the compression model coefficients of the
在一些實施例中,多個函數可用於解壓縮壓縮模型係數。例如,在壓縮模型係數時,編譯器可將係數劃分為一或多個子集,且判定對應於各子集中之係數值之分佈的一函數及參數。In some embodiments, multiple functions can be used to decompress the compressed model coefficients. For example, when compressing model coefficients, the compiler can divide the coefficients into one or more subsets, and determine a function and parameter corresponding to the distribution of the coefficient values in each subset.
圖2繪示可使用不同函數解壓縮之一壓縮模型係數集之一方塊圖。在一些實施例中,壓縮模型係數可經由複數個位元通道(例如,位元通道0至n)自記憶體傳輸至一解壓縮電路。解壓縮電路可使用一不同函數(例如,函數f_0至函數f_n)解壓縮各位元通道上之傳入係數資料。例如,解壓縮電路可接收對應於函數f_0至函數f_n之多組函數參數,各函數用於解壓縮經由一對應位元通道接收之係數資料。Figure 2 shows a block diagram of a set of compression model coefficients that can be decompressed using different functions. In some embodiments, the compression model coefficients can be transmitted from the memory to a decompression circuit via a plurality of bit channels (for example,
儘管圖2繪示針對各位元通道應用一不同函數之解壓縮電路,然應瞭解,在其他實施例中,一共同函數可應用於多個位元通道。另外,由解壓縮電路使用以解壓縮壓縮模型係數之函數可經組態以隨時間改變。例如,在一時間t=0,解壓縮電路可使用函數f_0至函數f_n進行解壓縮。然而,在一稍後時間t=t1 ,解壓縮電路可接收不同組函數參數以改變用於解壓縮位元通道之一或多者之函數。在一些實施例中,編譯器在壓縮待儲存於TSP之記憶體中之模型係數時判定哪些函數針對哪些位元通道且在何時用於壓縮/解壓縮模型係數。Although FIG. 2 illustrates a decompression circuit that applies a different function to each bit channel, it should be understood that in other embodiments, a common function may be applied to multiple bit channels. In addition, the function used by the decompression circuit to decompress the compression model coefficients can be configured to change over time. For example, at a time t=0, the decompression circuit can use the function f_0 to the function f_n to perform decompression. However, at a later time t=t 1 , the decompression circuit may receive different sets of function parameters to change the function used to decompress one or more of the bit channels. In some embodiments, the compiler determines which functions are for which bit channels and when to compress/decompress the model coefficients when compressing the model coefficients to be stored in the memory of the TSP.
在一些實施例中,在使用某些壓縮方案進行壓縮時,相較於解壓縮,某些係數值在被壓縮時可佔據較大數目個位元。因而,編譯器可判定無需壓縮此等係數值。在解壓縮期間,解壓縮電路可經組態以將一恆等函數應用於此等係數值。或者,可繞過解壓縮電路。In some embodiments, when certain compression schemes are used for compression, certain coefficient values may occupy a larger number of bits when compressed compared to decompression. Therefore, the compiler can determine that there is no need to compress these coefficient values. During decompression, the decompression circuit can be configured to apply an identity function to these coefficient values. Alternatively, the decompression circuit can be bypassed.
圖3A繪示展示根據一些實施例之一模型係數分佈之一例示性圖表。圖表300具有對應於係數值之一x軸及對應於計數值之一y軸。雖然圖表300之x軸僅展示整數係數值,但應瞭解,一模型之係數值可使用整數、浮點數、定點數及/或類似者表示。FIG. 3A illustrates an exemplary graph showing a model coefficient distribution according to some embodiments. The
圖表300含有展示一特定模型之一係數值分佈之一第一曲線302。在產生模型之一係數集之後(例如,透過一訓練程序),對該集合之具有各值之係數的數目進行計數。在許多情況中,具各值之係數之數目將概算一共同分佈,諸如一雙峰分佈、一高斯分佈、一帕松分佈及/或類似者。例如,如由第一曲線302繪示,特定模型之係數值具有一大體雙峰分佈,其中最大數目之係數具有值-2或2。The
圖表300亦繪示一第二曲線304,其指示模型之係數值之一累積分佈。累積分佈曲線304指示針對圖表300之x軸上所表示之各係數值,係數之一總數小於或等於該值。因而,一係數集之累積分佈將單調增加,而容許使用分佈之一函數自一給定計數值導出一唯一係數值。The
藉由編譯器基於擬合係數之累積計數分佈的一函數壓縮模型之係數值。在一些實施例中,編譯器可首先基於累積計數分佈選擇一函數類型,且判定選定函數類型之函數參數以達成函數類型與累積計數分佈之一最佳擬合。例如,圖表300中繪示之第三曲線306對應於可藉由編譯器選擇以概算表示累積計數分佈304之多項式函數。如圖3中繪示,對應於第三曲線306之多項式函數可為八階多項式函數。在一些實施例中,函數可基於概算表示係數值之計數分佈(曲線302)之一函數之一積分。The coefficient values of the model are compressed by the compiler based on a function of the cumulative count distribution of the fitting coefficients. In some embodiments, the compiler may first select a function type based on the cumulative count distribution, and determine the function parameters of the selected function type to achieve a best fit between the function type and the cumulative count distribution. For example, the
在一些實施例中,編譯器使用算術編碼基於經判定函數壓縮係數值。例如,如圖3B中繪示,將函數之計數值擬合至0與1之間的一範圍中,其中0由二進位序列0000…表示且1由二進位序列1111…表示。此導致較常出現係數(例如,具有較高計數值之係數)由短位元序列表示,且較不常出現係數由長位元序列表示。In some embodiments, the compiler uses arithmetic coding to compress coefficient values based on the determined function. For example, as shown in FIG. 3B, the count value of the function is fitted to a range between 0 and 1, where 0 is represented by the
在一些實施例中,各係數值可對應於基於相鄰係數值之間之區間之一值區間。可基於一捨入方案、頂限函數、地板函數及/或類似者判定各係數值之區間。例如,在係數值係整數且使用一地板函數來判定值區間之一實施例中,係數值1可對應於區間[1, 2),係數值2可對應於區間[2, 3),等等。In some embodiments, each coefficient value may correspond to a value interval based on an interval between adjacent coefficient values. The interval of each coefficient value can be determined based on a rounding scheme, ceiling function, floor function, and/or the like. For example, in an embodiment where the coefficient value is an integer and a floor function is used to determine the value interval, the coefficient value 1 may correspond to the interval [1, 2), the
各區間可對應於二進位序列值之一範圍(如使用函數判定者),其中使用表示對應於區間之二進位序列值之範圍的一位元序列寫碼各係數值。因而,因為具有高計數之係數值一般將對應於二進位序列值之一較大範圍,所以其等可使用較小數目個位元進行壓縮。Each interval may correspond to a range of binary sequence values (such as those determined by using a function), wherein each coefficient value is coded with a bit sequence representing the range of the binary sequence value corresponding to the interval. Therefore, because coefficient values with high counts will generally correspond to a larger range of one of the binary sequence values, they can be compressed using a smaller number of bits.
例如,假設諸係數值係整數,且係數值0對應於區間[-0.5, 0.5),而係數值2對應於區間[1.5, 2.5)。如圖3B中繪示,基於函數306,範圍308內之位元序列將映射至係數值0,而範圍310內之位元序列將映射至係數值2。因為範圍310跨越一較大範圍之位元序列,所以該範圍之位元序列相較於範圍308之位元序列通常可使用較小數目個共同位元表示。因而,係數值2 (其相較於係數值0具有一較高計數,如圖3A中繪示)相較於係數值0在壓縮時使用較小數目個位元表示。例如,範圍308跨越序列1000…,而範圍310可跨越二進位序列1011…至1110…。因而,係數值0可使用位元序列1000 (4個位元)表示,而係數值2可使用位元序列110 (3個位元)表示。應瞭解,在一些實施例中,表示一壓縮係數值之位元序列可能不表示對應於與該值相關聯之區間之範圍內的全部位元序列,只要位元序列不表示對應於其他係數值之區間之範圍的位元序列即可。For example, suppose that the coefficient values are integers, and the
圖4A繪示根據一些實施例之一解壓縮電路之一方塊圖。解壓縮電路400可對應於圖1中繪示之解壓縮電路,且其經組態以接收壓縮係數值402且輸出解壓縮係數值404。在一些實施例中,解壓縮電路使用算術寫碼技術及與係數值402相關聯之一函數來解壓縮自壓縮係數值402接收之位元。FIG. 4A shows a block diagram of a decompression circuit according to some embodiments. The decompression circuit 400 may correspond to the decompression circuit shown in FIG. 1, and it is configured to receive the
解壓縮電路在一序列擴展器電路406處接收壓縮係數值402之一或多個位元之一序列,序列擴展器電路406產生經接收位元序列之一高位元序列408及一低位元序列410。如本文中所使用,高位元序列408對應於添加有複數個二進位「1」值之經接收位元序列,且低位元序列410對應於添加有複數個二進位「0」值之經接收位元序列。例如,對於經接收位元序列「10」,高位元序列係「10111…」,而低位元序列將為「10000…」。The decompression circuit receives a sequence of one or more bits of the
解壓縮函數電路414基於一或多個經接收函數參數412判定待用於解壓縮之一函數。例如,圖4B繪示根據一些實施例之含有對應於不同函數類型之函數計算電路的一例示性解壓縮函數電路。解壓縮函數電路414包含數個函數計算電路,函數計算電路之各者實施用於自一輸入值計算一輸出值之一不同類型之函數。例如,如圖4B中繪示,函數計算電路可包括對應於多項式函數之一第一函數計算電路450a、對應於一高斯分佈函數之一第二函數計算電路450b及對應於一帕松分佈函數之一第三函數計算電路450c。The
函數參數412可包括指示可由解壓縮函數電路414使用以判定待使用之一函數計算電路之一函數類型(例如,多項式函數、高斯分佈函數及/或類似者)的一第一函數類型參數及零個或多個額外函數係數參數(例如,多項式函數之係數)。如圖4B中繪示,各不同類型之函數可與不同數目個係數及/或不同類型之係數相關聯。例如,函數計算電路450b可經組態以計算用於解壓縮擬合一高斯型分佈之係數值的一函數(例如,一高斯分佈之一積分之一逆運算),而函數計算電路450c可經組態以計算用於解壓縮一帕松型分佈之一函數。在一些實施例中,解壓縮函數電路414可基於待解壓縮之壓縮係數集或子集接收不同組之函數參數412。解壓縮函數電路414將函數應用於高位元序列408以判定一高係數值416,且將函數應用於低位元序列410以判定一低係數值418。The
在一些實施例中,解壓縮函數電路414在處理一經接收位元序列(例如,高或低位元序列)時使用函數判定一對應值,且基於對應值所在之一區間識別對應於位元序列之一係數值。例如,若由函數判定之對應值對應於兩個不同係數值之間的一值,則解壓縮函數電路414可基於一區間選擇方案(例如,捨入、頂限函數、地板函數及/或類似者)選擇一係數值。In some embodiments, the
比較器及控制電路420接收由解壓縮函數電路414判定之高係數值416及低係數值418,且判定高係數值及低係數值是否相同。若高係數值及低係數值相同,則將經接收位元序列輸出為一解壓縮輸出係數404。接著,解壓縮電路400可開始自壓縮係數值402接收一新位元序列。The comparator and
另一方面,若高係數值416及係數值418不相同,則無法使用當前接收之位元序列判定一解壓縮輸出係數。解壓縮電路自壓縮係數值402接收一額外位元,且更新高位元序列408及低位元序列410。在一些實施例中,因為高位元序列408或低位元序列410之任一者在接收一額外位元時將保持相同,所以對於各隨後接收之位元,僅需重新計算一單一額外擴展位元序列(例如,若經接收位元係「1」則重新計算低位元序列410,或若經接收位元係「0」則重新計算高位元序列408)。類似地,解壓縮函數電路414僅需針對重新計算之擴展位元序列判定一係數值,而無需針對高擴展位元序列及低擴展位元序列重新計算高係數值及低係數值兩者。接著,藉由比較器420比較更新係數值以判定是否可輸出一解壓縮係數值或是否需要額外位元。On the other hand, if the
表1繪示映射至解壓縮係數值之壓縮位元序列之一簡化實例。例如,解壓縮函數電路414可將一函數(如由經接收函數參數412定義)應用於一經接收位元序列(例如,0011…),其中所得值落入一係數值(例如,-2)之區間內。因而,解壓縮函數電路414將回應於經接收位元序列「0011」而返回係數值「-2」。
作為一闡釋性實例,假設解壓縮電路接收位元序列「0100111000000110」。解壓縮電路400接收串流之第一位元(「0」),因此序列擴展器電路406判定一高擴展位元序列「0111…」及一低擴展位元序列「0000…」。解壓縮函數電路414接收高擴展位元序列及低擴展位元序列,且判定分別對應於「-1」及「-3」之高係數值及低係數值。因為高係數值及低係數值不匹配,所以比較器及控制電路420無法判定待輸出之一單一輸出係數值。因而,解壓縮電路400自位元流接收一後續位元。As an illustrative example, suppose that the decompression circuit receives the bit sequence "0100111000000110". The decompression circuit 400 receives the first bit ("0") of the stream, so the sequence expander circuit 406 determines a high expansion bit sequence "0111..." and a low expansion bit sequence "0000...". The
當接收位元流之下一位元時,解壓縮電路400處之當前位元序列係「01」。因為高擴展位元序列仍為「0111…」,所以序列擴展器電路406僅需針對當前位元序列(「0100…」)重新計算一低擴展位元序列。解壓縮函數電路414亦針對低擴展位元序列計算一更新低係數(「-2」)。因為高係數值及低係數值仍不匹配,所以解壓縮電路400自位元流接收另一位元而不輸出一解壓縮係數值。When the next bit of the bit stream is received, the current bit sequence at the decompression circuit 400 is "01". Because the high extension bit sequence is still "0111...", the sequence expander circuit 406 only needs to recalculate a low extension bit sequence for the current bit sequence ("0100..."). The
在接收下一位元流位元之後,當前位元序列係「010」。序列擴展器電路406判定一更新高擴展位元序列「0101…」,解壓縮函數電路414將其判定為對應於一係數值「-2」。因為高係數值及低係數值兩者匹配,所以解壓縮電路400輸出「-2」作為一解壓縮係數值404。解壓縮電路可繼續接收壓縮位元序列「0100111000000110」之位元且輸出對應係數值(例如,針對位元序列「011」輸出「-1」,針對位元序列「1000」輸出「0」,針對位元序列「000」輸出「-3」且針對位元序列「110」輸出「2」)。After receiving the next bit stream bit, the current bit sequence is "010". The sequence expander circuit 406 determines an updated high-expansion bit sequence "0101...", and the
雖然上述實例主要論述使用算術寫碼及解壓縮函數來壓縮及解壓縮模型係數值,但應瞭解,在其他實施例中,可使用不同類型之寫碼。例如,在一些實施例中,可使用霍夫曼寫碼結合函數來壓縮及解壓縮模型係數值。Although the above examples mainly discuss the use of arithmetic coding and decompression functions to compress and decompress model coefficient values, it should be understood that in other embodiments, different types of coding may be used. For example, in some embodiments, the Huffman code combination function can be used to compress and decompress model coefficient values.
在一些實施例中,一模型之係數集可分為複數個子集,其中各子集之係數計數可符合一不同分佈。因而,可基於一不同函數壓縮及解壓縮各係數子集(例如,如圖2中繪示)。例如,可基於壓縮係數之位元通道及位置而將不同函數應用於儲存於TSP之記憶體中之壓縮係數值。交錯輸入 In some embodiments, the coefficient set of a model may be divided into a plurality of subsets, and the coefficient count of each subset may conform to a different distribution. Therefore, each coefficient subset can be compressed and decompressed based on a different function (for example, as shown in FIG. 2). For example, different functions can be applied to the compression factor value stored in the memory of the TSP based on the bit channel and position of the compression factor. Interleaved input
在一些實施例中,複數個解壓縮電路可用於並行解壓縮含有壓縮係數資料之一位元流。例如,在一第一時脈循環期間,各解壓縮電路可處理一不同壓縮係數之一第一位元。當一特定解壓縮電路完成解壓縮一特定係數時,其可移動至當前未處理之一後續壓縮係數。In some embodiments, a plurality of decompression circuits can be used to decompress a bit stream containing compression factor data in parallel. For example, during a first clock cycle, each decompression circuit can process a first bit of a different compression factor. When a specific decompression circuit finishes decompressing a specific coefficient, it can move to a subsequent compression coefficient that is currently unprocessed.
例如,壓縮係數資料之一位元流可包括對應於一第一係數之x個位元及對應於一第二係數之y個位元。在一第一時脈循環期間,一第一解壓縮電路可處理第一係數之第一位元,而一第二解壓縮電路可處理第二係數之第一位元。若x<y,則在第x+1個時脈循環處,第一解壓縮電路已完成處理第一係數,且可開始處理一第三係數之一第一位元,而第二解壓縮電路可在第y+1個時脈循環處處理一第四係數之一第一位元。For example, a bit stream of compression coefficient data may include x bits corresponding to a first coefficient and y bits corresponding to a second coefficient. During a first clock cycle, a first decompression circuit can process the first bit of the first coefficient, and a second decompression circuit can process the first bit of the second coefficient. If x<y, at the x+1th clock cycle, the first decompression circuit has finished processing the first coefficient and can start processing the first bit of a third coefficient, and the second decompression circuit The first bit of a fourth coefficient can be processed at the y+1th clock cycle.
例如,圖5繪示根據一些實施例之用於並行解壓縮壓縮係數資料的複數個解壓縮電路之一圖。壓縮模型係數112可產生表示為「aabbbbcccdd…」之一位元流,其包括用於寫碼一第一係數「a」之2個位元、用於寫碼一第二係數「b」之4個位元、用於寫碼一第三係數「c」之3個位元及用於編碼一第四係數「d」之2個位元。一劃分器電路502在一第一解壓縮電路110A與一第二解壓縮電路110B之間劃分位元流。劃分器502判定位元流中開始編碼各係數之一位置,且在解壓縮電路110A與110B之間劃分位元流之位元,使得各解壓縮電路解壓縮一不同係數之位元。例如,在一第一時脈循環處,劃分器電路502經組態以將係數「a」之一第一位元傳輸至解壓縮電路110A且將係數「b」之一第一位元傳輸至解壓縮電路110B。解壓縮電路110A及110B之各者使用基於經儲存函數參數114之一函數來處理經接收位元。在一第三時脈循環期間,解壓縮電路110A已完成處理係數「a」之位元且接收下一未處理係數(例如,係數「c」)之一第一位元,而解壓縮電路110B接收且處理係數「b」之第三位元。For example, FIG. 5 shows a diagram of a plurality of decompression circuits for decompressing compression coefficient data in parallel according to some embodiments. The
解壓縮電路110A及110B分別輸出解壓縮模型係數116A及116B,在一些實施例中,可使用交錯解壓縮係數116A及116B之一交錯器電路(未展示)來形成一解壓縮係數位元流。The
因為編譯器執行模型係數之初始壓縮且因而知道對應於各壓縮係數值之位元長度,所以編譯器可將指定由哪些解壓縮電路操作一位元流之哪些部分的指令儲存至記憶體,使得各解壓縮電路能夠在解壓縮一先前係數之後接收一後續壓縮係數之一第一位元。程序流程 Because the compiler performs the initial compression of the model coefficients and therefore knows the bit length corresponding to each compression coefficient value, the compiler can store instructions specifying which decompression circuits operate which parts of the bit stream into memory, so that Each decompression circuit can receive a first bit of a subsequent compression coefficient after decompressing a previous coefficient. Procedure flow chart
圖6係根據一些實施例之用於產生一壓縮模型係數集之一程序之一流程圖。使用一機器學習程序建構602及/或訓練一預測模型,此產生模型之一係數集。在一些實施例中,模型可為一神經網路模型。FIG. 6 is a flowchart of a procedure for generating a set of compression model coefficients according to some embodiments. A machine learning program is used to construct 602 and/or train a predictive model, which generates a set of coefficients of the model. In some embodiments, the model may be a neural network model.
一編譯器針對係數集之一或多個子集之各者基於子集內之係數值之分佈選擇604一函數。例如,編譯器產生子集之係數值之一累積計數分佈,且識別最佳擬合所產生分佈之一函數類型。函數類型可基於多項式函數、一高斯分佈函數、一帕松分佈函數及/或類似者。編譯器判定606選定函數類型之參數,以判定最佳擬合子集之係數值之分佈(例如,累積計數分佈)之一函數。編譯器基於經判定函數類型及函數參數壓縮608係數子集。A compiler selects 604 a function for each of one or more subsets of the coefficient set based on the distribution of coefficient values within the subset. For example, the compiler generates a cumulative count distribution of one of the coefficient values of the subset, and identifies a function type of the distribution generated by the best fit. The function type may be based on a polynomial function, a Gaussian distribution function, a Passon distribution function, and/or the like. The compiler determines 606 the parameters of the selected function type to determine a function of the distribution of coefficient values of the best-fit subset (for example, the cumulative count distribution). The compiler compresses a subset of 608 coefficients based on the determined function type and function parameters.
將壓縮係數子集及經判定函數參數儲存610於一記憶體中。可藉由一或多個算術單元使用壓縮係數(在解壓縮之後)來根據預測模型對輸入資料(例如,影像資料)執行操作。The compression coefficient subset and the determined function parameters are stored 610 in a memory. The compression factor (after decompression) can be used by one or more arithmetic units to perform operations on the input data (for example, image data) according to the prediction model.
圖7係根據一些實施例之用於解壓縮壓縮模型係數的一程序之一流程圖。解壓縮電路接收702對應於壓縮係數之資料。在一些實施例中,輸入資料被接收為一位元流,其中各壓縮係數由一可變長度位元序列表示。FIG. 7 is a flowchart of a procedure for decompressing compression model coefficients according to some embodiments. The decompression circuit receives 702 the data corresponding to the compression factor. In some embodiments, the input data is received as a bit stream, where each compression factor is represented by a variable length bit sequence.
解壓縮電路接收704對應於待用於解壓縮經接收壓縮係數資料之一函數的一或多個函數參數。函數參數可指示一函數類型以及函數之一或多個係數(例如,在函數類型係多項式之情況下,函數參數可指示多項式函數之係數)。解壓縮電路基於經接收函數參數組態706待由一解壓縮函數電路使用之函數。例如,在一些實施例中,解壓縮電路包括各自對應於一不同函數類型之複數個解壓縮函數電路。回應於接收函數參數,解壓縮電路選擇對應於由經接收參數指示之一函數類型之一特定解壓縮函數電路,且基於一或多個額外函數參數(例如,對應於函數係數值)組態選定解壓縮函數電路。The decompression circuit receives 704 one or more function parameters corresponding to a function to be used to decompress the received compression factor data. The function parameter may indicate a function type and one or more coefficients of the function (for example, when the function type is a polynomial, the function parameter may indicate the coefficient of the polynomial function). The decompression circuit is based on the received
解壓縮電路基於經組態函數使用解壓縮函數電路來解壓縮708對應於壓縮係數之輸入資料以輸出解壓縮係數。可將解壓縮係數提供至一TSP。The decompression circuit uses the decompression function circuit to decompress 708 the input data corresponding to the compression factor based on the configured function to output the decompression factor. The decompression factor can be provided to a TSP.
TSP藉由使用自解壓縮電路接收之解壓縮係數對經接收輸入資料執行算術運算而將模型應用710於該輸入資料。該等算術運算可包含矩陣乘法、點乘積運算、FFT及/或類似者。The TSP applies 710 the model to the input data by performing arithmetic operations on the received input data using the decompression coefficients received by the self-decompression circuit. The arithmetic operations may include matrix multiplication, dot product operations, FFT, and/or the like.
圖8係用於使用算術解碼對壓縮模型係數執行解壓縮的一程序之一流程圖。解壓縮電路可接收壓縮係數作為一位元流。因為各係數值可使用一可變長度位元序列表示,所以解壓縮電路可評估位元流之各位元且判定是否可自當前接收之位元獲得一解壓縮係數值。Fig. 8 is a flowchart of a procedure for decompressing the compression model coefficients using arithmetic decoding. The decompression circuit can receive the compression coefficient as a bit stream. Since each coefficient value can be represented by a variable-length bit sequence, the decompression circuit can evaluate each bit of the bit stream and determine whether a decompression coefficient value can be obtained from the currently received bit.
解壓縮電路接收802壓縮係數資料之一位元。解壓縮電路藉由將高位元或低位元之一序列添加至經接收序列而使用壓縮係數資料之當前接收位元序列產生804高擴展位元序列及低擴展位元序列。經接收位元序列可對應於由解壓縮電路接收之一位元序列,其不對應於已由解壓縮電路輸出之一解壓縮係數值。The decompression circuit receives 802 one bit of compression coefficient data. The decompression circuit generates 804 a high-expansion bit sequence and a low-expansion bit sequence using the current received bit sequence of the compression factor data by adding a sequence of high or low bits to the received sequence. The received bit sequence may correspond to a bit sequence received by the decompression circuit, which does not correspond to a decompression coefficient value that has been output by the decompression circuit.
解壓縮電路將一經判定函數應用806於高擴展位元序列及低擴展位元序列以判定解壓縮係數值。經判定函數可對應於與壓縮係數值對應之複數個經接收函數參數。在一些實施例中,將函數應用於高擴展位元序列或低擴展位元序列基於一區間方案產生在兩個不同可能係數值之間且與一特定係數值相關聯之一值。The decompression circuit applies a
解壓縮電路判定808高位元序列及低位元序列之解壓縮係數值是否相同。若是,則當前位元序列足以判定一解壓縮係數值,且解壓縮電路輸出810對應於當前接收之位元序列之解壓縮係數值。接著,解壓縮電路可接收壓縮係數資料之額外位元作為一新位元序列之部分以判定後續解壓縮係數值。The decompression circuit determines 808 whether the decompression coefficient values of the high-bit sequence and the low-bit sequence are the same. If so, the current bit sequence is sufficient to determine a decompression coefficient value, and the
另一方面,若高位元序列及低位元序列之解壓縮係數值不同,則當前位元序列不足以產生一解壓縮係數值,且解壓縮電路接收812壓縮係數資料之額外位元,直至對應於高擴展位元序列及低擴展位元序列之解壓縮係數值匹配。額外組態資訊 On the other hand, if the decompression coefficient values of the high-bit sequence and the low-bit sequence are different, the current bit sequence is not enough to generate a decompression coefficient value, and the decompression circuit receives 812 extra bits of the compression coefficient data until it corresponds to The decompression coefficient values of the high-expansion bit sequence and the low-expansion bit sequence match. Additional configuration information
本發明之實施例之前述描述已為繪示之目的而呈現;其並不意欲為詳盡性的或將本發明限制於所揭示之精確形式。熟習相關技術者可瞭解,鑒於上文揭示內容,許多修改及變動係可能的。The foregoing description of the embodiments of the present invention has been presented for illustrative purposes; it is not intended to be exhaustive or to limit the present invention to the precise form disclosed. Those who are familiar with the relevant technology can understand that in view of the content disclosed above, many modifications and changes are possible.
此描述之一些部分在對資訊操作之演算法及符號表示方面描述本發明之實施例。通常藉由熟習資料處理技術者使用此等演算法描述及表示來將其等工作之實質有效地傳達給其他熟習此項技術者。雖然在功能上、計算上或邏輯上描述此等操作,但其等應被理解為由電腦程式或等效電路、微程式碼或類似者實施。此外,亦已證明,不失一般性地,將此等操作配置稱為模組有時是方便的。所描述操作及其等相關聯模組可體現在軟體、韌體、硬體或其等之任何組合中。Some parts of this description describe embodiments of the invention in terms of algorithms and symbolic representations of information operations. Usually, those who are familiar with data processing technology use these algorithm descriptions and representations to effectively convey the essence of their work to other people who are familiar with the technology. Although these operations are described functionally, computationally, or logically, they should be understood as being implemented by computer programs or equivalent circuits, microprogram codes, or the like. In addition, it has also proven that, without loss of generality, it is sometimes convenient to refer to such operational configurations as modules. The described operations and related modules can be embodied in software, firmware, hardware, or any combination thereof.
可單獨或結合其他裝置一起運用一或多個硬體或軟體模組執行或實施本文中描述之步驟、操作或程序之任一者。在一項實施例中,運用包括含有電腦程式碼之一電腦可讀媒體之一電腦程式產品來實施一軟體模組,該電腦程式碼可由一電腦處理器執行以執行所描述之任何或全部步驟、操作或程序。One or more hardware or software modules can be used alone or in combination with other devices to execute or implement any of the steps, operations, or procedures described herein. In one embodiment, a computer program product including a computer readable medium containing computer program code is used to implement a software module, the computer program code can be executed by a computer processor to perform any or all of the steps described , Operation or procedure.
本發明之實施例亦可關於一種用於執行本文中之操作之設備。此設備可專門構造用於所需目的,及/或其可包括藉由儲存於電腦中之一電腦程式選擇性地啟動或重新組態之一通用計算裝置。此一電腦程式可儲存於耦合至一電腦系統匯流排之一非暫時性有形電腦可讀儲存媒體或適於儲存電子指令之任何類型之媒體中。此外,本說明書中指涉之任何計算系統可包含一單一處理器或可為採用多個處理器設計以增加計算能力之架構。The embodiments of the present invention may also relate to a device for performing the operations herein. This device may be specially constructed for the required purpose, and/or it may include a general-purpose computing device that is selectively activated or reconfigured by a computer program stored in the computer. This computer program can be stored in a non-transitory tangible computer-readable storage medium coupled to a computer system bus or any type of medium suitable for storing electronic instructions. In addition, any computing system referred to in this specification may include a single processor or may be an architecture designed with multiple processors to increase computing power.
本發明之實施例亦可關於一種藉由本文中描述之一計算程序產生之產品。此一產品可包括來源於一計算程序之資訊,其中資訊儲存於一非暫時性有形電腦可讀儲存媒體中且可包含一電腦程式產品或本文中描述之其他資料組合之任何實施例。The embodiments of the present invention may also relate to a product produced by one of the calculation procedures described herein. Such a product may include information derived from a computing process, where the information is stored in a non-transitory tangible computer-readable storage medium and may include any embodiment of a computer program product or other data combination described herein.
最後,已主要出於可讀性及指導目的選擇本說明書中使用之語言,且其並非經選擇以描繪或限制發明標的物。因此,本發明之範疇意欲不受此詳細描述限制,而是受限於探討基於此之一應用之任何申請專利範圍。因此,實施例之揭示內容意欲繪示而非限制在以下發明申請專利範圍中闡述之本發明之範疇。Finally, the language used in this specification has been chosen mainly for readability and guidance purposes, and it has not been chosen to describe or limit the subject matter of the invention. Therefore, the scope of the present invention is not intended to be limited by this detailed description, but is limited to the discussion of the scope of any patent application based on this application. Therefore, the disclosure of the embodiments is intended to illustrate rather than limit the scope of the present invention described in the scope of the following invention applications.
100:張量流處理器(TSP) 102:輸入資料值 104:輸出資料值/輸出值 106:算術電路單元 106A:算術單元 106B:算術單元 108:記憶體 110:解壓縮電路 110A:解壓縮電路/第一解壓縮電路 110B:解壓縮電路/第二解壓縮電路 112:壓縮模型係數 112A:壓縮模型係數 112B:壓縮模型係數 114:函數參數 114A:第一函數參數 114B:第二函數參數 116:解壓縮模型係數 116A:解壓縮模型係數/解壓縮模型參數 116B:解壓縮模型係數/解壓縮模型參數 118:預測模型 120:編譯器 300:圖表 302:第一曲線 304:第二曲線/累積分佈曲線/累積計數分佈 306:第三曲線/函數 308:範圍 310:範圍 400:解壓縮電路 402:壓縮係數值 404:解壓縮係數值/解壓縮輸出係數 406:序列擴展器電路 408:高位元序列 410:低位元序列 412:函數參數 414:解壓縮函數電路 416:高係數值 418:低係數值 420:比較器及控制電路 450:函數計算電路 450a:第一函數計算電路 450b:第二函數計算電路 450c:函數計算電路 502:劃分器電路/劃分器 602:程序 604:程序 606:程序 608:程序 610:程序 702:程序 704:程序 706:程序 708:程序 710:程序 802:程序 804:程序 806:程序 808:程序 810:程序 812:程序100: Tensor Stream Processor (TSP) 102: Enter data value 104: Output data value/output value 106: Arithmetic circuit unit 106A: Arithmetic unit 106B: Arithmetic unit 108: Memory 110: Decompression circuit 110A: Decompression circuit/first decompression circuit 110B: Decompression circuit/second decompression circuit 112: Compression model coefficient 112A: Compression model coefficient 112B: Compression model coefficient 114: Function parameters 114A: The first function parameter 114B: Second function parameter 116: Decompress model coefficients 116A: Decompression model coefficients/decompression model parameters 116B: Decompression model coefficients/decompression model parameters 118: Predictive Model 120: Compiler 300: chart 302: The first curve 304: second curve/cumulative distribution curve/cumulative count distribution 306: Third curve/function 308: range 310: Scope 400: Decompression circuit 402: Compression factor value 404: Decompression coefficient value/decompression output coefficient 406: Sequence Expander Circuit 408: High bit sequence 410: low bit sequence 412: function parameters 414: Decompression function circuit 416: High coefficient value 418: Low coefficient value 420: Comparator and control circuit 450: Function calculation circuit 450a: The first function calculation circuit 450b: Second function calculation circuit 450c: Function calculation circuit 502: divider circuit / divider 602: program 604: program 606: program 608: program 610: program 702: program 704: program 706: program 708: program 710: program 802: program 804: program 806: program 808: program 810: program 812: program
圖1繪示根據一些實施例之用於儲存及解壓縮用於一模型中之模型係數的一系統之一示意圖。FIG. 1 shows a schematic diagram of a system for storing and decompressing model coefficients used in a model according to some embodiments.
圖2繪示根據一些實施例之可使用不同函數解壓縮的一壓縮模型係數集之一方塊圖。FIG. 2 shows a block diagram of a set of compression model coefficients that can be decompressed using different functions according to some embodiments.
圖3A及圖3B繪示展示根據一些實施例之一模型係數分佈之例示性圖表。3A and 3B show exemplary graphs showing the distribution of model coefficients according to some embodiments.
圖4A繪示根據一些實施例之一解壓縮電路之一方塊圖。FIG. 4A shows a block diagram of a decompression circuit according to some embodiments.
圖4B繪示根據一些實施例之含有對應於不同函數類型之函數計算電路的一例示性解壓縮函數電路。4B shows an exemplary decompression function circuit including function calculation circuits corresponding to different function types according to some embodiments.
圖5繪示根據一些實施例之用於並行解壓縮壓縮係數資料的複數個解壓縮電路之一圖。FIG. 5 shows a diagram of a plurality of decompression circuits for decompressing compression coefficient data in parallel according to some embodiments.
圖6係根據一些實施例之用於產生一壓縮模型係數集的一程序之一流程圖。FIG. 6 is a flowchart of a procedure for generating a compression model coefficient set according to some embodiments.
圖7係根據一些實施例之解壓縮壓縮模型係數的一程序之一流程圖。FIG. 7 is a flowchart of a procedure for decompressing the compression model coefficients according to some embodiments.
圖8係用於使用算術解碼對壓縮模型係數執行解壓縮之一程序之一流程圖。Fig. 8 is a flowchart of a procedure for decompressing the compression model coefficients using arithmetic decoding.
圖僅為繪示之目的描繪本發明之實施例。熟習此項技術者自以下描述將容易認知,可在不脫離本文中描述之本發明之原理或所得優點之情況下採用本文中繪示之結構及方法之替代實施例。The figures depict embodiments of the invention for illustrative purposes only. Those skilled in the art will readily recognize from the following description that alternative embodiments of the structure and method illustrated in this article can be used without departing from the principles or advantages of the present invention described in this article.
702:接收 702: receive
704:接收 704: receive
706:組態 706: configuration
708:解壓縮 708: Unzip
710:應用 710: Application
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109132752A TWI729939B (en) | 2019-03-22 | 2019-03-22 | Method and processor for decompression of model parameters using functions based upon cumulative count distributions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109132752A TWI729939B (en) | 2019-03-22 | 2019-03-22 | Method and processor for decompression of model parameters using functions based upon cumulative count distributions |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202103064A TW202103064A (en) | 2021-01-16 |
TWI729939B true TWI729939B (en) | 2021-06-01 |
Family
ID=75234797
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW109132752A TWI729939B (en) | 2019-03-22 | 2019-03-22 | Method and processor for decompression of model parameters using functions based upon cumulative count distributions |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI729939B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6975770B2 (en) * | 2000-04-04 | 2005-12-13 | Sony Corporation | Image compression and decompression with predictor selection based on embedding data |
TWI279092B (en) * | 2004-03-26 | 2007-04-11 | Ind Tech Res Inst | Compressor/decompressor selecting apparatus and method of the same |
US8824569B2 (en) * | 2011-12-07 | 2014-09-02 | International Business Machines Corporation | High bandwidth decompression of variable length encoded data streams |
US9654531B2 (en) * | 2013-08-01 | 2017-05-16 | Spotify Ab | System and method for transitioning between receiving different compressed media streams |
TWI591976B (en) * | 2012-05-04 | 2017-07-11 | 李爾登公司 | System and methods for coping with doppler effects in distributed-input distributed-output wireless systems |
-
2019
- 2019-03-22 TW TW109132752A patent/TWI729939B/en active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6975770B2 (en) * | 2000-04-04 | 2005-12-13 | Sony Corporation | Image compression and decompression with predictor selection based on embedding data |
TWI279092B (en) * | 2004-03-26 | 2007-04-11 | Ind Tech Res Inst | Compressor/decompressor selecting apparatus and method of the same |
US8824569B2 (en) * | 2011-12-07 | 2014-09-02 | International Business Machines Corporation | High bandwidth decompression of variable length encoded data streams |
TWI591976B (en) * | 2012-05-04 | 2017-07-11 | 李爾登公司 | System and methods for coping with doppler effects in distributed-input distributed-output wireless systems |
US9654531B2 (en) * | 2013-08-01 | 2017-05-16 | Spotify Ab | System and method for transitioning between receiving different compressed media streams |
Also Published As
Publication number | Publication date |
---|---|
TW202103064A (en) | 2021-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10938413B2 (en) | Processing core data compression and storage system | |
CN108337000B (en) | Automatic method for conversion to lower precision data formats | |
US10599935B2 (en) | Processing artificial neural network weights | |
CN111062472B (en) | Sparse neural network accelerator based on structured pruning and acceleration method thereof | |
JP6998968B2 (en) | Deep neural network execution method, execution device, learning method, learning device and program | |
KR102608467B1 (en) | Method for lightening neural network and recognition method and apparatus using the same | |
US20220092031A1 (en) | Data compression method and computing device | |
US11372929B2 (en) | Sorting an array consisting of a large number of elements | |
CN113132723B (en) | Image compression method and device | |
JP2020149156A (en) | Data compression apparatus, data processing apparatus, data compression method, program, and trained model | |
Faraone et al. | Customizing low-precision deep neural networks for FPGAs | |
WO2018228399A1 (en) | Computing device and method | |
US20210081785A1 (en) | Information processing device and method, and recording medium storing information processing program | |
WO2020075433A1 (en) | Neural network processing device, neural network processing method, and neural network processing program | |
TWI708196B (en) | Method and processor for decompression of model parameters using functions based upon cumulative count distributions | |
Andrzejewski et al. | GPU-WAH: Applying GPUs to compressing bitmap indexes with word aligned hybrid | |
US10938412B2 (en) | Decompression of model parameters using functions based upon cumulative count distributions | |
TWI729939B (en) | Method and processor for decompression of model parameters using functions based upon cumulative count distributions | |
WO2020185238A1 (en) | Decompression of model parameters using functions based upon cumulative count distributions | |
TWI745697B (en) | Computing system and compressing method thereof for neural network parameters | |
US11342933B2 (en) | Lossy significance compression with lossy restoration | |
KR102502162B1 (en) | Apparatus and method for compressing feature map | |
US20210271450A1 (en) | Processing core with data associative adaptive rounding | |
CN114730295A (en) | Mode-based cache block compression | |
CN111783446A (en) | Method and device for processing sequence |