CN105893358A - A real-time compression method for files - Google Patents
A real-time compression method for files Download PDFInfo
- Publication number
- CN105893358A CN105893358A CN201410464824.4A CN201410464824A CN105893358A CN 105893358 A CN105893358 A CN 105893358A CN 201410464824 A CN201410464824 A CN 201410464824A CN 105893358 A CN105893358 A CN 105893358A
- Authority
- CN
- China
- Prior art keywords
- file
- node
- character
- weights
- tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to the field of real-time file transmission and compression. In the prior art, the method of compressing a file transmitted through a network comprises the steps of scanning the file after the file is completely transmitted and stored in a local storage space, formulating a proper encoding strategy according to appearing probability distribution condition statistics of characters in the file, scanning the file for a secondary time and generating a compression code for each character in the file and writing the same to a compressed file, and finally deleting the original file. The network transmission and scanning cost much time, and coexistence of the original file and the compressed file in a compressing process occupies the local storage space. The invention provides a method for rapidly compressing a network transmission file; when a file is completely transmitted, a compressed file is generated, and no other storage spaces are occupied in the compression process.
Description
One, technical field
The present invention is applied to network data transmission compressing file and decompression field, mainly solves during network transmission file data, the problem of transmission limit, limit compression.
Two, background technology
One file transmitted by network is compressed, traditional way is after file such as grade transfers, it is saved in locally stored space, again file is scanned, the probability distribution situation occurred by character in statistics file, formulates suitable coding strategy, then second time scanning file, each character applied compression of file is encoded, is written in compressed file, finally deletes original document.Network transmission and twice sweep, occupy the substantial amounts of time, and original document and compressed file exist during compression simultaneously, if file is relatively big, can take a large amount of memory space.The present invention is primarily to create a kind of method carrying out being not take up other memory spaces in Fast Compression, compression process for the file of network transmission.
Three, summary of the invention
One transmitting terminal sends file to a receiving terminal, and receiving terminal receives file and is compressed file, and after file is sent by the time, compressed file generates the most simultaneously, will not produce temporary file, will not take other memory space during compression.
1. transmitting terminal and receiving terminal are communicated by two TCP connections, and one connects responsible transmission instruction, and transmission data are responsible in a connection.
2. transmitting terminal sends connection by instruction and sends file transmission request, and the message format of request is as follows:
Type, fid, file-size, filename-length, filename.
Type:8bit, unsigned number, the type of message, sending request message is 1.
Fid:64bit, unsigned number, it is transmitted the unique identifier of file, after instruction transmission TCP successful connection, fid is initialized to the random number between 0~2^64-1, often transfers this numerical value of file and is increased by one, 0 is returned to, with this regular cycles after arriving 2^64-1.
File-size:64bit, unsigned number, file size.
Filename-length:16bit, unsigned number, filename length.
Filename: length is not intended to, character string, filename content.
3. the reply that receiving terminal makes requests on, normal reply (preparing to receive) message format is as follows:
Type, fid
Type:8bit, unsigned number, the type of message, prepare to be received as 2.
Fid:64bit, unsigned number, the unique identifier of file to be received.
When makeing mistakes to reply message form as follows:
Type, fid, code
Type:8bit, unsigned number, the type of message, error messages is 3.
Fid:64bit, unsigned number, the unique identifier of file to be received.
Code:8bit, unsigned number, reason-code of makeing mistakes.Table specific as follows:
code | Implication |
1 | Memory space inadequate |
2 | Transmitting file |
3 | Do not write authority |
4 | Other reasons |
4. receiving terminal adds " .gmf " at locally stored middle establishment compressed file, filename old file name, and the content of file beginning is as follows:
File-size, filename-length, filename
File-size:64bit, unsigned number, file size.
Filename-length:16bit, unsigned number, filename length.
Filename: be not intended to length, character string, filename.
5. receiving terminal initializes code tree, and this tree only one of which sky leaf node, symbol is TERM, and weights are always 0, numbered 1024.
6. transmitting terminal starts to send data by data transmission connection, and receiving terminal carries out data receiver.Receiving terminal is often read, into a character, to check whether this character is present in code tree:
1) if it does not exist, then this character is encoded, start up to TERM character from the root node of tree, just it is encoded to 0 through left child, is encoded to 1 through right child, until arriving this character, finally plus this character itself, in the coding write compressed file of generation.Then a stalk tree is generated, original TERM node is replaced with this stalk tree, the father node symbol of this subtree is empty, and weights are 1 (weights 0 of TERM are plus the symbol node weights 1 of new addition), the numbering that numbered TERM is original, its right branch node is the character just read in, this node symbol is this character, and weights are 1, and numbered present father node numbering deducts 1, left branch node is a new empty leaf node TERM, and numbered present father node numbering deducts 2.Because adding new node, so each weights needing to adjust each node, according to the order that node serial number is ascending, before amendment weights, node maximum for the numbering in present node and block with identical weights is swapped (switch character and weights, do not exchange numbering), and make the father node of the latter become new present node, until running into root node.
2) if it is present this character is encoded, start from the root node of tree until this character, be just encoded to 0 through left child, be encoded to 1 through right child, until arriving this character, in the coding write compressed file of generation.Then the weights of each node are adjusted, according to the order that node serial number is ascending, before revising the weights of this node, the node of the numbering maximum having identical weights in present node and block is swapped, and make the father node of the latter become new present node, until running into root node.
7. after transmitting terminal has sent data, sending file and be sent message in instruction TCP connection, message format is as follows:
Type, fid
Type:8bit, unsigned number, the type of message, being sent message is 4.
Fid:64bit, unsigned number, the unique identifier of the file being sent.
8. receiving terminal receives after file is sent message, returns under normal circumstances and is properly received message, and message format is as follows:
Type, fid
Type:8bit, unsigned number, the type of message, prepare to be received as 5.
Fid:64bit, unsigned number, confirm the unique identifier of the file received.
When makeing mistakes to reply message form as follows:
Type, fid, code
Type:8bit, unsigned number, the type of message, error messages is 6.
Fid:64bit, unsigned number, receive the unique identifier of the file made mistakes.
Code:8bit, unsigned number, reason-code of makeing mistakes.Table specific as follows:
code | Implication |
1 | Memory space inadequate |
2 | Transmitting file |
3 | The file data received is the most complete |
4 | Other reasons |
Focusing on the generation method to compressed encoding below and file decompression illustrates, such as old file name is a.txt, and the byte content of binary file is abccab, and compression step is as follows:
1. receiving terminal initializes code tree, this tree only one of which sky leaf node, and symbol is TERM, and the weights beginning 0, numbered 1024, a in Figure of description 1 is carried out the code tree after this step.
2. read first character joint for a, because there is no character a in Shu, so the coding of a is to navigate to the coding of TERM node plus character a itself from root vertex, because only that a TERM node, so the coding of a is exactly a, write compressed encoding a.Then a stalk tree is generated, original TERM node is replaced with this stalk tree, the father node symbol of this subtree is empty, numbering 1024 original for numbered TERM, its right branch node be character be a, weights are 1, numbered present father node numbering deduct 1 that is 1023, left branch node is a new empty leaf node TERM, and numbered present father node numbering deducts 2 that is 1022.Although the weights of root node are increased, but this node serial number is maximum, so not doing any switching motion, finally amendment node weights are 1 (weights 0 of TERM are plus the weights 1 of node a), and the b in Figure of description 1 is carried out the code tree after this step.
3. read second byte b, tree does not has character b, from root node to the path code of TERM node plus the coding of b character inherently character b, write compressed encoding 0b.Use comprises new TERM node and character b substitutes old TERM node, and the weights of root node add 1.C in Figure of description 1 is carried out the code tree after this step.
4. reading the 3rd byte c, do not have character c in tree, write c is encoded to 00c.Use comprises new TERM node and character b substitutes old TERM node, d in Figure of description 1 is carried out the code tree after this step, the weights now needing the node to numbered 1022 add 1 operation, but now identical with the node weights of numbered 1022 block has 1023, 1021, 1019, wherein 1023 is maximum, so needing to swap 1022 with 1023, weights are added 1 process, then using 1023 father node as present node, because being root node, so directly the weights of root node being added 1, e in Figure of description 1 is carried out the code tree after this step.
5. read the 4th byte c, having character c in tree, write c is encoded to 101, will add 1 to the weights of 1019 nodes, now identical with 1019 weights node has 1021 and 1022, maximum numbering is 1022, so needing to exchange 1022 and 1019, after exchange, the node weights to numbered 1022 add 1 process, then using 1022 father node as present node, because being root node, so directly adding 1 by the weights of root node, the f in Figure of description 1 is carried out the code tree after this step.
null6. read the 5th byte a,Tree has character a,Write a is encoded to 101,The weights of 1019 nodes will be added 1,Now identical with 1019 weights node has 1021,And 1021 are more than 1019,So needing to exchange 1021 and 1019,After exchange, the node weights to numbered 1021 add 1 process,G in Figure of description 1 is carried out the code tree after this step,Then need to adjust the weights of the father node 1023 of 1021,Because the block identical with 1023 weights 2 has 1022,But 1022 are less than 1023,So not swapping,The weights of 1023 are added 1 and becomes 3,Need to adjust the weights of 1024 below,Because being root node,So directly weights being added 1,Become 5,H in Figure of description 1 is carried out the code tree after this step.
7. read the 6th byte b, tree has character b, write b is encoded to 101, the node weights of 1019 will be added 1, now the most identical with 1019 weights 1 block, it is made without exchange, directly adding 1 by the weights of 1019 nodes becomes 2, then the weights of 1020 nodes will be added 1, now the most identical with 1020 node weights 1 block, it is made without exchange, directly adding 1 by the weights of 1020 nodes becomes 2, then the weights of 1023 nodes will be added 1, now there is no the block as 1023 weights, it is made without exchange, directly adding 1 by the weights of 1023 nodes becomes 4, then the weights of 1024 nodes will be added 1, because being root node, so directly adding 1 to become 6.I in Figure of description 1 is carried out the code tree after this step.
8. the compressed encoding of final abccab is: a0b00c101101101, the entitled a.txt.gmf of file.Originally store this file to need to use 48bit, by having only to 36bit after compression.
Compressed file is decompressed, it is desirable to the original binary content of generation is: abccab, and the step of decompressing files is as follows:
1. the original document size reading compressed file is 6 bytes, and the entitled a.txt of original document, by the original empty file of original document name creation.
2. initializing code tree, this tree only one of which sky leaf node, symbol is TERM, and the weights beginning 0, numbered 1024, a in Figure of description 1 is carried out the code tree after this step.
3. reading the first character joint in compressed encoding, for character a, write original document, then join in tree by character a, method is identical with during compression, and the b in Figure of description 1 is carried out the code tree after this step.
4.Reading 0 in compressed encoding, arrive TERM node, then read a byte from compressed encoding, for character b, character b is write original document, joins in tree by character b, the c in Figure of description 1 is carried out the code tree after this step.
5. read 0 in compressed encoding, arrive 1022, read 0 in compressed encoding again, arrive TERM, then from compressed encoding, read a byte, for character c, character c is write original document, joining in tree by character c, the e in Figure of description 1 is carried out the code tree after this step.
6. reading the position 101 in compressed encoding, eventually arrive at 1019 nodes, this node character is c, and c is write original document, adds 1 by the weights of 1019 nodes, and the f in Figure of description 1 is carried out the code tree after this step.
7. reading the position 101 in compressed encoding, eventually arrive at 1019 nodes, this node character is a, and a is write original document, adds 1 by the weights of 1019 nodes, and the h in Figure of description 1 is carried out the code tree after this step.
8. reading the position 101 in compressed encoding, eventually arrive at 1019 nodes, this node character is b, and b is write original document, adds 1 by the weights of 1019 nodes, and the i in Figure of description 1 is carried out the code tree after this step.
9. being finally completed decompression operation, original binary content is abccab, consistent with intended.
Fig. 1 is the various code tree Transformation Graphs occurred in compressing file and decompression process.
In Fig. 1, a is the initial code tree only comprising an empty node;
In Fig. 1, b is the code tree after adding character a node;
In Fig. 1, c is the code tree after adding character b node;
In Fig. 1, d is the code tree after adding character c node;
In Fig. 1, e is to add after character c node the code tree after weights carry;
In Fig. 1, f is the code tree after again adding character c node;
In Fig. 1, g is the code tree after again adding character a node;
In Fig. 1, h is again to add after character a node the code tree after weights carry;
In Fig. 1, i is the code tree after again adding character b node.
Four, detailed description of the invention
1. configuration TCP/IP network, and on network, two ends can intercommunication.
2. start transmitting terminal in one end of network, the other end at network starts receiving terminal.
3. transmitting terminal sends file, and receiving terminal can produce compressed file.
Claims (2)
1. the present invention uses variable tree to encode the file data of transmission in document transmission process, therefore it is required that protection uses the mode of variable tree to net
File in network transmission encodes.
2. the present invention uses dual pathways TCP to connect control, a channel transfer control instruction, a channel transfer literary composition when compressing in document transmission process
Number of packages evidence, therefore it is required that protection uses dual pathways TCP to connect the method for designing realizing the compression of transmission limit, limit in document transmission process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410464824.4A CN105893358A (en) | 2014-09-12 | 2014-09-12 | A real-time compression method for files |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410464824.4A CN105893358A (en) | 2014-09-12 | 2014-09-12 | A real-time compression method for files |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105893358A true CN105893358A (en) | 2016-08-24 |
Family
ID=56999973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410464824.4A Pending CN105893358A (en) | 2014-09-12 | 2014-09-12 | A real-time compression method for files |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105893358A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110401723A (en) * | 2019-08-16 | 2019-11-01 | 北京浪潮数据技术有限公司 | Method, system, equipment and the storage medium of OVA file upload services device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101521661A (en) * | 2008-02-29 | 2009-09-02 | 北京盖特佳信息安全技术股份有限公司 | Dual-channel information exchange method based on load balancing technique |
CN102546105A (en) * | 2011-12-28 | 2012-07-04 | 深圳市新为软件有限公司 | Method and device for network resource transmission |
CN102546108A (en) * | 2011-12-28 | 2012-07-04 | 深圳市新为软件有限公司 | Method and device for transmitting network resources by tree structure |
CN103181168A (en) * | 2010-08-17 | 2013-06-26 | 三星电子株式会社 | Video encoding method and apparatus using transformation unit of variable tree structure, and video decoding method and apparatus |
-
2014
- 2014-09-12 CN CN201410464824.4A patent/CN105893358A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101521661A (en) * | 2008-02-29 | 2009-09-02 | 北京盖特佳信息安全技术股份有限公司 | Dual-channel information exchange method based on load balancing technique |
CN103181168A (en) * | 2010-08-17 | 2013-06-26 | 三星电子株式会社 | Video encoding method and apparatus using transformation unit of variable tree structure, and video decoding method and apparatus |
CN102546105A (en) * | 2011-12-28 | 2012-07-04 | 深圳市新为软件有限公司 | Method and device for network resource transmission |
CN102546108A (en) * | 2011-12-28 | 2012-07-04 | 深圳市新为软件有限公司 | Method and device for transmitting network resources by tree structure |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110401723A (en) * | 2019-08-16 | 2019-11-01 | 北京浪潮数据技术有限公司 | Method, system, equipment and the storage medium of OVA file upload services device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103250463B (en) | For the subset coding of communication system | |
CN104081702B (en) | Method for sending/receiving grouping in a communications system | |
CN101359981B (en) | Method, apparatus and system for data packet redundant encoding and decoding | |
CN101222297B (en) | Interlaced code and network code combined data distribution method | |
CN105812098A (en) | Universal file transmission method for providing veried error protection and | |
US11936475B2 (en) | Method, apparatus, and system for improving reliability of data transmission involving an ethernet device | |
CN103858370A (en) | Apparatus and method for transmitting/receiving forward error correction packet in mobile communication system | |
CN105553873B (en) | Method and system for processing data in a telecommunication system to dynamically adapt to the amount of data to be transmitted | |
CN112600647B (en) | Multi-hop wireless network transmission method based on network coding endurance | |
CN106776129A (en) | A kind of restorative procedure of the multinode data file based on minimum memory regeneration code | |
CN102804661A (en) | Block aggregation of objects in a communication system | |
CN109274462B (en) | Image transmission method based on improved online fountain codes | |
CN103944676A (en) | MLT code coding and decoding method based on deep space communication environment | |
CN104836642A (en) | LTP (Licklider Transmission Protocol) optimized design method based on erase code | |
CN105893358A (en) | A real-time compression method for files | |
CN110191248A (en) | A kind of unmanned plane image transfer method of the Bats Code based on feedback | |
Yang et al. | Large file transmission in network-coded networks with packet loss: A performance perspective | |
CN105827441A (en) | SOAP message transmission method and system | |
CN105119957A (en) | Information transmission method and device used for intelligent device | |
US10728356B2 (en) | Communication device and communication system | |
CN112328373B (en) | Distributed simulation-oriented automatic discovery method for data distribution service DDS | |
CN104572987B (en) | A kind of method and system that simple regeneration code storage efficiency is improved by compressing | |
Nie et al. | A novel systematic raptor network coding scheme for Mars-to-Earth relay communications | |
CN101197825A (en) | Method, system and device for compression message transmission | |
CN103634843A (en) | Data transmission method, wireless network controller, base station and mobile communication system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
DD01 | Delivery of document by public notice |
Addressee: JIANGSU GM-WINLEAD INTELLIGENT TECHNOLOGY CO., LTD. Document name: Notification of Publication of the Application for Invention |
|
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160824 |