Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting Address record line is always the first line, but in original file it was last line #43

Open
LukeGary462 opened this issue Jul 8, 2020 · 11 comments

Comments

@LukeGary462
Copy link

I am testing this module for some production/testing firmware automation and I am super pleased!
I am noticing that when I encode a hex file to a dict with this module, then fetch and re-save to a hex file (the data is stored in a database as a JSON record) one line gets swapped in the file thus causing my md5 checksums to not match. While I am fairly certain the structure of an intel hex file prevents this from being an issue, it would be nice to under stand how this happens.

Encode to database:

file_name = file_path.split('/')[-1]
with open(file_path, 'rb') as file:
    md5 = hashlib.md5(
        file.read()
    ).hexdigest()
intel_hex = IntelHex()
intel_hex.loadhex(file_path)
if image_type.lower() in ['mfg', 'manufacturing']:
    self.manufacturing_fw = intel_hex.todict()
    self.manufacturing_fw_name = file_name
    self.manufacturing_md5 = md5

Decode From database:

if image_type.lower() in ['mfg', 'manufacturing']:
    output_file_name = f'{out_path}/{self.part_number}-{self.manufacturing_fw_name}'
    firmware_data = self.manufacturing_fw

data = {}
for key, value in firmware_data.items():
    if key != 'start_addr':
        key = int(key)
    data[key] = value

intel_hex = IntelHex()
intel_hex.fromdict(data)
intel_hex.write_hex_file(output_file_name)

I need to add these lines upon export to re-arrange the lines to get the exported file to match the original exactly.
Not efficient or the minimum solution, but it works with everything I have tested so far.

file = open(output_file_name, 'r')
lines = file.readlines()
file.close()
output = open(output_file_name, 'w')
total_lines = len(lines)
last_line = lines[-1]
lines.append(lines[-1])
lines[0], lines[total_lines-1] = lines[total_lines-1], lines[0]
lines.append(last_line)
for line in lines[1:-1]:
    output.write(line)
output.close()

I appreciate any help with this!

@fernandez85
Copy link
Contributor

fernandez85 commented Jul 8, 2020

Can You provide an example of a file before encode and after decode?

If i understood correctly (from the "fix" code) to have your image written correctly You must put first line at one before last position. I'm guessing that this is a Starting Address record. IntelHex standard doesn't say strictly about position. It can be placed at the beginning, it can be placed at the end (before End of File record). It could be placed between data segments as well. But this is just my theory - honestly I never seen such Hex file. Nevertheless this IntelHex implementation always place Starting Address record at the beginning of the file.

By the way, Your "fix" code can be simplified:

file = open(output_file_name, 'r')
lines = file.readlines()
file.close()
output = open(output_file_name, 'w')
lines.insert(-1, lines[0])
for line in lines[1:]:
    output.write(line)
output.close()

Of course if I clearly read out your intentions. But same is what You get when You run your code.

@fernandez85
Copy link
Contributor

I looked at the IntelHex standard again, and I realized that Starting Address record types are numbered higher then Data and Extended Adress records. So maybe this implementation isn't that good after all.
But first let us check the root-cause of your problem.

@bialix
Copy link
Member

bialix commented Jul 9, 2020

IntelHex startard allows separate records to be freely swapped until they all within one 64K page IIRC. Additional starting records can appear anywhere in the file.
Anyway, in your approach I would not rely on MD5 hashing of binary file. With IntelHex files this is not stable approach.
You can use hexdiff.py utility to actually compare content of 2 files, if you wish.
Or maybe you can provide some minimal example of your files with different MD5 before and after processing. It's possible that starting address record is to blame here. I don't really need your binary files, you can inspect with text editor first and last lines and show first 8 bytes here.
BTW the very last line must be EOF record always. So puting it to the start - sounds very wrong. Without real data it's hard to tell what's wrong actually.

@LukeGary462
Copy link
Author

Thanks for looking in to this. Our images are over that size boundary (about 100K), unfortunately I cannot share the hex files here, but I have used hex files with swapped lines in the past when merging in bootloaders in the image. The reasoning behind using the MD5 is because my firmware guy wants that to be used (not my choice). When I diff the files I can see exactly the line that gets swapped. I will get together a small snippet of the beginning and the end of the files to show you what is happening.

@LukeGary462
Copy link
Author

Source File on the Right. Reconstructed on the Left

Beginning of file
Meld_2020-07-09_13-09-33

End of file
Meld_2020-07-09_13-10-19

@fernandez85
Copy link
Contributor

Yes, that is the Start Linear Address record all right.
As I already wrote. Current implementation always places Starting Address records at beginning. So no matter where it will be placed when importing a file (e.g. using 'loadhex') You will have always Starting Address record at the beginning of the file when exporting to a file (e.g. using 'write_hex_file').

@bialix
Copy link
Member

bialix commented Jul 10, 2020

"the data is stored in a database as a JSON record" - why don't you save actual hex file content to database then?

@bialix
Copy link
Member

bialix commented Jul 10, 2020

I will change the title of this issue. It can be fixed with another option of write_hex_file, oh my. That's already too complex.

@bialix bialix changed the title Hex File Lines get swapped Starting Address record line is always first line, but in original file it was last line Jul 10, 2020
@bialix bialix changed the title Starting Address record line is always first line, but in original file it was last line Starting Address record line is always the first line, but in original file it was last line Jul 10, 2020
@LukeGary462
Copy link
Author

"the data is stored in a database as a JSON record" - why don't you save actual hex file content to database then?

I'm starting to think I should just do that... This module is very nice though. There is talk of changing bootloaders, so being able to merge the files with this module without manually creating new releases seems like nice future proofing

@LukeGary462
Copy link
Author

Yes, that is the Start Linear Address record all right.
As I already wrote. Current implementation always places Starting Address records at beginning. So no matter where it will be placed when importing a file (e.g. using 'loadhex') You will have always Starting Address record at the beginning of the file when exporting to a file (e.g. using 'write_hex_file').

Im not sure why this is happening, this was compiled and linked in IAR EWARM

@fernandez85
Copy link
Contributor

fernandez85 commented Jul 10, 2020

Why You have in Your Hex file Start Record on the end? Probably by a software design. @bialix already wrote that there isn't only one place to put that record. IntelHex documentation doesn't describe this precisely.

It can be fixed with another option of write_hex_file, oh my. That's already too complex.

Yeah. This function is really evolving very dynamically ;)
I thinking about the solution. I see 2 ways:

  1. We only care about importing and when loading we only check were it's placed, on beginning or at the end. If somewhere else then I suggest to just force 'on beginning' option in this case. And when writing we set it as-is (resolved while loading)
  2. Option 1. + new parameter, that would replace 'write_start_addr' (in 'write_hex_file' function) with for example 'start_addr_mode', which would have 3 options:
    • 'sof' - start of file, old behavior when 'write_start_addr'=True
    • 'eof' - end of file, naturally before EOF record
    • 'none' - old behavior when 'write_start_addr'=False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants