-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpm2archive -f pax cannot handle utf8 filenames #2972
Comments
In this case archive_write_header() returns ARCHIVE_WARN, which is treated as error in rpm2archive. OTOH I don't think libarchive should mess with the file names, maybe it makes sense to set the hdrcharset to |
Oh, and the error handling in rpm2archive is completely broken... |
Btw, it cannot handle UTF8 filenames as well, as it checks the current locale which is not initialized and thus 7 bit ascii... |
I find it very surprising that bsdtar's output depends on the current locale, but that seems to be the case:
|
It's not much work to not use libarchive for writing. The only two formats that can be used for archive writing are cpio and pax (all the others have too many limitations). Writing cpio is easy and writing a pax tar file is also not hard (reading a tar file is where it gets really messy because of all the different implementations). Is that something you would be interested in? |
rpm2archive could use some love for sure, but I'd rather not teach it about format internals, that's just the kind of thing I'd rather outsource to somebody else - like libarchive. If it doesn't do what we want it to do, then lets at least look at fixing it instead of doing it in rpm, it'd benefit way more people too. As for the encoding, I think here's a case for RPMTAG_ENCODING: if that's present and says utf-8 (upstream rpm will never put anything else there) we can safely assume utf-8. Anything else is a legacy case and if the easiest solution is to just say "BINARY" encoding then that's fine with me. Or does "GNU tar complains" mean it actually fails entirely rather than just warn? |
It just warns about the unknown attribute. |
And this is about file names, I think "upstream rpm" treats those pretty much as binary as they are created by the build process and not part of the spec file. |
Right. So the warning would only be seen by folks who run rpm2archive to convert legacy rpms to tar - ie a rare corner case really. A harmless warning from gnu tar in that case is quite acceptable to me at least. |
Why "legacy"? Does the current code reject non-utf8 file names? |
It does, by default. For many years now. Looking closer: we turned it into an error five years ago, before that it was a warning for a similar period of time. It's still macro overridable for of course for v4 packages. |
That makes things a bit easier, so we just need to teach libarchive that it should accept utf8. I'll adapt the title of this issue ;-) |
I'll open a pull request for this. |
Our headers are always useing utf8 and the pax standard also requires utf8 strings. So do this nasty little locale switching to make libarchive not depend on the active locale. Fixes issue rpm-software-management#2972
Our headers are always useing utf8 and the pax standard also requires utf8 strings. So do this nasty little locale switching to make libarchive not depend on the active locale. Fixes issue #2972
It fails because it want to convert the filenames to utf8:
$ echo $LC_CTYPE
de_DE@euro
$ rpm2archive /usr/src/packages/RPMS/x86_64/empty-3.0.0-1.x86_64.rpm > /dev/null
Error writing archive: Can't translate pathname './fooöo' to UTF-8 (84)
The text was updated successfully, but these errors were encountered: