Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM] Crash failed to parse the core header when makedumpfile is compiled with -D_TIME_BITS=64 #177

Open
zyxiaooo opened this issue Apr 1, 2024 · 5 comments

Comments

@zyxiaooo
Copy link

zyxiaooo commented Apr 1, 2024

Hi,

I have ARM platforms using kernel 5.15. Recently we switched to 64 bit time and then found that the core failed to open with the following error:

crash: diskdump / compressed kdump: cannot malloc block_size buffer

All data after timestamp shift 12 bytes in the core header:

struct disk_dump_header {
        char                    signature[SIG_LEN];     /* = "DISKDUMP" */
        int                     header_version; /* Dump header version */
        struct new_utsname      utsname;        /* copy of system_utsname */
        struct timeval          timestamp;      /* Time stamp */
        uint8_t dummy[12];   <<<<<<<<<<<<<<<<<<<<<<<<<< add this will temporarily workaround the issue

I also tried to compile crash with the follow command to match the makedumpfile one:

 make target=ARM CFLAGS="-D_TIME_BITS=64"

But got another error:

WARNING: compressed kdump: invalid nr_cpus value: 0
Segmentation fault

Any idea how to correctly handle the core dump with ARM + -D_TIME_BITS=64 ?

@liutgnu
Copy link
Member

liutgnu commented Apr 2, 2024

Hi,

I have ARM platforms using kernel 5.15. Recently we switched to 64 bit time and then found that the core failed to open with the following error:

crash: diskdump / compressed kdump: cannot malloc block_size buffer

The error msg is just the fail of realloc in diskdump.c:read_dump_header(), could you check the failing reason of realloc? Is it due to memory shortage or incorrect value of block_size? In addition, a strerror() may help.

All data after timestamp shift 12 bytes in the core header:

struct disk_dump_header {
        char                    signature[SIG_LEN];     /* = "DISKDUMP" */
        int                     header_version; /* Dump header version */
        struct new_utsname      utsname;        /* copy of system_utsname */
        struct timeval          timestamp;      /* Time stamp */
        uint8_t dummy[12];   <<<<<<<<<<<<<<<<<<<<<<<<<< add this will temporarily workaround the issue

Yeah, it makes sense, because the 64bit time will use larger space.

I also tried to compile crash with the follow command to match the makedumpfile one:

 make target=ARM CFLAGS="-D_TIME_BITS=64"

In my computer(fedora 38),

$ cat /usr/include/bits/types/struct_timeval.h
struct timeval
{
#ifdef __USE_TIME_BITS64
__time64_t tv_sec; /* Seconds. /
__suseconds64_t tv_usec; /
Microseconds. /
#else
__time_t tv_sec; /
Seconds. /
__suseconds_t tv_usec; /
Microseconds. */
#endif
};

I guess(not tried) it should be "CFLAGS="-D__USE_TIME_BITS64"", in order to enable 64bit timestamp.

But got another error:

WARNING: compressed kdump: invalid nr_cpus value: 0
Segmentation fault

Segfault can represent many things. It is better to have a gdb bt stacktrace for further debug.

Any idea how to correctly handle the core dump with ARM + -D_TIME_BITS=64 ?

@zyxiaooo
Copy link
Author

zyxiaooo commented Apr 2, 2024

Thanks for the reply.

crash: diskdump / compressed kdump: cannot malloc block_size buffer

Due to the header mismatch, this is because the block_size it reads is 0.

I guess(not tried) it should be "CFLAGS="-D__USE_TIME_BITS64"", in order to enable 64bit timestamp.

I also tried this but got the same error.

WARNING: compressed kdump: invalid nr_cpus value: 0
Segmentation fault

I think this is still header mismatch, because nr_cpu is not 0 in the test core. Haven't got a chance to dig further though.

@liutgnu
Copy link
Member

liutgnu commented Apr 3, 2024

Yeah, the block_size == 0 is abnormal, which comes from the disk_dump_header, which coming from makedumpfile. It's better to have the vmcore, dump the disk_dump_header into hex, and verify if it is due to error of makedumpfile or kernel itself.

@zyxiaooo
Copy link
Author

zyxiaooo commented Apr 3, 2024

With kernel 5.15, and a makedumpfile compiled with the -D_TIME_BITS=64, I hexdumped the the generated core header, and I can see that there are 12 bytes more around the timestamp field.

With exactly the same kernel, and a makedumpfile compiled WITHOUT -D_TIME_BITS=64, everything works fine.

So I guess there are some issue with makedumpfile with that flag.

Note that not sure if it is related, but we generate the core in flat mode first (makedumpfile -F -c), then make them back to non-flat mode (makedumpfile -R). Just let you know in case it is an issue only under this scenario.

@liutgnu
Copy link
Member

liutgnu commented Apr 3, 2024

Not sure neither, sorry I cannot provide any further useful info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants