-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hard fault in networking stack. #136
Comments
Hi, try to increase the CONFIG_IOB_NBUFFERS and CONFIG_IOB_NCHAINS and let us know if it helped. |
"R15 is inside _net_timedwait and R14 is inside nxsem_wait." In this case, you are looking at the VICTIM of the corruption, not the CULPRIT that caused the corruption. This classic failure occurs like this:
The usually fix that works 80% of the time is to increase the stack size of Task B, the CULPRIT or, sometimes Task A, the VICTIM. Task A's stack must be increased in the case when Task A is both the VICTIM and CUPRIT. This happens when Task A's stack is too small so when it is suspended, its critical stack area hold the current state lies outside of its stack limits and in some other tasks memory memory space. The failure scenario is the same as above except that Task B is not the CULPRIT. It innocently clobbers Task A's stack. |
I guess I shouldn't look at bugs before having my coffee. PC and LR clearly make no sense in that order. I'll colour the stacks, check for depth and close the bug if it was that. Thanks. |
I assumed that you just reversed them. They do make perfect sense in the opposite order. But if the stack was corrupted when the task restarted, nother in the register dump or stack dump may be correct. |
I've had no hard faults in the last 20 hours or so, so that's good. After increasing the number of IOB buffers (which was the first thing I tried before posting the ticket) I still get a lot of "Failed to create new I/O buffer chain" errors while loading a page but most of the time the content does appear to load correctly. Also NuttX generates a lot of nerr output when sockets are closed from under it (e.g. by closing a web browser or hitting refresh before a page loads) which are strictly accurate but seem like a heavy response to a normal behaviour, potentially masking other things that are happening with a lot of debug output going on. I'll close the bug later today. Thanks for the help. |
Running the uIP based web server on an STM32F4 with Ethernet connection, I get errors and eventually a hard fault provoked by loading multiple pages (opening multiple connections on port 80) in quick succession, e.g. by loading a web page that includes a couple of css files, a couple of js files and a couple of images.
The errors are:
tcp_datahandler: ERROR: Failed to create new I/O buffer chain
This is from iob_tryalloc failing to allocate a buffer. It is repeated multiple times per page load. More rarely I see:
tcp_datahandler: ERROR: Failed to add data to the I/O buffer chain: -12
Initially this resulted in most of the connections failing and httpd processes hanging, but after setting CONFIG_NET_TCPBACKLOG_CONNS to 8 the page almost always fully loads (connections succeed) despite the errors.
However within a minute or two of repeatedly reloading pages, I get a hard fault.
R15 is inside _net_timedwait and R14 is inside nxsem_wait.
I've attached the map.
System.map.txt
Also my config:
defconfig.txt
The text was updated successfully, but these errors were encountered: