Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve general TCP stability #3027

Closed
wants to merge 2 commits into from
Closed

Conversation

d-a-v
Copy link
Collaborator

@d-a-v d-a-v commented Mar 6, 2017

This commit on one hand allows to optionally use a per-tcp-connection
circular buffer to move received data out from esp internal link layer
incoming area, thus preventing most of the "LmacRxBlk:1" cases. The
circular buffer has to have a minimum size of lwip's TCP_WND (because TCP
window size ensures no more data can be received until the user consume a
byte).

On the other hand, lowering TCP_MSS lets esp handle smaller buffers (user
data are actually small on this device, so there is little effect). This of
course does not prevent sending large buffer at the cost of more tcp
packets. This constant is currently 1460 which is standard but needlessly
way too high for our small esp device.

To test the benefits of these changes, TCP_MSS is set to 256, a tcp service
sending back all received data is setup in a sketch, and clients on a PC are
started to send then check received data. Two clients are continuously
running and two others starts new sessions every some hundreds of bytes.
The shaken ESP has been running hours, no memory leak was detected, and an
average of 20-25 KBytes were still available from heap. The ESP
"LmacRxBlk:1" hangs at the very first second when not using the circular
buffers, even with a 256 bytes MSS.

A new global variable: "lwip_bufferize" >0 let the ESP use circular buffers
for every new tcp connection. A shell script is provided to automatically
change and recompile LWIP with a new MSS: "tools/sdk/lwip/src/LWIP-CHANGE-MSS".

This commit on one hand allows to optionally use a per-tcp-connection
circular buffer to move received data out from esp internal link layer
incoming area, thus preventing most of the "LmacRxBlk:1" cases.  The
circular buffer has to have a minimum size of lwip's TCP_WND (because TCP
window size ensures no more data can be received until the user consume a
byte).

On the other hand, lowering TCP_MSS lets esp handle smaller buffers (user
data are actually small on this device, so there is little effect).  This of
course does not prevent sending large buffer at the cost of more tcp
packets.  This constant is currently 1460 which is standard but needlessly
way too high for our small esp device.

To test the benefits of these changes, TCP_MSS is set to 256, a tcp service
sending back all received data is setup in a sketch, and clients on a PC are
started to send then check received data.  Two clients are continuously
running and two others starts new sessions every some hundreds of bytes.
The shaken ESP has been running hours, no memory leak was detected, and an
average of 20-25 KBytes were still available from heap.  The ESP
"LmacRxBlk:1" hangs at the very first second when not using the circular
buffers, even with a 256 bytes MSS.

A new global variable: "lwip_bufferize" >0 let the ESP use circular buffers
for every new tcp connection.  A shell script is provided to automatically
change and recompile LWIP with a new MSS: "tools/sdk/lwip/src/LWIP-CHANGE-MSS".
@codecov-io
Copy link

codecov-io commented Mar 6, 2017

Codecov Report

Merging #3027 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3027   +/-   ##
=======================================
  Coverage   27.82%   27.82%           
=======================================
  Files          20       20           
  Lines        3626     3626           
  Branches      656      656           
=======================================
  Hits         1009     1009           
  Misses       2441     2441           
  Partials      176      176

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a01638f...e44a4c6. Read the comment docs.

@me-no-dev
Copy link
Collaborator

I can not possibly understand how sending 6 packets instead of 1 is better? Have you tested this low MSS through routers and external networks. What test have you actually done? Any throughput? Uploading/downloading files? OTA?
And BTW the ESP8266 has more than enough RAM to handle regular 1460 packets. The Async libs handle 5 upload/downloads at the same time.

@d-a-v
Copy link
Collaborator Author

d-a-v commented Mar 6, 2017

MSS is not the main deal here. I reduced it so to reduce the running memory footprint.

The tests I've done are explained above: the esp receives and sends back the same data (which is upload and download at the same time).
Not reducing MSS (1460) but using the circular receive buffer works, it hangs otherwise. This is an old behaviour I already had 2 years ago and solved by the same manner: getting data out from lwip's pbufs as soon as possible.

With 256 MSS, I get ~500Kbps (two ways so ~1Mbps overall). I just checked through internet (broadband@work to dsl@home with a route going all around the country: I currently get 100Kbps).
I was not focused on bandwidth but rather on a non-hanging scheme.

I had not the chance to play with examples using async libs. I will try asap.

@d-a-v
Copy link
Collaborator Author

d-a-v commented Mar 6, 2017

Hello,
OTA is not working if MSS is less than 1460.
To make this work, in tools/espota.py, using python's udp socket.send() in a loop rather than a single socket.sendall() (which offers no possible error checking) would do the trick and would not change performances in case of a legacy 1460 MSS.
Using sendAll() with file chunks of at most MSS works, but this is not consistant (I had to try anyway :)

edit: My bad. (TCP_)MSS has nothing to do with UDP. Still, the fact is that TCP_MSS value still influences the OTA udp services in the manner described above. No clue so far. This is a side discussion which does not invalidate this PR which is all about TCP even with a 1460 MSS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants