Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply asio stackless coroutine. #794

Merged
merged 2 commits into from
Mar 12, 2021
Merged

Apply asio stackless coroutine. #794

merged 2 commits into from
Mar 12, 2021

Conversation

redboltz
Copy link
Owner

@redboltz redboltz commented Feb 23, 2021

Applied boost asio's stackless coroutine to process_mqttpacket functions.
I expected faster compile time but no significant difference observed.

maset 0a0c6a9

        Command being timed: "make no_tls_both"
        User time (seconds): 37.66
        System time (seconds): 1.81
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:39.54
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2980752
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 1013237
        Voluntary context switches: 62
        Involuntary context switches: 676
        Swaps: 0
        File system inputs: 0
        File system outputs: 141152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

This PR

        Command being timed: "make no_tls_both"
        User time (seconds): 37.52
        System time (seconds): 1.71
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:39.30
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 3005936
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 1030592
        Voluntary context switches: 62
        Involuntary context switches: 670
        Swaps: 0
        File system inputs: 0
        File system outputs: 155432
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Time and memory consumption are not changed.

Object size after stripped becomes smaller.

-rwxr-xr-x 1 kondo   892272  2月 23 20:28 coro*
-rwxr-xr-x 1 kondo   970096  2月 23 20:28 master*

I guess that it is caused by reducing template instantiation by each phase.
But it doesn't make effects to compile times.

I don't decide yet the PR should be merged.
The parsing code looks more straight forward. It is good.

I also checked run time performance by continuous publish. There are no big difference. It is expected result.

@codecov
Copy link

codecov bot commented Feb 23, 2021

Codecov Report

Merging #794 (801a25e) into master (25ec9f8) will decrease coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #794      +/-   ##
==========================================
- Coverage   84.07%   84.04%   -0.04%     
==========================================
  Files          63       63              
  Lines        8599     8423     -176     
==========================================
- Hits         7230     7079     -151     
+ Misses       1369     1344      -25     

@jonesmz
Copy link
Contributor

jonesmz commented Feb 23, 2021

A few thoughts.

  1. if you move the parsing functions out of the endpoint class entirely, you will probably get smaller code because of fewer nested templates.

  2. the boot asio macro based co routines are more complex to write, but may have smaller code size, as they are only a swtch-case and integer.

Good luck!

@redboltz redboltz force-pushed the apply_coroutine branch 3 times, most recently from 815ba33 to 2ce8e09 Compare March 11, 2021 11:50
@redboltz redboltz force-pushed the apply_coroutine branch 2 times, most recently from 7abc20e to 525b5d7 Compare March 12, 2021 05:20
@redboltz
Copy link
Owner Author

Thank you for the comment.

A few thoughts.

  1. if you move the parsing functions out of the endpoint class entirely, you will probably get smaller code because of fewer nested templates.

Yes but there are only four variation of the template instantiation. Those are mqtt, mqtts, ws, and wss.
I tested them locally but no significant compile times difference are observed.

  1. the boot asio macro based co routines are more complex to write, but may have smaller code size, as they are only a swtch-case and integer.

The keyword yield introduced by yield.hpp is the same as the macro BOOST_ASIO_CORO_YIELD.

https://www.boost.org/doc/libs/1_75_0/doc/html/boost_asio/reference/coroutine.html#boost_asio.reference.coroutine.alternate_macro_names
https://github.com/boostorg/asio/blob/boost-1.75.0/include/boost/asio/yield.hpp
https://github.com/boostorg/asio/blob/boost-1.75.0/include/boost/asio/unyield.hpp

So, the PR has already used macro based (simple switch case) stackless coroutine.

By the way, I ran the CI of the PR several times, then the total time of each task became shorter.
I guess that total ( I mean something like integration) memory consumption might reduced and it helps parallel build performance, even if the peak memory consumption is similar.

I will merge the PR soon.

@redboltz
Copy link
Owner Author

We've tested the performance of coroutine version using our proprietary broker.
No significant runtime performance difference from non coroutine version is observed.
So I merge the PR.

@redboltz redboltz merged commit 9393386 into master Mar 12, 2021
@redboltz redboltz deleted the apply_coroutine branch March 12, 2021 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants