Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease CPU usage #61

Closed
hakavlad opened this issue Jul 4, 2018 · 37 comments
Closed

Decrease CPU usage #61

hakavlad opened this issue Jul 4, 2018 · 37 comments

Comments

@hakavlad
Copy link
Contributor

hakavlad commented Jul 4, 2018

There is no sense to check memory 10 times per second if there is more than one gigabyte of available memory. You can optimize the frequency of memory checks to decrease the frequency of memory checks and decrease CPU usage. Now the sleep period is 0.1. You can find the time until the next check as follows:

t1 = MemAvailable / 4000000
t2 = SwapFree / 1000000
sleep_until_next_mem_check = t1 + t2

I implemented a similar algorithm in nohang, which is OOM preventer written in Python. If nohang checked memory 10 times per second, then the CPU usage would be too large. Now when using this algorithm for calculating the period between memory checks, the CPU usage is significant only at a low memory level. If there is a lot of memory available, then nohang can use the processor even less than earlyoom. Demo https://youtu.be/8vjeolxw7Uo

I think it's possible to improve earlyoom in the same way. In most cases it will reduce the CPU usage by an order of magnitude.

@rfjakob
Copy link
Owner

rfjakob commented Jul 4, 2018

I agree that could be a good idea. However, the check uses very little CPU already, so I'm not sure if it's worth it. How much CPU usage do you see from earlyoom?

@rfjakob
Copy link
Owner

rfjakob commented Jul 4, 2018

The situation may be different on embedded systems though. These have much weaker CPUs.

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 4, 2018

IMHO in any case such optimization will not make it worse.

Earlyoom uses 0 - 0.7% CPU on my Pentium B960.
If I use earlyoom on VM with 1 thread, earlyoom stably uses 0.5% CPU.

@rfjakob
Copy link
Owner

rfjakob commented Jul 4, 2018

There is a downside: longer reaction time

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 4, 2018

The reaction time will be sufficient to prevent OOM because the sleep period will decrease if the available memory decreases. This is proven in practice, this algorithm is used by nohang and works great.

Further I will display a nohang output when calculating sleep time as follows:
sleep_between_mem_checks = time.sleep(mem_avail / 4000000 + swap_free / 2000000)

I execute tail /dev/zero without swap and got the following nohang output

MemAvail: 4919 M, 83.7 %
sleep 1.26
MemAvail: 4919 M, 83.7 %
sleep 1.26
MemAvail: 3552 M, 60.5 %
sleep 0.91
MemAvail: 2027 M, 34.5 %
sleep 0.52
MemAvail: 1150 M, 19.6 %
sleep 0.29
MemAvail:  649 M, 11.0 %
sleep 0.17
MemAvail:  370 M,  6.3 %

2018-07-04 Wed 22:51:17
  MemAvailable (370 MiB, 6.3 %) < mem_min_sigterm (470 MiB, 8.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to tail,
  Pid: 16601, Badness: 779, VmRSS: 4574 MiB, VmSwap: 0 MiB
  Success
  sleep 0.5
MemAvail: 4914 M, 83.6 %
sleep 1.26
MemAvail: 4898 M, 83.4 %
sleep 1.25
MemAvail: 4897 M, 83.3 %
sleep 1.25

Further you can see nohang output if I execute stress -m 2 --vm-bytes 4G

MemAvail: 4912 M, 83.6 %
sleep 1.26
MemAvail: 4914 M, 83.6 %
sleep 1.26
MemAvail: 2491 M, 42.4 %
sleep 0.64
MemAvail:    0 M,  0.0 %

2018-07-04 Wed 22:53:46
  MemAvailable (0 MiB, 0.0 %) < mem_min_sigkill (235 MiB, 4.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigkill (0 MiB, - %)
  Preventing OOM: trying to send the SIGKILL signal to stress,
  Pid: 16675, Badness: 446, VmRSS: 2618 MiB, VmSwap: 0 MiB
  Success
  sleep 3.0
MemAvail: 4984 M, 84.8 %
sleep 1.28
MemAvail: 4978 M, 84.7 %
sleep 1.27

And stress -m 2 --vm-bytes 4G output

stress: info: [16674] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
stress: FAIL: [16674] (415) <-- worker 16675 got signal 9
stress: WARN: [16674] (417) now reaping child worker processes
stress: FAIL: [16674] (451) failed run completed in 1s

OOM has been prevented without problem.

stress -m 4 --vm-bytes 4G output without swap

MemAvail: 4804 M, 81.8 %
sleep 1.23
MemAvail: 4803 M, 81.8 %
sleep 1.23
MemAvail: 1686 M, 28.7 %
sleep 0.43
MemAvail:   23 M,  0.4 %

2018-07-04 Wed 22:39:08
  MemAvailable (23 MiB, 0.4 %) < mem_min_sigkill (235 MiB, 4.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigkill (0 MiB, - %)
  Preventing OOM: trying to send the SIGKILL signal to stress,
  Pid: 16536, Badness: 217, VmRSS: 1275 MiB, VmSwap: 0 MiB
  Success
  sleep 3.0
MemAvail:  116 M,  2.0 %

2018-07-04 Wed 22:39:09
  MemAvailable (116 MiB, 2.0 %) < mem_min_sigkill (235 MiB, 4.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigkill (0 MiB, - %)
  Preventing OOM: trying to send the SIGKILL signal to stress,
  Pid: 16538, Badness: 290, VmRSS: 1705 MiB, VmSwap: 0 MiB
  Success
  sleep 3.0
MemAvail: 4960 M, 84.4 %
sleep 1.27
MemAvail: 4951 M, 84.3 %
sleep 1.27

and stress -m 4 --vm-bytes 4G output

stress: info: [16534] dispatching hogs: 0 cpu, 0 io, 4 vm, 0 hdd
stress: FAIL: [16534] (415) <-- worker 16536 got signal 9
stress: WARN: [16534] (417) now reaping child worker processes
stress: FAIL: [16534] (415) <-- worker 16538 got signal 9
stress: WARN: [16534] (417) now reaping child worker processes
stress: FAIL: [16534] (451) failed run completed in 3s

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 4, 2018

Nohang output when execute tail /dev/zero with swap:

MemAvail: 4873 M, 82.9 % | SwapFree: 5875 M, 100.0 %
sleep 4.26
MemAvail: 4873 M, 82.9 % | SwapFree: 5875 M, 100.0 %
sleep 4.26
MemAvail: 2063 M, 35.1 % | SwapFree: 5875 M, 100.0 %
sleep 3.54
MemAvail:    0 M,  0.0 % | SwapFree: 4524 M,  77.0 %
sleep 2.32
MemAvail:    0 M,  0.0 % | SwapFree: 2857 M,  48.6 %
sleep 1.46
MemAvail:    0 M,  0.0 % | SwapFree: 1780 M,  30.3 %
sleep 0.91
MemAvail:    0 M,  0.0 % | SwapFree: 1112 M,  18.9 %
sleep 0.57
MemAvail:    0 M,  0.0 % | SwapFree:  628 M,  10.7 %
sleep 0.32
MemAvail:    0 M,  0.0 % | SwapFree:  416 M,   7.1 %

2018-07-04 Wed 23:05:51
  MemAvailable (0 MiB, 0.0 %) < mem_min_sigterm (470 MiB, 8.0 %)
  SwapFree (416 MiB, 7.1 %) < swap_min_sigterm (470 MiB, 8.0 %)
  Preventing OOM: trying to send the SIGTERM signal to tail,
  Pid: 16775, Badness: 865, VmRSS: 5038 MiB, VmSwap: 5117 MiB
  Success
  sleep 0.5
MemAvail:  168 M,  2.9 % | SwapFree: 3271 M,  55.7 %
sleep 1.72
MemAvail: 5183 M, 88.2 % | SwapFree: 5476 M,  93.2 %
sleep 4.13

Nohang output when executestress -m 4 --vm-bytes 10G with swap:

MemAvail: 5074 M, 86.4 % | SwapFree: 5514 M,  93.9 %
sleep 4.12
MemAvail: 5070 M, 86.3 % | SwapFree: 5515 M,  93.9 %
sleep 4.12
MemAvail:    0 M,  0.0 % | SwapFree: 3793 M,  64.6 %
sleep 1.94
MemAvail:    0 M,  0.0 % | SwapFree: 1998 M,  34.0 %
sleep 1.02
MemAvail:    0 M,  0.0 % | SwapFree: 1037 M,  17.6 %
sleep 0.53
MemAvail:    0 M,  0.0 % | SwapFree:  436 M,   7.4 %

2018-07-04 Wed 23:08:05
  MemAvailable (0 MiB, 0.0 %) < mem_min_sigterm (470 MiB, 8.0 %)
  SwapFree (436 MiB, 7.4 %) < swap_min_sigterm (470 MiB, 8.0 %)
  Preventing OOM: trying to send the SIGTERM signal to stress,
  Pid: 16843, Badness: 227, VmRSS: 1279 MiB, VmSwap: 1393 MiB
  Success
  sleep 0.5
MemAvail:    0 M,  0.0 % | SwapFree:  911 M,  15.5 %
sleep 0.47
MemAvail:  805 M, 13.7 % | SwapFree: 1573 M,  26.8 %
sleep 1.01
MemAvail: 4544 M, 77.3 % | SwapFree: 5473 M,  93.2 %
sleep 3.97
MemAvail: 5151 M, 87.7 % | SwapFree: 5478 M,  93.2 %
sleep 4.12

and stress -m 4 --vm-bytes 10G output

stress: info: [16840] dispatching hogs: 0 cpu, 0 io, 4 vm, 0 hdd
stress: FAIL: [16840] (415) <-- worker 16843 got signal 15
stress: WARN: [16840] (417) now reaping child worker processes
stress: FAIL: [16840] (451) failed run completed in 9s

OOM has been totally prevented.

It is very important improvement.

As you see, 0.1s sleep period is very short for me. I want you to accept this algorithm and add CLI options for change mem/swap rates.

@rfjakob
Copy link
Owner

rfjakob commented Jul 4, 2018

tail /dev/zero without swap?

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 4, 2018

With swap and without swap both.

@rfjakob
Copy link
Owner

rfjakob commented Jul 4, 2018

Ah, in the other post. Sorry, missed that!

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 4, 2018

while true; do tail /dev/zero; done without swap with Nohang https://youtu.be/DefJBaKD7C8

@rfjakob
Copy link
Owner

rfjakob commented Jul 6, 2018

The behavoir in your tests look very good.

However, I would like the algorithm to be easier to predict for the user. How about just dropping the poll rate to 1Hz instead of 10Hz when

(available ram + free swap) > 1gb

?

@rfjakob
Copy link
Owner

rfjakob commented Jul 6, 2018

In control theory terms, your algorithm is a continous controller, and with those you usually need extra checks to keep the value from going too low or too high. So we have four values to adjust:

  • swap multiplier
  • ram multiplier
  • lower sleep limit
  • upper sleep limit

On the other hand, a two-state controller needs:

  • lower sleep value
  • upper sleep value
  • switching threshold

But I have to admit that you could say the switching threshold are actually two values because they count ram and swap with a multiplier of one. Hmm.

Did you use an upper and lower limit in nohang?

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 6, 2018

rate_mem = 6000000
rate_swap = 3000000

mem_min_sigterm = 10 %
mem_min_sigkill = 6 %

swap_min_sigterm = 10 %
swap_min_sigkill = 6 %

No more.
See https://github.com/hakavlad/nohang/blob/master/nohang.conf
(translation coming soon)

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 6, 2018

I would like the algorithm to be easier to predict for the user.

MA/1000000 means that OOM preventer have time to prevent OOM if MemAv decreases with speed 1000000 kB/s. Is is easy to predict.

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

Also you can add the monitoring intensity option to be able to more rarely checks memory on systems with stable memory usage (embedded, servers).

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

(available ram + free swap) > 1gb

Maybe better then 0.1s, but worse than MemAv / rate

@rfjakob
Copy link
Owner

rfjakob commented Jul 9, 2018

I added a small test tool, membomb, to find out how fast RAM and Swap can be depleted.

This is on a Pentium G630 from 2011, with 8 GB DDR3 RAM, and 1 GB of swap on an SSD:

# Make sure swap is clean
$ sudo swapoff -a && sudo swapon /swapfile

$ ./membomb 
  39 MiB (2112 MiB/s)
  78 MiB (2344 MiB/s)
 117 MiB (2453 MiB/s)
[...]
4257 MiB (2377 MiB/s)
4296 MiB (2536 MiB/s)
4335 MiB (2549 MiB/s)
4375 MiB (2548 MiB/s)
4414 MiB (2551 MiB/s)
4453 MiB (2246 MiB/s)
4492 MiB (  63 MiB/s) ### <--- RAM is full, swapping starts
4531 MiB (  35 MiB/s)
4570 MiB (  51 MiB/s)
4609 MiB (  63 MiB/s)
4648 MiB (  84 MiB/s)
4687 MiB ( 236 MiB/s)
4726 MiB ( 425 MiB/s)
4765 MiB ( 178 MiB/s)
4804 MiB ( 312 MiB/s)
4843 MiB ( 125 MiB/s)
4882 MiB ( 251 MiB/s)
4921 MiB ( 370 MiB/s)
4960 MiB ( 723 MiB/s)
5000 MiB ( 267 MiB/s)
5039 MiB ( 225 MiB/s)
5078 MiB ( 187 MiB/s)
5117 MiB ( 393 MiB/s)
5156 MiB ( 383 MiB/s)
5195 MiB ( 879 MiB/s)
5234 MiB ( 390 MiB/s)
5273 MiB (1548 MiB/s)
5312 MiB ( 718 MiB/s)
5351 MiB ( 384 MiB/s)
5390 MiB ( 689 MiB/s)
5429 MiB ( 274 MiB/s)
5468 MiB ( 146 MiB/s)
5507 MiB ( 133 MiB/s)
5546 MiB (  91 MiB/s)
5585 MiB (  58 MiB/s)
Killed                 ### <--- killed by earlyoom

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

With stable 0.1s?

@rfjakob
Copy link
Owner

rfjakob commented Jul 9, 2018

Yes

@rfjakob
Copy link
Owner

rfjakob commented Jul 9, 2018

I noticed a problem with swap enableb: sometimes membomb is killed, and 0.1s later a chrome tab is killed, because the memory is still low.

Seems like the kernel needs more than 0.1s to clean up the process.

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

later a chrome tab is killed, because the memory is still low.

That's why I use min delay after all sigkills in nohang.

    V. PREVENTION OF KILLING INNOCENT VICTIMS

    Valid values are integers from the range [0; 1000].

oom_score_min = 20

    Valid values are non-negative floating-point numbers.

min_delay_after_sigterm = 0.5
min_delay_after_sigkill = 3

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

Delay after send signal should be after any SIGKILL, not only after fail.

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

Mem 5.7 GiB, Zram swap 5.7 GiB
t = MemAv / 6000000 + SwFree / 3000000
SIGKILL 10%

4296 MiB (2220 MiB/s)
4335 MiB (2240 MiB/s)
4375 MiB (2277 MiB/s)
4414 MiB (2252 MiB/s)
4453 MiB (2227 MiB/s)
4492 MiB (2249 MiB/s)
4531 MiB (2233 MiB/s)
4570 MiB (2242 MiB/s)
4609 MiB (2247 MiB/s)
4648 MiB (2121 MiB/s)
4687 MiB (2248 MiB/s)
4726 MiB (2243 MiB/s)
4765 MiB (2221 MiB/s)
4804 MiB (2190 MiB/s)
4843 MiB (2246 MiB/s)
4882 MiB (2245 MiB/s)
4921 MiB (2240 MiB/s)
4960 MiB (2090 MiB/s)
5000 MiB (1649 MiB/s)
5039 MiB ( 409 MiB/s)
5078 MiB ( 446 MiB/s)
5117 MiB ( 503 MiB/s)
5156 MiB ( 711 MiB/s)
5195 MiB ( 762 MiB/s)
5234 MiB ( 784 MiB/s)
5273 MiB ( 778 MiB/s)
5312 MiB ( 700 MiB/s)
5351 MiB ( 760 MiB/s)
5390 MiB ( 663 MiB/s)
5429 MiB ( 686 MiB/s)
5468 MiB ( 737 MiB/s)
5507 MiB ( 775 MiB/s)
5546 MiB ( 795 MiB/s)
5585 MiB ( 768 MiB/s)
5625 MiB ( 778 MiB/s)
5664 MiB ( 747 MiB/s)
5703 MiB ( 763 MiB/s)
5742 MiB ( 747 MiB/s)
5781 MiB ( 729 MiB/s)
5820 MiB ( 765 MiB/s)
5859 MiB ( 789 MiB/s)
5898 MiB ( 728 MiB/s)
5937 MiB ( 779 MiB/s)
5976 MiB ( 796 MiB/s)
6015 MiB ( 783 MiB/s)
6054 MiB ( 770 MiB/s)
6093 MiB ( 791 MiB/s)
6132 MiB ( 792 MiB/s)
6171 MiB ( 768 MiB/s)
6210 MiB ( 762 MiB/s)
6250 MiB ( 785 MiB/s)
6289 MiB ( 786 MiB/s)
6328 MiB ( 773 MiB/s)
6367 MiB ( 748 MiB/s)
6406 MiB ( 700 MiB/s)
6445 MiB ( 773 MiB/s)
6484 MiB ( 751 MiB/s)
6523 MiB ( 701 MiB/s)
6562 MiB ( 744 MiB/s)
6601 MiB ( 738 MiB/s)
6640 MiB ( 741 MiB/s)
6679 MiB ( 700 MiB/s)
6718 MiB ( 728 MiB/s)
6757 MiB ( 755 MiB/s)
6796 MiB ( 748 MiB/s)
6835 MiB ( 730 MiB/s)
6875 MiB ( 595 MiB/s)
6914 MiB ( 726 MiB/s)
6953 MiB ( 725 MiB/s)
6992 MiB ( 706 MiB/s)
7031 MiB ( 773 MiB/s)
7070 MiB ( 793 MiB/s)
7109 MiB ( 712 MiB/s)
7148 MiB ( 759 MiB/s)
7187 MiB ( 778 MiB/s)
7226 MiB ( 766 MiB/s)
7265 MiB ( 775 MiB/s)
7304 MiB ( 771 MiB/s)
7343 MiB ( 776 MiB/s)
7382 MiB ( 787 MiB/s)
7421 MiB ( 786 MiB/s)
7460 MiB ( 772 MiB/s)
7500 MiB ( 799 MiB/s)
7539 MiB ( 744 MiB/s)
7578 MiB ( 770 MiB/s)
7617 MiB ( 719 MiB/s)
7656 MiB ( 773 MiB/s)
7695 MiB ( 785 MiB/s)
7734 MiB ( 775 MiB/s)
7773 MiB ( 774 MiB/s)
7812 MiB ( 788 MiB/s)
7851 MiB ( 774 MiB/s)
7890 MiB ( 694 MiB/s)
7929 MiB ( 781 MiB/s)
7968 MiB ( 779 MiB/s)
8007 MiB ( 783 MiB/s)
8046 MiB ( 692 MiB/s)
8085 MiB ( 791 MiB/s)
8125 MiB ( 804 MiB/s)
8164 MiB ( 748 MiB/s)
8203 MiB ( 811 MiB/s)
8242 MiB ( 781 MiB/s)
8281 MiB ( 770 MiB/s)
8320 MiB ( 781 MiB/s)
8359 MiB ( 749 MiB/s)
8398 MiB ( 777 MiB/s)
8437 MiB ( 743 MiB/s)
8476 MiB ( 725 MiB/s)
8515 MiB ( 795 MiB/s)
8554 MiB ( 734 MiB/s)
8593 MiB ( 693 MiB/s)
8632 MiB ( 724 MiB/s)
8671 MiB ( 779 MiB/s)
8710 MiB ( 780 MiB/s)
8750 MiB ( 800 MiB/s)
8789 MiB ( 768 MiB/s)
8828 MiB ( 724 MiB/s)
8867 MiB ( 774 MiB/s)
8906 MiB ( 746 MiB/s)
8945 MiB ( 742 MiB/s)
8984 MiB ( 795 MiB/s)
9023 MiB ( 784 MiB/s)
9062 MiB ( 761 MiB/s)
9101 MiB ( 702 MiB/s)
9140 MiB ( 770 MiB/s)
9179 MiB ( 733 MiB/s)
9218 MiB ( 750 MiB/s)
9257 MiB ( 737 MiB/s)
9296 MiB ( 790 MiB/s)
9335 MiB ( 790 MiB/s)
9375 MiB ( 794 MiB/s)
9414 MiB ( 756 MiB/s)
9453 MiB ( 798 MiB/s)
9492 MiB ( 806 MiB/s)
9531 MiB ( 734 MiB/s)
Убито                           ### <--- killed by nohang
2018-07-10 Tue 06:58:43
  MemAvailable (0 MiB, 0.0 %) < mem_min_sigkill (588 MiB, 10.0 %)
  SwapFree (534 MiB, 9.1 %) < swap_min_sigkill (588 MiB, 10.0 %)
  Preventing OOM: trying to send the SIGKILL signal to membomb,
  Pid: 22476, Badness: 815, VmRSS: 4988 MiB, VmSwap: 4580 MiB
  Success

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 9, 2018

Mem 5.7 GiB, Zram swap 5.7 GiB
t = MemAv / 6000000 + SwFree / 3000000
10% SIGTERM, 5% SIGKILL

8242 MiB ( 801 MiB/s)
8281 MiB ( 792 MiB/s)
8320 MiB ( 817 MiB/s)
8359 MiB ( 762 MiB/s)
8398 MiB ( 743 MiB/s)
8437 MiB ( 758 MiB/s)
8476 MiB ( 804 MiB/s)
8515 MiB ( 830 MiB/s)
8554 MiB ( 840 MiB/s)
8593 MiB ( 793 MiB/s)
8632 MiB ( 833 MiB/s)
8671 MiB ( 823 MiB/s)
8710 MiB ( 733 MiB/s)
Завершено                       ###### Terminated by nohang
2018-07-10 Tue 07:11:29
  MemAvailable (0 MiB, 0.0 %) < mem_min_sigterm (588 MiB, 10.0 %)
  SwapFree (469 MiB, 8.0 %) < swap_min_sigterm (588 MiB, 10.0 %)
  Preventing OOM: trying to send the SIGTERM signal to membomb,
  Pid: 22808, Badness: 743, VmRSS: 4693 MiB, VmSwap: 4023 MiB
  Success

@rfjakob
Copy link
Owner

rfjakob commented Jul 10, 2018

Delay after send signal should be after any SIGKILL, not only after fail.

I wonder if we can do better than just wait (for how long?). Wait until we see the memory usage drop?

@hakavlad
Copy link
Contributor Author

So far I have not been able to think of anything better. Seems like simple delays are not bad idea.

@rfjakob
Copy link
Owner

rfjakob commented Jul 10, 2018

I have added an extra 200ms sleep after a kill, but only if swap is enabled (commit). I have never seen that happen without swap.

@hakavlad
Copy link
Contributor Author

****
  MemAvailable (464 MiB, 7.9 %) < mem_min_sigterm (529 MiB, 9.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to tail,
  Pid: 16174, Badness: 762, VmRSS: 4470 MiB, VmSwap: 0 MiB
  Success
****
  MemAvailable (448 MiB, 7.6 %) < mem_min_sigterm (529 MiB, 9.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  oom_score 9 < oom_score_min 15
  • it happens without swap and if delay = 0. Min delay should be > 0 even if swap disabled.

@rfjakob
Copy link
Owner

rfjakob commented Jul 11, 2018

Do you also see this with SIGKILL?

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 11, 2018

yes, I saw this with SIGKILL.

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 11, 2018

while true; do ./membomb; done

2% SIGTERM, 1% SIGKILL

t = MA/6000000

MemAvail:  274 M,  4.7 %
MemAvail:  174 M,  3.0 %
MemAvail:  109 M,  1.9 %
* MemAvailable (109 MiB, 1.9 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to membomb,
  Pid: 16021, Badness: 806, VmRSS: 4729 MiB, VmSwap: 0 MiB
  Success; reaction time: 20 ms
MemAvail: 4305 M, 73.3 %
MemAvail: 2765 M, 47.1 %
MemAvail: 1751 M, 29.8 %
MemAvail: 1103 M, 18.8 %
MemAvail:  693 M, 11.8 %
MemAvail:  435 M,  7.4 %
MemAvail:  273 M,  4.6 %
MemAvail:  173 M,  2.9 %
MemAvail:  108 M,  1.8 %
* MemAvailable (108 MiB, 1.8 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to membomb,
  Pid: 16022, Badness: 806, VmRSS: 4730 MiB, VmSwap: 0 MiB
  Success; reaction time: 20 ms
MemAvail: 4297 M, 73.1 %
MemAvail: 2762 M, 47.0 %
MemAvail: 1756 M, 29.9 %
MemAvail: 1107 M, 18.8 %
MemAvail:  704 M, 12.0 %
MemAvail:  446 M,  7.6 %
MemAvail:  279 M,  4.7 %
MemAvail:  177 M,  3.0 %
MemAvail:  111 M,  1.9 %
* MemAvailable (111 MiB, 1.9 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to membomb,
  Pid: 16023, Badness: 805, VmRSS: 4727 MiB, VmSwap: 0 MiB
  Success; reaction time: 20 ms
MemAvail: 4315 M, 73.5 %
MemAvail: 2788 M, 47.5 %
MemAvail: 1775 M, 30.2 %
MemAvail: 1120 M, 19.1 %

Are you going to accept new sleep time algorithm?

PS. Nice new output in 1.1!

@hakavlad
Copy link
Contributor Author

hakavlad commented Jul 11, 2018

while true; do stress -m 4 --vm-bytes 3G; done

MemAvail:  848 M, 14.4 %
MemAvail:  216 M,  3.7 %
MemAvail:   94 M,  1.6 %
* MemAvailable (94 MiB, 1.6 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to stress,
  Pid: 16244, Badness: 195, VmRSS: 1147 MiB, VmSwap: 0 MiB
  Success; reaction time: 21 ms
MemAvail: 3144 M, 53.5 %
MemAvail:  814 M, 13.9 %
MemAvail:  208 M,  3.5 %
MemAvail:   88 M,  1.5 %
* MemAvailable (88 MiB, 1.5 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to stress,
  Pid: 16248, Badness: 200, VmRSS: 1177 MiB, VmSwap: 0 MiB
  Success; reaction time: 21 ms
MemAvail: 3089 M, 52.6 %
MemAvail:  806 M, 13.7 %
MemAvail:  215 M,  3.7 %
MemAvail:   76 M,  1.3 %
* MemAvailable (76 MiB, 1.3 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to stress,
  Pid: 16254, Badness: 201, VmRSS: 1183 MiB, VmSwap: 0 MiB
  Success; reaction time: 20 ms
MemAvail: 3237 M, 55.1 %
MemAvail:  855 M, 14.6 %
MemAvail:  219 M,  3.7 %
MemAvail:   73 M,  1.2 %
* MemAvailable (73 MiB, 1.2 %) < mem_min_sigterm (118 MiB, 2.0 %)
  SwapFree (0 MiB, 0.0 %) < swap_min_sigterm (0 MiB, - %)
  Preventing OOM: trying to send the SIGTERM signal to stress,
  Pid: 16259, Badness: 183, VmRSS: 1074 MiB, VmSwap: 0 MiB
  Success; reaction time: 20 ms
MemAvail: 3375 M, 57.4 %
MemAvail: 2608 M, 44.4 %
MemAvail: 4153 M, 70.7 %
MemAvail: 4153 M, 70.7 %

Membomb is not quickest!

@rfjakob
Copy link
Owner

rfjakob commented Jul 11, 2018

Yes the new sleep algorithm will go into earlyoom 1.2

@hakavlad
Copy link
Contributor Author

I added a small test tool, membomb, to find out how fast RAM and Swap can be depleted.

Stress is better (by speed). https://people.seas.harvard.edu/~apw/stress/

@rfjakob
Copy link
Owner

rfjakob commented Jul 11, 2018

Interesting. Maybe because it is 4x parallel (-m 4)

@hakavlad
Copy link
Contributor Author

of course

rfjakob added a commit that referenced this issue Jul 16, 2018
The idea is simple: if memory and swap can only fill up so fast,
we know how long we can sleep without risking to miss a low
memory event.

#61
@rfjakob
Copy link
Owner

rfjakob commented Jul 16, 2018

Implemented via b8b3c32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants