diff --git a/README.md b/README.md index f731c0a2b7f8..e95532ba6022 100644 --- a/README.md +++ b/README.md @@ -123,6 +123,7 @@ pair of .c and .py files, and some are directories of files. - tools/[inject](tools/inject.py): Targeted error injection with call chain and predicates [Examples](tools/inject_example.txt). - tools/[killsnoop](tools/killsnoop.py): Trace signals issued by the kill() syscall. [Examples](tools/killsnoop_example.txt). - tools/[klockstat](tools/klockstat.py): Traces kernel mutex lock events and display locks statistics. [Examples](tools/klockstat_example.txt). +- tools/[kvmexit](tools/kvmexit.py): Display the exit_reason and its statistics of each vm exit. [Examples](tools/kvmexit_example.txt). - tools/[llcstat](tools/llcstat.py): Summarize CPU cache references and misses by process. [Examples](tools/llcstat_example.txt). - tools/[mdflush](tools/mdflush.py): Trace md flush events. [Examples](tools/mdflush_example.txt). - tools/[memleak](tools/memleak.py): Display outstanding memory allocations to find memory leaks. [Examples](tools/memleak_example.txt). diff --git a/man/man8/kvmexit.8 b/man/man8/kvmexit.8 new file mode 100644 index 000000000000..c0cb4c9845f7 --- /dev/null +++ b/man/man8/kvmexit.8 @@ -0,0 +1,115 @@ +.TH kvmexit 8 "2021-07-08" "USER COMMANDS" +.SH NAME +kvmexit \- Display the exit_reason and its statistics of each vm exit. +.SH SYNOPSIS +.B kvmexit [\-h] [\-p PID [\-v VCPU | \-a] ] [\-t TID | \-T 'TID1,TID2'] [duration] +.SH DESCRIPTION +Considering virtual machines' frequent exits can cause performance problems, +this tool aims to locate the frequent exited reasons and then find solutions +to reduce or even avoid the exit, by displaying the detail exit reasons and +the counts of each vm exit for all vms running on one physical machine. + +This tool uses a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to +collaboratively store each kvm exit reason and its count. The reason is there +exists a rule when one vcpu exits and re-enters, it tends to continue to run on +the same physical cpu as the last cycle, which is also called 'cache hit'. Thus +we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed +things up; and for other cases, then use a percpu_hash. + +As RAW_TRACEPOINT_PROBE(kvm_exit) consumes less cpu cycles, when this tool is +used, it firstly tries to employ raw tracepoints in modules, and if failes, +then fall back to regular tracepoint. + +Limitation: In view of the hardware-assisted virtualization technology of +different architectures, currently we only adapt on vmx in intel. + +Since this uses BPF, only the root user can use this tool. +.SH REQUIREMENTS +CONFIG_BPF and bcc. + +This also requires Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support). +.SH OPTIONS +.TP +\-h +Print usage message. +.TP +\-p PID +Display process with this PID only, collpase all tids with exit reasons sorted +in descending order. +.TP +\-v VCPU +Display this VCPU only for this PID. +.TP +\-a ALLTIDS +Display all TIDS for this PID. +.TP +\-t TID +Display thread with this TID only with exit reasons sorted in descending order. +.TP +\-T 'TID1,TID2' +Display threads for a union like {395490, 395491}. +.TP +duration +Duration of display, after sleeping several seconds. +.SH EXAMPLES +.TP +Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end: +# +.B kvmexit +.TP +Display kvm exit reasons and statistics for all threads after sleeping 6 secs: +# +.B kvmexit 6 +.TP +Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs: +# +.B kvmexit -p 1273795 5 +.TP +Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs: +# +.B kvmexit -p 1273795 5 -a +.TP +Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end: +# +.B kvmexit -p 1273795 -v 0 +.TP +Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs: +# +.B kvmexit -p 1273795 -v 0 4 +.TP +Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs: +# +.B kvmexit -t 1273819 10 +.TP +Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end: +# +.B kvmexit -T '1273820,1273819' +.SH OVERHEAD +This traces the "kvm_exit" kernel function, records the exit reason and +calculates its counts. Contrast with filling more vm-exit reason debug entries, +this tool is more easily and flexibly: the bcc python logic could provide nice +kernel aggregation and custom output, the bpf in-kernel percpu_array and +percpu_cache further improves performance. + +The impact of using this tool on the host should be negligible. While this +tool is very efficient, it does affect the guest virtual machine itself, the +average test results on guest vm are as follows: + | cpu cycles + no TP | 1127 + regular TP | 1277 (13% downgrade) + RAW TP | 1187 (5% downgrade) + +Host: echo 1 > /proc/sys/net/core/bpf_jit_enable +.SH SOURCE +This is from bcc. +.IP +https://github.com/iovisor/bcc +.PP +Also look in the bcc distribution for a companion _examples.txt file containing +example usage, output, and commentary for this tool. +.SH OS +Linux +.SH STABILITY +Unstable - in development. +.SH AUTHOR +Fei Li diff --git a/tools/kvmexit.py b/tools/kvmexit.py new file mode 100755 index 000000000000..a959efbbbbf1 --- /dev/null +++ b/tools/kvmexit.py @@ -0,0 +1,389 @@ +#!/usr/bin/env python +# +# kvmexit.py +# +# Display the exit_reason and its statistics of each vm exit +# for all vcpus of all virtual machines. For example: +# $./kvmexit.py +# PID TID KVM_EXIT_REASON COUNT +# 1273551 1273568 EXIT_REASON_MSR_WRITE 6 +# 1274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1 +# 1274253 1274261 EXIT_REASON_HLT 12 +# ... +# +# Besides, we also allow users to specify one pid, tid(s), or one +# pid and its vcpu. See kvmexit_example.txt for more examples. +# +# @PID: each vitual machine's pid in the user space. +# @TID: the user space's thread of each vcpu of that virtual machine. +# @KVM_EXIT_REASON: the reason why the vm exits. +# @COUNT: the counts of the @KVM_EXIT_REASONS. +# +# REQUIRES: Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support) +# +# Copyright (c) 2021 ByteDance Inc. All rights reserved. +# +# Author(s): +# Fei Li + + +from __future__ import print_function +from time import sleep, strftime +from bcc import BPF +import argparse +import multiprocessing +import os +import signal +import subprocess + +# +# Process Arguments +# +def valid_args_list(args): + args_list = args.split(",") + for arg in args_list: + try: + int(arg) + except: + raise argparse.ArgumentTypeError("must be valid integer") + return args_list + +# arguments +examples = """examples: + ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C + ./kvmexit 5 # Display in real-time after sleeping 5s + ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order + ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s + ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default + ./kvmexit -p 3195281 -a # Display all tids for pid 3195281 + ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order + ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s + ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491} +""" +parser = argparse.ArgumentParser( + description="Display kvm_exit_reason and its statistics at a timed interval", + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=examples) +parser.add_argument("duration", nargs="?", default=99999999, type=int, help="show delta for next several seconds") +parser.add_argument("-p", "--pid", type=int, help="trace this PID only") +exgroup = parser.add_mutually_exclusive_group() +exgroup.add_argument("-t", "--tid", type=int, help="trace this TID only") +exgroup.add_argument("-T", "--tids", type=valid_args_list, help="trace a comma separated series of tids with no space in between") +exgroup.add_argument("-v", "--vcpu", type=int, help="trace this vcpu only") +exgroup.add_argument("-a", "--alltids", action="store_true", help="trace all tids for this pid") +args = parser.parse_args() +duration = int(args.duration) + +# +# Setup BPF +# + +# load BPF program +bpf_text = """ +#include + +#define REASON_NUM 69 +#define TGID_NUM 1024 + +struct exit_count { + u64 exit_ct[REASON_NUM]; +}; +BPF_PERCPU_ARRAY(init_value, struct exit_count, 1); +BPF_TABLE("percpu_hash", u64, struct exit_count, pcpu_kvm_stat, TGID_NUM); + +struct cache_info { + u64 cache_pid_tgid; + struct exit_count cache_exit_ct; +}; +BPF_PERCPU_ARRAY(pcpu_cache, struct cache_info, 1); + +FUNC_ENTRY { + int cache_miss = 0; + int zero = 0; + u32 er = GET_ER; + if (er >= REASON_NUM) { + return 0; + } + + u64 cur_pid_tgid = bpf_get_current_pid_tgid(); + u32 tgid = cur_pid_tgid >> 32; + u32 pid = cur_pid_tgid; + + if (THREAD_FILTER) + return 0; + + struct exit_count *tmp_info = NULL, *initial = NULL; + struct cache_info *cache_p; + cache_p = pcpu_cache.lookup(&zero); + if (cache_p == NULL) { + return 0; + } + + if (cache_p->cache_pid_tgid == cur_pid_tgid) { + //a. If the cur_pid_tgid hit this physical cpu consecutively, save it to pcpu_cache + tmp_info = &cache_p->cache_exit_ct; + } else { + //b. If another pid_tgid matches this pcpu for the last hit, OR it is the first time to hit this physical cpu. + cache_miss = 1; + + // b.a Try to load the last cache struct if exists. + tmp_info = pcpu_kvm_stat.lookup(&cur_pid_tgid); + + // b.b If it is the first time for the cur_pid_tgid to hit this pcpu, employ a + // per_cpu array to initialize pcpu_kvm_stat's exit_count with each exit reason's count is zero + if (tmp_info == NULL) { + initial = init_value.lookup(&zero); + if (initial == NULL) { + return 0; + } + + pcpu_kvm_stat.update(&cur_pid_tgid, initial); + tmp_info = pcpu_kvm_stat.lookup(&cur_pid_tgid); + // To pass the verifier + if (tmp_info == NULL) { + return 0; + } + } + } + + if (er < REASON_NUM) { + tmp_info->exit_ct[er]++; + if (cache_miss == 1) { + if (cache_p->cache_pid_tgid != 0) { + // b.*.a Let's save the last hit cache_info into kvm_stat. + pcpu_kvm_stat.update(&cache_p->cache_pid_tgid, &cache_p->cache_exit_ct); + } + // b.* As the cur_pid_tgid meets current pcpu_cache_array for the first time, save it. + cache_p->cache_pid_tgid = cur_pid_tgid; + bpf_probe_read(&cache_p->cache_exit_ct, sizeof(*tmp_info), tmp_info); + } + return 0; + } + + return 0; +} +""" + +# format output +exit_reasons = ( + "EXCEPTION_NMI", + "EXTERNAL_INTERRUPT", + "TRIPLE_FAULT", + "INIT_SIGNAL", + "N/A", + "N/A", + "N/A", + "INTERRUPT_WINDOW", + "NMI_WINDOW", + "TASK_SWITCH", + "CPUID", + "N/A", + "HLT", + "INVD", + "INVLPG", + "RDPMC", + "RDTSC", + "N/A", + "VMCALL", + "VMCLEAR", + "VMLAUNCH", + "VMPTRLD", + "VMPTRST", + "VMREAD", + "VMRESUME", + "VMWRITE", + "VMOFF", + "VMON", + "CR_ACCESS", + "DR_ACCESS", + "IO_INSTRUCTION", + "MSR_READ", + "MSR_WRITE", + "INVALID_STATE", + "MSR_LOAD_FAIL", + "N/A", + "MWAIT_INSTRUCTION", + "MONITOR_TRAP_FLAG", + "N/A", + "MONITOR_INSTRUCTION", + "PAUSE_INSTRUCTION", + "MCE_DURING_VMENTRY", + "N/A", + "TPR_BELOW_THRESHOLD", + "APIC_ACCESS", + "EOI_INDUCED", + "GDTR_IDTR", + "LDTR_TR", + "EPT_VIOLATION", + "EPT_MISCONFIG", + "INVEPT", + "RDTSCP", + "PREEMPTION_TIMER", + "INVVPID", + "WBINVD", + "XSETBV", + "APIC_WRITE", + "RDRAND", + "INVPCID", + "VMFUNC", + "ENCLS", + "RDSEED", + "PML_FULL", + "XSAVES", + "XRSTORS", + "N/A", + "N/A", + "UMWAIT", + "TPAUSE" +) + +# +# Do some checks +# +try: + # Currently, only adapte on intel architecture + cmd = "cat /proc/cpuinfo | grep vendor_id | head -n 1" + arch_info = subprocess.check_output(cmd, shell=True).strip() + if b"Intel" in arch_info: + pass + else: + raise Exception("Currently we only support Intel architecture, please do expansion if needs more.") + + # Check if kvm module is loaded + if os.access("/dev/kvm", os.R_OK | os.W_OK): + pass + else: + raise Exception("Please insmod kvm module to use kvmexit tool.") +except Exception as e: + raise Exception("Failed to do precondition check, due to: %s." % e) + +try: + if BPF.support_raw_tracepoint_in_module(): + # Let's firstly try raw_tracepoint_in_module + func_entry = "RAW_TRACEPOINT_PROBE(kvm_exit)" + get_er = "ctx->args[0]" + else: + # If raw_tp_in_module is not supported, fall back to regular tp + func_entry = "TRACEPOINT_PROBE(kvm, kvm_exit)" + get_er = "args->exit_reason" +except Exception as e: + raise Exception("Failed to catch kvm exit reasons due to: %s" % e) + + +def find_tid(tgt_dir, tgt_vcpu): + for tid in os.listdir(tgt_dir): + path = tgt_dir + "/" + tid + "/comm" + fp = open(path, "r") + comm = fp.read() + if (comm.find(tgt_vcpu) != -1): + return tid + return -1 + +# set process/thread filter +thread_context = "" +header_format = "" +need_collapse = not args.alltids +if args.tid is not None: + thread_context = "TID %s" % args.tid + thread_filter = 'pid != %s' % args.tid +elif args.tids is not None: + thread_context = "TIDS %s" % args.tids + thread_filter = "pid != " + " && pid != ".join(args.tids) + header_format = "TIDS " +elif args.pid is not None: + thread_context = "PID %s" % args.pid + thread_filter = 'tgid != %s' % args.pid + if args.vcpu is not None: + thread_context = "PID %s VCPU %s" % (args.pid, args.vcpu) + # transfer vcpu to tid + tgt_dir = '/proc/' + str(args.pid) + '/task' + tgt_vcpu = "CPU " + str(args.vcpu) + args.tid = find_tid(tgt_dir, tgt_vcpu) + if args.tid == -1: + raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid)) + thread_filter = 'pid != %s' % args.tid + elif args.alltids: + thread_context = "PID %s and its all threads" % args.pid + header_format = "TID " +else: + thread_context = "all threads" + thread_filter = '0' + header_format = "PID TID " +bpf_text = bpf_text.replace('THREAD_FILTER', thread_filter) + +# For kernel >= 5.0, use RAW_TRACEPOINT_MODULE for performance consideration +bpf_text = bpf_text.replace('FUNC_ENTRY', func_entry) +bpf_text = bpf_text.replace('GET_ER', get_er) +b = BPF(text=bpf_text) + + +# header +print("Display kvm exit reasons and statistics for %s" % thread_context, end="") +if duration < 99999999: + print(" after sleeping %d secs." % duration) +else: + print("... Hit Ctrl-C to end.") +print("%s%-35s %s" % (header_format, "KVM_EXIT_REASON", "COUNT")) + +# signal handler +def signal_ignore(signal, frame): + print() +try: + sleep(duration) +except KeyboardInterrupt: + signal.signal(signal.SIGINT, signal_ignore) + + +# Currently, sort multiple tids in descending order is not supported. +if (args.pid or args.tid): + ct_reason = [] + if args.pid: + tgid_exit = [0 for i in range(len(exit_reasons))] + +# output +pcpu_kvm_stat = b["pcpu_kvm_stat"] +pcpu_cache = b["pcpu_cache"] +for k, v in pcpu_kvm_stat.items(): + tgid = k.value >> 32 + pid = k.value & 0xffffffff + for i in range(0, len(exit_reasons)): + sum1 = 0 + for inner_cpu in range(0, multiprocessing.cpu_count()): + cachePIDTGID = pcpu_cache[0][inner_cpu].cache_pid_tgid + # Take priority to check if it is in cache + if cachePIDTGID == k.value: + sum1 += pcpu_cache[0][inner_cpu].cache_exit_ct.exit_ct[i] + # If not in cache, find from kvm_stat + else: + sum1 += v[inner_cpu].exit_ct[i] + if sum1 == 0: + continue + + if (args.pid and args.pid == tgid and need_collapse): + tgid_exit[i] += sum1 + elif (args.tid and args.tid == pid): + ct_reason.append((sum1, i)) + elif not need_collapse or args.tids: + print("%-8u %-35s %-8u" % (pid, exit_reasons[i], sum1)) + else: + print("%-8u %-8u %-35s %-8u" % (tgid, pid, exit_reasons[i], sum1)) + + # Display only for the target tid in descending sort + if (args.tid and args.tid == pid): + ct_reason.sort(reverse=True) + for i in range(0, len(ct_reason)): + if ct_reason[i][0] == 0: + continue + print("%-35s %-8u" % (exit_reasons[ct_reason[i][1]], ct_reason[i][0])) + break + + +# Aggregate all tids' counts for this args.pid in descending sort +if args.pid and need_collapse: + for i in range(0, len(exit_reasons)): + ct_reason.append((tgid_exit[i], i)) + ct_reason.sort(reverse=True) + for i in range(0, len(ct_reason)): + if ct_reason[i][0] == 0: + continue + print("%-35s %-8u" % (exit_reasons[ct_reason[i][1]], ct_reason[i][0])) diff --git a/tools/kvmexit_example.txt b/tools/kvmexit_example.txt new file mode 100644 index 000000000000..6b5b8719f119 --- /dev/null +++ b/tools/kvmexit_example.txt @@ -0,0 +1,250 @@ +Demonstrations of kvm exit reasons, the Linux eBPF/bcc version. + + +Considering virtual machines' frequent exits can cause performance problems, +this tool aims to locate the frequent exited reasons and then find solutions +to reduce or even avoid the exit, by displaying the detail exit reasons and +the counts of each vm exit for all vms running on one physical machine. + + +Features of this tool +===================== + +- Although there is a patch: [KVM: x86: add full vm-exit reason debug entries] + (https://patchwork.kernel.org/project/kvm/patch/1555939499-30854-1-git-send-email-pizhenwei@bytedance.com/) + trying to fill more vm-exit reason debug entries, just as the comments said, + the code allocates lots of memory that may never be consumed, misses some + arch-specific kvm causes, and can not do kernel aggregation. Instead bcc, as + a user space tool, can implement all these functions more easily and flexibly. +- The bcc python logic could provide nice kernel aggregation and custom output, + like collpasing all tids for one pid (e.i. one vm's qemu process id) with exit + reasons sorted in descending order. For more information, see the following + #USAGE message. +- The bpf in-kernel percpu_array and percpu_cache further improves performance. + For more information, see the following #Help to understand. + + +Limited +======= + +In view of the hardware-assisted virtualization technology of +different architectures, currently we only adapt on vmx in intel. +And the amd feature is on the road.. + + +Example output: +=============== + +# ./kvmexit.py +Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end. +PID TID KVM_EXIT_REASON COUNT +^C1273551 1273568 EXIT_REASON_HLT 12 +1273551 1273568 EXIT_REASON_MSR_WRITE 6 +1274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1 +1274253 1274261 EXIT_REASON_HLT 12 +1274253 1274261 EXIT_REASON_MSR_WRITE 4 + +# ./kvmexit.py 6 +Display kvm exit reasons and statistics for all threads after sleeping 6 secs. +PID TID KVM_EXIT_REASON COUNT +1273903 1273922 EXIT_REASON_EXTERNAL_INTERRUPT 175 +1273903 1273922 EXIT_REASON_CPUID 10 +1273903 1273922 EXIT_REASON_HLT 6043 +1273903 1273922 EXIT_REASON_IO_INSTRUCTION 24 +1273903 1273922 EXIT_REASON_MSR_WRITE 15025 +1273903 1273922 EXIT_REASON_PAUSE_INSTRUCTION 11 +1273903 1273922 EXIT_REASON_EOI_INDUCED 12 +1273903 1273922 EXIT_REASON_EPT_VIOLATION 6 +1273903 1273922 EXIT_REASON_EPT_MISCONFIG 380 +1273903 1273922 EXIT_REASON_PREEMPTION_TIMER 194 +1273551 1273568 EXIT_REASON_EXTERNAL_INTERRUPT 18 +1273551 1273568 EXIT_REASON_HLT 989 +1273551 1273568 EXIT_REASON_IO_INSTRUCTION 10 +1273551 1273568 EXIT_REASON_MSR_WRITE 2205 +1273551 1273568 EXIT_REASON_PAUSE_INSTRUCTION 1 +1273551 1273568 EXIT_REASON_EOI_INDUCED 5 +1273551 1273568 EXIT_REASON_EPT_MISCONFIG 61 +1273551 1273568 EXIT_REASON_PREEMPTION_TIMER 14 + +# ./kvmexit.py -p 1273795 5 +Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs. +KVM_EXIT_REASON COUNT +MSR_WRITE 13467 +HLT 5060 +PREEMPTION_TIMER 345 +EPT_MISCONFIG 264 +EXTERNAL_INTERRUPT 169 +EPT_VIOLATION 18 +PAUSE_INSTRUCTION 6 +IO_INSTRUCTION 4 +EOI_INDUCED 2 + +# ./kvmexit.py -p 1273795 5 -a +Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs. +TID KVM_EXIT_REASON COUNT +1273819 EXTERNAL_INTERRUPT 64 +1273819 HLT 2802 +1273819 IO_INSTRUCTION 4 +1273819 MSR_WRITE 7196 +1273819 PAUSE_INSTRUCTION 2 +1273819 EOI_INDUCED 2 +1273819 EPT_VIOLATION 6 +1273819 EPT_MISCONFIG 162 +1273819 PREEMPTION_TIMER 194 +1273820 EXTERNAL_INTERRUPT 78 +1273820 HLT 2054 +1273820 MSR_WRITE 5199 +1273820 EPT_VIOLATION 2 +1273820 EPT_MISCONFIG 77 +1273820 PREEMPTION_TIMER 102 + +# ./kvmexit.py -p 1273795 -v 0 +Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end. +KVM_EXIT_REASON COUNT +^CMSR_WRITE 2076 +HLT 795 +PREEMPTION_TIMER 86 +EXTERNAL_INTERRUPT 20 +EPT_MISCONFIG 10 +PAUSE_INSTRUCTION 2 +IO_INSTRUCTION 2 +EPT_VIOLATION 1 +EOI_INDUCED 1 + +# ./kvmexit.py -p 1273795 -v 0 4 +Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs. +KVM_EXIT_REASON COUNT +MSR_WRITE 4726 +HLT 1827 +PREEMPTION_TIMER 78 +EPT_MISCONFIG 67 +EXTERNAL_INTERRUPT 28 +IO_INSTRUCTION 4 +EOI_INDUCED 2 +PAUSE_INSTRUCTION 2 + +# ./kvmexit.py -p 1273795 -v 4 4 +Traceback (most recent call last): + File "tools/kvmexit.py", line 306, in + raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid)) + Exception: There's no vCPU 4 for PID 1273795. + +# ./kvmexit.py -t 1273819 10 +Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs. +KVM_EXIT_REASON COUNT +MSR_WRITE 13318 +HLT 5274 +EPT_MISCONFIG 263 +PREEMPTION_TIMER 171 +EXTERNAL_INTERRUPT 109 +IO_INSTRUCTION 8 +PAUSE_INSTRUCTION 5 +EOI_INDUCED 4 +EPT_VIOLATION 2 + +# ./kvmexit.py -T '1273820,1273819' +Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end. +TIDS KVM_EXIT_REASON COUNT +^C1273819 EXTERNAL_INTERRUPT 300 +1273819 HLT 13718 +1273819 IO_INSTRUCTION 26 +1273819 MSR_WRITE 37457 +1273819 PAUSE_INSTRUCTION 13 +1273819 EOI_INDUCED 13 +1273819 EPT_VIOLATION 53 +1273819 EPT_MISCONFIG 654 +1273819 PREEMPTION_TIMER 958 +1273820 EXTERNAL_INTERRUPT 212 +1273820 HLT 9002 +1273820 MSR_WRITE 25495 +1273820 PAUSE_INSTRUCTION 2 +1273820 EPT_VIOLATION 64 +1273820 EPT_MISCONFIG 396 +1273820 PREEMPTION_TIMER 268 + + +Help to understand +================== + +We use a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to collaboratively +store each kvm exit reason and its count. The reason is there exists a rule when +one vcpu exits and re-enters, it tends to continue to run on the same physical +cpu (pcpu as follows) as the last cycle, which is also called 'cache hit'. Thus +we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed +things up; and for other cases, then use a percpu_hash. + +BTW, we originally use a common hash to do this, with a u64(exit_reason) +key and a struct exit_info {tgid_pid, exit_reason} value. But due to +the big lock in bpf_hash, each updating is quite performance consuming. + +Now imagine here is a pid_tgidA (vcpu A) exits and is going to run on +pcpuArrayA, the BPF code flow is as follows: + + pid_tgidA keeps running on the same pcpu + // \\ + // \\ + // Y N \\ + // \\ + a. cache_hit b. cache_miss +(cacheA's pid_tgid matches pid_tgidA) || + | || + | || + "increase percpu exit_ct and return" || + [*Note*] || + pid_tgidA ever been exited on pcpuArrayA? + // \\ + // \\ + // \\ + // Y N \\ + // \\ + b.a load_last_hashA b.b initialize_hashA_with_zero + \ / + \ / + \ / + "increase percpu exit_ct" + || + || + is another pid_tgid been running on pcpuArrayA? + // \\ + // Y N \\ + // \\ + b.*.a save_theLastHit_hashB do_nothing + \\ // + \\ // + \\ // + b.* save_to_pcpuArrayA + + +[*Note*] we do not update the table in above "a.", in case the vcpu hit the same +pcpu again when exits next time, instead we only update until this pcpu is not +hitted by the same tgidpid(vcpu) again, which is in "b.*.a" and "b.*". + + +USAGE message: +============== + +# ./kvmexit.py -h +usage: kvmexit.py [-h] [-p PID [-v VCPU | -a] ] [-t TID | -T 'TID1,TID2'] [duration] + +Display kvm_exit_reason and its statistics at a timed interval + +optional arguments: + -h, --help show this help message and exit + -p PID, --pid PID display process with this PID only, collpase all tids with exit reasons sorted in descending order + -v VCPU, --v VCPU display this VCPU only for this PID + -a, --alltids display all TIDS for this PID + -t TID, --tid TID display thread with this TID only with exit reasons sorted in descending order + -T 'TID1,TID2', --tids 'TID1,TID2' + display threads for a union like {395490, 395491} + duration duration of display, after sleeping several seconds + +examples: + ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C + ./kvmexit 5 # Display in real-time after sleeping 5s + ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order + ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s + ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default + ./kvmexit -p 3195281 -a # Display all tids for pid 3195281 + ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order + ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s + ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491}