Skip to content

Commit

Permalink
Merge pull request iovisor#2866 from sumanthkorikkar/draft_bpf_probe_…
Browse files Browse the repository at this point in the history
…read_user_approach1

bcc: Use bpf_probe_read_user in tools and provide backward compatibility
  • Loading branch information
yonghong-song committed Apr 28, 2020
2 parents d2e8ea4 + aa3a4a6 commit f4302f3
Show file tree
Hide file tree
Showing 30 changed files with 148 additions and 57 deletions.
38 changes: 33 additions & 5 deletions docs/reference_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ This guide is incomplete. If something feels missing, check the bcc and kernel s
- [7. bpf_get_current_task()](#7-bpf_get_current_task)
- [8. bpf_log2l()](#8-bpf_log2l)
- [9. bpf_get_prandom_u32()](#9-bpf_get_prandom_u32)
- [10. bpf_probe_read_user()](#10-bpf_probe_read_user)
- [11. bpf_probe_read_user_str()](#11-bpf_probe_read_user_str)
- [Debugging](#debugging)
- [1. bpf_override_return()](#1-bpf_override_return)
- [Output](#output)
Expand Down Expand Up @@ -196,7 +198,7 @@ For example:
```C
int count(struct pt_regs *ctx) {
char buf[64];
bpf_probe_read(&buf, sizeof(buf), (void *)PT_REGS_PARM1(ctx));
bpf_probe_read_user(&buf, sizeof(buf), (void *)PT_REGS_PARM1(ctx));
bpf_trace_printk("%s %d", buf, PT_REGS_PARM2(ctx));
return(0);
}
Expand Down Expand Up @@ -242,7 +244,7 @@ int do_trace(struct pt_regs *ctx) {
uint64_t addr;
char path[128];
bpf_usdt_readarg(6, ctx, &addr);
bpf_probe_read(&path, sizeof(path), (void *)addr);
bpf_probe_read_user(&path, sizeof(path), (void *)addr);
bpf_trace_printk("path:%s\\n", path);
return 0;
};
Expand Down Expand Up @@ -372,7 +374,7 @@ Syntax: ```int bpf_probe_read(void *dst, int size, const void *src)```

Return: 0 on success

This copies a memory location to the BPF stack, so that BPF can later operate on it. For safety, all memory reads must pass through bpf_probe_read(). This happens automatically in some cases, such as dereferencing kernel variables, as bcc will rewrite the BPF program to include the necessary bpf_probe_reads().
This copies size bytes from kernel address space to the BPF stack, so that BPF can later operate on it. For safety, all kernel memory reads must pass through bpf_probe_read(). This happens automatically in some cases, such as dereferencing kernel variables, as bcc will rewrite the BPF program to include the necessary bpf_probe_read().

Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=bpf_probe_read+path%3Aexamples&type=Code),
Expand All @@ -386,7 +388,7 @@ Return:
- \> 0 length of the string including the trailing NULL on success
- \< 0 error

This copies a `NULL` terminated string from memory location to BPF stack, so that BPF can later operate on it. In case the string length is smaller than size, the target is not padded with further `NULL` bytes. In case the string length is larger than size, just `size - 1` bytes are copied and the last byte is set to `NULL`.
This copies a `NULL` terminated string from kernel address space to the BPF stack, so that BPF can later operate on it. In case the string length is smaller than size, the target is not padded with further `NULL` bytes. In case the string length is larger than size, just `size - 1` bytes are copied and the last byte is set to `NULL`.

Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=bpf_probe_read_str+path%3Aexamples&type=Code),
Expand Down Expand Up @@ -490,6 +492,32 @@ Example in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=bpf_get_prandom_u32+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=bpf_get_prandom_u32+path%3Atools&type=Code)

### 10. bpf_probe_read_user()

Syntax: ```int bpf_probe_read_user(void *dst, int size, const void *src)```

Return: 0 on success

This attempts to safely read size bytes from user address space to the BPF stack, so that BPF can later operate on it. For safety, all user address space memory reads must pass through bpf_probe_read_user().

Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=bpf_probe_read_user+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=bpf_probe_read_user+path%3Atools&type=Code)

### 11. bpf_probe_read_user_str()

Syntax: ```int bpf_probe_read_user_str(void *dst, int size, const void *src)```

Return:
- \> 0 length of the string including the trailing NULL on success
- \< 0 error

This copies a `NULL` terminated string from user address space to the BPF stack, so that BPF can later operate on it. In case the string length is smaller than size, the target is not padded with further `NULL` bytes. In case the string length is larger than size, just `size - 1` bytes are copied and the last byte is set to `NULL`.

Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=bpf_probe_read_user_str+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=bpf_probe_read_user_str+path%3Atools&type=Code)

## Debugging

### 1. bpf_override_return()
Expand Down Expand Up @@ -1721,7 +1749,7 @@ See the "Understanding eBPF verifier messages" section in the kernel source unde

## 1. Invalid mem access

This can be due to trying to read memory directly, instead of operating on memory on the BPF stack. All memory reads must be passed via bpf_probe_read() to copy memory into the BPF stack, which can be automatic by the bcc rewriter in some cases of simple dereferencing. bpf_probe_read() does all the required checks.
This can be due to trying to read memory directly, instead of operating on memory on the BPF stack. All kernel memory reads must be passed via bpf_probe_read() to copy kernel memory into the BPF stack, which can be automatic by the bcc rewriter in some cases of simple dereferencing. bpf_probe_read() does all the required checks.

Example:

Expand Down
6 changes: 3 additions & 3 deletions docs/tutorial_bcc_python_developer.md
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,7 @@ int count(struct pt_regs *ctx) {
struct key_t key = {};
u64 zero = 0, *val;
bpf_probe_read(&key.c, sizeof(key.c), (void *)PT_REGS_PARM1(ctx));
bpf_probe_read_user(&key.c, sizeof(key.c), (void *)PT_REGS_PARM1(ctx));
// could also use `counts.increment(key)`
val = counts.lookup_or_try_init(&key, &zero);
if (val) {
Expand Down Expand Up @@ -620,7 +620,7 @@ int do_trace(struct pt_regs *ctx) {
uint64_t addr;
char path[128]={0};
bpf_usdt_readarg(6, ctx, &addr);
bpf_probe_read(&path, sizeof(path), (void *)addr);
bpf_probe_read_user(&path, sizeof(path), (void *)addr);
bpf_trace_printk("path:%s\\n", path);
return 0;
};
Expand All @@ -640,7 +640,7 @@ b = BPF(text=bpf_text, usdt_contexts=[u])
Things to learn:

1. ```bpf_usdt_readarg(6, ctx, &addr)```: Read the address of argument 6 from the USDT probe into ```addr```.
1. ```bpf_probe_read(&path, sizeof(path), (void *)addr)```: Now the string ```addr``` points to into our ```path``` variable.
1. ```bpf_probe_read_user(&path, sizeof(path), (void *)addr)```: Now the string ```addr``` points to into our ```path``` variable.
1. ```u = USDT(pid=int(pid))```: Initialize USDT tracing for the given PID.
1. ```u.enable_probe(probe="http__server__request", fn_name="do_trace")```: Attach our ```do_trace()``` BPF C function to the Node.js ```http__server__request``` USDT probe.
1. ```b = BPF(text=bpf_text, usdt_contexts=[u])```: Need to pass in our USDT object, ```u```, to BPF object creation.
Expand Down
2 changes: 1 addition & 1 deletion examples/cpp/RecordMySQLQuery.cc
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ int probe_mysql_query(struct pt_regs *ctx, void* thd, char* query, size_t len) {
key.ts = bpf_ktime_get_ns();
key.pid = bpf_get_current_pid_tgid();
bpf_probe_read_str(&key.query, sizeof(key.query), query);
bpf_probe_read_user_str(&key.query, sizeof(key.query), query);
int one = 1;
queries.update(&key, &one);
Expand Down
3 changes: 2 additions & 1 deletion examples/lua/bashreadline.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ int printret(struct pt_regs *ctx)
return 0;
pid = bpf_get_current_pid_tgid();
data.pid = pid;
bpf_probe_read(&data.str, sizeof(data.str), (void *)PT_REGS_RC(ctx));
bpf_probe_read_user(&data.str, sizeof(data.str),
(void *)PT_REGS_RC(ctx));
events.perf_submit(ctx, &data, sizeof(data));
return 0;
};
2 changes: 1 addition & 1 deletion examples/lua/strlen_count.lua
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ int printarg(struct pt_regs *ctx) {
if (pid != PID)
return 0;
char str[128] = {};
bpf_probe_read(&str, sizeof(str), (void *)PT_REGS_PARM1(ctx));
bpf_probe_read_user(&str, sizeof(str), (void *)PT_REGS_PARM1(ctx));
bpf_trace_printk("strlen(\"%s\")\n", &str);
return 0;
};
Expand Down
2 changes: 1 addition & 1 deletion examples/lua/usdt_ruby.lua
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ int trace_method(struct pt_regs *ctx) {
bpf_usdt_readarg(2, ctx, &addr);
char fn_name[128] = {};
bpf_probe_read(&fn_name, sizeof(fn_name), (void *)addr);
bpf_probe_read_user(&fn_name, sizeof(fn_name), (void *)addr);
bpf_trace_printk("%s(...)\n", fn_name);
return 0;
Expand Down
2 changes: 1 addition & 1 deletion examples/tracing/mysqld_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
* see: https://dev.mysql.com/doc/refman/5.7/en/dba-dtrace-ref-query.html
*/
bpf_usdt_readarg(1, ctx, &addr);
bpf_probe_read(&query, sizeof(query), (void *)addr);
bpf_probe_read_user(&query, sizeof(query), (void *)addr);
bpf_trace_printk("%s\\n", query);
return 0;
};
Expand Down
2 changes: 1 addition & 1 deletion examples/tracing/nodejs_http_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
uint64_t addr;
char path[128]={0};
bpf_usdt_readarg(6, ctx, &addr);
bpf_probe_read(&path, sizeof(path), (void *)addr);
bpf_probe_read_user(&path, sizeof(path), (void *)addr);
bpf_trace_printk("path:%s\\n", path);
return 0;
};
Expand Down
2 changes: 1 addition & 1 deletion examples/tracing/strlen_count.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
struct key_t key = {};
u64 zero = 0, *val;
bpf_probe_read(&key.c, sizeof(key.c), (void *)PT_REGS_PARM1(ctx));
bpf_probe_read_user(&key.c, sizeof(key.c), (void *)PT_REGS_PARM1(ctx));
// could also use `counts.increment(key)`
val = counts.lookup_or_try_init(&key, &zero);
if (val) {
Expand Down
2 changes: 1 addition & 1 deletion examples/tracing/strlen_snoop.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
return 0;
char str[80] = {};
bpf_probe_read(&str, sizeof(str), (void *)PT_REGS_PARM1(ctx));
bpf_probe_read_user(&str, sizeof(str), (void *)PT_REGS_PARM1(ctx));
bpf_trace_printk("%s\\n", &str);
return 0;
Expand Down
9 changes: 5 additions & 4 deletions src/cc/export/helpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,7 @@ static long long (*bpf_tcp_gen_syncookie)(struct bpf_sock *sk, void *ip,
static int (*bpf_skb_output)(void *ctx, void *map, __u64 flags, void *data,
__u64 size) =
(void *)BPF_FUNC_skb_output;

static int (*bpf_probe_read_user)(void *dst, __u32 size,
const void *unsafe_ptr) =
(void *)BPF_FUNC_probe_read_user;
Expand Down Expand Up @@ -887,8 +888,8 @@ int bpf_usdt_readarg_p(int argc, struct pt_regs *ctx, void *buf, u64 len) asm("l
#if defined(__TARGET_ARCH_x86)
#define bpf_target_x86
#define bpf_target_defined
#elif defined(__TARGET_ARCH_s930x)
#define bpf_target_s930x
#elif defined(__TARGET_ARCH_s390x)
#define bpf_target_s390x
#define bpf_target_defined
#elif defined(__TARGET_ARCH_arm64)
#define bpf_target_arm64
Expand All @@ -905,7 +906,7 @@ int bpf_usdt_readarg_p(int argc, struct pt_regs *ctx, void *buf, u64 len) asm("l
#if defined(__x86_64__)
#define bpf_target_x86
#elif defined(__s390x__)
#define bpf_target_s930x
#define bpf_target_s390x
#elif defined(__aarch64__)
#define bpf_target_arm64
#elif defined(__powerpc__)
Expand All @@ -923,7 +924,7 @@ int bpf_usdt_readarg_p(int argc, struct pt_regs *ctx, void *buf, u64 len) asm("l
#define PT_REGS_RC(ctx) ((ctx)->gpr[3])
#define PT_REGS_IP(ctx) ((ctx)->nip)
#define PT_REGS_SP(ctx) ((ctx)->gpr[1])
#elif defined(bpf_target_s930x)
#elif defined(bpf_target_s390x)
#define PT_REGS_PARM1(x) ((x)->gprs[2])
#define PT_REGS_PARM2(x) ((x)->gprs[3])
#define PT_REGS_PARM3(x) ((x)->gprs[4])
Expand Down
51 changes: 47 additions & 4 deletions src/cc/frontends/clang/b_frontend_action.cc
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
#include "bcc_libbpf_inc.h"

#include "libbpf.h"
#include "bcc_syms.h"

namespace ebpf {

Expand Down Expand Up @@ -82,6 +83,30 @@ const char **get_call_conv(void) {
return ret;
}

static std::string check_bpf_probe_read_user(llvm::StringRef probe) {
if (probe.str() == "bpf_probe_read_user" ||
probe.str() == "bpf_probe_read_user_str") {
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 5, 0)
return probe.str();
#else
// Check for probe_user symbols in backported kernels before fallback
void *resolver = bcc_symcache_new(-1, nullptr);
uint64_t addr = 0;
bool found = bcc_symcache_resolve_name(resolver, nullptr,
"bpf_probe_read_user", &addr) >= 0 ? true: false;
if (found)
return probe.str();

if (probe.str() == "bpf_probe_read_user") {
return "bpf_probe_read";
} else {
return "bpf_probe_read_str";
}
#endif
}
return "";
}

using std::map;
using std::move;
using std::set;
Expand Down Expand Up @@ -701,7 +726,7 @@ void BTypeVisitor::rewriteFuncParam(FunctionDecl *D) {
// it in case of "syscall__" for other architectures.
if (strncmp(D->getName().str().c_str(), "syscall__", 9) == 0 ||
strncmp(D->getName().str().c_str(), "kprobe____x64_sys_", 18) == 0) {
preamble += "#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER\n";
preamble += "#if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) && !defined(__s390x__)\n";
genParamIndirectAssign(D, preamble, calling_conv_regs);
preamble += "\n#else\n";
genParamDirectAssign(D, preamble, calling_conv_regs);
Expand Down Expand Up @@ -947,6 +972,22 @@ bool BTypeVisitor::VisitCallExpr(CallExpr *Call) {
} else if (Call->getCalleeDecl()) {
NamedDecl *Decl = dyn_cast<NamedDecl>(Call->getCalleeDecl());
if (!Decl) return true;

string text;

std::string probe = check_bpf_probe_read_user(Decl->getName());
if (probe != "") {
vector<string> probe_args;

for (auto arg : Call->arguments())
probe_args.push_back(
rewriter_.getRewrittenText(expansionRange(arg->getSourceRange())));

text = probe + "(" + probe_args[0] + ", " + probe_args[1] + ", " +
probe_args[2] + ")";
rewriter_.ReplaceText(expansionRange(Call->getSourceRange()), text);
}

if (AsmLabelAttr *A = Decl->getAttr<AsmLabelAttr>()) {
// Functions with the tag asm("llvm.bpf.extra") are implemented in the
// rewriter rather than as a macro since they may also include nested
Expand All @@ -959,10 +1000,10 @@ bool BTypeVisitor::VisitCallExpr(CallExpr *Call) {
}

vector<string> args;

for (auto arg : Call->arguments())
args.push_back(rewriter_.getRewrittenText(expansionRange(arg->getSourceRange())));

string text;
if (Decl->getName() == "incr_cksum_l3") {
text = "bpf_l3_csum_replace_(" + fn_args_[0]->getName().str() + ", (u64)";
text += args[0] + ", " + args[1] + ", " + args[2] + ", sizeof(" + args[2] + "))";
Expand Down Expand Up @@ -994,8 +1035,10 @@ bool BTypeVisitor::VisitCallExpr(CallExpr *Call) {
text = "({ u64 __addr = 0x0; ";
text += "_bpf_readarg_" + current_fn_ + "_" + args[0] + "(" +
args[1] + ", &__addr, sizeof(__addr));";
text += "bpf_probe_read(" + args[2] + ", " + args[3] +
", (void *)__addr);";

text += check_bpf_probe_read_user(StringRef("bpf_probe_read_user"));

text += "(" + args[2] + ", " + args[3] + ", (void *)__addr);";
text += "})";
rewriter_.ReplaceText(expansionRange(Call->getSourceRange()), text);
} else if (Decl->getName() == "bpf_usdt_readarg") {
Expand Down
2 changes: 1 addition & 1 deletion tools/bashreadline.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
return 0;
pid = bpf_get_current_pid_tgid();
data.pid = pid;
bpf_probe_read(&data.str, sizeof(data.str), (void *)PT_REGS_RC(ctx));
bpf_probe_read_user(&data.str, sizeof(data.str), (void *)PT_REGS_RC(ctx));
bpf_get_current_comm(&comm, sizeof(comm));
if (comm[0] == 'b' && comm[1] == 'a' && comm[2] == 's' && comm[3] == 'h' && comm[4] == 0 ) {
Expand Down
3 changes: 2 additions & 1 deletion tools/biosnoop.lua
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,8 @@ int trace_req_completion(struct pt_regs *ctx, struct request *req)
valp = infobyreq.lookup(&req);
if (valp == 0) {
data.len = req->__data_len;
strcpy(data.name,"?");
data.name[0] = '?';
data.name[1] = 0;
} else {
data.pid = valp->pid;
data.len = req->__data_len;
Expand Down
3 changes: 2 additions & 1 deletion tools/biosnoop.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,8 @@
valp = infobyreq.lookup(&req);
if (valp == 0) {
data.len = req->__data_len;
strcpy(data.name, "?");
data.name[0] = '?';
data.name[1] = 0;
} else {
if (##QUEUE##) {
data.qdelta = *tsp - valp->ts;
Expand Down
12 changes: 9 additions & 3 deletions tools/dbslower.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,12 +127,12 @@
tmp.timestamp = bpf_ktime_get_ns();
#if defined(MYSQL56)
bpf_probe_read(&tmp.query, sizeof(tmp.query), (void*) PT_REGS_PARM3(ctx));
bpf_probe_read_user(&tmp.query, sizeof(tmp.query), (void*) PT_REGS_PARM3(ctx));
#elif defined(MYSQL57)
void* st = (void*) PT_REGS_PARM2(ctx);
char* query;
bpf_probe_read(&query, sizeof(query), st);
bpf_probe_read(&tmp.query, sizeof(tmp.query), query);
bpf_probe_read_user(&query, sizeof(query), st);
bpf_probe_read_user(&tmp.query, sizeof(tmp.query), query);
#else //USDT
bpf_usdt_readarg(1, ctx, &tmp.query);
#endif
Expand All @@ -157,7 +157,13 @@
data.pid = pid >> 32; // only process id
data.timestamp = tempp->timestamp;
data.duration = delta;
#if defined(MYSQL56) || defined(MYSQL57)
// We already copied string to the bpf stack. Hence use bpf_probe_read()
bpf_probe_read(&data.query, sizeof(data.query), tempp->query);
#else
// USDT - we didnt copy string to the bpf stack before.
bpf_probe_read_user(&data.query, sizeof(data.query), tempp->query);
#endif
events.perf_submit(ctx, &data, sizeof(data));
#ifdef THRESHOLD
}
Expand Down
4 changes: 2 additions & 2 deletions tools/execsnoop.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,15 +120,15 @@ def parse_uid(user):
static int __submit_arg(struct pt_regs *ctx, void *ptr, struct data_t *data)
{
bpf_probe_read(data->argv, sizeof(data->argv), ptr);
bpf_probe_read_user(data->argv, sizeof(data->argv), ptr);
events.perf_submit(ctx, data, sizeof(struct data_t));
return 1;
}
static int submit_arg(struct pt_regs *ctx, void *ptr, struct data_t *data)
{
const char *argp = NULL;
bpf_probe_read(&argp, sizeof(argp), ptr);
bpf_probe_read_user(&argp, sizeof(argp), ptr);
if (argp) {
return __submit_arg(ctx, (void *)(argp), data);
}
Expand Down
Loading

0 comments on commit f4302f3

Please sign in to comment.