Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: calipso_req_setattr() calls into _copy_from_user() #40

Closed
pcmoore opened this issue Jun 20, 2018 · 12 comments
Closed

BUG: calipso_req_setattr() calls into _copy_from_user() #40

pcmoore opened this issue Jun 20, 2018 · 12 comments
Assignees

Comments

@pcmoore
Copy link
Member

pcmoore commented Jun 20, 2018

While running tests with the selinux-testsuite, a kernel WARNING was uncovered with the following backtrace:

[ 3050.288154] WARNING: CPU: 1 PID: 3144 at lib/usercopy.c:11 _copy_from_user+0x85/0x90
[ 3050.294409] Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_security ip6_tables xt_CONNSECMARK xt_SECMARK nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c iptable_security ah6 xfrm6_mode_transport ah4 xfrm4_mode_transport ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_umad rpcrdma rdma_ucm ib_iser ib_ipoib rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon i2c_piix4 crc32c_intel mlx5_core drm_kms_helper mlxfw ttm serio_raw devlink drm virtio_net virtio_console virtio_blk net_failover failover qemu_fw_cfg ata_generic pata_acpi
[ 3050.321805] CPU: 1 PID: 3144 Comm: client Not tainted 4.18.0-0.rc1.git1.1.1.secnext.fc29.x86_64 #1
[ 3050.325220] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 3050.327076] RIP: 0010:_copy_from_user+0x85/0x90
[ 3050.328569] Code: de 89 ea e8 4d 19 52 00 89 c0 eb e5 48 29 c5 49 01 ec 48 89 c5 48 89 ea 4c 89 e7 31 f6 e8 c3 3e 52 00 48 89 e8 5b 5d 41 5c c3 <0f> 0b eb a3 0f 1f 80 00 00 00 00 41 54 49 89 f4 be 19 00 00 00 55 
[ 3050.334450] RSP: 0018:ffff89603b4038b8 EFLAGS: 00010206
[ 3050.336047] RAX: 0000000080000305 RBX: ffff89602288b000 RCX: ffff895f00000000
[ 3050.338248] RDX: 0000000000000010 RSI: 000000000000000a RDI: ffffffff9334ead5
[ 3050.340357] RBP: 0000000000000010 R08: 0000000000000010 R09: 0000000000000060
[ 3050.342421] R10: 0000000000000040 R11: 0000000000000040 R12: ffff895f42e37160
[ 3050.344493] R13: ffff895f42e37130 R14: ffff89603b403928 R15: 0000000000000000
[ 3050.346557] FS:  00007fab6bde0f80(0000) GS:ffff89603b400000(0000) knlGS:0000000000000000
[ 3050.348943] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3050.350540] CR2: 00007fab6b6f7ae0 CR3: 0000000057750000 CR4: 00000000001406e0
[ 3050.352583] Call Trace:
[ 3050.353288]  <IRQ>
[ 3050.353886]  ipv6_renew_option+0xb2/0xf0
[ 3050.354973]  ipv6_renew_options+0x26a/0x340
[ 3050.356141]  ipv6_renew_options_kern+0x2c/0x40
[ 3050.357404]  calipso_req_setattr+0x72/0xe0
[ 3050.358553]  netlbl_req_setattr+0x126/0x1b0
[ 3050.359710]  selinux_netlbl_inet_conn_request+0x80/0x100
[ 3050.361105]  selinux_inet_conn_request+0x6d/0xb0
[ 3050.362337]  security_inet_conn_request+0x32/0x50
[ 3050.363588]  tcp_conn_request+0x35f/0xe00
[ 3050.364655]  ? __lock_acquire+0x250/0x16c0
[ 3050.365764]  ? selinux_socket_sock_rcv_skb+0x1ae/0x210
[ 3050.367149]  ? tcp_rcv_state_process+0x289/0x106b
[ 3050.368400]  tcp_rcv_state_process+0x289/0x106b
[ 3050.369594]  ? tcp_v6_do_rcv+0x1a7/0x3c0
[ 3050.370590]  tcp_v6_do_rcv+0x1a7/0x3c0
[ 3050.371543]  tcp_v6_rcv+0xc82/0xcf0
[ 3050.372445]  ip6_input_finish+0x10d/0x690
[ 3050.373462]  ip6_input+0x45/0x1e0
[ 3050.374334]  ? ip6_rcv_finish+0x1d0/0x1d0
[ 3050.375352]  ipv6_rcv+0x32b/0x880
[ 3050.376206]  ? ip6_make_skb+0x1e0/0x1e0
[ 3050.377191]  __netif_receive_skb_core+0x6f2/0xdf0
[ 3050.378401]  ? process_backlog+0x85/0x250
[ 3050.379409]  ? process_backlog+0x85/0x250
[ 3050.380388]  ? process_backlog+0xec/0x250
[ 3050.381371]  process_backlog+0xec/0x250
[ 3050.382390]  net_rx_action+0x153/0x480
[ 3050.383304]  __do_softirq+0xd9/0x4f7
[ 3050.384186]  do_softirq_own_stack+0x2a/0x40
[ 3050.385203]  </IRQ>
[ 3050.385736]  ? ip6_finish_output2+0x267/0x990
[ 3050.386792]  do_softirq.part.12+0x68/0x70
[ 3050.387783]  __local_bh_enable_ip+0xce/0xe0
[ 3050.388804]  ip6_finish_output2+0x290/0x990
[ 3050.389789]  ? __lock_is_held+0x5a/0xa0
[ 3050.390685]  ? ip6_output+0x7a/0x2b0
[ 3050.391506]  ip6_output+0x7a/0x2b0
[ 3050.392299]  ? ip6_fragment+0xb30/0xb30
[ 3050.393184]  ip6_xmit+0x2ec/0x860
[ 3050.393956]  ? ip6_append_data+0x150/0x150
[ 3050.394906]  ? inet6_csk_xmit+0x67/0x230
[ 3050.395792]  ? __lock_is_held+0x5a/0xa0
[ 3050.396691]  inet6_csk_xmit+0x10b/0x230
[ 3050.397586]  tcp_transmit_skb+0x4fd/0xb30
[ 3050.398512]  tcp_connect+0xcad/0x1080
[ 3050.399363]  tcp_v6_connect+0x65d/0x950
[ 3050.400228]  ? __inet_stream_connect+0xd1/0x370
[ 3050.401230]  ? tcp_v6_pre_connect+0x70/0x70
[ 3050.402159]  __inet_stream_connect+0xd1/0x370
[ 3050.403112]  ? mark_held_locks+0x57/0x80
[ 3050.403989]  ? __local_bh_enable_ip+0x80/0xe0
[ 3050.404948]  inet_stream_connect+0x36/0x50
[ 3050.405851]  __sys_connect+0xd3/0x100
[ 3050.406665]  ? trace_hardirqs_on_caller+0xed/0x180
[ 3050.407723]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 3050.408750]  __x64_sys_connect+0x16/0x20
[ 3050.409625]  do_syscall_64+0x60/0x1f0
[ 3050.410400]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3050.411465] RIP: 0033:0x7fab6b6da304
[ 3050.412292] Code: 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 01 44 2c 00 8b 00 85 c0 75 13 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 41 89 d4 55 48 89 f5 53 
[ 3050.416293] RSP: 002b:00007ffe529be1a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
[ 3050.417875] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fab6b6da304
[ 3050.419351] RDX: 000000000000001c RSI: 0000000000af72b0 RDI: 0000000000000003
[ 3050.420776] RBP: 00007ffe529be340 R08: 0000000000000010 R09: 0000000000000100
[ 3050.422201] R10: fffffffffffff3ad R11: 0000000000000246 R12: 0000000000400a10
[ 3050.423643] R13: 00007ffe529be420 R14: 0000000000000000 R15: 0000000000000000

... the issue would appear that calipso_req_setattr() ends up calling a function which assumes the IPv6 option data is coming from userspace, and ends up calling _copy_from_user() to safely copy the data. Unfortunately in this particular case the IPv6 option is not coming from userspace which triggers the warning we see above.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 20, 2018

I still need to look at this a bit more, but I suspect the solution is going to require a second version of ipv6_renew_options() which can operate safely on kernel allocated option data. It is unclear if it is easier to provide a limited implementation specific to CALIPSO (we do something somewhat similar for CIPSO) or if we should provide a generic implementation that could live in net/ipv6/exthdrs.c.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 20, 2018

@hdmdavies is the original author of this code, but he may not be actively monitoring this account. I'm including him on the off chance he wants to be involved in the fix.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 20, 2018

Alternatively, another option might be to move the copy_from_user() call out of ipv6_renew_option() and up into do_ipv6_setsockopt() as that appears to be the only other user of the ipv6_renew_option() function.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 20, 2018

Alternatively, another option might be to move the copy_from_user() call out of ipv6_renew_option() and up into do_ipv6_setsockopt() as that appears to be the only other user of the ipv6_renew_option() function.

It appears there is some precedence for calling copy_from_user() in do_ipv6_setsockopt().

@pcmoore pcmoore self-assigned this Jun 20, 2018
@pcmoore
Copy link
Member Author

pcmoore commented Jun 21, 2018

I think I'm going to give up on moving the copy_from_user() call as that is always going to result in an extra copy in the common (non-CALIPSO) case. While it's far from the critical path, it's stuff like that which the netdev folks love to reject.

I'm currently investigating how ugly it would be to add a CALIPSO specific version of ipv6_renew_options().

@pcmoore
Copy link
Member Author

pcmoore commented Jun 21, 2018

It looks like that is going to be a poor option too; more investigation is needed.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 21, 2018

Completely untested, but here is my first attempt at a fix: pcmoore/misc-linux_kernel@489a9f7.

I'm currently building a test kernel RPM via COPR in case anyone wants to try it out.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 22, 2018

Initial testing is proving positive. The system is able to demonstrate basic functionality and both the selinux-testsuite and audit-testsuite pass with no kernel warnings/panics.

@pcmoore
Copy link
Member Author

pcmoore commented Jun 22, 2018

@pcmoore
Copy link
Member Author

pcmoore commented Jul 2, 2018

Follow on revision posted to netdev upstream:

@pcmoore
Copy link
Member Author

pcmoore commented Jul 2, 2018

... and a v2 of the follow on because the 0-day test robot found a mistake (a rather foolish mistake I might add):

@pcmoore
Copy link
Member Author

pcmoore commented Aug 31, 2018

Resolved with commit a9ba23d that was included in v4.18.

@pcmoore pcmoore closed this as completed Aug 31, 2018
pcmoore pushed a commit that referenced this issue Oct 22, 2018
In 4.19-rc1, Eugeniy reported weird boot and IO errors on ARC HSDK

| INFO: task syslogd:77 blocked for more than 10 seconds.
|       Not tainted 4.19.0-rc1-00007-gf213acea4e88 #40
| "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
| message.
| syslogd         D    0    77     76 0x00000000
|
| Stack Trace:
|  __switch_to+0x0/0xac
|  __schedule+0x1b2/0x730
|  io_schedule+0x5c/0xc0
|  __lock_page+0x98/0xdc
|  find_lock_entry+0x38/0x100
|  shmem_getpage_gfp.isra.3+0x82/0xbfc
|  shmem_fault+0x46/0x138
|  handle_mm_fault+0x5bc/0x924
|  do_page_fault+0x100/0x2b8
|  ret_from_exception+0x0/0x8

He bisected to 84c6591 ("locking/atomics,
asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*()")

This commit however only unmasked the real issue introduced by commit
4aef66c ("locking/atomic, arch/arc: Fix build") which missed the
retry-if-scond-failed branch in atomic_fetch_##op() macros.

The bisected commit started using atomic_fetch_##op() macros for building
the rest of atomics.

Fixes: 4aef66c ("locking/atomic, arch/arc: Fix build")
Reported-by: Eugeniy Paltsev <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
[vgupta: wrote changelog]
pcmoore pushed a commit that referenced this issue Apr 4, 2022
…frame()

The following KASAN warning is detected by QEMU.

==================================================================
BUG: KASAN: stack-out-of-bounds in unwind_frame+0x508/0x870
Read of size 4 at addr c36bba90 by task cat/163

CPU: 1 PID: 163 Comm: cat Not tainted 5.10.0-rc1 #40
Hardware name: ARM-Versatile Express
[<c0113fac>] (unwind_backtrace) from [<c010e71c>] (show_stack+0x10/0x14)
[<c010e71c>] (show_stack) from [<c0b805b4>] (dump_stack+0x98/0xb0)
[<c0b805b4>] (dump_stack) from [<c0b7d658>] (print_address_description.constprop.0+0x58/0x4bc)
[<c0b7d658>] (print_address_description.constprop.0) from [<c031435c>] (kasan_report+0x154/0x170)
[<c031435c>] (kasan_report) from [<c0113c44>] (unwind_frame+0x508/0x870)
[<c0113c44>] (unwind_frame) from [<c010e298>] (__save_stack_trace+0x110/0x134)
[<c010e298>] (__save_stack_trace) from [<c01ce0d8>] (stack_trace_save+0x8c/0xb4)
[<c01ce0d8>] (stack_trace_save) from [<c0313520>] (kasan_set_track+0x38/0x60)
[<c0313520>] (kasan_set_track) from [<c0314cb8>] (kasan_set_free_info+0x20/0x2c)
[<c0314cb8>] (kasan_set_free_info) from [<c0313474>] (__kasan_slab_free+0xec/0x120)
[<c0313474>] (__kasan_slab_free) from [<c0311e20>] (kmem_cache_free+0x7c/0x334)
[<c0311e20>] (kmem_cache_free) from [<c01c35dc>] (rcu_core+0x390/0xccc)
[<c01c35dc>] (rcu_core) from [<c01013a8>] (__do_softirq+0x180/0x518)
[<c01013a8>] (__do_softirq) from [<c0135214>] (irq_exit+0x9c/0xe0)
[<c0135214>] (irq_exit) from [<c01a40e4>] (__handle_domain_irq+0xb0/0x110)
[<c01a40e4>] (__handle_domain_irq) from [<c0691248>] (gic_handle_irq+0xa0/0xb8)
[<c0691248>] (gic_handle_irq) from [<c0100b0c>] (__irq_svc+0x6c/0x94)
Exception stack(0xc36bb928 to 0xc36bb970)
b920:                   c36bb9c0 00000000 c0126919 c0101228 c36bb9c0 b76d7730
b940: c36b8000 c36bb9a0 c3335b00 c01ce0d8 00000003 c36bba3c c36bb940 c36bb978
b960: c010e298 c011373c 60000013 ffffffff
[<c0100b0c>] (__irq_svc) from [<c011373c>] (unwind_frame+0x0/0x870)
[<c011373c>] (unwind_frame) from [<00000000>] (0x0)

The buggy address belongs to the page:
page:(ptrval) refcount:0 mapcount:0 mapping:00000000 index:0x0 pfn:0x636bb
flags: 0x0()
raw: 00000000 00000000 ef867764 00000000 00000000 00000000 ffffffff 00000000
page dumped because: kasan: bad access detected

addr c36bba90 is located in stack of task cat/163 at offset 48 in frame:
 stack_trace_save+0x0/0xb4

this frame has 1 object:
 [32, 48) 'trace'

Memory state around the buggy address:
 c36bb980: f1 f1 f1 f1 00 04 f2 f2 00 00 f3 f3 00 00 00 00
 c36bba00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>c36bba80: 00 00 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
                 ^
 c36bbb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 c36bbb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

There is a same issue on x86 and has been resolved by the commit f7d27c3
("x86/mm, kasan: Silence KASAN warnings in get_wchan()").
The solution could be applied to arm architecture too.

Signed-off-by: Lin Yujun <[email protected]>
Reported-by: He Ying <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant