-
Notifications
You must be signed in to change notification settings - Fork 652
/
ibacm.8
169 lines (169 loc) · 7.81 KB
/
ibacm.8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md
.TH "ibacm" 8 "2014-06-16" "ibacm" "ibacm" ibacm
.SH NAME
ibacm \- address and route resolution services for InfiniBand.
.SH SYNOPSIS
.sp
.nf
\fIibacm\fR [-D] [-P] [-A addr_file] [-O option_file]
.fi
.SH "DESCRIPTION"
The IB ACM implements and provides a framework for name,
address, and route (path) resolution services over InfiniBand.
It is intended to address connection setup scalability issues running
MPI applications on large clusters. The IB ACM provides information
needed to establish a connection, but does not implement the CM protocol.
.P
A primary user of the ibacm service is the librdmacm library. This
enables applications to make use of the ibacm service without code
changes or needing to be aware that the service is in use.
librdmacm versions 1.0.12 - 1.0.15 can invoke IB ACM services when built using
the --with-ib_acm option. Version 1.0.16 and newer of librdmacm will automatically
use the IB ACM if it is installed. The IB ACM services tie in under the
rdma_resolve_addr, rdma_resolve_route, and rdma_getaddrinfo routines.
For maximum benefit, the rdma_getaddrinfo routine should be used,
however existing applications should still see significant connection
scaling benefits using the calls
available in librdmacm 1.0.11 and previous releases.
.P
The IB ACM is focused on being scalable, efficient, and extensible. It implements
a plugin architecture that allows a vendor to supply its proprietary provider in
addition to the default provider. The current default provider implementation
ibacmp limits network traffic, SA interactions, and centralized
services. Ibacmp supports multiple resolution protocols in order to handle
different fabric topologies.
.P
The IB ACM package is comprised of three components: the ibacm core service,
the default provider ibacmp shared library, and a test/configuration utility
- ib_acme. All three are userspace components and are available for Linux.
Additional details are given below.
.SH "OPTIONS"
.TP
\-D
run in daemon mode (default)
.TP
\-P
run as standard process
.TP
\-A addr_file
address configuration file
.TP
\-O option_file
option configuration file
.TP
\--systemd
Enable systemd integration. This includes optional socket activation of the daemon's
listening socket.
.SH "QUICK START GUIDE"
1. Prerequisites: libibverbs and libibumad must be installed.
The IB stack should be running with IPoIB configured.
These steps assume that the user has administrative privileges.
.P
2. Install the IB ACM package. This installs ibacm, ibacmp, ib_acme, and init.d scripts.
.P
3. Run 'ibacm' as administrator to start the ibacm daemon.
.P
4. Optionally, run 'ib_acme -d <dest_ip> -v' to verify that
the ibacm service is running.
.P
5. Install librdmacm, using the build option --with-ib_acm if needed.
This build option is not needed with librdmacm 1.0.17 or newer.
The librdmacm will automatically use the ibacm service.
On failures, the librdmacm will fall back to normal resolution.
.P
6. You can use ib_acme -P to gather performance statistics from the local ibacm
daemon to see if the service is working correctly. Similarly, the command
ib_acme -e could be used to enumerate all endpoints created by the local ibacm
service.
.SH "NOTES"
ib_acme:
.P
The ib_acme program serves a dual role. It acts as a utility to test
ibacm operation and help verify if the ibacm service and selected
protocol is usable for a given cluster configuration. Additionally,
it automatically generates ibacm configuration files to assist with
or eliminate manual setup.
.P
ibacm configuration files:
.P
The ibacm service relies on two configuration files.
.P
The ibacm_addr.cfg file contains name and address mappings for each IB
<device, port, pkey> endpoint. Although the names in the ibacm_addr.cfg
file can be anything, ib_acme maps the host name to the IB endpoints. IP
addresses, on the other hand, are assigned dynamically. If the address file
cannot be found, the ibacm service will attempt to create one using default
values.
.P
The ibacm_opts.cfg file provides a set of configurable options for the
ibacm core service and default provider, such as timeout, number of retries,
logging level, etc. ib_acme generates the ibacm_opts.cfg file using static
information. If an option file cannot be found, ibacm will use default values.
.P
ibacm:
.P
The ibacm service is responsible for resolving names and addresses to
InfiniBand path information and caching such data. It
should execute with administrative privileges.
.P
The ibacm implements a client interface over TCP sockets, which is
abstracted by the librdmacm library. One or more providers can be loaded
by the core service, depending on the configuration. In the default provider
ibacmp, one or more back-end protocols are used to satisfy user requests.
Although ibacmp supports standard SA path record queries on the back-end, it
also supports a resolution protocol based on multicast traffic.
The latter is not usable on all fabric topologies, specifically
ones that may not have reversible paths or fabrics using torus routing.
Users should use the ib_acme utility to verify that multicast protocol
is usable before running other applications.
.P
Conceptually, the default provider ibacmp implements an ARP like protocol and either
uses IB multicast records to construct path record data or queries the
SA directly, depending on the selected route protocol. By default, the
ibacmp provider uses and caches SA path record queries.
.P
Specifically, all IB endpoints join a number of multicast groups.
Multicast groups differ based on rates, mtu, sl, etc., and are prioritized.
All participating endpoints must be able to communicate on the lowest
priority multicast group. The ibacmp assigns one or more names/addresses
to each IB endpoint using the ibacm_addr.cfg file. Clients provide source
and destination names or addresses as input to the service, and receive
as output path record data.
.P
The service maps a client's source name/address to a local IB endpoint.
If the destination name/address is not cached locally in the default provider,
it sends a multicast request out on the lowest priority multicast group on the
local endpoint. The request carries a list of multicast groups that the sender can use.
The recipient of the request selects the highest priority multicast group
that it can use as well and returns that information directly to the sender.
The request data is cached by all endpoints that receive the multicast
request message. The source endpoint also caches the response and uses
the multicast group that was selected to construct or obtain path record
data, which is returned to the client.
.P
The current implementation of the provider ibacmp has several additional restrictions:
.P
- The ibacmp is limited in its handling of dynamic changes.
ibacm must be stopped and restarted if a cluster is reconfigured.
.P
- Support for IPv6 has not been verified.
.P
- The number of multicast groups that an endpoint can support is limited to 2.
.P
The ibacmp contains several internal caches. These include caches for GID
and LID destination addresses. These caches can be optionally
preloaded. ibacm supports the OpenSM dump_pr plugin "full" PathRecord
format which is used to preload these caches.
The file format is specified in the ibacm_opts.cfg file via the
route_preload setting which should be set to full_opensm_v1 for this
file format. Default format is none which does not preload these caches.
See dump_pr.notes.txt in dump_pr for more information on the
full_opensm_v1 file format and how to configure OpenSM to
generate this file.
.P
Additionally, the name, IPv4, and IPv6 caches can be be preloaded by using
the addr_preload option. The default is none which does not preload these
caches. To preload these caches, set this option to acm_hosts and
configure the addr_data_file appropriately.
.SH "SEE ALSO"
ibacm(7), ib_acme(1), rdma_cm(7)