summaryrefslogtreecommitdiff
path: root/share/man/man9/fail.9
blob: d7447b1cd35d974416d3695a3793ba388884249a (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
.\"
.\" Copyright (c) 2009-2019 Dell EMC Isilon http://www.isilon.com/
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\"    notice(s), this list of conditions and the following disclaimer as
.\"    the first lines of this file unmodified other than the possible
.\"    addition of one or more copyright notices.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\"    notice(s), this list of conditions and the following disclaimer in the
.\"    documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
.\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
.\" DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY
.\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
.\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
.\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
.\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
.\" DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd June 6, 2019
.Dt FAIL 9
.Os
.Sh NAME
.Nm DEBUG_FP ,
.Nm KFAIL_POINT_CODE ,
.Nm KFAIL_POINT_CODE_FLAGS ,
.Nm KFAIL_POINT_CODE_COND ,
.Nm KFAIL_POINT_ERROR ,
.Nm KFAIL_POINT_EVAL ,
.Nm KFAIL_POINT_DECLARE ,
.Nm KFAIL_POINT_DEFINE ,
.Nm KFAIL_POINT_GOTO ,
.Nm KFAIL_POINT_RETURN ,
.Nm KFAIL_POINT_RETURN_VOID ,
.Nm KFAIL_POINT_SLEEP_CALLBACKS ,
.Nm fail_point
.Nd fail points
.Sh SYNOPSIS
.In sys/fail.h
.Fn KFAIL_POINT_CODE "parent" "name" "code"
.Fn KFAIL_POINT_CODE_FLAGS "parent" "name" "flags" "code"
.Fn KFAIL_POINT_CODE_COND "parent" "name" "cond" "flags" "code"
.Fn KFAIL_POINT_ERROR "parent" "name" "error_var"
.Fn KFAIL_POINT_EVAL "name" "code"
.Fn KFAIL_POINT_DECLARE "name"
.Fn KFAIL_POINT_DEFINE "parent" "name" "flags"
.Fn KFAIL_POINT_GOTO "parent" "name" "error_var" "label"
.Fn KFAIL_POINT_RETURN "parent" "name"
.Fn KFAIL_POINT_RETURN_VOID "parent" "name"
.Fn KFAIL_POINT_SLEEP_CALLBACKS "parent" "name" "pre_func" "pre_arg" "post_func" "post_arg" "code"
.Sh DESCRIPTION
Fail points are used to add code points where errors may be injected
in a user controlled fashion.
Fail points provide a convenient wrapper around user-provided error
injection code, providing a
.Xr sysctl 9
MIB, and a parser for that MIB that describes how the error
injection code should fire.
.Pp
The base fail point macro is
.Fn KFAIL_POINT_CODE
where
.Fa parent
is a sysctl tree (frequently
.Sy DEBUG_FP
for kernel fail points, but various subsystems may wish to provide
their own fail point trees), and
.Fa name
is the name of the MIB in that tree, and
.Fa code
is the error injection code.
The
.Fa code
argument does not require braces, but it is considered good style to
use braces for any multi-line code arguments.
Inside the
.Fa code
argument, the evaluation of
.Sy RETURN_VALUE
is derived from the
.Fn return
value set in the sysctl MIB.
.Pp
Additionally,
.Fn KFAIL_POINT_CODE_FLAGS
provides a
.Fa flags
argument which controls the fail point's behaviour.
This can be used to e.g., mark the fail point's context as non-sleepable,
which causes the
.Sy sleep
action to be coerced to a busy wait.
The supported flags are:
.Bl -ohang -offset indent
.It FAIL_POINT_USE_TIMEOUT_PATH
Rather than sleeping on a
.Fn sleep
call, just fire the post-sleep function after a timeout fires.
.It FAIL_POINT_NONSLEEPABLE
Mark the fail point as being in a non-sleepable context, which coerces
.Fn sleep
calls to
.Fn delay
calls.
.El
.Pp
Likewise,
.Fn KFAIL_POINT_CODE_COND
supplies a
.Fa cond
argument, which allows you to set the condition under which the fail point's
code may fire.
This is equivalent to:
.Bd -literal
	if (cond)
		KFAIL_POINT_CODE_FLAGS(...);

.Ed
See
.Sx SYSCTL VARIABLES
below.
.Pp
The remaining
.Fn KFAIL_POINT_*
macros are wrappers around common error injection paths:
.Bl -inset
.It Fn KFAIL_POINT_RETURN parent name
is the equivalent of
.Sy KFAIL_POINT_CODE(..., return RETURN_VALUE)
.It Fn KFAIL_POINT_RETURN_VOID parent name
is the equivalent of
.Sy KFAIL_POINT_CODE(..., return)
.It Fn KFAIL_POINT_ERROR parent name error_var
is the equivalent of
.Sy KFAIL_POINT_CODE(..., error_var = RETURN_VALUE)
.It Fn KFAIL_POINT_GOTO parent name error_var label
is the equivalent of
.Sy KFAIL_POINT_CODE(..., { error_var = RETURN_VALUE; goto label;})
.El
.Pp
You can also introduce fail points by separating the declaration,
definition, and evaluation portions.
.Bl -inset
.It Fn KFAIL_POINT_DECLARE name
is used to declare the
.Sy fail_point
struct.
.It Fn KFAIL_POINT_DEFINE parent name flags
defines and initializes the
.Sy fail_point
and sets up its
.Xr sysctl 9 .
.It Fn KFAIL_POINT_EVAL name code
is used at the point that the fail point is executed.
.El
.Sh SYSCTL VARIABLES
The
.Fn KFAIL_POINT_*
macros add sysctl MIBs where specified.
Many base kernel MIBs can be found in the
.Sy debug.fail_point
tree (referenced in code by
.Sy DEBUG_FP ) .
.Pp
The sysctl variable may be set in a number of ways:
.Bd -literal
  [<pct>%][<cnt>*]<type>[(args...)][-><more terms>]
.Ed
.Pp
The <type> argument specifies which action to take; it can be one of:
.Bl -tag -width ".Dv return"
.It Sy off
Take no action (does not trigger fail point code)
.It Sy return
Trigger fail point code with specified argument
.It Sy sleep
Sleep the specified number of milliseconds
.It Sy panic
Panic
.It Sy break
Break into the debugger, or trap if there is no debugger support
.It Sy print
Print that the fail point executed
.It Sy pause
Threads sleep at the fail point until the fail point is set to
.Sy off
.It Sy yield
Thread yields the cpu when the fail point is evaluated
.It Sy delay
Similar to sleep, but busy waits the cpu.
(Useful in non-sleepable contexts.)
.El
.Pp
The <pct>% and <cnt>* modifiers prior to <type> control when
<type> is executed.
The <pct>% form (e.g. "1.2%") can be used to specify a
probability that <type> will execute.
This is a decimal in the range (0, 100] which can specify up to
1/10,000% precision.
The <cnt>* form (e.g. "5*") can be used to specify the number of
times <type> should be executed before this <term> is disabled.
Only the last probability and the last count are used if multiple
are specified, i.e. "1.2%2%" is the same as "2%".
When both a probability and a count are specified, the probability
is evaluated before the count, i.e. "2%5*" means "2% of the time,
but only 5 times total".
.Pp
The operator -> can be used to express cascading terms.
If you specify <term1>-><term2>, it means that if <term1> does not
.Ql execute ,
<term2> is evaluated.
For the purpose of this operator, the
.Fn return
and
.Fn print
operators are the only types that cascade.
A
.Fn return
term only cascades if the code executes, and a
.Fn print
term only cascades when passed a non-zero argument.
A pid can optionally be specified.
The fail point term is only executed when invoked by a process with a
matching p_pid.
.Sh EXAMPLES
.Bl -tag -width Sy
.It Sy sysctl debug.fail_point.foobar="2.1%return(5)"
21/1000ths of the time, execute
.Fa code
with RETURN_VALUE set to 5.
.It Sy sysctl debug.fail_point.foobar="2%return(5)->5%return(22)"
2/100ths of the time, execute
.Fa code
with RETURN_VALUE set to 5.
If that does not happen, 5% of the time execute
.Fa code
with RETURN_VALUE set to 22.
.It Sy sysctl debug.fail_point.foobar="5*return(5)->0.1%return(22)"
For 5 times, return 5.
After that, 1/1000th of the time, return 22.
.It Sy sysctl debug.fail_point.foobar="0.1%5*return(5)"
Return 5 for 1 in 1000 executions, but only 5 times total.
.It Sy sysctl debug.fail_point.foobar="1%*sleep(50)"
1/100th of the time, sleep 50ms.
.It Sy sysctl debug.fail_point.foobar="1*return(5)[pid 1234]"
Return 5 once, when pid 1234 executes the fail point.
.El
.Sh AUTHORS
.An -nosplit
This manual page was written by
.Pp
.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
and
.Pp
.An Zach Loafman Aq Mt zml@FreeBSD.org .
.Sh CAVEATS
It is easy to shoot yourself in the foot by setting fail points too
aggressively or setting too many in combination.
For example, forcing
.Fn malloc
to fail consistently is potentially harmful to uptime.
.Pp
The
.Fn sleep
sysctl setting may not be appropriate in all situations.
Currently,
.Fn fail_point_eval
does not verify whether the context is appropriate for calling
.Fn msleep .
You can force it to evaluate a
.Sy sleep
action as a
.Sy delay
action by specifying the
.Sy FAIL_POINT_NONSLEEPABLE
flag at the point the fail point is declared.