1Fault injection capabilities infrastructure 2=========================================== 3 4See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug. 5 6 7Available fault injection capabilities 8-------------------------------------- 9 10o failslab 11 12 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) 13 14o fail_page_alloc 15 16 injects page allocation failures. (alloc_pages(), get_free_pages(), ...) 17 18o fail_futex 19 20 injects futex deadlock and uaddr fault errors. 21 22o fail_make_request 23 24 injects disk IO errors on devices permitted by setting 25 /sys/block/<device>/make-it-fail or 26 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request()) 27 28o fail_mmc_request 29 30 injects MMC data errors on devices permitted by setting 31 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request 32 33o fail_function 34 35 injects error return on specific functions, which are marked by 36 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries 37 under /sys/kernel/debug/fail_function. No boot option supported. 38 39o NVMe fault injection 40 41 inject NVMe status code and retry flag on devices permitted by setting 42 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default 43 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and 44 retry flag can be set via the debugfs. 45 46 47Configure fault-injection capabilities behavior 48----------------------------------------------- 49 50o debugfs entries 51 52fault-inject-debugfs kernel module provides some debugfs entries for runtime 53configuration of fault-injection capabilities. 54 55- /sys/kernel/debug/fail*/probability: 56 57 likelihood of failure injection, in percent. 58 Format: <percent> 59 60 Note that one-failure-per-hundred is a very high error rate 61 for some testcases. Consider setting probability=100 and configure 62 /sys/kernel/debug/fail*/interval for such testcases. 63 64- /sys/kernel/debug/fail*/interval: 65 66 specifies the interval between failures, for calls to 67 should_fail() that pass all the other tests. 68 69 Note that if you enable this, by setting interval>1, you will 70 probably want to set probability=100. 71 72- /sys/kernel/debug/fail*/times: 73 74 specifies how many times failures may happen at most. 75 A value of -1 means "no limit". 76 77- /sys/kernel/debug/fail*/space: 78 79 specifies an initial resource "budget", decremented by "size" 80 on each call to should_fail(,size). Failure injection is 81 suppressed until "space" reaches zero. 82 83- /sys/kernel/debug/fail*/verbose 84 85 Format: { 0 | 1 | 2 } 86 specifies the verbosity of the messages when failure is 87 injected. '0' means no messages; '1' will print only a single 88 log line per failure; '2' will print a call trace too -- useful 89 to debug the problems revealed by fault injection. 90 91- /sys/kernel/debug/fail*/task-filter: 92 93 Format: { 'Y' | 'N' } 94 A value of 'N' disables filtering by process (default). 95 Any positive value limits failures to only processes indicated by 96 /proc/<pid>/make-it-fail==1. 97 98- /sys/kernel/debug/fail*/require-start: 99- /sys/kernel/debug/fail*/require-end: 100- /sys/kernel/debug/fail*/reject-start: 101- /sys/kernel/debug/fail*/reject-end: 102 103 specifies the range of virtual addresses tested during 104 stacktrace walking. Failure is injected only if some caller 105 in the walked stacktrace lies within the required range, and 106 none lies within the rejected range. 107 Default required range is [0,ULONG_MAX) (whole of virtual address space). 108 Default rejected range is [0,0). 109 110- /sys/kernel/debug/fail*/stacktrace-depth: 111 112 specifies the maximum stacktrace depth walked during search 113 for a caller within [require-start,require-end) OR 114 [reject-start,reject-end). 115 116- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: 117 118 Format: { 'Y' | 'N' } 119 default is 'N', setting it to 'Y' won't inject failures into 120 highmem/user allocations. 121 122- /sys/kernel/debug/failslab/ignore-gfp-wait: 123- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: 124 125 Format: { 'Y' | 'N' } 126 default is 'N', setting it to 'Y' will inject failures 127 only into non-sleep allocations (GFP_ATOMIC allocations). 128 129- /sys/kernel/debug/fail_page_alloc/min-order: 130 131 specifies the minimum page allocation order to be injected 132 failures. 133 134- /sys/kernel/debug/fail_futex/ignore-private: 135 136 Format: { 'Y' | 'N' } 137 default is 'N', setting it to 'Y' will disable failure injections 138 when dealing with private (address space) futexes. 139 140- /sys/kernel/debug/fail_function/inject: 141 142 Format: { 'function-name' | '!function-name' | '' } 143 specifies the target function of error injection by name. 144 If the function name leads '!' prefix, given function is 145 removed from injection list. If nothing specified ('') 146 injection list is cleared. 147 148- /sys/kernel/debug/fail_function/injectable: 149 150 (read only) shows error injectable functions and what type of 151 error values can be specified. The error type will be one of 152 below; 153 - NULL: retval must be 0. 154 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096). 155 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096). 156 157- /sys/kernel/debug/fail_function/<functiuon-name>/retval: 158 159 specifies the "error" return value to inject to the given 160 function for given function. This will be created when 161 user specifies new injection entry. 162 163o Boot option 164 165In order to inject faults while debugfs is not available (early boot time), 166use the boot option: 167 168 failslab= 169 fail_page_alloc= 170 fail_make_request= 171 fail_futex= 172 mmc_core.fail_request=<interval>,<probability>,<space>,<times> 173 174o proc entries 175 176- /proc/<pid>/fail-nth: 177- /proc/self/task/<tid>/fail-nth: 178 179 Write to this file of integer N makes N-th call in the task fail. 180 Read from this file returns a integer value. A value of '0' indicates 181 that the fault setup with a previous write to this file was injected. 182 A positive integer N indicates that the fault wasn't yet injected. 183 Note that this file enables all types of faults (slab, futex, etc). 184 This setting takes precedence over all other generic debugfs settings 185 like probability, interval, times, etc. But per-capability settings 186 (e.g. fail_futex/ignore-private) take precedence over it. 187 188 This feature is intended for systematic testing of faults in a single 189 system call. See an example below. 190 191How to add new fault injection capability 192----------------------------------------- 193 194o #include <linux/fault-inject.h> 195 196o define the fault attributes 197 198 DECLARE_FAULT_INJECTION(name); 199 200 Please see the definition of struct fault_attr in fault-inject.h 201 for details. 202 203o provide a way to configure fault attributes 204 205- boot option 206 207 If you need to enable the fault injection capability from boot time, you can 208 provide boot option to configure it. There is a helper function for it: 209 210 setup_fault_attr(attr, str); 211 212- debugfs entries 213 214 failslab, fail_page_alloc, and fail_make_request use this way. 215 Helper functions: 216 217 fault_create_debugfs_attr(name, parent, attr); 218 219- module parameters 220 221 If the scope of the fault injection capability is limited to a 222 single kernel module, it is better to provide module parameters to 223 configure the fault attributes. 224 225o add a hook to insert failures 226 227 Upon should_fail() returning true, client code should inject a failure. 228 229 should_fail(attr, size); 230 231Application Examples 232-------------------- 233 234o Inject slab allocation failures into module init/exit code 235 236#!/bin/bash 237 238FAILTYPE=failslab 239echo Y > /sys/kernel/debug/$FAILTYPE/task-filter 240echo 10 > /sys/kernel/debug/$FAILTYPE/probability 241echo 100 > /sys/kernel/debug/$FAILTYPE/interval 242echo -1 > /sys/kernel/debug/$FAILTYPE/times 243echo 0 > /sys/kernel/debug/$FAILTYPE/space 244echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 245echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 246 247faulty_system() 248{ 249 bash -c "echo 1 > /proc/self/make-it-fail && exec $*" 250} 251 252if [ $# -eq 0 ] 253then 254 echo "Usage: $0 modulename [ modulename ... ]" 255 exit 1 256fi 257 258for m in $* 259do 260 echo inserting $m... 261 faulty_system modprobe $m 262 263 echo removing $m... 264 faulty_system modprobe -r $m 265done 266 267------------------------------------------------------------------------------ 268 269o Inject page allocation failures only for a specific module 270 271#!/bin/bash 272 273FAILTYPE=fail_page_alloc 274module=$1 275 276if [ -z $module ] 277then 278 echo "Usage: $0 <modulename>" 279 exit 1 280fi 281 282modprobe $module 283 284if [ ! -d /sys/module/$module/sections ] 285then 286 echo Module $module is not loaded 287 exit 1 288fi 289 290cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start 291cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end 292 293echo N > /sys/kernel/debug/$FAILTYPE/task-filter 294echo 10 > /sys/kernel/debug/$FAILTYPE/probability 295echo 100 > /sys/kernel/debug/$FAILTYPE/interval 296echo -1 > /sys/kernel/debug/$FAILTYPE/times 297echo 0 > /sys/kernel/debug/$FAILTYPE/space 298echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 299echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 300echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem 301echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth 302 303trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT 304 305echo "Injecting errors into the module $module... (interrupt to stop)" 306sleep 1000000 307 308------------------------------------------------------------------------------ 309 310o Inject open_ctree error while btrfs mount 311 312#!/bin/bash 313 314rm -f testfile.img 315dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1 316DEVICE=$(losetup --show -f testfile.img) 317mkfs.btrfs -f $DEVICE 318mkdir -p tmpmnt 319 320FAILTYPE=fail_function 321FAILFUNC=open_ctree 322echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject 323echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval 324echo N > /sys/kernel/debug/$FAILTYPE/task-filter 325echo 100 > /sys/kernel/debug/$FAILTYPE/probability 326echo 0 > /sys/kernel/debug/$FAILTYPE/interval 327echo -1 > /sys/kernel/debug/$FAILTYPE/times 328echo 0 > /sys/kernel/debug/$FAILTYPE/space 329echo 1 > /sys/kernel/debug/$FAILTYPE/verbose 330 331mount -t btrfs $DEVICE tmpmnt 332if [ $? -ne 0 ] 333then 334 echo "SUCCESS!" 335else 336 echo "FAILED!" 337 umount tmpmnt 338fi 339 340echo > /sys/kernel/debug/$FAILTYPE/inject 341 342rmdir tmpmnt 343losetup -d $DEVICE 344rm testfile.img 345 346 347Tool to run command with failslab or fail_page_alloc 348---------------------------------------------------- 349In order to make it easier to accomplish the tasks mentioned above, we can use 350tools/testing/fault-injection/failcmd.sh. Please run a command 351"./tools/testing/fault-injection/failcmd.sh --help" for more information and 352see the following examples. 353 354Examples: 355 356Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab 357allocation failure. 358 359 # ./tools/testing/fault-injection/failcmd.sh \ 360 -- make -C tools/testing/selftests/ run_tests 361 362Same as above except to specify 100 times failures at most instead of one time 363at most by default. 364 365 # ./tools/testing/fault-injection/failcmd.sh --times=100 \ 366 -- make -C tools/testing/selftests/ run_tests 367 368Same as above except to inject page allocation failure instead of slab 369allocation failure. 370 371 # env FAILCMD_TYPE=fail_page_alloc \ 372 ./tools/testing/fault-injection/failcmd.sh --times=100 \ 373 -- make -C tools/testing/selftests/ run_tests 374 375Systematic faults using fail-nth 376--------------------------------- 377 378The following code systematically faults 0-th, 1-st, 2-nd and so on 379capabilities in the socketpair() system call. 380 381#include <sys/types.h> 382#include <sys/stat.h> 383#include <sys/socket.h> 384#include <sys/syscall.h> 385#include <fcntl.h> 386#include <unistd.h> 387#include <string.h> 388#include <stdlib.h> 389#include <stdio.h> 390#include <errno.h> 391 392int main() 393{ 394 int i, err, res, fail_nth, fds[2]; 395 char buf[128]; 396 397 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait"); 398 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid)); 399 fail_nth = open(buf, O_RDWR); 400 for (i = 1;; i++) { 401 sprintf(buf, "%d", i); 402 write(fail_nth, buf, strlen(buf)); 403 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); 404 err = errno; 405 pread(fail_nth, buf, sizeof(buf), 0); 406 if (res == 0) { 407 close(fds[0]); 408 close(fds[1]); 409 } 410 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y', 411 res, err); 412 if (atoi(buf)) 413 break; 414 } 415 return 0; 416} 417 418An example output: 419 4201-th fault Y: res=-1/23 4212-th fault Y: res=-1/23 4223-th fault Y: res=-1/12 4234-th fault Y: res=-1/12 4245-th fault Y: res=-1/23 4256-th fault Y: res=-1/23 4267-th fault Y: res=-1/23 4278-th fault Y: res=-1/12 4289-th fault Y: res=-1/12 42910-th fault Y: res=-1/12 43011-th fault Y: res=-1/12 43112-th fault Y: res=-1/12 43213-th fault Y: res=-1/12 43314-th fault Y: res=-1/12 43415-th fault Y: res=-1/12 43516-th fault N: res=0/12 436