qemu/docs/vfio-ap.txt
<<
>>
Prefs
   1Adjunct Processor (AP) Device
   2=============================
   3
   4Contents:
   5=========
   6* Introduction
   7* AP Architectural Overview
   8* Start Interpretive Execution (SIE) Instruction
   9* AP Matrix Configuration on Linux Host
  10* Starting a Linux Guest Configured with an AP Matrix
  11* Example: Configure AP Matrices for Three Linux Guests
  12
  13Introduction:
  14============
  15The IBM Adjunct Processor (AP) Cryptographic Facility is comprised
  16of three AP instructions and from 1 to 256 PCIe cryptographic adapter cards.
  17These AP devices provide cryptographic functions to all CPUs assigned to a
  18linux system running in an IBM Z system LPAR.
  19
  20On s390x, AP adapter cards are exposed via the AP bus. This document
  21describes how those cards may be made available to KVM guests using the
  22VFIO mediated device framework.
  23
  24AP Architectural Overview:
  25=========================
  26In order understand the terminology used in the rest of this document, let's
  27start with some definitions:
  28
  29* AP adapter
  30
  31  An AP adapter is an IBM Z adapter card that can perform cryptographic
  32  functions. There can be from 0 to 256 adapters assigned to an LPAR depending
  33  on the machine model. Adapters assigned to the LPAR in which a linux host is
  34  running will be available to the linux host. Each adapter is identified by a
  35  number from 0 to 255; however, the maximum adapter number allowed is
  36  determined by machine model. When installed, an AP adapter is accessed by
  37  AP instructions executed by any CPU.
  38
  39* AP domain
  40
  41  An adapter is partitioned into domains. Each domain can be thought of as
  42  a set of hardware registers for processing AP instructions. An adapter can
  43  hold up to 256 domains; however, the maximum domain number allowed is
  44  determined by machine model. Each domain is identified by a number from 0 to
  45  255. Domains can be further classified into two types:
  46
  47    * Usage domains are domains that can be accessed directly to process AP
  48      commands
  49
  50    * Control domains are domains that are accessed indirectly by AP
  51      commands sent to a usage domain to control or change the domain; for
  52      example, to set a secure private key for the domain.
  53
  54* AP Queue
  55
  56  An AP queue is the means by which an AP command-request message is sent to an
  57  AP usage domain inside a specific AP. An AP queue is identified by a tuple
  58  comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
  59  APQI corresponds to a given usage domain number within the adapter. This tuple
  60  forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
  61  instructions include a field containing the APQN to identify the AP queue to
  62  which the AP command-request message is to be sent for processing.
  63
  64* AP Instructions:
  65
  66  There are three AP instructions:
  67
  68  * NQAP: to enqueue an AP command-request message to a queue
  69  * DQAP: to dequeue an AP command-reply message from a queue
  70  * PQAP: to administer the queues
  71
  72  AP instructions identify the domain that is targeted to process the AP
  73  command; this must be one of the usage domains. An AP command may modify a
  74  domain that is not one of the usage domains, but the modified domain
  75  must be one of the control domains.
  76
  77Start Interpretive Execution (SIE) Instruction
  78==============================================
  79A KVM guest is started by executing the Start Interpretive Execution (SIE)
  80instruction. The SIE state description is a control block that contains the
  81state information for a KVM guest and is supplied as input to the SIE
  82instruction. The SIE state description contains a satellite control block called
  83the Crypto Control Block (CRYCB). The CRYCB contains three fields to identify
  84the adapters, usage domains and control domains assigned to the KVM guest:
  85
  86* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
  87  to the KVM guest. Each bit in the mask, from left to right, corresponds to
  88  an APID from 0-255. If a bit is set, the corresponding adapter is valid for
  89  use by the KVM guest.
  90
  91* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
  92  assigned to the KVM guest. Each bit in the mask, from left to right,
  93  corresponds to  an AP queue index (APQI) from 0-255. If a bit is set, the
  94  corresponding queue is valid for use by the KVM guest.
  95
  96* The AP Domain Mask field is a bit mask that identifies the AP control domains
  97  assigned to the KVM guest. The ADM bit mask controls which domains can be
  98  changed by an AP command-request message sent to a usage domain from the
  99  guest. Each bit in the mask, from left to right, corresponds to a domain from
 100  0-255. If a bit is set, the corresponding domain can be modified by an AP
 101  command-request message sent to a usage domain.
 102
 103If you recall from the description of an AP Queue, AP instructions include
 104an APQN to identify the AP adapter and AP queue to which an AP command-request
 105message is to be sent (NQAP and PQAP instructions), or from which a
 106command-reply message is to be received (DQAP instruction). The validity of an
 107APQN is defined by the matrix calculated from the APM and AQM; it is the
 108cross product of all assigned adapter numbers (APM) with all assigned queue
 109indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
 110assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
 111the guest.
 112
 113The APQNs can provide secure key functionality - i.e., a private key is stored
 114on the adapter card for each of its domains - so each APQN must be assigned to
 115at most one guest or the linux host.
 116
 117   Example 1: Valid configuration:
 118   ------------------------------
 119   Guest1: adapters 1,2  domains 5,6
 120   Guest2: adapter  1,2  domain 7
 121
 122   This is valid because both guests have a unique set of APQNs: Guest1 has
 123   APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and (2,7).
 124
 125   Example 2: Valid configuration:
 126   ------------------------------
 127   Guest1: adapters 1,2 domains 5,6
 128   Guest2: adapters 3,4 domains 5,6
 129
 130   This is also valid because both guests have a unique set of APQNs:
 131      Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
 132      Guest2 has APQNs (3,5), (3,6), (4,5), (4,6)
 133
 134   Example 3: Invalid configuration:
 135   --------------------------------
 136   Guest1: adapters 1,2  domains 5,6
 137   Guest2: adapter  1    domains 6,7
 138
 139   This is an invalid configuration because both guests have access to
 140   APQN (1,6).
 141
 142AP Matrix Configuration on Linux Host:
 143=====================================
 144A linux system is a guest of the LPAR in which it is running and has access to
 145the AP resources configured for the LPAR. The LPAR's AP matrix is
 146configured via its Activation Profile which can be edited on the HMC. When the
 147linux system is started, the AP bus will detect the AP devices assigned to the
 148LPAR and create the following in sysfs:
 149
 150/sys/bus/ap
 151... [devices]
 152...... xx.yyyy
 153...... ...
 154...... cardxx
 155...... ...
 156
 157Where:
 158    cardxx     is AP adapter number xx (in hex)
 159....xx.yyyy    is an APQN with xx specifying the APID and yyyy specifying the
 160               APQI
 161
 162For example, if AP adapters 5 and 6 and domains 4, 71 (0x47), 171 (0xab) and
 163255 (0xff) are configured for the LPAR, the sysfs representation on the linux
 164host system would look like this:
 165
 166/sys/bus/ap
 167... [devices]
 168...... 05.0004
 169...... 05.0047
 170...... 05.00ab
 171...... 05.00ff
 172...... 06.0004
 173...... 06.0047
 174...... 06.00ab
 175...... 06.00ff
 176...... card05
 177...... card06
 178
 179A set of default device drivers are also created to control each type of AP
 180device that can be assigned to the LPAR on which a linux host is running:
 181
 182/sys/bus/ap
 183... [drivers]
 184...... [cex2acard]        for Crypto Express 2/3 accelerator cards
 185...... [cex2aqueue]       for AP queues served by Crypto Express 2/3
 186                          accelerator cards
 187...... [cex4card]         for Crypto Express 4/5/6 accelerator and coprocessor
 188                          cards
 189...... [cex4queue]        for AP queues served by Crypto Express 4/5/6
 190                          accelerator and coprocessor cards
 191...... [pcixcccard]       for Crypto Express 2/3 coprocessor cards
 192...... [pcixccqueue]      for AP queues served by Crypto Express 2/3
 193                          coprocessor cards
 194
 195Binding AP devices to device drivers
 196------------------------------------
 197There are two sysfs files that specify bitmasks marking a subset of the APQN
 198range as 'usable by the default AP queue device drivers' or 'not usable by the
 199default device drivers' and thus available for use by the alternate device
 200driver(s). The sysfs locations of the masks are:
 201
 202   /sys/bus/ap/apmask
 203   /sys/bus/ap/aqmask
 204
 205   The 'apmask' is a 256-bit mask that identifies a set of AP adapter IDs
 206   (APID). Each bit in the mask, from left to right (i.e., from most significant
 207   to least significant bit in big endian order), corresponds to an APID from
 208   0-255. If a bit is set, the APID is marked as usable only by the default AP
 209   queue device drivers; otherwise, the APID is usable by the vfio_ap
 210   device driver.
 211
 212   The 'aqmask' is a 256-bit mask that identifies a set of AP queue indexes
 213   (APQI). Each bit in the mask, from left to right (i.e., from most significant
 214   to least significant bit in big endian order), corresponds to an APQI from
 215   0-255. If a bit is set, the APQI is marked as usable only by the default AP
 216   queue device drivers; otherwise, the APQI is usable by the vfio_ap device
 217   driver.
 218
 219   Take, for example, the following mask:
 220
 221      0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
 222
 223    It indicates:
 224
 225      1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6
 226      belong to the vfio_ap device driver's pool.
 227
 228   The APQN of each AP queue device assigned to the linux host is checked by the
 229   AP bus against the set of APQNs derived from the cross product of APIDs
 230   and APQIs marked as usable only by the default AP queue device drivers. If a
 231   match is detected,  only the default AP queue device drivers will be probed;
 232   otherwise, the vfio_ap device driver will be probed.
 233
 234   By default, the two masks are set to reserve all APQNs for use by the default
 235   AP queue device drivers. There are two ways the default masks can be changed:
 236
 237   1. The sysfs mask files can be edited by echoing a string into the
 238      respective sysfs mask file in one of two formats:
 239
 240      * An absolute hex string starting with 0x - like "0x12345678" - sets
 241        the mask. If the given string is shorter than the mask, it is padded
 242        with 0s on the right; for example, specifying a mask value of 0x41 is
 243        the same as specifying:
 244
 245           0x4100000000000000000000000000000000000000000000000000000000000000
 246
 247        Keep in mind that the mask reads from left to right (i.e., most
 248        significant to least significant bit in big endian order), so the mask
 249        above identifies device numbers 1 and 7 (01000001).
 250
 251        If the string is longer than the mask, the operation is terminated with
 252        an error (EINVAL).
 253
 254      * Individual bits in the mask can be switched on and off by specifying
 255        each bit number to be switched in a comma separated list. Each bit
 256        number string must be prepended with a ('+') or minus ('-') to indicate
 257        the corresponding bit is to be switched on ('+') or off ('-'). Some
 258        valid values are:
 259
 260           "+0"    switches bit 0 on
 261           "-13"   switches bit 13 off
 262           "+0x41" switches bit 65 on
 263           "-0xff" switches bit 255 off
 264
 265           The following example:
 266              +0,-6,+0x47,-0xf0
 267
 268              Switches bits 0 and 71 (0x47) on
 269              Switches bits 6 and 240 (0xf0) off
 270
 271        Note that the bits not specified in the list remain as they were before
 272        the operation.
 273
 274   2. The masks can also be changed at boot time via parameters on the kernel
 275      command line like this:
 276
 277         ap.apmask=0xffff ap.aqmask=0x40
 278
 279         This would create the following masks:
 280
 281            apmask:
 282            0xffff000000000000000000000000000000000000000000000000000000000000
 283
 284            aqmask:
 285            0x4000000000000000000000000000000000000000000000000000000000000000
 286
 287         Resulting in these two pools:
 288
 289            default drivers pool:    adapter 0-15, domain 1
 290            alternate drivers pool:  adapter 16-255, domains 0, 2-255
 291
 292Configuring an AP matrix for a linux guest.
 293------------------------------------------
 294The sysfs interfaces for configuring an AP matrix for a guest are built on the
 295VFIO mediated device framework. To configure an AP matrix for a guest, a
 296mediated matrix device must first be created for the /sys/devices/vfio_ap/matrix
 297device. When the vfio_ap device driver is loaded, it registers with the VFIO
 298mediated device framework. When the driver registers, the sysfs interfaces for
 299creating mediated matrix devices is created:
 300
 301/sys/devices
 302... [vfio_ap]
 303......[matrix]
 304......... [mdev_supported_types]
 305............ [vfio_ap-passthrough]
 306............... create
 307............... [devices]
 308
 309A mediated AP matrix device is created by writing a UUID to the attribute file
 310named 'create', for example:
 311
 312   uuidgen > create
 313
 314   or
 315
 316   echo $uuid > create
 317
 318When a mediated AP matrix device is created, a sysfs directory named after
 319the UUID is created in the 'devices' subdirectory:
 320
 321/sys/devices
 322... [vfio_ap]
 323......[matrix]
 324......... [mdev_supported_types]
 325............ [vfio_ap-passthrough]
 326............... create
 327............... [devices]
 328.................. [$uuid]
 329
 330There will also be three sets of attribute files created in the mediated
 331matrix device's sysfs directory to configure an AP matrix for the
 332KVM guest:
 333
 334/sys/devices
 335... [vfio_ap]
 336......[matrix]
 337......... [mdev_supported_types]
 338............ [vfio_ap-passthrough]
 339............... create
 340............... [devices]
 341.................. [$uuid]
 342..................... assign_adapter
 343..................... assign_control_domain
 344..................... assign_domain
 345..................... matrix
 346..................... unassign_adapter
 347..................... unassign_control_domain
 348..................... unassign_domain
 349
 350assign_adapter
 351   To assign an AP adapter to the mediated matrix device, its APID is written
 352   to the 'assign_adapter' file. This may be done multiple times to assign more
 353   than one adapter. The APID may be specified using conventional semantics
 354   as a decimal, hexadecimal, or octal number. For example, to assign adapters
 355   4, 5 and 16 to a mediated matrix device in decimal, hexadecimal and octal
 356   respectively:
 357
 358       echo 4 > assign_adapter
 359       echo 0x5 > assign_adapter
 360       echo 020 > assign_adapter
 361
 362   In order to successfully assign an adapter:
 363
 364   * The adapter number specified must represent a value from 0 up to the
 365     maximum adapter number allowed by the machine model. If an adapter number
 366     higher than the maximum is specified, the operation will terminate with
 367     an error (ENODEV).
 368
 369   * All APQNs that can be derived from the adapter ID being assigned and the
 370     IDs of the previously assigned domains must be bound to the vfio_ap device
 371     driver. If no domains have yet been assigned, then there must be at least
 372     one APQN with the specified APID bound to the vfio_ap driver. If no such
 373     APQNs are bound to the driver, the operation will terminate with an
 374     error (EADDRNOTAVAIL).
 375
 376     No APQN that can be derived from the adapter ID and the IDs of the
 377     previously assigned domains can be assigned to another mediated matrix
 378     device. If an APQN is assigned to another mediated matrix device, the
 379     operation will terminate with an error (EADDRINUSE).
 380
 381unassign_adapter
 382   To unassign an AP adapter, its APID is written to the 'unassign_adapter'
 383   file. This may also be done multiple times to unassign more than one adapter.
 384
 385assign_domain
 386   To assign a usage domain, the domain number is written into the
 387   'assign_domain' file. This may be done multiple times to assign more than one
 388   usage domain. The domain number is specified using conventional semantics as
 389   a decimal, hexadecimal, or octal number. For example, to assign usage domains
 390   4, 8, and 71 to a mediated matrix device in decimal, hexadecimal and octal
 391   respectively:
 392
 393      echo 4 > assign_domain
 394      echo 0x8 > assign_domain
 395      echo 0107 > assign_domain
 396
 397   In order to successfully assign a domain:
 398
 399   * The domain number specified must represent a value from 0 up to the
 400     maximum domain number allowed by the machine model. If a domain number
 401     higher than the maximum is specified, the operation will terminate with
 402     an error (ENODEV).
 403
 404   * All APQNs that can be derived from the domain ID being assigned and the IDs
 405     of the previously assigned adapters must be bound to the vfio_ap device
 406     driver. If no domains have yet been assigned, then there must be at least
 407     one APQN with the specified APQI bound to the vfio_ap driver. If no such
 408     APQNs are bound to the driver, the operation will terminate with an
 409     error (EADDRNOTAVAIL).
 410
 411     No APQN that can be derived from the domain ID being assigned and the IDs
 412     of the previously assigned adapters can be assigned to another mediated
 413     matrix device. If an APQN is assigned to another mediated matrix device,
 414     the operation will terminate with an error (EADDRINUSE).
 415
 416unassign_domain
 417   To unassign a usage domain, the domain number is written into the
 418   'unassign_domain' file. This may be done multiple times to unassign more than
 419   one usage domain.
 420
 421assign_control_domain
 422   To assign a control domain, the domain number is written into the
 423   'assign_control_domain' file. This may be done multiple times to
 424   assign more than one control domain. The domain number may be specified using
 425   conventional semantics as a decimal, hexadecimal, or octal number. For
 426   example, to assign  control domains 4, 8, and 71 to  a mediated matrix device
 427   in decimal, hexadecimal and octal respectively:
 428
 429      echo 4 > assign_domain
 430      echo 0x8 > assign_domain
 431      echo 0107 > assign_domain
 432
 433   In order to successfully assign a control domain, the domain number
 434   specified must represent a value from 0 up to the maximum domain number
 435   allowed by the machine model. If a control domain number higher than the
 436   maximum is specified, the operation will terminate with an error (ENODEV).
 437
 438unassign_control_domain
 439   To unassign a control domain, the domain number is written into the
 440   'unassign_domain' file. This may be done multiple times to unassign more than
 441   one control domain.
 442
 443Notes: No changes to the AP matrix will be allowed while a guest using
 444the mediated matrix device is running. Attempts to assign an adapter,
 445domain or control domain will be rejected and an error (EBUSY) returned.
 446
 447Starting a Linux Guest Configured with an AP Matrix:
 448===================================================
 449To provide a mediated matrix device for use by a guest, the following option
 450must be specified on the QEMU command line:
 451
 452   -device vfio_ap,sysfsdev=$path-to-mdev
 453
 454The sysfsdev parameter specifies the path to the mediated matrix device.
 455There are a number of ways to specify this path:
 456
 457/sys/devices/vfio_ap/matrix/$uuid
 458/sys/bus/mdev/devices/$uuid
 459/sys/bus/mdev/drivers/vfio_mdev/$uuid
 460/sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/devices/$uuid
 461
 462When the linux guest is started, the guest will open the mediated
 463matrix device's file descriptor to get information about the mediated matrix
 464device. The vfio_ap device driver will update the APM, AQM, and ADM fields in
 465the guest's CRYCB with the adapter, usage domain and control domains assigned
 466via the mediated matrix device's sysfs attribute files. Programs running on the
 467linux guest will then:
 468
 4691. Have direct access to the APQNs derived from the cross product of the AP
 470   adapter numbers (APID) and queue indexes (APQI) specified in the APM and AQM
 471   fields of the guests's CRYCB respectively. These APQNs identify the AP queues
 472   that are valid for use by the guest; meaning, AP commands can be sent by the
 473   guest to any of these queues for processing.
 474
 4752. Have authorization to process AP commands to change a control domain
 476   identified in the ADM field of the guest's CRYCB. The AP command must be sent
 477   to a valid APQN (see 1 above).
 478
 479CPU model features:
 480
 481Three CPU model features are available for controlling guest access to AP
 482facilities:
 483
 4841. AP facilities feature
 485
 486   The AP facilities feature indicates that AP facilities are installed on the
 487   guest. This feature will be exposed for use only if the AP facilities
 488   are installed on the host system. The feature is s390-specific and is
 489   represented as a parameter of the -cpu option on the QEMU command line:
 490
 491      qemu-system-s390x -cpu $model,ap=on|off
 492
 493      Where:
 494
 495         $model is the CPU model defined for the guest (defaults to the model of
 496                the host system if not specified).
 497
 498         ap=on|off indicates whether AP facilities are installed (on) or not
 499                   (off). The default for CPU models zEC12 or newer
 500                   is ap=on. AP facilities must be installed on the guest if a
 501                   vfio-ap device (-device vfio-ap,sysfsdev=$path) is configured
 502                   for the guest, or the guest will fail to start.
 503
 5042. Query Configuration Information (QCI) facility
 505
 506   The QCI facility is used by the AP bus running on the guest to query the
 507   configuration of the AP facilities. This facility will be available
 508   only if the QCI facility is installed on the host system. The feature is
 509   s390-specific and is represented as a parameter of the -cpu option on the
 510   QEMU command line:
 511
 512      qemu-system-s390x -cpu $model,apqci=on|off
 513
 514      Where:
 515
 516         $model is the CPU model defined for the guest
 517
 518         apqci=on|off indicates whether the QCI facility is installed (on) or
 519                      not (off). The default for CPU models zEC12 or newer
 520                      is apqci=on; for older models, QCI will not be installed.
 521
 522                      If QCI is installed (apqci=on) but AP facilities are not
 523                      (ap=off), an error message will be logged, but the guest
 524                      will be allowed to start. It makes no sense to have QCI
 525                      installed if the AP facilities are not; this is considered
 526                      an invalid configuration.
 527
 528                      If the QCI facility is not installed, APQNs with an APQI
 529                      greater than 15 will not be detected by the AP bus
 530                      running on the guest.
 531
 5323. Adjunct Process Facility Test (APFT) facility
 533
 534   The APFT facility is used by the AP bus running on the guest to test the
 535   AP facilities available for a given AP queue. This facility will be available
 536   only if the APFT facility is installed on the host system. The feature is
 537   s390-specific and is represented as a parameter of the -cpu option on the
 538   QEMU command line:
 539
 540      qemu-system-s390x -cpu $model,apft=on|off
 541
 542      Where:
 543
 544         $model is the CPU model defined for the guest (defaults to the model of
 545                the host system if not specified).
 546
 547         apft=on|off indicates whether the APFT facility is installed (on) or
 548                     not (off). The default for CPU models zEC12 and
 549                     newer is apft=on for older models, APFT will not be
 550                     installed.
 551
 552                     If APFT is installed (apft=on) but AP facilities are not
 553                     (ap=off), an error message will be logged, but the guest
 554                     will be allowed to start. It makes no sense to have APFT
 555                     installed if the AP facilities are not; this is considered
 556                     an invalid configuration.
 557
 558                     It also makes no sense to turn APFT off because the AP bus
 559                     running on the guest will not detect CEX4 and newer devices
 560                     without it. Since only CEX4 and newer devices are supported
 561                     for guest usage, no AP devices can be made accessible to a
 562                     guest started without APFT installed.
 563
 564Hot plug a vfio-ap device into a running guest:
 565==============================================
 566Only one vfio-ap device can be attached to the virtual machine's ap-bus, so a
 567vfio-ap device can be hot plugged if and only if no vfio-ap device is attached
 568to the bus already, whether via the QEMU command line or a prior hot plug
 569action.
 570
 571To hot plug a vfio-ap device, use the QEMU device_add command:
 572
 573    (qemu) device_add vfio-ap,sysfsdev="$path-to-mdev"
 574
 575    Where the '$path-to-mdev' value specifies the absolute path to a mediated
 576    device to which AP resources to be used by the guest have been assigned.
 577
 578Note that on Linux guests, the AP devices will be created in the
 579/sys/bus/ap/devices directory when the AP bus subsequently performs its periodic
 580scan, so there may be a short delay before the AP devices are accessible on the
 581guest.
 582
 583The command will fail if:
 584
 585* A vfio-ap device has already been attached to the virtual machine's ap-bus.
 586
 587* The CPU model features for controlling guest access to AP facilities are not
 588  enabled (see 'CPU model features' subsection in the previous section).
 589
 590Hot unplug a vfio-ap device from a running guest:
 591================================================
 592A vfio-ap device can be unplugged from a running KVM guest if a vfio-ap device
 593has been attached to the virtual machine's ap-bus via the QEMU command line
 594or a prior hot plug action.
 595
 596To hot unplug a vfio-ap device, use the QEMU device_del command:
 597
 598    (qemu) device_del vfio-ap,sysfsdev="$path-to-mdev"
 599
 600    Where $path-to-mdev is the same as the path specified when the vfio-ap
 601    device was attached to the virtual machine's ap-bus.
 602
 603On a Linux guest, the AP devices will be removed from the /sys/bus/ap/devices
 604directory on the guest when the AP bus subsequently performs its periodic scan,
 605so there may be a short delay before the AP devices are no longer accessible by
 606the guest.
 607
 608The command will fail if the $path-to-mdev specified on the device_del command
 609does not match the value specified when the vfio-ap device was attached to
 610the virtual machine's ap-bus.
 611
 612Example: Configure AP Matrixes for Three Linux Guests:
 613=====================================================
 614Let's now provide an example to illustrate how KVM guests may be given
 615access to AP facilities. For this example, we will show how to configure
 616three guests such that executing the lszcrypt command on the guests would
 617look like this:
 618
 619Guest1
 620------
 621CARD.DOMAIN TYPE  MODE
 622------------------------------
 62305          CEX5C CCA-Coproc
 62405.0004     CEX5C CCA-Coproc
 62505.00ab     CEX5C CCA-Coproc
 62606          CEX5A Accelerator
 62706.0004     CEX5A Accelerator
 62806.00ab     CEX5C CCA-Coproc
 629
 630Guest2
 631------
 632CARD.DOMAIN TYPE  MODE
 633------------------------------
 63405          CEX5A Accelerator
 63505.0047     CEX5A Accelerator
 63605.00ff     CEX5A Accelerator (5,4), (5,171), (6,4), (6,171),
 637
 638Guest3
 639------
 640CARD.DOMAIN TYPE  MODE
 641------------------------------
 64206          CEX5A Accelerator
 64306.0047     CEX5A Accelerator
 64406.00ff     CEX5A Accelerator
 645
 646These are the steps:
 647
 6481. Install the vfio_ap module on the linux host. The dependency chain for the
 649   vfio_ap module is:
 650   * iommu
 651   * s390
 652   * zcrypt
 653   * vfio
 654   * vfio_mdev
 655   * vfio_mdev_device
 656   * KVM
 657
 658   To build the vfio_ap module, the kernel build must be configured with the
 659   following Kconfig elements selected:
 660   * IOMMU_SUPPORT
 661   * S390
 662   * ZCRYPT
 663   * S390_AP_IOMMU
 664   * VFIO
 665   * VFIO_MDEV
 666   * VFIO_MDEV_DEVICE
 667   * KVM
 668
 669   If using make menuconfig select the following to build the vfio_ap module:
 670   -> Device Drivers
 671      -> IOMMU Hardware Support
 672         select S390 AP IOMMU Support
 673      -> VFIO Non-Privileged userspace driver framework
 674         -> Mediated device driver framework
 675            -> VFIO driver for Mediated devices
 676   -> I/O subsystem
 677      -> VFIO support for AP devices
 678
 6792. Secure the AP queues to be used by the three guests so that the host can not
 680   access them. To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff,
 681   06.0004, 06.0047, 06.00ab, and 06.00ff for use by the vfio_ap device driver,
 682   the corresponding APQNs must be removed from the default queue drivers pool
 683   as follows:
 684
 685      echo -5,-6 > /sys/bus/ap/apmask
 686
 687      echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask
 688
 689   This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004,
 690   06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The
 691   sysfs directory for the vfio_ap device driver will now contain symbolic links
 692   to the AP queue devices bound to it:
 693
 694   /sys/bus/ap
 695   ... [drivers]
 696   ...... [vfio_ap]
 697   ......... [05.0004]
 698   ......... [05.0047]
 699   ......... [05.00ab]
 700   ......... [05.00ff]
 701   ......... [06.0004]
 702   ......... [06.0047]
 703   ......... [06.00ab]
 704   ......... [06.00ff]
 705
 706   Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later)
 707   can be bound to the vfio_ap device driver. The reason for this is to
 708   simplify the implementation by not needlessly complicating the design by
 709   supporting older devices that will go out of service in the relatively near
 710   future, and for which there are few older systems on which to test.
 711
 712   The administrator, therefore, must take care to secure only AP queues that
 713   can be bound to the vfio_ap device driver. The device type for a given AP
 714   queue device can be read from the parent card's sysfs directory. For example,
 715   to see the hardware type of the queue 05.0004:
 716
 717   cat /sys/bus/ap/devices/card05/hwtype
 718
 719   The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the
 720   vfio_ap device driver.
 721
 7223. Create the mediated devices needed to configure the AP matrixes for the
 723   three guests and to provide an interface to the vfio_ap driver for
 724   use by the guests:
 725
 726   /sys/devices/vfio_ap/matrix/
 727   --- [mdev_supported_types]
 728   ------ [vfio_ap-passthrough] (passthrough mediated matrix device type)
 729   --------- create
 730   --------- [devices]
 731
 732   To create the mediated devices for the three guests:
 733
 734       uuidgen > create
 735       uuidgen > create
 736       uuidgen > create
 737
 738        or
 739
 740        echo $uuid1 > create
 741        echo $uuid2 > create
 742        echo $uuid3 > create
 743
 744   This will create three mediated devices in the [devices] subdirectory named
 745   after the UUID used to create the mediated device. We'll call them $uuid1,
 746   $uuid2 and $uuid3 and this is the sysfs directory structure after creation:
 747
 748   /sys/devices/vfio_ap/matrix/
 749   --- [mdev_supported_types]
 750   ------ [vfio_ap-passthrough]
 751   --------- [devices]
 752   ------------ [$uuid1]
 753   --------------- assign_adapter
 754   --------------- assign_control_domain
 755   --------------- assign_domain
 756   --------------- matrix
 757   --------------- unassign_adapter
 758   --------------- unassign_control_domain
 759   --------------- unassign_domain
 760
 761   ------------ [$uuid2]
 762   --------------- assign_adapter
 763   --------------- assign_control_domain
 764   --------------- assign_domain
 765   --------------- matrix
 766   --------------- unassign_adapter
 767   ----------------unassign_control_domain
 768   ----------------unassign_domain
 769
 770   ------------ [$uuid3]
 771   --------------- assign_adapter
 772   --------------- assign_control_domain
 773   --------------- assign_domain
 774   --------------- matrix
 775   --------------- unassign_adapter
 776   ----------------unassign_control_domain
 777   ----------------unassign_domain
 778
 7794. The administrator now needs to configure the matrixes for the mediated
 780   devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3).
 781
 782   This is how the matrix is configured for Guest1:
 783
 784      echo 5 > assign_adapter
 785      echo 6 > assign_adapter
 786      echo 4 > assign_domain
 787      echo 0xab > assign_domain
 788
 789      Control domains can similarly be assigned using the assign_control_domain
 790      sysfs file.
 791
 792      If a mistake is made configuring an adapter, domain or control domain,
 793      you can use the unassign_xxx interfaces to unassign the adapter, domain or
 794      control domain.
 795
 796      To display the matrix configuration for Guest1:
 797
 798         cat matrix
 799
 800         The output will display the APQNs in the format xx.yyyy, where xx is
 801         the adapter number and yyyy is the domain number. The output for Guest1
 802         will look like this:
 803
 804         05.0004
 805         05.00ab
 806         06.0004
 807         06.00ab
 808
 809   This is how the matrix is configured for Guest2:
 810
 811      echo 5 > assign_adapter
 812      echo 0x47 > assign_domain
 813      echo 0xff > assign_domain
 814
 815   This is how the matrix is configured for Guest3:
 816
 817      echo 6 > assign_adapter
 818      echo 0x47 > assign_domain
 819      echo 0xff > assign_domain
 820
 8215. Start Guest1:
 822
 823   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
 824      -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
 825
 8267. Start Guest2:
 827
 828   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
 829      -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
 830
 8317. Start Guest3:
 832
 833   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
 834      -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ...
 835
 836When the guest is shut down, the mediated matrix devices may be removed.
 837
 838Using our example again, to remove the mediated matrix device $uuid1:
 839
 840   /sys/devices/vfio_ap/matrix/
 841      --- [mdev_supported_types]
 842      ------ [vfio_ap-passthrough]
 843      --------- [devices]
 844      ------------ [$uuid1]
 845      --------------- remove
 846
 847
 848   echo 1 > remove
 849
 850   This will remove all of the mdev matrix device's sysfs structures including
 851   the mdev device itself. To recreate and reconfigure the mdev matrix device,
 852   all of the steps starting with step 3 will have to be performed again. Note
 853   that the remove will fail if a guest using the mdev is still running.
 854
 855   It is not necessary to remove an mdev matrix device, but one may want to
 856   remove it if no guest will use it during the remaining lifetime of the linux
 857   host. If the mdev matrix device is removed, one may want to also reconfigure
 858   the pool of adapters and queues reserved for use by the default drivers.
 859
 860Limitations
 861===========
 862* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN
 863  to the default drivers pool of a queue that is still assigned to a mediated
 864  device in use by a guest. It is incumbent upon the administrator to
 865  ensure there is no mediated device in use by a guest to which the APQN is
 866  assigned lest the host be given access to the private data of the AP queue
 867  device, such as a private key configured specifically for the guest.
 868
 869* Dynamically assigning AP resources to or unassigning AP resources from a
 870  mediated matrix device - see 'Configuring an AP matrix for a linux guest'
 871  section above - while a running guest is using it is currently not supported.
 872
 873* Live guest migration is not supported for guests using AP devices. If a guest
 874  is using AP devices, the vfio-ap device configured for the guest must be
 875  unplugged before migrating the guest (see 'Hot unplug a vfio-ap device from a
 876  running guest' section above.
 877