linux/Documentation/admin-guide/cgroup-v1/hugetlb.rst
<<
>>
Prefs
   1==================
   2HugeTLB Controller
   3==================
   4
   5HugeTLB controller can be created by first mounting the cgroup filesystem.
   6
   7# mount -t cgroup -o hugetlb none /sys/fs/cgroup
   8
   9With the above step, the initial or the parent HugeTLB group becomes
  10visible at /sys/fs/cgroup. At bootup, this group includes all the tasks in
  11the system. /sys/fs/cgroup/tasks lists the tasks in this cgroup.
  12
  13New groups can be created under the parent group /sys/fs/cgroup::
  14
  15  # cd /sys/fs/cgroup
  16  # mkdir g1
  17  # echo $$ > g1/tasks
  18
  19The above steps create a new group g1 and move the current shell
  20process (bash) into it.
  21
  22Brief summary of control files::
  23
  24 hugetlb.<hugepagesize>.rsvd.limit_in_bytes            # set/show limit of "hugepagesize" hugetlb reservations
  25 hugetlb.<hugepagesize>.rsvd.max_usage_in_bytes        # show max "hugepagesize" hugetlb reservations and no-reserve faults
  26 hugetlb.<hugepagesize>.rsvd.usage_in_bytes            # show current reservations and no-reserve faults for "hugepagesize" hugetlb
  27 hugetlb.<hugepagesize>.rsvd.failcnt                   # show the number of allocation failure due to HugeTLB reservation limit
  28 hugetlb.<hugepagesize>.limit_in_bytes                 # set/show limit of "hugepagesize" hugetlb faults
  29 hugetlb.<hugepagesize>.max_usage_in_bytes             # show max "hugepagesize" hugetlb  usage recorded
  30 hugetlb.<hugepagesize>.usage_in_bytes                 # show current usage for "hugepagesize" hugetlb
  31 hugetlb.<hugepagesize>.failcnt                        # show the number of allocation failure due to HugeTLB usage limit
  32
  33For a system supporting three hugepage sizes (64k, 32M and 1G), the control
  34files include::
  35
  36  hugetlb.1GB.limit_in_bytes
  37  hugetlb.1GB.max_usage_in_bytes
  38  hugetlb.1GB.usage_in_bytes
  39  hugetlb.1GB.failcnt
  40  hugetlb.1GB.rsvd.limit_in_bytes
  41  hugetlb.1GB.rsvd.max_usage_in_bytes
  42  hugetlb.1GB.rsvd.usage_in_bytes
  43  hugetlb.1GB.rsvd.failcnt
  44  hugetlb.64KB.limit_in_bytes
  45  hugetlb.64KB.max_usage_in_bytes
  46  hugetlb.64KB.usage_in_bytes
  47  hugetlb.64KB.failcnt
  48  hugetlb.64KB.rsvd.limit_in_bytes
  49  hugetlb.64KB.rsvd.max_usage_in_bytes
  50  hugetlb.64KB.rsvd.usage_in_bytes
  51  hugetlb.64KB.rsvd.failcnt
  52  hugetlb.32MB.limit_in_bytes
  53  hugetlb.32MB.max_usage_in_bytes
  54  hugetlb.32MB.usage_in_bytes
  55  hugetlb.32MB.failcnt
  56  hugetlb.32MB.rsvd.limit_in_bytes
  57  hugetlb.32MB.rsvd.max_usage_in_bytes
  58  hugetlb.32MB.rsvd.usage_in_bytes
  59  hugetlb.32MB.rsvd.failcnt
  60
  61
  621. Page fault accounting
  63
  64hugetlb.<hugepagesize>.limit_in_bytes
  65hugetlb.<hugepagesize>.max_usage_in_bytes
  66hugetlb.<hugepagesize>.usage_in_bytes
  67hugetlb.<hugepagesize>.failcnt
  68
  69The HugeTLB controller allows users to limit the HugeTLB usage (page fault) per
  70control group and enforces the limit during page fault. Since HugeTLB
  71doesn't support page reclaim, enforcing the limit at page fault time implies
  72that, the application will get SIGBUS signal if it tries to fault in HugeTLB
  73pages beyond its limit. Therefore the application needs to know exactly how many
  74HugeTLB pages it uses before hand, and the sysadmin needs to make sure that
  75there are enough available on the machine for all the users to avoid processes
  76getting SIGBUS.
  77
  78
  792. Reservation accounting
  80
  81hugetlb.<hugepagesize>.rsvd.limit_in_bytes
  82hugetlb.<hugepagesize>.rsvd.max_usage_in_bytes
  83hugetlb.<hugepagesize>.rsvd.usage_in_bytes
  84hugetlb.<hugepagesize>.rsvd.failcnt
  85
  86The HugeTLB controller allows to limit the HugeTLB reservations per control
  87group and enforces the controller limit at reservation time and at the fault of
  88HugeTLB memory for which no reservation exists. Since reservation limits are
  89enforced at reservation time (on mmap or shget), reservation limits never causes
  90the application to get SIGBUS signal if the memory was reserved before hand. For
  91MAP_NORESERVE allocations, the reservation limit behaves the same as the fault
  92limit, enforcing memory usage at fault time and causing the application to
  93receive a SIGBUS if it's crossing its limit.
  94
  95Reservation limits are superior to page fault limits described above, since
  96reservation limits are enforced at reservation time (on mmap or shget), and
  97never causes the application to get SIGBUS signal if the memory was reserved
  98before hand. This allows for easier fallback to alternatives such as
  99non-HugeTLB memory for example. In the case of page fault accounting, it's very
 100hard to avoid processes getting SIGBUS since the sysadmin needs precisely know
 101the HugeTLB usage of all the tasks in the system and make sure there is enough
 102pages to satisfy all requests. Avoiding tasks getting SIGBUS on overcommited
 103systems is practically impossible with page fault accounting.
 104
 105
 1063. Caveats with shared memory
 107
 108For shared HugeTLB memory, both HugeTLB reservation and page faults are charged
 109to the first task that causes the memory to be reserved or faulted, and all
 110subsequent uses of this reserved or faulted memory is done without charging.
 111
 112Shared HugeTLB memory is only uncharged when it is unreserved or deallocated.
 113This is usually when the HugeTLB file is deleted, and not when the task that
 114caused the reservation or fault has exited.
 115
 116
 1174. Caveats with HugeTLB cgroup offline.
 118
 119When a HugeTLB cgroup goes offline with some reservations or faults still
 120charged to it, the behavior is as follows:
 121
 122- The fault charges are charged to the parent HugeTLB cgroup (reparented),
 123- the reservation charges remain on the offline HugeTLB cgroup.
 124
 125This means that if a HugeTLB cgroup gets offlined while there is still HugeTLB
 126reservations charged to it, that cgroup persists as a zombie until all HugeTLB
 127reservations are uncharged. HugeTLB reservations behave in this manner to match
 128the memory controller whose cgroups also persist as zombie until all charged
 129memory is uncharged. Also, the tracking of HugeTLB reservations is a bit more
 130complex compared to the tracking of HugeTLB faults, so it is significantly
 131harder to reparent reservations at offline time.
 132