Lines Matching refs:cpuset

45 the resources within a task's current cpuset.  They form a nested
56 policy, are both filtered through that task's cpuset, filtering out any
57 CPUs or Memory Nodes not in that cpuset. The scheduler will not
64 cpusets and which CPUs and Memory Nodes are assigned to each cpuset,
65 specify and query to which cpuset a task is assigned, and list the
66 task pids assigned to a cpuset.
100 The kernel cpuset patch provides the minimum essential kernel
121 - Each task in the system is attached to a cpuset, via a pointer
124 allowed in that task's cpuset.
126 those Memory Nodes allowed in that task's cpuset.
127 - The root cpuset contains all the systems CPUs and Memory
129 - For any cpuset, one can define child cpusets containing a subset
131 - The hierarchy of cpusets can be mounted at /dev/cpuset, for
133 - A cpuset may be marked exclusive, which ensures that no other
134 cpuset (except direct ancestors and descendants) may contain
136 - You can list all the tasks (by pid) attached to any cpuset.
141 - in init/main.c, to initialize the root cpuset at system boot.
142 - in fork and exit, to attach and detach a task from its cpuset.
144 allowed in that task's cpuset.
146 the CPUs allowed by their cpuset, if possible.
148 Memory Nodes by what's allowed in that task's cpuset.
150 - in vmscan.c, to restrict page recovery to the current cpuset.
155 modifying cpusets is via this cpuset file system.
167 Each cpuset is represented by a directory in the cgroup file system
169 files describing that cpuset:
171 - cpuset.cpus: list of CPUs in that cpuset
172 - cpuset.mems: list of Memory Nodes in that cpuset
173 - cpuset.memory_migrate flag: if set, move pages to cpusets nodes
174 - cpuset.cpu_exclusive flag: is cpu placement exclusive?
175 - cpuset.mem_exclusive flag: is memory placement exclusive?
176 - cpuset.mem_hardwall flag: is memory allocation hardwalled
177 - cpuset.memory_pressure: measure of how much paging pressure in cpuset
178 - cpuset.memory_spread_page flag: if set, spread page cache evenly on allowed nodes
179 - cpuset.memory_spread_slab flag: if set, spread slab cache evenly on allowed nodes
180 - cpuset.sched_load_balance flag: if set, load balance within CPUs on that cpuset
181 - cpuset.sched_relax_domain_level: the searching range when migrating tasks
183 In addition, only the root cpuset has the following file:
184 - cpuset.memory_pressure_enabled flag: compute memory_pressure?
187 command. The properties of a cpuset, such as its flags, allowed
195 children of that task, to a cpuset allows organizing the work load
197 to using the CPUs and Memory Nodes of a particular cpuset. A task
198 may be re-attached to any other cpuset, if allowed by the permissions
199 on the necessary cpuset file system directories.
205 The following rules apply to each cpuset:
214 exclusive cpuset. Also, the use of a Linux virtual file system (vfs)
215 to represent the cpuset hierarchy provides for a familiar permission
218 The cpus and mems files in the root (top_cpuset) cpuset are
228 If a cpuset is cpu or mem exclusive, no other cpuset, other than
232 A cpuset that is cpuset.mem_exclusive *or* cpuset.mem_hardwall is "hardwalled",
238 isolating each job's user allocation in its own cpuset. To do this,
239 construct a large mem_exclusive cpuset to hold all the jobs, and
243 mem_exclusive cpuset.
248 The memory_pressure of a cpuset provides a simple per-cpuset metric
249 of the rate that the tasks in a cpuset are attempting to free up in
250 use memory on the nodes of the cpuset to satisfy additional memory
265 to monitor a cpuset for signs of memory pressure. It's up to the
270 /dev/cpuset/memory_pressure_enabled, the hook in the rebalance
275 Why a per-cpuset, running average:
277 Because this meter is per-cpuset, rather than per-task or mm,
287 Because this meter is per-cpuset rather than per-task or mm,
289 pressure in a cpuset, with a single read, rather than having to
291 set of tasks in the cpuset.
293 A per-cpuset simple digital filter (requires a spinlock and 3 words
294 of data per-cpuset) is kept, and updated by any task attached to that
295 cpuset, if it enters the synchronous (direct) page reclaim code.
297 A per-cpuset file provides an integer number representing the recent
299 the tasks in the cpuset, in units of reclaims attempted per second,
305 There are two boolean flag files per cpuset that control where the
307 kernel data structures. They are called 'cpuset.memory_spread_page' and
308 'cpuset.memory_spread_slab'.
310 If the per-cpuset boolean flag file 'cpuset.memory_spread_page' is set, then
315 If the per-cpuset boolean flag file 'cpuset.memory_spread_slab' is set,
326 except perhaps as modified by the task's NUMA mempolicy or cpuset
340 Both 'cpuset.memory_spread_page' and 'cpuset.memory_spread_slab' are boolean flag
342 for that cpuset. If a "1" is written to that file, then that turns
347 Setting the flag 'cpuset.memory_spread_page' turns on a per-process flag
348 PFA_SPREAD_PAGE for each task that is in that cpuset or subsequently
349 joins that cpuset. The page allocation calls for the page cache
354 Similarly, setting 'cpuset.memory_spread_slab' turns on the flag
368 the several nodes in the jobs cpuset in order to fit. Without this
370 data set, the memory allocation across the nodes in the jobs cpuset
409 When the per-cpuset flag "cpuset.sched_load_balance" is enabled (the default
410 setting), it requests that all the CPUs in that cpusets allowed 'cpuset.cpus'
413 from any CPU in that cpuset to any other.
415 When the per-cpuset flag "cpuset.sched_load_balance" is disabled, then the
416 scheduler will avoid load balancing across the CPUs in that cpuset,
417 --except-- in so far as is necessary because some overlapping cpuset
420 So, for example, if the top cpuset has the flag "cpuset.sched_load_balance"
422 CPUs, and the setting of the "cpuset.sched_load_balance" flag in any other
425 Therefore in the above two situations, the top cpuset flag
426 "cpuset.sched_load_balance" should be disabled, and only some of the smaller,
430 the top cpuset that might use non-trivial amounts of CPU, as such tasks
437 Of course, tasks pinned to a particular CPU can be left in a cpuset
438 that disables "cpuset.sched_load_balance" as those tasks aren't going anywhere
448 overlapping cpusets enables the flag 'cpuset.sched_load_balance', then we
450 a task to a CPU outside its cpuset, but the scheduler load balancing
454 between which cpusets have the flag "cpuset.sched_load_balance" enabled,
455 and the sched domain configuration. If a cpuset enables the flag, it
458 cpuset enables the flag.
460 If two cpusets have partially overlapping 'cpuset.cpus' allowed, and only
464 paragraphs above. In the general case, as in the top cpuset case,
470 CPUs in "cpuset.isolcpus" were excluded from load balancing by the
472 of the value of "cpuset.sched_load_balance" in any cpuset.
477 The per-cpuset flag 'cpuset.sched_load_balance' defaults to enabled (contrary
478 to most cpuset flags.) When enabled for a cpuset, the kernel will
479 ensure that it can load balance across all the CPUs in that cpuset
480 (makes sure that all the CPUs in the cpus_allowed of that cpuset are
483 If two overlapping cpusets both have 'cpuset.sched_load_balance' enabled,
486 If, as is the default, the top cpuset has 'cpuset.sched_load_balance' enabled,
488 the whole system, regardless of any other cpuset settings.
493 of CPUs allowed to a cpuset having 'cpuset.sched_load_balance' enabled.
495 The internal kernel cpuset to scheduler interface passes from the
496 cpuset code to the scheduler code a partition of the load balanced
501 The cpuset code builds a new such partition and passes it to the
504 - the 'cpuset.sched_load_balance' flag of a cpuset with non-empty CPUs changes,
505 - or CPUs come or go from a cpuset with this flag enabled,
506 - or 'cpuset.sched_relax_domain_level' value of a cpuset with non-empty CPUs
508 - or a cpuset with non-empty CPUs and with this flag enabled is removed,
517 the cpuset code to update these sched domains, it compares the new
551 The 'cpuset.sched_relax_domain_level' file allows you to request changing
554 otherwise initial value -1 that indicates the cpuset has no request.
567 This file is per-cpuset and affect the sched domain where the cpuset
568 belongs to. Therefore if the flag 'cpuset.sched_load_balance' of a cpuset
569 is disabled, then 'cpuset.sched_relax_domain_level' have no effect since
570 there is no sched domain belonging the cpuset.
585 the searching cost enough small by managing cpuset to compact etc.
596 task directly, the impact on a task of changing its cpuset CPU
597 or Memory Node placement, or of changing to which cpuset a task
600 If a cpuset has its Memory Nodes modified, then for each task attached
601 to that cpuset, the next time that the kernel attempts to allocate
603 in the task's cpuset, and update its per-task memory placement to
606 its new cpuset, then the task will continue to use whatever subset
607 of MPOL_BIND nodes are still allowed in the new cpuset. If the task
609 in the new cpuset, then the task will be essentially treated as if it
610 was MPOL_BIND bound to the new cpuset (even though its NUMA placement,
612 from one cpuset to another, then the kernel will adjust the task's
616 If a cpuset has its 'cpuset.cpus' modified, then each task in that cpuset
618 if a task's pid is written to another cpusets 'cpuset.tasks' file, then its
620 bound to some subset of its cpuset using the sched_setaffinity() call,
621 the task will be allowed to run on any CPU allowed in its new cpuset,
624 In summary, the memory placement of a task whose cpuset is changed is
631 cpusets memory placement policy 'cpuset.mems' subsequently changes.
632 If the cpuset flag file 'cpuset.memory_migrate' is set true, then when
633 tasks are attached to that cpuset, any pages that task had
634 allocated to it on nodes in its previous cpuset are migrated
635 to the task's new cpuset. The relative placement of the page within
636 the cpuset is preserved during these migration operations if possible.
637 For example if the page was on the second valid node of the prior cpuset
638 then the page will be placed on the second valid node of the new cpuset.
640 Also if 'cpuset.memory_migrate' is set true, then if that cpuset's
641 'cpuset.mems' file is modified, pages allocated to tasks in that
642 cpuset, that were on nodes in the previous setting of 'cpuset.mems',
644 Pages that were not in the task's prior cpuset, or in the cpuset's
645 prior 'cpuset.mems' setting, will not be moved.
648 to remove all the CPUs that are currently assigned to a cpuset,
649 then all the tasks in that cpuset will be moved to the nearest ancestor
651 cpuset is bound with another cgroup subsystem which has some restrictions
653 in the original cpuset, and the kernel will automatically update
657 violate cpuset placement, over starving a task that has had all
664 the current task's cpuset, then we relax the cpuset, and look for
665 memory anywhere we can find it. It's better to violate the cpuset
668 To start a new job that is to be contained within a cpuset, the steps are:
670 1) mkdir /sys/fs/cgroup/cpuset
671 2) mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset
672 3) Create the new cpuset by doing mkdir's and write's (or echo's) in
673 the /sys/fs/cgroup/cpuset virtual file system.
675 5) Attach that task to the new cpuset by writing its pid to the
676 /sys/fs/cgroup/cpuset tasks file for that cpuset.
679 For example, the following sequence of commands will setup a cpuset
681 and then start a subshell 'sh' in that cpuset:
683 mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset
684 cd /sys/fs/cgroup/cpuset
687 /bin/echo 2-3 > cpuset.cpus
688 /bin/echo 1 > cpuset.mems
691 # The subshell 'sh' is now running in cpuset Charlie
693 cat /proc/self/cpuset
696 - via the cpuset file system directly, using the various cd, mkdir, echo,
702 (http://code.google.com/p/cpuset/)
715 Creating, modifying, using the cpusets can be done through the cpuset
719 # mount -t cgroup -o cpuset cpuset /sys/fs/cgroup/cpuset
721 Then under /sys/fs/cgroup/cpuset you can find a tree that corresponds to the
722 tree of the cpusets in the system. For instance, /sys/fs/cgroup/cpuset
723 is the cpuset that holds the whole system.
725 If you want to create a new cpuset under /sys/fs/cgroup/cpuset:
726 # cd /sys/fs/cgroup/cpuset
729 Now you want to do something with this cpuset.
734 cgroup.clone_children cpuset.memory_pressure
735 cgroup.event_control cpuset.memory_spread_page
736 cgroup.procs cpuset.memory_spread_slab
737 cpuset.cpu_exclusive cpuset.mems
738 cpuset.cpus cpuset.sched_load_balance
739 cpuset.mem_exclusive cpuset.sched_relax_domain_level
740 cpuset.mem_hardwall notify_on_release
741 cpuset.memory_migrate tasks
743 Reading them will give you information about the state of this cpuset:
746 the cpuset.
749 # /bin/echo 1 > cpuset.cpu_exclusive
752 # /bin/echo 0-7 > cpuset.cpus
755 # /bin/echo 0-7 > cpuset.mems
757 Now attach your shell to this cpuset:
760 You can also create cpusets inside your cpuset by using mkdir in this
764 To remove a cpuset, just use rmdir:
766 This will fail if the cpuset is in use (has cpusets inside, or has
769 Note that for legacy reasons, the "cpuset" filesystem exists as a
774 mount -t cpuset X /sys/fs/cgroup/cpuset
778 mount -t cgroup -ocpuset,noprefix X /sys/fs/cgroup/cpuset
779 echo "/sbin/cpuset_release_agent" > /sys/fs/cgroup/cpuset/release_agent
785 in cpuset directories:
787 # /bin/echo 1-4 > cpuset.cpus -> set cpus list to cpus 1,2,3,4
788 # /bin/echo 1,2,3,4 > cpuset.cpus -> set cpus list to cpus 1,2,3,4
790 To add a CPU to a cpuset, write the new list of CPUs including the
791 CPU to be added. To add 6 to the above cpuset:
793 # /bin/echo 1-4,6 > cpuset.cpus -> set cpus list to cpus 1,2,3,4,6
795 Similarly to remove a CPU from a cpuset, write the new list of CPUs
800 # /bin/echo "" > cpuset.cpus -> clear cpus list
807 # /bin/echo 1 > cpuset.cpu_exclusive -> set flag 'cpuset.cpu_exclusive'
808 # /bin/echo 0 > cpuset.cpu_exclusive -> unset flag 'cpuset.cpu_exclusive'
829 errors. If you use it in the cpuset file system, you won't be
839 Web: http://www.bullopensource.org/cpuset