Monday, December 7, 2015

Lets perform some magic with Cgroups!

cgroups is mechanisem for monitoring and managing the computer's resources such as:
  • CPU runtime
  • Memory usage
  • Read/write speed for block device
  • Network bandwidth
As we know Linux is a great system for sharing resources around running applications/process, but let's say this time I wouldn't like to share and distribute my resources equally among processes, I want to guarantee more resources to a specific process .
this could be done via control groups aka cgroups.
Lets say we have one process which is highly important over the others, so I would declare a profile which consists of limits resources and then assign this profile to the process.
similar example can be given while speaking of containers/vm machines which we would like to  prioritize the resources between the containers.
this way we limit the impact of VM machines which hogs the CPUs.

I suggest you to read the kernel's documentation on cgroups which
elaborates very well, here is the link:
https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt

In case your cgroups commands are not installed on your system you can install it right away via (I'm using Ubuntu 14.04 for demonstration purposes):

sudo apt-get install cgroup-bin

after reboot, we can see a folder named cgroup:
cgroups is now located at /sys/fs/cgroup.
list the contents in the folder, you should see the following subdirectories:


Those subdirectories present the control group subsystems which can be managed by you.

In this post I'll be giving three demonstrations each one would be demonstrating managment on a different kind of resource. So let the fun begin!

Example #1 - CPU cores usage

lets say I would like to run a specific process on a specific core, I can easily do it on the fly. you can create a control group under the cpuset folder. and then you should echo the number of cpu you want to assign to the process, for example:

echo 0 > ./cpuset.cpus

run your process, and then assign it's pid to the specific core via:

echo <PID> > ./cpuset.tasks

Below you can see a screen-shot of the graph I took while monitoring the 4 cores, there are 4 intervals which I'll explain:

Interval #1 (50-60 sec):
  • Demonstrates the 4 cores which run on normal load.
Interval #2 (30-50 sec):
  • I have invoked my complexCalculation process, you can easily notice a ramp on CPU3.
Interval #3 (10-30 sec):
  • I have applied cgroup rule, so now the process will run only on CPU, we can see the decline on CPU3 and a ramp on CPU1.
Interval #4 (0-10 sec):
  • I have stopped the process complexCalculation so as we would expect there is a graceful degradation on CPU1.
For getting live updates regarding the cores usage, I have used the top command:

top -p<PID>

press 1 to toggle to Separate-Cpu-States screen.

So what actually happened under the hood?
The hard affinity is stored as bitmask in the task's task_struct as cpu_allowed (see sched.h). The bitmask contains one bit per possible processor on the system (In my case I have 4 CPUs). By default, all bits are set and, therefore, a process is potentially runnable on any processor.
After I have echoed to cpuset.tasks the function sched_setaffinity() was invoked,
we can easily see it via ftrace or by setting a break point.

Example #2 - Limiting memory usage

We can easily write a c program which grabs on each loop iteration chunk of memory of about the size of 5MB.
So after about 15 iterations we have consumed 75MB of RAM.I'll be calling this small app "processWastingMemory".
for avoiding this scenario (wasting memory) we can enter a new rule regarding the memory consumption, the rule would reside at the memory controller:

1) memory.limit_in_bytes (physical memory)
2) memory.memsw.limit_in_bytes (swap usage)

lets create a control group of name "myDemo".

20 MB = 20971520Bytes

echo 20971520 > /cgroup/memory/myDemo/memory.limit_in_bytes
echo 20971520 > /cgroup/memory/myDemo/memory.memsw.limit_in_bytes

now lets run the process/task in a given control groups:

cgexec -g memory:myDemo ./processWastingMemory

So here I'm defining the control groups in which the task will be run. the controller is "memory", After executing the command we can easily notice the program got closed (killed) immediately after reaching the memory limit of 20MB for the process.

We can easily check dmesg which shows the following message:

"Memory cgroup out of memory: kill process"

Example #3 - Read/write speed for block device

Will be given next week with interesting graphs... so stay tune!

Meanwhile enjoy exploring new intriguing stuff in the Linux world! :)

No comments:

Post a Comment

About