Information
User Beancounters or UBC is the set of limits and guarantees which is the core resource management component in Virtuozzo kernel for both Parallels Virtuozzo Containers and OpenVZ products. This article describes the purpose of UBC limits and guarantees.
Information about User Beancouters can be found on the node and inside the container in /proc/user_beancounters
. For more information about the file structure, refer to this article:
1354 What are those User Beancounters?
Primary UBC parameters
numproc - maximum number of processes and kernel-level threads allowed for a container.
While configuring resource control system, it is important to estimate both the maximum number of processes and the average number of processes. Other dependent resource control parameters may depend both on the limit and the average number.
The barrier of numproc does not provide additional control and should be set equal to the limit.
There is a restriction on the total number of processes in the system. More than about 16000 processes start to cause poor responsiveness of the system, worsening when the number grows. Total number of processes exceeding 32000 is very likely to cause hang of the system. In practice the number of processes is usually less. Each process consumes some memory, and the available memory and the low memory limit the number of processes to lower values. With typical processes, it is normal to be able to run only up to 8000 processes in a system.
numtcpsock - maximum number of TCP sockets.
Since container has its own set of IP addresses, there are no direct limits on the total number of TCP sockets in the system. The number of sockets needs to be controlled because each socket needs certain amount of memory for receive and transmit buffers, and the memory is a limited resource.
numothersock - maximum number of non-TCP sockets (local sockets, UDP, and other types of sockets).
UDP sockets are used for Domain Name Service (DNS) queries, but the number of such sockets opened simultaneously is low. UDP and other sockets may also be used in some very special applications (SNMP agents and others).
The barrier of this parameter should be set equal to the limit. The number of local sockets in a system is not limited. The number of UDP sockets in a system, similarly to TCP sockets, is not limited in Virtuozzo systems.
Similarly to numtcpsock parameter the number of non-TCP sockets needs to be controlled because each socket needs certain amount of memory for its buffers, and the memory is a limited resource.
vmguarpages - memory allocation guarantee.
The amount of memory that container's applications are guaranteed to be able to allocate is specified as the barrier of vmguarpages parameter. The current amount of allocated memory space is accounted into privvmpages parameter, and vmguarpages parameter does not have its own accounting. The barrier and the limit of privvmpages parameter impose an upper limit on the memory allocations. The meaning of the limit for the vmguarpages parameter is unspecified in the current version and should be set to the maximal allowed value of LONG_MAX.
If the current amount of allocated memory space does not exceed the guaranteed amount (the barrier of vmguarpages), memory allocations of container's applications always succeed. If the current amount of allocated memory space exceeds the guarantee but below the barrier of privvmpages, allocations may or may not succeed, depending on the total amount of available memory in the system.
Starting from the barrier of privvmpages, normal priority allocations and, starting from the limit of privvmpages, all memory allocations made by the applications fail. The memory allocation guarantee is a primary tool for controlling the memory available to containers, because it allows administrators to provide Service Level Agreements — agreements guaranteeing certain quality of service, certain amount of resources and general availability of the service. The unit of measurement of vmguarpages values is memory pages (4KB on x86 and x86_64 processors). The total memory allocation guarantees given to containers are limited by the physical resources of the computer — the size of RAM and the swap space.
Secondary UBC parameters
kmemsize - size of unswappable memory in bytes, allocated by the operating system kernel.
The kmemsize parameter is related to the number of processes. Each process consumes certain amount of kernel memory - 24 kilobytes at minimum, 30-60 KB typically. Very large processes may consume more than that. It is important to have a certain safety gap between the barrier and the limit of the kmemsize parameter, e.g. 10%. Equal barrier and limit of the kmemsize parameter may lead to the situation where the kernel will need to kill container's applications to keep the kmemsize usage under the limit.
kmemsize limits cannot be set arbitrarily high. The total amount of memory accounted into the kmemsize parameter plus the socket buffer space is limited by the hardware resources of the system.
tcpsndbuf - the total size of buffers used to send data over TCP network connections.
If this restriction is not satisfied, some network connections may silently stall, being unable to transmit data.
Setting high values for tcpsndbuf parameter may, but not necessarily, increase performance of network communications. Note that unlike most other parameters hitting tcpsndbuf limits and failed socket buffer allocations do not have strong negative effect on the applications, but just reduce performance of network communications.
If you use rtorrent in a container, a low value for tcpsndbuf may cause rtorrent to take unusual amount of cpu. In this case, you must put a higher value. Also watch the number of failcnt in /proc/user_beancounters.
tcpsndbuf limits cannot be set arbitrarily high. The total amount of tcpsndbuf consumable by all containers in the system plus the kmemsize and other socket buffers is limited by the hardware resources of the system.
tcprcvbuf - the total size of buffers used to temporarily store the data coming from TCP network connections.
tcprcvbuf parameter depends on number of TCP sockets and should allow for some minimal amount of socket buffer memory for each socket:
If this restriction is not satisfied, some network connections may stall, being unable to receive data, and will be terminated after a couple of minutes.
Similarly to tcpsndbuf, setting high values for tcprcvbuf parameter may, but not necessarily, increase performance of network communications. Hitting tcprcvbuf limits and failed socket buffer allocations do not have strong negative effect on the applications, but just reduce performance of network communications. However, staying above the barrier of tcprcvbuf parameter for a long time is less harmless than for tcpsndbuf. Long periods of exceeding the barrier may cause termination of some connections.
tcprcvbuf limits cannot be set arbitrarily high. The total amount of tcprcvbuf consumable by all containers in the system plus the kmemsize and other socket buffers is limited by the hardware resources of the system.
othersockbuf - the total size of buffers used by local (UNIX-domain) connections between processes inside the system (such as connections to a local database server) and send buffers of UDP and other datagram protocols.
Increased limit for othersockbuf is necessary for high performance of communications through local (UNIX-domain) sockets. However, similarly to tcpsndbuf, hitting othersockbuf affects the communication performance only and does not affect the functionality.
othersockbuf limits cannot be set arbitrarily high. The total amount of othersockbuf consumable by all containers in the system plus the kmemsize and other socket buffers is limited by the hardware resources of the system.
dgramrcvbuf - the total size of buffers used to temporarily store the incoming packets of UDP and other datagram protocols.
Hitting dgramrcvbuf means that some datagrams are dropped, which may or may not be important for application functionality. UDP is a protocol with not guaranteed delivery, so even if the buffers permit, the datagrams may be as well dropped later on any stage of the processing, and applications should be prepared for it.
Unlike other socket buffer parameters, for dgramrcvbuf the barrier should be set to the limit.
dgramrcvbuf limits cannot be set arbitrarily high. The total amount of dgramrcvbuf consumable by all containers in the system plus the kmemsize and other socket buffers is limited by the hardware resources of the system.
oomguarpages - the guaranteed amount of memory in case the memory is "over-booked" (out-of-memory kill guarantee).
If the current usage of memory and swap space (the value of oomguarpages) plus the amount of used kernel memory (kmemsize) and socket buffers is below the barrier, processes in this container are guaranteed not to be killed in out-of-memory situations. If the system is in out-of-memory situation and there are several containers with oomguarpages excess, applications in the container with the biggest excess will be killed first. The failcnt counter of oomguarpages parameter increases when a process in this container is killed because of out-of-memory situation.
If the administrator needs to make sure that some application won't be forcedly killed regardless of the application's behavior, setting the privvmpages limit to a value not greater than the oomguarpages guarantee significantly reduce the likelihood of the application being killed, and setting it to a half of the oomguarpages guarantee completely prevents it. Such configurations are not popular because they significantly reduce the utilization of the hardware.
The meaning of the limit for the oomguarpages parameter is unspecified in the current version.
The total out-of-memory guarantees given to the containers should not exceed the physical capacity of the computer. If guarantees are given for more than the system has, in out-of-memory situations applications in containers with guaranteed level of service and system daemons may be killed.
privvmpages - memory allocation limit in pages (which are typically 4096 bytes in size).
The barrier and the limit of privvmpages parameter control the upper boundary of the total size of allocated memory. Note that this upper boundary does not guarantee that the container will be able to allocate that much memory, neither does it guarantee that other containers will be able to allocate their fair share of memory. The primary mechanism to control memory allocation is the vmguarpages guarantee. privvmpages parameter accounts allocated (but, possibly, not used yet) memory. The accounted value is an estimation how much memory will be really consumed when the container's applications start to use the allocated memory. Consumed memory is accounted into oomguarpages parameter.
Since the memory accounted into privvmpages may not be actually used, the sum of current privvmpages values for all containers may exceed the RAM and swap size of the computer.
There should be a safety gap between the barrier and the limit for privvmpages parameter to reduce the number of memory allocation failures that the application is unable to handle. This gap will be used for "high-priority" memory allocations, such as process stack expansion. Normal priority allocations will fail when the barrier of privvmpages is reached.
Total privvmpages should correlate with the physical resources of the computer. Also, it is important not to allow any container to allocate a significant portion of all system RAM to avoid serious service level degradation for other containers.
Auxillary UBC parameters
lockedpages - process pages not allowed to be swapped out (pages locked by mlock(2)).
Note that typical server applications like Web, FTP, mail servers do not use memory locking features.
The configuration of this parameter does not affect security and stability of the whole system or isolation between containers. Its configuration affects functionality and resource shortage reaction of applications in the given container only.
shmpages - the total size of shared memory (IPC, shared anonymous mappings, and tmpfs
objects).
The barrier should be set equal to the limit. The configuration of this parameter does not affect security and stability of the whole system or isolation between containers. Its configuration affects functionality and resource shortage reaction of applications in the given container only.
physpages - the total number of RAM pages used by processes in a container. Before 2.6.32-based kernels were used in accounting-only purposes.
For vSwap-enabled kernels, the barrier should be set to 0, and the limit limits the total size of RAM used by a container. For older kernels, physpages is an accounting-only parameter. The barrier should be set to 0 and the limit to unlimited.
numfile - the number of open files.
Note: actually currently adjusting the barrier will change the kernel behaviour on "pre-charging" the numfile resource. If you change one you will most likely not notice any changes in container behaviour at all. This ability was added for researching purposes purely.
numflock - the number of file locks
numpty - the number of pseudo-terminals.
numsiginfo - the number of siginfo structures.
The barrier should be set equal to the limit. Very high settings of the limit of this parameter may reduce responsiveness of the system. It is unlikely that any container will need the limit greater than the Linux default — 1024.
dcachesize - the total size of dentry and inode structures locked in memory.
The configuration of this parameter should have a gap between the barrier and the limit (about 10%). The configuration of this parameter does not affect security and stability of the whole system or isolation between Containers. Its configuration affects functionality and resource shortage reaction of applications in the given Container only.
numiptent - the number of NETFILTER (IP packet filtering) entries.
Also, large numiptent cause considerable slowdown of processing of network packets. It is not recommended to allow containers to create more than 200–300 numiptent.
swappages - the amount of swap space to show in container. Available starting from 2.6.32-based kernel.
If limit is set, its value is reported as the amount of total swap space in a container. If the limit is set to LONG_MAX (which is the in-kernel default for this parameter), all the swap space values parameters (total, used, free) are reported as 0.
The value of barrier for this beancounter is ignored.
The value of held shows how much swap space is currently being used for this container.
Additional information
For more information, review the following sources:
Memory Management in PVC for Linux
UBC Management Guide
OpenVZ Wiki
Parallels Virtuozzo Containers 4.7 for Linux User's Guide