Skip to content

Senpai is an automated memory sizing tool for container applications.

License

Notifications You must be signed in to change notification settings

stephenw121/senpai

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Senpai

Senpai is an automated memory sizing tool for container applications.

Background

Determining the exact amount of memory required by an application (the workingset size) is a difficult, error-prone task.

Libraries and code pages used during startup are loaded into memory only to be never touched again afterwards. On top of that, the Linux filesystem cache doesn't kick out cold data until that memory is required for new data. Allocated memory is not a good proxy for required memory. This makes it difficult to provision memory correctly and maintain adequate safety margins: Too little, and the applications experience thrashing or out-of-memory kills during load peaks; too much, and costly hardware resources are being wasted.

Senpai is a userspace tool that determines the actual memory requirement of containerized applications.

Using Linux psi metrics and cgroup2 memory limits, senpai applies just enough memory pressure on a container to page out the cold and unused memory pages that aren't necessary for nominal workload performance. It dynamically adapts to load peaks and troughs, and so provides a workingset profile of an application over time.

This information helps system operators eliminate waste, shore up for contingencies, optimize task placement in compute grids, and plan long-term capacity/hardware requirements.

Examples

An example kernel compile job has a peak memory consumption of 800M:

$ time make -j4 -s
real    3m58.050s
user    13m33.735s
sys     1m30.130s

$ sort -n memory.current-nolimit.log | tail -n 1
803934208

However, when a memory limit of 600M is applied, the job finishes in the same amount of time - with 25% less available memory:

# echo 600M > memory.high

$ time make -j4 -s
real    4m0.654s
user    13m28.493s
sys     1m31.509s

$ sort -n memory.current-600M.log | tail -n 1
629116928

Clearly, the full 800M aren't required. But 600M still has an unknown amount of slack - even a 400M limit doesn't materially affect runtime:

# echo 400M > memory.high

$ time make -j4 -s
real    4m3.186s
user    13m20.452s
sys     1m31.085s

$ sort -n memory.current-400M.log | tail -n 1
419368960

At 300M, on the other hand, the workload struggles to make forward progress and finish within a reasonable amount of time:

# echo 300M > memory.high

$ time make -j4 -s
^C
real    9m9.974s
user    10m59.315s
sys     1m16.576s

Finding the exact cutoff where job performance begins to plummet is a tedious trial-and-error process. It also only works when the job does a fixed amount of work every time it runs, like in this example, but that isn't true for many datacenter services that run indefinitely and process highly variable user input.

Senpai determines the memory requirement of an application while the application is running:

# senpai .
2019-08-19 14:26:05 Configuration:
2019-08-19 14:26:05   cgpath = /sys/fs/cgroup/kernelbuild
2019-08-19 14:26:05   min_size = 104857600
2019-08-19 14:26:05   max_size = 107374182400
2019-08-19 14:26:05   interval = 5
2019-08-19 14:26:05   pressure = 1000
2019-08-19 14:26:05   max_probe = 0.01
2019-08-19 14:26:05   max_backoff = 0.1
2019-08-19 14:26:05   log_probe = 1000
2019-08-19 14:26:05   log_backoff = 10
2019-08-19 14:26:05 Resetting limit to memory.current.
2019-08-19 14:26:06 limit=100.00M pressure=0.000000 time_to_probe= 6 total=117669927 delta=0 integral=0
2019-08-19 14:26:07 limit=100.00M pressure=0.000000 time_to_probe= 5 total=117669927 delta=0 integral=0
2019-08-19 14:26:08 limit=100.00M pressure=0.000000 time_to_probe= 4 total=117669927 delta=0 integral=0

$ time make -j4 -s

2019-08-19 14:26:09 limit=100.00M pressure=0.000000 time_to_probe= 3 total=117678359 delta=8432 integral=8432
2019-08-19 14:26:09   backoff: 0.09259305978684715
2019-08-19 14:26:10 limit=109.26M pressure=0.180000 time_to_probe= 5 total=117719536 delta=41177 integral=41177
2019-08-19 14:26:10   backoff: 0.1
2019-08-19 14:26:11 limit=120.18M pressure=0.180000 time_to_probe= 5 total=117768197 delta=48661 integral=48661

...

2019-08-19 14:26:43 limit=340.48M pressure=0.160000 time_to_probe= 5 total=118045638 delta=202 integral=202
2019-08-19 14:26:44 limit=340.48M pressure=0.130000 time_to_probe= 4 total=118045638 delta=0 integral=202
2019-08-19 14:26:45 limit=340.48M pressure=0.130000 time_to_probe= 3 total=118045638 delta=0 integral=202
2019-08-19 14:26:46 limit=340.48M pressure=0.110000 time_to_probe= 2 total=118045638 delta=0 integral=202
2019-08-19 14:26:47 limit=340.48M pressure=0.110000 time_to_probe= 1 total=118045690 delta=52 integral=254
2019-08-19 14:26:48 limit=340.48M pressure=0.090000 time_to_probe= 0 total=118045690 delta=0 integral=254
2019-08-19 14:26:48   probe: -0.001983887611266873
2019-08-19 14:26:49 limit=339.80M pressure=0.090000 time_to_probe= 5 total=118045690 delta=0 integral=0

...

real    4m9.420s
user    13m21.723s
sys     1m33.037s

$ sort -n memory.current-senpai.log | tail -n 1
347762688

ABOUT SECTION (Recommended Addition):

The about section is very vague and does not tell the audience exactly what this product can do. This tool can potentially be more widespread and investigated if there were more descriptive capabilities of this tool. Since the target audience is more directed to consumers, a simplified version describing the benefits of this tool can be beneficial.

INSTALLATION SECTION(Recommended Addition):

Due to the intended target audience being consumers, there should potentially be a section dedicated to the installation instructions for this tool. Assuming that the consumers do not care how the logic aspects of the tool functions, a clear step-by-step instruction manual to assist the consumer to apply the tool would be greatly beneficial to include in the README doc.

EXAMPLE SECTION(Recommended Addition):

The example section provided does showcase its usage and demonstrate its functionality, however there could potentially be more descriptive manners of the results shown when the tool is used. This would help elevate the product's readability and

FEATURES SECTION(Recommended Addition):

An included feature section would promote readability and will quickly allow consumers to understand whether this product is tailored for them. Including features would also fortify the product as it lists out all the potential capabilities and

CITATIONS SECTION(Recommended Addition):

Including a citations section with this product can elevate its community engagement and transparency. It can also show acknowledgment and inspiration that showcases the beliefs and backgrounds of this product.

CONTACT SECTION(Recommended Addition):

Including contact information of the contributor promotes contributions and others to participate in advancing the product.

CONTRIBUTE SECTION(Recommended Addition):

Including this section would allow those that contribute to be aware of the issues amongst this product.

Requirements

  • Linux v4.20 or up with CONFIG_PSI=y
  • python3

License

senpai is GPL v2.0 licensed, as found in the LICENSE file.

About

Senpai is an automated memory sizing tool for container applications.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%