Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate jemalloc #8152

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Integrate jemalloc #8152

wants to merge 2 commits into from

Conversation

Al2Klimov
Copy link
Member

@Al2Klimov Al2Klimov commented Aug 5, 2020

... to speed up malloc(3) and thereby also the config loading:

fixes #8110

Also seems to fix a memory leak for some people:

fixes #8737

Reason: https://www.softwareverify.com/blog/memory-fragmentation-your-worst-nightmare/

@Al2Klimov Al2Klimov added the enhancement New feature or request label Aug 5, 2020
@Al2Klimov Al2Klimov added this to the 2.13.0 milestone Aug 5, 2020
@Al2Klimov Al2Klimov self-assigned this Aug 5, 2020
@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch from 5f9f4cd to bbf7589 Compare August 6, 2020 09:08
@Al2Klimov Al2Klimov removed their assignment Aug 6, 2020
@Al2Klimov Al2Klimov marked this pull request as ready for review August 6, 2020 09:09
@Al2Klimov Al2Klimov requested a review from N-o-X September 15, 2020 15:14
@Al2Klimov
Copy link
Member Author

Numbers: Icinga/docker-icinga2#22 (comment)

@Al2Klimov
Copy link
Member Author

@lippserd FYI: Independent of this PR a regex index doesn't make it much faster.

diff --git a/lib/base/scriptutils.cpp b/lib/base/scriptutils.cpp
index 838f20edd..001dc86ed 100644
--- a/lib/base/scriptutils.cpp
+++ b/lib/base/scriptutils.cpp
@@ -16,8 +16,12 @@
 #include "base/namespace.hpp"
 #include "config/configitem.hpp"
 #include <boost/regex.hpp>
+#include <boost/thread/locks.hpp>
+#include <boost/thread/shared_mutex.hpp>
 #include <algorithm>
 #include <set>
+#include <string>
+#include <unordered_map>
 #ifdef _WIN32
 #include <msi.h>
 #endif /* _WIN32 */
@@ -93,6 +97,11 @@ bool ScriptUtils::CastBool(const Value& value)
        return value.ToBool();
 }

+static struct {
+       std::unordered_map<std::string, boost::regex> Index;
+       boost::shared_mutex Mutex;
+} l_Regexes;
+
 bool ScriptUtils::Regex(const std::vector<Value>& args)
 {
        if (args.size() < 2)
@@ -111,7 +120,25 @@ bool ScriptUtils::Regex(const std::vector<Value>& args)
        else
                mode = MatchAll;

-       boost::regex expr(pattern.GetData());
+       const boost::regex* expr = nullptr;
+
+       {
+               auto key (pattern.GetData());
+               boost::upgrade_lock<boost::shared_mutex> shared (l_Regexes.Mutex);
+               auto pos (l_Regexes.Index.find(key));
+
+               if (pos == l_Regexes.Index.end()) {
+                       boost::upgrade_to_unique_lock<boost::shared_mutex> unique (shared);
+
+                       pos = l_Regexes.Index.find(key);
+
+                       if (pos == l_Regexes.Index.end()) {
+                               pos = l_Regexes.Index.emplace(key, key).first;
+                       }
+               }
+
+               expr = &pos->second;
+       }

        Array::Ptr texts;

@@ -128,7 +155,7 @@ bool ScriptUtils::Regex(const std::vector<Value>& args)
                        bool res = false;
                        try {
                                boost::smatch what;
-                               res = boost::regex_search(text.GetData(), what, expr);
+                               res = boost::regex_search(text.GetData(), what, *expr);
                        } catch (boost::exception&) {
                                res = false; /* exception means something went terribly wrong */
                        }
@@ -144,7 +171,7 @@ bool ScriptUtils::Regex(const std::vector<Value>& args)
        } else {
                String text = argTexts;
                boost::smatch what;
-               return boost::regex_search(text.GetData(), what, expr);
+               return boost::regex_search(text.GetData(), what, *expr);
        }
 }

@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch from bbf7589 to ddda5aa Compare December 14, 2020 15:29
@Al2Klimov Al2Klimov requested review from julianbrost and removed request for N-o-X March 22, 2021 14:58
Copy link
Contributor

@julianbrost julianbrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just leave this to the administrator? They'd have to install jemalloc themselves anyways and if they to this because they want it for icinga2, they can also set an environment variable.

I already find this script strange without this PR, I'd rather see if we can get rid of this mess than extend it.

icinga-app/icinga2.cmake Outdated Show resolved Hide resolved
@Al2Klimov
Copy link
Member Author

They'd have to install jemalloc themselves anyways

No, see #8152 (comment).

@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch from ddda5aa to 874a184 Compare March 26, 2021 15:33
@Al2Klimov Al2Klimov requested a review from julianbrost March 26, 2021 15:34
icinga-app/icinga2.cmake Outdated Show resolved Hide resolved
icinga-app/icinga2.cmake Outdated Show resolved Hide resolved
@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch from 874a184 to b58148e Compare April 12, 2021 08:51
@Al2Klimov Al2Klimov requested review from julianbrost and removed request for julianbrost April 12, 2021 08:52
@Al2Klimov Al2Klimov self-assigned this Apr 12, 2021
@Al2Klimov Al2Klimov marked this pull request as draft April 12, 2021 10:55
@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch from b58148e to 2c4c0c0 Compare April 12, 2021 11:30
@Al2Klimov Al2Klimov marked this pull request as ready for review April 12, 2021 11:30
@Al2Klimov Al2Klimov requested a review from julianbrost April 12, 2021 11:31
@Al2Klimov Al2Klimov added this to the 2.14.0 milestone Jun 2, 2021
@Al2Klimov
Copy link
Member Author

@cla-bot check

@Al2Klimov
Copy link
Member Author

@N-o-X Aren't you testing a real world config atm? Please could you test this PR before/after w/ the config?

@Al2Klimov
Copy link
Member Author

Apropos large configs: do you all agree that we just don’t need this on Raspbian for obvious reasons?

@lippserd
Copy link
Member

Apropos large configs: do you all agree that we just don’t need this on Raspbian for obvious reasons?

Yes.

@N-o-X
Copy link
Contributor

N-o-X commented Nov 26, 2021

@N-o-X Aren't you testing a real world config atm? Please could you test this PR before/after w/ the config?

The results are kinda bad.

Setup

VM:

  • 16GB RAM
  • 8 Cores

Config:

[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 4 NotificationCommands.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 72985 Notifications.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 24 Dependencies.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 24400 HostGroups.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 28007 Hosts.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 2 EventCommands.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 7036 Downtimes.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 116 Comments.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 FileLogger.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 IcingaDB.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 ApiListener.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 215 Zones.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 207 Endpoints.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 11 ApiUsers.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 507 CheckCommands.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 713 Users.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 26 TimePeriods.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 88 ServiceGroups.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 2599 ScheduledDowntimes.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 383318 Services.

Tests

Command: time icinga2 daemon -C

Results 2.13.2

Run 1

real 12m32.680s
user 62m25.203s
sys 17m40.012s

Run 2

real 12m50.119s
user 63m10.677s
sys 17m24.793s

Results 2.13.2 + This PR

Run 1

real 14m40.973s
user 68m20.610s
sys 24m0.484s

Run 2

real 14m35.091s
user 68m40.074s
sys 22m32.007s

@Al2Klimov
Copy link
Member Author

@julianbrost julianbrost removed their request for review November 30, 2021 15:47
@Al2Klimov
Copy link
Member Author

Good to know: #8737 (comment)

@Al2Klimov Al2Klimov requested a review from julianbrost January 3, 2022 09:54
@Al2Klimov
Copy link
Member Author

One more reason: #8737 (comment)

@julianbrost
Copy link
Contributor

I did some playing with jemalloc 5.2.1 (using LD_PRELOAD) and a somewhat larger config modeled after some real world config (daemon -C takes around 20s on my laptop). With this, I got around a 12% performance benefit at 4% more memory usage (about 1GB peak, so about 40MB extra). Take the exact numbers with a grain of salt but they looked consistent enough across 10 runs that I think it's fair to say that in this scenario you get a nice performance benefit for a small penalty in memory use.

Full Config
const numHosts = 10000;
const numHostTemplates = 100;
const numHostGroups = 100;
const maxGroupsPerHost = 4;
const maxTemplatesPerHost = 2;
const hostTemplateApplyRules = 100;
const hostVarEqApplyRules = 10;
const hostNameMatchApplyRules = 90;
const hostNotMatchingApplyRules = 150;
const numExtraNotificationRules = 10;

object CheckCommand "dummy" {
	command = ["true"]
}

object NotificationCommand "dummy" {
	command = ["true"]
}

for (i in range(numHostGroups)) {
	object HostGroup String(i) {}
}

for (i in range(numHostTemplates)) {
	var t = "template_"+String(i)
	template Host t use (t) {
		vars[t] = 42
	}
}

for (i in range(numHosts)) {
	object Host String(i) use (i) {
		check_command = "dummy"
		for (j in range(1 + i % maxGroupsPerHost)) {
			groups += [String((7*i + 11*j) % numHostGroups)]
		}
		for (j in range(1 + i % maxTemplatesPerHost)) {
			import "template_" + String((11*i + 13*j) % numHostTemplates)
		}
		vars.v0 = "foo"
		vars.v1 = "foo"
		vars.v2 = "foo"
		vars.v3 = "foo"
		vars.v4 = "foo"
		vars.v5 = "foo"
		vars.v6 = "foo"
		vars.v7 = "foo"
		vars.v8 = "foo"
		vars.v9 = "foo"
	}

	object Endpoint String(i) {}

	object Zone String(i) use (i) {
		endpoints = [String(i)]
	}
}

for (i in range(hostTemplateApplyRules)) {
	var t = "template_" + String(i%numHostTemplates)
	apply Service t use (t) {
		check_command = "dummy"
		assign where t in host.templates
	}
}

for (i in range(hostVarEqApplyRules)) {
	var t = "var_eq_" + String(i)
	apply Service t use (i, t) {
		check_command = "dummy"
		assign where host.vars["template_" + String(i % numHostTemplates)] == 42
	}
}

for (i in range(hostNameMatchApplyRules)) {
	var t = "name_match+" + String(i)
	var p = String(i%10) + "*"
	apply Service t use (t, p) {
		check_command = "dummy"
		assign where match(p, host.name)
	}
}

for (i in range(hostNotMatchingApplyRules)) {
	var t = "no_match_" + String(i)
	var p = String(i%10) + "*"
	apply Service t use (t, p) {
		check_command = "dummy"
		assign where match("*", host.name) && host.vars.var1 == "value-never-set"
	}
}

object User "user" {}

apply Notification "all" to Host {
	command = "dummy"
	users = ["user"]
	assign where true
}

apply Notification "all" to Service {
	command = "dummy"
	users = ["user"]
	assign where true
}

for (i in range(numExtraNotificationRules)) {
	apply Notification "extra-" + String(i) to Host {
		command = "dummy"
		users = ["user"]
		assign where host.name.len() == 1
	}

	apply Notification "extra-" + String(i) to Service {
		command = "dummy"
		users = ["user"]
		assign where host.name.len() == 1
	}
}

@julianbrost
Copy link
Contributor

Something else to consider: looks like the glibc allocator is better at detecting bad stuff:

$ g++ -std=c++11 -o double-free double-free.cpp
$ g++ -std=c++11 -o heap-overflow heap-overflow.cpp 
$ ./double-free 
free(): double free detected in tcache 2
[1]    1194835 IOT instruction (core dumped)  ./double-free
$ ./heap-overflow
munmap_chunk(): invalid pointer
[1]    1194847 IOT instruction (core dumped)  ./heap-overflow
$ jemalloc.sh ./double-free
$ echo $?
0
$ jemalloc.sh ./heap-overflow
$ echo $?                    
0
double-free.cpp
int main() {
	auto x = new int[1];
	delete[] x;
	delete[] x;
}
heap-overflow.cpp
#include <vector>

int main() {
	std::vector<int> x(1), y(1);
	for (int i = 0; i < 8192; i++) {
		x[i] = 42;
		y[i] = 42;
	}
}

@lippserd lippserd removed their request for review October 4, 2022 13:45
@lippserd
Copy link
Member

lippserd commented Oct 4, 2022

I opt for just closing this one. We should invest the time it takes to test and verify if and how this affects performance into actually improving the code.

@steaksauce-
Copy link

I'm on 2.13.3:

icinga2 - The Icinga 2 network monitoring daemon (version: r2.13.3-1)

Copyright (c) 2012-2022 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: CentOS Linux
  Platform version: 7 (Core)
  Kernel: Linux
  Kernel version: 3.10.0-1127.8.2.el7.x86_64
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.5
  Build host: runner-hh8q3bz2-project-322-concurrent-0
  OpenSSL version: OpenSSL 1.0.2k-fips  26 Jan 2017

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid

Fairly sizeable deployment and suffering from memory leaks which led me here.

Not sure if this is helpful, but I added LD_PRELOAD=/usr/lib64/libjemalloc.so.1 to /etc/sysconfig/icinga2

pre config change test for icinga2 daemon -C

real	1m16.896s
user	2m34.925s
sys	0m16.074s

post config change test for icinga2 daemon -C

real	1m5.757s
user	2m35.747s
sys	0m18.760s

Should have some solid memory graphs in the morning

@julianbrost julianbrost removed this from the 2.14.0 milestone Jan 12, 2023
@Al2Klimov
Copy link
Member Author

Added

fixes #8737

to OP.

@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch 2 times, most recently from d7c282b to 35215d8 Compare May 2, 2023 15:02
@Icinga Icinga deleted a comment from Al2Klimov Jan 19, 2024
@Icinga Icinga deleted a comment from Al2Klimov Jan 19, 2024
@Al2Klimov Al2Klimov force-pushed the feature/jemalloc-8110 branch from 35215d8 to c86176a Compare January 31, 2025 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla/signed enhancement New feature or request
Projects
None yet
5 participants