Page 1 of 1

Out of memory: how to troubleshoot?

Posted: Sun Jun 14, 2020 9:05 pm
by Milo
Recently I've been getting out-of-memory errors in my log shortly after loading my saved game, even if I stay docked and just let the clock run. I'm using Windows, with the developer build of 1.88 and with the debug console attached (N.B., the same errors appear in the Latest.log even if the debug console is not attached). In the console, the warnings look like this (unsurprisingly, this gets reported for all sorts of scripts, not just these particular ones):

Error: out of memory
Active script: Oolite Bounty Hunter AI 1.88
oolite-priorityai.js, line 279:
for (var i=activeHandlers.length-1; i >= 0 ; i--)
Error: out of memory
Active script: Oolite Police AI 1.88
oolite-priorityai.js, line 279:
for (var i=activeHandlers.length-1; i >= 0 ; i--)
.
.
.
etc.

How would you go about tracking down the root cause? I disabled a few OXPs and the OOM errors stopped happening. However, I'm not sure if my baseline memory usage is excessive. It's hovering just under 3 GB when I just let the game run (still docked, not doing anything). Here is a screenshot of the Resource Monitor showing the baseline: https://pasteboard.co/Jd5ZMJZ.png

I have extra logging enabled (I copied logcontrol.plist to my Oolite\Addons folder and flipped even more switches on). However, I don't see anything in the log that seems to implicate any particular OXP as a resource hog. Here is a "clean" Latest.log corresponding to the above Resource Monitor screenshot (with the game still idling in the background): https://pastebin.com/zNcgwRdv

Next, I re-enabled just one of the OXPs that I had disabled to get to the baseline stable level: phkb's new experimental PopulationControl.oxz. Here is the Latest.log with that OXP added: https://pastebin.com/SLQX9Kbg

I didn't manage to catch the Resource Monitor with a screenshot, but while my saved game was loading, I noticed that this time the memory usage briefly went above 3 GB before dropping back down. And, in the log, a single OOM error appeared around that time. After letting it idle and not seeing any further errors, I decided to try launching from the station to see if it would consume more memory. I did not see it tick over 3 GB [but I might have missed it], however a single additional OOM error did appear. As you can see in the log, there was one OOM before I launched, and one after. I auto-docked a few moments after that and then let it idle again, and the idling memory usage actually stabilized at a lower level than in the baseline run earlier (around 2.7 GB this run vs. around 2.9 GB before). One thing I see from comparing the logs is that the groupCount values total slightly lower (1 less?) for this second run than they did on the stable baseline run. However, with phkb's tweaked populator there was a larger groupCount for police and for hunters and a smaller groupCount for pirates.

For a third run, I restarted and loaded the same saved game, with no change in OXPs (still using phkb's tweaked populator). This time, I got a flood of OOM errors for several seconds until I paused the game. Again I missed the Resource Monitor screenshot, but it did spike to just above 3GB and stayed at that level while logging the errors. And, if I unpause it, the OOM errors continue. Here is the Latest.log for this run: https://pastebin.com/DLt9k883

In this unstable third run, the groupCount numbers for police and hunters were notably higher than in the stable run. Maybe a contributing factor.

(As a side note, I'm a bit confused by OOM errors happening at 3 GB, because I thought Oolite could use up to 4GB of memory?)

Where to go from here? I can reproduce this fairly easily, but it's not every time. I guess I'm close to the limit and random factors each time I load are deciding if I go over it. But I'm not sure why I'm so close to the limit. Any advice would be greatly appreciated. (I've made a separate thread instead of putting this in phkb's thread because I doubt that his OXP is directly responsible, I suspect it's just helping tip me over the edge.)

Thanks,
Milo

Re: Out of memory: how to troubleshoot?

Posted: Mon Jun 15, 2020 5:51 am
by another_commander
You are running the 64-bit version of the game. There should be no limit to memory usage whatsoever., unless you exceed your physical RAM of course, which does not seem to be the case. Perplexing, indeed.

What happens if you install all OXPs except phkb's populator? Does it run out of memory then?

You can also try switching on these keys in logcontrol.plist for additional populator info:

Code: Select all

universe.populate = yes;
universe.maxEntitiesDump = yes;

Re: Out of memory: how to troubleshoot?

Posted: Mon Jun 15, 2020 1:05 pm
by Milo
New observation: memory usage exceeding 3GB does not consistently correlate with the OOM errors, because several times now I have seen it go over 3GB for a while in the Resource Monitor without any such errors. This makes me wonder how exactly the OOM error gets raised, and if it is a pass-through error code directly from JavaScript or not.

Perhaps I'm mis-remembering it happening without phkb's OXP before. So far I haven't managed to get the OOM errors with that OXP removed, even with several others added (including the Povray planets retextures that I had removed earlier when I thought I needed to reduce overall memory usage). I will keep trying to reproduce it without phkb's OXP.

I've also switched to a nightly build (for reverse cycling MFDs and the new graphics), and I can still reproduce OOM errors when I enable phkb's OXP. I toggled the logging switches you suggested. With phkb's OXP enabled, here is a new log with OOM errors: https://pastebin.com/Jv71UGFx (again, this is quite different from the previous logs, since it is with a nightly build and several other OXPs have also been enabled).

Re: Out of memory: how to troubleshoot?

Posted: Tue Jun 16, 2020 4:49 am
by phkb
Can you PM me a copy of the save file you're using for this? I'd like to check some of the data elements inside it.

Re: Out of memory: how to troubleshoot?

Posted: Fri Jun 19, 2020 12:49 pm
by Milo
I'm getting some "out of memory" exceptions now even without phkb's new OXP. I looked into this a bit. The SpiderMonkey JSAPI implementation has a maximum bytes limit which is user-specified when the runtime is instantiated. In OOJavaScriptEngine.m, line 281 sets _runtime = JS_NewRuntime(32L * 1024L * 1024L); so the maximum amount of memory that is allowed to be used by the script engine (including memory pending garbage collection) is roughly 32 MB. Needless to say, that's not a lot of memory on modern hardware.

Since I seem to be right on the edge of it with ~400 world scripts, I think doubling to 64L would be plenty. I don't seem to have a memory leak, from watching the overall memory usage of the process.

I don't know how to identify which of my OXPs are consuming the majority of the script memory. It looks like to see the relevant counter (rt->gcBytes [bytes allocated]), SpiderMonkey needs to be built with #define JS_GCMETER and will then conditionally compile the js_DumpGCStats function and print memory stats to stdout each GC cycle. (There is also a #define JS_ARENAMETER, but the output from that looks less useful.)

I'm going to try to figure out how to rebuild SpiderMonkey with JS_GCMETER. . . Building Oolite on Windows was easy. Thanks, another_commander! However, it does not build the JS DLL. The comments in the Makefiles seem to indicate that this isn't supported on Windows, and "make -f libjs.make debug=no" fails because it can't find perl in the path. After some failed attempts to add missing requirements (I managed perl, got stuck on C++ compiler) to the MinGW environment provided, I'm admitting defeat on this. Any help would be appreciated.

Re: Out of memory: how to troubleshoot?

Posted: Fri Jun 19, 2020 3:43 pm
by another_commander
Building the Spidermonkey DLL for Oolite on Windows has easily been one of the hardest tasks I've had to do since I got involved in the project. Back then when I compiled it with the set of features we needed for the game, MinGW was not a Spidermonkey target and I had to convert the MSVC project files to something that make could parse and build, while trying to keep 1-on-1 parity with all the supported elements of the MSVC project files. This is what you will have to do if you intend to build the JS dll yourself and I can assure you that it is an extremely frustrating process.

The development environment provided for Windows is a subset of my complete development environment which includes all the game dlls build projects. The environment you have allows you to build the game executable only, as you have already found out. The environment I have allows me to additionally build installers for distribution and also any of the game's dlls, in case there is need to make adjustments to them, including the Spidermonkey one. I can try to dig my makefiles for Spidermonkey if you think that might help, but since I have not had a look at those places for years and years, I cannot guarantee that I will be able to isolate them.

My suggestion is to not try to build Spidermonkey, unless for educational purposes. Instead, try to dedicate more time on the game itself; report bugs, try hacking the exe code - since you can now build it from its sources and test as much as you can. One thing you can do is set the JS runtime size to 64MB in the JS_NewRuntime invocation and see how that works with your memory exhaustion issue.

If I can get the right makefiles for Spidermonkey from my dev environment, I'll upload them somewhere but I won't be able to provide any assistance with them; I don't remember what I did back then and why.

Re: Out of memory: how to troubleshoot?

Posted: Fri Jun 19, 2020 3:57 pm
by Milo
I figured it was something like this. I wonder if I can reach into the JS environment through the runtime object to retrieve the gcBytes counter from Oolite itself? If I can get visibility of the amount of memory allocated by the runtime, I could toggle OXPs on and off to figure out which ones are responsible for the majority of the usage. Meanwhile, I've increased the runtime size to 64 MB and will play with this for a while to see if the OOM error happens again. I will also put phkb's OXP back into the mix...

So far no OOM despite bringing phkb's OXP back in, so the increase to 64 MB seems to have done the trick.

Re: Out of memory: how to troubleshoot?

Posted: Sat Jun 20, 2020 7:39 pm
by another_commander
another_commander wrote: Fri Jun 19, 2020 3:43 pm
If I can get the right makefiles for Spidermonkey from my dev environment, I'll upload them somewhere but I won't be able to provide any assistance with them; I don't remember what I did back then and why.
Here are the files needed to compile Spidermonkey on Windows using Oolite's dev environment. I hope I have not forgotten anything. Still, I am sure that it will not build right out of the box if you throw them in the Spidermonkey source folder. I seem to recall having to enter a gnu assembler command manually in the console for at least one file at some point.

It will also fail to link at the end. But don't worry, the file called linkit64_opt contains the final link command and you can just run this as a batch file as a final step. If all goes well, you should end up with the Spidermonkey dll in the source's root folder.

That's all I can do for now. Good luck in your build attempt, if you are still up for it.

Re: Out of memory: how to troubleshoot?

Posted: Thu Jul 02, 2020 5:43 am
by Milo
I discovered that the debug console (I'm using your new GUI) has a drop-down menu option Debug -> console commands -> write JS memory stats.

JavaScript heap: 45.46 MiB (limit 64.00 MiB, 39 collections to date)

I haven't taken the time yet to narrow down which OXPs are the top contributors, but at least I have the data.

Re: Out of memory: how to troubleshoot?

Posted: Thu Jul 02, 2020 8:32 am
by another_commander
I have sent commit c7a7f0f to the repository. It should take care of the out of memory issue. As stated in the commit message:
commit c7a7f0f wrote:
Use the key 'JSRunTime_size_mb' in .GNUstepDefaults. The value of this key is the requested size of the JS runtime in megabytes. If the key is missing, then the default value of 32MB is used.

Re: Out of memory: how to troubleshoot?

Posted: Thu Jul 02, 2020 8:40 am
by Milo
While I recognize a conservative argument for keeping the default where it was, I doubt any users would have a problem with a higher default. After all, it isn't pre-allocating, just using what it needs. So I would suggest raising it to 64, with my own experience as evidence that with only OXPs available through the in-game manager it is possible to exceed 32.

Re: Out of memory: how to troubleshoot?

Posted: Thu Jul 02, 2020 8:51 am
by another_commander
I am very reluctant altering the defaults of the Javascript engine, because it may have side effects that are not immediately obvious. It is also uncertain what effect it may have on performance on other system configurations. Most of the JS stuff we have today was written by Jens a long time ago and is effectively part of the the heart of the game, so if changes to any defaults are to be applied, it would be necessary to have very good justifications. If this gets tested over a long period of time and proves indeed safe to change, we can consider it. But right now I would prefer it as an option, since most users should be fine with the defaults anyway - having over 310 OXPs installed is not what I would call a typical game setup.

Re: Out of memory: how to troubleshoot?

Posted: Thu Jul 02, 2020 10:45 am
by another_commander
Quick heads-up: In order to maintain consistency with both the existing code references and current standards regarding MB vs MiB meaning, commit 91ad883 has been made. It just makes sure that all references to runtime size are in MiB.

As a result, the .GNUstepDefaults key to use is changed to jsruntime-size-mib.
Example .GNUstepDafaults containing it:

Code: Select all

    NSGlobalDomain = {
    };
    oolite = {
	"Agtst1_it-humbletrash" = "-25944";
	"AtMaraus-humbletrash" = 1896;
	"Jameson-humbletrash" = "-29624";
	jsruntime-size-mib = 64;
	autosave = NO;
	"debug-settings-override" = {
	};
	detailLevel = 3;
	display_height = 1080;
	display_refresh = 0;
	display_width = 1920;
	"flight-arrow-key-precision-factor" = 0.1;
	"fov-value" = 72.2;
	fullscreen = NO;
	"gamma-value" = 1.6;
	"mouse-control-in-windowed-mode" = YES;
	"music mode" = on;
	p3dnsf = 0.25;
	"save-directory" = "C:\\Oolite/oolite.app/oolite-saves";
	volume_control = 0.3;
	window_height = 715;
	window_width = 1194;
	"wireframe-graphics" = NO;
    };
}
Edit: Changed the key again for consistency with almost all other .GNUstepDefaults keys.