Trunk/nightlies have a profiling tool to help track down bottlenecks both in JavaScript and the bits of Oolite that can be triggered from JavaScript. Starting a few minutes ago, I’m starting to roll out the advanced flavour of this tool, which provides detailed information about Oolite’s internal time usage (currently not in Windows).
The profiler is invoked from the JavaScript console as
console.profile(function() { /* put your code here */ }); or
:time /* put your code here */
In practical terms, you might want to try
:time worldScripts["my-oxp-world-script"].someEventHandler(fakeParameters)
As of tonight’s nightly, the full profiler functionality works on Vector3D, Quaternion, System and Console methods (again, not under Windows) as well as some internal functions called by other methods.
Here’s an example of the profiler in use. I’ve cleaned it up a bit manually, since it doesn’t deal gracefully with those looooong Objective-C method names.
Code: Select all
> :time system.addShips('trader', 5, player.ship.position, 10000)
Total time: 37.459 ms
JavaScript: 0.03433 ms, native: 37.4247 ms
Counted towards limit: 0.0538281 ms, excluded: 37.4052 ms
Profiler overhead: 0.290558 ms
NAME COUNT TOTAL SELF TOTAL% SELF% SELFMAX
-[ShipEntity setUpFromDictionary:] 7 30.06 28.88 80.3 77.1 25.96
-[Universe generateSystemData:useCache:] 42 3.66 3.66 9.8 9.8 0.75
-[Universe addShipAt:withRole:
withinRadius:] 5 37.30 2.90 99.6 7.7 0.64
-[ShipEntity setUpShipFromDictionary:] 7 31.40 1.19 83.8 3.2 0.31
-[ShipEntity setUpSubEntities:] 7 1.18 0.25 3.2 0.7 0.08
-[Universe newShipWithRole:] 5 30.75 0.12 82.1 0.3 0.03
-[ShipEntity setUpOneSubentity:] 2 0.93 0.06 2.5 0.2 0.04
VectorToJSValue 1 0.05 0.05 0.1 0.1 0.05
-[Universe
randomShipKeyForRoleRespectingConditions:] 20 0.05 0.05 0.1 0.1 0.01
JSDefineNSProperty 14 0.04 0.04 0.1 0.1 0.01
-[Universe newShipWithName:usePlayerProxy:] 7 31.48 0.04 84.0 0.1 0.01
JSGetNSProperty 21 0.03 0.03 0.1 0.1 0.00
-[ShipEntity initWithKey:definition:] 7 31.42 0.02 83.9 0.1 0.01
+[NSString stringWithJavaScriptString:] 22 0.02 0.02 0.1 0.1 0.00
AddShipsOrGroup 1 37.37 0.02 99.8 0.1 0.02
-[Universe shipClassForShipDictionary:] 7 0.02 0.02 0.1 0.1 0.01
-[Universe addShipsAt:withRole:
quantity:withinRadius:asGroup:] 1 37.32 0.02 99.6 0.1 0.02
+[NSString stringOrNilWithJavaScriptValue:
inContext:] 22 0.04 0.02 0.1 0.0 0.00
ExtractString 35 0.01 0.01 0.0 0.0 0.00
JSArrayFromNSArray 1 0.01 0.01 0.0 0.0 0.01
-[NSString javaScriptValueInContext:] 7 0.01 0.01 0.0 0.0 0.00
JSNewNSArrayValue 1 0.02 0.00 0.1 0.0 0.00
-[Universe currentSystemData] 2 0.57 0.00 1.5 0.0 0.00
SystemAddShips 1 37.37 0.00 99.8 0.0 0.00
VectorFromArgumentListNoErrorInternal 1 0.00 0.00 0.0 0.0 0.00
JSVectorWithVector 1 0.00 0.00 0.0 0.0 0.00
JSObjectGetVector 1 0.00 0.00 0.0 0.0 0.00
Well, that’s a bit scary. Fortunately, you don’t really care about the big table of gobbledegook most of the time. The most important bit for scripters is the lines at the top:
Code: Select all
Total time: 37.459 ms
JavaScript: 0.03433 ms, native: 37.4247 ms
Counted towards limit: 0.0538281 ms, excluded: 37.4052 ms
Profiler overhead: 0.290558 ms
“Total time” is what it says: it took 37.4 milliseconds to run the command.
On the second line, “native” time is time accounted for by profiling-ready Oolite methods. “JavaScript” time is everything else. The goal is for all native time to be fully accounted for, so “JavaScript” time is stuff you’re responsible for, and “native” time is stuff we’re responsible for. (Exception: if you provide a function for Oolite to call, as with system.filteredEntities(), that function’s runtime will currently be counted as “native”.)
The third line divides the total time into “counted towards limit” and “excluded”. When the first number passes a certain threshold, your script will be unceremoniously halted. (At the moment, the threshold is 250 ms for most methods, and 2000 ms for loading time, setUp() and commands issued from the console. In the final release, the limits will be higher.) The point of this is to ensure that scripts stuck in an infinite loop or ridiculously long calculation can’t freeze the game.
For particularly slow native methods, like
System.addShips(), the time limiter is paused. This paused time is listed as “excluded”. We could exclude all “native” time, but the overhead isn’t worth it for fast methods, so it’s done selectively. Part of the point of the profiler is to recognise native methods which should be excluded.
The last number, “Profiler overhead”, accounts for most (but not all) of the time overhead caused by the profiler itself. This time is excluded from the time limit and is part of the “excluded” statistic. It’s not possible to accurately account for the entire cost of the profiler, and it does add a small cost even when it’s not in use, so in the stable normal-user build it will be completely disabled.
The important bit
If the time limiter complains about one of your functions, and it doesn’t look like it should be slow, try the profiler. If the “native” time massively exceeds the “excluded” time, you’ve found something we need to exclude. Please report this, preferably including the entire profiler output as well as your code.
Important bit ends
So, back to the table of gobbledegook. This lists every profiling-enabled native function called by the script being profiled, directly or indirectly. The columns are:
- NAME: guess.
- COUNT: the number of times the function has been invoked.
- TOTAL: the total time spent in invocations of this function, including subcalls, summed over all the invocations.
- SELF: the time spent in this function, excluding subcalls, summed over all invocations. The list is sorted with the highest SELF values at the top.
- TOTAL%: the TOTAL value expressed as a percentage of the overall “Total time”. These will often sum to more than 100 %, since subcalls are counted more than once.
- SELF%: the SELF value expressed as a percentage of the overall “Total time”. These will always sum to less than 100 %, at least before rounding, since they don’t include JavaScript time.
- SELFMAX: the longest amount of time spent in a single invocation.
The distinction between TOTAL and SELF is best illustrated with some code:
Code: Select all
function outer()
{
/* Do something that takes 10 ms here */
function inner()
{
/* Do something that takes 20 ms here */
}
// Call inner() twice:
inner();
inner();
}
NAME COUNT TOTAL SELF TOTAL% SELF% SELFMAX
inner 1 40 40 80.0 80.0 20
outer 1 50 10 100.0 20.0 10
Looking at the sample code above, we can see:
- 99.8 % of the time is spent within SystemAddShips, which (probably not very surprisingly) implements system.addShips(). However, it’s near the bottom of the list with 0.0 % SELF time, because most of its work is done in subroutines.
- 83.8 % of the total runtime is spent in seven calls -[ShipEntity setUpShipFromDictionary:] and its subroutines, including the confusingly named -[ShipEntity setUpFromDictionary:], which do most of the work of creating a new ship. (There are seven calls because two normal subenitites were set up, as seen by the two invocations of -[ShipEntity setUpOneSubentity:].)
- 10 % of time is spent in 42 calls to -[Universe generateSystemData:useCache:]. This was a surprise to me; it turns out that this is called multiple times while setting up the characters for the ships’ crew, in order to get information about their home systems. This could probably be optimized away.
- One of the calls to -[ShipEntity setUpFromDictionary:] accounts for 90 % of its total time (25.96/28.88). That’s because it was the only call to load a new model from disk. I’ve added profiling support for mesh loading now.