Crash in AI (possible race condition) - 1.62 final

For test results, bug reports, announcements of new builds etc.

Moderators: winston, another_commander, Getafix

Post Reply
User avatar
winston
Pirate
Pirate
Posts: 731
Joined: Mon Sep 27, 2004 10:21 pm
Location: Port St. Mary, Isle of Man
Contact:

Crash in AI (possible race condition) - 1.62 final

Post by winston »

I've encountered a crash in the AI. It seems to be quite rare (both times it happened after the game had been running for over an hour - the latest one, I left the game running and after about 2.5 hours it crashed with a segmentation fault). This happened on Linux, but there's nothing here that suggests that it's particular to that version. My hunch is some kind of race condition.

Stack trace:

Code: Select all

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -150646080 (LWP 5618)]
0x008d633a in objc_msg_lookup () from /usr/lib/libobjc.so.1
(gdb) bt
#0  0x008d633a in objc_msg_lookup () from /usr/lib/libobjc.so.1
#1  0x00e201a6 in -[GSArray indexOfObject:] ()
   from /usr/GNUstep/System/Library/Libraries/libgnustep-base.so.1.11
#2  0x00e54ff3 in -[NSArray containsObject:] ()
   from /usr/GNUstep/System/Library/Libraries/libgnustep-base.so.1.11
#3  0x080f1e16 in -[ShipEntity fireLaserShotInDirection:] (self=0xe4bff010, 
    _cmd=0x8187440, direction=0) at ShipEntity.m:4998
#4  0x080f015e in -[ShipEntity fireMainWeapon:] (self=0xe4bff010, 
    _cmd=0x81871f0, range=11737.24371360863) at ShipEntity.m:4606
#5  0x080e434d in -[ShipEntity update:] (self=0xe4bff010, _cmd=0x818c6e0, 
    delta_t=0.0099929981661261991) at ShipEntity.m:1973
#6  0x081222bf in -[Universe update:] (self=0x86b6ff8, _cmd=0x8174460, 
    delta_t=0.0099929981661261991) at Universe.m:5061
#7  0x0805d967 in -[GameController doStuff:] (self=0x854b8f8, _cmd=0x8177a48, 
    sender=0x0) at GameController.m:444
Going to frame 3 (the last Oolite code before the crash):

Code: Select all

4994                            if (victim->isShip)
4995                            {
4996                                    ShipEntity* ship_hit = ((ShipEntity*)victim); 
4997                                    ShipEntity* subent = ship_hit->subentity_taking_damage;
4998                                    if ((subent) && [ship_hit->sub_entities containsObject:subent])
4999                                    {
5000                                            if (ship_hit->isFrangible)
5001                                            {
Crash is at line 4998.
If we look at ship_hit->sub_entities, this looks fine:

Code: Select all

(gdb) po ship_hit->sub_entities
("<ParticleEntity 300 PARTICLE_EXHAUST ttl: -272.439s>", "<ParticleEntity 300 PARTICLE_EXHAUST ttl: -272.439s>", "<ParticleEntity 300 PARTICLE_EXHAUST ttl: -272.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>", "<ParticleEntity 600 PARTICLE_FLASHER ttl: -270.439s>")
However, if we look at the rhs of the comparison - although we find subent is not NIL (and therefore the first part of the 'if' statement evaluates to true) we discover that the subent pointer is pointing at something it shouldn't. Ship_hit is valid, though (as you may expect):

Code: Select all

(gdb) po ship_hit
<ShipEntity GalCop Viper Interceptor 968 (wingman) >

(gdb) p subent
$2 = (class ShipEntity *) 0xf0c34010

(gdb) po subent

Program received signal SIGSEGV, Segmentation fault.
0x008d633a in objc_msg_lookup () from /usr/lib/libobjc.so.1
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (_NSPrintForDebugger) will be abandoned.
User avatar
JensAyton
Grand Admiral Emeritus
Grand Admiral Emeritus
Posts: 6657
Joined: Sat Apr 02, 2005 2:43 pm
Location: Sweden
Contact:

Post by JensAyton »

ship_hit->subentity_taking_damage may be a dangling pointer. Is it cleared when subentities are destroyed? (I’d look, but I want to get a new release of Dry Dock out in the next hour.)
User avatar
winston
Pirate
Pirate
Posts: 731
Joined: Mon Sep 27, 2004 10:21 pm
Location: Port St. Mary, Isle of Man
Contact:

Post by winston »

Well, unless I'm mistaken, a Viper doesn't contain any sub entities that are frangible (well, not until the rest of the Viper is destroyed) so I'm not sure what one of the sub entities is doing being deleted (without the rest of the Viper being deleted).
User avatar
JensAyton
Grand Admiral Emeritus
Grand Admiral Emeritus
Posts: 6657
Joined: Sat Apr 02, 2005 2:43 pm
Location: Sweden
Contact:

Post by JensAyton »

Good point. So why is subentity_taking_damage not nil?
User avatar
winston
Pirate
Pirate
Posts: 731
Joined: Mon Sep 27, 2004 10:21 pm
Location: Port St. Mary, Isle of Man
Contact:

Post by winston »

I think subentities that are non-frangible can still take damage but they pass it onto the entity they are stuck on to (I seem to remember Giles mentioning that a while back). I've not really looked at the code in depth, though.
User avatar
aegidian
Master and Commander
Master and Commander
Posts: 1161
Joined: Thu May 20, 2004 10:46 pm
Location: London UK
Contact:

Post by aegidian »

Yep a 'hanging pointer' problem as Ahruman said.

I've commited a fix (r1304) and can probably rebuild before releasing tomorrow!

Thanks guys.

diff

Code: Select all

Index: Universe.m
===================================================================
--- Universe.m  (revision 1303)
+++ Universe.m  (revision 1304)
@@ -4317,6 +4317,8 @@
                result = [hit_entity universal_id];
                if ((hit_subentity)&&[hit_entity->sub_entities containsObject:hit_subentity])
                        hit_entity->subentity_taking_damage = hit_subentity;
+               else
+                       hit_entity->subentity_taking_damage = nil;
                if (range_ptr != nil)
                        range_ptr[0] = (GLfloat)nearest;
        }
Index: ShipEntity.m
===================================================================
--- ShipEntity.m        (revision 1303)
+++ ShipEntity.m        (revision 1304)
@@ -177,6 +177,7 @@
        isShip = YES;
        //
        isFrangible = YES;
+       subentity_taking_damage = nil;
        //
        dockingInstructions = nil;
        //
@@ -514,6 +515,7 @@
        isShip = YES;
        //
        isFrangible = YES;
+       subentity_taking_damage = nil;
        //
        if (dockingInstructions)
                [dockingInstructions autorelease];
@@ -577,6 +579,7 @@
        isShip = YES;
        //
        isFrangible = YES;
+       subentity_taking_damage = nil;
        //
        isNearPlanetSurface = NO;
        //
@@ -933,6 +936,7 @@
        //
        if ([shipdict objectForKey:@"frangible"])       // if an object has frangible == YES then it can have its subentities shot away!
                isFrangible = [(NSNumber *)[shipdict objectForKey:@"frangible"] boolValue];
+       subentity_taking_damage = nil;
        //
        if ([shipdict objectForKey:@"laser_color"])
        {
"The planet Rear is scourged by well-intentioned OXZs."

Oolite models and gear? click here!
User avatar
aegidian
Master and Commander
Master and Commander
Posts: 1161
Joined: Thu May 20, 2004 10:46 pm
Location: London UK
Contact:

Post by aegidian »

Y'know it's just struck me that this also could have been the cause of the invulnerable ship problem. If the subentity_taking_damage pointer had been left pointing at a valid entity and ship_hit->sub_entities hasn't been cleared (it should have been) then hits may be registering on the wrong victim entirely.
"The planet Rear is scourged by well-intentioned OXZs."

Oolite models and gear? click here!
User avatar
winston
Pirate
Pirate
Posts: 731
Joined: Mon Sep 27, 2004 10:21 pm
Location: Port St. Mary, Isle of Man
Contact:

Post by winston »

aegidian wrote:
then hits may be registering on the wrong victim entirely.
Ooh, nasty :-)

Sort of an SEP (Somebody Else's Problem) field? I always wondered what unfortunate people suddenly found themselves asphyxiating when the SEP field of Slartibartfast's ship made it Somebody Else's Problem about standing on an asteroid in space in a total vacuum whilst watching the Krikkit Robots unlock their planet...
User avatar
aegidian
Master and Commander
Master and Commander
Posts: 1161
Joined: Thu May 20, 2004 10:46 pm
Location: London UK
Contact:

Post by aegidian »

now at r1305 - reiniting ships when recycling to avoid this problem
"The planet Rear is scourged by well-intentioned OXZs."

Oolite models and gear? click here!
User avatar
Star Gazer
---- E L I T E ----
---- E L I T E ----
Posts: 633
Joined: Sat Aug 14, 2004 4:55 pm
Location: North Norfolk, UK, (Average Agricultural, Feudal States,Tech Level 8)

Post by Star Gazer »

Nicely sorted, guys, you're developing into a very efficient team... :wink:
Very funny, Scotty, now beam down my clothes...
Post Reply