A slight non-runinng issue with 1.75.4
Moderators: winston, another_commander
A slight non-runinng issue with 1.75.4
Hi,
I've finally decided to have a bash at compiling 1.75.4 (trunk) on my FreeBSD workstation in preparation for the MNSR. After a few additional tweaks to the makefile as detailed here - https://bb.oolite.space/viewtopic.ph ... 75#p146976 it compiles without issue. However once I start the application I'm presented with the splash screen, which disappears and then nothing... the application just exits quietly without error.
I've run the application through gdb and get the following -
(gdb) run
Starting program: /home/spooky/oolite-build/trunk/oolite.app/oolite.dbg
[New LWP 101048]
[New Thread 806e041c0 (LWP 101048)]
[New Thread 806fb9380 (LWP 100462)]
[New Thread 806e0ae40 (LWP 100938)]
[New Thread 806fbfac0 (LWP 100944)]
[New Thread 806fbf900 (LWP 101013)]
[New Thread 806fbee80 (LWP 101014)]
[New Thread 806fbeb00 (LWP 101016)]
[New Thread 806fbe080 (LWP 101020)]
[New Thread 806fbd0c0 (LWP 101041)]
[New Thread 806fbe940 (LWP 101052)]
[New Thread 806fbcf00 (LWP 101053)]
[New Thread 806fbecc0 (LWP 101055)]
[New Thread 806fbcd40 (LWP 101056)]
[New Thread 806fbcb80 (LWP 101058)]
[Thread 806fb9380 (LWP 100462) exited]
Program exited with code 01.
I've run it through multiple times and on different machines and it always seems to terminate with the 2nd spawned thread exiting. I still have the 1.75.3 code and that builds and executes without issue.
Without any obvious error and no core file to examine I set a break point at main and single stepped through the code until I saw the thread in question being spawned.
(gdb) s
OOLogOutputHandlerInit () at src/Core/OOLogOutputHandler.m:127
127 sInited = YES;
(gdb) s
129 if (sLogger != nil)
(gdb) s
131 sWriteToStderr = [[NSUserDefaults standardUserDefaults] boolForKey:@"logging-echo-to-stderr"];
(gdb) s
150 NSRecursiveLock *lock = GSLogLock();
(gdb) s
0x0000000802735760 in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) info threads
3 Thread 806fb9380 (LWP 100883) 0x000000080273b3cc in __error ()
from /lib/libthr.so.3
* 2 Thread 806e041c0 (LWP 100754) 0x0000000802735760 in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) thread 3
[Switching to thread 3 (Thread 806fb9380 (LWP 100883))]#0 0x000000080273b3cc in __error () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function __error,
which has no line number information.
0x0000000802735762 in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function pthread_rwlock_trywrlock,
which has no line number information.
0x0000000802734f80 in raise () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function raise,
which has no line number information.
0x0000000802739a80 in pthread_setcancelstate () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function pthread_setcancelstate,
which has no line number information.
0x0000000802734f9c in raise () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function raise,
which has no line number information.
0x0000000802735aaa in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function pthread_rwlock_trywrlock,
which has no line number information.
[New Thread 806e0ae40 (LWP 100408)]
[New Thread 806e04e00 (LWP 101013)]
[New Thread 806e04c40 (LWP 101014)]
[New Thread 806fbfc80 (LWP 101016)]
[New Thread 806fbf900 (LWP 101020)]
[New Thread 806fbee80 (LWP 101041)]
[New Thread 806fbdec0 (LWP 101044)]
[New Thread 806fbf740 (LWP 101052)]
[New Thread 806fbdd00 (LWP 101053)]
[New Thread 806fbfac0 (LWP 101055)]
[New Thread 806fbdb40 (LWP 101056)]
[New Thread 806fbd980 (LWP 101058)]
[Thread 806fb9380 (LWP 100883) exited]
Program exited with code 01.
To my simplistic brain it looks like its something to do with NSRecursiveLock *lock = GSLogLock(); I've compared the OOLogOutput.m and .h from 1.75.3 and 1.75.4 and they're identical. I then decided to find all the files in Core that reference NSRecursiveLock... and they're all identical too.
Does anybody have any other suggestions?
I've finally decided to have a bash at compiling 1.75.4 (trunk) on my FreeBSD workstation in preparation for the MNSR. After a few additional tweaks to the makefile as detailed here - https://bb.oolite.space/viewtopic.ph ... 75#p146976 it compiles without issue. However once I start the application I'm presented with the splash screen, which disappears and then nothing... the application just exits quietly without error.
I've run the application through gdb and get the following -
(gdb) run
Starting program: /home/spooky/oolite-build/trunk/oolite.app/oolite.dbg
[New LWP 101048]
[New Thread 806e041c0 (LWP 101048)]
[New Thread 806fb9380 (LWP 100462)]
[New Thread 806e0ae40 (LWP 100938)]
[New Thread 806fbfac0 (LWP 100944)]
[New Thread 806fbf900 (LWP 101013)]
[New Thread 806fbee80 (LWP 101014)]
[New Thread 806fbeb00 (LWP 101016)]
[New Thread 806fbe080 (LWP 101020)]
[New Thread 806fbd0c0 (LWP 101041)]
[New Thread 806fbe940 (LWP 101052)]
[New Thread 806fbcf00 (LWP 101053)]
[New Thread 806fbecc0 (LWP 101055)]
[New Thread 806fbcd40 (LWP 101056)]
[New Thread 806fbcb80 (LWP 101058)]
[Thread 806fb9380 (LWP 100462) exited]
Program exited with code 01.
I've run it through multiple times and on different machines and it always seems to terminate with the 2nd spawned thread exiting. I still have the 1.75.3 code and that builds and executes without issue.
Without any obvious error and no core file to examine I set a break point at main and single stepped through the code until I saw the thread in question being spawned.
(gdb) s
OOLogOutputHandlerInit () at src/Core/OOLogOutputHandler.m:127
127 sInited = YES;
(gdb) s
129 if (sLogger != nil)
(gdb) s
131 sWriteToStderr = [[NSUserDefaults standardUserDefaults] boolForKey:@"logging-echo-to-stderr"];
(gdb) s
150 NSRecursiveLock *lock = GSLogLock();
(gdb) s
0x0000000802735760 in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) info threads
3 Thread 806fb9380 (LWP 100883) 0x000000080273b3cc in __error ()
from /lib/libthr.so.3
* 2 Thread 806e041c0 (LWP 100754) 0x0000000802735760 in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) thread 3
[Switching to thread 3 (Thread 806fb9380 (LWP 100883))]#0 0x000000080273b3cc in __error () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function __error,
which has no line number information.
0x0000000802735762 in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function pthread_rwlock_trywrlock,
which has no line number information.
0x0000000802734f80 in raise () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function raise,
which has no line number information.
0x0000000802739a80 in pthread_setcancelstate () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function pthread_setcancelstate,
which has no line number information.
0x0000000802734f9c in raise () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function raise,
which has no line number information.
0x0000000802735aaa in pthread_rwlock_trywrlock () from /lib/libthr.so.3
(gdb) s
Single stepping until exit from function pthread_rwlock_trywrlock,
which has no line number information.
[New Thread 806e0ae40 (LWP 100408)]
[New Thread 806e04e00 (LWP 101013)]
[New Thread 806e04c40 (LWP 101014)]
[New Thread 806fbfc80 (LWP 101016)]
[New Thread 806fbf900 (LWP 101020)]
[New Thread 806fbee80 (LWP 101041)]
[New Thread 806fbdec0 (LWP 101044)]
[New Thread 806fbf740 (LWP 101052)]
[New Thread 806fbdd00 (LWP 101053)]
[New Thread 806fbfac0 (LWP 101055)]
[New Thread 806fbdb40 (LWP 101056)]
[New Thread 806fbd980 (LWP 101058)]
[Thread 806fb9380 (LWP 100883) exited]
Program exited with code 01.
To my simplistic brain it looks like its something to do with NSRecursiveLock *lock = GSLogLock(); I've compared the OOLogOutput.m and .h from 1.75.3 and 1.75.4 and they're identical. I then decided to find all the files in Core that reference NSRecursiveLock... and they're all identical too.
Does anybody have any other suggestions?
Spooky
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
- JazHaz
- ---- E L I T E ----
- Posts: 2991
- Joined: Tue Sep 22, 2009 11:07 am
- Location: Enfield, Middlesex
- Contact:
Re: A slight non-runinng issue with 1.75.4
Have you tried launching with -nosplash as a parameter?
Re: A slight non-runinng issue with 1.75.4
Thanks for your reply, I have now. It opens a blank context where the splash screen usual goes and then does exactly the same as before.Have you tried launching with -nosplash as a parameter?
Spooky
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
- JazHaz
- ---- E L I T E ----
- Posts: 2991
- Joined: Tue Sep 22, 2009 11:07 am
- Location: Enfield, Middlesex
- Contact:
Re: A slight non-runinng issue with 1.75.4
Have you got OpenGL on that machine?
Re: A slight non-runinng issue with 1.75.4
Once again I appreciate your assistance but as I said in my initial post it runs 1.75.3 just fine. The machine has 2 Quadro cards in it and is running the latest (285.09.05) FreeBSD 64bit ports tree drivers.Have you got OpenGL on that machine?
Code: Select all
GLX Information for pyro:0.0:
direct rendering: Yes
GLX extensions:
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
GLX_EXT_swap_control, GLX_EXT_texture_from_pixmap, GLX_ARB_create_context,
GLX_ARB_create_context_profile, GLX_EXT_create_context_es2_profile,
GLX_ARB_create_context_robustness, GLX_ARB_multisample,
GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB,
GLX_ARB_get_proc_address
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
server glx extensions:
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
GLX_EXT_swap_control, GLX_EXT_texture_from_pixmap, GLX_ARB_create_context,
GLX_ARB_create_context_profile, GLX_EXT_create_context_es2_profile,
GLX_ARB_create_context_robustness, GLX_ARB_multisample,
GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB
client glx vendor string: NVIDIA Corporation
client glx version string: 1.4
Spooky
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
Re: A slight non-runinng issue with 1.75.4
It looks like I've managed to finally track down the problem.
Long and short, Oolite's multithreading changed after 1.75.3 and the FreeBSD ports build of GNUStep is at best outdated or at worst broken. I'll try and contact the maintainer however this means an update to the official ports build of Oolite is extremely unlikely.WARNING your program is becoming multi-threaded, but you are using an ObjectiveC runtime library which does not have a thread-safe implementation of the +initialize method. This means that any classes not already used may be incorrectly initialised, potentially causing strange behaviors and crashes.
To put this into context, the runtime bug has been knoown for several years and only rarely causes problems ... the easy workaround being to ensure that any classes used by a new thread have already been used in the main thread before the new thread starts.
If you are worried, please build/run GNUstep with a runtime which supports the +initialize method. The GNUstep stable runtime (libobjc) and experimental runtime (libobjc2), available from the GNUstep website and subversion repository, should both work.
To disable this warning (eg. for an application which does not suffer any problems caused by this runtime bug), please set the GSSilenceInitializeWarning user default to YES.
Spooky
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
- JensAyton
- Grand Admiral Emeritus
- Posts: 6657
- Joined: Sat Apr 02, 2005 2:43 pm
- Location: Sweden
- Contact:
Re: A slight non-runinng issue with 1.75.4
FreeBSD currently uses GCC 4.2.1 from 2007, because of a policy decision not to use GPLv3 software. The Objective-C runtime library is provided with the compiler, which is why you have an old runtime and this warning message.
My understanding is that FreeBSD will soon be switching to Clang as the default compiler. With this change, the new GNUstep runtime (libobjc2) will presumably be the default for GNUstep under FreeBSD, and the
All that said, it’s not immediately obvious that the crash you described above is actually due to the
My understanding is that FreeBSD will soon be switching to Clang as the default compiler. With this change, the new GNUstep runtime (libobjc2) will presumably be the default for GNUstep under FreeBSD, and the
+initialize
problem will go away. I don’t know what the time frame for this is. It is possible to build with Clang and libobjc2 now, but I don’t know the details and don’t know if a port can be built that way.All that said, it’s not immediately obvious that the crash you described above is actually due to the
+initialize
issue, which would most likely cause deadlocks or double initializations.E-mail: [email protected]
Re: A slight non-runinng issue with 1.75.4
That same error message is harmlessly (AFAICT) displayed on various Linux boxes as well. That's not to say it's harmless in your case though!
The glass is twice as big as it needs to be.
Re: A slight non-runinng issue with 1.75.4
Indeed, however I had to "adjust the port" to build the development release of the GNUstep port just to get those warnings to fire correctly. Hence my statement "[the] FreeBSD ports build of GNUStep is at best outdated or at worst broken". If it was being caused by just the bundled libobjc being obsolete my changing of the gnustep version would make no odds.Ahruman wrote:FreeBSD currently uses GCC 4.2.1 from 2007, because of a policy decision not to use GPLv3 software. The Objective-C runtime library is provided with the compiler, which is why you have an old runtime and this warning message.
I have a 9.0-PRERELEASE machine which is self-hosting on Clang and I assume has libobjc2 (I will check later), currently however that's failing on the bundled libjs_static.a linking due to a hidden symbol not being defined. I'll try and get that sorted before starting a conversation about clang and GNUstepAhruman wrote:My understanding is that FreeBSD will soon be switching to Clang as the default compiler. With this change, the new GNUstep runtime (libobjc2) will presumably be the default for GNUstep under FreeBSD, and the+initialize
problem will go away. I don’t know what the time frame for this is. It is possible to build with Clang and libobjc2 now, but I don’t know the details and don’t know if a port can be built that way.
The key points to my posts are that 1.75.3 compiles and runs fine with GCC 4.2.1 and the standard Ports version of gnustep-base 1.19.3. Something was changed in 1.75.4 which adversely affects multi-threading on the older version of GNUstep. 1.75.3, 1.75.4 and 1.76 both compile and run fine with GCC 4.2.1 when you have the development port of gnustep-base 1.22.0 installed.
If you could give me some pointers on what may have changed to cause this incompatibility that would be grand, however I'd rather FreeBSD's GNUstep port was more current and verbose with warnings.
Thanks,
Spooky
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
- JensAyton
- Grand Admiral Emeritus
- Posts: 6657
- Joined: Sat Apr 02, 2005 2:43 pm
- Location: Sweden
- Contact:
Re: A slight non-runinng issue with 1.75.4
I really have no idea. :-/Spooky wrote:If you could give me some pointers on what may have changed to cause this incompatibility that would be grand,
E-mail: [email protected]
- JensAyton
- Grand Admiral Emeritus
- Posts: 6657
- Joined: Sat Apr 02, 2005 2:43 pm
- Location: Sweden
- Contact:
Re: A slight non-runinng issue with 1.75.4
Possible workaround: in OOLogOutputHandler.m, comment out this part (from line 149):
This should stop the specific crash, and that code isn’t very important (it intercepts
Code: Select all
NSRecursiveLock *lock = GSLogLock();
[lock lock];
_NSLog_printf_handler = OONSLogPrintfHandler;
[lock unlock];
NSLog
messages from within GNUstep and adds them to Oolite’s log). However, I wouldn’t be surprised if the same underlying problem causes a crash somewhere else.E-mail: [email protected]
Re: A slight non-runinng issue with 1.75.4
Thanks for the reply,
I reverted back to gnustep-base 1.19.3, made your suggested changes and recompiled. I'm afraid it behaves exactly as before.
I'll try and get a clang build sorted today, in the long term that will be more desirable.
I reverted back to gnustep-base 1.19.3, made your suggested changes and recompiled. I'm afraid it behaves exactly as before.
I'll try and get a clang build sorted today, in the long term that will be more desirable.
Spooky
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.
www.int13h.com
Evil Genius
The most merciful thing in all the world is the inability of the human mind to correlate all of its contents.