Page 1 of 2
Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Mon Jul 03, 2023 9:22 pm
by hiran
On Linux, rather than running Oolite directly, users will run the wrapper script. It's job is to configure the library path and active directory before running the real Oolite binary. This wrapper script is also registered for use by application launchers (that would be the equivalent of a Desktop icon or start menu icon in Windows).
But this wrapper script seems to perform more than just prepare environment variables:
- it copies files that need to exist for a stable run. Should that not happen at the installer?
- it detects Oolite terminates with a nonzero exit code. Would not be too bad, but all of a sudden that script goes interactive. See
https://github.com/OoliteProject/oolite ... te.src#L73
So imagine Oolite is started via the launcher. At that time we want to see the splash screen and shortly after the game screen. No terminal window is expected, and indeed no terminal window is showing up. Now if at that stage Oolite crashes, the process terminates and the GUI will indicate that the process has stopped. No it does not. The process of the wrapper script does not terminate. Instead it displays a message on stdout and waits for user input. At which point is the user expected to really notice?
This is a discrepancy we should solve differently. Any ideas?
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Thu Jul 06, 2023 7:48 pm
by hiran
Hmmm. Noone concerned about that? Well, if I am the only one thinking in that direction then maybe I have to come up with a solution....
So my idea is this:
If Oolite is considered a UI based application, there should not be error messages on stdout or stderr.
But what is something is going seriously wrong? So wrong that nothing can be displayed in a window. So wrong that even logging into the logfile just does not work?
There are mechanisms that all processes in a computer system have available: stdout, stderr and a return code.
So yes, place an error message onto these output streams. And raise the return code so that it is not zero (which usually indicates success).
The Oolite application is behaving that way, and it is good.
But this wrapper script, boy it is changing more than it should. So I intend to cleanup a bit there. Either it makes the error visible graphically. Or it should just not change the behaviour, meaning the error is on the output stream and the exit code does not get tampered with. Just so that whoever called the wrapper script can take a decision what to do.
I am open for feedback but willing to take on that change. So feel free to comment!
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 2:08 am
by Commander_X
From what I can tell, the "
cat" command is used as a sort of "
pause" command prompt batch file.
Instead of advising to press "
Ctrl-C to continue", and use "
cat", we can advise to press "
any key to continue" and use (e.g.)
This will wait for a (timeout-ed period) character input and continue the execution of the script. The timeout can be either passed to the read command as a "
-t 20 " parameter (for 20 seconds timeout), or by setting the
TMOUT varible somewhere in the script. Thus the script would timeout even if the expected input won't arrive.
Another way to pause the script execution for a period, is to use the "
sleep" command.
I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 5:33 am
by hiran
Commander_X wrote: ↑Fri Jul 07, 2023 2:08 am
From what I can tell, the "
cat" command is used as a sort of "
pause" command prompt batch file.
Instead of advising to press "
Ctrl-C to continue", and use "
cat", we can advise to press "
any key to continue" and use (e.g.)
This will wait for a (timeout-ed period) character input and continue the execution of the script. The timeout can be either passed to the read command as a "
-t 20 " parameter (for 20 seconds timeout), or by setting the
TMOUT varible somewhere in the script. Thus the script would timeout even if the expected input won't arrive.
Another way to pause the script execution for a period, is to use the "
sleep" command.
I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
If you run from terminal the messages are visible anyway. Thus I'd only understand to halt the script if it were to keep a window open - which is not the case in my setup. We are talking Linux here. Would it not make sense to use
notify-send to show a graphical message? Oolite would hardly be started without a desktop environment I guess.
Yet the idea of a timeout seems to be a compromise to serve both worlds. But how long should that be? I do use a different wrapper (OoliteStarter) and do not want to wait 20 seconds (or any other unnecessary delay) without any feedback what happened. How long would you wait before clicking again?
Would it help if we either configure the timeout via command line or split the script into two?
One part would care about showing an error, the other one would just ensure Oolite is invoked with all the necessary tweaks? I would be calling the latter only while users can do what they always did...
oolite-wrapper ( second-wrapper ( oolite ) )
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 3:07 pm
by Commander_X
hiran wrote: ↑Fri Jul 07, 2023 5:33 am
[...]
Would it not make sense to use
notify-send to show a graphical message?
[...]
Never too late to learn something (first time I hear about this command). I was going to argue that notify-send might be a desktop environment specific command, and I'm not yet 100% it's not, but it covers xfce and gnome desktops, at least. KDE only, or other non gtk+ desktops might need some research.
hiran wrote: ↑Fri Jul 07, 2023 5:33 am
[...]
Would it help if we either configure the timeout via command line or split the script into two?
One part would care about showing an error, the other one would just ensure Oolite is invoked with all the necessary tweaks? I would be calling the latter only while users can do what they always did...
oolite-wrapper ( second-wrapper ( oolite ) )
The timeout for the read command I suggested would be set through the command line, e.g. either
or
Splitting the script won't solve too much. What would be left hanging after Oolite crashing, would still be a shell session waiting for that input to come.
A timeout would be useful because that shell will end on its own, without the need to identify it, and do a
kill. Mentioning the timeout in the message to the users would also be a useful option.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 3:45 pm
by hiran
Ok, let me make a concise example that we then can discuss.
oolite-wrapper.sh
Code: Select all
#!/bin/bash
# take the first parameter as timeout value - should be done more clever
if [ "$1" = "-t" ]
then
TIMEOUT=${2:-20}
shift 2
fi
# find my own directory
SOURCE=${BASH_SOURCE[0]}
while [ -L "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
SOURCE=$(readlink "$SOURCE")
[[ $SOURCE != /* ]] && SOURCE=$DIR/$SOURCE # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
# invoke the second wrapper and evaluate result
${DIR}/second-wrapper.sh $*
RC=$?
if [ "${RC}" != "0" ]
then
echo
echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
echo "report, please copy + paste the log above into the report."
echo
echo "(Press any key to continue or wait for ${TIMEOUT} seconds)"
read -t ${TIMEOUT} -r -s -N 1 junk
fi
# important: preserve the exit code for the caller
exit ${RC}
second-wrapper.sh
Code: Select all
#!/bin/bash
# find my own directory
SOURCE=${BASH_SOURCE[0]}
while [ -L "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
SOURCE=$(readlink "$SOURCE")
[[ $SOURCE != /* ]] && SOURCE=$DIR/$SOURCE # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
# perform necessary cleanup
# set environment variables, especially LD_LIBRARY_PATH
# run Oolite
${DIR}/oolite $*
RC=$?
if [ "${RC}" != "0" ]
then
# according to https://specifications.freedesktop.org/notification-spec/latest/index.html
# the urgency level should be sufficient for the notification to wait for acknowledgement
notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite-wrapper.sh"
fi
# important: preserve the exit code for the caller
exit ${RC}
So the usage, as I indicated above would look like this:
oolite-wrapper ( second-wrapper ( oolite ) )
With that, the behaviour for an ordinary user would be exactly the same as before, I just added a desktop notification that waits to be acknowledge.
For me as advanced user, I could run the second wrapper directly and still enjoy the path fixtures.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 6:20 pm
by Commander_X
You can always do with a command line parameter for a single script in the .desktop file, that only sends the notification and bypasses the input request when it's used, and does both when it's not.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 6:57 pm
by hiran
Commander_X wrote: ↑Fri Jul 07, 2023 6:20 pm
You can always do with a command line parameter for a single script in the .desktop file, that only sends the notification and bypasses the input request when it's used, and does both when it's not.
Is that idea instead or on top of my draft?
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 10:07 pm
by Commander_X
hiran wrote: ↑Fri Jul 07, 2023 6:57 pm
[...]
Is that idea instead or on top of my draft?
Instead. The current single script will have to suppress the input command (cat or replace it with read) part when the "desktop" parameter is passed, and allow it only when run without. In the same script you'll have to also implement the notification (to be run always)
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 10:09 pm
by hiran
Commander_X wrote: ↑Fri Jul 07, 2023 10:07 pm
hiran wrote: ↑Fri Jul 07, 2023 6:57 pm
[...]
Is that idea instead or on top of my draft?
Instead. The current single script will have to suppress the input command (cat or replace it with read) part when the "desktop" parameter is passed, and allow it only when run without. In the same script you'll have to also implement the notification (to be run always)
I tested running a single script, and in there I could not even get
notify-send
to work.
Could you make this a complete example?
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 10:21 pm
by Commander_X
Replacing this sequence (starting at line 66 from what I can tell from your github link above):
Code: Select all
if [ $? != 0 ]
then
echo
echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
echo "report, please copy + paste the log above into the report."
echo
echo "(Press Ctrl-C to continue)"
cat
fi
with this
Code: Select all
if [ $? != 0 ]
then
notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite"
if [ "x$1" == "x" ]
then
echo
echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
echo "report, please copy + paste the log above into the report."
echo
echo "(Press Ctrl-C to continue)"
cat
fi
fi
should do the trick.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 10:25 pm
by hiran
Commander_X wrote: ↑Fri Jul 07, 2023 10:21 pm
Replacing [...] should do the trick.
Did you try it or is it as theoretic as my above example? For me it did not work, as shared libraries were not found.
Code: Select all
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ ./oolite-trunk
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/oolite-wrapper: 74: [: x: unexpected operator
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$
Code: Select all
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ bash -x oolite.app/oolite-wrapper
+ TRUNK=-trunk
+++ dirname oolite.app/oolite-wrapper
++ cd oolite.app
++ cd ..
++ pwd -P
+ OOLITE_ROOT=/home/demo/GNUstep/Applications/Oolite-trunk
+ '[' '!' -f /home/demo/.Oolite/.oolite-trunk-run ']'
+ '[' '!' -d /home/demo/GNUstep/Library/DTDs ']'
++ uname -m
++ sed -e s/amd64/x86_64/
+ HOST_ARCH=x86_64
+ '[' x86_64 = x86_64 ']'
+ export GNUSTEP_HOST=x86_64-pc-linux-gnu
+ GNUSTEP_HOST=x86_64-pc-linux-gnu
+ export GNUSTEP_HOST_CPU=x86_64
+ GNUSTEP_HOST_CPU=x86_64
+ export GNUSTEP_FLATTENED=yes
+ GNUSTEP_FLATTENED=yes
+ export GNUSTEP_HOST_OS=linux-gnu
+ GNUSTEP_HOST_OS=linux-gnu
+ export GNUSTEP_HOST_VENDOR=pc
+ GNUSTEP_HOST_VENDOR=pc
+ export LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ export ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ cd /home/demo/GNUstep/Applications/Oolite-trunk/
+ ./oolite.app/oolite
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
+ '[' 127 '!=' 0 ']'
+ notify-send --app-name=Oolite --urgency=critical 'Oolite Singularity Occurred' 'Oolite crashed dismally. If you want to troubleshoot, open a terminal and run /oolite'
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
+ '[' x == x ']'
+ echo
+ echo 'Erk. It looks like Oolite-trunk died with an error. When making an error'
Erk. It looks like Oolite-trunk died with an error. When making an error
+ echo 'report, please copy + paste the log above into the report.'
report, please copy + paste the log above into the report.
+ echo
+ echo '(Press Ctrl-C to continue)'
(Press Ctrl-C to continue)
+ cat
...it still running as it is blocked.
Code: Select all
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ bash -x oolite.app/oolite-wrapper x
+ TRUNK=-trunk
+++ dirname oolite.app/oolite-wrapper
++ cd oolite.app
++ cd ..
++ pwd -P
+ OOLITE_ROOT=/home/demo/GNUstep/Applications/Oolite-trunk
+ '[' '!' -f /home/demo/.Oolite/.oolite-trunk-run ']'
+ '[' '!' -d /home/demo/GNUstep/Library/DTDs ']'
++ uname -m
++ sed -e s/amd64/x86_64/
+ HOST_ARCH=x86_64
+ '[' x86_64 = x86_64 ']'
+ export GNUSTEP_HOST=x86_64-pc-linux-gnu
+ GNUSTEP_HOST=x86_64-pc-linux-gnu
+ export GNUSTEP_HOST_CPU=x86_64
+ GNUSTEP_HOST_CPU=x86_64
+ export GNUSTEP_FLATTENED=yes
+ GNUSTEP_FLATTENED=yes
+ export GNUSTEP_HOST_OS=linux-gnu
+ GNUSTEP_HOST_OS=linux-gnu
+ export GNUSTEP_HOST_VENDOR=pc
+ GNUSTEP_HOST_VENDOR=pc
+ export LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ export ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ cd /home/demo/GNUstep/Applications/Oolite-trunk/
+ ./oolite.app/oolite x
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
+ '[' 127 '!=' 0 ']'
+ notify-send --app-name=Oolite --urgency=critical 'Oolite Singularity Occurred' 'Oolite crashed dismally. If you want to troubleshoot, open a terminal and run /oolite'
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
+ '[' xx == x ']'
+ exit 0
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$
But you can see notify-send was not executed.
We need something better than that.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Fri Jul 07, 2023 11:31 pm
by Commander_X
This
Code: Select all
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
would certainly require to run the notify-send this way
Code: Select all
LD_LIBRARY_PATH=/usr/lib64 notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite"
But you're also getting
Code: Select all
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
which means your binary oolite is being compiled with the local version of libgnustep-base, different of the libgnustep-base.so.1.20 distributed as a x86_64 dependency in the "deps" folder of Linux.
It's very likely you'll need to check if a local version of the library exists before placing the distributed one in "/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib". If it exists, then remove the one in your runtime deps folder.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Sat Jul 08, 2023 4:38 am
by hiran
I agree, that is the right track.
We are looking at two different problems. One at compile/installation time, another at runtime.
In this thread I'd like to focus on the runtime problem. It is the other one that causes Oolite to fail - but that's what we need to display the message.
Also note that the LD_LIBRARY_PATH would not be a problem running notify-send had we not tampered with for Oolite before in the same script. Either we just unset what we modified before, or we separate the functions over two scripts and use the isolation provided by the OS. Going for isolation also keeps the scripts maintainable as each of them serves a very distinct purpose.
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Posted: Sun Jul 09, 2023 7:45 pm
by hiran
To move forward, I modified the two script in question and tested them on my local installation. That worked to satisfaction.
It actually involves two changes, distributed over two repositories. Plus the fact that one script is not directly in the repo, it is generated at installation time by the installer.
Please review the PRs and let me know if they are good to be merged.
https://github.com/OoliteProject/oolite/pull/432
https://github.com/OoliteProject/oolite ... ies/pull/2
Also, since I change the oolite-linux-dependencies I am not sure what additional change should go into the Oolite repo to make it pull the latest submodule.