Oolite Linux potentially unresponsive (stuck, crashed)
Moderators: winston, another_commander, Getafix
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Oolite Linux potentially unresponsive (stuck, crashed)
On Linux, rather than running Oolite directly, users will run the wrapper script. It's job is to configure the library path and active directory before running the real Oolite binary. This wrapper script is also registered for use by application launchers (that would be the equivalent of a Desktop icon or start menu icon in Windows).
But this wrapper script seems to perform more than just prepare environment variables:
- it copies files that need to exist for a stable run. Should that not happen at the installer?
- it detects Oolite terminates with a nonzero exit code. Would not be too bad, but all of a sudden that script goes interactive. See https://github.com/OoliteProject/oolite ... te.src#L73
So imagine Oolite is started via the launcher. At that time we want to see the splash screen and shortly after the game screen. No terminal window is expected, and indeed no terminal window is showing up. Now if at that stage Oolite crashes, the process terminates and the GUI will indicate that the process has stopped. No it does not. The process of the wrapper script does not terminate. Instead it displays a message on stdout and waits for user input. At which point is the user expected to really notice?
This is a discrepancy we should solve differently. Any ideas?
But this wrapper script seems to perform more than just prepare environment variables:
- it copies files that need to exist for a stable run. Should that not happen at the installer?
- it detects Oolite terminates with a nonzero exit code. Would not be too bad, but all of a sudden that script goes interactive. See https://github.com/OoliteProject/oolite ... te.src#L73
So imagine Oolite is started via the launcher. At that time we want to see the splash screen and shortly after the game screen. No terminal window is expected, and indeed no terminal window is showing up. Now if at that stage Oolite crashes, the process terminates and the GUI will indicate that the process has stopped. No it does not. The process of the wrapper script does not terminate. Instead it displays a message on stdout and waits for user input. At which point is the user expected to really notice?
This is a discrepancy we should solve differently. Any ideas?
Sunshine - Moonlight - Good Times - Oolite
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Hmmm. Noone concerned about that? Well, if I am the only one thinking in that direction then maybe I have to come up with a solution....
So my idea is this:
If Oolite is considered a UI based application, there should not be error messages on stdout or stderr.
But what is something is going seriously wrong? So wrong that nothing can be displayed in a window. So wrong that even logging into the logfile just does not work?
There are mechanisms that all processes in a computer system have available: stdout, stderr and a return code.
So yes, place an error message onto these output streams. And raise the return code so that it is not zero (which usually indicates success).
The Oolite application is behaving that way, and it is good.
But this wrapper script, boy it is changing more than it should. So I intend to cleanup a bit there. Either it makes the error visible graphically. Or it should just not change the behaviour, meaning the error is on the output stream and the exit code does not get tampered with. Just so that whoever called the wrapper script can take a decision what to do.
I am open for feedback but willing to take on that change. So feel free to comment!
So my idea is this:
If Oolite is considered a UI based application, there should not be error messages on stdout or stderr.
But what is something is going seriously wrong? So wrong that nothing can be displayed in a window. So wrong that even logging into the logfile just does not work?
There are mechanisms that all processes in a computer system have available: stdout, stderr and a return code.
So yes, place an error message onto these output streams. And raise the return code so that it is not zero (which usually indicates success).
The Oolite application is behaving that way, and it is good.
But this wrapper script, boy it is changing more than it should. So I intend to cleanup a bit there. Either it makes the error visible graphically. Or it should just not change the behaviour, meaning the error is on the output stream and the exit code does not get tampered with. Just so that whoever called the wrapper script can take a decision what to do.
I am open for feedback but willing to take on that change. So feel free to comment!
Sunshine - Moonlight - Good Times - Oolite
-
- ---- E L I T E ----
- Posts: 681
- Joined: Sat Aug 09, 2014 4:16 pm
Re: Oolite Linux potentially unresponsive (stuck, crashed)
From what I can tell, the "cat" command is used as a sort of "pause" command prompt batch file.
Instead of advising to press "Ctrl-C to continue", and use "cat", we can advise to press "any key to continue" and use (e.g.) This will wait for a (timeout-ed period) character input and continue the execution of the script. The timeout can be either passed to the read command as a " -t 20 " parameter (for 20 seconds timeout), or by setting the TMOUT varible somewhere in the script. Thus the script would timeout even if the expected input won't arrive.
Another way to pause the script execution for a period, is to use the "sleep" command.
I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
Instead of advising to press "Ctrl-C to continue", and use "cat", we can advise to press "any key to continue" and use (e.g.)
Code: Select all
read -r -s -N 1 junk
Another way to pause the script execution for a period, is to use the "sleep" command.
I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
If you run from terminal the messages are visible anyway. Thus I'd only understand to halt the script if it were to keep a window open - which is not the case in my setup. We are talking Linux here. Would it not make sense to use notify-send to show a graphical message? Oolite would hardly be started without a desktop environment I guess.Commander_X wrote: ↑Fri Jul 07, 2023 2:08 amFrom what I can tell, the "cat" command is used as a sort of "pause" command prompt batch file.
Instead of advising to press "Ctrl-C to continue", and use "cat", we can advise to press "any key to continue" and use (e.g.)This will wait for a (timeout-ed period) character input and continue the execution of the script. The timeout can be either passed to the read command as a " -t 20 " parameter (for 20 seconds timeout), or by setting the TMOUT varible somewhere in the script. Thus the script would timeout even if the expected input won't arrive.Code: Select all
read -r -s -N 1 junk
Another way to pause the script execution for a period, is to use the "sleep" command.
I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
Yet the idea of a timeout seems to be a compromise to serve both worlds. But how long should that be? I do use a different wrapper (OoliteStarter) and do not want to wait 20 seconds (or any other unnecessary delay) without any feedback what happened. How long would you wait before clicking again?
Would it help if we either configure the timeout via command line or split the script into two?
One part would care about showing an error, the other one would just ensure Oolite is invoked with all the necessary tweaks? I would be calling the latter only while users can do what they always did...
oolite-wrapper ( second-wrapper ( oolite ) )
Sunshine - Moonlight - Good Times - Oolite
-
- ---- E L I T E ----
- Posts: 681
- Joined: Sat Aug 09, 2014 4:16 pm
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Never too late to learn something (first time I hear about this command). I was going to argue that notify-send might be a desktop environment specific command, and I'm not yet 100% it's not, but it covers xfce and gnome desktops, at least. KDE only, or other non gtk+ desktops might need some research.
The timeout for the read command I suggested would be set through the command line, e.g. eitherhiran wrote: ↑Fri Jul 07, 2023 5:33 am[...]
Would it help if we either configure the timeout via command line or split the script into two?
One part would care about showing an error, the other one would just ensure Oolite is invoked with all the necessary tweaks? I would be calling the latter only while users can do what they always did...
oolite-wrapper ( second-wrapper ( oolite ) )
Code: Select all
read -t 20 -r -s -N 1 junk
Code: Select all
TMOUT=20 read -r -s -N 1 junk
A timeout would be useful because that shell will end on its own, without the need to identify it, and do a kill. Mentioning the timeout in the message to the users would also be a useful option.
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Ok, let me make a concise example that we then can discuss.
So the usage, as I indicated above would look like this:
With that, the behaviour for an ordinary user would be exactly the same as before, I just added a desktop notification that waits to be acknowledge.
For me as advanced user, I could run the second wrapper directly and still enjoy the path fixtures.
oolite-wrapper.sh
Code: Select all
#!/bin/bash
# take the first parameter as timeout value - should be done more clever
if [ "$1" = "-t" ]
then
TIMEOUT=${2:-20}
shift 2
fi
# find my own directory
SOURCE=${BASH_SOURCE[0]}
while [ -L "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
SOURCE=$(readlink "$SOURCE")
[[ $SOURCE != /* ]] && SOURCE=$DIR/$SOURCE # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
# invoke the second wrapper and evaluate result
${DIR}/second-wrapper.sh $*
RC=$?
if [ "${RC}" != "0" ]
then
echo
echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
echo "report, please copy + paste the log above into the report."
echo
echo "(Press any key to continue or wait for ${TIMEOUT} seconds)"
read -t ${TIMEOUT} -r -s -N 1 junk
fi
# important: preserve the exit code for the caller
exit ${RC}
second-wrapper.sh
Code: Select all
#!/bin/bash
# find my own directory
SOURCE=${BASH_SOURCE[0]}
while [ -L "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
SOURCE=$(readlink "$SOURCE")
[[ $SOURCE != /* ]] && SOURCE=$DIR/$SOURCE # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
# perform necessary cleanup
# set environment variables, especially LD_LIBRARY_PATH
# run Oolite
${DIR}/oolite $*
RC=$?
if [ "${RC}" != "0" ]
then
# according to https://specifications.freedesktop.org/notification-spec/latest/index.html
# the urgency level should be sufficient for the notification to wait for acknowledgement
notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite-wrapper.sh"
fi
# important: preserve the exit code for the caller
exit ${RC}
oolite-wrapper ( second-wrapper ( oolite ) )
With that, the behaviour for an ordinary user would be exactly the same as before, I just added a desktop notification that waits to be acknowledge.
For me as advanced user, I could run the second wrapper directly and still enjoy the path fixtures.
Sunshine - Moonlight - Good Times - Oolite
-
- ---- E L I T E ----
- Posts: 681
- Joined: Sat Aug 09, 2014 4:16 pm
Re: Oolite Linux potentially unresponsive (stuck, crashed)
You can always do with a command line parameter for a single script in the .desktop file, that only sends the notification and bypasses the input request when it's used, and does both when it's not.
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Is that idea instead or on top of my draft?Commander_X wrote: ↑Fri Jul 07, 2023 6:20 pmYou can always do with a command line parameter for a single script in the .desktop file, that only sends the notification and bypasses the input request when it's used, and does both when it's not.
Sunshine - Moonlight - Good Times - Oolite
-
- ---- E L I T E ----
- Posts: 681
- Joined: Sat Aug 09, 2014 4:16 pm
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Instead. The current single script will have to suppress the input command (cat or replace it with read) part when the "desktop" parameter is passed, and allow it only when run without. In the same script you'll have to also implement the notification (to be run always)
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
I tested running a single script, and in there I could not even getCommander_X wrote: ↑Fri Jul 07, 2023 10:07 pmInstead. The current single script will have to suppress the input command (cat or replace it with read) part when the "desktop" parameter is passed, and allow it only when run without. In the same script you'll have to also implement the notification (to be run always)
notify-send
to work.Could you make this a complete example?
Sunshine - Moonlight - Good Times - Oolite
-
- ---- E L I T E ----
- Posts: 681
- Joined: Sat Aug 09, 2014 4:16 pm
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Replacing this sequence (starting at line 66 from what I can tell from your github link above):
with this
should do the trick.
Code: Select all
if [ $? != 0 ]
then
echo
echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
echo "report, please copy + paste the log above into the report."
echo
echo "(Press Ctrl-C to continue)"
cat
fi
Code: Select all
if [ $? != 0 ]
then
notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite"
if [ "x$1" == "x" ]
then
echo
echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
echo "report, please copy + paste the log above into the report."
echo
echo "(Press Ctrl-C to continue)"
cat
fi
fi
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
Did you try it or is it as theoretic as my above example? For me it did not work, as shared libraries were not found.
Code: Select all
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ ./oolite-trunk
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/oolite-wrapper: 74: [: x: unexpected operator
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$
Code: Select all
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ bash -x oolite.app/oolite-wrapper
+ TRUNK=-trunk
+++ dirname oolite.app/oolite-wrapper
++ cd oolite.app
++ cd ..
++ pwd -P
+ OOLITE_ROOT=/home/demo/GNUstep/Applications/Oolite-trunk
+ '[' '!' -f /home/demo/.Oolite/.oolite-trunk-run ']'
+ '[' '!' -d /home/demo/GNUstep/Library/DTDs ']'
++ uname -m
++ sed -e s/amd64/x86_64/
+ HOST_ARCH=x86_64
+ '[' x86_64 = x86_64 ']'
+ export GNUSTEP_HOST=x86_64-pc-linux-gnu
+ GNUSTEP_HOST=x86_64-pc-linux-gnu
+ export GNUSTEP_HOST_CPU=x86_64
+ GNUSTEP_HOST_CPU=x86_64
+ export GNUSTEP_FLATTENED=yes
+ GNUSTEP_FLATTENED=yes
+ export GNUSTEP_HOST_OS=linux-gnu
+ GNUSTEP_HOST_OS=linux-gnu
+ export GNUSTEP_HOST_VENDOR=pc
+ GNUSTEP_HOST_VENDOR=pc
+ export LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ export ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ cd /home/demo/GNUstep/Applications/Oolite-trunk/
+ ./oolite.app/oolite
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
+ '[' 127 '!=' 0 ']'
+ notify-send --app-name=Oolite --urgency=critical 'Oolite Singularity Occurred' 'Oolite crashed dismally. If you want to troubleshoot, open a terminal and run /oolite'
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
+ '[' x == x ']'
+ echo
+ echo 'Erk. It looks like Oolite-trunk died with an error. When making an error'
Erk. It looks like Oolite-trunk died with an error. When making an error
+ echo 'report, please copy + paste the log above into the report.'
report, please copy + paste the log above into the report.
+ echo
+ echo '(Press Ctrl-C to continue)'
(Press Ctrl-C to continue)
+ cat
Code: Select all
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ bash -x oolite.app/oolite-wrapper x
+ TRUNK=-trunk
+++ dirname oolite.app/oolite-wrapper
++ cd oolite.app
++ cd ..
++ pwd -P
+ OOLITE_ROOT=/home/demo/GNUstep/Applications/Oolite-trunk
+ '[' '!' -f /home/demo/.Oolite/.oolite-trunk-run ']'
+ '[' '!' -d /home/demo/GNUstep/Library/DTDs ']'
++ uname -m
++ sed -e s/amd64/x86_64/
+ HOST_ARCH=x86_64
+ '[' x86_64 = x86_64 ']'
+ export GNUSTEP_HOST=x86_64-pc-linux-gnu
+ GNUSTEP_HOST=x86_64-pc-linux-gnu
+ export GNUSTEP_HOST_CPU=x86_64
+ GNUSTEP_HOST_CPU=x86_64
+ export GNUSTEP_FLATTENED=yes
+ GNUSTEP_FLATTENED=yes
+ export GNUSTEP_HOST_OS=linux-gnu
+ GNUSTEP_HOST_OS=linux-gnu
+ export GNUSTEP_HOST_VENDOR=pc
+ GNUSTEP_HOST_VENDOR=pc
+ export LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ export ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ cd /home/demo/GNUstep/Applications/Oolite-trunk/
+ ./oolite.app/oolite x
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
+ '[' 127 '!=' 0 ']'
+ notify-send --app-name=Oolite --urgency=critical 'Oolite Singularity Occurred' 'Oolite crashed dismally. If you want to troubleshoot, open a terminal and run /oolite'
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
+ '[' xx == x ']'
+ exit 0
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$
We need something better than that.
Sunshine - Moonlight - Good Times - Oolite
-
- ---- E L I T E ----
- Posts: 681
- Joined: Sat Aug 09, 2014 4:16 pm
Re: Oolite Linux potentially unresponsive (stuck, crashed)
This
would certainly require to run the notify-send this way
But you're also getting
which means your binary oolite is being compiled with the local version of libgnustep-base, different of the libgnustep-base.so.1.20 distributed as a x86_64 dependency in the "deps" folder of Linux.
It's very likely you'll need to check if a local version of the library exists before placing the distributed one in "/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib". If it exists, then remove the one in your runtime deps folder.
Code: Select all
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
Code: Select all
LD_LIBRARY_PATH=/usr/lib64 notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite"
Code: Select all
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
It's very likely you'll need to check if a local version of the library exists before placing the distributed one in "/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib". If it exists, then remove the one in your runtime deps folder.
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
I agree, that is the right track.
We are looking at two different problems. One at compile/installation time, another at runtime.
In this thread I'd like to focus on the runtime problem. It is the other one that causes Oolite to fail - but that's what we need to display the message.
Also note that the LD_LIBRARY_PATH would not be a problem running notify-send had we not tampered with for Oolite before in the same script. Either we just unset what we modified before, or we separate the functions over two scripts and use the isolation provided by the OS. Going for isolation also keeps the scripts maintainable as each of them serves a very distinct purpose.
We are looking at two different problems. One at compile/installation time, another at runtime.
In this thread I'd like to focus on the runtime problem. It is the other one that causes Oolite to fail - but that's what we need to display the message.
Also note that the LD_LIBRARY_PATH would not be a problem running notify-send had we not tampered with for Oolite before in the same script. Either we just unset what we modified before, or we separate the functions over two scripts and use the isolation provided by the OS. Going for isolation also keeps the scripts maintainable as each of them serves a very distinct purpose.
Sunshine - Moonlight - Good Times - Oolite
- hiran
- Theorethicist
- Posts: 2415
- Joined: Fri Mar 26, 2021 1:39 pm
- Location: a parallel world I created for myself. Some call it a singularity...
Re: Oolite Linux potentially unresponsive (stuck, crashed)
To move forward, I modified the two script in question and tested them on my local installation. That worked to satisfaction.
It actually involves two changes, distributed over two repositories. Plus the fact that one script is not directly in the repo, it is generated at installation time by the installer.
Please review the PRs and let me know if they are good to be merged.
https://github.com/OoliteProject/oolite/pull/432
https://github.com/OoliteProject/oolite ... ies/pull/2
Also, since I change the oolite-linux-dependencies I am not sure what additional change should go into the Oolite repo to make it pull the latest submodule.
It actually involves two changes, distributed over two repositories. Plus the fact that one script is not directly in the repo, it is generated at installation time by the installer.
Please review the PRs and let me know if they are good to be merged.
https://github.com/OoliteProject/oolite/pull/432
https://github.com/OoliteProject/oolite ... ies/pull/2
Also, since I change the oolite-linux-dependencies I am not sure what additional change should go into the Oolite repo to make it pull the latest submodule.
Sunshine - Moonlight - Good Times - Oolite