Oolite Linux potentially unresponsive (stuck, crashed)

For discussion of ports to POSIX based systems, especially using GNUStep.

Moderators: another_commander, winston, Getafix

User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

On Linux, rather than running Oolite directly, users will run the wrapper script. It's job is to configure the library path and active directory before running the real Oolite binary. This wrapper script is also registered for use by application launchers (that would be the equivalent of a Desktop icon or start menu icon in Windows).

But this wrapper script seems to perform more than just prepare environment variables:
- it copies files that need to exist for a stable run. Should that not happen at the installer?
- it detects Oolite terminates with a nonzero exit code. Would not be too bad, but all of a sudden that script goes interactive. See https://github.com/OoliteProject/oolite ... te.src#L73

So imagine Oolite is started via the launcher. At that time we want to see the splash screen and shortly after the game screen. No terminal window is expected, and indeed no terminal window is showing up. Now if at that stage Oolite crashes, the process terminates and the GUI will indicate that the process has stopped. No it does not. The process of the wrapper script does not terminate. Instead it displays a message on stdout and waits for user input. At which point is the user expected to really notice?

This is a discrepancy we should solve differently. Any ideas?
Sunshine - Moonlight - Good Times - Oolite
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

Hmmm. Noone concerned about that? Well, if I am the only one thinking in that direction then maybe I have to come up with a solution....

So my idea is this:
If Oolite is considered a UI based application, there should not be error messages on stdout or stderr.
But what is something is going seriously wrong? So wrong that nothing can be displayed in a window. So wrong that even logging into the logfile just does not work?

There are mechanisms that all processes in a computer system have available: stdout, stderr and a return code.
So yes, place an error message onto these output streams. And raise the return code so that it is not zero (which usually indicates success).
The Oolite application is behaving that way, and it is good.

But this wrapper script, boy it is changing more than it should. So I intend to cleanup a bit there. Either it makes the error visible graphically. Or it should just not change the behaviour, meaning the error is on the output stream and the exit code does not get tampered with. Just so that whoever called the wrapper script can take a decision what to do.

I am open for feedback but willing to take on that change. So feel free to comment!
Sunshine - Moonlight - Good Times - Oolite
Commander_X
---- E L I T E ----
---- E L I T E ----
Posts: 664
Joined: Sat Aug 09, 2014 4:16 pm

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by Commander_X »

From what I can tell, the "cat" command is used as a sort of "pause" command prompt batch file.

Instead of advising to press "Ctrl-C to continue", and use "cat", we can advise to press "any key to continue" and use (e.g.)

Code: Select all

read -r -s -N 1 junk
This will wait for a (timeout-ed period) character input and continue the execution of the script. The timeout can be either passed to the read command as a " -t 20 " parameter (for 20 seconds timeout), or by setting the TMOUT varible somewhere in the script. Thus the script would timeout even if the expected input won't arrive.

Another way to pause the script execution for a period, is to use the "sleep" command.

I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

Commander_X wrote: Fri Jul 07, 2023 2:08 am
From what I can tell, the "cat" command is used as a sort of "pause" command prompt batch file.

Instead of advising to press "Ctrl-C to continue", and use "cat", we can advise to press "any key to continue" and use (e.g.)

Code: Select all

read -r -s -N 1 junk
This will wait for a (timeout-ed period) character input and continue the execution of the script. The timeout can be either passed to the read command as a " -t 20 " parameter (for 20 seconds timeout), or by setting the TMOUT varible somewhere in the script. Thus the script would timeout even if the expected input won't arrive.

Another way to pause the script execution for a period, is to use the "sleep" command.

I wouldn't remove that message and the option it provides, as inexperienced users can be directed to launch the game with a "Run in terminal" option enabled (or even launch from the terminal), and capture easier the error message.
If you run from terminal the messages are visible anyway. Thus I'd only understand to halt the script if it were to keep a window open - which is not the case in my setup. We are talking Linux here. Would it not make sense to use notify-send to show a graphical message? Oolite would hardly be started without a desktop environment I guess.

Yet the idea of a timeout seems to be a compromise to serve both worlds. But how long should that be? I do use a different wrapper (OoliteStarter) and do not want to wait 20 seconds (or any other unnecessary delay) without any feedback what happened. How long would you wait before clicking again?

Would it help if we either configure the timeout via command line or split the script into two?
One part would care about showing an error, the other one would just ensure Oolite is invoked with all the necessary tweaks? I would be calling the latter only while users can do what they always did...

oolite-wrapper ( second-wrapper ( oolite ) )
Sunshine - Moonlight - Good Times - Oolite
Commander_X
---- E L I T E ----
---- E L I T E ----
Posts: 664
Joined: Sat Aug 09, 2014 4:16 pm

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by Commander_X »

hiran wrote: Fri Jul 07, 2023 5:33 am
[...]
Would it not make sense to use notify-send to show a graphical message?
[...]
Never too late to learn something (first time I hear about this command). I was going to argue that notify-send might be a desktop environment specific command, and I'm not yet 100% it's not, but it covers xfce and gnome desktops, at least. KDE only, or other non gtk+ desktops might need some research.
hiran wrote: Fri Jul 07, 2023 5:33 am
[...]
Would it help if we either configure the timeout via command line or split the script into two?
One part would care about showing an error, the other one would just ensure Oolite is invoked with all the necessary tweaks? I would be calling the latter only while users can do what they always did...

oolite-wrapper ( second-wrapper ( oolite ) )
The timeout for the read command I suggested would be set through the command line, e.g. either

Code: Select all

read -t 20 -r -s -N 1 junk
or

Code: Select all

TMOUT=20 read -r -s -N 1 junk
Splitting the script won't solve too much. What would be left hanging after Oolite crashing, would still be a shell session waiting for that input to come.
A timeout would be useful because that shell will end on its own, without the need to identify it, and do a kill. Mentioning the timeout in the message to the users would also be a useful option.
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

Ok, let me make a concise example that we then can discuss.

oolite-wrapper.sh

Code: Select all

#!/bin/bash

# take the first parameter as timeout value - should be done more clever
if [ "$1" = "-t" ]
then
    TIMEOUT=${2:-20}
    shift 2
fi

# find my own directory
SOURCE=${BASH_SOURCE[0]}
while [ -L "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
  DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
  SOURCE=$(readlink "$SOURCE")
  [[ $SOURCE != /* ]] && SOURCE=$DIR/$SOURCE # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )

# invoke the second wrapper and evaluate result
${DIR}/second-wrapper.sh $*
RC=$?

if [ "${RC}" != "0" ]
then
echo
   echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
   echo "report, please copy + paste the log above into the report."
   echo
   echo "(Press any key to continue or wait for ${TIMEOUT} seconds)"
   read -t ${TIMEOUT} -r -s -N 1 junk
fi

# important: preserve the exit code for the caller
exit ${RC}
second-wrapper.sh

Code: Select all

#!/bin/bash

# find my own directory
SOURCE=${BASH_SOURCE[0]}
while [ -L "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
  DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )
  SOURCE=$(readlink "$SOURCE")
  [[ $SOURCE != /* ]] && SOURCE=$DIR/$SOURCE # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR=$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )

# perform necessary cleanup
# set environment variables, especially LD_LIBRARY_PATH

# run Oolite
${DIR}/oolite $*
RC=$?

if [ "${RC}" != "0" ]
then
    # according to https://specifications.freedesktop.org/notification-spec/latest/index.html
    # the urgency level should be sufficient for the notification to wait for acknowledgement
    notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite-wrapper.sh"
fi

# important: preserve the exit code for the caller
exit ${RC}
So the usage, as I indicated above would look like this:
oolite-wrapper ( second-wrapper ( oolite ) )

With that, the behaviour for an ordinary user would be exactly the same as before, I just added a desktop notification that waits to be acknowledge.
For me as advanced user, I could run the second wrapper directly and still enjoy the path fixtures.
Sunshine - Moonlight - Good Times - Oolite
Commander_X
---- E L I T E ----
---- E L I T E ----
Posts: 664
Joined: Sat Aug 09, 2014 4:16 pm

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by Commander_X »

You can always do with a command line parameter for a single script in the .desktop file, that only sends the notification and bypasses the input request when it's used, and does both when it's not.
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

Commander_X wrote: Fri Jul 07, 2023 6:20 pm
You can always do with a command line parameter for a single script in the .desktop file, that only sends the notification and bypasses the input request when it's used, and does both when it's not.
Is that idea instead or on top of my draft?
Sunshine - Moonlight - Good Times - Oolite
Commander_X
---- E L I T E ----
---- E L I T E ----
Posts: 664
Joined: Sat Aug 09, 2014 4:16 pm

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by Commander_X »

hiran wrote: Fri Jul 07, 2023 6:57 pm
[...]
Is that idea instead or on top of my draft?
Instead. The current single script will have to suppress the input command (cat or replace it with read) part when the "desktop" parameter is passed, and allow it only when run without. In the same script you'll have to also implement the notification (to be run always)
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

Commander_X wrote: Fri Jul 07, 2023 10:07 pm
hiran wrote: Fri Jul 07, 2023 6:57 pm
[...]
Is that idea instead or on top of my draft?
Instead. The current single script will have to suppress the input command (cat or replace it with read) part when the "desktop" parameter is passed, and allow it only when run without. In the same script you'll have to also implement the notification (to be run always)
I tested running a single script, and in there I could not even get notify-send to work.
Could you make this a complete example?
Sunshine - Moonlight - Good Times - Oolite
Commander_X
---- E L I T E ----
---- E L I T E ----
Posts: 664
Joined: Sat Aug 09, 2014 4:16 pm

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by Commander_X »

Replacing this sequence (starting at line 66 from what I can tell from your github link above):

Code: Select all

if [ $? != 0 ]
then
   echo
   echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
   echo "report, please copy + paste the log above into the report."
   echo
   echo "(Press Ctrl-C to continue)"
   cat
fi
with this

Code: Select all

if [ $? != 0 ]
then
  notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite"
  if [ "x$1" == "x" ]
  then
   echo
   echo "Erk. It looks like Oolite${TRUNK} died with an error. When making an error"
   echo "report, please copy + paste the log above into the report."
   echo
   echo "(Press Ctrl-C to continue)"
   cat
  fi
fi
should do the trick.
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

Commander_X wrote: Fri Jul 07, 2023 10:21 pm
Replacing [...] should do the trick.
Did you try it or is it as theoretic as my above example? For me it did not work, as shared libraries were not found.

Code: Select all

demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ ./oolite-trunk
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/oolite-wrapper: 74: [: x: unexpected operator
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$

Code: Select all

demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ bash -x oolite.app/oolite-wrapper 
+ TRUNK=-trunk
+++ dirname oolite.app/oolite-wrapper
++ cd oolite.app
++ cd ..
++ pwd -P
+ OOLITE_ROOT=/home/demo/GNUstep/Applications/Oolite-trunk
+ '[' '!' -f /home/demo/.Oolite/.oolite-trunk-run ']'
+ '[' '!' -d /home/demo/GNUstep/Library/DTDs ']'
++ uname -m
++ sed -e s/amd64/x86_64/
+ HOST_ARCH=x86_64
+ '[' x86_64 = x86_64 ']'
+ export GNUSTEP_HOST=x86_64-pc-linux-gnu
+ GNUSTEP_HOST=x86_64-pc-linux-gnu
+ export GNUSTEP_HOST_CPU=x86_64
+ GNUSTEP_HOST_CPU=x86_64
+ export GNUSTEP_FLATTENED=yes
+ GNUSTEP_FLATTENED=yes
+ export GNUSTEP_HOST_OS=linux-gnu
+ GNUSTEP_HOST_OS=linux-gnu
+ export GNUSTEP_HOST_VENDOR=pc
+ GNUSTEP_HOST_VENDOR=pc
+ export LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ export ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ cd /home/demo/GNUstep/Applications/Oolite-trunk/
+ ./oolite.app/oolite
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
+ '[' 127 '!=' 0 ']'
+ notify-send --app-name=Oolite --urgency=critical 'Oolite Singularity Occurred' 'Oolite crashed dismally. If you want to troubleshoot, open a terminal and run /oolite'
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
+ '[' x == x ']'
+ echo

+ echo 'Erk. It looks like Oolite-trunk died with an error. When making an error'
Erk. It looks like Oolite-trunk died with an error. When making an error
+ echo 'report, please copy + paste the log above into the report.'
report, please copy + paste the log above into the report.
+ echo

+ echo '(Press Ctrl-C to continue)'
(Press Ctrl-C to continue)
+ cat
...it still running as it is blocked.

Code: Select all

demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ bash -x oolite.app/oolite-wrapper x
+ TRUNK=-trunk
+++ dirname oolite.app/oolite-wrapper
++ cd oolite.app
++ cd ..
++ pwd -P
+ OOLITE_ROOT=/home/demo/GNUstep/Applications/Oolite-trunk
+ '[' '!' -f /home/demo/.Oolite/.oolite-trunk-run ']'
+ '[' '!' -d /home/demo/GNUstep/Library/DTDs ']'
++ uname -m
++ sed -e s/amd64/x86_64/
+ HOST_ARCH=x86_64
+ '[' x86_64 = x86_64 ']'
+ export GNUSTEP_HOST=x86_64-pc-linux-gnu
+ GNUSTEP_HOST=x86_64-pc-linux-gnu
+ export GNUSTEP_HOST_CPU=x86_64
+ GNUSTEP_HOST_CPU=x86_64
+ export GNUSTEP_FLATTENED=yes
+ GNUSTEP_FLATTENED=yes
+ export GNUSTEP_HOST_OS=linux-gnu
+ GNUSTEP_HOST_OS=linux-gnu
+ export GNUSTEP_HOST_VENDOR=pc
+ GNUSTEP_HOST_VENDOR=pc
+ export LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ LD_LIBRARY_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib
+ export ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ ESPEAK_DATA_PATH=/home/demo/GNUstep/Applications/Oolite-trunk/oolite.app/Resources/
+ cd /home/demo/GNUstep/Applications/Oolite-trunk/
+ ./oolite.app/oolite x
./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
+ '[' 127 '!=' 0 ']'
+ notify-send --app-name=Oolite --urgency=critical 'Oolite Singularity Occurred' 'Oolite crashed dismally. If you want to troubleshoot, open a terminal and run /oolite'
notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
+ '[' xx == x ']'
+ exit 0
demo@OoliteDemo:~/GNUstep/Applications/Oolite-trunk$ 
But you can see notify-send was not executed.

We need something better than that.
Sunshine - Moonlight - Good Times - Oolite
Commander_X
---- E L I T E ----
---- E L I T E ----
Posts: 664
Joined: Sat Aug 09, 2014 4:16 pm

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by Commander_X »

This

Code: Select all

notify-send: /home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib/libz.so.1: version `ZLIB_1.2.9' not found (required by /lib/x86_64-linux-gnu/libpng16.so.16)
would certainly require to run the notify-send this way

Code: Select all

LD_LIBRARY_PATH=/usr/lib64 notify-send --app-name=Oolite --urgency=critical "Oolite Singularity Occurred" "Oolite crashed dismally. If you want to troubleshoot, open a terminal and run ${DIR}/oolite"
But you're also getting

Code: Select all

./oolite.app/oolite: error while loading shared libraries: libgnustep-base.so.1.28: cannot open shared object file: No such file or directory
which means your binary oolite is being compiled with the local version of libgnustep-base, different of the libgnustep-base.so.1.20 distributed as a x86_64 dependency in the "deps" folder of Linux.
It's very likely you'll need to check if a local version of the library exists before placing the distributed one in "/home/demo/GNUstep/Applications/Oolite-trunk/oolite-deps/lib". If it exists, then remove the one in your runtime deps folder.
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

I agree, that is the right track.
We are looking at two different problems. One at compile/installation time, another at runtime.

In this thread I'd like to focus on the runtime problem. It is the other one that causes Oolite to fail - but that's what we need to display the message.

Also note that the LD_LIBRARY_PATH would not be a problem running notify-send had we not tampered with for Oolite before in the same script. Either we just unset what we modified before, or we separate the functions over two scripts and use the isolation provided by the OS. Going for isolation also keeps the scripts maintainable as each of them serves a very distinct purpose.
Sunshine - Moonlight - Good Times - Oolite
User avatar
hiran
Theorethicist
Posts: 2056
Joined: Fri Mar 26, 2021 1:39 pm
Location: a parallel world I created for myself. Some call it a singularity...

Re: Oolite Linux potentially unresponsive (stuck, crashed)

Post by hiran »

To move forward, I modified the two script in question and tested them on my local installation. That worked to satisfaction.
It actually involves two changes, distributed over two repositories. Plus the fact that one script is not directly in the repo, it is generated at installation time by the installer.

Please review the PRs and let me know if they are good to be merged.
https://github.com/OoliteProject/oolite/pull/432
https://github.com/OoliteProject/oolite ... ies/pull/2

Also, since I change the oolite-linux-dependencies I am not sure what additional change should go into the Oolite repo to make it pull the latest submodule.
Sunshine - Moonlight - Good Times - Oolite
Post Reply