Enable Oolite development

hiran · Post by **hiran** » Sun Apr 23, 2023 8:46 pm

It seems to me that Oolite development has stalled. Here is a code change that took 474 days from commit to a pre-release.

It is absolutely understandable that development stalls if there are almost no developers, no automatic build environment and the supported platforms outnumber the ability of single persons to really test their changes. As stated here, some changes were just lucky guesses. And with over time libraries have to be exchanged, separately for every platform and thoroughly tested. How should an individual master all this? But without ongoing development, without a minimum maintenance the code will at some point in time no longer compile well (this point has already come for the MacOS platform).

Countermeasures need to lower the hurdle of changing code without knowing the results. There are several steps involved. One of them was to introduce automatic builds. So today, whenever we merge/push something into the master branch we automatically get Oolite built both for Linux and Windows. But there is more...

a) Why do we wait for an automatic build until we merge on master? Isn't it worth to see the build from a feature branch before deciding to merge?
b) Doing more builds requires also that we have good names for them. Tagging the builds may become essential, and we need to agree on a method for that.
c) More builds will create more artefacts. Can GitHub host them all, or will we run into a quota? How can we keep the number of builds at a reasonable level?
d) Automatic builds just provide code, and it would still take users to download, test and feedback if everything is good. Everything? Keep in mind the size of the code, amount of features and thus complexity of testing. So we need to think about automatic testing. Unit tests would be a suitable approach: With them, every developer would see the impact of his/her change at the earliest.
e) How would we check whether unit tests have been implemented? Or whether they need to be updated to include new features that get added in the future? For this there exist tools to measure test coverage or other issues with the code.

It seems to me before we go for a) we have to have a solution for b) and c).
Similarly, it may be easier to first resolve e) before we attack d).

What are your opinions on it?

hiran · Post by **hiran** » Mon Apr 24, 2023 8:38 pm

Not all at the same time, please.
Looks like I am going into a monologue. But then I spent some thoughts on it that I can share.

hiran wrote: ↑Sun Apr 23, 2023 8:46 pm
a) Why do we wait for an automatic build until we merge on master? Isn't it worth to see the build from a feature branch before deciding to merge?
b) Doing more builds requires also that we have good names for them. Tagging the builds may become essential, and we need to agree on a method for that.
c) More builds will create more artefacts. Can GitHub host them all, or will we run into a quota? How can we keep the number of builds at a reasonable level?
d) Automatic builds just provide code, and it would still take users to download, test and feedback if everything is good. Everything? Keep in mind the size of the code, amount of features and thus complexity of testing. So we need to think about automatic testing. Unit tests would be a suitable approach: With them, every developer would see the impact of his/her change at the earliest.
e) How would we check whether unit tests have been implemented? Or whether they need to be updated to include new features that get added in the future? For this there exist tools to measure test coverage or other issues with the code.

It seems to me before we go for a) we have to have a solution for b) and c).
Similarly, it may be easier to first resolve e) before we attack d).

First things first to get them out of the way:
b) Good names for our builds
I looked into Semantic Versioning which is an industry standard. There should not be a way around it, and in fact Oolite has followed these standard patterns quite well anyway. The only difference is that semantic versioning allows to be applied at prereleases as well as feature branches etc. So it is the answer to all the question marks how to name a release.

On a different repository I tried the GitHub action that automatically calculates the version number, and I got really usable results just by sticking to the default settings. Therefore I consider this a feasible way to finish off b).

c) Limit artefacts
Github actions are available to delete artefacts using filters that we define. That means we can configure limits like
- for every branch, keep up to three prerelease builds (oldest will be removed first)
- in total, keep up to 50 prerelease builds (oldest will be removed first)
- while we can setup similar limits for releases we can also choose to have them managed manually, as they currently show a long history of Oolite development.
The limits that are applied by GitHub can be seen in the organization settings.

I have tested this action in another repository and it works as expected. Therefore I consider this a feasible was to finish off c).

e) Checking unit tests and code quality
There are tools out there that can help with automatic analysis. Some even offer analytic services free of charge for open source projects.
I tested SonarQube and Codacy Production. While SonarQube is great for Java and many more languages, it seems a bit quirky to be configured for Objective C. This seemed easier with Codacy. Both tools report that there is no code coverage. Does it mean today there are really no unit tests, or did both just not detect it?

Nevertheless, for such a free hosting offer to be correctly configured, one would need admin privileges on the code repository, if not even on the organization. Who can help?

and more · Post by **cim** » Tue Apr 25, 2023 7:56 am

Semantic versioning for pre-release: watch out for things like "maximum_version = 1.XX.99" in OXP manifests, if you're going to be incrementing that number automatically every build. So long as the "odd = pre-release" / "even = release" pattern for XX continues, it shouldn't be an issue in practice as the release builds would still never get that high and sticking a pre-release build as the maximum version would be unusual. That isn't strictly semantic versioning, of course, but the interoperability reasons for semantic versioning don't apply so much to Oolite anyway, and changing a mature version numbering scheme is a different question to establishing one from the start.

Unit tests: Indeed - unless someone added them since I left, anyway - there aren't any. Nice as automated testing would have been, it was generally not the unit-testable code which was causing the bugs anyway - Oolite's code is generally a mix of "well, you could unit test that to increase your coverage metric, but you can tell by eye that it's correct anyway and it'd be really obvious in-game if you did somehow manage to break it" and "you are not going to get useful unit tests on that method without refactoring four entire classes, because the code was never written with unit testing in mind" - a combination which does not encourage trying to get started on adding them.

hiran · Post by **hiran** » Tue Apr 25, 2023 7:25 pm

cim wrote: ↑Tue Apr 25, 2023 7:56 am
Semantic versioning for pre-release: watch out for things like "maximum_version = 1.XX.99" in OXP manifests, if you're going to be incrementing that number automatically every build. So long as the "odd = pre-release" / "even = release" pattern for XX continues, it shouldn't be an issue in practice as the release builds would still never get that high and sticking a pre-release build as the maximum version would be unusual. That isn't strictly semantic versioning, of course, but the interoperability reasons for semantic versioning don't apply so much to Oolite anyway, and changing a mature version numbering scheme is a different question to establishing one from the start.

Thank you for bringing this up, but ooops...
Do I understand correctly that Oolite version numbers contain at most a 1 as major, then two digits as minor and at most a 99 in the end? Odd minor numbers are prereleases while even minors are releases? This would be quite alarming for me:

We already reached 1.91. The next release should then be 1.92. After three further releases (1.94, 1.96 and 1.98) the project would reach it's end.
Sounds a bit like the year 2000 problem. But: The only arithmetics on version numbers we need is to compare them. Is one bigger than the other? Is one compatible to the other? These functions should exist somewhere in the Oolite code, and I do not believe they have to be repeated in each and every OXP. Which would give us a chance to change the code to properly work on semantic versioning. I would at least give that a try.

Yet version numbers from a feature branch (let's call that branch featureA) would then look like 1.91.1-featureA.1. Once merged into master we would get to 1.91.2. If on the third part (I think semver calles it the patch or build) we are not limited to two digits there should not be any problem.

But it needs to be tested, I agree here.

cim wrote: ↑Tue Apr 25, 2023 7:56 am
Unit tests: Indeed - unless someone added them since I left, anyway - there aren't any. Nice as automated testing would have been, it was generally not the unit-testable code which was causing the bugs anyway - Oolite's code is generally a mix of "well, you could unit test that to increase your coverage metric, but you can tell by eye that it's correct anyway and it'd be really obvious in-game if you did somehow manage to break it" and "you are not going to get useful unit tests on that method without refactoring four entire classes, because the code was never written with unit testing in mind" - a combination which does not encourage trying to get started on adding them.

I started using SonarQube on some of my projects. In the beginning I was annoyed about all the findings. It was not just the test coverage but partly things I would disagree with. But the more I looked into the issues the more I understood why they are actually code smells and meanwhile I am trying to remove them completely. And I agree that this can lead to refactoring a couple of classes, leasing to code that is easier to understand and maintain. However before refactoring code you want to have unit tests just to make sure you do not break too much during the refactoring.

Writing unit tests definitely looks like unnecessary work but Oolite has reached a quality that noone wants to loose. As the saying goes, noone would touch a running system. But that also means no further development.
With unit tests that ensure we do not loose functionality developers can start changing code again - especially if these developers are new to the code and cannot asses the full impact of their actions. And I am very confident that we badly need to attract new develoeprs. So let's make it easy for them. But I fully understand that this is my opinion which need to be shared by others.

So I'd like to suggest not to argue too much about the automatic testing. Code analysis can find more than just that. And I know we can get it free of charge. Should we not simply configure it, look at the results and discuss then?

and more · Post by **cim** » Wed Apr 26, 2023 9:45 am

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
Do I understand correctly that Oolite version numbers contain at most a 1 as major, then two digits as minor and at most a 99 in the end?

They're semver-style arbitrary-base numbers, so 500.45785.284783 would be valid in theory and understood by the OXP manifest parser. Going to 1.100.0 for a future stable release would certainly be fine. (Though strict semver would insist on 2.0.0 anyway; almost all major release versions have broken backwards compatibility for *something*, even if generally nothing big ... which is what I mean by "changing a mature and understood version numbering process")

The problem is that the Oolite website, when you register an OXP there, suggests (or at least suggested?) "1.XXX.99" as its format for maximum versions, if you wanted to say "runs on any 1.XXX but not 1.(XXX+1)". If you're going to have >99 point versions routinely, that breaks.

Of course, that's only important for the version number contained in the Oolite binary itself, which is a constant definition in the source code. So the Github tag for the build could be 1.91.400 and the binary would still claim to be 1.91.0 and that would be fine for OXZ compatibility. If you had the build rewrite that constant to always match the tag version that would be trickier.

Or you could hack the manifest.plist parser in Oolite itself to treat "1.XXXX.99" maximum versions as meaning "<1.(XXXX+1).0", for a workaround. Or you could adjust the website manifest generator first to put a much greater number of 9s on the 3rd component.

All solvable problems, but saying "from now on, semver" is potentially extra work that doesn't really add value over Oolite's existing scheme:
A.B.C
A = extremely major version; changes to this are expected to break all (or at least a lot of) existing OXPs
B = major version; some OXP incompatibilities, but mostly they should just work as before. LinuxKernel style of even for release, odd for pre-release.
C = minor version; bugfixes only on release versions, largely irrelevant for pre-release in practice

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
Yet version numbers from a feature branch (let's call that branch featureA) would then look like 1.91.1-featureA.1

From memory the OXP manifest parser would interpret that as 1.91.not-an-integer-so-call-it-zero.we-didn't-expect-a-fourth-number-anyway

But again, the binary would probably still advertise itself as 1.91.0 in that case regardless. If not, getting the manifest parser to ignore the dash and anything after it should fix things up.

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
And I am very confident that we badly need to attract new develoeprs. So let's make it easy for them.

I'm not disputing the value of a comprehensive test suite - all the stuff I write in my day job gets one, and it saves me a lot of effort in the long run - I'm saying that there's no point in configuring frameworks to run unit tests automatically until there are actually some useful unit tests written, and having been through the process of trying to add tests to older code not designed to be easily tested a few times, you might find it hard to get people to volunteer to add them to Oolite.

For example, if you look at ShipEntity.m the frustumRadius function is probably fairly easy to write unit tests for that:
- mock up ship entities with various profile radii and exhaust sub-entities;
- verify that the function returns the profile radii + the largest exhaust radii in each case
You're not adding much value in that case because the function self-evidently does that by looking at it, and is unlikely to need changing in future anyway, but you could certainly write a unit test for it. And if you did that tedious retrospective work for a few thousand methods, the "not much added value" per-test would eventually add up to actual bugs averted here and there. But the "tedious" bit means that it doesn't help solve the lack of developers problem [1].

Conversely, the "setUpMixedEscorts" method would be far harder to unit test (it pokes a whole bunch of global state which would need at least a lot of mocked objects) but is something which might plausibly generate bugs. The problem is that because of the number of mocked objects and calls you'd need to set up, you'd potentially end up not testing the output but the internal implementation, so perfectly good changes to the method would also fail the unit test. With a bit of careful refactoring - which you'd have to test manually, because that refactoring is a pre-requisite to build the test suite! - you could probably get it to the stage where it would be testable more sensibly, though.

And then there's things like "behaviour_attack_broadside", which don't have a return value, are entirely side-effects on object state, are very easy to break if you change them, are very easy to break if you change some not obviously related methods too, and are very hard to even rigorously define correct behaviour for in a way that doesn't classify *improvements* as test failures just as much as breakages are. Good luck

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
Code analysis can find more than just that. And I know we can get it free of charge. Should we not simply configure it, look at the results and discuss then?

Sure, absolutely, I'm sure static analysis tools are way better than they were the last time I ran one over Oolite - but you can do that on an offline copy of the source without needing it integrating as a build action on every commit, right? For an app as old and hairy as Oolite, it's likely going to generate thousands of potential things to check and improve, so having it generate roughly the same thousand every day doesn't really add anything. Just run the report, stick it up somewhere, and let people pick them off. If there's any enthusiasm for that sort of tidying up, run another report next month.

If it's generating thousands of things, noticing that the change you introduced has added an extra one which is actually serious is pretty tricky. If the baseline report is clear - either fixed or manually marked as "this one is fine" in some exceptions file - *then* they become useful for integrating into the builds because new ones stand out. But you have to get to that point first.

(Sure, you can probably integrate it for free; still not worth it if it's not going to generate anything useful on a day-to-day basis)

[1] If you're volunteering to personally write enough unit tests that you can safely do the refactoring needed to remove the static analysis warnings, then ignore this bit. But again, cart-before-horse on sticking it up on automated building: you need to do that work first to get any value out of the automated build step.

hiran · Post by **hiran** » Wed Apr 26, 2023 7:42 pm

cim wrote: ↑Wed Apr 26, 2023 9:45 am
All solvable problems, but saying "from now on, semver" is potentially extra work that doesn't really add value over Oolite's existing scheme:
A.B.C
A = extremely major version; changes to this are expected to break all (or at least a lot of) existing OXPs
B = major version; some OXP incompatibilities, but mostly they should just work as before. LinuxKernel style of even for release, odd for pre-release.
C = minor version; bugfixes only on release versions, largely irrelevant for pre-release in practice

We can easily stay with A.B.C numbers. In that case C would somehow markup the build number on branches.

What I actually want to go for is having automatic builds not just on releases. The reason is that no single individual would build on multiple operating systems and then upload the result so others can see it, and this over years in a timely manner. These times are gone. So actually I am striving for a) and thought the other two were prerequisites.

cim wrote: ↑Wed Apr 26, 2023 9:45 am

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
Yet version numbers from a feature branch (let's call that branch featureA) would then look like 1.91.1-featureA.1
From memory the OXP manifest parser would interpret that as 1.91.not-an-integer-so-call-it-zero.we-didn't-expect-a-fourth-number-anyway

But again, the binary would probably still advertise itself as 1.91.0 in that case regardless. If not, getting the manifest parser to ignore the dash and anything after it should fix things up.

That does not sound like a showstopper then.

cim wrote: ↑Wed Apr 26, 2023 9:45 am

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
And I am very confident that we badly need to attract new develoeprs. So let's make it easy for them.
I'm not disputing the value of a comprehensive test suite - all the stuff I write in my day job gets one, and it saves me a lot of effort in the long run - I'm saying that there's no point in configuring frameworks to run unit tests automatically until there are actually some useful unit tests written, and having been through the process of trying to add tests to older code not designed to be easily tested a few times, you might find it hard to get people to volunteer to add them to Oolite.

This is a chicken-hen paradoxon. It does not make sense to configure a test framework if there are no tests. But what point is it to write tests if there is no test framework?

cim wrote: ↑Wed Apr 26, 2023 9:45 am

hiran wrote: ↑Tue Apr 25, 2023 7:25 pm
Code analysis can find more than just that. And I know we can get it free of charge. Should we not simply configure it, look at the results and discuss then?
Sure, absolutely, I'm sure static analysis tools are way better than they were the last time I ran one over Oolite - but you can do that on an offline copy of the source without needing it integrating as a build action on every commit, right?

I did that to see what two specimen could deliver. The results are just visible to me. What's next?

cim wrote: ↑Wed Apr 26, 2023 9:45 am
For an app as old and hairy as Oolite, it's likely going to generate thousands of potential things to check and improve, so having it generate roughly the same thousand every day doesn't really add anything. Just run the report, stick it up somewhere, and let people pick them off. If there's any enthusiasm for that sort of tidying up, run another report next month.

Actually the frequency to run the reports depends a bit on the change frequency. There is no point in running it more often than we receive commits. Running it less often could be debatable. But even once a month would be an improvement. Running it manually is not a feasible option. Who would do it? How often? How reliably? For how long?
Yet on the other side the integration is just a few clicks away - given that you have the right privileges. I would have already configured it by now since I spent way more time writing this message.

cim wrote: ↑Wed Apr 26, 2023 9:45 am
If it's generating thousands of things, noticing that the change you introduced has added an extra one which is actually serious is pretty tricky. If the baseline report is clear - either fixed or manually marked as "this one is fine" in some exceptions file - *then* they become useful for integrating into the builds because new ones stand out. But you have to get to that point first.

The tools offer that functionality.

cim wrote: ↑Wed Apr 26, 2023 9:45 am
[1] If you're volunteering to personally write enough unit tests that you can safely do the refactoring needed to remove the static analysis warnings, then ignore this bit. But again, cart-before-horse on sticking it up on automated building: you need to do that work first to get any value out of the automated build step.

I did not say I am volunteering. Objective C is far from my capabilities. But if I were to touch the code I'd first want some safety net before messing up the core.

What do others think about this?

timer · Post by **timer** » Thu Apr 27, 2023 6:50 am

Very interesting, but I'm overloaded with coding for work and can hardly find time to develop the site :(

PS best strategy: "hack-hack and deploy to prod!" ;)
PPS I am very glad that cim is in touch with us!

and more · Post by **cim** » Thu Apr 27, 2023 11:13 am

hiran wrote: ↑Wed Apr 26, 2023 7:42 pm
This is a chicken-hen paradoxon. It does not make sense to configure a test framework if there are no tests. But what point is it to write tests if there is no test framework?

Test framework in the sense of "there's somewhere you can write tests and run them locally while building the change" is obviously a pre-requisite to get any tests written. I guess whoever writes the first test gets to pick which test framework they want to use.

Once some tests are written, then running them automatically on build would be the next step, but even something where you can type "make test" and get some useful output locally is a big help. That would usually be the first step anyway, since it's easier to debug why the test framework isn't working locally before trying to debug it remotely a commit-push-build cycle at a time.

hiran wrote: ↑Wed Apr 26, 2023 7:42 pm
I did that to see what two specimen could deliver. The results are just visible to me. What's next?

Publish them so that others can see them too, I suppose. Sure, it's not as neat as having it integrated into the build process, but at least they'd be available somewhere.

hiran wrote: ↑Wed Apr 26, 2023 7:42 pm

cim wrote: ↑Wed Apr 26, 2023 9:45 am
[1] If you're volunteering to personally write enough unit tests that you can safely do the refactoring needed to remove the static analysis warnings, then ignore this bit. But again, cart-before-horse on sticking it up on automated building: you need to do that work first to get any value out of the automated build step.
I did not say I am volunteering. Objective C is far from my capabilities. But if I were to touch the code I'd first want some safety net before messing up the core.

If no-one is volunteering to write unit tests, then whether they might hypothetically benefit a future developer and whether the build process would run them automatically is irrelevant: there aren't any and there aren't going to be any in future. So the problem is not how to do the relatively short piece of technical work to integrate an irrelevant test framework that will constantly report "0/0 tests passed!" but how to get someone to do the much longer piece of work to build enough unit tests that the test framework gives sufficient coverage to have any confidence whatsoever that "all tests passed" means anything in terms of that safety net.

hiran · Post by **hiran** » Tue May 02, 2023 4:01 pm

It looks like there is no further opinion in this matter. Then I will let it rest.

Oolite Bulletins

Enable Oolite development

Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development

Re: Enable Oolite development