Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic Versioning for Annif releases beyond 1.0 #616

Closed
osma opened this issue Aug 31, 2022 · 6 comments
Closed

Semantic Versioning for Annif releases beyond 1.0 #616

osma opened this issue Aug 31, 2022 · 6 comments

Comments

@osma
Copy link
Member

osma commented Aug 31, 2022

Semantic Versioning is a set of good practices for version numbers that reflect the expected amount (or lack of) backward compatibility in a software release. All Annif releases so far have been in the 0.x series, where there are no strict rules and anything is allowed, which is good for initial fast development. However, Annif is nowadays used in production in at least four large institutions (NatLibFi/FintoAI, Yle, ZBW, DNB) and according to the SemVer FAQ, "If your software is being used in production, it should probably already be 1.0.0.". So we should aim at releasing version 1.0.0 somewhere in the not-so-distant future (during the year 2023). This will bring us into a new era where we have to be more careful about backwards compatibility and selecting version numbers for new releases in a way that follows the SemVer principles.

The guiding principle of semantic versioning is to document the public API of the software project and then to try to avoid introducing backwards incompatible changes to that API; if such changes must be made, this requires a new major release. This begs the question what is the public API of Annif? It isn't primarily a software library (although it could be used that way) but a tool that is commonly used via the CLI and/or the REST API, with configuration files and data stored on disk. We need to define the policy for what may and may not change, particularly for minor releases (e.g. 1.0.0 -> 1.1.0) which require the most judgement.

Here is a breakdown of the possible aspects of backwards compatibility and some suggestions for what could be the policy for what is and isn't allowed in a minor version release. Overall, I think it makes sense to consider the perspective of the Annif user (and/or the person who is managing the installation); a minor version should be a smooth upgrade that in most cases shouldn't require any changes apart from the upgrade itself (although some changes could be beneficial, for example enabling new features). There is obviously a tradeoff here: we want upgrades to be smooth, but a very strict policy on backwards compatibility makes it more likely that major versions need to be released often, which is probably not what we want.

CLI commands

Possible policies for CLI commands:

  • Old CLI commands (including options) must keep working, including commands that rely on default values.
  • Old CLI commands (including options) must keep working, but default values may change.

If we choose the latter (less strict) policy, it means that scripts that run Annif commands are best written so that they avoid relying on default values, so that they don't break when the defaults change.

It is also possible (and allowed) to deprecate some CLI commands in a minor release. However, removing a CLI command completely would require a major release.

The output of CLI commands may also be considered part of the API, in particular for commands like annif eval whose output is meant to be easily processed using tools like grep and sed. I think that the sensible approach would be to state that the output of /some/ commands (including eval and hyperopt) must stay the same except that it's allowed to introduce new output lines that follow the same syntax (e.g. new evaluation metrics).

REST API method calls

The REST API already includes a version number prefix (currently /v1/) so it is expected that any backwards incompatible changes would require incrementing that version prefix. If support for the old API version is removed, this would naturally also trigger a new major release of Annif. What is considered a breaking change for the REST API is another discussion, but that is out of scope for Annif version numbers.

Web UI

The Web UI relies on the REST API for all operations. I don't think it makes sense to restrict changes to the web UI, since it's not something that is normally used programmatically but by human users. So any kind of changes to the Web UI may be made in minor releases of Annif.

Configuration files

Configuration files (e.g. projects.cfg, projects.toml) are managed by the user and thus it's not desirable that they need to be changed because of a minor version upgrade. Possible policies:

  • Old configuration files must keep working and default values must stay the same.
  • Old configuration files must keep working, but default values may change.

The policy could also be different for core vs. optional backends. If for example upstream Omikuji changes its default values, would this trigger a new major release of Annif?

Vocabulary data

The vocabulary is loaded with the annif loadvoc command and ideally would not need to be reloaded due to a version upgrade. So the suggested policy is that for minor Annif releases, previously loaded vocabularies should keep working. If there are changes to the format on disk, there should be an easy (and if possible, automatic) migration mechanism available so that a full reload is not necessary.

Project/model data

This is probably the most difficult part, since Annif relies a lot on external libraries for the backends, and some of those tend to change frequently in backwards-incompatible ways (e.g. Omikuji), or at least complain about data files saved using previous versions (e.g. scikit-learn vectorizers used by many backends). Some possible policies for minor releases:

  • Old projects must keep working without showing any deprecation warnings.
  • Old projects must keep working, but it's OK to show warnings.
  • Old projects for the core backends (installed by default, e.g. tfidf, svc, mllm, stwfsa) must keep working, but optional backends (e.g. fasttext, omikuji, nn_ensemble) may require retraining.
  • Old projects may require retraining as long as this is clearly communicated to the user in the release notes and proper error messages are displayed when attempting to use incompatible old models.

The first two are probably too strict and would either trigger frequent major releases or hold back important upgrades of external libraries.

Python API

As said above, Annif isn't primarily a software library. Thus, any changes to the Python API (e.g. method signatures) are considered internal and are allowed in minor versions.

Python environment

Annif has tried to maintain support for three consecutive Python versions and this window is occasionally shifted forward. Nowadays new Python minor releases (e.g. 3.9, 3.10, 3.11) are made on a 12-month schedule, with the release made in October. Would dropping support for an old version of Python require a major release of Annif? Possible policies for minor releases:

  • Annif must keep supporting all previously supported Python versions.
  • Annif must keep supporting all previously supported Python versions, as long as they are supported by the upstream Python project; it is OK to drop support for EOL versions.
  • Annif must support three consecutive Python versions; it is OK to drop support for an old Python version as long as any three versions are still working.
  • Annif may drop support for old Python versions.

The first (strictest) option would in the long term imply a new major version of Annif at least once per year, because old Python versions cannot be supported indefinitely. I think the second or third options make the most sense. The second option is possibly too conservative (e.g. support for Python 3.7 could be dropped in June 2023 and for Python 3.8 in October 2024 - but support for 3.7 was dropped in Annif 0.58 released in August 2022!), while the third option reflects the status quo.

Next steps

I invite all Annif users, particularly those involved in production installations, to comment on the above suggestions. After the discussion, we need to turn this into a specification (e.g. a wiki page) that documents the policies on how semantic versioning is applied for Annif releases beyond 1.0.0.

@osma osma added this to the Long term milestone Aug 31, 2022
@osma osma pinned this issue Aug 31, 2022
@mo-fu
Copy link
Contributor

mo-fu commented Sep 16, 2022

For ZBW the main issue in the past has been compatibility of serialized model between versions. Unfortunately this is the hardest part as this can break when dependencies are updated or for different Python versions. Therefore, also a change of the Python version in the Docker container should lead to a major version increment if the models become incompatible.
In the past we only switched to a new Annif version when we also trained and deployed a new model. In the future ZBW could and probably should establish a kind of integration test. The procedure would spin up a container with the old model files and the new Annif version and test the API against it. If there were no security reasons ZBW would probably not update every minor version if it would result in training new models.

The following will be my thoughts on the other areas mentioned in the issue:

CLI

As the CLI is the main interface taught in the tutorial, it is probably used by most people. Changing the interface and also default values could become a problem for commands that take a long time (train/eval/index). During manual interaction, users could probably adapt but could get frustrated if a command takes a long time to finish and the output is different than expected.
So probably best not to change it between minor versions.

Rest API

I concur that there is no issue in adding methods between minor versions.
As long as the old version is kept there should be no need to increase the major version

Web UI

Also agree that it probably could change between minor versions as user could adapt quickly

Configuration Files

If default values change you may get a completely different model, possibly breaking the predictive power of your model.

Vocabulary

If you have to break the compatibility you should make sure that old vocabularies can not be loaded.
I.e., no silent failures that would lead to strange prediction

Python API

We actually use Annif as Python library for parameter optimization but we are aware that this is not intended and the risk is calculated.

@osma
Copy link
Member Author

osma commented Sep 19, 2022

Thank you for your thoughts @mo-fu ! This is very valuable and we seem to be in agreement on the big issues.

Therefore, also a change of the Python version in the Docker container should lead to a major version increment if the models become incompatible.

Yes, this is true and should be considered when choosing the version number for a new release.

In the future ZBW could and probably should establish a kind of integration test. The procedure would spin up a container with the old model files and the new Annif version and test the API against it.

You are of course welcome to do this, but I think whether a new version is expected to work with old model files is something that is generally known around release time and this should be (and has been - at least we've tried!) communicated in the Release Notes. It should not be up to users such as ZBW to discover that models have broken, but verifying it never hurts.

@osma
Copy link
Member Author

osma commented Sep 19, 2022

There is also one more aspect of compatibility that I forgot above:

Analyzers

Analyzers are used to pre-process text and do things like stemming and lemmatization. But they also change over time, and a word may no longer be lemmatized the same way - see this comment for an actual example. New versions of lemmatizers like spaCy and Simplemma are released quite frequently and we probably want to keep up with them without too much delay.

But there is a problem with old models trained using the old analyzers - they assume certain words are lemmatized in one way, and when that assumption breaks, the model may perform worse than it did before. This could be a particularly big issue for the MLLM lexical model, which relies a lot on words being lemmatized in a consistent way.

@san-uh
Copy link

san-uh commented Sep 19, 2022

General
The DNB uses Annif in various usecases. Under the hood there are three different vocabularies (GND descriptors, DDC subject categories, DDC short numbers) for indexing and classification on the basis of full texts and tables of contents in the text languages German or English in productive use. The entire workflows are fully automated and run daily. Currently, between 4000 and 7000 publications are processed daily, with a strong upward trend. The machine-generated metadata are written directly into the catalogue system to become available for research and for data buyers immediately. Currently, six different Annif models run in Docker containers as a service via the provided REST API.
From DNB's point of view, the REST API interface is the part of Annif where to handle most carefully concerning a strict backward compatibility policy. For the other components a less strict policy might be possible.
It is useful to find a balance between the regulations of semantic versioning and an efficient development of Annif components that are dependent of the functionality, interfaces and default values of rapid growing up third party’s software.

CLI commands
The 1st option is our preferred option. But from DNB's point of view, the 2nd option is sufficient to be chosen. But changes to the default values should be easily recognisable in the release notes.
We encourage the use of deprecation warnings in minor releases to prepare the user for changes in the near future. In addition, you might want to consider highlighting other lifecycle stages in the client commands, such as "experimental".

REST API
We agree. As described in the introduction, from DNB's point of view the REST API is the most important interface of Annif we use in our workflows. The major version number must change on any incompatible change of its behaviour.

WEB UI
We agree, it doesn’t make sense to restrict the changes on the web UI, because it is not used in DNB’s productive workflows (only for some tests). In general, it is helpful to add new REST API’s functionalities to the web UI.

Configuration files
The 1st option is preferred, but similar to CLI commands, the 2nd option is sufficient. But changes should be easily recognisable in the release notes.

Vocabulary data
The fact that previously loaded vocabularies should continue to work with minor Annif versions is not a hard condition for versioning. It should also be acceptable for the user to reload the vocabulary when a version changes. In our experience, reloading the vocabulary is fast enough. It is a mandatory part of our command chain for training models.

Project/model data
The first two options are indeed too strict: option 3 or option 4 are both sufficient. Generally, the main documentation in the Annif Wiki should also be kept up to date in addition to the version notes and appropriate error messages.
From the users point of view it is not necessary to distinguish between optional and core backends.

Python API
We agree.

Python environment
In general, we agree that of the options mentioned, 3) and 4) would be best. 1) and 2) are far too conservative for the context of Annif software. According to the installation in a pyenv virtual environment it is more important to encapsulate Annif and its dependencies in a nutshell than to be compatible with a bunch of different older Python versions.
Nevertheless, itshould also be stated somewhere (in the version notes?!) which Python version it has been tested by the developers. Installation as a Conda package would also be desirable. This would also simplify the operation of several Annif versions.

Analyzers
Models need to be re-trained and re-tested when a new version of a used analyzer is released. This should not be a fundamental problem for most users of Annif (experimental or production).

Additional: Annif container images
We are currently using the Docker containers provided by Annif. That spares us and other users to build and test reusable and system independent solutions for productive use. So it is required to provide a container (with a standard functionality) for each minor version. It is absolutely clear, that it isn’t possible to provide images for each wacky functionality.

Greetings, Christoph & Sandro for the DNB Team

@osma
Copy link
Member Author

osma commented Oct 10, 2022

Thank you all for your input so far. As a first step towards improving awareness of backward compatibility issues between Annif versions, we have included a new section "Backward compatibility" in the release notes of Annif 0.59 and intend to keep this section in future release notes as well.

@juhoinkinen
Copy link
Member

juhoinkinen commented Aug 17, 2023

I created a page in Wiki describing the requirements and policies for the backward compatibility of Annif public API based on the discussion in this issue.

I close the issue, but it can still be commented.

EDIT: I fixed the link URL since the wiki page got renamed --@osma

@juhoinkinen juhoinkinen modified the milestones: Long term, 1.0 Aug 17, 2023
@osma osma unpinned this issue Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants