Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting of "newspaper year"-processes in METS-file of newspaper-process #5821

Open
andre-hohmann opened this issue Oct 24, 2023 · 8 comments
Labels

Comments

@andre-hohmann
Copy link
Collaborator

andre-hohmann commented Oct 24, 2023

Describe the bug
If new processes for newspaper-issues are created, processes for the year-levels are created, too. The order of the year-processes is done according to the Kitodo ID, although this does not always reflect the order of the processes.

For example, if processes for older issues are created after the new ones, because they needed to be repaired and could be digitized only after the new ones.
This can be solved be rearranging the order in the metadata editor manually - but this is cumbersome.

To Reproduce
Steps to reproduce the behavior:

  1. Create processes of newspaper issues for 1874, 1873, 1872, 1871, ...
  2. Open the processes of the newspaper title in the metadata editor and in the METS file
  3. Check the sorting of the year processes

Expected behavior
The processes of the years should be ordered by the values in the METS-attribute ORDERLABEL.

Screenshots
In the following example, the process until 1896 are migrated. The other ones are created after the migration.

sortingYear02

Release
3.6.0-SNAPSHOT from 03/01/2023

Desktop (please complete the following information):

  • OS: Windows
  • Browser: Chrome

Additional context
This issue is related to:

@BartChris
Copy link
Collaborator

BartChris commented Nov 10, 2023

@andre-hohmann Are you talking only about newspaper issues? I see this as a general problem (as it also relevant for periodical volumes and multi-volume works), which was also discussed in one group at the Kitodo User Group Meeting this week.
The correct sorting of subprocesses of a parent process in the presentation system can be achieved in two ways: a) by manually sorting the processes in Kitodo or b) by using the "ORDERLABEL"-attribute to sort the processes independent of the ordering in the exported METS which is defined by Kitodo production.

I know that some libraries rely on the ORDERLABEL to achieve correct ordering in their respective presentation system because when the import of volumes in Kitodo happens not in the order of the volumes it is a lot of work to reorder the volumes in Kitodo. On the other hand the reordering feature in Kitodo was introduced exactly for the purpose of bringing subprocesses in the right order.

I am not sure if it is a good idea to automatically sort imported processes based on the ORDERLABEL while importing since the question would be when exactly this should be the case. (Only for newspapers? Always? Does that make ORDERLABEL a required field?)

What do you think of a button in the metadata editor which would allow to sort the processes by ORDERLABEL? That would allow us to keep the current behavior: new processes are always attached to the end by default (this is definitely the case for mass import), but can be reordered based on the orderlabel. One could check then if all processes involved have an ORDERLABEL and only enable sorting in this case. That would also allow to sync the ordering in Kitodo based on an ORDERLABEL attribute which is derived from the catalogue. We can maybe also discuss this in the @kitodo/kitodo-community-board. cc @matthias-ronge

@andre-hohmann
Copy link
Collaborator Author

@BartChris :
In this issue, i am talking only about the "year-processes" (which are shown in the metadata editor of newspaper processes), because for processes on this level the metadata or the sorting cannot be adjusted during the import. In addition only few metadata is created, which could be applied to sort the processes.

For other document types there are opportunities to adjust the sorting of the subordinate processes. We can think about another general approach, but we should take into account the following aspects:

The processes of newspaper-issues in the "year-processes" seems to be sorted correctly, due to the elements "month" and "day".

You are right, that this is relevant for periodicals and multi-volume works, too. However, the order of their subordinate processes (volumes) can be influenced during the manual import (Mehrbändige Werke, Zeitschriften). As we do not apply the mass import, i have not considered this use case.

The information, which is shown in the list can be configured in the ruleset by use="displaySummary":

For the migration, it was decided to apply the key CurrentNoSorting to sort the volumes, which worked well in our migration:

If some processes are not in the current order, we do correct the order manually. Therefore the order of the processes in metadata editor have been unlocked:

For us, this works well, because a wrong sorting of processes of subordinate process does not occur often - at least until now.
The ORDERLABEL might not work well in our case, because for volume with titles, ORDERLABEL contains the title of the volume.
In the end, we can adjust the order of "year-processes" manually, too.

As there might be different approaches to capture sorting information (metadata key, document type, ...) this seems to be a topic for the @kitodo/kitodo-community-board. At least to ensure, that we have the same idea about it and ti gather use cases.

@andre-hohmann
Copy link
Collaborator Author

andre-hohmann commented Feb 2, 2024

Further information:

It seems as if the processes of the newspaper issues, which are created subsequently to existing ones are sorted in a similar way.
Not by month and day as the existing ones, but by internal id or creation date. They appear always at the end in structure tree.

In this case, too, it may not have consequences for the depiction in the calendar, but the METS file looks strange.

It seems as if the described sorting is conducted only, if the month of the newly created issue-process does not exist in the year-process. If the the month already exists, the issue-process seems to be sorted correctly according to the month ans day values.
Edit 2024-02-08: All processes are appended at the end - see comments below.

@apiller
Copy link

apiller commented Feb 5, 2024

https://dfg-viewer.de/show?tx_dlf%5Bid%5D=https%3A%2F%2Fopendata2.uni-halle.de%2Foai%2Fdd%3Fverb%3DGetRecord%26metadataPrefix%3Dmets%26identifier%3Doai%3Aopendata2.uni-halle.de%3A1516514412012%2F182098&cHash=e377843f8a3b175a777c84d3431e3436
is a nice example for the consequencies of this bug. Its ok to move the years to the right position but its definitly to much efford to do it within every single month.

@matthias-ronge
Copy link
Collaborator

When creating newspaper processes with the calendar, an annual process is created for each year; the annual processes are appended to the overall newspaper process. So they get always added at the end. As a result, their order is inconsistent. At the time of software design, the only plan was to create a newspaper as a whole all at once, not incrementally. Remember that the calendar editor was part of the Create Process Screen in version 2.

Goal: New year processes should be inserted in the complete edition in the correct place, not at the bottom.


What happens if you create additional processes for an existing year? Then—I assume—the same year is produced a second time, right?

@apiller
Copy link

apiller commented Feb 7, 2024

No not anymore. Right now the additional issues are child of the "old" year. But unfortunatly they are placed at the end of the list and not within their sequence. As I said its ok to sort about 30 years but to sort issues within a list of 600 others is a punishment.

@andre-hohmann
Copy link
Collaborator Author

Here are some corrections and additions:

correction

It seems as if the described sorting is conducted only, if the month of the newly created issue-process does not exist in the year-process. If the the month already exists, the issue-process seems to be sorted correctly according to the month ans day values.

This is not correct. In the SLUB Dresden, the test scenario was not a realistic one. @apiller observation is correct.

additions

is a nice example for the consequencies of this bug

See newspaper-issues 1928-01-08, 1928-01-15, 1928-01-22, 1928-01-29 which are appended at the end of the month. On the first glance, everything looks correct. Another examples can be found in January 1924 (Jahrgang 37).

sort issues within a list of 600 others is a punishment

Due to the long loading times when opening the year processes with more then 200 (up to 600) child processes the metadata editor (more then 15 minutes), this is in some cases technically just not possible.
There has been a related issue regarding the response time, but i do not know, if it was worse before solving the issue or if there exists another problem.

@andre-hohmann
Copy link
Collaborator Author

Another example from SLUB Dresden is the following one of "Schönburger Tageblatt und Waldenburger":

@andre-hohmann andre-hohmann changed the title Sorting of processes of "newspaper year"-processes Sorting of "newspaper year"-processes in METS-file of newspaper-process Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants