Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extended subresolution support #3119

Merged
merged 27 commits into from
Apr 26, 2018
Merged

Add extended subresolution support #3119

merged 27 commits into from
Apr 26, 2018

Conversation

rleigh-codelibre
Copy link
Contributor

@rleigh-codelibre rleigh-codelibre commented Apr 3, 2018

Intended for improved subresolution support in readers, for adding SUBIFDS support to TIFF and OME-TIFF readers.

  • Add MetadataList and CoreMetadataList classes and unit tests
  • Add SubResolutionFormatReader class as an alternative to FormatReader, using CoreMetadataList
  • Deprecate core-index-related methods in IFormatReader, which are exposing internal implementation details of FormatReader
  • Readers for sub-resolution formats and all TIFF formats use SubResolutionFormatReader
  • Note the subclassing of CoreMetadata in SVSReader to store reader-specific subresolution-specific metadata directly in the CoreMetadataList container

CoreMetadataList replaces the ArrayList<CoreMetadata> in FormatReader, in a new SubResolutionFormatReader class (to avoid any API breaks in FormatReader). The difference is that it's an array of arrays, representing series and resolution, rather than a flat list. This makes lookups O(1) rather than O(n) due to avoiding the need to do a full traversal or series of hops over the list; it's now always two array lookups. It implements the bits of ArrayList needed to be as compatible as possible with the previous implementation, so it's a drop-in replacement in many (but not all) aspects. SubResolutionFormatReader is otherwise a complete duplication of the existing FormatReader.

MetadataList is a generalisation of CoreMetadataList to be used for storing arbitrary metadata by individual readers such as IFD mappings, or whatever, for series and subresolutions. Currently used as the underlying implementation of CoreMetadataList, but will be used by the TIFF and other readers in the future. This could be used to simplify some of the existing readers providing subresolutions, as well as to provide new capabilities, such as SUBIFDS support in the TIFF readers. If you don't subclass CoreMetadata, this allows separate containers for different types of reader-specific metadata.

Testing:

  • Ensure all builds are green and all full repository tests are green (reader regressions are possible)
  • Pay special attention to these formats (check that the series and resolutions are in the same order with both flat and nonflat viewing modes, and that the pixel data is OK):
    • AFI (SVS-derived)
    • Imaris HDF
    • Leica-SCN
    • OMERO Pyramid TIFF
    • SVS

(the full repo test should verify all this, but it wouldn't hurt to check over a random sampling).

@rleigh-codelibre
Copy link
Contributor Author

@sbesson Looks like there's a handful of failures in https://web-proxy.openmicroscopy.org/east-ci/job/BIOFORMATS-test-folder/5669/console . I'm not familiar with this new test setup; it looks like several TIFF-based formats are being tested here, but I'm not sure what the commonality is. Is this multiple formats?

* @param <T> The type of metadata to store.
*/
public class MetadataList<T> {
protected ArrayList<ArrayList<T>> data = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay to declare data as being simply List<List<T>>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly, it doesn't work. Nested generics cause javac to error out. I tried and failed, and also worked through several ways to do it with Riad, without success.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fetch your branch and take a look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/components/formats-api/src/loci/formats/CoreMetadataList.java b/components/formats-api/src/loci/formats/CoreMetadataList.java
index b0b84ab076..6b8d627f24 100644
--- a/components/formats-api/src/loci/formats/CoreMetadataList.java
+++ b/components/formats-api/src/loci/formats/CoreMetadataList.java
@@ -105,7 +105,7 @@ public class CoreMetadataList extends MetadataList<CoreMetadata> {
   }
 
   public void reorder() {
-    for (ArrayList<CoreMetadata> s : data) {
+    for (List<CoreMetadata> s : data) {
       Collections.sort(s, comparator);
     }
   }
diff --git a/components/formats-api/src/loci/formats/MetadataList.java b/components/formats-api/src/loci/formats/MetadataList.java
index 5fa7215b50..b1a09f9ac3 100644
--- a/components/formats-api/src/loci/formats/MetadataList.java
+++ b/components/formats-api/src/loci/formats/MetadataList.java
@@ -52,7 +52,7 @@ import java.util.ArrayList;
  * @param <T> The type of metadata to store.
  */
 public class MetadataList<T> {
-  protected ArrayList<ArrayList<T>> data = new ArrayList<>();
+  protected List<List<T>> data = new ArrayList<>();
 
   // Construct an empty list.
   public MetadataList() {

Does this not suffice? It builds cleanly for me.

/**
* Initializes a reader from the input file name.
*
* Calls {@link #initFile(String id)} to initializes the input file, reads
Copy link
Member

@mtbc mtbc Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize, etc. - some odd tense agreement issue in this sentence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are inherited directly from the FormatReader javadocs. I'll fix both up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My issue was with the sentence starting where this comment is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, fixed this now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 ta

String name = "Series " + series;
try {
String realName = ((IMetadata) store).getImageName(series);
if (realName != null && realName.trim().length() != 0) {
Copy link
Member

@mtbc mtbc Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. However, we don't currently have any apache-commons dependencies in formats-api, so I'm not sure if it's worth adding for the sake of one line, which was already in this form in FormatReader.

@sbesson
Copy link
Member

sbesson commented Apr 4, 2018

@rleigh-codelibre each BIOFORMATs-test-folder job runs against one top-level folder under the curated data repository. Recent top-level folders match a given file format but this rule does not apply across the board. The error above seemed to be transient and was fixed by a re-run against the zeiss-tiff subdirectory (see log)

Probably, the first issue to resolve in order to test this against the data repository is to fix the Docker image build logic. Consuming the east integration branch leads to error when retrieving SNAPSHOT dependencies - see also ome/bio-formats-build#11.

@rleigh-codelibre
Copy link
Contributor Author

@rleigh-codelibre
Copy link
Contributor Author

rleigh-codelibre commented Apr 4, 2018

Closing temporarily so we can see if tomorrow's build still has failures; may be some configuration issues with the new docker image and/or the repo config being used which are independent of this PR and should be identified and fixed separately.

@rleigh-codelibre
Copy link
Contributor Author

protected int resolution = 0;

/**
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?


/** Adds an entry to the global metadata table. */
protected void addGlobalMeta(String key, boolean value) {
addGlobalMeta(key, new Boolean(value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static valueOf methods are the ones to use here and below.

protected void addMetaList(String key, Object value,
Hashtable<String, Object> meta)
{
Vector list = (Vector) meta.get(key);
Copy link
Member

@mtbc mtbc Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to get, the remove gives it to us. (Also below.)


/** Adds an entry to the metadata table for the current series. */
protected void addSeriesMeta(String key, boolean value) {
addSeriesMeta(key, new Boolean(value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valueOf

}
catch (IllegalArgumentException e) {
throw new FormatException("Image plane too large. Only 2GB of data can " +
"be extracted at one time. You can workaround the problem by opening " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

work around


if (hasFlattenedResolutions()) {
return 1;
}
Copy link
Member

@mtbc mtbc Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close up linebreak to } else { (also below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'd like this, it's actually the other way around in terms of code style!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fair enough, another OMERO/BF style difference. Certainly keeps me on top of my options dialogs in IDEs!

/* @see IFormatHandler#setCoreIndex(int) */
@Override
public void setCoreIndex(int no) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove blank line

setupService();
Hashtable<String, Object> allMetadata =
new Hashtable<String, Object>();
allMetadata.putAll(metadata);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a copy constructor you can use instead.

import java.util.ArrayList;
import java.util.Deque;
import java.util.HashMap;
import java.util.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually try to avoid wildcard imports.


if (c == null || i == null) {
throw new FormatException("Error setting core metadata for image number " + s);
throw new FormatException("Error setting core metadata for series " + series + " resolution" + resolution);
Copy link
Member

@mtbc mtbc Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space or = before resolution

@@ -42,6 +42,7 @@
import java.util.regex.Pattern;
import java.util.regex.Matcher;

import loci.formats.*;
Copy link
Member

@mtbc mtbc Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid wildcard imports.

ms.pixelType = ifd.getPixelType();
ms.metadataComplete = true;
ms.interleaved =
ms.sizeX > MAX_SIZE && ms.sizeY > MAX_SIZE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting it's a && but outwith this PR.

import loci.formats.FormatTools;
import loci.formats.ImageTools;
import loci.formats.MetadataTools;
import loci.formats.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wildcard again

@sbesson
Copy link
Member

sbesson commented Apr 17, 2018

I have not reviewed the code yet but assuming the repository tests have been passing with these changes and there are no blockers found by the reviewers, my vote would be in favor of merging this into the east integration branch and capturing all outstanding comments and questions (cleanups, reader, API...) as a 5.9.0 card so that they can be addressed in follow-up PRs. A final review should happen when the east integration branch is proposed for the integration into develop.

@mtbc
Copy link
Member

mtbc commented Apr 17, 2018

No blockers from me.

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few specific comments followed by:

  • Depending on the List<T> discussion, we should come back to the name of the new format reader class.
  • Also depending on that discussion, I could imagine maintaining a get(0) which calls out to get(0, 0) to reduce some of the changes here.

Leaving the PR open for a few more days for anyone to review the API, but I agree with @sbesson: I don't foresee us getting all the comments made on this PR done with a single merge. Happy to have east used as the integration branch.

* @param <T> The type of metadata to store.
*/
public class MetadataList<T> {
protected List<List<T>> data = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous work on core metadata moved from a T[] to a List<T>. This moves to a List<List<T>>. Either I think that definition should be at the heart of the specific FormatReader class, i.e. when one is implementing a reader one chooses the CoreMetadataListFormatReader or the CoreMetadataListListFormatReader or we should think about future-proofing a bit. e.g. what would it mean to add a third element here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't mean anything to add a third element right now. The split is series/resolution, and this will be sufficiently future-proof for subresolutions in z.

If we want to get more complex, I think we would have to move to an entirely new interface. This is all an artefact of being backward compatible with the old FormatReader API when adding subresolutions; ideally we would remove most of the coreIndex-related complexity entirely, but that's not currently an option.

// -- Constants --

/** Default thumbnail width and height. */
protected static final int THUMBNAIL_DIMENSION = 128;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these static fields make me wonder if we need to introduce some common element (via inheritance or composition) so the values aren't cut-n-pasted. At the same time, this new class may be a chance to move them away from protected in favor of private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything common should probably go into FormatTools.

/* @see IFormatReader#getOptimalTileHeight() */
@Override
public int getOptimalTileHeight() {
FormatTools.assertId(currentId, true, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the constants above, the functionality in this class should be refactored elsewhere.

@@ -52,6 +49,16 @@
import ome.units.quantity.Length;
import ome.units.UNITS;

class SVSCoreMetadata extends CoreMetadata {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is access needed to these fields or can this be private static?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be package-private already?

@@ -64,16 +71,12 @@
LoggerFactory.getLogger(SVSReader.class);

/** TIFF image description prefix for Aperio SVS files. */
public static final String APERIO_IMAGE_DESCRIPTION_PREFIX = "Aperio Image";
private static final String APERIO_IMAGE_DESCRIPTION_PREFIX = "Aperio Image";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I don't approve of more privates, but what's the reasoning behind this change? Do we know there are no consumers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a static string for a reader implementation with no reason for anyone to be using it. There might be a "consumer", but it's most unlikely. This should never have been public. I can revert it if required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how we could know there are no consumers? But could at least deprecate now if we'd like to get rid of this.

@@ -878,7 +878,7 @@ public boolean isInterleaved(int subC) {
}
catch (IllegalArgumentException e) {
throw new FormatException("Image plane too large. Only 2GB of data can " +
"be extracted at one time. You can workaround the problem by opening " +
"be extracted at one time. You can worka round the problem by opening " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh. Fixed now, thanks.

rleigh-codelibre and others added 22 commits April 25, 2018 14:51
…ss setting the series"

This reverts commit b877f07.

The logic in FileStitcher looks a bit inconsistent, so leave well alone for now.
@rleigh-codelibre
Copy link
Contributor Author

Rebased onto current east following the merge of 5.8.2.

@rleigh-codelibre
Copy link
Contributor Author

@sbesson sbesson merged commit 2daf7c6 into ome:east Apr 26, 2018
@rleigh-codelibre rleigh-codelibre deleted the series-metadata branch April 26, 2018 19:43
@joshmoore
Copy link
Member

With the rebasing, is there any capture of which suggestions were made here that have not yet been addressed?

@rleigh-codelibre
Copy link
Contributor Author

I don't think the rebase lost any suggestions (I was pleasantly surprised, must be a recent change). Every suggestion has been fixed or commented on.

The main outstanding item would be your suggestion to factor out common code from the two implementations. I've already done so for a lot of the convenience methods (into MetadataTools). The remaining things can be factored out as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants