Fix minimum_id algorithm #65

jhlee-mitre · 2022-11-11T02:41:44Z

Summary

New algorithm to examine if a FHIR resource is "minimum" of the other FHIR resource

New behavior

Given two FHIR resource instances A and B, A is considered a minimum of B if and only if:
B has all the elements of A, exactly the same element names and values (codes). No consideration of range or pre/post-coordination.
Hierarchies and levels of the elements from A must be the same in B. Otherwise it is considered as different element.
Order doesn’t matter.
Duplication doesn’t matter. If B has two duplicate sets (which isn’t realistically the case of FHIR though), the algorithm stops when it find the first set.

Code changes

Added minimum_id.rb under lib. To be included into assertion.rb after testing.

Testing guidance

ruby lib/testscript_engine/minimum_id.rb

…engine into fix_minimum_id

karlnaden · 2022-11-11T05:48:27Z

I added example files for test cases from the ticket I raised: #62. These appear to pass, which is a good sign.

Granted that I'm reading the code too late at night to develop a deep understanding, I'm not fully understand the implementation approach you've taken here. I'd be interested in a narrative description of what steps you intend the algorithm to take because I think that would help me understand it better.

Initial observations:

you seem to take this approach of doing a depth-first traversal on the specification instance (min_obj) and each time you find an individual value, then you do a depth-first search on the actual instance being checked (tar_obj). This feels inefficient, compared with say recursing through both instances together. Additionally, the way you've implemented, you don't take into account values in fields at higher levels that provide meaning to lower levels. I added two examples that try to elucidate the problems: patient_two_names_min.json indicates a "official" name (Joseph) and a "usual" name (Joey) but patient_two_names_jumbled.json lists Joey as the "official" name and vice-versa - this should fail because even if you find "Joey" at the right level, it isn't correct if it is found as a part of the official name entry (note based on output, the logic may not even be comparing list entries that are primitive values; a capability statement example mCODE_CapabilityStatement_exampleServer_shouldFail.json I moved the "code:in" SearchParameter into the Observation resource entry instead of the Condition resource entry - this should fail for the same reason as above, but code:in is still found and validated even though it is not in the right place.
You take what I would call a state-full or imperative approach to the implementation, where you have a flag variable that gets set. I think we will be better served here by a stateless / functional approach where the functions return the answer to a minimumId check on a specific part of the tree and then for evaluating e.g., a hash / json object, you check the minimumId property on all of the keys in the min_obj and if any return false, the whole function returns false. That will I think help make the code easier to understand and easier to test.

jhlee-mitre · 2022-11-14T13:02:56Z

I added example files for test cases from the ticket I raised: #62. These appear to pass, which is a good sign.

Granted that I'm reading the code too late at night to develop a deep understanding, I'm not fully understand the implementation approach you've taken here. I'd be interested in a narrative description of what steps you intend the algorithm to take because I think that would help me understand it better.

Initial observations:

you seem to take this approach of doing a depth-first traversal on the specification instance (min_obj) and each time you find an individual value, then you do a depth-first search on the actual instance being checked (tar_obj). This feels inefficient, compared with say recursing through both instances together. Additionally, the way you've implemented, you don't take into account values in fields at higher levels that provide meaning to lower levels. I added two examples that try to elucidate the problems: patient_two_names_min.json indicates a "official" name (Joseph) and a "usual" name (Joey) but patient_two_names_jumbled.json lists Joey as the "official" name and vice-versa - this should fail because even if you find "Joey" at the right level, it isn't correct if it is found as a part of the official name entry (note based on output, the logic may not even be comparing list entries that are primitive values; a capability statement example mCODE_CapabilityStatement_exampleServer_shouldFail.json I moved the "code:in" SearchParameter into the Observation resource entry instead of the Condition resource entry - this should fail for the same reason as above, but code:in is still found and validated even though it is not in the right place.

You take what I would call a state-full or imperative approach to the implementation, where you have a flag variable that gets set. I think we will be better served here by a stateless / functional approach where the functions return the answer to a minimumId check on a specific part of the tree and then for evaluating e.g., a hash / json object, you check the minimumId property on all of the keys in the min_obj and if any return false, the whole function returns false. That will I think help make the code easier to understand and easier to test.

Great comments! It will need a few iterations to be quality algorithm, let me go back and try second round of polishing the algorithm/code.

jhlee-mitre · 2022-11-14T14:40:21Z

Short description about the new commit:

What the algorithm does:

For given two sets of FHIR resources, it is flatten with path. For example

{
  "resourceType": "Patient",
  "identifier": [
    {
      "use": "usual",
    }
  ]
}

may be converted to a Hash:

{ 
"resourceType" => "Patient",
"identifier.0.use" -> "usual"
}

For now this algorithm is "order-unaware". If is easy to change it order-aware. Later we can make it configurable.

exam_minimum() compares two flatten Hashes to evaluate whether one is minimum of the other.

Remove duplicate

jhlee-mitre · 2022-11-21T18:54:30Z

Integrated the recursive minimum_id check into the engine. The previous one (deep_merge) was replaced.

jhlee-mitre · 2022-11-21T18:54:58Z

It doesn't contain a TestScript to test, yet.

karlnaden · 2022-11-22T04:07:51Z

Added some TestScripts for minimumId that uses just fixtures including

a reflexive TestScript that should succeed (a minimumId a)
a symmetric TestScript that should fail (a minimumId b, but not b minimumId a)

These didn't quite work as expected and the logic wasn't giving any hints as to why the unexpected failures, so I started to add some basic error details and ended up fixing a few things while doing that:

Needed to convert FHIR object representations to hashes before passing into the check_minimum_id function
Both check_minimum_id_hash and check_minimum_id_array were returning the result for the last key/index checked, rather than keeping track of whether any failures had occurred
Both check_minimum_id_hash and check_minimum_id_array were returning false when the check succeeded instead of true to be consistent with the top-level check_minimum_id function

Finally, note that rather than try to use the template Jack created for the compare results, I created a minimumId-specific return.

Please take a look and let me know what you think.

A last thing I'd like to do somewhere as a part of this commit is provide brief documentation of the logic behind our implementation of the minimumId assertion. It is possible that this should live in a FHIR implementation guide eventually, but I think that would be something we tackle in the new year, so I'd probably just put it in the README for simplicity.

jhlee-mitre · 2022-11-22T14:31:38Z

I tested on my end and it looks good. Just added one more test for fun.

Minor thing: if an error occurs, it's always when a resource in the minimum fixture is missing in the actual fixture. So it's about difference, but one direction. Would this be even clear wording for error msg:

Actual resource differed from content in fixture '#{assert.minimumId}'
-->
Actual resource from content in fixture '#{assert.minimumId}' wasn't found in the response '#{assert.sourceId}'

For the templates for error message handling, we may consider refactoring it to accommodate more flexibility. I agree with handling them individually meanwhile.
It sounds good to add the documentation into README. If it's overwhelming, we may use Wiki for more details (e.g. algorithm) and keep link from README.

karlnaden · 2022-11-22T15:14:37Z

Minor thing: if an error occurs, it's always when a resource in the minimum fixture is missing in the actual fixture. So it's about difference, but one direction.

Tried some new wording, see what you think. For each location in the minimumId fixture, there are two failure cases:

Corresponding data found at that location in the actual resource, but it is different
No data found at that location in the actual resource
I collapse that to "differences". While I agree that could potentially be misleading, I think realistically, the feedback we're giving for minimumId right now likely still needs to be improved for realistic use. However, I don't think it is worth spending further time on it now.

For the templates for error message handling, we may consider refactoring it to accommodate more flexibility. I agree with handling them individually meanwhile.

Agreed - could be done better, but not a priority now

It sounds good to add the documentation into README. If it's overwhelming, we may use Wiki for more details (e.g. algorithm) and keep link from README.

Agreed. Recommend we avoid setting up a new thing for documentation at this point

Polished wording; Added minimum_id logic explanation.

jhlee-mitre

I found "TestScript Engine" and "About the project" were overlapping and merged. Thanks for adding another polish. It looks great! If you think the PR is good, approve and I will merge.

WIP test new algorithm to examine minimum_id

debaba2

jhlee-mitre requested a review from karlnaden November 11, 2022 02:43

jhlee-mitre and others added 5 commits November 10, 2022 23:36

WIP test new algorithm to examine minimum_id

d3cdc4d

Merge branch 'fix_minimum_id' of github.com:fhir-crucible/testscript-…

16c7823

…engine into fix_minimum_id

additional minimumId test cases

ea8c124

add expected failure case

5e74366

another failure case example

510054e

Change algorithm based on path aware flattening

96a1c17

jhlee-mitre and others added 3 commits November 16, 2022 14:51

Add rspec

7784c6f

Moved minimum_id_spec file

ba674ac

Delete minimum_id_spec.rb

2431550

Remove duplicate

jhlee-mitre changed the title ~~WIP Fix algorithm to examine minimum_id~~ (WIP) Fix algorithm to examine minimum_id Nov 16, 2022

jhlee-mitre self-assigned this Nov 16, 2022

jhlee-mitre changed the title ~~(WIP) Fix algorithm to examine minimum_id~~ (WIP) Fix minimum_id algorithm Nov 16, 2022

jhlee-mitre and others added 4 commits November 18, 2022 14:20

Rebuild based on Karl's recursive algorithm

5e44c2e

Gemfile

edb1cfd

Merge branch 'main' into fix_minimum_id

fe846b2

Replace deep_merge by recursive minimum_check

75613a5

jhlee-mitre changed the title ~~(WIP) Fix minimum_id algorithm~~ Fix minimum_id algorithm Nov 21, 2022

jhlee-mitre marked this pull request as ready for review November 21, 2022 18:53

jhlee-mitre and others added 4 commits November 21, 2022 14:17

Unit test

9488fdb

Quick fix

c257f05

Quick fix minimum_id()

f0fb45b

fix logic + tests, add failure hints + TestScripts

94d225c

Add one more unit test

351a4d1

error message update, indicate expected failure

9b16f82

jhlee-mitre and others added 2 commits November 26, 2022 11:26

Update README.md

5fd8fd2

Polished wording; Added minimum_id logic explanation.

proposed readme updates

ac9628f

jhlee-mitre commented Nov 28, 2022

View reviewed changes

documentation tweak

3758b15

karlnaden approved these changes Nov 28, 2022

View reviewed changes

jhlee-mitre merged commit 1db222e into main Nov 28, 2022

jhlee-mitre deleted the fix_minimum_id branch November 28, 2022 13:26

jhlee-mitre mentioned this pull request Nov 28, 2022

minimumId implementation doesn't allow subsets within list entries #62

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix minimum_id algorithm #65

Fix minimum_id algorithm #65

jhlee-mitre commented Nov 11, 2022

karlnaden commented Nov 11, 2022

jhlee-mitre commented Nov 14, 2022

jhlee-mitre commented Nov 14, 2022

jhlee-mitre commented Nov 21, 2022

jhlee-mitre commented Nov 21, 2022

karlnaden commented Nov 22, 2022

jhlee-mitre commented Nov 22, 2022

karlnaden commented Nov 22, 2022

jhlee-mitre left a comment

Fix minimum_id algorithm #65

Fix minimum_id algorithm #65

Conversation

jhlee-mitre commented Nov 11, 2022

Summary

New behavior

Code changes

Testing guidance

karlnaden commented Nov 11, 2022

jhlee-mitre commented Nov 14, 2022

jhlee-mitre commented Nov 14, 2022

jhlee-mitre commented Nov 21, 2022

jhlee-mitre commented Nov 21, 2022

karlnaden commented Nov 22, 2022

jhlee-mitre commented Nov 22, 2022

karlnaden commented Nov 22, 2022

jhlee-mitre left a comment

Choose a reason for hiding this comment