Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe isPartOf relations in DCAT-US dataset as Collection in catalog-next #4969

Open
9 of 10 tasks
FuhuXia opened this issue Nov 6, 2024 · 2 comments
Open
9 of 10 tasks
Assignees
Labels
H2.0/Harvest-General General Harvesting 2.0 Issues

Comments

@FuhuXia
Copy link
Member

FuhuXia commented Nov 6, 2024

User Story

In order to show relations of DCAT-US datasets linked to each other with isPartOf field, data.gov teams wants group those datasets into Collections in the catalog-next.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

Parent dataset behavior:

  • GIVEN a dataset's identifier is the value of isPartOf field of other datasets
    AND they are from the same harvest source
    WHEN the dataset is listed in a search result
    THEN a collection icon shows up to indicate it is a Collection \

  • GIVEN a dataset's identifier is the value of isPartOf field of other datasets
    AND they are from the same harvest source
    AND all children datasets are in the deleted state
    WHEN the dataset is listed in a search result
    THEN there is NO collection icon to this dataset \

  • GIVEN a dataset's identifier is the value of isPartOf field of other datasets
    AND they are from the same harvest source
    WHEN the dataset detail page is visited
    THEN there is a block to indicate this is a Collection
    AND there is a link to the Collection page.

  • GIVEN a Collection page is visited
    THEN all the collection's children datasets are listed
    AND you can search within this Collection.

Children dataset behavior:

  • GIVEN a dataset with a value for isPartOf value
    THEN the dataset is hidden from normal search result
    AND does not count in the total dataset count

  • GIVEN a dataset with a value for isPartOf value
    THEN the dataset is listed when search with include_collection:true
    AND it is counted in the total dataset count

  • GIVEN a dataset with a value for isPartOf value
    THEN the dataset is listed in its Collection page

  • GIVEN a dataset with a value for isPartOf value
    THEN the dataset detail page should indicate it is part of a Collection.

  • GIVEN a dataset with a value for isPartOf value
    AND the parent dataset is active
    THEN the dataset detail page should have a link to the parent dataset.

  • GIVEN a dataset with a value for isPartOf value
    AND the parent dataset is not active (deleted or does not exist)
    THEN the dataset detail page should have a link to its Collection page

Background

The UI should be similar to the Collection in the existing catalog.data.gov. The difference is that there is no additional CKAN field in the extras collection_metadata=true to indicate a parent(collection) dataset. The collection relations are built by querying SOLR index with harvest_source_id and isPartOf values.

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

[Notes or a checklist reflecting our understanding of the selected approach]

@FuhuXia
Copy link
Member Author

FuhuXia commented Nov 6, 2024

Collection page has its own story.

@hkdctol hkdctol added the H2.0/Harvest-General General Harvesting 2.0 Issues label Nov 7, 2024
@hkdctol hkdctol moved this to 📥 Queue in data.gov team board Nov 7, 2024
@FuhuXia FuhuXia assigned FuhuXia and unassigned FuhuXia Nov 14, 2024
@FuhuXia FuhuXia moved this from 📥 Queue to 🏗 In Progress [8] in data.gov team board Nov 19, 2024
@FuhuXia FuhuXia self-assigned this Nov 19, 2024
@FuhuXia
Copy link
Member Author

FuhuXia commented Nov 21, 2024

Most of work is done in PR GSA/ckanext-geodatagov#282, keeping collection UI and user experience same as current catalog, but on the backend it is rewritten use solr query to tell parent-child relationship, since ckan is not aware if dataset is part of a collection.

Some side effects of ckan loosing control of collection logic are:

  • children datasets can exist without parent dataset.
  • parent dataset can lose all children dataset and become a regular dataset
  • a dataset can be a parent, and at the same time it is a child to another dataset, it means nested collection (grand children)
  • it does not prevent two datasets becoming parent-child to each other. we can add some UI element to indicate the troubled relationship once the datasets are visited.

TODO
add collection icon to parent datasets on the dataset listing page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
H2.0/Harvest-General General Harvesting 2.0 Issues
Projects
Status: 📟 Sprint Backlog [7]
Development

No branches or pull requests

2 participants