Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP GobiertoData add dcat catalog endpoint #3846

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

stbnrivas
Copy link
Contributor

@stbnrivas stbnrivas commented Apr 27, 2021

Closes #2671

✌️ What does this PR do?

add endpoint /api/v1/data/datasets/catalog.xml which show an rdf/xml (catalog in dcat format)

🔍 How should this be manually tested?

  1. configure new site options:
bin/rails c
site = Site.find_by(domain: "*********")
GobiertoSeeds::GobiertoData::Recipe.run(site) if site.configuration.gobierto_data_enabled?

check configuration was loaded in selected site: backoffice > customize site

Screenshot from 2021-05-07 07-13-02

  1. create datasets with specific features:
  • a dataset with license
  • a dataset without license
  • a dataset with source url
  • a dataset without source url
  • a dataset with category
  • a dataset without category
  1. visit endpoint api/v1/data/datasets/catalog.xml
  2. copy xml payload
  3. paste payload into https://www.w3.org/RDF/Validator/
  • check valid passed (rdf validation)
  1. paste payload into
    FOR DCAT
    FOR DCAT-AP
  • check valid passed (dcat validation)
  • you can ignore messages like this:
The property: title can't have multiple objects
The property: description can't have multiple objects

@stbnrivas stbnrivas force-pushed the 2671-add-gobierto-data-rdf-dcat-endpoint branch from 07f1923 to 6d824d1 Compare April 27, 2021 16:23
@stbnrivas stbnrivas force-pushed the 2671-add-gobierto-data-rdf-dcat-endpoint branch from 6d824d1 to 4efaad6 Compare April 28, 2021 14:42
@stbnrivas
Copy link
Contributor Author

stbnrivas commented Apr 28, 2021

UPDATE: I sorted by mandatory | optionals

mandatory / question (to close issue IMHO):
| 1. I added cache for response of catalog... I'm trying to figure out how to invalidate every a dataset will be created or delete
| 6. endpoint should don't show if module GobiertoData if module gobierto data is disabled (now show catalog with dataset list empty)
| 7. add some link in frontend? into sitemap.xml? how can we federate our datasets?
| 8. pass dcat validator

to solve in another issue (to can schedule)
| 4. every dataset has to be a license url related with #3032. should be done first

optionals
| 3. Should I add license for catalog level, there are 3 options: {nil, constant for all sites, one for site}
| 5. add keywords for dataset to show into catalog related with #3032
| 2. Should I add rdf static types for some attributes like ^^string ... not sure the validation pass

@stbnrivas stbnrivas marked this pull request as ready for review April 28, 2021 15:19
@furilo
Copy link
Member

furilo commented Apr 28, 2021

IMPROVEMENTS?

Please differentiate clearly what are improvements or what are things that should be decided. All improvements can go to further issues.

  1. Should I add rdf static types for some attributes like ^^string ... not sure the validation pass

?

  1. Should I add license for catalog level, there are 3 options: {nil, constant for all sites, one for site}

Is this is not mandatory, don't specify. If is mandatory, take the license of the first dataset.

  1. add keywords for dataset to show into catalog related with Data / New fields for datasets: Source and License #3032

Mandatory?

  1. endpoint should don't show if module GobiertoData if module gobierto data is disabled

Proceed, its simple right?

  1. add some link in frontend? into sitemap.xml?

In the <head> of /datos, as an RDF link - we use the meta-tags gem to define meta tags from specific views.

<link rel="alternate" type="application/rdf+xml" href="url.to.dcat.rdf"/>

@ferblape
Copy link
Member

endpoint should don't show if module GobiertoData if module gobierto data is disabled

This is already implemented in the routes file, you don't need to do anything

Copy link
Member

@ferblape ferblape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but there are a few details yet to be fixed.

About tests, you are just testing the presenter, that's not enough and you need to add controller or integration test to verify the api call. See how are tested the other methods of the data api

app/presenters/gobierto_data/dataset_presenter.rb Outdated Show resolved Hide resolved
app/presenters/gobierto_data/dataset_presenter.rb Outdated Show resolved Hide resolved
app/presenters/gobierto_data/dataset_presenter.rb Outdated Show resolved Hide resolved
modified: GobiertoData::Dataset.where(site_id: site.id).maximum(:created_at) || site.created_at,
languages: site["configuration_data"]["available_locales"],
homepage: url_helpers.gobierto_data_root_url(host: site.domain),
license_url: "https://opendatacommons.org/licenses/odbl/",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alvaro said no license, right? Is it mandatory?

Copy link
Contributor Author

@stbnrivas stbnrivas Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about if DCAT has mandatory xml elements, from now I think that:

  • DCAT don't define xml element required, recommended, optional.
  • DCAT-AP define some xml element as required, reccomended or optionals (but not 100% sure)

So as alvaro said I'll remove license at catalog level 👍

app/presenters/gobierto_data/dataset_presenter.rb Outdated Show resolved Hide resolved
app/presenters/gobierto_data/dataset_presenter.rb Outdated Show resolved Hide resolved
config/routes.rb Outdated Show resolved Hide resolved
app/presenters/gobierto_data/dataset_presenter.rb Outdated Show resolved Hide resolved
config/locales/gobierto_data/presenters/ca.yml Outdated Show resolved Hide resolved
config/routes.rb Outdated Show resolved Hide resolved
@stbnrivas stbnrivas force-pushed the 2671-add-gobierto-data-rdf-dcat-endpoint branch from 3a2f1e8 to 23aa029 Compare April 30, 2021 10:45

def test_structure_catalog_building_do_not_show_draft_datasets
catalog = @subject.build_catalog
datasets_published = GobiertoData::Dataset.by_site(@site.id).visibles.size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current_site.datasets?

app/models/gobierto_data/dataset.rb Outdated Show resolved Hide resolved
app/views/gobierto_data/api/v1/datasets/catalog.xml.erb Outdated Show resolved Hide resolved
test/fixtures/gobierto_common/custom_fields.yml Outdated Show resolved Hide resolved
@stbnrivas stbnrivas force-pushed the 2671-add-gobierto-data-rdf-dcat-endpoint branch 2 times, most recently from 4b8e4ce to a452f6d Compare April 30, 2021 14:00
@stbnrivas stbnrivas force-pushed the 2671-add-gobierto-data-rdf-dcat-endpoint branch from a452f6d to 55d3919 Compare April 30, 2021 14:32
@stbnrivas stbnrivas changed the title WIP GobiertoData add dcat catalog endpoint GobiertoData add dcat catalog endpoint Apr 30, 2021
@ferblape
Copy link
Member

@stbnrivas could you fix the conflicts?

@stbnrivas stbnrivas changed the title GobiertoData add dcat catalog endpoint WIP GobiertoData add dcat catalog endpoint Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Research how to integrate DCAT in Gobierto Data
3 participants