-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework schema (list of headers) and document #29
Comments
Popolo doesn't define a CSV representation yet - there is RDF and JSON so far. On RDF path, I'm not sure if Linked CSV is ready. On the JSON path, it should be straight-forward to re-use JSON fields as CSV headers. Is there a documented version of the CSV schema? The datapackage.json doesn't describe the difference between |
@jpmckinney list of fields set out above and some initial suggested changes. @markbrough your thoughts here re IATI very useful ... |
I'll review more closely in a bit, but to clarify one point, you can use fields outside of those within Popolo while still being conformant: http:https://popoloproject.com/specs/#conformance So, if you want to keep |
@jpmckinney any thoughts here. I'm aiming to do a rev (and possibly finalize) this asap. I guess the big question here is CSV vs JSON (I mean for JSON we'd just take the full popolo version I think). If CSV how do we map and how do we handle things like fields with multiple possible values. Options are:
|
Sorry for delay, I'll look at this within the next day. |
The "abbr" column in the CSV would be the "other_names" array in the JSON. Maybe rename "abbr" to "other_name"? Otherwise I think all the other header names conform. CSV has the big advantage of more people being able to understand, create and use it. Is it anticipated that many fields will be multi-value? Has that come up already? How much detailed info are these lists expected to contain? If the project is expected to maintain a fairly narrow scope with only essential/primary data, then CSV should be enough. If it's expected to expand to provide detailed info for at least some jurisdictions, then JSON is necessary. A hybrid approach may allow people to submit CSVs (for those jurisdictions that don't (yet) have detailed info), and a script would be run to convert those CSVs to JSON. Thoughts? Re: multi-value columns in CSV:
|
OK, so I think we'll go for plain CSV and see how we do. I've made another tweak to include other_names. |
I don't know if a new For |
@jpmckinney all good suggestions (as usual!) - let's run with both of them. I've updated the change proposal above to reflect these. |
Added founding_date and dissolution_date and image to add. @stefanw could you clarify what |
FIXED. |
|
Awesome! Where can I find docs for the schema? Is it |
@stefanw wouldn't it make sense to split phone numbers into |
@jpmckinney this distinction comes from the German public body dataset out of FragDenStaat.de. The fields were modeled after the original federal data source which was not structured enough to make an easy distinction between voice/fax. Surely this can be inferred from prefixes ("Tel.", "Fax:" etc.). The contact data was never needed, we were only after emails. This should in no way dictate the structure of an ideal dataset. |
@jpmckinney for docs of schema see https://github.com/okfn/publicbodies#data which links to http:https://data.okfn.org/community/okfn/publicbodies (that is nicer than looking at the datapackage.json) |
@stefanw so could i drop contact field in de dataset in favour of address and email (already in the dataset)? |
Depends on what you want the publicbodies dataset to contain, I don't mind either. I could also parse out voice/fax if it helps, should be an easy regex. |
+1 for specific voice / fax etc. fields, with the possibility to have several per line. |
@augusto-herrmann I did not suggest anything, I merely answered the question and explained the existing fields. Popolo supports many types of contact info (postal address, email, phone, fax etc.) under "contact_details". |
I'm very happy for a new set of fields to go in: @augusto-herrmann could you distill a core set of changes with descriptor of the fields and we'll review. Also very much welcome input form @jpmckinney here so we keep aligned with popolo on this. |
I'll be happy to review any proposed changes to the schema, just @-mention me in any new issues. |
@rgrp, the link http:https://data.okfn.org/community/okfn/publicbodies (also referenced in the README) has since become broken. Has the schema documentation been moved somewhere else? If so, it would be nice to have a redirect. |
@augusto-herrmann that's a bug in data.okfn.org which is getting fixed now. |
@augusto-herrmann ok - the issue was that the data package is actually named public-bodies whilst repo is named publicbodies so redirect was not working correctly. Now fixed. |
@rossjones suggested: "Would it make sense for publicbodies.org to follow the popolo spec at http:https://popoloproject.com/data.html" (that link is now broken)
Correct link is: http:https://popoloproject.com/specs/organization.html
Seems a great idea!
Current fields
Current fields and suggested changes (e.g. to be in line with popolo as much as possible). Note the list of changes is in progress and incomplete.
Add:
Consider switch to JSON from CSV
Pros / Cons
The text was updated successfully, but these errors were encountered: