Wikidata:Requests for permissions/Bot/So9qBot 11

From Wikidata
Jump to navigation Jump to search

So9qBot 11 (talkcontribsnew itemsnew lexemesSULBlock logUser rights logUser rightsxtools)
Operator: So9q (talkcontribslogs)

Task/s: Import of 564 hiking paths from Naturkartan.

Code: https://github.com/dpriskorn/NaturkartanScraper

Function details:

  • I extracted all hiking paths and cleaned the data (removed bicycle routes).
  • Then I exported all trails >= 5km => 601 trails
  • Then I cleaned away those with duplicate names in OpenRefine => 544 trails (CSV)

All the metadata about the trails are maintained via this service by public organizations. Unfortunately the public organizations have not been willing to publish open data by themselves directly, which would of course be preferred.

Some of the data can be found in a rather messy and unusable from the länstyrelser via naturvårdsverket.

This import would help the Swedish OSM community because I match all Swedish hiking trails in Wikidata against OSM and link them using hiking trail matcher. This makes it very easy for anyone to see how many official trails are still missing in OSM.

I can upload a few example items if anyone wants to review the schema.

To be discussed:

  • are hiking paths around 5 km length notable? Would we rather make the cut off at 10 km? Or perhaps 2 km?

--So9q (talk) 15:02, 3 December 2024 (UTC)[reply]

Discussion

Ainali Belteshassar so9q Vätte Popperipopp Tulipasylvestris Esquilo Daniel Mietchen — with focus on topics related to research (Q42240) VisbyStar Haxpett QubeCube Marcus.linneberg Vitplister Spisen Sollentuna Myohmy671 Autom S4b1nuz E.656 JoranL

Notified participants of WikiProject SwedenSo9q (talk) 08:30, 4 December 2024 (UTC)[reply]

The data source doesn't seem openly licensed, or am I missing something? Ainali (talk) 16:36, 4 December 2024 (UTC)[reply]
That can be debated, basically the only four bits of information I extracted are:
Name, length, municipality, type of route. None of that is protected by copyright in the US (it's metadata and not copyrightable). The same is most probably the case in Sweden, but then we have the database protection thing. I would argue thta since this is information from public bodies already released to the public through the service no judge would admit a suit for copyright violation.
Scraping this information actually improves the service because it could increase the traffic and thus importance of the service.
I'm going to contact them and ask for a written approval. So9q (talk) 08:42, 7 December 2024 (UTC)[reply]