A by the Helmholtz metadata collaboration (HMC) curated list of awesome stuff around the FAIR principles for (scientific) data, i.e that data is findable, accessable, interoperable and re-usable. The list is organized in use cases of data producers, data users, data curators and data provides. 'FAIR' is not the same as 'open', but there is overlap.
- Resources about the FAIR principles
- FAIR assessment
- Organizations and Communities
- Metadata formats and standards
- Ontology services
- Finding datasets and software
- Software and software publications
- Provenance tracking
- Metadata management
- Your own repository setup
- Awesome meta data sources
- Related lists
Barend Mons article in Nature 578, 491 (2020) - Proposition to invest 5% of research funds in ensuring data are reusable.
Cost of not having FAIR research data - A 2018 European Commission Cost-benefit analysis for FAIR research data (Written by PwC EU Services).
The FAIR Guiding Principles for scientific data management and stewardship - This Comment in Sci Data is the first formal publication of the FAIR Principles (2016).
GO FAIR Zotero Library - Nice collection of publications around the FAIR principles.
DataPLANT ARC Tool Talk - NFDI4plants interpretation of the FDO based on GitHub repository and RO Crate.
DONA Suggested Reading - The history of the Digital Object Architecture (DOA) back into the 80s.
FAIR Digital Objects Forum - General platform for discussions on the advancement and development of FAIR Digital Objects.
FAIR Digital Object Framework - A WIP specification for an FDO infrastructure based on linked data / RDF.
FAIR DO publications - Relevant publications (concept papers and specs) by RDA working groups on FDOs.
RO Crate - Pragmatic approach combining existing technologies and ontologies into a metadata standard for annotating scientific. datasets.
PID services registry - A searchable registry for PID services.
FAIR Evaluation Services - A FAIR assessment tool from FAIRsharing, code.
F-uji - An (online) tool which can provide a FAIR score for a given PID based on a metric created by FAIRsFAIR, code.
EuDat - Collaborative European data infrastructure.
FAIRsharing - A curated resource on data and metadata standards, inter-related to databases and data policies.
Research Data Alliance - International organization and communication platform for establishing standards and recommendations concerning research data publication.
The Turing Way - Handbook and community for reproducible, ethical and collaborative data science.
DataCite - Metadata schema developed by international community with increasing adoption by repositories
Data Catalog (DCAT) - RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.
Dublin Core Metadata Initiative Terms - Dublin Core Metadata Element Set, is a set of fifteen "core" elements for describing resources.
JSON LD Playground - Convert JSON-LD data between various representations.
JSON Schema - Standard for the description of structural constraints in order to do validation of JSON objects.
Provenance Primer (PROV) - This primer document provides an accessible introduction to the PROV data model for provenance interchange on the Web.
Resource Description Framework (RDF) - RDF is a standard model for data interchange on the Web.
Schema.org - Well-established and industry-accepted vocabulary providing semantics for common entities like Person, Organization, Dataset, etc.
SKOS - The Simple Knowledge Organization System (SKOS) is a common data model for sharing and linking knowledge organization systems via the Semantic Web.
Ontobee - A linked ontology data server to support ontology term dereferencing, linkage, query and integration. See also this publication.
Ontology Lookup Service - OLS is a repository for biomedical ontologies that aims to provide a single point of access to the latest ontology versions.
Also see
- awesome-ontology - A curated list of ontology things.
- awesome-semantic-tools - List of projects related to Ontology engineering and Semantic Web technologies.
Datacite commons - Search through the metadata indexed by Datacite.
EuDat B2find - Search through metadata of datasets accumulated by EuDat.
Microsoft academy - Mircosoft academy search through a pid graph created by microsoft (shutdown end of 2021).
OpenAIRE explorer - Search through the metadata indexed by openaire.
Schole explorer - A data literature interlinking service (former scholix), indexes links between data and journal publications. It also provides interfaces and APIs to query the graph.
Research Software Repository - Aggregates research software from various sources with information about the problem it solves and its scientific domain.
CITATION.CFF - Plain text files with human- and machine-readable citation information for software (and datasets). Supported by GitHub, Zenodo, Zotero.
Citable code with Zenodo & GitHub - Make GitHub repositories citable with Zenodo DOI.
CodeMeta - CodeMeta works on providing a minimal metadata schema for science software and code, in JSON and XML to create a concept vocabulary that can be used to standardize the exchange of software metadata across repositories and organizations.
fossology - FOSSology is an open source license compliance software system and toolkit. You can run license, copyright and export control scans from the command line.
HERMES - A CI based workflow to create and publish software publications to well known repositories.
SOMEF - Extract software publication metadata from README and other docs automatically using ML and other techniques to reduce the amount of boilerplate work for the developer.
- awesome-research-software-registies - Awesome list for where one can register or upload research software.
- awesome-rse - An awesome collection of resources around research software engineering.
AiiDA - Automated Interactive Infrastructure and Database for Computational Science (AiiDA) to automatically track provenance of simulation workflows and all associated data, code.
DataLad - A free and open-source distributed data management system for everyone. It is based on git-annex with manual to automatic provenance tracking, code.
MLflow - Tool to track the provenance of machine learning applications, code.
CWL - Domain-agnostic and community-driven open standard for description and execution of research workflows that supports provenance tracking (CWLProv) in a standard-compliant way using the existing RO Crate, PROV and BagIt standards.
PROV-O Primer - An introduction to the data model of Provenance Ontlogy (PROV-O)
There is overlap with these more general lists of workflow tools, but not every pipeline or workflow manager includes good provenance tracking.
- awesome-pipeline - A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin.
- Awesome workflow engines - Curated list of awesome open source workflow engines.
- Computational Data Analysis Workflow Systems - A list of existing workflow systems.
Dataverse - Open source research data repository software code.
EuDat B2share - A repository by EuDat, but the software is open sourc, bases in invenio and one can setup own instances of it, code.
Invenio - Open source customizable software to setup large scale digital repositories, library systems and data repositories, code.
InvenioRDM - The turn-key research data management repository based on Invenio framework and Zenodo.
Microsoft academy graph - All the data and links from Mircosoft academy (shutdown end of 2021).
Openaire graph - All metadata contained in the openaire graph.
Scholix - A schema for scholarly links. Implemented and deployed by several scholarly link providers.
CrossRef - Organization building connections between related entities, building a queryable graph.
Awesome lists related to several points.
awesome-rse - An awesome list by HIFIS collecting information about research software engineering, touching FAIRness and sustainability
awesome-rse-policies - An awesome list by HIFIS collecting information about research software engineering policies, touching FAIRness and sustainability
Awesome-open-climate-science - An open science related list specific to the domain of Atmospheric, Ocean, and Climate science.
Awesome-open-science-software - A list of open science resources and software.
Awesome Curated Tools - A curated list of digital tools we use, ranging from accounting and data science to scientific research and liquid democracy.
Contributions are welcome! 😎
If you want to contribute please read the contribution guideline.