The MLC Dataset catalogue contains descriptions of 89 MLC datasets. Each dataset is annotated with a description of the problem the dataset is associated with and the different transformations performed. All dataset descriptions are enhanced with semantic annotations (metadata) based on classes/terms from ontologies and controlled vocabularies. The semantic annotations can be categorized into two groups: (1) Annotations of the dataset provenance information and (2) Annotations that capture the relevant machine learning characteristics of the datasets.
The MLC Dataset Catalog can be accessed at: https://semantichub.ijs.si/MLCdatasets/
All code assosciated with the web application is given in the MLC-catalog-web-application folder.
The pipeline used for generating the semantic annotations is available in the semantic-annotation-pipeline folder.