title

tags

authors

affiliations

date

bibliography

Python class defining a machine learning dataset ensuring key-based correspondence and maintaining integrity

neuroscience

machine-learning

object-oriented-programming

dataset

name	orcid	affiliation
Pradeep Reddy Raamana	0000-0003-4662-0558	1

name	orcid	affiliation
Stephen C. Strother	0000-0002-3198-217X	1, 2

name	index
Rotman Research Institute, Baycrest Health Sciences, Toronto, ON, Canada	1

name	index
Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada	2

24 August 2017

paper.bib

Summary

A common problem in machine learning is keeping track of the features extracted, and ensuring integrity of the dataset. This is incredibly hard as the number of projects grow, or personnel changes are frequent (hence breaking the chain of hyper-local info about the dataset). This package provides a Python data structure to encapsulate a machine learning dataset with key info greatly suited for neuroimaging applications (or any other domain), where each sample needs to be uniquely identified with a subject ID (or something similar). Key-level correspondence across data, labels (e.g., 1 or 2), classnames (e.g., 'healthy', 'disease') and the related helps maintain data integrity, in addition to offering a way to easily trace back to the sources from where the features have been originally derived.

This data structure also helps ease the machine learning workflow by offering several well-knit methods and useful attributes specifically geared towards neuroscience research.

References

Raamana, P.R., 2017, neuropredict: easy, standardized and comprehensive predictive analysis for neuroimaging features, GitHub. URL: https://github.com/raamana/neuropredict

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paper.md

paper.md

Summary

References

Files

paper.md

Latest commit

History

paper.md

File metadata and controls

Summary

References