Skip to content

tecosaur/DataToolkit.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Main/docs/src/assets/logotype.svg

DataToolkit is a batteries-included family of packages for robustly managing data. The particular package(s) you want to use will depend on the project.

For now, this set of packages around the beta stage of development. No major changes to the core functionality or structure are anticipated, but small expansions in the data-CLI functionality and set of transformers and plugins provided by DataToolkitCommon are expected prior to the 1.0 release, and larger changes may occur if there is good reason for them.


[[iris]]
uuid = "3f3d7714-22aa-4555-a950-78f43b74b81c"
description = "Fisher's famous Iris flower measurements"

    [[iris.storage]]
    driver = "web"
    checksum = "k12:cfb9a6a302f58e5a9b0c815bb7e8efb4"
    url = "https://raw.githubusercontent.com/scikit-learn/scikit-learn/1.0/sklearn/datasets/data/iris.csv"

    [[iris.loader]]
    driver = "csv"
    args.header = ["sepal_length", "sepal_width", "petal_length", "petal_width", "species_class"]
    args.skipto = 2

Similar Packages

DataDeps.jl
Downloading files on-demand. Essentially implements the web storage driver along with some of the machinery.
DataSets.jl
An alternate take on declarative data representation. Focused on filling a gap with JuliaHub’s cloud compute offering; less versatile overall.
RemoteFiles.jl
Automatically re-downloading files on a schedule. Equivalent to the web storage driver when using the lifetime parameter of the store plugin.

Relevant Links