bigvis

The bigvis package provides tools for exploratory data analysis of large datasets (10-100 million obs). The aim is to have most operations take less than 5 seconds on commodity hardware, even for 100,000,000 data points.

Since bigvis is not currently available on CRAN, the easiest way to try it out is to:

# install.packages("devtools")
devtools::install_github("hadley/bigvis")

Workflow

The bigvis package is structured around the following workflow:

bin() and condense() to get a compact summary of the data
if the estimates are rough, you might want to smooth(). See best_h() and rmse_cvs() to figure out a good starting bandwidth
if you're working with counts, you might want to standardise()
visualise the results with autoplot() (you'll need to load ggplot2 to use this)

Weighted statistics

Bigvis also provides a number of standard statistics efficiently implemented on weighted/binned data: weighted.median, weighted.IQR, weighted.var, weighted.sd, weighted.ecdf and weighted.quantile.

Acknowledgements

This package wouldn't be possible without:

the fantastic Rcpp package, which makes it amazingly easy to integrate R and C++
JJ Allaire and Carlos Scheidegger who have indefatigably answered my many C++ questions
the generous support of Revolution Analytics who supported the early development.
Yue Hu, who implemented a proof of concepts that showed that it might be possible to work with this much data in R.

Name		Name	Last commit message	Last commit date
Latest commit History 335 Commits
R		R
bench		bench
data		data
inst		inst
man		man
src		src
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
bigvis.Rproj		bigvis.Rproj
notes.md		notes.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bigvis

Workflow

Weighted statistics

Acknowledgements

About

Releases

Packages

Languages

hadley/bigvis

Folders and files

Latest commit

History

Repository files navigation

bigvis

Workflow

Weighted statistics

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages