< Back to Alex Krizhevsky's home page
The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.airplane | ||||||||||
automobile | ||||||||||
bird | ||||||||||
cat | ||||||||||
deer | ||||||||||
dog | ||||||||||
frog | ||||||||||
horse | ||||||||||
ship | ||||||||||
truck |
Version | Size | md5sum |
CIFAR-10 python version | 163 MB | c58f30108f718f92721af3b95e74349a |
CIFAR-10 Matlab version | 175 MB | 70270af85842c9e89bb428ec9976c926 |
CIFAR-10 binary version (suitable for C programs) | 162 MB | c32a1d4ab5d03f1284b67883e8d87530 |
def unpickle(file): import cPickle with open(file, 'rb') as fo: dict = cPickle.load(fo) return dictAnd a python3 version:
def unpickle(file): import pickle with open(file, 'rb') as fo: dict = pickle.load(fo, encoding='bytes') return dictLoaded in this way, each of the batch files contains a dictionary with the following elements:
<1 x label><3072 x pixel> ... <1 x label><3072 x pixel>In other words, the first byte is the label of the first image, which is a number in the range 0-9. The next 3072 bytes are the values of the pixels of the image. The first 1024 bytes are the red channel values, the next 1024 the green, and the final 1024 the blue. The values are stored in row-major order, so the first 32 bytes are the red channel values of the first row of the image.
Superclass | Classes |
aquatic mammals | beaver, dolphin, otter, seal, whale |
fish | aquarium fish, flatfish, ray, shark, trout |
flowers | orchids, poppies, roses, sunflowers, tulips |
food containers | bottles, bowls, cans, cups, plates |
fruit and vegetables | apples, mushrooms, oranges, pears, sweet peppers |
household electrical devices | clock, computer keyboard, lamp, telephone, television |
household furniture | bed, chair, couch, table, wardrobe |
insects | bee, beetle, butterfly, caterpillar, cockroach |
large carnivores | bear, leopard, lion, tiger, wolf |
large man-made outdoor things | bridge, castle, house, road, skyscraper |
large natural outdoor scenes | cloud, forest, mountain, plain, sea |
large omnivores and herbivores | camel, cattle, chimpanzee, elephant, kangaroo |
medium-sized mammals | fox, porcupine, possum, raccoon, skunk |
non-insect invertebrates | crab, lobster, snail, spider, worm |
people | baby, boy, girl, man, woman |
reptiles | crocodile, dinosaur, lizard, snake, turtle |
small mammals | hamster, mouse, rabbit, shrew, squirrel |
trees | maple, oak, palm, pine, willow |
vehicles 1 | bicycle, bus, motorcycle, pickup truck, train |
vehicles 2 | lawn-mower, rocket, streetcar, tank, tractor |
Version | Size | md5sum |
CIFAR-100 python version | 161 MB | eb9058c3a382ffc7106e4002c42a8d85 |
CIFAR-100 Matlab version | 175 MB | 6a4bfa1dcd5c9453dda6bb54194911f4 |
CIFAR-100 binary version (suitable for C programs) | 161 MB | 03b5dce01913d631647c71ecec9e9cb8 |
<1 x coarse label><1 x fine label><3072 x pixel> ... <1 x coarse label><1 x fine label><3072 x pixel>
The file has 60000 rows, each row contains a single index into the tiny db, where the first image in the tiny db is indexed "1". "0" stands for an image that is not from the tiny db. The first 50000 lines correspond to the training set, and the last 10000 lines correspond to the test set.