Skip to content

Commit

Permalink
chore: Change wording
Browse files Browse the repository at this point in the history
  • Loading branch information
pierluigiferrari committed Mar 31, 2018
1 parent 2e5e7a8 commit a5e0d48
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions data_generator_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@
"\n",
"This is a brief tutorial on how to use this data generator. We're using a small example dataset that comes with this repository.\n",
"\n",
"The generator can handle all sorts of annotation formats, but for this tutorial our example dataset will be a tiny subset of the Pascal VOC 2007 dataset.\n",
"The generator can handle all sorts of annotation formats, but for this tutorial our example dataset will be a tiny subset (8 images) of the Pascal VOC 2007 dataset.\n",
"\n",
"It is generally not necessary that ground truth annotations are available for the dataset, the generator can also load and generate batches of only images without any annotations, but for purpose of this tutorial we'll assume that we do have annotations.\n",
"It is generally not necessary that ground truth annotations are available for the dataset, the generator can also load and generate batches of only images without any annotations, but for the purpose of this tutorial we'll assume that we do have annotations.\n",
"\n",
"In the case of a dataset without annotations, for example a test dataset for a competition for which no ground truth is publicly available, everything explained here works the same, the only difference being that there are no annotations.\n",
"In the case of a dataset without annotations, for example a test dataset for a competition for which no ground truth is publicly available, everything explained here works just the same, the only difference being that there are no annotations.\n",
"\n",
"Even though this tutorial explains a lot of important aspects about this data generator, it goes without saying that you should also read the documentation of all the relevant classes and functions so that you understand what all the parameters are."
"Even though this tutorial explains a lot of important aspects about this data generator, it goes without saying that you should also read the documentation of all the relevant classes and functions so that you understand what all the parameters do."
]
},
{
Expand Down Expand Up @@ -102,7 +102,7 @@
"This is where you actually tell the generator what your dataset is and where you parse its annotations (if there are any). Parsing the annotations just means that the generator will read the annotations from XML, JSON, CSV or whatever files your annotations are in and store them in a long list that it keeps in memory.\n",
"\n",
"The data generator provides three such parser methods:\n",
"1. `parse_csv()`: This one is for datasets where you have a folder that contains all your images and (optionally) a CSV file that contains the annotations for all the images. This parser is fairly versatile with regard to the layout of said CSV file: You can tell it what columns of the CSV file contain what information. For more details on this parser, please refer to its documentation.\n",
"1. `parse_csv()`: This one is for datasets where you have a folder that contains all your images and a CSV file that contains the annotations for all the images. This parser is fairly versatile with regard to the layout of the CSV file: You can tell it what columns of the CSV file contain what information. For more details on this parser, please refer to its documentation.\n",
"2. `parse_json()`: This one is for datasets in the MS COCO format. The images and annotations of your dataset don't necessarily need to be all in one directory, a point that will be illustrated below. For more details on this parser, please refer to its documentation.\n",
"3. `parse_xml()`: This one is for datasets in the Pascal VOC format, and this is the parser we'll be using for this tuturial to parse our example dataset. As with the JSON parser, the images and annotations of your dataset don't necessarily need to be all in one directory.\n",
"\n",
Expand All @@ -112,7 +112,7 @@
"\n",
"The XML parser needs:\n",
"* `image_dirs`: a list of directories that contain the images of your dataset,\n",
"* `image_set_filenames`: a list of image set files that define which images are to be included in the dataset, and\n",
"* `image_set_filenames`: a list of paths of image set files that define which images are to be included in the dataset, and\n",
"* `annotations_dirs` (optional): a list of directories that contain the annotations XML files of your dataset.\n",
"\n",
"If you are not familiar with this dataset format, please refer to the [official Pascal VOC documentation](https://host.robots.ox.ac.uk/pascal/VOC/). As for the second of these arguments above, an image set is simply a list of image IDs in form of a text file that determines which images belong to a dataset. Note that the three arguments above are not paths, but lists of paths. You can pass more than one image directory, more than one annotations directory, and more than one image set to the parser. The parser will then compose the joint dataset from all the sources you pass here. An important thing to note here is that all three of these lists need to have the same length, i.e. each image directory corresponds to its respective annotations directory and image set file. In the example below these lists will have only one element each.\n",
Expand Down

0 comments on commit a5e0d48

Please sign in to comment.