Skip to content

Latest commit

 

History

History
119 lines (88 loc) · 4.73 KB

README.md

File metadata and controls

119 lines (88 loc) · 4.73 KB

Lifecycle: experimental

NPS EML Creation

Creating Ecological Metadata Language (EML) metadata for NPS data packages is a two-step process.

The first step is to generate an EML formatted .xml file. There are a number of tools for generating this initial file. This repo contains an R script, instructions, and an example of how to use EMLassemblyline to generate an initial EML metadata file while taking into consideration NPS data package specifications and requirements for uploading to DataStore.

No matter the method of generating the initial EML file, the second step is to add NPS-specific information to the EML metadata (for instance, the data package DOI, links to the DRR, information about CUI, producing units and content unit connections). Currently, the only tool for adding NPS-specific metadata is the R/EMLeditor package. Editing EML by hand is not advised.

This is an early version of the NPS EML creation script. Please request enhancements and bug fixes through Issues.

Comprehensive Guide

For a comprehensive guide to generating EML via EMLassemblyline for NPS data packages, please consult the accompanying NPS EML Creation GitHub website.

Quickstart

Prior to generating EML you will need the following:

  1. Data: A set of fully QA/QC’d data files in .csv format using UTF-8 encoding.

  2. Internet access: for downloading software and packages. A strong internet connection is necessary, particularly if you have taxonomic information as EMLassemblyline will use scientific names to reach out to ITIS and/or GBIF to populate taxonomic coverage fields from Kingdom down to species (and beyond).

  3. Software: R (and probably RStudio) installed on your computer. These are both available in Software Center. See the R Advisory Group’s website for more information. You will also need to install the R package EMLassemblyline from GitHub as well as some other packages from CRAN:

install.packages(c("devtools", "lubridate", "tidyverse", "stringr")
library(devtools)

devtools::install_github("nationalparkservice/NPSdataverse")
library("NPSdataverse", "lubridate", "tidyverse", "stringr")
  1. Access to MS Excel (or any spreadsheet type programs) and Notepad (or any text editor). These will facilitate editing tab-delimited files.

Download the Script

A stand-alone version of the NPS EML Creation script is available for download. You don’t have to download the entire repository to make EML.

Generate EML

  1. Edit the EML_Creation_Script.R file as necessary and run each line or set of code (except the make_eml() function).

  2. Edit the auto-generated .txt files using a text editor or spreadsheet application as necessary. For details, look at the NPS template editing guideline.

  3. Run the make_eml() function (this could take a little while - particularly if you have a lot of taxonomic data).

  4. Be sure to read and address any Issues or Warnings after running the make_eml() function

Next steps

The EML you have created is not the final step in NPS EML creation. To fully utilize DataStore’s new capabilities and to make sure your data are easily discoverable and reusable, you still need to edit the EML file to provide NPS-specific information (e.g. publisher, unit connections, DOIs, etc).

Currently, the only tool available to add NPS-specific information to EML is R/EMLeditor. Manually editing your metadata by hand is not recommended.

Additional documentation

  1. The guide to using the NPS EML creation script for creating EML using EMLassemblyline on github.
  2. The original EDI guidelines for creating EML.

Acknowledgements

EMLassemblyline and much of the excellent original documentation was developed by the Environmental Data Initiative. We have modified and annotated that documentation to make it more relevant to NPS.