Prerequisites

Follow these instructions first, to install the prerequisites to run nuxeo_spreadsheet.

Using `nuxeo_spreadsheet`

nuxeo_spreadsheet constitutes a set of prototype Python scripts ("csv2dict"), which can be used to import metadata in a tab-delimited spreadsheet into Nuxeo. Note that comma separated value-based spreadsheets (CSV) are not supported ("csv2dict" is a misnomer).

1. Upload files to a Project Folder in Nuxeo

Once you have the prerequisites in place, upload your content files into a Project Folder in Nuxeo. Import the files through the Nuxeo UI, or use the bulk import options.

2. Get directory paths to the files

Next, generate a list of directory paths for the files in that Project Folder. You'll need the path to the Project Folder; it's reflected in the URL in your browser view of Nuxeo. Make sure you are in your python environment (e.g., venv) and run this command.

nxls /asset-library/UCX/Project_folder --show-only-path

Optionally, add the additional > command followed a .txt filename, to output the list of directory paths to a .txt file in your home directory (e.g., at cd C:\Users\yourname\):

nxls /asset-library/UCX/Project_folder --show-only-path > paths.txt

If you're using miniconda within Windows, here's an overview of the process:

Open the Command Prompt from the Start menu
Activate your python environment. In this example, we're activating a python environment named "venv": activate venv
Run the command: nxls /asset-library/UCX/Project_folder --show-only-path

3. Create metadata in tab-delimited spreadsheet

Use Nuxeo Spreadsheet Template. The first tab comprises the template; the second tab provides an example for reference purposes.

Note with the following considerations:

We'd suggest saving a copy of the template in Google Sheets, and working directly in the Google Sheets format to build out the metadata. We do not recommend using Excel (.xlsx), based on our initial tests (Excel can add additional quotes around text, and also introduce errors with special characters).
The column headings in the tab-delimited spreadsheet need to exactly match the headings expected by the Python scripts constituting nuxeo_spreadsheet. You can double-check the headings by reviewing the columns.txt file in GitHub.
In cases where metadata elements are repeatable in Nuxeo, you can append a numeric indicator after the column heading. In the Nuxeo Tab-Delimited Spreadsheet Template, you can see examples of this for Creator. When using this function, note that you must include columns for all complex data fields (e.g., if repeating Creator information, the following fields must be in place: Creator # Name, Creator # Name Type, Creator # Role, Creator # Source, and Creator # Authority ID).
Each row in the spreadsheet can contain metadata for either a simple object, a parent-level record for a complex object, or a component for a complex object. The main thing to ensure is that the row corresponds to the correct File Path in Nuxeo.
File Path (color-coded in red) is required for each row; additionally, either Title, Type, Copyright Status, and/or Copyright Statement is required, if and when the objects will be published in Calisphere. For additional information on the metadata requirements, see the Nuxeo user guide
The File Path cell should contain the exact file directory path to the content file in Nuxeo, to be associated with the metadata record (e.g., "/asset-library/UCOP/nuxeo_tab_import_demo/ucm_li_1998_009_i.jpg").
If using Google Sheets directly to create your metadata records, note that some of the fields have validation rules. These fields are keyed to controlled vocabularies established in Nuxeo.
Once you've completed the process of creating metadata records using the template, save a copy as a tab-delimited file.

If using Google Sheets, download as tab separated value:

4. Import metadata into Nuxeo

Import Metadata into Nuxeo using Tab-Delimited Spreadsheet (see below to import using Google Sheet)

Load with meta_from_csv.py. This process will convert the metadata from the spreadsheet into Python dict outputs, and call pynux to import the Python dict outputs directly into Nuxeo.

usage: meta_from_csv.py [-h] --datafile DATAFILE [-d] [--loglevel LOGLEVEL]
                        [--rcfile RCFILE]

optional arguments:
  -h, --help           show this help message and exit
  --datafile DATAFILE  tab-delimited spreadsheet input file -- required
  -d, --dry-run        dry run
  --blankout           blank out all fields not set in sheet
  --sheet SHEET        importants a specific named sheet from google spreadsheet
  
common options for pynux commands:
  --loglevel LOGLEVEL  CRITICAL ERROR WARNING INFO DEBUG NOTSET, default is
                       ERROR
  --rcfile RCFILE      path to ConfigParser compatible ini file

Note for Windows: you may need to run python meta_from_csv.py ... or edit the shebang. If you're using miniconda within Windows, here's an overview of the process:

Open the Command Prompt from the Start menu
Activate your python environment. In this example, we're activating a python environment named "venv": activate venv
Go to nuxeo_spreadsheet\csv2dict in your home directory, e.g.: cd C:\Users\yourname\nuxeo_spreadsheet\csv2dict
Run the command. In this example, the DATAFILE is the location of a tab-delimited file (named "tab-delimited-metadata.txt") that's on our Desktop. python meta_from_csv.py --datafile C:\Users\yourname\Desktop\tab-delimited-metadata.txt

Import Metadata into Nuxeo using Google Sheet

Load with meta_from_csv.py. This process will convert the metadata from the spreadsheet into Python dict outputs, and call pynux to import the Python dict outputs directly into Nuxeo.

usage: meta_from_csv.py [-h] --datafile DATAFILE [-d] [--loglevel LOGLEVEL]
                        [--rcfile RCFILE]

optional arguments:
  -h, --help           show this help message and exit
  --datafile DATAFILE  tab-delimited spreadsheet input file -- required
  -d, --dry-run        dry run
  --blankout           blank out all fields not set in sheet
  --sheet SHEET        importants a specific named sheet from google spreadsheet
  
common options for pynux commands:
  --loglevel LOGLEVEL  CRITICAL ERROR WARNING INFO DEBUG NOTSET, default is
                       ERROR
  --rcfile RCFILE      path to ConfigParser compatible ini file

Note for Windows: you may need to run python meta_from_csv.py ... or edit the shebang. If you're using miniconda within Windows, here's an overview of the process:

Open the Command Prompt from the Start menu
Activate your python environment. In this example, we're activating a python environment named "venv": activate venv
Go to nuxeo_spreadsheet\csv2dict in your home directory, e.g.: cd C:\Users\yourname\nuxeo_spreadsheet\csv2dict
Run the command. In this example, the DATAFILE is the url of the Google Sheet. This command assumes that the first sheet (tab) in the spreadsheet is to be ingested. If there is another sheet (i.e. Sheet 2) which is the ingest sheet, skip this step and see note below. python meta_from_csv.py --datafile https://docs.google.com/spreadsheets/d/12345

IMPORTANT If the sheet you want to ingest is not the first sheet in the spreadsheet you will need to run the command differently. python meta_from_csv.py --datafile https://docs.google.com/spreadsheets/d/12345 --sheet 'Sheet Name'

If there are not spaces in the sheet name (i.e. SheetName), there does not need to be quotes surrounding the name. However if the sheet has spaces in the name (i.e. Sheet Name), quotes need to enclose the name when running the script.

`mets_example`

Sample code for loading METS into Nuxeo

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
csv2dict		csv2dict
export_nuxeo		export_nuxeo
mets_example		mets_example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequisites

Using `nuxeo_spreadsheet`

1. Upload files to a Project Folder in Nuxeo

2. Get directory paths to the files

3. Create metadata in tab-delimited spreadsheet

4. Import metadata into Nuxeo

Import Metadata into Nuxeo using Tab-Delimited Spreadsheet (see below to import using Google Sheet)

Import Metadata into Nuxeo using Google Sheet

`mets_example`

About

Releases

Packages

Languages

dnoneill/nuxeo_spreadsheet

Folders and files

Latest commit

History

Repository files navigation

Prerequisites

Using nuxeo_spreadsheet

1. Upload files to a Project Folder in Nuxeo

2. Get directory paths to the files

3. Create metadata in tab-delimited spreadsheet

4. Import metadata into Nuxeo

Import Metadata into Nuxeo using Tab-Delimited Spreadsheet (see below to import using Google Sheet)

Import Metadata into Nuxeo using Google Sheet

mets_example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Using `nuxeo_spreadsheet`

`mets_example`

Packages