CSV Importer

Bulkrax can import from a CSV file that follows the following guidelines.

Required fields

The CSV MUST have a header row to uniquely identify the record.
This header row MUST be called either source_identifier or the field name configured in config/initializers/bulkrax.rb as source_identifier_field_mapping.
The source_identifier field MUST contain a unique identifier for the item.
The CSV MUST have a title column
There MUST be a source_identifier and title for all works

Note: The source identifier is added to the imported Work in the system_identifier_field (see below for an explanation of the system identifier field). The default is source.

Supported fields

All columns will be imported if the column name matches an existing metadata property in Hyrax, eg. title, creator etc.

In addition, the following columns will be imported:

collection
file
remote_files
model

Collections

A column headed collection will be used to define which collection imported works should be added to.

Multiple collections can be supplied, if separated with a semi-colon (;) or pipe (|).

If the value provided matches a value found in the system_identifier_field of an existing collection, then works will be added to that collection. If not, a new collection will be created and both title and system_identifier_field will be set to the value supplied in the collection column.

For example

source_identifier	title	collection
imported_work_1	Work One	Collection One
imported_work_2	Work Two	Collection One; Collection Two

In the first row (after the header), the Work being imported will be added to Collection One, and in the second, to both Collection One and Collection Two.

If either of those already exist, then the existing collection is used. If not, a new one is created.

Model

The model column is used to determine the work type. It is not required. In it's absence, either the field mapping or default_work_type will be used. Read more about these in the Configuration guide.

Files

Files will be imported from a column called file or remote_files if they are present.

The remote_files column will contain URLs to files which will be downloaded and imported. Multiple files can be imported, if separated by a semi-colon (;) or pipe (|) (URLs themselves MUST NOT contain semi-colons or pipes).

The file column will contain filenames (these must be unique). Multiple files can be imported, if separated by a semi-colon (;) or pipe (|) (filenames themselves MUST NOT contain semi-colons or pipes).

Files Location

If imported from a pre-existing server location, files MUST be placed in a directory called files relative to the location of the CSV file.

If uploading using Browse Everything, the location of the files will be handled by the system.

For example:

source_identifier	title	creator	publisher	file
first_work	First work title	Smith, John	Faber and Faber	document.pdf
second_work	Second work title	Jones, David	Macmillan	firstdocument.docx; seconddocument.pdf
third_work	Third work title	Other, A.N.	Penguin

If the CSV to be imported is located at

/tmp/imports/1/csv-to-be-imported.csv

The files would be at:

/tmp/imports/1/files/document.pdf
/tmp/imports/1/files/firstdocument.docx
/tmp/imports/1/files/seconddocument.pdf

The third_work does not have any associated files.

Importing Metadata and Files from a Zip file

A Zip file containing a single CSV and a folder named files/ can be imported by the CSV Importer. The structure of the Zip is very important and is as follows:

metadata.csv
files/
  |
  file_1.png
  file_2.jpg

See the Files Location guide for how to reference the files within the CSV

In Finder, select the CSV and the files/ folder (cmd + click to select multiple items), right click, and select Compress. This will create the Zip file that will be imported.

NOTE: The names of the files themselves don't matter, as long as they match what's in the files column in the CSV. Likewise, the name of the CSV does not matter. However, the name of the folder containing the files does matter and should be written exactly as "files" (lowercase and plural). Also, the structure of the Zip is important; for example, if you compress a directory containing the CSV and the files/ folder, it will not import properly.

Configuration and Customization

Please see the Configuration guide for information on how to configure and customize import. For example, by excluding columns from import, or splitting data on specific delimeters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSV Importer

CSV Importer

Required fields

Supported fields

Collections

Model

Files

Files Location

Importing Metadata and Files from a Zip file

Configuration and Customization

Clone this wiki locally