JsMarc is a Vanilla-JS utility designed by Clément Corbin to handle bibligraphic MARC (MAchine Readable Cataloging) records, commonly used by libraries.
JsMarc can be used in several ways: a web interface, a command-line tool and ES modules are provided.
This application relies on Workerify to execute JsMarc off the main event loop, by creating a handful of web workers on the fly. These limit the risk of freezing the tab & improve performance on large records batches.
Online JsMarc is available at https://corbin-c.github.io/jsmarc/app/
Main features are:
- batch record parsing: displays all the records in a batch in a table, with the raw record followed by selected fields & subfields
- batch record filtering: extracts the matching records from a batch given a list of values and a field. Output is a MARC file to download.
- data extraction: parses records for a given set of fields and displays a summary. Outputs a HTML table or a JSON.
- MARC help: displays meaning of fields/subfields codes on hover, provides help selecting fields
/!\ You'll need NodeJS locally installed to run this script
Just clone this repo (git clone https://github.com/corbin-c/jsmarc.git
). You
might want to symlink the marc-node
script to your local bin path.
Execute ./marc-node
to run the CLI tool. Without parameters, it returns the following
help:
$ ./marc-node
Error: Mandatory argument: command
Usage: marc-node COMMAND FILE [OPTIONS]
If FILE is -, read stdin.
Commands:
display
filter
extract
help
Options: Syntax: --KEY=VALUE
encoding
record-separator
field-separator
subfield-separator
format
fields
values
curl "https://web-z3950.herokuapp.com/?server=lx2.loc.gov:210/LCDB&isbn=0066620724,0596001312&format=usmarc" | ./marc-node display - --format=marc21
Grabs two records from the Library of Congress Z3950 server (using my Web-Z3950 NodeJS binding), displays them and explicits the fields.
./marc-node display /path/to/records.mrc --fields=856\$u
Opens the /path/to/records.mrc
batch of records and only shows the 856$u
field. (NB: a backslash is needed to escape the dollar sign)
curl "https://web-z3950.herokuapp.com/?server=lx2.loc.gov:210/LCDB&isbn=0066620724,0596001312&format=usmarc" | ./marc-node extract - --fields=100\$a,020\$a
Extracts the 100$a
and 020$a
fields from two records from the LoC and generates
the following JSON output:
[{
"leader": "01208cam a22003014a 4500",
"fields": [{
"code": "020",
"indicator": " ",
"subfields": [{
"code": "a",
"value": "0066620724 (hc)"
}]
}, {
"code": "100",
"indicator": "1 ",
"subfields": [{
"code": "a",
"value": "Torvalds, Linus,"
}]
}]
}, {
"leader": "01397cam a22003014a 4500",
"fields": [{
"code": "020",
"indicator": " ",
"subfields": [{
"code": "a",
"value": "0596001312"
}]
}, {
"code": "020",
"indicator": " ",
"subfields": [{
"code": "a",
"value": "0596001088 (pbk.)"
}]
}, {
"code": "100",
"indicator": "1 ",
"subfields": [{
"code": "a",
"value": "Raymond, Eric S."
}]
}]
}]
./marc-node filter ./path/to/batch/records.mrc --fields=020\$a --values=0596001312,"0066620724 (hc)"
Filters a batch of records to only keep records where the 020$a
field matches
one of the provided comma-separated values (0596001312,"0066620724 (hc)"). Note
the quotes used when providing values containing spaces.
The tools presented above all relies on two ES modules, the parser and the helper.
Import them with:
import * as MarcParser from "https://corbin-c.github.io/jsmarc/src/parser.js";
import * as MarcHelper from "https://corbin-c.github.io/jsmarc/src/helper.js";
This module exports a few useful tools for working on MARC records.
First, it comes with a built-in field code reader, which allows one to pass
the MarcParser fields codes using the "traditional" notation, which consists of the field
code and the subfield code separated by a dollar sign ($
). This notation can be
used when selecting fields to parse or to filter. For example, the ISBN in Marc21,
is located at subfield a
of field 020
, which could be written as 020$a
.
Multiple fields may be selected, comma-separated.
The parseRecord()
function can be called with one record and a facultative set
of parameters:
MarcParser.parseRecord(record,{
toParse, //array of selected fields to work on, as field_code$subfield_code, eg ["856$u"]
fieldSeparator,
subfieldSeparator //in case the record is not ISO2709 compliant, one may want to adjust the separators
});
It outputs a MarcParser
object, with the following properties:
MarcParser {
fieldSeparator: '', //string from parameters
subfieldSeparator: '', //string from parameters
rawRecord: '', //string from record argument
leader: '', //24 first chars of record
directory: [ //describes the position, code & length of each one of the fields
{ code: '001', length: '0012', position: '00000' },
...
],
fields: [ //describes all fields
{ code: '123', value: 'string' }, //fields without subfields only have a value
{ code: '456', indicator: ' ', subfields: [Array] }, //otherwise they contain an array of subfields
... //subfields are structured as { code: "", value: "" } too
],
'@fields': { code: '', indicator: ' ', subfields: [], value: '' }, //field template
'@directory': { code: [ 0, 3 ], length: [ 3, 7 ], position: [ 7, 12 ] }, //directory template
parseCode: '', //string from parameters
header: '' //string: leader + directory
}
The filterRecord()
function is meant to be used with the .filter()
array method.
Given a parsed record (a MarcParser object
), a set of values and a field
(as field_code$subfield_code
, e.g. 856$u
), it returns true
or false
whether the
record matches the provided parameters.
/!\ This module can't work without the MarcParser, unless you build your own
record
object, with the required structure.
The MarcHelper module is able to read from MARC definitions files (see the definitions
folder for further details) and enrich records with labels explaining the fields,
subfields and indicators meanings.
It provides the explainRecord
function, which takes a parsed Marc record and
a format as arguments. The function returns a promises which resolves when the MARC
definitions files have been loaded:
await MarcHelper.explainRecord(parsedRecord,format);
This will return a record object, enriched with labels for each field, subfield and
indicator (if that label was found).
The format
parameter has to match one of the keys of the formats.json
object.
It is also possible to reverse search for field codes with the searchField
function. Given a string and a MARC format, it returns a promise, resolving in
an array of code/label pairs:
await MarcHelper.searchField("auteur","unimarc");
//expected ouput (truncated):
[
{ code: '200$c', value: "Titre propre d'un auteur différent" },
{ code: '488$a', value: 'Auteur[7x0 du document en relation]' },
{ code: '604', value: "Point d'accès sujet - Auteur/Titre" },
{ code: '604$a', value: 'Auteur' },
{ code: '701$4', value: "Auteur d'oeuvre adaptée ou continuée" },
{ code: '702$4', value: "Auteur d'oeuvre adaptée ou continuée" },
...
]