-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 #1
Comments
Ugh! The XML in my text looks like crap on this page. If whomever is available to help with this will email me ([email protected]), I'll email you the XML causing the blow up. (I've been burning weeks on this problem and it continues to stymie me.) |
Actually, if you can see the saved text by selecting to edit this issue, it seems to have saved the XML text. But if it will help, I'm happy to email it to whomever attempts to analyze it. Joe Justice |
Can you submit this as an issue under marc4j, with the binary and xml versions |
Yes. I guess I was in the wrong place. Sorry. ☺ -joe From: Simon Spero [mailto:[email protected]] Can you submit this as an issue under marc4j, with the binary and xml versions — |
Also, if you have stacktrace that is good too, but sample code is good too Simon On Thu, Feb 28, 2013 at 2:08 PM, gypsyjoe [email protected] wrote:
|
I will send you the process I’m working through because I have several steps that are going on getting me to this point. I can include the original binary MARC21 from which this MARCXML is coming, but, as I cannot convert the MARCXML I sent you, I cannot send you any binary MARC of that step. I’ll do my best to describe what’s going on in my item. But I am able to convert some records that are not included here. I’ll include those files, too, and describe them. I should have it ready soon. Thanks. -joe From: Simon Spero [mailto:[email protected]] Also, if you have stacktrace that is good too, but sample code is good too Simon On Thu, Feb 28, 2013 at 2:08 PM, gypsyjoe <[email protected]mailto:[email protected]> wrote:
— |
How do I attach the files to the issue? It’s issue #26. Here’s the zip of the files I wanted to attach. But I can’t figure out who to do it on the site. -joe From: Simon Spero [mailto:[email protected]] Also, if you have stacktrace that is good too, but sample code is good too Simon On Thu, Feb 28, 2013 at 2:08 PM, gypsyjoe <[email protected]mailto:[email protected]> wrote:
— |
Looking at the MARCXML record above the field you add:
is missing the marc indicators, (the ind1 and ind2 attributes) if you change the added datafield to be:
it should parse correctly and produce a valid marc8 encoded binary MARc record after conversion. |
Still ought to be handled more gracefully than an NPE. I was about to split the Reader and Writer Tests on a per class basis, so Simon On Sat, Mar 2, 2013 at 3:42 PM, haschart [email protected] wrote:
|
Cool! Let me know if I may be of help or if you have any questions. I'm sure I could forward the DOM code showing how I'm doing things there. Honestly, I've been banging at this since before Code4Lib and it has been through all sorts of rewrites and attempts to comb out the problem. My latest thought is to pull in the marc4j project code into my servlet code so I can step through the marc4j processes and examine them more completely. But I wasn't able to finish this set up on Friday. Good luck. I'm burning a candle for us. :-) -joe Sent from my iPad On Mar 2, 2013, at 3:49 PM, "Simon Spero" <[email protected]mailto:[email protected]> wrote: Still ought to be handled more gracefully than an NPE. I was about to split the Reader and Writer Tests on a per class basis, so Simon On Sat, Mar 2, 2013 at 3:42 PM, haschart <[email protected]mailto:[email protected]> wrote:
— |
I am attempting to use marc4j to convert a MARCXML file back to MARC21 binary, which I had previously converted from MARC21 to MARCXML using marc4j. I made one update to some of the records in the MARCXML to add a single tag element for MARC tag 088 with a value of "OSTI-ID=#######" where the #'s are individual numeric digits. After making this update and then attempting to convert back to MARC21, I get a snag in the SAXParser that throws a NullPointerException. It breaks on a particular record.
I've attempted to fix this by pulling out the individual records into a DOM and getting each node then pulling the string out of the node and then converting the string to a byte array input stream to move it to an InputStream object and passing the single record to the MarcXmlReader object. But I get the following error for this record megta data.
Exception getting thrown:
MarcXmlParser run() MarcException: Unable to parse input
XML Record causing the blow up:
MarcXmlParser run() MarcException: Unable to parse input
I would greatly appreciate it if someone could help me figure out why this record XML is flipping out the MarcXmlParser.parse function. It seems to be blowing up when the SAXParserFactory XMLReader attempts to parse the record. I'm even passing the node string through a normalizer like this to make sure it's valid ASCII text.
szxmlnode = Normalizer.normalize(szxmlnode, Normalizer.Form.NFD).replaceAll("[^\p{ASCII}]", "");
Joe Justice
Sandia National Laboratories
Albuqueruque, New Mexico
The text was updated successfully, but these errors were encountered: