Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing: Failure to index one dataset, results in exception when clicked. #69

Closed
kcondon opened this issue Apr 13, 2020 · 6 comments
Closed
Assignees

Comments

@kcondon
Copy link
Contributor

kcondon commented Apr 13, 2020

With v4.20, index all worked for all datasets but one. That one has an existing record but when you click on it, throws an exception.

Indexing directly throws an error.

Leonid reported seeing these errors in the log on page load:

[2020-04-13T14:54:04.331-0400] [glassfish 4.1] [WARNING] [] [javax.enterprise.resource.webcontainer.jsf.lifecycle] [tid: _ThreadID=105 _ThreadName=jk-connector(3)] [timeMillis: 1586804044331] [levelValue: 900] [[
  #{DatasetPage.init}: java.lang.NullPointerException
javax.faces.FacesException: #{DatasetPage.init}: java.lang.NullPointerException
	at com.sun.faces.application.ActionListenerImpl.processAction(ActionListenerImpl.java:118)

It looks like it's something with the author of the dataset: (L.A.)

Caused by: java.lang.NullPointerException
	at edu.harvard.iq.dataverse.DatasetAuthor.isEmpty(DatasetAuthor.java:87)
	at edu.harvard.iq.dataverse.DataCitation.lambda$getAuthorsAndProducersFrom$1(DataCitation.java:738)

The dataset: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2U2N87

@djbrooke djbrooke transferred this issue from IQSS/dataverse Apr 13, 2020
@landreev
Copy link
Collaborator

So to me it looks like it may be more of a curation issue, rather than dev. - if it's just a matter of going into the db. and fixing whatever is wrong with the author for the dataset?

But then it may also be something in our code, that allows a user to create a dataset with "bad" author field; or allows to edit it in a way that saves it, but then breaks the dataset...

The dataset passes our validation API btw - so it's not an outright constraint violation.

@landreev
Copy link
Collaborator

We have changed something about author affiliations in 4.20 - correct?
this is the line 87 of DatasetAuthor.java:

        return ( (affiliation==null || affiliation.getValue().trim().equals(""))

that throws a null pointer. so it looks like it is now possible to have an affiliation field that's no null; but has a null value. Hmm.
Also, there doesn't seem to be any other datasets in the db with this issue - at least this one is the only one that failed to reindex...

@jggautier
Copy link
Collaborator

We have changed something about author affiliations in 4.20 - correct?

The PR at IQSS/dataverse#6619 removes parenthesis from author affiliation values in the Search API results.

@djbrooke
Copy link
Contributor

  • Some investigation to see how this happened, and fix in prod
  • Determine code fix and put up a PR

@sekmiller
Copy link

If we want to delete the offending record here is the diagnostic query(hat tip Leonid):

SELECT v.id, f.datasetfieldtype_id, t.name, fp.datasetversion_id
FROM datasetfieldvalue v,
datasetfield f,
datasetfield fp,
datasetfieldcompoundvalue fpv,
datasetfieldtype t
WHERE v.datasetfield_id = f.id
AND f.datasetfieldtype_id = t.id
AND v.value IS null
AND t.id = 9
AND f.parentdatasetfieldcompoundvalue_id = fpv.id
AND fpv.parentdatasetfield_id = fp.id
AND fp.template_id IS null

The "v.id" in the result can be used in the delete query.

/4776468 - this is the id from the production clone that I had locally

delete from datasetfieldvalue where id = xxxxxx;

@sekmiller sekmiller assigned djbrooke and unassigned sekmiller May 4, 2020
@landreev
Copy link
Collaborator

landreev commented May 4, 2020

Just to document it, this was fixed without running direct db queries; by creating a draft version (by edit -> save - no need to make any actual changes), then publishing, as a superuser, without incrementing the version. This updates the latest version, removing the empty/null values in the process. (and then deletes the draft)

@djbrooke djbrooke closed this as completed May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants