Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata: Astronomy - Changes to Ingest and Display of Resolution Elements #614

Closed
eaquigley opened this issue Jul 9, 2014 · 15 comments
Closed
Assignees
Labels

Comments

@eaquigley
Copy link
Contributor

@posixeleni :
Eleni,
I finally got to working on this...
I'm trying to make sense out of what it is that we agreed to do with these fields... and I'm still confused about some of it:

Author Name: Eleni Castro (@posixeleni)
Original Redmine Issue: 4051, https://redmine.hmdc.harvard.edu/issues/4051

After meeting with Gus and Leonid about FITS file ingest it was decided that we would just display the min max values, and set "allowMultiples to FALSE at the Dataset aggregate level (astrophysics.tsv) for the following elements:

  • Spatial Resolution
  • Spectral Resolution
  • Time Resolution

*Note: Continue to store all the individual File-level values in the backend but in the UI Dataset level only show min and max as a string in a single field. (no need to create compound).

In all honesty, it seems kinda strange that we are doing this for, say, time.resolution... to me it would make much more sense to allow multiples on this field, and just store every (unique) value found.
But ok, I'm going to assume that this is what Gus wanted. But then the metadata block needs to change further: for all the 3 resolution.* fields above the "fieldtype" is still set to "float". Since Gus wants these to be strings, made of " ", the type needs to be set to "text".
Otherwise I'm getting validation errors - because "1.234 5.678" isn't a legit float number; even though it's made of 2 valid floats.

@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Eleni Castro (@posixeleni)
Original Date: 2014-06-05T19:12:48Z


Leonid - apologies for the oversight (still learning what the protocol for redmine tickets is - had assigned to Gustavo) please review this FITS ticket that was created based on our talk with Gus last week and let me know if anything is unclear or if clarification is needed. I am already putting the changes into the booleans in the astrophysics.tsv file and will include it in my batch of metadata changes with a note to this ticket for QA.

@eaquigley eaquigley added this to the Dataverse 4.0: In Review milestone Jul 9, 2014
@scolapasta scolapasta modified the milestones: Beta 5 - Dataverse 4.0, In Review - Dataverse 4.0 Jul 15, 2014
@landreev
Copy link
Contributor

My assumption was that the FITS ingest was already implemented per the design that we agreed on in that meeting with Gus.
However, I just retested several files that were properly ingesting before, and quite a few failed. It appears that this is indeed due to some changes in the metadata block; some field where multiples used to be allowed is no longer configured that way; some other field that used to be defined as text is now defined as float so a validation error occurs if a value found in a file does not parse as a valid float... etc.
Again, I thought all such changes had already been synchronized. I don't have time to work on it this week so the ticket has to be rescheduled.

@landreev
Copy link
Contributor

@posixeleni :
Just noticed your comment above:

*Note: Continue to store all the individual File-level values in the backend but in the UI Dataset
level only show min and max as a string in a single field. (no need to create compound).

Did this comment come from Gus? I thought I was able to communicate to him that we DO NOT store any metadata per individual files. So we can't continue doing so. The only metadata we have stored is the aggregate values associated with the dataset; these are built from the values extracted from individual files on ingest, but there is no mechanism in 4.0 for attaching any fields and values from metadata blocks to files.

I did propose it at some point, that we should somehow make some fields assignable to files... but nothing was decided or done about it, so at this point it's safe to assume it's not happening in 4.0.

@posixeleni
Copy link
Contributor

@landreev Yes, this note had come from our last meeting we had together with Gus. If we can't currently store these metadata values at the file level then I can create a separate ticket for us to review Post 4.0. Let me know if I can help clarify anything else.

@landreev
Copy link
Contributor

@posixeleni
I had a detailed discussion with him about it... Just trying to remember if it was before or after that meeting. And my understanding was that he no longer had any expectations to have any metadata directly associated with files. But yes, we'll revisit this after 4.0. It does appear that most of the fields in their metadata block are file-level properties - so it would make more sense if we could keep them attached to files.

@pdurbin
Copy link
Member

pdurbin commented Sep 11, 2014

He's @augustfly on GitHub in case he wants to leave any comments here or in #916.

@posixeleni
Copy link
Contributor

@landreev let me know that in order for this change in the metadata to fully work I will need to change the fieldType from Float to Text since we will be storing these as strings in the db rather than floats. The changes in fieldType will happen for the following elements:

  • Spatial Resolution
  • Spectral Resolution
  • Time Resolution

Once this is in the db then this ticket can be closed.

@pdurbin @scolapasta would we need a new schema.xml for these changes and/or db drop?

@pdurbin
Copy link
Member

pdurbin commented Oct 20, 2014

@pdurbin @scolapasta would we need a new schema.xml for these changes and/or db drop?

As a rule of thumb, whenever the TSV files for metadata blocks are touched, a new schema.xml is required. However until #370 is worked on, pretty much every field is already stored in Solr as "text_en" so I don't expect to need to publish a new Solr schema but @posixeleni please let me know when your commit is in so I can check.

I'll let @scolapasta comment on if a database drop is required or not since I'm not sure.

@posixeleni
Copy link
Contributor

Spoke with @scolapasta and we should not need a db drop for these changes although schema.xml yes. Will assign to @pdurbin when I check in my changes.

posixeleni added a commit that referenced this issue Oct 20, 2014
For fieldType changes see #614.
These will require us to manually change in the DB and then have Phil
do a schema.xml change.

For new GSD Block addition see: #268
@posixeleni
Copy link
Contributor

@pdurbin these changes are now committed which will require a manual change in the DB with the help of either @scolapasta or @kcondon and then I can assign to you for a schema.xml update:

@pdurbin
Copy link
Member

pdurbin commented Oct 20, 2014

until #370 is worked on, pretty much every field is already stored in Solr as "text_en" so I don't expect to need to publish a new Solr schema

As I suspected, the changes in 3abff41 to astrophysics.tsv and did not result in any change to the Solr schema.xml. No need to assign this ticket to me. I did just update the Solr schema.xml, but that was for the GSD block, the new customGSD.tsv file: #268 (reference)

landreev added a commit that referenced this issue Oct 20, 2014
…vel file metadata

aggregation, for some of the FITS metadata fields.
Per ticket #614.
@landreev landreev assigned kcondon and unassigned landreev Oct 31, 2014
@landreev
Copy link
Contributor

All the Gus's files should be ingesting now.

@posixeleni
Copy link
Contributor

@kcondon to test this successfully please note i need to work with either you or @scolapasta to put in a small change in the DB for these to be interpreted as strings (text) in build. Please let me know if you can help me put this change in!

@posixeleni
Copy link
Contributor

Correction: the DB in build has already been dropped and the DB in demo will be as well so this is ready for @kcondon to test.

@kcondon
Copy link
Contributor

kcondon commented Nov 3, 2014

OK, I'm now able to ingest all Gus' FITS files.

@kcondon kcondon closed this as completed Nov 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants