Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links to TAIR locus pages are wrong in AmiGO #659

Closed
kltm opened this issue May 23, 2018 · 20 comments
Closed

Links to TAIR locus pages are wrong in AmiGO #659

kltm opened this issue May 23, 2018 · 20 comments

Comments

@kltm
Copy link
Member

kltm commented May 23, 2018

From @tberardini on May 23, 2018 15:10

Example:

http://amigo.geneontology.org/amigo/gene_product/TAIR:locus:2007651

The link URL to TAIR generated by AmiGO for locus:2007651 is

http://arabidopsis.org/servlets/TairObject?type=communication&id=locus:2007651

But should be:

https://www.arabidopsis.org/servlets/TairObject?accession=locus:2007651

Please fix. Thank you.

Copied from original issue: geneontology/amigo#503

@kltm
Copy link
Member Author

kltm commented May 24, 2018

This likely traces back to #443

@kltm
Copy link
Member Author

kltm commented May 24, 2018

@cmungall This seems similar to what we went through with WB a couple of months ago. This is essentially a collision being expressed in the "linker" code as it uses the main namespace as the lookup and TAIR has two different possible target for the same namespace.
See https://github.com/geneontology/go-site/blob/master/metadata/db-xrefs.yaml#L2892
As, I believe, is mentioned in the analogus WB tickets from April, full regexp lookup for all links would be quite expensive and we went with suggesting either splitting the namespace and/or having the receiving resource deal with the local id part. Experimenting with TAIR, the latter does not seem to be in place.

@kltm
Copy link
Member Author

kltm commented May 24, 2018

Also note: #158

@tberardini
Copy link
Contributor

Anything I can do to help resolve this?

@cmungall
Copy link
Member

sorry can't give full attention now, can we just have the gene link be the default for now? This is 99.9% of the links clicked on I would wager

@tberardini
Copy link
Contributor

can we just have the gene link be the default for now?

Yes please.

@kltm
Copy link
Member Author

kltm commented May 25, 2018

Will do.

@kltm
Copy link
Member Author

kltm commented May 25, 2018

This "patch"/"hack" is now in place for TAIR.

@tberardini
Copy link
Contributor

Thank you!

@kltm
Copy link
Member Author

kltm commented May 25, 2018

NP.

I just want to spell this out so when we pick up on this again we have a quick start.
So, for example, here is all of the TAIR-ish information in our metadata as of this moment:

- database: AGI_LocusCode
  name: Arabidopsis Genome Initiative
  description: Comprises TAIR, TIGR and MIPS
  generic_urls:
    - http://www.arabidopsis.org
  entity_types:
    - type_name: gene
      type_id: SO:0000704
      id_syntax: A[Tt][MmCc0-5][Gg][0-9]{5}(\.[0-9]{1})?
      url_syntax: http://arabidopsis.org/servlets/TairObject?type=locus&name=[example_id]
      example_id: AGI_LocusCode:At2g17950
      example_url: http://arabidopsis.org/servlets/TairObject?type=locus&name=At2g17950
- database: AraCyc
  name: AraCyc metabolic pathway database for Arabidopsis thaliana
  generic_urls:
    - http://www.arabidopsis.org/biocyc/index.jsp
  entity_types:
    - type_name: entity
      type_id: BET:0000000
      url_syntax: http://www.arabidopsis.org:1555/ARA/NEW-IMAGE?type=NIL&object=[example_id]
      example_id: AraCyc:PWYQT-62
      example_url: http://www.arabidopsis.org:1555/ARA/NEW-IMAGE?type=NIL&object=PWYQT-62
- database: TAIR
  name: The Arabidopsis Information Resource
  rdf_uri_prefix: http://identifiers.org/tair.locus/
  generic_urls:
    - http://www.arabidopsis.org/
  entity_types:
    - type_name: gene
      type_id: SO:0000704
      id_syntax: gene:[0-9]{7,12}
      url_syntax: http://arabidopsis.org/servlets/TairObject?accession=[example_id]
      example_id: TAIR:gene:2062713
      example_url: http://arabidopsis.org/servlets/TairObject?accession=gene:2062713
    - type_name: communication
      type_id: BET:0000000
      id_syntax: Communication:[0-9]{7,12}
      url_syntax: http://arabidopsis.org/servlets/TairObject?type=communication&id=[example_id]
      example_id: TAIR:Communication:1345790
      example_url: http://arabidopsis.org/servlets/TairObject?type=communication&id=1345790
    - type_name: primary transcript
      type_id: SO:0000185
      id_syntax: locus:[0-9]{7}
      url_syntax: http://arabidopsis.org/servlets/TairObject?accession=[example_id]
      example_id: TAIR:locus:2146653
      example_url: http://arabidopsis.org/servlets/TairObject?accession=locus:2146653

That's four URL schemes under three namespaces. Ideally, what we'd want is one of two things, which practically boils down to one thing done two different ways: a one namespace per URL scheme.

  1. Split the TAIR communications into their own namespace (e.g. TAIR_COMM), like PAINT_REF, etc.
  2. Setup a TAIR endpoint to resolve local IDs such that all entity types here have the same URL scheme (e.g. http://arabidopsis.org/servlets/TairObject?accession=ANYTHING:LOCAL_WHATEVS).

Really, for what we're doing, a better metadata format would have be to have the url_syntax stuff be at a higher level than under entity_types so that there was no confusing repetition for something we expect to be the same in practice.

@tberardini
Copy link
Contributor

Came here from a TAIR helpdesk email. After reviewing the SO mapping for the entity_types above, I think we need to edit a bit. In TAIR, locus = SO:gene and gene = SO:primary_transcript.

- type_name: gene

SHOULD BE primary_transcript

  type_id: SO:0000704

SHOULD BE SO:0000185

  id_syntax: gene:[0-9]{7,12}
  url_syntax: http://arabidopsis.org/servlets/TairObject?accession=[example_id]
  example_id: TAIR:gene:2062713
  example_url: http://arabidopsis.org/servlets/TairObject?accession=gene:2062713
- type_name: primary transcript

SHOULD BE gene

  type_id: SO:0000185

SHOULD BE SO:0000704

  id_syntax: locus:[0-9]{7}
  url_syntax: http://arabidopsis.org/servlets/TairObject?accession=[example_id]
  example_id: TAIR:locus:2146653
  example_url: http://arabidopsis.org/servlets/TairObject?accession=locus:2146653

@dustine32
Copy link
Contributor

@tberardini Thanks for the response! Yep, this was the first approach that @thomaspd and I were thinking but we're a little hesitant to claim all TAIR genes follow this. Just letting you know making this change would likely invalidate any TAIR entities in an annotation's "with/from" field if the identifier followed the "gene:1234567" syntax. Would this be OK?

@tberardini
Copy link
Contributor

Just letting you know making this change would likely invalidate any TAIR entities in an annotation's "with/from" field if the identifier followed the "gene:1234567" syntax. Would this be OK?

Is that because the with/from field requires the entity to be in a subset of classes that don't include primary_transcript?

@dustine32
Copy link
Contributor

@tberardini Correct! So just throw primary_transcript in this subset as well?

@tberardini
Copy link
Contributor

@dustine32 I would say, yes.

@pgaudet
Copy link
Contributor

pgaudet commented Jul 3, 2019

@tberardini @dustine32 Is this fixed ?

@tberardini
Copy link
Contributor

I've given my input, hope that was sufficient.

@kltm
Copy link
Member Author

kltm commented Jul 3, 2019

@pgaudet As far as I know, what is outlined in #659 (comment) has not been completed (as far as what closure would be). It does not seem to be pressing at this point?

@pgaudet
Copy link
Contributor

pgaudet commented Jul 3, 2019

Action point: Need identifier best practices when interacting with GO.

@cmungall
Copy link
Member

cmungall commented Mar 2, 2023

I am closing this because the original issue is resolved

however, I have many questions here and will open a new issue

@cmungall cmungall closed this as completed Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants