-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How would you improve GitHub? #580
Comments
Honestly, I have never found GitHub particularly attractive as a platform to host a collaborative research project such as the OSM. I understand all of the reasons for which it was chosen and it is functional, but as a participant in the OSM I have found trying to follow the discussion currents very frustrating. Much of the problem may simply be my lack of experience in navigating GitHub, but the below listed items are a summary of some of my issues. Many of these might be irrelevant with more time and training, but time is something I do not have to devote to a task that should be intuitive.
Thank you for letting me vent some of my frustrations. I believe in the premise and promise of the OSM and I am more than familiar with the issues of trying to fund / support basic research. I cannot say enough about how impressed I am with @mattodd for his organizational prowess and drive in keeping the project going. All of my comments were aimed at GitHub and not the OSM. |
I also find github difficult to use. I share many of the same specific complaints as @MedChemProf. I've been using the OSM Github for about two years now, and I still get confused about how to find information. I feel that Github is fine for managing code-based projects and for people used to git, but it feels awkward for OSM. On the other hand, I am not familiar with better alternatives. Whatever platform we adopt should be easy to learn and use for students, especially those from a biology or chemistry background. |
@FinnWoelm I've had to use use many different document management systems when working with different groups, including GitHub, Drupal, Joomla, Dotmatics, Intralinks, Box, Pandadoc, Huddle, CDD, Science Cloud. None are ideal, all have their quirks and all have a learning curve. If you want the system to be open-source then there are limited options. If you want to do Open Source science then shouldn't the software also be open-source? Searching and indexing is an issue, you need to be able to search message threads, all document types (pdf, text, presentation, spreadsheet, chemical structure etc.). It would be better if the indexing recognised molecule identifiers. Simply hoping that Google will do the indexing is not enough. It also requires some discipline by the users to tag documents/messages appropriately but this is very difficult to enforce. One of the quirks of GitHub is that closed issues are not visible by default, this makes sense when tracking code bugs but does not help newcomers. Ideally all the meta data for every assay would be accessible. In an ideal world all current data would be in a sub-structure searchable database with a web interface. With business rules defined that control multiple results from the same assay, run on different batches, run in different labs etc. With the ability to export to desktop for analysis. |
This is super useful feedback, thanks for taking the time. A few quick observations around why I think this platform currently beats many potential others:
Obviously there are things that need to be better, and many of those are in comments above. The filestore works well for anyone running the desktop client, but for everyone else I imagine it's sub-optimal. Something more like Dropbox would, I suspect, be useful. That Issues are linear discussions, without threads (branching out), can sometimes make things difficult to follow. In some cases the difficulty of following things might be an inevitable result of things being busy and there being nobody available to collate the discussion into action items and close the relevant issue. No platform will solve that - it's a different problem. Your point @MedChemProf about the most up to date datasheets is I think something more about how we maintain lab books and other components, rather than Github? From our point of view a platform that understands chemistry would obviously be amazing. I mention this in talks every now and again, that we need a way of referring to a molecule like an issue number - e.g. "IDENTIFIER-X" is linked to a row in the Master List, for example, so that the page understands which molecule you're referring to in the way that it can currently understand if you're referring to another issue. A big thing for me is to make the "suggest change-review change-accept change" cycle easier to understand. This function is ultimately fundamental to how we work as scientists. We look at things, we suggest changes and people accept or reject those changes. This feature is at the core of Github. Yet we're not using it properly. Over at Google Docs etc we all use such a thing. We read and suggest/comment and those suggestions are accepted or not. We need this functionality to be easier across Github issues, files and wikis. |
@FinnWoelm I'm just going to jump in here quickly to make a suggestion for a simple change to github. If it were possible to upload and share sdf files (.sdf) on Github, this would, I think, go a long way to making it a more attractive tool for open science work. Unfortunately it is currently not a recognised file format and so is not permitted. |
@MedChemProf , @holeung , @drc007 , @mattodd , @bendndi — thank you all for sharing your thoughts! Apologies for the delayed reply. I quickly realized that this would become a challenging conversation for me, given my lack of a bio/chem background. 😅 This weekend, I finally got a chance to set aside some time and really dive in. Thanks for your patience. 🔍 Search
Can you share a little more about that? Is the issue that GitHub does not search for similar/related words? For example: If I search for Do you maybe have a specific example of the last time this happened?
Yup, rich text indexing is definitely a challenge with GitHub. The chemical structure files you are referring to, are those the 👩🎓 Students
How do you usually solve this issue with your students? Do you walk them through the core features of GitHub? Do you have a video/text/interactive tutorial you send them? How do they respond to that: Do you lose their interest in the process or are they eager to contribute and participate once they "get it"? ☁️ Open Web Database
Probably outing myself as a complete newbie here 🙈, but isn't OSM's Molecule Database sort of like that? 🔬 Molecule References
What's the benefit of having "the page [understand] which molecule you're referring to"? So that you can easily find all conversations related to a molecule while browsing the master sheet? It's like doing a search for ⬆️ Sharing SDF files
Are you referring to attaching SDF files to a discussion (such as this)? Or is it about uploading it to a GitHub repository? Because it's definitely possible to upload it to a GitHub repository! In fact, I found this OSM repo that was using SDF to track compounds about 6 years ago. And I see that it's hosted remotely now as of Series 4. Looks like it was decided that Google sheets would replace SDF as the primary data source? 📂 File Organization/Structure
What would make the most current set of data easier to find? What would make it clear whether it's current? Prominently displaying a link to it somewhere on the website or in the GitHub readme?
This does indeed go somewhat away from the Q about GitHub — but I'm very curious. As far as I can tell, these are the main collaboration/communication tools in use:
Is this accurate? What led to the adoption of Google Sheets (over, for example, a CSV file in GitHub)? Doesn't this a) make searching more difficult because data is in multiple platforms, b) make contributing more difficult because the Google sheet defaults to view-only permission for non-contributors, and c) make tracking changes more difficult because there's no "git log"? 🚥 Workflow
Can you tell me more about that? As far as I can tell, OSM is not using GitHub's pull request workflow (series 3 had no PRs and series 4 had four PRs). Instead, users are added as collaborators to the repository and the Google sheet to receive direct edit access, yes? Is that because PRs are too complex or because PRs are not useful? 💬 Threaded Discussions
As I'm writing this, I could not agree more 😀 |
@FinnWoelm Chemical structure files can be many different forms. cdx is a proprietary binary data format that can contain chemical structure information, but also display, text, tables, arrows, boxes etc. cdxml is is the xml version. The are many, many other chemistry file formats OpenBabel (http://openbabel.org/wiki/Main_Page) has details of over 100 file formats. Most contain chemical structure and meta data in a fully defined format. However whilst most are text based, viewing the contents in a understandable format usually requires a special reader. |
@FinnWoelm Google sheets versus sdf as repository. The OSM's Molecule Database is another example of pulling data from the Google sheet. It does not allow users to enter data into the database directly. Whilst it allows substructure searching you can't currently combine structure and alphanumeric searches. I'm also not sure how easy it would be to implement a relational database since it currently imports from effectively a flat file. |
Having seen demos of CDD and Stardrop in Copenhagen this week I suggest the OSM project team should get quotes for both. No idea how finances are/will be run from UCL and/or USYD for 2019 so maybe an explainer would be useful Update: n.b. I checked with them about exporting data which they said should be no problem (i.e. closed > open). If there is any interest (anyone can hit the free trial) in my capacity as SAB member I can discuss more with them about "opening up" options since there are certainly other OSDD teams who would like to take advantage of the data organisation but also want to work in the open |
I guess one question might be should an Open Source project be funding closed source software? |
@FinnWoelm Another nice to have is a easy way to display proteins/structures etc. there are number Javascript viewers but they all require some experience with javascript coding. Examples would be 3Dmol.js http://3dmol.csb.pitt.edu/index.html or NGLviewer http://nglviewer.org |
There are alternatives
… On 31 Mar 2019, at 10:11, cdsouthan ***@***.***> wrote:
I guess one question might be should an Open Source project be funding closed source software?
but Microsoft and ChemDraw?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#580 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGwKRHgj0FTkklVwQaZ8bo2RQapTkQo-ks5vcHutgaJpZM4benwc>.
|
Try EzMol - http://www.sbg.bio.ic.ac.uk/ezmol/ a web-based, wizard-drive, molecular graphics display. It is designed for the occasional user who does not wish to invest the time to use programs such as PyMol and Chimera. Generated images can be downloaded for publication. It’s a bit like EasyJet – basic but gets you from A to B. (See Reynolds, et al (2018). JMB) |
EzMol is built using 3Dmol.js, it is a nice implementation but I don't think you can embed the resulting structure file within the OSM website, just a static image. You can download a .ezm file but then need to go to the EzMol website to view it. Would be better if it could export the javascript code used to generate the display. |
Something like this, another site I'm working |
Very nice summary of things we're using, Finn! Lots of discussion here, and perhaps the reason it's stalled is that there's too much to take in. @FinnWoelm is there something you'd like to tackle, specifically, and we could bud off to a new Issue to tackle something? Depends on your bandwidth. Three quick things:
(FYI have linked this discussion here) |
Use the InChIKey as the molecuar identifier |
Strictly speaking InChiKeys are not unique. They are also not particularly human friendly, ATALOFNDEOCMKK-OITMNORJSA-N is not particularly memorable and easy to make errors whilst typing. Another useful feature of using something like OSM-000123 is that a table sort puts the molecules into chronological order. |
I seem to remember the suggestion of OSM000123 (sans hyphen -puleeze...) came up ages ago. The trouble with adding anything at this stage (human-friendly or not) is it just makes the retro-mapping-spaghetti problem worse with each new identifier or synonym |
Which is why you have to decide at the start of a project |
Amen |
I think many of the issues could be solved with discourse. For example threaded discussions. Also I think some universities already use it to deal with students If you are interested I could help install a discourse instance. Then we can start to figure out which parts of it need to be modified so it suits your needs.
I think the only true way to solve this problem is to have a database that is coupled with your labnotes and discussion board. I explained in this issue how we might achieve this: |
Hi OSM,
I've been having conversations with @mattodd about alternatives to Github for collaboration and communication. He asked me to expand the conversation to everyone, hence this issue.
My colleague and I spent the past 18 months building a web platform with GitHub's functionality but made for documents (in Drive/Dropbox or on your pc) and focused on simple UI/UX.
Unfortunately, we were blinded by enthusiasm when we started developing our platform and so we never asked the community whether what we are doing is actually helpful and useful.
The intention is and always has been to build a platform that would make open initiatives — such as OSM — more accessible to the general public. I would love to hear your thoughts and see if we can turn our work into something that's valuable to OSM.
I'm not saying that where anywhere near ready to "compete" with GitHub on a feature level, but we would be very happy to build whatever serves you best!
A couple questions to get the conversation started:
Cheers,
Finn
The text was updated successfully, but these errors were encountered: