Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make has-note configurable #601

Closed
bdarcus opened this issue May 22, 2022 · 7 comments · Fixed by #603 or #753
Closed

Make has-note configurable #601

bdarcus opened this issue May 22, 2022 · 7 comments · Fixed by #603 or #753
Labels
enhancement New feature or request
Milestone

Comments

@bdarcus
Copy link
Contributor

bdarcus commented May 22, 2022

... Seems one obvious needed change here would be something like a configurable citar-has-notes-function variable. One could then swap one that queried org-roam instead to create the candidate data.

Originally posted by @bdarcus in #518 (comment)


@roshanshariff - in the context of discussing #518, it occurred to me (and I think you may have mentioned this earlier) we'd want to do something like the below:

(defcustom citar-has-note-function (citar-has-note)
  "The function to return notes associated with a reference.

Function must take two arguments: KEY and ENTRY."
  :group 'citar
    :type '(choice
            (function-item :tag "Default, one-note-per-file" (citar-has-note))
            (function-item :tag "Org-Roam v2" citar-org-roam-has-note)
            (function :tag "Other")))

(defun citar-org-roam-has-note (key _entry)
  "Query org-roam for presence of bibliographic notes for KEY."
  (let ((ref-node (org-roam-node-from-ref (concat "@" key)))) 
    (list (org-roam-node-file ref-node))))

Does that seem right to you? Any suggestions?

BTW, with org-roam v2, you can have notes on a reference in different places; example, where first "toly2017" is a within-file note, and the second is a file:

image

... and here's what the first one looks like internally:

#s(org-roam-node :file "/home/bruce/org/roam/20200307105756-covid_19.org" :file-title "COVID-19" :file-hash nil :file-atime
                  (25226 5062 342907 390000)
                  :file-mtime
                  (25226 5062 337907 382000)
                  :id "aff1a98d-8b4c-4771-95ea-163090b248f4" :level 1 :point 1262 :todo nil :priority nil :scheduled nil :deadline nil :title "China Outbreak and Response" :properties
                  (("CATEGORY" . "20200307105756-covid_19")
                   ("ROAM_REFS" . "@toly2017")
                   ("ID" . "aff1a98d-8b4c-4771-95ea-163090b248f4")
                   ("BLOCKED" . "")
                   ("FILE" . "/home/bruce/org/roam/20200307105756-covid_19.org")
                   ("PRIORITY" . "B")
                   ("ITEM" . "China Outbreak and Response"))
                  :olp nil :tags nil :aliases nil :refs
                  ("toly2017"))
@bdarcus bdarcus added the enhancement New feature or request label May 22, 2022
@roshanshariff
Copy link
Collaborator

roshanshariff commented May 22, 2022

I think it's better to put the indirection one level deeper. So keep the citar-has-note function, but have a configurable list of backends that it consults instead of hard-coding its behaviour. Two backends we could already have: scanning the notes directory for an appropriately named file or looking in the bibtex "files" field for files with an appropriate extension. To this we could add another that queries the Org-Roam database for a note with the citation key as a ref.

As we've discussed before, these functions might need to do some expensive operation once before they're called for every reference in the bibliography. In the current implementation, the notes directory needs to be scanned. For Org-Roam notes, you need an SQLite query. So each backend shouldn't just be a predicate function as you've outlined above, but a function that when called does the expensive work once and returns a predicate function that can be called repeatedly.

I hope that makes sense?

EDIT: For clarity, I was thinking of something like this (untested):

(defvar citar-note-backends '(citar-file--note citar--note-field))

(defun citar-has-note ()
  ;; Call each function in `citar-note-backends` to get a list of predicates
  (let ((preds (mapcar #'funcall citar-note-backends)))
    ;; Return a predicate that checks if `citekey` and `entry` have a note
    (lambda (citekey entry)
      ;; Call each predicate with `citekey` and `entry`; return the first non-nil result
      (seq-some (lambda (pred) (funcall pred citekey entry)) preds))))

The elements of citar-note-backends would have a similar API to citar-has-note.

@bdarcus
Copy link
Contributor Author

bdarcus commented May 22, 2022

Thanks for the example; I was struggling to get the details :-)

In that, the citar-file--note would be for the current behavior, and the citar--note-field would be a possible org-roam one?

And practically speaking, would it not be the case that one would like only ever define one backend?

E.g. the default would be the first one, and a hypothetical citar-org-roam package would reset that to the second?

BTW, in org-roam, one can of course also do lower level queries like this, which I assume is (much?) faster than calling once per reference?

ELISP> (org-roam-db-query [:select * :from refs])
(("fd1e8dfa-8bfd-43fb-884a-b350a7a0850a" "agnew2003" "cite")
 ("c34e84b5-5218-42cf-a088-c159bcecdbc5" "couper2014" "cite")
 ("af4eb0c0-e786-4078-83af-f70a2dab8923" "berlet2000" "cite")
 ("8d57cb9f-f482-4058-93cd-7b1d616c23a5" "un_2017-protest" "cite")
 ("f3bf6c1a-3274-4192-8399-cc5065edb011" "wallace-wells2019" "cite")
 ("59ab1d4e-bf1f-4dce-a8ed-2df53fc0efd8" "Woodman_2017-bills" "cite")
 ("6ff760ca-f386-4a49-815b-bb49582560a9" "kohn2008" "cite")
 ("6661bba3-4b23-4934-aec4-389676bd5b31" "ahrens2017" "cite")
 ("d780d736-7d8f-4827-81ca-7ce7b51e05c6" "staeheli_conflicting_2002" "cite")
 ("f8788f64-31d0-449e-b72b-c9e4e1f0a9e0" "bunch_2020-trump-fox" "cite")
 ("b2bae16e-120b-4446-a671-ee54de7549de" "olorunnipa2020" "cite")
 ("a86f73cc-c782-45d4-bd49-009d1c7e051c" "solnit2010" "cite")
 ("09077e7d-3e01-4c3f-8c07-1ac539618f8c" "ward2006" "cite")
 ("b0a7badd-fda1-4780-8875-f26b8fd982b1" "nichanian2022" "cite")
 ("db9d22c9-4ec1-4729-a228-ad2bd86e8074" "klippenstein2020a" "cite")
 ("93f2c87d-716e-499c-b157-36f7d15e7334" "toly2017" "cite")
 ("48ccafdf-d016-4fe8-82e4-72cb4420b081" "staff2019" "cite"))

This had me wondering about possibly a citar cache for these data?

Aside: the org uuid field is of course how org-roam generalizes node identity.

@roshanshariff
Copy link
Collaborator

roshanshariff commented May 22, 2022

Yeah, exactly, you'd want to call that function once, store the results (perhaps in a hash table) then have a lambda that just queries the hash table 2000 times or whatever (when formatting the candidates, say).

I like this pattern more than caching things whenever a cache is not absolutely necessary.

There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton

In this case, it's totally okay to scan the directory or perform the SQLite query every time we need to (say) format the candidates; we just don't want to do it for every single candidate. I think adding a cache would just add more complexity, global state, invalidation logic, etc.

You're right that the current behaviour only uses a single back-end (the notes directory) but you could envision looking at the bibtex entries too, just as we do with attached files. It gives us a little extra flexibility with almost no extra complexity or overhead.

EDIT: I also think it's more user-friendly to allow multiple sources. Let's say somebody was using a directory-of-notes system, but they want to slowly transition to Org-Roam and use it for new notes. They might appreciate being able to gradually move over rather than having to do it all at once.

@bdarcus
Copy link
Contributor Author

bdarcus commented May 22, 2022

Yeah, exactly, you'd want to call that function once, store the results (perhaps in a hash table) then have a lambda that just queries the hash table 2000 times or whatever (when formatting the candidates, say).

So you mean no new global cache, but rather a local let variable on citar--format-candidate?

Example below would return a simple list of all keys with associated notes (easy enough to convert to a hash if that's valuable; I know it can be for performance):

(mapcar #'car (org-roam-db-query [:select ref :from refs]))

Assuming yes, then how to generalize that? Should citar-has-note or something similar return a list?

An issue is that citar-has-note now returns a list of file paths, while org-roam v2 is really oriented around keys and uuids, and uses those to look up filepaths and such.

So what's the best simple data model for that hash or list? Maybe a hash with key as citekey, and a list of one-or-more filepaths as value?

Sorry for all the questions; I'm a DIY programmer, and don't have a ton of experience with hashes :-)

I also think it's more user-friendly to allow multiple sources. Let's say somebody was using a directory-of-notes system, but they want to slowly transition to Org-Roam and use it for new notes. They might appreciate being able to gradually move over rather than having to do it all at once.

Makes sense. And as you say, there's no reason to limit flexibility if there's no real cost.

I was just trying to understand the technical details and implications.

@roshanshariff
Copy link
Collaborator

So you mean no new global cache, but rather a local let variable on citar--format-candidate?

Indeed, and citar--format-candidates already has this local let binding. It binds hasnotep to the result of calling citar-has-note. With the above code, hasnotep would internally have a reference to preds, which in turn contain whatever data are needed to figure out if a candidate has notes associated with it.

I'll have to think about it some more, but perhaps it's simplest for the open-note function to just be independent of citar-has-note? This actually seems to be the status quo (with citar-open-notes and citar-open-note-function). So to get the Org-Roam functionality, you'd need to override two places.

Of course, the open-note functions could reuse the code that's behind citar-has-note so the functionality wouldn't be duplicated. You could just have a list of note-opener functions (analogous to citar-open-note-function) that given a key and entry, opens an existing note or returns nil if there wasn't one. Then, if none of these functions returns true, you call a "new note" function. Something like this:

(defcustom citar-open-note-functions '(citar-file--open-note citar--open-note-field citar-org-roam-open-note))

(defcustom citar-create-note-function #'citar-file--create-note)

(defun citar-open-note (citekey entry)
  (or (seq-some (lambda (opener) (funcall opener citekey entry)) citar-open-note-functions)
      (funcall citar-create-note-function citekey entry)))

There's no reason to make this complicated for efficiency like citar-has-note, because you'll never be opening thousands of notes at the same time (and if you do, the efficiency of looking up each one is the least of your concerns). At the same time, there's no reason to complicate citar-has-note by making it anything but a predicate.

With this approach, you could use Org-Roam's own mechanisms for opening nodes or creating new nodes. In particular, Org-Roam nodes can be sub-headings inside files rather than files themselves, so it's best we not get in the business of opening the note ourselves based on just the filename.

@bdarcus
Copy link
Contributor Author

bdarcus commented May 22, 2022

I need to think through your example etc., but on this key point:

With this approach, you could use Org-Roam's own mechanisms for opening nodes or creating new nodes. In particular, Org-Roam nodes can be sub-headings inside files rather than files themselves, so it's best we not get in the business of opening the note ourselves based on just the filename.

Exactly; this is the whole point of the idea. As I started to think "what would it take to support multiple-notes-per-file?", I thought a) that shouldn't be our problem, and b) that's already solved elsewhere.

@bdarcus
Copy link
Contributor Author

bdarcus commented May 22, 2022

You could just have a list of note-opener functions (analogous to citar-open-note-function) that given a key and entry, opens an existing note or returns nil if there wasn't one.

On this piece, this goes back to my earlier question about the list of functions:

If this were default, would there be the potential for conflicts and confusion?

I, for example, have my citar directory as a subdirectory of my org-roam directory, which allows it to work with org-roam.

But if I wanted to simply always defer to org-roam for note opening and creation, then by definition I would only want that function; right? I guess that could be addressed by putting the org-roam function first in the list?

I am going to likely experiment with a very small citar-org-roam package, in part to test this, so just need to think through how that would work.

bdarcus added a commit that referenced this issue May 23, 2022
@bdarcus bdarcus changed the title Make has-note and has-file configurable Make has-note configurable May 23, 2022
@bdarcus bdarcus added this to the v1.0 milestone May 23, 2022
bdarcus added a commit that referenced this issue May 26, 2022
Refactor the notes API to enable more general functionality (see #518),
and tighter integration with note-taking packages like org-roam (see #602).

- Add citar-keys-with-notes-functions defcustom.
- Rename citar-open-notes-function to citar-open-notes-functions; make a list.
- Refactor citar-open-notes to make use of the above
- Add some convenience functions for working with these data.

Close #601
bdarcus added a commit that referenced this issue May 26, 2022
Refactor the notes API to enable more general functionality (see #518),
and tighter integration with note-taking packages like org-roam (see #602).

- Add citar-keys-with-notes-functions defcustom.
- Rename citar-open-notes-function to citar-open-notes-functions; make a list.
- Refactor citar-open-notes to make use of the above
- Add some convenience functions for working with these data.

Close #601
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants