Skip to content

Commit

Permalink
many changes to try to make things faster - they might break stuff!
Browse files Browse the repository at this point in the history
  • Loading branch information
Francis St-Amour committed Feb 13, 2024
1 parent 24ff6ef commit 7e8a558
Showing 1 changed file with 61 additions and 45 deletions.
106 changes: 61 additions & 45 deletions obsidian.el
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@
If it is true, create in inbox, otherwise next to the current buffer."
:type 'boolean)

(defcustom obsidian-daily-notes-directory obsidian-inbox-directory
(defcustom obsidian-daily-notes-directory obsidian-inbox-directory

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

just whitespace

"Subdir to create daily notes with `obsidian-daily-note'. Default: the inbox directory"
:type 'directory)

Expand Down Expand Up @@ -153,9 +153,11 @@ When run interactively asks user to specify the path."
;; Copied from org-roam's org-roam-descendant-of-p
(defun obsidian-descendant-of-p (a b)
"Return t if A is descendant of B."
(unless (equal (file-truename a) (file-truename b))
(string-prefix-p (replace-regexp-in-string "^\\([A-Za-z]\\):" #'downcase (expand-file-name b) t t)
(replace-regexp-in-string "^\\([A-Za-z]\\):" #'downcase (expand-file-name a) t t))))
(string-search ".." (file-relative-name a b))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

This seems to be much faster, but I don't know if there's some weird corner cases that I don't know about.

;; (unless (equal (file-truename a) (file-truename b)) (string-prefix-p (replace-regexp-in-string "^\\([A-Za-z]\\):" #'downcase (expand-file-name b) t t) (replace-regexp-in-string "^\\([A-Za-z]\\):" #'downcase (expand-file-name a) t t)))
)



(defun obsidian-not-trash-p (file)
"Return t if FILE is not in .trash of Obsidian."
Expand Down Expand Up @@ -185,15 +187,19 @@ FILE is an Org-roam file if:
- Is not a dot file or, if `obsidian-include-hidden-files' is t, then:
- It is not in .trash
- It is not an Emacs temp file"
(-when-let* ((path (or file (-> (buffer-base-buffer) buffer-file-name)))
(relative-path (file-relative-name path obsidian-directory))
(ext (file-name-extension relative-path))
(md-p (string= ext "md"))
(obsidian-dir-p (obsidian-descendant-of-p path obsidian-directory))
(-when-let* ((path (or file (buffer-file-name (buffer-base-buffer))))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

-when-let* was much faster than when-let

(_ (s-ends-with-p ".md" path))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Filtering out paths that doesn't end with ".md" as soon as possible was pretty good, file-relative-pathname was taking a good percentage of run time of this function.

;; (not-node-modules-p (not (string-match-p (rx (or "node_modules" ".git")) path)))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

The project I was testing with, there was a node_modules directory as well as a .git directory at the root of the project, filtering out those paths ASAP gave a huge gain.

It's currently commented out because in my last tests I was listing files with projectile which already excludes files that are ignored by git.

➡️ Maybe filtering out dot-files sooner could help?

;; (not-ignored-p (not (magit-file-ignored-p path)))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Using magit-file-ignored-p sometimes made things faster sometimes not...

I should try putting this check later in the list.

;; (relative-path (file-relative-name path obsidian-directory))
;; (ext (file-name-extension relative-path))
;; (md-p (string= ext "md"))
;; (obsidian-dir-p (obsidian-descendant-of-p path obsidian-directory))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

obsidian-descendant-of-p was slow, even with the changes that I had made to make it faster.

I think this is redundant in general, because we're normally listing file under obsidian-directory in the first place.

(not-dot-file (or obsidian-include-hidden-files (not (obsidian-dot-file-p path))))
(not-trash-p (obsidian-not-trash-p path))
(not-dot-obsidian (obsidian-not-dot-obsidian-p path))
(not-temp-p (not (s-contains-p "~" relative-path))))
;; (not-temp-p (not (s-contains-p "~" relative-path)))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

This is currently commented out because in my last tests I was listing files with projectile which already excludes files that are ignored by git.

)
t))

(defun obsidian--file-relative-name (f)
Expand All @@ -219,9 +225,20 @@ FILE is an Org-roam file if:

(defun obsidian-reset-cache ()
"Clear and reset obsidian cache."
;; (setq obsidian-files-cache (->> (directory-files-recursively obsidian-directory "\.*$") (-filter #'obsidian-file-p)))
(fsta/comment
(setq obsidian-files-cache (directory-files-recursively

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

This was MUCH faster and it's the easiest change (I think 😅).

Instead of building a list of all the file with directory-files-recursively and then creating yet another list with -filter, we just pass the predicate #'obsidian-file-p directly to directory-files-recursively, so the files that don't match the predicate are never put in a list in the first place.

This comment has been minimized.

Copy link
@jayemar

jayemar Jul 11, 2024

From the docs and playing around with directory-files-recusively, it seems the predicate operates against subdirectories, not the files inside of the subdirectories. This often means the files within subdirectories will be ignored.

This causes some of the tests to fail, or example obsidian-list-all-files will return a file count less than the expected.

;; dir
obsidian-directory
;; regexp
"\.*\\.md$"
;; include-directories
nil
;; predicate
#'obsidian-file-p)))
(setq obsidian-files-cache
(->> (directory-files-recursively obsidian-directory "\.*$")
(-filter #'obsidian-file-p)))
(cl-delete-if-not #'obsidian-file-p
(projectile-dir-files-alien obsidian-directory)))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

You just can't beat projectile-dir-files-alien in terms of speed!

But it's not always applicable.

Also I used destructive filtering (cl-delete-if-not), but I didn't check if it was safe to do so, nor if it was faster than plain cl-remove-if-not.

(setq obsidian-cache-timestamp (float-time)))

(defun obsidian-list-all-files ()
Expand All @@ -233,6 +250,8 @@ Obsidian notes files:
(obsidian-reset-cache))
obsidian-files-cache)



(defun obsidian-clear-cache ()
"Clears the obsidian.el cache.
Expand All @@ -243,8 +262,8 @@ If you need to run this manually, please report this as an issue on Github."

(defun obsidian-list-all-directories ()
"Lists all Obsidian sub folders."
(->> (directory-files-recursively obsidian-directory "" t)
(-filter #'obsidian-user-directory-p)))
;; (->> (directory-files-recursively obsidian-directory "" t) (-filter #'obsidian-user-directory-p))
(directory-files-recursively obsidian-directory "" t #'obsidian-user-directory-p))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Same "easy" trick here: pass the predicate directly to directory-files-recursively to avoid building a big list only to filter it right away.

This comment has been minimized.

Copy link
@jayemar

jayemar Jul 12, 2024

I don't this function and predicate are working the way we need. The predicate determines which directories will be descended into, but once they're found they'll still be returned even if the predicate returns false. So the /.obsidian and /.trash directories will still be returned, for example, but they ust won't be descended into to search for additional directories.

This comment has been minimized.

Copy link
@jayemar

jayemar Jul 12, 2024

I just added a test in PR licht1stein#96 to help check for this.


(defun obsidian-read-file-or-buffer (&optional file)
"Return string contents of a file or current buffer.
Expand Down Expand Up @@ -288,36 +307,33 @@ Return nil if the front matter does not exist, or incorrectly delineated by

(defun obsidian--file-front-matter (file)
"Check if FILE has front matter and returned parsed to hash-table if it does."
(let* ((starts-with-dashes-p (with-temp-buffer
(insert-file-contents file nil 0 3)
(string= (buffer-string) "---"))))
(if starts-with-dashes-p
(let* ((front-matter-s (with-temp-buffer
(insert-file-contents file)
(obsidian-get-yaml-front-matter))))
(if front-matter-s
(yaml-parse-string front-matter-s))))))
(with-temp-buffer

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

This is the most counter-intuitive sh..

with-temp-buffer is so slow (comparatively to the rest) that reading the whole file all the time was faster that using with-temp-buffer twice.

Which makes me think there could be a very big performance gain doing both "find all tags" and "find all aliases" in one go, to reduce the number of with-temp-buffer again (it should also help with the OS' file cache).

This comment has been minimized.

Copy link
@jayemar

jayemar Jul 12, 2024

This is great, after making this change the tests are running in almost half the time!

(insert-file-contents file)
(when (string= "---" (buffer-substring-no-properties 1 (min (point-max) 3)))
(when-let ((front-matter-s (obsidian-get-yaml-front-matter)))
(yaml-parse-string front-matter-s)))))

(defun obsidian--update-from-front-matter (file)
"Takes FILE, parse front matter then update anything that needs to be updated.
At the moment updates only `obsidian--aliases-map' with found aliases."
(let* ((dict (obsidian--file-front-matter file)))
(if dict
(let* ((aliases (gethash 'aliases dict))
(alias (gethash 'alias dict))
(all-aliases (-filter #'identity (append aliases (list alias)))))
;; Update aliases
(-map (lambda (al) (if al (progn
(obsidian--add-alias (format "%s" al) file)))) all-aliases)))))
(when-let* ((dict (obsidian--file-front-matter file)))
(let* ((aliases (gethash 'aliases dict))
(alias (gethash 'alias dict))
(all-aliases (cl-copy-list (append aliases (list alias)))))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

I forgot to remove all-aliases, but it should not be necessary anymore.

Obviously, I didn't measure the impact of these changes yet.

I think that's what I left off before I decided to commit whatever I had at the moment,


Instead of creating a list that contains both alias and aliases and map over it. We add alias if it's not null and we map over aliases.

Should help reduce GC pressure.

;; Update aliases
(when alias
(obsidian--add-alias (format "%s" al) file))
(map nil (lambda (al) (obsidian--add-alias (format "%s" al) file)) aliases))))

(defun obsidian--update-all-from-front-matter ()
"Take all files in obsidian vault, parse front matter and update."
(dolist (f (obsidian-list-all-files))
(condition-case err
(obsidian--update-from-front-matter f)
(error (message "Error updating YAML front matter in file %s. Error: %s"
f (error-message-string err)))))
(fsta/measure-time

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Nothing to see here, I just added the "measure-time" for my tests.

(dolist (f (obsidian-list-all-files))
(condition-case err
(obsidian--update-from-front-matter (file-name-concat obsidian-directory f))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Projectile returns paths relative to the root of the project.

I think it's a good idea to store the paths as relative paths... I think it would save some space enough to impact the speed, but don't take my word for it.

(error (message "Error updating YAML front matter in file %s. Error: %s"
f (error-message-string err))))))
(message "Obsidian aliases updated."))

(defun obsidian-tag-p (s)
Expand All @@ -329,21 +345,21 @@ At the moment updates only `obsidian--aliases-map' with found aliases."
"Return all tags in file or current buffer.
If FILE is not specified, use the current buffer"
(-> (obsidian-read-file-or-buffer file)
obsidian-find-tags
-distinct))
(obsidian-find-tags (obsidian-read-file-or-buffer file)) )

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Not deduplicating tags at this level anymore, to avoid the call to flatten in obsidian-list-all-tags.


(defun obsidian-list-all-tags ()
"Find all tags in all obsidian files."
(->> (obsidian-list-all-files)
(mapcar #'obsidian-find-tags-in-file)
-flatten
-distinct))
(let ((tags (make-hash-table :test 'equal)))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

Here again we speed things up by avoiding building intermediary lists.

No -flatten, no -distinct. For each file, we map over the list of tags and we add it to a hash-table. At the end we just "dump" the hash-table.

This comment has been minimized.

Copy link
@jayemar

jayemar Jul 12, 2024

I didn't actually notice any performance gains with this one, and it actually may have run a bit slower for me. Maybe the simplicity of the list outweighs the complexity of dealing with a hash table even with the additional calls to -flatten and -distinct?

(mapcar (lambda (file)
(mapcar (lambda (tag)
(puthash tag tag tags))
(obsidian-find-tags (obsidian-read-file-or-buffer file))))
(obsidian-list-all-files))
(hash-table-keys tags)))

(defun obsidian-update-tags-list ()
"Scans entire Obsidian vault and update all tags for completion."
(->> (obsidian-list-all-tags)
(setq obsidian--tags-list))
(setq obsidian--tags-list (obsidian-list-all-tags))

This comment has been minimized.

Copy link
@fstamour

fstamour Feb 13, 2024

Owner

This is just a style change, I find the use of the threading macro very confusing in this case (when it's only threaded into 1 form.

This comment has been minimized.

Copy link
@jayemar

jayemar Jul 11, 2024

I agree, I definitely find your style to be more clear.

(message "Obsidian tags updated"))

(define-minor-mode obsidian-mode
Expand Down

0 comments on commit 7e8a558

Please sign in to comment.