C-J-Cundy / elfeed-score

Gnus-style scoring for elfeed

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

elfeed-score

https://melpa.org/packages/elfeed-score-badge.svg https://stable.melpa.org/packages/elfeed-score-badge.svg

Introduction

elfeed-score brings Gnus-style scoring to Elfeed.

Elfeed is an extensible web feed reader for Emacs. By default, it will display entries in reverse chronological order. This package defines a bit of metadata for each of your feed entries: a “score”. A score is an integer (negative or positive), and higher scores denote entries of greater interest to you. This package also (optionally) installs a new sort function, so that Elfeed will display entries with higher scores before entries with lower scores (entries with the same scores will still be sorted in reverse chronological order). It also provides an entry display function for the search buffer that displays each entry’s score, should you choose to install it.

While you can manually assign a score to an entry, you will likely find it more convenient to create rules for scoring that will be automatically applied to each new entry every time you update Elfeed. You can currently score against title, feed & content by defining strings that will be matched against those attributes by substring, regexp or whole-word match. You can also score against the presence or absence of tags. Rules (other than the last sort) can be scoped by Elfeed entry tags, so that a rule will only be applied if an entry has certain tags (or does not have certain tags). Each rule defines an integral value, and the rules are applied in order of definition. The new entry’s score begins at zero, and is adjusted by the value defined by each matching scoring rule.

For instance, here’s a subset of my scoring file at the moment:

;;; Elfeed score file                                     -*- lisp -*-
(("title"
  ("OPEN THREAD" -1000 S 1576681345.4086394)
  ("raymond c\\(hen\\)?" 250 r 1576808786.1558545) (t .(@dev)))
 ("content"
  ("type erasure" 500 s 1576808786.043517))
 ("title-or-content"
  ("california" 150 100 s 1576808786.4068587)
  ("china" 150 100 w 1576808786.1848788))
 ("feed"
  ("Essays in Idleness" 250 S t 1576808786.1956885)
  ("Irreal" 250 S t 1576808786.1765869)
  ("Julia Evans" 100 s t 1576808786.4092398)
  ("National Weather Service" 400 S t 1576808786.1117532)
  ("emacs-news – sacha chua" 350 S t 1576808785.3807983))
 ("tag"
  ((t . reddit-question)
   750 nil))
 (mark -2500))

Like Gnus scoring, this may look like Lisp code, but it is not directly eval’d. It will be read by the Lisp reader, so it must at least be a valid Lisp s-expression. You can find details below.

Prerequisites

This package was developed against Elfeed 3.3.0, which itself requires Emacs 24.3 and cURL.

Installing

The easiest way to install elfeed-score is MELPA; assuming you’ve got MELPA in your package-archives, just say:

(use-package elfeed-score
  :ensure t
  :config
  (progn
    (elfeed-score-enable)
    (define-key elfeed-search-mode-map "=" elfeed-score-map)))

You can, if you wish, install from source as well:

cd /tmp
curl -L --output=elfeed-score-0.4.4.tar.gz https://github.com/sp1ff/elfeed-score/archive/0.4.4.tar.gz
tar xvf elfeed-score-0.4.4.tar.gz && cd elfeed-score-0.4.4
./configure
make
make install

Running the Unit Tests

The unit tests require some macros defined by the Elfeed test suite, which is not distributed with the MELPA package. Therefore, you’ll need to clone the Elfeed git repo & develop against that:

cd /tmp
git clone https://github.com/skeeto/elfeed.git
curl -L --output=elfeed-score-0.4.4.tar.gz https://github.com/sp1ff/elfeed-score/archive/0.4.4.tar.gz
tar xvf elfeed-score-0.4.4.tar.gz && cd elfeed-score-0.4.4
export EMACSLOADPATH=/tmp/elfeed-score-0.4.4:/tmp/elfeed/tests:$EMACSLOADPATH
./configure
make
make check

Unless you already use EMACSLOADPATH, this likely won’t work as written– you’ll need to work out exactly how to tell Emacs to pickup Elfeed from your git repo instead of wherever you’ve got it installed. If you’re running the unit tests, I assume you can get this working.

Getting Started

Score File Format

The score file (~/.emacs.d/elfeed.score by default) is an Emacs Lisp form that evaluates to an association list. Comments are as per usual. The current format recognizes seven keys:

  • “title”: the value associated with this is a list of rules matching text against the entry title
  • “content”: the value associated with this is a list of rules matching text against the entry content
  • “title-or-content”: a list of rules matching against both entry title & content
  • “feed”: the value associated with this is a list of rules matching text against the entry feed
  • “authors”: the value associated with this is a list of rules matching text against the name of the entry’s author. If the entry has multiple authors the rules match against the string formed by concatenating the authors’ names together with “, ” as a delimiter.
  • “tag”: a list of rules matching against entry tags
  • “adjust-tags”: a list of rules to be applied after an entry is scored; they can add or remove tags based on the score being above or below given thresholds
  • mark: an integer which, if greater than an entry’s final score, will result in the entry being marked as read (in other words, if your rules have lowered an entry’s scores below this level, don’t even display it in the Elfeed search buffer)

Title & content rules are defined by a list of length five:

  1. the match text
  2. the match value: this is an integer specifying the amount by which the entry’s score should be adjusted, should the text match
  3. the match type: this may be one of s, S, r, R, w or W for substring match, case-sensitive substring match, regexp match or case-sensitive regexp match, and case-insensitive or case-sensitive whole word match, respectively. Whole word matching just feeds the match text to word-search-regexp before doing a regexp search.
  4. the last time this rule matched an entry, in seconds since Unix epoch. This element is optional, need not be supplied by the score file author, and will be automatically kept up-to-date by the package.
  5. tag scoping rules (on which more below); this is optional, and won’t be written out when the scoring rules are serialized unless it is non-nil.

So, when first setting up your score file, saying:

;;; Elfeed score file                                     -*- lisp -*-
(("title"
  ("OPEN THREAD" -1000 S))
 ("content"
  ("california" 100 s nil (t . (@daily @politics)))))

means that you want all entries whose title contains the text “OPEN THREAD” to have its score decreased by 1000, and whose content contains the text “california” to have its score increased by 100, but only if the entry has at lease one of the tags @daily or @politics. The former match will be case-sensitive, the latter case-insensitive.

I’ve found myself defining duplicate rules for both title & content, albeit with different values (presuming that a match against title would be more significant than a match against content). To eliminate this, I added a “title-or-content” rule type that mimics the formats above, but permits for different values to be added to the score depending on whether the match is found against the title or the content. For instance

;;; Elfeed score file                                     -*- lisp -*-
(("title-or-content"
  ("california" 150 100 s nil (t . (@daily @politics)))))

defines a rule that will perform a substring match for “california” against both the entry title and content. A match found in the title adds 150 to the score, and a match found in the content adds 100. The rule will only be applied, however, to entries whose tags contain @daily, @politics or both.

Scoring against the entry’s feed is done similarly, but may be done against either the feed title or the feed URL. This is indicated by adding a new element at index 3 which may be one of t or u (for title or URL, respectively).

Scoring against the entry’s tags is similar, but is done with a three tuple:

  1. the tags whose presense or absence will trigger the rule. This is specified as a cons cell (switch . tags) where switch is either t or nil and tags is either a tag or a list of tags. If switch is t, the rule will apply to any entry tagged with one or more of the tags listed in tags. Conversely, if switch is nil, the rule will match entries who have none of the tags in tags.
  2. the value by which the entry’s score shall be adjusted if this rule matches
  3. the last time this rule matched an entry, in seconds since Unix epoch. This element is optional, need not be supplied by the score file author, and will be automatically kept up-to-date by the package.

So, for example, the following rules:

;;; Elfeed score file                                    -*- lisp -*-
(("title"
  ...)
 ("tag"
  ((t . (a b)) 100)
  ((nil . (x y z) -100)))
 ...

will add 100 to the score of any entry tagged with either =’a= or =’b=, and subtract 100 from from entries that are not tagged with at least one of =’x=, =’y=, or =’z=.

If you’ve decided that an entry’s score is low enough, you may not even want to see it. In that case, add a rule like:

(mark N)

when the entry’s final score is below N, the package will remove the unread tag from the entry, marking it as “read”.

When I began this project, the only example of scoring for Elfeed I could find was this article (“Scoring Elfeed Articles”). The author (John Kitchin) computes a score and adds one or two tags to entries whose score is sufficiently high. It’s always bothered me that elfeed-score couldn’t do that, so in build 0.4.3, I added one more type of rule: “adjust-tags”. These are applied after the scoring process, and will add or remove tags based on whether the entry’s score is above or below a given threshold.

Adjust-tags rules are given by a three-tuple:

  1. the threshold at which the rule shall apply; this is defined by a cons cell (switch . threshold). switch may be t or nil and threshold is the threshold against which each entry’s score shall be compared. If switch is t, the rule applies if the score is greater than or equal to threshold; if switch is nil the rule applies if score is less than or equal to threshold.
  2. the tags to be added or removed; also defined by a cons cell (switch . tags). If switch is t & the rule applies, tags (either a single tag or a list of tags) will be added to the entry; if switch is nil, they will be removed
  3. the last time this rule matched an entry, in seconds since Unix epoch. This element is optional, need not be supplied by the score file author, and will be automatically kept up-to-date by the package.

For example, the following rules:

;;; Elfeed score file                                    -*- lisp -*-
(("title"
  ...)
 ("adjust-tags"
  ((t . 1000) (t . a))
  ((nil . -1000) (nil . b)))
 ...

will add the tag =’a= to all entries whose score is 1000 or more, and remove tag =’b= from all entries whose score is -1000 or less.

Tag Scoping Rules

Limiting rules by entry tags involves the use of a cons cell of the form:

(BOOLEAN . (TAG...))

The car is a boolean, and the cdr is a list of tags. If the former is t, the rule will only be applied if the entry has at least one of the tags listed. If the boolean value is nil, the rule will only apply if the entry has none of the tags listed.

This only applies to title, content, title-or-content & feed-based rules.

Using elfeed-score

Once your score file is setup, load elfeed-score.

(require 'elfeed-score)

Just loading the library will not modify Elfeed; you need to explicitly enable the package for that:

(elfeed-score-enable)

This will install the new sort function & new entry hook, as well as read your score file. NB. elfeed-score-enable is autoloaded, so if you’ve installed this package in the usual way, you should be able to just invoke the function & have the package loaded & enabled automatically.

Some elfeed users have already customized elfeed-search-sort-function and may not wish to have efleed-score install a new one. elfeed-score-enable takes a prefix argument: if present, it will install the new entry hook & commence scoring, but will not install the new sort function. Such users may refer to elfeed-score-sort if they would like to incorporate scoring into their sort functions.

The package defines a keymap, but does not bind it to any key. I like to set it to the === key:

(define-key elfeed-search-mode-map "=" elfeed-score-map)

At this point, any new entries will be scored automatically, but the entries already in your database have not yet been scored. Scoring is idempotent (scoring an entry more than once will always result in it having the same score assigned). This means you can load up an Elfeed search, and then, in the Elfeed search buffer (*elfeed-search*), score all results with “= v” (elfeed-score-score-search). When the command completes, the view will be re-sorted by score. Your score file will also have been updated on disk (to record the last time that each rule matched).

You can optionally arrange to have the scores displayed in the search buffer:

(setq elfeed-search-print-entry-function #'elfeed-score-print-entry)

This is not turned on by elfeed-score-enable; you will need to set this manually. However, elfeed-score-unload will remove it, if it’s there.

Finally, you can configure score logging by setting the variable elfeed-score-log-level. By default it will be =’warn= (which will produce very little output). Other possible settings are =’debug=, =’info=, and =’error=. To trouble-shoot a balky rule, type (setq elfeed-score-log-level 'debug), re-score your current view, and switch to buffer *elfeed-score*.

Status and Roadmap

I’m using elfeed-score day in & day out for my RSS reading, but this is a preliminary release (the version number, 0.4, was chosen to suggest this).

Things I may want to do in the future:

  • add some kind of feature to age out rules that haven’t matched in a long time
  • capture which entries are actually opened & which ones are manually marked read without bothering to read them; see if I can “learn” from that information (something like Gnus Adaptive Scoring)

I wrote a post on how elfeed-score works, along with the process of submitting code to MELPA, here.

Bugs, comments, issues, feature requests &c welcome at sp1ff@pobox.com.

About

Gnus-style scoring for elfeed

License:GNU General Public License v3.0


Languages

Language:Emacs Lisp 98.3%Language:Shell 1.1%Language:Makefile 0.4%Language:M4 0.3%