A micro search library leveraging Boolean Operators, supporting wildcard annotation within search terms.
Revision: Jan 6, 2022.
Npm package: @spaceavocado/librarian.
npm install @spaceavocado/librarian
yarn add @spaceavocado/librarian
The library is being build as CommonJS module and ESM.
The library could be tested in the online sandbox https://librarian-sandbox.web.app/.
An evaluable search expression could be created directly from the evaluable core expression functions or parsed from the raw expression input.
type Evaluable = {
// Convert the expression to a literal expression string
toString: (format?: (...arg: Serializable[]) => string) => string
// Test if there are any matches within the given context
test: (context: string) => boolean
// Execute the expression within the given context and get fond matches, if any.
execute: (context: string, onEvaluation?: Evaluation) => EvaluationResult
}
EvaluationResult: Match[]
= Matches found within the context.EvaluationResult: true
= Evaluated as matching within the context, without any direct match records (NOT operand).EvaluationResult: false
= Not matches found within the context.
import { term, and, or, not, nor, xor, parse } from '@spaceavocado/librarian'
// term(), and(), or(), nor(), xor(), not(), parse() return an "Evaluable"
import { parse } from '@spaceavocado/librarian'
// Parse raw expression into a search function
const search = parse('"cent??" AND ("new york" OR "berlin")').execute
// Perform a search
const result1 = search("Christie visited the New York's city center last week.")
// The result1 contains a collection of librarian.Match results
;[
{ match: 'center', term: 'cent??', index: 37, length: 6 },
{ match: 'New York', term: 'new york', index: 21, length: 8 },
]
// Perform a search
const result2 = search(
"Christie visited the New York's city outskirt last week."
)
// The result2 is false, as there are not found matches
false
import { term, and, or, not } from '@spaceavocado/librarian'
// Create the search function from the search expression handlers
const search = and(term('cent??'), or(term('new york'), term('berlin'))).execute
// Perform a search
const result1 = search("Christie visited the New York's city center last week.")
// The result1 contains a collection of librarian.Match results
;[
{ match: 'center', term: 'cent??', index: 37, length: 6 },
{ match: 'New York', term: 'new york', index: 21, length: 8 },
]
// Perform a search
const result2 = search(
"Christie visited the New York's city outskirt last week."
)
// The result2 is false, as there are not found matches
false
import { parse, term, and, or } from '@spaceavocado/librarian'
const expression1 = and(
term('cent??'),
or(term('new york'), term('berlin'))
).toString()
const expression2 = parse('"cent??" AND ("new york" OR "berlin")').toString()
// The literal form of the expression1
;('("cent?? AND ("new york" OR "berlin")")')
// The literal form of the expression2
;('("cent?? AND ("new york" OR "berlin")")')
Note: toString
method could be provided with a custom formatting function.
Requires all terms to be found within the search context.
Search Context:
The NASA Juno mission, which began orbiting Jupiter in July 2016, just recently made its 38th close flyby of the gas giant. The mission was extended earlier this year, adding on a flyby of Jupiter's moon Ganymede in June.
Search expression 1: "nasa" AND "mission" AND "ganymede"
- Result: Match, all terms are present in the search context.
Search expression 2: "nasa" AND "august"
- Result: No Match, "nasa" is present within the search context, but "august" is not.
import { term, and } from '@spaceavocado/librarian'
const search = and(term('new york'), term('berlin')).execute
const search = parse('"new york" AND "berlin"').execute
const result = search("Christie visited the New York's city center last week.")
Requires at least one term to be found within the search context.
Search Context:
The NASA Juno mission, which began orbiting Jupiter in July 2016, just recently made its 38th close flyby of the gas giant. The mission was extended earlier this year, adding on a flyby of Jupiter's moon Ganymede in June.
Search expression 1: "nasa" OR "mission" OR "ganymede"
- Result: Match, all terms are present in the search context.
Search expression 2: "nasa" AND "august"
- Result: Match, "nasa" is present within the search context, "august" is not, but at least one term is present.
Search expression 3: "spacex" AND "august"
- Result: No Match, "spacex" nor "august" is present within the search context.
import { term, or } from '@spaceavocado/librarian'
// Should the OR expression perform exhaustive matching?
// - Exhaustive matching is useful to highlight all matches
// within the search context.
// - Not-exhaustive matching, i.e. short-circuiting is better
// from optimization perspective. (default option for OR).
const exhaustive = false
const parseOptions = {
exhaustiveOr: false,
}
const search = or(exhaustive)(term('new york'), term('berlin')).execute
const search = parse('"new york" OR "berlin"', parseOptions).execute
const result = search("Christie visited the New York's city center last week.")
Requires no terms to be found within the search context.
Search Context:
The NASA Juno mission, which began orbiting Jupiter in July 2016, just recently made its 38th close flyby of the gas giant. The mission was extended earlier this year, adding on a flyby of Jupiter's moon Ganymede in June.
Search expression 1: "nasa" NOR "mission" NOR "ganymede"
- Result: NO Match, "nasa" is present within the search context.
Search expression 2: "spacex" NOR "year"
- Result: NO Match, "year" is present within the search context.
Search expression 3: "spacex" NOR "august"
- Result: Match, "spacex" nor "august" is present within the search context.
import { term, nor } from '@spaceavocado/librarian'
const search = nor(term('new york'), term('berlin')).execute
const search = parse('"new york" NOR "berlin"').execute
const result = search("Christie visited the New York's city center last week.")
Requires exactly one term to be found within the search context.
Search Context:
The NASA Juno mission, which began orbiting Jupiter in July 2016, just recently made its 38th close flyby of the gas giant. The mission was extended earlier this year, adding on a flyby of Jupiter's moon Ganymede in June.
Search expression 1: "nasa" XOR "mission" XOR "august"
- Result: NO Match, "nasa" and "mission" is present within the search context.
Search expression 2: "spacex" XOR "august"
- Result: NO Match, "spacex" nor "august" is present within the search context.
Search expression 3: "spacex" XOR "august" XOR "mission"
- Result: Match, only "mission" is present within the search context.
import { term, xor } from '@spaceavocado/librarian'
const search = xor(term('new york'), term('berlin')).execute
const search = parse('"new york" XOR "berlin"').execute
const result = search("Christie visited the New York's city center last week.")
Flips the outcome of AND, OR operators and/or result of the search term.
Search Context:
The NASA Juno mission, which began orbiting Jupiter in July 2016, just recently made its 38th close flyby of the gas giant. The mission was extended earlier this year, adding on a flyby of Jupiter's moon Ganymede in June.
Search expression 1: NOT "spacex"
- Result: Match, "spacex" is NOT present in the search context.
Search expression 2: NOT "nasa"
- Result: No Match, "nasa" is present in the search context.
Search expression 3: "nasa" AND NOT "spacex"
- Result: Match, "nasa" is present, and "spacex" is NOT present in the search context.
Search expression 3: "nasa" AND NOT ("spacex" OR "galactic")
- Result: Match, "nasa" is present, and "spacex" or "galactic" are NOT present in the search context.
import { term, not } from '@spaceavocado/librarian'
const search = not(term('new york')).execute
const search = parse('NOT "new york"').execute
const search = parse('"new york" AND NOT ("center" AND NOT "week")').execute
const result = search("Christie visited the New York's city center last week.")
AND
, OR
, NOR
, XOR
operators cannot be mixed within the same expression scope, they could be combined only within different scope of the search expression. See Search Expression Scoping/Nesting for more details.
Examples:
- VALID:
"animal" AND ("cat" OR "dog" OR "bird")
- INVALID:
"animal" AND "cat" OR "dog"
- VALID:
"animal" OR ("cat" AND "dog")
- INVALID:
"animal" OR "cat" AND "dog"
An asterisk (*) may be used to specify any number of characters. It is typically used at the end of a root word, when it is referred to as "truncation." This is great when you want to search for variable endings of a root word.
Note: (*) matches any word character (equivalent to [a-zA-Z0-9_]
)
Examples:
cent*
matches: center, centre*fix
matches: prefix, suffixb*r
matches: beer, bear
import { term } from '@spaceavocado/librarian'
const search = term('cent*').execute
const search = parse('"cent*"').execute
const result = search("Christie visited the New York's city center last week.")
A question mark (?) may be used to represent a single character, anywhere in the word. It is most useful when there are variable spellings for a word, and you want to search for all variants at once.For example, searching for colo?r would return both color and colour.
Note: (?) matches any word character (equivalent to [a-zA-Z0-9_]
)
Examples:
cent??
matches: center, centre but NOTcents???fix
matches: prefix, suffix but NOTaffixb??r
matches: beer, bear but NOTbor
import { term } from '@spaceavocado/librarian'
const search = term('cent??').execute
const search = parse('"cent??"').execute
const result = search("Christie visited the New York's city center last week.")
The parentheses, (
)
, are used to enclose the nested search expressions.
Example: "red" AND ("cat" OR "dog")
The root expression could be optionally enclosed within parentheses. E.g.: "cat" OR "dog"
= ("cat" OR "dog")
The search expressions could be nested without any limitation. E.g.: "animal" AND ("tiger" OR ("snail" AND "african"))
The search terms are not case sensitive. i.e. "cat"
finds cat
, Cat
and/or CAT
.
Probe captures provides information about the whole expression evaluation tree.
import { parse, term, and, or } from '@spaceavocado/librarian'
const expression = and(
term('cent??'),
or(term('new york'), term('berlin'))
)
const search = probe(expression).execute
// Perform a search with probe data
const [result1, probeData1] = search("Christie visited the New York's city center last week.")
// The result1 contains a collection of librarian.Match results
;[
{ match: 'center', term: 'cent??', index: 37, length: 6 },
{ match: 'New York', term: 'new york', index: 21, length: 8 },
]
// The probeData1 contains librarian.ProbeResult
;{
id: Symbol(and),
toString: [Function: toString],
execute: [Function: execute],
test: [Function: test],
result: [
{ term: 'cent??', match: 'center', index: 37, length: 6 },
{ term: 'new york', match: 'New York', index: 21, length: 8 }
],
descendants: [
{
id: Symbol(term),
toString: [Function: toString],
execute: [Function: execute],
test: [Function: test],
result: [
{ term: 'cent??', match: 'center', index: 37, length: 6 }
],
descendants: []
},
{
id: Symbol(or),
toString: [Function: toString],
execute: [Function: execute],
test: [Function: test],
result: [
{ term: 'new york', match: 'New York', index: 21, length: 8 }
],
descendants: [
{
id: Symbol(term),
toString: [Function: toString],
execute: [Function: execute],
test: [Function: test],
result: [
{ term: 'new york', match: 'New York', index: 21, length: 8 }
],
descendants: []
},
{
id: Symbol(term),
toString: [Function: toString],
execute: [Function: execute],
test: [Function: test],
descendants: []
}
],
}
]
}
result: Match[]
= Matches found within the context.result: true
= Matching within the context.result: false
= Not matches found within the context.result: undefined
= The evaluable has not been executed, i.e this expression branch has not been needed to be executed.
See contributing.md.
Librarian is released under the MIT license. See license.txt.