Match strings to patterns and extract variables even if the input text does not match the pattern exactly:
Pattern: "my name is ¿ and I am ¿ years old"
Input: "my name is John and I am 30 years old"
Score: 1.0
Variables: ["John", "30"]
Tokens: ["my name is ", " and I am ", " years old"]
Input: "My names John and I'm 30 years old."
Score: 0.8285714285714286
Variables: ["John", "30"]
Tokens: ["My names ", " and I'm ", " years old."]
In ambiguous cases all valid extraction results are returned:
Pattern: "What ¿ ¿s"
Input: "What the hell are lobsters"
Score: 1.0
Extraction 1:
Variables: ["the", "hell are lobster"]
Tokens: ["What ", " ", "s"]
Extraction 2:
Variables: ["the hell are", "lobster"]
Tokens: ["What ", " ", "s"]
Extraction 3:
Variables: ["the hell", "are lobster"]
Tokens: ["What ", " ", "s"]
There are two methods available:
- stringCompare() - Determines how closely an input string matches a pattern and returns a value between 0 and 1
- stringEditDistance() - Determines how closely an input string matches a pattern and returns the number of edits required on the input string in order for it to match the pattern
Both methods can take the same parameters:
PARAMETER | TYPE | DESCRIPTION |
---|---|---|
pattern | String | The pattern to compare against (the wildcard symbol is ¿ ) |
text | String | The input string to compare to the pattern |
vars | List<List<String>> | If included, this list will be populated with the extracted variables found during the comparison |
tokens | List<List<String>> | If included, this list will be populated with the extracted tokens found during the comparison |
ignoreCase | boolean | When true, the comparison is performed without case sensativity (false by default) |
ignorePunctuation | boolean | When true, punctation is ignored during the comparison (false by default) |
Both methods have the same overloads using the parameters defined above:
pattern | text | vars | tokens | ignoreCase | ignorePunctuation |
---|---|---|---|---|---|
X |
X |
||||
X |
X |
X |
|||
X |
X |
X |
X |
||
X |
X |
X |
X |
||
X |
X |
X |
X |
X |
|
X |
X |
X |
X |
X |
X |