spencermountain / compromise

modest natural-language processing

Home Page:http://compromise.cool

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using .freeze() in nlp.plugin()?

thegoatherder opened this issue · comments

commented

The documentation for .freeze() shows adding a word to the lexicon and freezing it using addWords().

For projects that use the nlp.plugin() option to initialise the lexicon at program start, an object is passed to the words property with a name-value pair syntax for each tag.

Is there a way to specify these tags to be frozen from within the plugin object?

commented

Can it also be specified in a match-object?

{
  match: "#Diagnostic+ (preassessment|assessment|reassessment|pre-assessment|re-assessment)",
  reason: "",
  tag: "Diagnostic",
}
commented

Thanks, Spence! Once these are implemented we can fully integrate it into our solution and get it tested thoroughly. We will let you know if we find any anomalies! Looking forward to getting this feature rolling…!

hey Adam, both features are implemented on dev, and i will test, document, and release it, hopefully tomorrow.
cheers

commented

Hey Spence that’s great! I forgot to mention the reason we need the match-object is for buildNet() and sweep() - will these support frozen tags too?

yep - {match:'foo', tag:'Bar', freeze:true, } will lock in #Bar, and it will be unchangeable. Will that do the trick for your purpose?

commented

Yep absolutely perfect, thank you!

both features have been released in 14.2.0:

frozen lexicon via plugin:

nlp.plugin({
  // normal lexicon
  words:{
    foo:'Bar'
  },
  // frozen lexicon
  frozen: {
    'juicy fruit': 'Singular',
    'front steps': 'Plural',
  },
})
let doc = nlp(`i ate juicy fruit on the front steps`)
doc.debug()

and freeze inside sweep:

let matches = [
  { match: 'juicy fruit', tag: 'Singular', freeze: true },
  { match: 'front steps', tag: 'Plural', freeze: true },
]
let doc = nlp(`i ate juicy fruit on the front steps`)
let net = nlp.buildNet(matches)
doc.sweep(net)
doc.debug()

note that in both cases, the words don't stay frozen, after this process. You can do doc.sweep(net).freeze() to re-freeze them, for further analysis.
cheers

commented

Superb thanks Spence.
@Fdawgs is going to integrate this on our end in the coming days and we will report back on any anomalies found during testing. Thanks again

both features have been released in 14.2.0:

Did you mean 14.11.2 @spencermountain?

Oops, yes