Tagging text is slow
rockt opened this issue · comments
public List<Mention> tag(String text) throws UIMAException {
JCas jcas = JCasFactory.createJCas(typeSystem);
jcas.setDocumentText(text);
PubmedDocument pd = new PubmedDocument(jcas);
pd.setBegin(0);
pd.setEnd(text.length());
pd.setPmid("");
pd.addToIndexes(jcas);
return tag(jcas);
}
This is slow since a jcas is initialized each time we want to tag a string. Instead, hold back one pre-intitialized jcas and reset it each time this method gets called.
Not quite that easy if we want to allow threading for this method (which seems sensible to me). Several threads cannot work on the same JCas, so we must either make it thread-safe or synchronized and thereby preventing any multithreading.