how read from url parse html, get nodes contain text node(filter scripts) parse all texts to an article find all highlight texts position in article highlight find highlight dom parse dom back to xpath