JimTsesm / Wikipedia-Internal-Missing-Links-Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wikipedia-Internal-Missing-Links-Detection

Execution Instructions:

After compiling the program, the following command can be used to run it: java -cp target/esa-1.0-SNAPSHOT.jar be.vanoosten.esa.Main --method [1/2] --createIndex [0/1] [path to pages-articles-multistream.xml.bz2] --candidateSetPath [path to candidate set csv] --rankingPath [output path] --indexPath [path to the index]
where the list of parameters are the following:

  • --method: 1 to use titles and 2 to use abstracts
  • --createIndex: if the index has not been created 1 followed by the path to the xml file, else 0
  • --indexPath: the path to the index
  • --candidateSetPath: the path to the input csv containing the candidate set
  • --rankingPath: the path to output the ranking

About

License:GNU Affero General Public License v3.0


Languages

Language:Java 100.0%