castorini / anserini

Anserini is a Lucene toolkit for reproducible information retrieval research

Home Page:http://anserini.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error Building Anserini on Windows

paulowoicho opened this issue · comments

I have been having trouble building anserini on a windows computer using this command: mvn clean package appassembler:assemble. It keeps throwing the following error:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:36 min
[INFO] Finished at: 2021-01-21T00:00:48Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project anserini: There are test failures.
[ERROR]
[ERROR] Please refer to C:\Users\Owoicho\Documents\PhD Prep\PhD Stuff\anserini\target\surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

I have Java 11 and Maven 3.6.3 installed:

>mvn -version
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: C:\Program Files\apache-maven-3.6.3\bin\..
Java version: 11.0.10, vendor: Oracle Corporation, runtime: C:\Program Files\Java\jdk-11.0.10
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
>java -version
java version "11.0.10" 2021-01-19 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.10+8-LTS-162)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.10+8-LTS-162, mixed mode)

What could I be doing wrong? Please help!

Hi @paulowoicho - thanks for your interest! sorry, but no one on our team uses windows, so it's difficult for us to help...

Your error message suggests that there are test failures... what's failing?

Do you have access to a mac or linux machine?

Thanks @lintool for getting back to me! Here are the tests that are failing:

Results :

Failed tests:   testStreamIteration(io.anserini.collection.BibtexCollectionTest): expected:<author_name1  and[](..)
  testManualSegmentInitialization(io.anserini.collection.BibtexCollectionTest): expected:<author_name1  and[](..)
  testIterateCollection(io.anserini.collection.BibtexCollectionTest): expected:<author_name1  and[](..)
  testStreamIteration(io.anserini.collection.EpidemicQACollectionTest): expected:<66689> but was:<68210>
  testIterateCollection(io.anserini.collection.EpidemicQACollectionTest): expected:<66689> but was:<68210>
  testManualSegmentInitialization(io.anserini.collection.EpidemicQACollectionTest): expected:<4291> but was:<4412>
  testStreamIteration(io.anserini.collection.JsonCollectionDocumentArrayTest): expected:<{[(..)
  testIterateCollection(io.anserini.collection.JsonCollectionDocumentArrayTest): expected:<{[(..)
  testManualSegmentInitialization(io.anserini.collection.JsonCollectionDocumentArrayTest): expected:<{[(..)
  testIterateCollection(io.anserini.collection.JsonCollectionDocumentObjectTest): expected:<{[(..)
  testManualSegmentInitialization(io.anserini.collection.JsonCollectionDocumentObjectTest): expected:<{[(..)
  testStreamIteration(io.anserini.collection.JsonCollectionDocumentObjectTest): expected:<{[(..)
  testStreamIteration(io.anserini.collection.JsonCollectionLineObjectTest): expected:<{[(..)
  testIterateCollection(io.anserini.collection.JsonCollectionLineObjectTest): expected:<{[(..)
  testManualSegmentInitialization(io.anserini.collection.JsonCollectionLineObjectTest): expected:<{[(..)
  testManualSegmentInitialization(io.anserini.collection.TwentyNewsgroupsCollectionTest)
  testStreamIteration(io.anserini.collection.TwentyNewsgroupsCollectionTest)
  testIterateCollection(io.anserini.collection.TwentyNewsgroupsCollectionTest)
  testGetQrelsResource(io.anserini.eval.RelevanceJudgmentsTest): expected:<301 0 FBIS3-10082 1[(..)
  testMain(io.anserini.index.IndexReaderUtilsTest): expected:<Index statistics[(..)
  testMain(io.anserini.search.SimpleSearcherTest): expected:<... 1 0.570200 Anserini[](..)
  testNonEnglishTopics(io.anserini.search.topicreader.TopicReaderTest): expected:<[?????????????????]> but was:<[???å?ƒé??èµ°å??éª????å??å¼ è?ºè°?æ??ä»?ä??å??ç?»ï¼?]>
  testNonEnglishTopics_TopicIdsAsStrings(io.anserini.search.topicreader.TopicReaderTest): expected:<[?????????????????]> but was:<[???å?ƒé??èµ°å??éª????å??å¼ è?ºè°?æ??ä»?ä??å??ç?»ï¼?]>
  test(io.anserini.util.ExtractAverageDocumentLengthTest): expected:<... Exact avg doclength[(..)
  test(io.anserini.util.ExtractDocumentLengthsTest): expected:<...(sum of doclengths):[(..)

Tests run: 296, Failures: 25, Errors: 0, Skipped: 0

I do not have access to a mac or Linux machine but I could install a Virtualbox that lets me use Linux on my computer

Hey @paulowoicho, I get the same issue on Windows which is due to an encoding error - one workaround is to build without enabling tests by adding the following flag: -Dmaven.test.skip=true

So the full command would be: mvn clean package appassembler:assemble -Dmaven.test.skip=true

Otherwise, I recommend installing the Windows Subsystem for Linux if you can, since it makes development a lot easier

Hey @paulowoicho, I get the same issue on Windows which is due to an encoding error - one workaround is to build without enabling tests by adding the following flag: -Dmaven.test.skip=true

So the full command would be: mvn clean package appassembler:assemble -Dmaven.test.skip=true

Otherwise, I recommend installing the Windows Subsystem for Linux if you can, since it makes development a lot easier

Wow!it works! Thank you for your command! you save my life.

This Command Should be put on The Main page because every time I face the same problem. I hope that you put it please. For windows Skip tests

Sure, see #1848

Thanks Sir . But without any offence it is still now showing in the main page ( Main Read Me File ).

Anserini Home

image

It's in a pull request - it'll get merged once I get someone to sign off on it.

Great and Big thanks for support and cooperation 👍

I'm not too familiar with the codebase, but I noticed a "raw" open in some places, e.g. here

        return open(file_name, flags)

Such statements could indeed cause encoding problems on Windows: when no encoding='utf8' provided to open(), it uses the default system encoding which is utf8 on Linux systems and locale-dependent on Windows systems.

I think it could be changed rather quickly with a method described here: https://dev.to/methane/python-use-utf-8-mode-on-windows-212i

I'm not familiar with how does Maven run tests, but if it runs them using Python, then maybe passing -Xutf8 will magically help to fix everything?

Alternatively, running set _JAVA_OPTIONS=-Dfile.encoding=UTF-8 (and then restoring the old value, whatever it was) could help...

I skipped the tests and it was fine with that
mvn clean package appassembler:assemble -Dmaven.test.skip=true