Ruben Laguna's blog

Mar 10, 2010 - 1 minute read - Comments - build fail java locale lucene mac maven oome tests tika

Building Apache Tika 0.6 fails if the locale is not en_US

I tried to build Apache Tika 0.6 yesterday and I couldn’t build it because the tests failed. The failing tests were

  testExcelParserFormatting(org.apache.tika.parser.microsoft.ExcelParserTest)
  testExcelFormats(org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest)

and the failure had to to with the fact that the locale was “es_ES” and the numbering format differs (”1.599,99” and not “1,599.99”)

$ mvn -version
Apache Maven 2.2.0 (r788681; 2009-06-26 15:04:01+0200)
Java version: 1.6.0_17
Java home: /System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
Default locale: es_ES, platform encoding: MacRoman
OS name: "mac os x" version: "10.6.2" arch: "x86_64" Family: "mac"@

I changed the locale temporarily to be able to build

export LC_ALL=en_US.UTF-8

so those test no longer failed. But then I run into OutOfMemoryError while running some of the tests so I set MAVEN_OPTS:

export MAVEN_OPTS="-Xmx2048m"

And then mvn install suceeded !. I got the jars

find . -name "*jar"
./tika-app/target/tika-app-0.6.jar
./tika-bundle/target/tika-bundle-0.6.jar
./tika-core/target/tika-core-0.6.jar
./tika-parsers/target/tika-parsers-0.6.jar