Cédric Champeau's blog: JLangdetect 0.2 released

30 September 2009

JLangDetect 0.2 released !

This is a small update which includes the following features :

Ability to detect the language of a document using a subset of the languages used for training
Logs now managed by log4j

The ability to use a subset of the languages used for training is important if you know that a document must be written in french or english, for example, but the detector has been trained for more languages. Using a subset will ensure that the detector returns one of those languages.

Downloads

JLangDetect is licensed under Apache 2.0.

Binary : jlangdetect-0.2.jar
Source: jlangdetect-0.2-sources.jar
Javadoc: jlangdetect-0.2-javadoc.jar
Europarl pre-compiled corpus: ngrams-europarl.zip

Version Control

Project is hosted on Google code.