Language recognition in Java [closed]

See what you think of the version in Apache Tika. This assumes that you want to find out what language text is in, as opposed to wanting to build a parser for a programming language.


Textcat http://textcat.sourceforge.net/ doesn't have Russian but it does handle the following:

  • albanian
  • danish
  • dutch
  • english
  • finnish
  • french
  • german
  • hungarian
  • italian
  • norwegian
  • polish
  • slovakian
  • slovenian
  • spanish
  • swedish

There is Language Detection API which accepts text via HTTP POST and returns JSON with detected languages and scores. It can be used from Java or any other programming language.