Entry point for Language Identification.
This class can be used from any thread.
Constant Summary
float | DEFAULT_IDENTIFY_LANGUAGE_CONFIDENCE_THRESHOLD | The default confidence threshold for the
identifyLanguage(String) call. |
float | DEFAULT_IDENTIFY_POSSIBLE_LANGUAGES_CONFIDENCE_THRESHOLD | The default confidence threshold for the
identifyPossibleLanguages(String) call. |
String | UNDETERMINED_LANGUAGE_CODE | The BCP-47 code for "undetermined language" |
Public Method Summary
void |
close()
Releases resources when the client is finished using the instance.
|
Task<String> |
identifyLanguage(String text)
Identifies the language in a supplied
String and
returns the most likely language.
|
Task<List<IdentifiedLanguage>> |
identifyPossibleLanguages(String text)
Identifies the language in a supplied
String and
returns a list of possible languages, cutting off any languages whose
confidence score falls below the threshold in
getConfidenceThreshold() .
|
Inherited Method Summary
Constants
public static final float DEFAULT_IDENTIFY_LANGUAGE_CONFIDENCE_THRESHOLD
The default confidence threshold for the
identifyLanguage(String)
call.
public static final float DEFAULT_IDENTIFY_POSSIBLE_LANGUAGES_CONFIDENCE_THRESHOLD
The default confidence threshold for the
identifyPossibleLanguages(String)
call.
public static final String UNDETERMINED_LANGUAGE_CODE
The BCP-47 code for "undetermined language"
Public Methods
public void close ()
Releases resources when the client is finished using the instance.
The instance can still be used after a call to this method, but might take slightly longer to produce a result, because it will have to load the model again.
public Task<String> identifyLanguage (String text)
Identifies the language in a supplied String
and returns
the most likely language.
Parameters
text | The text for which to identify the language. Inputs longer than 200 characters are truncated to 200 characters, as longer input does not improve the detection accuracy. |
---|
Returns
- A
Task
that returns aString
with the BCP-47 language code of the most likely language, orUNDETERMINED_LANGUAGE_CODE
if the confidence was below the threshold of0.5
.
public Task<List<IdentifiedLanguage>> identifyPossibleLanguages (String text)
Identifies the language in a supplied String
and returns
a list of possible languages, cutting off any languages whose confidence score falls
below the threshold in
getConfidenceThreshold()
.
Note that this API assumes the text
is in a single language; the
returned list contains all estimations for what that language could be, along with a
confidence score for each possible language. The API does not detect multiple
languages in a single text.
Parameters
text | The text for which to identify the language. Inputs longer than 200 characters are truncated to 200 characters, as longer input does not improve the detection accuracy. |
---|
Returns
- A
Task
that returns aList
ofIdentifiedLanguage
s. The returned list will never be empty; if all languages have lower confidence scores than the threshold, the list will contain a single item with theUNDETERMINED_LANGUAGE_CODE
.