Detect Dominant Language in Text v1.0.0 Help
Identifies the dominant language of the text. Refers to Language identification.
How can I use this Step?
The Step lets you identify the dominant language of the input text. You can use the Step to approach different text analytics problems, such as content-based recommendation and text classification.
How does the Step work?
The Step uses RFC 5646 language tags to determine the dominant language of the text, applying ISO 639-1 for the two-letter code and ISO 639-2 for the three-letter code.
The Step returns language code and the confidence score for each language detected in the text. The score indicates the level of confidence that the Step correctly identified the language. To learn more, see the Output example.
Input settings
- To set up this section, provide the text to analyze in the Input text field. You can enter text manually or use the Merge fields here.
Input text
The input text must be a UTF-8 string. The string must contain at least 20 characters. The maximum string size is 100 KB.
You can find all supported languages and their respective codes in the following table:
Code | Language | Code | Language | Code | Language |
---|---|---|---|---|---|
af | Afrikaans | hy | Armenian | pt | Portuguese |
am | Amharic | ilo | Iloko | ps | Pushto |
ar | Arabic | id | Indonesian | qu | Quechua |
as | Assamese | is | Icelandic | ro | Romanian |
az | Azerbaijani | it | Italian | ru | Russian |
ba | Bashkir | jv | Javanese | sa | Sanskrit |
be | Belarusian | ja | Japanese | si | Sinhala |
bn | Bengali | kn | Kannada | sk | Slovak |
bs | Bosnian | ka | Georgian | sl | Slovenian |
bg | Bulgarian | kk | Kazakh | sd | Sindhi |
ca | Catalan | km | Central Khmer | so | Somali |
ceb | Cebuano | ky | Kirghiz | es | Spanish |
cs | Czech | ko | Korean | sq | Albanian |
cv | Chuvash | ku | Kurdish | sr | Serbian |
cy | Welsh | lo | Lao | su | Sundanese |
da | Danish | la | Latin | sw | Swahili |
de | German | lv | Latvian | sv | Swedish |
el | Greek | lt | Lithuanian | ta | Tamil |
en | English | lb | Luxembourgish | tt | Tatar |
eo | Esperanto | ml | Malayalam | te | Telugu |
et | Estonian | mt | Maltese | tg | Tajik |
eu | Basque | mr | Marathi | tl | Tagalog |
fa | Persian | mk | Macedonian | th | Thai |
fi | Finnish | mg | Malagasy | tk | Turkmen |
fr | French | mn | Mongolian | tr | Turkish |
gd | Scottish Gaelic | ms | Malay | ug | Uighur |
ga | Irish | my | Burmese | uk | Ukrainian |
gl | Galician | ne | Nepali | ur | Urdu |
gu | Gujarati | new | Newari | uz | Uzbek |
ht | Haitian | nl | Dutch | vi | Vietnamese |
he | Hebrew | no | Norwegian | yi | Yiddish |
ha | Hausa | or | Oriya | yo | Yoruba |
hi | Hindi | om | Oromo | zh | Chinese (Simplified) |
hr | Croatian | pa | Punjabi | zh-TW | Chinese (Traditional) |
hu | Hungarian | pl | Polish |
Merge field settings
The Step returns the result as a JSON object and stores it in the Merge field variable. Thus you can access the output JSON object from any point of your Flow.
Output example
The Step's output contains information about all detected languages, including their code, name, and confidence score:
{
"top": {
"score": 0.986181914806366,
"langCode": "en",
"langName": "English"
},
"all": [
{
"score": 0.986181914806366,
"langCode": "en",
"langName": "English"
}
],
"count": 1
}
{
"top": {
"score": 0.986181914806366,
"langCode": "en",
"langName": "English"
},
"all": [
{
"score": 0.986181914806366,
"langCode": "en",
"langName": "English"
}
],
"count": 1
}
Error Handling
By default, the Step handles errors using a separate exit. So if any error occurs during the Step execution, the Flow proceeds down the error
exit.
Note: If you disable the Handle error toggle, the Step does not handle errors. With this setup, if any error occurs during the Step execution, the Flow fails immediately after exceeding the Flow's timeout. To prevent the Flow from being suspended while continuing to handle errors in the Flow, place the Flow Error Handling Step before the main Flow logic.
Reporting
The Step reports once after its execution. You can change the Step log level and add new tags in the section.
Log level
By default, the Step inherits its log level from Flow's log level. You can change the Step's log level by selecting an appropriate option from the Log level list.
Tags
Tags help organize and filter session information when generating reports. You can specify the tag category, label, and value when adding a new tag.
Service dependencies
- flow builder - v2.28.3
- event-manager - v2.3.0
- deployer - v2.6.0
- comprehend provider - v0.9.0
Release notes
v1.0.0
- Initial release