GuessLanguage Class Reference
from PyKDE5.sonnet import *
Namespace: Sonnet
Detailed Description
GuessLanguage determines the language of a given text.
GuessLanguage can determine the differnce between ~75 languages for a given string. It is based off a perl script origionaly written by Maciej Ceglowski <maciej@ceglowski.com> called Languid. His script used a 2 part huristic to determine language. First the text is is checked for the scripts it contains, then for each set of languages useing those scripts a n-gram frequency model of a given language is compared to a model of the text. The most similar language model is assumed to be the language. If no language is found an empty string is returned.
- Since:
- 4.3
Methods | |
| __init__ (self) | |
| QString | identify (self, QString text, QStringList suggestions=QStringList()) |
| setLimits (self, int maxItems, float minConfidence) | |
Method Documentation
| __init__ | ( | self ) |
Constructor Creates a new GuessLanguage instance. If text is specified, it sets the text to be checked.
- Parameters:
-
text the text that is to be checked
| QString identify | ( | self, | ||
| QString | text, | |||
| QStringList | suggestions=QStringList() | |||
| ) |
Returns the 2 digit ISO 639-1 code for the language of the currently set text and. Three digits are returned only in the case where a 2 digit code does not exist. If text isn't empty, set the text to checked.
- Parameters:
-
text to be identified
- Returns:
- list of the presumed languages of the text, sorted by decreasing confidence. Empty list means it is impossible to determine language with confidence required by setLimits
| setLimits | ( | self, | ||
| int | maxItems, | |||
| float | minConfidence | |||
| ) |
Sets limits to number of languages returned by identify(). The confidence for each language is computed as difference between this and next language on the list normalized to 0-1 range. Reasonable value to get fairly sure result is 0.1 . Default is returning best guess without caring about confidence - exactly as after call to setLimits(1,0).
- Parameters:
-
maxItems The list returned by identify() will never have more than maxItems item minConfidence The list will have only enough items for their summary confidence equal or exceed minConfidence.
KDE 5.0 PyKDE API Reference