Providing highly accurate text recognition. In beta version the accuracy rate is 90 to95%.
Automatic image skew correction enables detection and correction of image skews.
Spell checking during text recognition considerably improves the quality of output.
Image pre-processing automatically detects the orientation of a page.
Sindhi OCR lets you convert scans of documents containing printed text.
Sindhi OCR is developed on Long short-term memory (LSTM) an architecture of a recurrent neural network (RNN), which is not only open source but has more capabilities to handle millions of words (corpus) of language. From the most frequent used words to whole corpora, including dictionary, positioning of glyphs, airabs of Arabic alphabet based languages.
In next release of OCR version two more fonts will be added.
Beta release: (Version: 1.0) Recognized Sindhi alphabet, Glyphs and ligatures. English letters, bullets and punctuation marks are trained.
Font supported:
1. MB Lateefi
2. Awami
3. Adabi