Indic Ngram Library

What is Ngram?

An n-gram model is a type of probabilistic model for predicting the next item in a sequence. n-grams are used in various areas of statistical natural language processing and genetic sequence analysis. An n-gram is a subsequence of n items from a given sequence. The items in question can be phonemes, syllables, letters, words or base pairs according to the application. An n-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram"; and size 4 or more is simply called an "n-gram".

If you want to use this library in your program , you may refer the JSON-RPC based API documentation.

Read more about N-gram

Supported Languages

English, Hindi, Malayalam, Kannada, Bengali

Enter the text for getting the n-gram below. For Word Ngram type enter a sentence. Language of each word will be detected. You can give the text in any language and even with mixed language.

Enter the text for conversion in the below text area.


Python ngram API

This service provides indic ngram libraries
  • Method: modules.Ngram.wordNgram
    • arg1 : the sentence
    • n : n of n-gram (Optional)
    • Return : The ngram for the sentence
  • Method: modules.Ngram.letterNgram
    • arg1 : the word
    • n : n of n-gram (Optional)
    • Return : The ngram for the word
  • Method: modules.Ngram.syllableNgram
    • arg1 : the word
    • n : n of n-gram (Optional)
    • Return : The ngram for the word, the letters being splitted at syllable level