Recognition-of-written-languages
Language modeling is a field where unsupervised learning is widely applied, with text analysis and written language detection among the best-known applications. In this project, we want to be able to identify the language of a given text (French or English). To achieve this, we’re going to use a text data table in which each text has already been labeled by its language. Initially, the aim is to build several models characterizing the different languages, based on the frequency of appearance of symbols (letters) in each language, and then to compare the different models.
Each exercise in the project will involve :
¤ Choose a model
¤ Estimating its parameters
¤ Program it and comment on the results.