Hidden Markov Model for Text classification

Md. Fantacher Islam

Md. Fantacher Islam

Khulna, Khulna Division

1 0
  • 0 Collaborators

Here Hidden Markov Algorithm used for classifying document using Natural Language Processing. Primarily, spam-ham data-set was used as the data-set. ...learn more

Project status: Under Development

Artificial Intelligence

Groups
Student Developers for AI

Code Samples [1]Links [1]

Overview / Usage

This model can use any kind of document classification like sentimental analysis.

Methodology / Approach

Hidden Markov models are created and trained (one for each category), a new document d can be classified by, first of all, formatting it into an ordered wordlist Ld in the same way as in the training process. Then, as words are considered observations in T-HMM, we calculate the probability (likelihood) of the word sequence Ld being produced by the two HMMs. That is, P(Ld|�R) and P(Ld|�N) need to be computed, where �R is the model for relevant documents and �N the model for non-relevant documents. The final output class for document d will be the class represented by the HMM with the highest calculated probability.

Technologies Used

Python 3.6.5
IDE: Spyder
Library: NLP

Repository

https://github.com/FantacherJOY/Hidden-Markov-Model-for-NLP

Comments (0)