Web pages classification: An effective approach based on text mining techniques

Research

Title	Web pages classification: An effective approach based on text mining techniques
Type	Presentation
Keywords	classification, data mining, machine learning, natural language processing, text mining
Year	2017
Researchers	Seyed Moein Babapour ، Meysam Roostaee

Abstract

Some web pages on Internet contain important content that are useful in a long time period or even forever. On the other hand, there are some web pages that are valuable only in a short time period. It is difficult to classify these types of web pages automatically due to their contents. This is an important task for improving the performance of search engines and web page recommender engines. In this project, webpages were classified into two categories with machine learning algorithms. For this purpose, natural language processing and text mining techniques were used for text pre-processing. Then appropriate information was extracted from texts and eventually web pages were classified by using machine learning algorithms. Compared to other approaches, most of the focus in this project is on text pre-processing stage and new strategies were presented to fill the gap. The results indicate that the proposed approach had better performance than other approaches.

Meysam Roostaee

Research

Abstract