2024 : 11 : 24
Meysam Roostaee

Meysam Roostaee

Academic rank: Assistant Professor
ORCID:
Education: PhD.
ScopusId:
HIndex:
Faculty: Faculty of Technology and Engineering
Address: University of Mazandaran
Phone: 01135305141

Research

Title
An effective approach to candidate retrieval for cross-language plagiarism detection: A fusion of conceptual and keyword-based schemes
Type
JournalPaper
Keywords
Plagiarism detection, Cross-language plagiarism, Candidate retrieval, Conceptual model, Keyword-based model
Year
2020
Journal Information Processing and Management
DOI
Researchers Meysam Roostaee ، Mohammad Hadi Sadreddini ، Seyed Mostafa Fakhrahmad

Abstract

Due to the rapid growth of documents and manuscripts in various languages all over the world, plagiarism detection has become a challenging task, especially for cross lingual cases. Because of this issue, in today's plagiarism detection systems, a candidate retrieval process is developed as the frst step, in order to reduce the set of documents for comparison to a reasonable number. The performance of the second step of plagiarism detection, which is devoted to a detailed analysis of the candidates is tightly dependent on the candidate retrieval phase. Regarding its high importance, the present study focuses on the candidate retrieval task and aims to extract the minimal set of highly potential source documents, accurately. The paper proposes a fusion of concept-based and keyword-based retrieval models for this purpose. A dynamic interpolation factor is used in the proposed scheme in order to combine the results of conceptual and bag-of-words models. The effectiveness of the proposed model for cross language candidate retrieval is also compared with state-of-the-art models over German-English and Spanish-English language partitions. The results show that the proposed candidate retrieval model outperforms the state-of-the-art models and can be considered as a proper choice to be embedded in cross-language plagiarism detection systems.