Automated File Labeling for Heterogeneous Files Organization Using Machine Learning

Sagheer Abbas; Syed Ali Raza; Muhammad Adnan Khan; Muhammad Adnan Khan; Atta-Ur-Rahman; Kiran Sultan; Amir Mosavi

doi:10.32604/cmc.2023.032864

Automated File Labeling for Heterogeneous Files Organization Using Machine Learning

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Sagheer Abbas - , National College of Business Administration and Economics (Author)
Syed Ali Raza - , National College of Business Administration and Economics, Government College University Lahore (Author)
Muhammad Adnan Khan - , Riphah International University (Author)
Muhammad Adnan Khan - , Gachon University (Author)
Atta-Ur-Rahman - , Imam Abdulrahman Bin Faisal University (Author)
Kiran Sultan - , King Abdulaziz University (Author)
Amir Mosavi - , Óbuda University, Slovak University of Technology, TUD Dresden University of Technology (Author)

Faculty of Civil Engineering

Abstract

File labeling techniques have a long history in analyzing the anthological trends in computational linguistics. The situation becomes worse in the case of files downloaded into systems from the Internet. Currently, most users either have to change file names manually or leave a meaningless name of the files, which increases the time to search required files and results in redundancy and duplications of user files. Currently, no significant work is done on automated file labeling during the organization of heterogeneous user files. A few attempts have been made in topic modeling. However, one major drawback of current topic modeling approaches is better results. They rely on specific language types and domain similarity of the data. In this research, machine learning approaches have been employed to analyze and extract the information from heterogeneous corpus. A different file labeling technique has also been used to get the meaningful and `cohesive topic of the files. The results show that the proposed methodology can generate relevant and context-sensitive names for heterogeneous data files and provide additional insight into automated file labeling in operating systems.

Details

Original language	English
Pages (from-to)	3263-3278
Number of pages	16
Journal	Computers, Materials and Continua
Volume	74
Issue number	2
Publication status	Published - 2023
Peer-reviewed	Yes

Keywords

ASJC Scopus subject areas

Biomaterials
Modeling and Simulation
Mechanics of Materials
Computer Science Applications
Electrical and Electronic Engineering

Keywords

Automated file labeling, file organization, machine learning, topic modeling

Research Portal of the TU Dresden

Contributors

Abstract

Details

Keywords

ASJC Scopus subject areas

Keywords