A Dual Autoencoder-like NMF with higher-order graph regularization for topic modeling

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

Abstract

Recently, algebraic approaches to topic modeling, such as Non-negative Matrix Factorization (NMF), have received considerable attention due to their interpretability, simplicity, and flexibility. Nonetheless, most current methods perform clustering in only one direction–focusing on either data points (representing documents) or features (representing words)–and thus fail to fully exploit the inherent duality that simultaneous co-clustering can provide. To overcome this limitation, we introduce a Dual Autoencoder-like NMF method with Higher-Order Graph Regularization (DAHOG-NMF) for topic modeling. This approach leverages the inherent co-clustering property of NMF to jointly learn latent representations for both documents and words. Building on this foundation and inspired by autoencoder architectures–which learn compact representations by reconstructing input data–DAHOG-NMF encodes and decodes both documents and words, thereby uncovering underlying structures from both domains. Additionally, DAHOG-NMF utilizes a dual graph regularization with second-order nearest-neighbor relationships to simultaneously model the geometric structures of documents and words. By uncovering connections among data samples and features that might be overlooked by first-order neighborhoods, this method builds a higher-order graph regularizer, thereby enhancing the preservation of the local and global structures in both manifolds. Comprehensive experiments on multiple benchmark datasets highlight the effectiveness and clear advantage of DAHOG-NMF compared to various classical and state-of-the-art topic modeling methods. The source code is available at https://github.com/MaryamMajidi93/DAHOG-NMF.

Details

Original languageEnglish
Article number114907
JournalKnowledge-based systems
Volume333
Publication statusPublished - 30 Jan 2026
Peer-reviewedYes

Keywords

Keywords

  • Autoencoder-like NMF, Encoder-Decoder structure, Higher-order graph learning, Non-negative matrix factorization, Topic modeling