Estimating News Coverage Patterns using Latent Dirichlet Allocation (LDA)
The growing rate of unstructured textual data has made an open challenge for the knowledge discovery, which aims extracting desired information from large collection of data. This study presents a system to derive news coverage patterns with the help of probabilistic model – Latent Dirichlet Allocation. Pattern is an arrangement of words within collected data that more likely appear together in certain context. The news coverage patterns have been computed as number function of news articles comprising of such patterns. A prototype, as a proof, has been developed to estimate the news coverage patterns for a newspaper – The Dawn. Analyzing the news coverage patterns from different aspects has been carried out using multidimensional data model. Further, the extracted news coverage patterns are illustrated by visual graphs to yield in-depth understanding of the topics, which have been covered in the news. The results also assist in identification of schema related to newspaper and journalists’ articles.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Sukkur IBA Journal of Emerging Technologies - SJET holds the rights to all the published papers. Authors retain the copyright. Authors are required to sign the consent to publish & copyright agreement to make sure that the article/paper/work is solely published in SJET and never been published before or will be published after publication in SJET to any other journal. However, authors and readers may freely read, download, copy, distribute, print, search, or link to the full texts of articles and to use for any other lawful & non-commercial purpose.