DefinePK

DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.

Extracting Temporal Entity from Urdu Language Text


Article Information

View Article

Title: Extracting Temporal Entity from Urdu Language Text

Authors: Daler Ali, Malik Muhammad Saad Missen, Muhammad Ali Memon, Muhammad Ali Nizamani, Asadullah Shaikh

Journal: University of Sindh Journal of Information and Communication Technology

HEC Recognition History
Category From To
Y 2024-10-01 2025-12-31
Y 2023-07-01 2024-09-30
Y 2022-07-01 2023-06-30
Y 2021-07-01 2022-06-30
Y 2020-07-01 2021-06-30

Publisher: University of Sindh, Jamshoro

Country: Pakistan

Year: 2020

Volume: 4

Issue: 3

Language: English

Keywords: Entity ExtractionUrdu Language TextDates

Categories

Abstract

The detection of temporal entities within natural language texts is an interesting information extractionproblem. Temporal entities help to estimate authorship dates, enhance information retrieval capabilities, detect andtrack topics in news articles, and augment electronic news reader experience. Research has been performed on thedetection, normalization and annotation guidelines for English temporal entities. However, research for Urdulanguage lags far behind and there is a need for lot of work to be done in this regard especially when huge quantityof Urdu data is being generated on online social networks on daily basis. In this paper, we propose a rule-basedapproach for temporal entity extraction for Urdu language. Comparing our approach with existing Urdu temporalentity extraction approaches, our approach dominates on behalf of accuracy and on tackling with all types of Urdutemporal entity types. We use a publicly available Urdu data corpus for our experiments which consists of 206 datetags. We extend this dataset by adding 200 Urdu Fully Qualified Date (UFQD) tags. We also introduce a new datetype for Urdu language called Urdu Partially Fully Qualified. Our proposed system achieved average (0.97, 0.98and 0.98) (Precision, Recall and F1-Measure) respectively for UFQD and Urdu Partially Fully Qualified Date.Some challenges and issues of other date types in Urdu Textual Language i.e. Deictic and Anaphoric are alsodiscussed in detail.


Paper summary is not available for this article yet.

Loading PDF...

Loading Statistics...