Please use this identifier to cite or link to this item: http://ptsldigitalv2.ukm.my:8080/jspui/handle/123456789/476441
Title: Pseudo-relevance feedback for English translated Qur'anic text retrieval
Authors: Yasir Hadi Farhan (P74154)
Supervisor: Shahrul Azman Mohd Noah, Prof. Dr.
Keywords: Information Retrieval
English translation
Qur'anic text
Pseudo-relevance feedback
Dissertations, Academic -- Malaysia
Issue Date: 27-Dec-2016
Description: One of the biggest issues that affect the Information Retrieval (IR) systems performance is the difficulties facing users to define exactly what their information needs, as that information might be a gap in their knowledge. Such an issue is more problematic for classical and literary documents such as the al-Quran. One of the approaches to overcome such an issue is pseudo-relevance feedback which assumes a small number of top-ranked documents a relevant in the initial retrieval results. It selects related terms from these documents to improve the query representation through query expansion. Among the issues in the Quranic text are ambiguities and complexity of the text. Due to these issues, users need to reformulate and refine their queries to match their information needs. Pseudo-relevance feedback can help relieve these issues. The classic Rocchio algorithm has been widely used to support query reformulation in pseudo relevance feedbacks. In this research, a modified Rocchio algorithm was proposed by considering element of terms selection and query importance. In this case it combines the term frequency and inverse document frequency (TF-IDF) weights and Rocchio's algorithm weights in order to generate a new query. It also uses the frequency of terms to choose suitable expansion words. Evaluation of the proposed algorithm were compared against the probabilistic IR model implemented in Lucene toolkit and against the WordNet query expansion approach. The experiments only consider relevance feedbacks after two iterations. The evaluation used the Quranic dataset previously used by other researchers. Twelve queries were considered during the evaluation. The results of the experiments showed that the proposed method exhibit significant improvement in recall and precision. The average precision through pseudo relevance feedback for the first iteration was 8.3%, and for the second iteration was 11.3%, whereas the average precision by Lucene was 3.3% and the average precision by WordNet query expansion was 2.7%. These results prove that the proposed method improves retrieval performance.,Certification of Master's/Doctoral Thesis" is not available
Pages: 103
Publisher: UKM, Bangi
Appears in Collections:Faculty of Information Science and Technology / Fakulti Teknologi dan Sains Maklumat

Files in This Item:
File SizeFormat 
ukmvital_85959+SOURCE1+SOURCE1.0.PDF
  Restricted Access
341.06 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.