PAWS Meeting 2007-09-28
Paper presentation by Danielle
Khan, L., McLeod, D., and Hovy, E. 2004. Retrieval effectiveness of an ontology-based model for information selection. The VLDB Journal 13, 1 (Jan. 2004), 71-85. DOI= http://dx.doi.org/10.1007/s00778-003-0105-1
Technology in the field of digital media generates huge amounts of nontextual information, audio, video, and images, along with more familiar textual information. The potential for exchange and retrieval of information is vast and daunting. The key problem in achieving efficient and user-friendly retrieval is the development of a search mechanism to guarantee delivery of minimal irrelevant information (high precision) while insuring relevant information is not overlooked (high recall). The traditional solution employs keyword-based search. The only documents retrieved are those containing user-specified keywords. But many documents convey desired semantic information without containing these keywords. This limitation is frequently addressed through query expansion mechanisms based on the statistical co-occurrence of terms. Recall is increased, but at the expense of deteriorating precision. Focusing on audio data, we have constructed a demonstration prototype. We have experimentally and analytically shown that our model, compared to keyword search, achieves
a significantly higher degree of precision and recall. The techniques employed can be applied to the problem of information selection in all media types.
Discussion
Decided to keep reading logs in private www.wordpress.com blogs only visible to group members
Khan, L., McLeod, D., and Hovy, E. 2004. Retrieval effectiveness of an ontology-based model for information selection. The VLDB Journal 13, 1 (Jan. 2004), 71-85. DOI= http://dx.doi.org/10.1007/s00778-003-0105-1
Technology in the field of digital media generates huge amounts of nontextual information, audio, video, and images, along with more familiar textual information. The potential for exchange and retrieval of information is vast and daunting. The key problem in achieving efficient and user-friendly retrieval is the development of a search mechanism to guarantee delivery of minimal irrelevant information (high precision) while insuring relevant information is not overlooked (high recall). The traditional solution employs keyword-based search. The only documents retrieved are those containing user-specified keywords. But many documents convey desired semantic information without containing these keywords. This limitation is frequently addressed through query expansion mechanisms based on the statistical co-occurrence of terms. Recall is increased, but at the expense of deteriorating precision. Focusing on audio data, we have constructed a demonstration prototype. We have experimentally and analytically shown that our model, compared to keyword search, achieves
a significantly higher degree of precision and recall. The techniques employed can be applied to the problem of information selection in all media types.
Discussion
Decided to keep reading logs in private www.wordpress.com blogs only visible to group members