Publikationsansicht

Web-supported Matching and Classification of Business Opportunities (2004)

Abstract
More and more business opportunities are published on the Web; however, it is difficult to collect and process them automatically. This paper describes a tool and techniques to help users discovering relevant business opportunities, in particular, calls for tenders. The tool includes spidering, information extraction, classification, and a search interface. Our focus in this paper is on classification, which aims to organize calls for tenders into classes, so as to facilitate user’s browsing. We describe a new approach to classification of business opportunities on the Web using language modeling (LM) approach. This utilization is strongly inspired by the recent success of LM in IR experiments. However, few attempts have been made to use LM for text classification so far. Our goal is to investigate whether LM can bring improvement to text classification. Our experiments are conducted on two corpora: Reuters containing newswire articles and FedBizOpps (FBO) containing calls for tenders (CFTs) published on the Web. The experimental results show that LM-based classification can significantly improve the classification performance on both test corpora, compared with the traditional Naïve Bayes (NB) classifier. In particular, it seems to have stronger impact on FBO than on Reuters. This result shows that LM can greatly improve classification on the Web.

Details der Publikation
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.84.5652
Quelle http://rali.iro.umontreal.ca/Publications/files/WI2004.pdf
Mitarbeiter CiteSeerX
Archiv CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Typ text
Sprache Englisch
Verknüpfungen 10.1.1.46.1529, 10.1.1.54.6410, 10.1.1.11.9519, 10.1.1.80.8909, 10.1.1.109.2516, 10.1.1.29.1052, 10.1.1.62.8251, 10.1.1.29.3543, 10.1.1.24.4239, 10.1.1.68.7050, 10.1.1.86.1135