| MBOI: Discovery of Business Opportunities on the Internet Extended Abstract (2008) | |||||||||||||||
Abstract | |||||||||||||||
| We propose a tool for the discovery of business opportunities on the Web, more specifically to help a user find relevant call for tenders (CFT), i.e. invitations to contractors to submit a tender for their products/services. Simple keyword-based Information Retrieval do not capture the relationships in the data, which are needed to answer the complex needs of the users. We therefore augment keywords with information extracted through natural language processing and business intelligence tools. As opposed to most systems, this information is used at all stages in the back-end and interface. The benefits are twofold: first we obtain higher precision of search and classification, and second the user gains access to a deeper level of information. Two challenges are: how to discover new CFT and related documents on the Web, and how to extract information from these documents, knowing that the Web offers no guarantee on the structure and stability of those documents. A major hurdle to the discovery of new documents is the poor degree of “linkedness ” between businesses, and the open topic area, which makes topic-focused Web crawling (Aggarwal et al., 2001) unapplicable. To extract information, wrappers (Soderland, 1999), i.e. tools that can recognise textual and/or structural patterns, have limited success because of the diversity and volatility of Web documents. Since we cannot assume a structure for documents, we exploit information usually contained in CFTs: contracting authority, opening/closing date, location, legal notices, conditions of submission, classification, etc. These can appear marked up with tags or as free-text. A first type of information to extract are the socalled named entities (Maynard et al., 2001), i.e. | |||||||||||||||
Details der Publikation | |||||||||||||||
| |||||||||||||||