Publikationsansicht

Wrapping Web Pages into XML Documents: A Practical Experience and Comparison of Two Tools (2008)

Abstract
The notion of wrapping a web server into XML documents is driven from the need for structured data that can be used by a variety of applications. The web contains vast amounts of information that is useless to most applications since it is mainly targeting a human audience. A solution to this would be to automate the browsing process and then convert the extracted information into a more suitable format – like XML. This is called wrapping. We have used two different tools to wrap several tourist sites into XML The tool we have been using are Norfolk, a system developed since 1997 by the CSIRO TED group and W4F, initially developed at the University of Pennsylvania, now a commercial product. This report describes our practical experience with the tools and makes comparison between them. The comparison highlights useful features for a wrapper system to support real applications. 1.

Details der Publikation
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.103.4206
Quelle http://www-rocq.inria.fr/~vercoust/PAPERS/Wrappers-Ausweb.pdf
Mitarbeiter CiteSeerX
Archiv CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Typ text
Sprache Englisch
Verknüpfungen 10.1.1.42.3821, 10.1.1.43.9419, 10.1.1.1.4674, 10.1.1.132.8947