Anthony Tomasic

Holistic Application Analysis for Update-Independence (2009)

Charles Garrod, Todd Mowry, Amit Manjhi, Anthony Tomasic, Bruce Maggs

Current database performance optimizations stop at the border between the database application and the database system, focusing either on improving the performance of just the database system or the...

Holistic Application Analysis for Update-Independence (2009)

Charles Garrod, Todd Mowry, Amit Manjhi, Anthony Tomasic, Bruce Maggs

Current database performance optimizations stop at the border between the database application and the database system, focusing either on improving the performance of just the database system or the...

RADAR: A Personal Assistant that Learns to Reduce Email Overload (2009)

Michael Freed, Jaime Carbonell, Geoff Gordon, Jordan Hayes, Brad Myers, Daniel Siewiorek, ...

Email client software is widely used for personal task management, a purpose for which it was not designed and is poorly suited. Past attempts to remedy the problem have focused on adding task...

Scalable Query Result Caching for Web Applications (2009)

Anastasia Ailamaki, Charles Garrod, Christopher Olston, Bruce Maggs, Amit Manjhi, Anthony Tomasic, ...

The backend database system is often the performance bottleneck when running web applications. A common approach to scale the database component is query result caching, but it faces the challenge of...

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) RADAR: A Personal Assistant that Learns to Reduce Email Overload (2009)

Michael Freed, Jaime Carbonell, Geoff Gordon, Jordan Hayes, Brad Myers, Daniel Siewiorek, ...

Email client software is widely used for personal task management, a purpose for which it was not designed and is poorly suited. Past attempts to remedy the problem have focused on adding task...

HolisticQuery TransformationsforDynamic Web Applications (2009)

Amit Manjhi, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, Anthony Tomasic

A promising approach to scaling Web applications is to distribute the server infrastructure on which they run. This approach, unfortunately, can introduce latency between the application and database...

Akamai Technologies (2009)

Anastasia Ailamaki, Charles Garrod, Christopher Olston, Bruce Maggs, Amit Manjhi, Google Inc, ...

The backend database system is often the performance bottleneck when running web applications. A common approach to scale the database component is query result caching, but it faces the challenge of...

Learning Information Intent via Observation (2008)

Anthony Tomasic

Workers in organizations frequently request help from assistants by sending request messages that express information intent: an intention to update data in an information system. Human assistants...

Abstract (2008)

Anthony Tomasic

We discuss the problem of unavailable data sources in the context of two mediator based applications. We discuss the limitations of existing system with respect to this problem and describe a novel...

Equal Time for Data on the Internet with (2008)

George A. Mihaila, Louiqa Raschid, Anthony Tomasic

Abstract. Many collections of scienti c data in particular disciplines are available today around the world. Much of this data conforms to some agreed upon standard for data exchange, i.e., a...

ABSTRACT On the Evaluation of Symmetric Publish/Subscribe (2008)

Anthony Tomasic

Traditional publish / subscribe systems offer a range of expressive subscription languages for constraints, but restrict the publish operation to be a single published object that contains only...

Abstract A Framework for Classifying Scienti c Metadata (2008)

Helena Galhardas, Eric Simon, Anthony Tomasic

The scienti c community, public organizations and administrations have generated a large amount of data concerning the environment. Thereisaneed to allow sharing and exchange of this type of...

ABSTRACT On the Evaluation of Symmetric Publish/Subscribe (2008)

Anthony Tomasic

Traditional publish / subscribe systems offer a range of expressive subscription languages for constraints, but restrict the publish operation to be a single published object that contains only...

User constructed data integration via mixed-initiative design (2008)

Anthony Tomasic, John Zimmerman, Ian Hargraves, Roderick Mcmullen

Administrators frequently perform data integration “by hand ” on the desktop as part of the execution of administrative tasks. This position paper discusses the application of mixed-initiative...

Synthetic Workload Performance Analysis of Incremental Updates * Abstract (2008)

Kurt Sheens, Anthony Tomasic, Hector Garcia-molina

Declining disk and CPU costs have kindled a renewed interest in efficient document indexing techniques. In this paper, the problem of incremental updates of inverted lists is addressedusing a...

Abstract Parachute Queries in the Presence of Unavailable Data Sources (2008)

Anthony Tomasic

Mediator systems are used today in a wide variety of unreliable environments. When processing a query, a mediator may try to access a data source which is unavailable. In this situation, existing...

An Introduction to the eXML Data Integration Suite (2008)

Georges Gardarin, Antoine Mensch, Anthony Tomasic

Abstract. This paper describes the e-XML component suite, a modular product

XML/DBC: A Standard API for Access to XML Repositories and Mediators Invited Panel, 2 nd Workshop on Data Integration over the Web (DIWEB’02) (2008)

Anthony Tomasic

The XML marketplace recently has witnessed a rapid growth in the number of XML repositories and mediators [W98] based on XQuery and XPath. In addition to a specification for the query language, a...

Abstract (2008)

Anthony Tomasic, Hector Garcia-molina

The proliferation of the world's \information highways " has renewed interest in e cient document indexing techniques. In this article, we provide an overview of the issues in parallel...

General Terms (2008)

Anthony Tomasic

In [4] we describe the Virtual Information Officer (VIO), a system designed to determine user intent from natural language messages and assist with task completion. VIO assists users with requests...

Abstract The Distributed Information Search Component (Disco) and the World Wide Web (2008)

Anthony Tomasic, Remy Amouroux, Hubert Naacke

is a prototype heterogeneous distributed database that accesses underlying data sources. The Disco prototype currently focuses on three central research problems in the context of these systems....

distributed information retrieval (2008)

Luis Gravano, Héctor García-molina, Anthony Tomasic, Inria Rocquencourt, Name Luis Gravano

The dramatic growth of the Internet has created a new problem for users: the location of relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution...

WWW 2007 / Track: Browsers and User Interfaces Session: Smarter Browsing Learning Information Intent via Observation (2008)

Anthony Tomasic

Workers in organizations frequently request help from assistants by sending request messages that express information intent: an intention to update data in an information system. Human assistants...

DRAFT-- NOT FOR DISTRIBUTION-- SEE TKDE 1998 FOR FINAL VERSION (2007)

Anthony Tomasic, Louiqa Raschid, Patrick Valduriez

Abstract--- Accessing many data sources aggravates problems for users of heterogeneous distributed databases. Database administrators must deal with fragile mediators, that is, mediators with schemas...

Data Engineering (2007)

September Vol No, Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan, Processing Top N, ...

In a wide-area environment, the time required to obtain data from remote sources can vary unpredictably due to network congestion, link failure or other problems. Traditional techniques for query...

Unavailable Data Sources in Mediator Based Applications (2007)

Philippe Bonnet, Anthony Tomasic

We discuss the problem of unavailable data sources in the context of two mediator based applications. We discuss the limitations of existing system with respect to this problem and describe a novel...

(will be inserted by hand later) Locating and Accessing Data Repositories with WebSemantics* (2007)

George A. Mihaila, Louiqa Raschid, Anthony Tomasic

Abstract. Many collections of scientific data in particular disciplines are available today on the World Wide Web. Most of these data sources are compliant with some standard for interoperable...

Retrieval (2007)

Luis Gravano, Anthony Tomasic, Inria Rocquencourt

The dramatic growth of the Internet has created a new problem for users: location of the relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution...

VIO: A mixedinitiative approach to learning and automating procedural update tasks (2007)

John Zimmerman, Anthony Tomasic, Isaac Simmons, Ian Hargraves, Ken Mohnkern, Jason Cornwell, ...

Today many workers spend too much of their time translating their co-workers ’ requests into structures that information systems can understand. This paper presents the novel interaction design and...

Learning to Detect Phishing Emails (2006)

Fette, Ian, Sadeh, Norman, Tomasic, Anthony

There are an increasing number of emails purporting to be from a trusted entity that attempt to deceive users into providing account or identity information, commonly known as phishing emails....

Workflow by Example: Automating Database Interactions via Induction (2006)

Tomasic, Anthony, McGuire, R. M., Myers, Brad

Workflow and data integration engineering are complex and expensive activities. Adding a new workflow to a system often requires lengthy and repeated rounds of software engineering. The time and cost...

Learning to extract gene-protein names from weaklylabeled text in preparation (2006)

Richard C. Wang, Anthony Tomasic, Robert E. Frederking, Isaac Simmons, William W. Cohen, Isaac Simmons, ...

Training a named entity recognizer (NER) has always been a difficult task due to the effort required to generate a significant amount of annotated training data. In this paper, we reduce or eliminate...

Linking Messages and Form Requests (2006)

Anthony Tomasic

Large organizations with sophisticated infrastructures have large form-based systems that manage the interaction between the user community and the infrastructure. In many cases, when a user needs to...

Simultaneous scalability and security for data-intensive Web applications (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, Anthony Tomasic

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Invalidation clues for database scalability services (2006)

Amit Manjhi, Phillip B. Gibbons, Anastassia Ailamaki, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, ...

For their scalability needs, data-intensive Web applications can use a Database Scalability Service (DBSS), which caches applications ’ query results and answers queries on their behalf. One way...

Simultaneous scalability and security for data-intensive Web applications (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, Anthony Tomasic

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Simultaneous Scalability and Security (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, ...

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Simultaneous Scalability and Security for Data-Intensive Web Applications (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, Anthony Tomasic

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Simultaneous Scalability and Security (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, ...

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Simultaneous Scalability and Security for Data-Intensive Web Applications (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, Anthony Tomasic

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Workflow By Example: Automating Database Interactions via Induction (2006)

Anthony Tomasic, R. Martin Mcguire, Brad Myers

by example Workflow and data integration engineering are complex and expensive activities. Adding a new workflow to a system often requires lengthy and repeated rounds of software engineering. The...

Invalidation clues for database scalability services (2006)

Amit Manjhi, Phillip B. Gibbons, Anastassia Ailamaki, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, ...

For their scalability needs, data-intensive Web applications can use a Database Scalability Service (DBSS), which caches applications ’ query results and answers queries on their behalf. One way...

Linking Messages and Form Requests (2006)

Anthony Tomasic

Large organizations with sophisticated infrastructures have large form-based systems that manage the interaction between the user community and the infrastructure. In many cases, when a user needs to...

the previous technical report. (2006)

Amit Manjhi, Phillip B. Gibbons, Anastassia Ailamaki, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, ...

For their scalability needs, data-intensive Web applications can use a Database Scalability Service (DBSS), which caches applications ’ query results and answers queries on their behalf. One way...

Simultaneous scalability and security for data-intensive Web applications (2006)

Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher Olston, Anthony Tomasic

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and...

Invalidation clues for database scalability services (2006)

Amit Manjhi, Phillip B. Gibbons, Anastassia Ailamaki, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, ...

For their scalability needs, data-intensive Web applications can use a Database Scalability Service (DBSS), which caches applications ’ query results and answers queries on their behalf. To address...

Learning to Detect Phishing Emails (2006)

Ian Fette, Norman Sadeh, Anthony Tomasic

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation

Invalidation clues for database scalability services (2006)

Amit Manjhi, Phillip B. Gibbons, Anastassia Ailamaki, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, ...

For their scalability needs, data-intensive Web applications can use a Database Scalability Service (DBSS), which caches applications ’ query results and answers queries on their behalf. To address...

Learning to extract gene-protein names from weaklylabeled text in preparation (2006)

Richard C. Wang, Anthony Tomasic, Robert E. Frederking, William W. Cohen

Training a named entity recognizer (NER) has always been a difficult task due to the effort required to generate a significant amount of annotated training data. In this paper, we reduce or eliminate...

Learning to navigate web forms (2004)

Anthony Tomasic, William Cohen, Susan Fussell, John Zimmerman, Marina Kobayashi, Einat Minkov, ...

Given a particular update request to a WWW system, users are faced with the navigation problem of finding the correct form to accomplish the update request. In a large system, such as SAP with about...

GlOSS: Text-source discovery over the internet (1999)

Luis Gravano, Hector Garca-molina, Anthony Tomasic, Inria Rocquencourt, Name Luis Gravano

The dramatic growth of the Internet has created a new problem for users: the location of relevant sources of documents. This article presents a framework for (and experimentally analyzes a solution...

Parachute Queries in the Presence of Unavailable Data Sources (1998)

Bonnet, Philippe, Tomasic, Anthony

Mediator systems are used today in a wide variety of unreliable environments. When processing a query, a mediator may try to access a data source which is unavailable. In this situation, existing...

Parachute Queries in the Presence of Unavailable Data Sources (1998)

Bonnet, Philippe, Tomasic, Anthony

Mediator systems are used today in a wide variety of unreliable environments. When processing a query, a mediator may try to access a data source which is unavailable. In this situation, existing...

Parachute Queries in the Presence of Unavailable Data Sources (1998)

Bonnet, Philippe, Tomasic, Anthony

Mediator systems are used today in a wide variety of unreliable environments. When processing a query, a mediator may try to access a data source which is unavailable. In this situation, existing...

Scaling access to heterogeneous data sources with DISCO (1998)

Anthony Tomasic, Louiqa Raschid, Patrick Valduriez

Abstract | Accessing many data sources aggravates problems for users of heterogeneous distributed databases. Database administrators must deal with fragile mediators, that is, mediators with schemas...

Dynamic query operator scheduling for wide-area remote access (1998)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic

Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote...

A framework for classifying scientific metadata (1998)

Helena Galhardas, Eric Simon, Anthony Tomasic

The scientific community, public organizations and administrations have generated a large amount of data concerning the environment. There is a need to allow sharing and exchange of this type of...

Leveraging mediator cost models with heterogeneous data sources (1998)

Hubert Naacke, Georges Gardarin, Anthony Tomasic

Distributed systems require declarative access to diverse information sources. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In these...

A framework for classifying scientific metadata (1998)

Helena Galhardas, Eric Simon, Anthony Tomasic

The scientific community, public organizations and administrations have generated a large amount of data concerning the environment. There is a need to allow sharing and exchange of this type of...

Equal Time for Data on the Internet with WebSemantics (1998)

George Mihaila, Louiqa Raschid, Anthony Tomasic

. Many collections of scientific data in particular disciplines are available today around the world. Much of this data conforms to some agreed upon standard for data exchange, i.e., a standard...

Leveraging Mediator Cost Models with Heterogeneous Data Sources (1998)

Hubert Naacke Georges, Georges Gardarin, Anthony Tomasic

Distributed systems require declarative access to diverse information sources. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In these...

A Framework for Classifying Scientific Metadata (1998)

Helena Galhardas, Eric Simon, Anthony Tomasic

The scientific community, public organizations and administrations have generated a large amount of data concerning the environment. There is a need to allow sharing and exchange of this type of...

A Framework for Classifying Scientific Metadata (1998)

Helena Galhardas, Eric Simon, Anthony Tomasic

The scientific community, public organizations and administrations have generated a large amount of data concerning the environment. There is a need to allow sharing and exchange of this type of...

Leveraging Mediator Cost Models with Heterogeneous Data Sources (1998)

Hubert Naacke, Hubert Naacke, Georges Gardarin, Georges Gardarin, Anthony Tomasic, Anthony Tomasic, ...

: Distributed systems require declarative access to diverse data sources of information. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In...

Parachute Queries in the Presence of Unavailable Data Sources (1998)

Philippe Bonnet, Anthony Tomasic

Mediator systems are used today in a wide variety of unreliable environments. When processing a query, a mediator may try to access a data source which is unavailable. In this situation, existing...

Partial answers for unavailable data sources (1998)

Anthony Tomasic

Abstract. Many heterogeneous database system products and prototypes exist today; they will soon be deployed in a wide variety of environments. Most existing systems suffer from an Achilles ’ heel:...

Partial answers for unavailable data sources (1998)

Anthony Tomasic

Abstract. Many heterogeneous database system products and prototypes exist today � they will soon be deployedinawidevariety ofenvironments. Most existing systems su er from an Achilles ' heel:...

Dynamic Query Operator Scheduling for Wide-Area Remote Access (1997)

Amsaleg, Laurent, Franklin, Michael J., Tomasic, Anthony

Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote...

Dynamic Query Operator Scheduling for Wide-Area Remote Access (1997)

Amsaleg, Laurent, Franklin, Michael J., Tomasic, Anthony

Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote...

Partial Answers for Unavailable Data Sources (1997)

Bonnet, Philippe, Tomasic, Anthony

Many heterogeneous database system products and prototypes exist today; they will soon be deployed in a wide variety of environments. All existing systems suffer from an {\em Achilles' heel}: if some...

Dealing with Discrepancies in Wrapper Functionality (1997)

Kapitskaia, Olga, Tomasic, Anthony, Valduriez, Patrick

Much of the world's information is stored electronically in data sources. The data sources can be full-fledged databases, simple files, HTML pages or specialized data sources that possess diverse...

Leveraging Mediator Cost Models with Heterogeneous Data Sources (1997)

Naacke, Hubert, Gardarin, Georges, Tomasic, Anthony

Distributed systems require declarative access to diverse data sources of information. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In...

Dynamic Query Operator Scheduling for Wide-Area Remote Access (1997)

Amsaleg, Laurent, Franklin, M., Tomasic, Anthony

Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote...

Partial Answers for Unavailable Data Sources (1997)

Bonnet, Philippe, Tomasic, Anthony

Many heterogeneous database system products and prototypes exist today; they will soon be deployed in a wide variety of environments. All existing systems suffer from an {\em Achilles' heel}: if some...

Dealing with Discrepancies in Wrapper Functionality (1997)

Kapitskaia, Olga, Tomasic, Anthony, Valduriez, Patrick

Much of the world's information is stored electronically in data sources. The data sources can be full-fledged databases, simple files, HTML pages or specialized data sources that possess diverse...

Leveraging Mediator Cost Models with Heterogeneous Data Sources (1997)

Naacke, Hubert, Gardarin, Georges, Tomasic, Anthony

Distributed systems require declarative access to diverse data sources of information. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In...

Dynamic Query Operator Scheduling for Wide-Area Remote Access (1997)

Amsaleg, Laurent, Franklin, M., Tomasic, Anthony

Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote...

Partial Answers for Unavailable Data Sources (1997)

Bonnet, Philippe, Tomasic, Anthony

Many heterogeneous database system products and prototypes exist today; they will soon be deployed in a wide variety of environments. All existing systems suffer from an {\em Achilles' heel}: if some...

Dealing with Discrepancies in Wrapper Functionality (1997)

Kapitskaia, Olga, Tomasic, Anthony, Valduriez, Patrick

Much of the world's information is stored electronically in data sources. The data sources can be full-fledged databases, simple files, HTML pages or specialized data sources that possess diverse...

Leveraging Mediator Cost Models with Heterogeneous Data Sources (1997)

Naacke, Hubert, Gardarin, Georges, Tomasic, Anthony

Distributed systems require declarative access to diverse data sources of information. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In...

Data structures for efficient broker implementation (1997)

Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas

With the profusion of text databases on the Internet, it is becoming increasingly hard to find the most useful databases for a given query. To attack this problem, several existing and proposed...

Partial Answers for Unavailable Data Sources (1997)

Philippe Bonnet, Anthony Tomasic, Anthony Tomasic, Projet Rodin

: Many heterogeneous database system products and prototypes exist today; they will soon be deployed in a wide variety of environments. All existing systems suffer from an Achilles' heel: if...

Dealing with Discrepancies in Wrapper Functionality (1997)

Olga Kapitskaia, Olga Kapitskaia, Anthony Tomasic, Anthony Tomasic, Patrick Valduriez, Patrick Valduriez, ...

: Much of the world's information is stored electronically in data sources. The data sources can be full-fledged databases, simple files, HTML pages or specialized data sources that possess...

Data Structures for Efficient Broker Implementation (1997)

Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas

With the profusion of text databases on the Internet, it is becoming increasingly hard to find the most useful databases for a given query. To attack this problem, several existing and proposed...

Improving Responsiveness for Wide-Area Data Access (1997)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan

In a wide-area environment, the time required to obtain data from remote sources can vary unpredictably due to network congestion, link failure or other problems. Traditional techniques for query...

Data Structures for Efficient Broker Implementation (1997)

Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas

With the profusion of text databases on the Internet, it is becoming increasingly hard to find the most useful databases for a given query. To attack this problem, several existing and proposed...

The Distributed Information Search Component (DisCo) and the World Wide Web (1997)

Anthony Tomasic, Rémy Amouroux, Philippe Bonnet, Olga Kapitskaia, Hubert Naacke, Louiqa Raschid

The Distributed Information Search COmponent (Disco) is a prototype heterogeneous distributed database that accesses underlying data sources. The Disco prototype currently focuses on three central...

Data Structures for Efficient Broker Implementation (1997)

Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas

this article, we show that the GlOSS summaries can be employed as the representation for summary information in a large-scale system. In particular, we offer evidence that GlOSS can effectively...

Improving responsiveness for wide-area data access (1997)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan

In a wide-area environment, the time required to obtain data from remote sources can vary unpredictably due to network congestion, link failure or other problems. Traditional techniques for query...

Scaling Heterogeneous Databases and the Design of Disco (1996)

Anthony Tomasic

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scaling Heterogeneous Databases and the Design of Disco (1996)

Anthony Tomasic

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scrambling query plans to cope with unexpected delays (1996)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan

Accessing numerous widely-distributed data sources poses significant new challenges for query optimization and execution. Congestion or failure in the network introduce highly-variable response times...

Scaling Heterogeneous Databases and the Design of DISCO (1996)

Anthony Tomasic, Louiqa Raschid, Patrick Valduriez

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scaling Heterogeneous Databases and the Design of DISCO (1996)

Anthony Tomasic, Louiqa Raschid, Patrick Valduriez

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scaling Heterogeneous Databases and the Design of DISCO (1996)

Patrick Valduriez, Anthony Tomasic, Anthony Tomasic, Louiqa Raschid, Louiqa Raschid, Et Patrick Valduriez

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scaling Heterogeneous Databases and the Design of Disco (1996)

Patrick Valduriez, Anthony Tomasic, Anthony Tomasic, Louiqa Raschid, Louiqa Raschid, Et Patrick Valduriez

: Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Data Structures for Efficient Broker Implementation (1996)

Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas

With the profusion of text databases on the Internet, it is becoming increasingly hard to find the most useful databases for a given query. To attack this problem, several existing and proposed...

Query Scrambling for Bursty Data Arrival (1996)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic

Distributed databases operating over wide-area networks, such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote...

Scrambling Query Plans to Cope With Unexpected Delays (1996)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan

Accessing data from numerous widely-distributed sources poses significant new challenges for query optimization and execution. Congestion and failures in the network can introduce highly-variable...

Scaling Heterogeneous Databases and the Design of Disco (1996)

Anthony Tomasic

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scrambling query plans to cope with unexpected delays (1996)

Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan

Accessing data from numerous widely-distributed sources poses signi cant new challenges for query optimization and execution. Congestion and failures in the network can introduce highly-variable...

Scaling Heterogeneous Databases and the Design of Disco (1995)

Tomasic, Anthony, Raschid, Louiqa, Valduriez, Patrick

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scaling Heterogeneous Databases and the Design of Disco (1995)

Tomasic, Anthony, Raschid, Louiqa, Valduriez, Patrick

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

Scaling Heterogeneous Databases and the Design of Disco (1995)

Tomasic, Anthony, Raschid, Louiqa, Valduriez, Patrick

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources....

The Effectiveness of GlOSS for the Text Database Discovery Problem (1994)

Luis Gravano, Hector Garcia-Molina, Anthony Tomasic

The popularity of on-line document databases has led to a new problem: finding which text databases (out of many candidate choices) are the most relevant to a user. Identifying the relevant databases...

Synthetic Workload Performance Analysis of Incremental Updates (1994)

Kurt Shoens, Anthony Tomasic, Hector Garcia-molina

Declining disk and CPU costs have kindled a renewed interest in efficient document indexing techniques. In this paper, the problem of incremental updates of inverted lists is addressed using a...

Precision and Recall of GlOSS Estimators for Database Discovery (1994)

Luis Gravano, Hector Garcia-Molina, Anthony Tomasic

The availability of large numbers of network information sources has led to a new problem: finding which text databases (out of perhaps thousands of choices) are the most relevant to a query. We call...

Determining Correct View Update Translations via Query Containment (1994)

Anthony Tomasic

Given an intensional database (IDB) and an extension database (EDB), the view update problem translates updates on the IDB into updates on the EDB. One approach to the view update problem uses a...

The Effectiveness of GlOSS for the Text Database Discovery Problem (1994)

Luis Gravano, Hector Garcia-Molina, Anthony Tomasic

The popularity of on-line document databases has led to a new problem: finding which text databases (out of many candidate choices) are the most relevant to a user. Identifying the relevant databases...

Abstract (1994)

Kurt Shoens, Anthony Tomasic, Hector Garcia-molina

Declining disk and CPU costs have kindled a renewed interest in e cient document indexing techniques. In this paper, the problem of incremental updates of inverted lists is addressed using a...

Query Processing and Inverted Indices in Shared-Nothing Document Information Retrieval Systems (1993)

Anthony Tomasic, Hector Garcia-molina

The performance of distributed text document retrieval systems is strongly in uenced by the organization of the inverted index. This paper compares the performance impact on query processing of...

Performance of inverted indices in distributed text document retrieval systems (1993)

Anthony Tomasic, Hector Garcia-molina

The performance of distributed text document retrieval systems is strongly in uenced by the organization of the inverted index. This paper compares the performance impact on query processing of...

Performance of inverted indices in shared-nothing distributed text document information retrieval systems (1993)

Anthony Tomasic, Hector Garcia-molina

The performance of distributed text document retrieval systems is strongly in uenced bytheorganization of the inverted index. This paper compares the performance impact on query processing of various...

Incremental Updates of Inverted Lists for Text Document Retrieval (1993)

Anthony Tomasic, Hector Garcia-molina, Kurt Shoens

With the proliferation of the world's "information highways" a renewed interest in efficient document indexing techniques has come about. In this paper, the problem of incremental...

Caching and Database Scaling in Distributed Shared-Nothing Information Retrieval Systems (1993)

Anthony Tomasic, Hector Garcia-molina

A common class of existing information retrieval system provides access to abstracts. For example Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of...

Correct View Update Translations via Containment (1993)

Anthony Tomasic

One approach to the view update problem for deductive databases proves properties of translations - that is, a language specifies the meaning of an update to the intensional database (IDB) in terms...

Performance of Inverted Indices in Shared-Nothing Distributed Text Document Information Retrieval Systems (1993)

Anthony Tomasic, Hector Garcia-molina

The performance of distributed text document retrieval systems is strongly influenced by the organization of the inverted index. This paper compares the performance impact on query processing of...

Incremental Updates of Inverted Lists for Text Document Retrieval (1993)

Anthony Tomasic, Hector Garcia-molina, Kurt Shoens

With the proliferation of the world's "information highways" a renewed interest in efficient document indexing techniques has come about. In this paper, the problem of incremental...

The Efficacy of GlOSS for the Text Database Discovery Problem (1993)

Luis Gravano, Hector Garcia-Molina, Anthony Tomasic

The popularity of information retrieval has led users to a new problem: finding which text databases (out of thousands of candidate choices) are the most relevant to a user. Answering a given query...

Performance of Inverted Indices in Distributed Text Document Retrieval Systems (1993)

Anthony Tomasic, Hector Garcia-molina

The performance of distributed text document retrieval systems is strongly influenced by the organization of the inverted index. This paper compares the performance impact on query processing of...

Caching and Database Scaling in Distributed Shared-Nothing Information Retrieval Systems (1993)

Anthony Tomasic, Hector Garcia-molina

A common class of existing information retrieval system provides access to abstracts. For example Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of...

Caching and database scaling in distributed shared-nothing information retrieval systems (1992)

Anthony Tomasic, Hector Garcia-molina

A common class of existing information retrieval system provides access to abstracts. For example Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of...

Query Processing and Inverted Indices in Shared- Nothing Text Document Information Retrieval Systems (1992)

Vldb Jouma, Michael Carey, Patrick Valduriez, Anthony Tomasic, Hector Garcia-molina

Abstract. The performance of distributed text document retrieval systems is strongly influenced by the organization of the inverted text. This article compares the performance impact on query...