Michael S. Waterman

Details der Publikationsliste

Zeitraum

1978 - 2009

Anzahl

109

Co-Autoren

Contents (2009)

Michael S. Waterman

2. Sequence alignments............................................. 1986

On the Length of the Longest Exact Position Match in a Random Sequence (2009)

Gesine Reinert, Michael S. Waterman

Abstract—A mixed Poisson approximation and a Poisson approximation for the length of the longest exact match of a random sequence across another sequence are provided, where the match is required...

Am. J. Hum. Genet. 73:63–73, 2003 Haplotype Block Partition with Limited Resources and Applications (2009)

Kui Zhang, Fengzhu Sun, Michael S. Waterman, Ting Chen

Recent studies have shown that the human genome has a haplotype block structure such that it can be decomposed into large blocks with high linkage disequilibrium (LD) and relatively limited haplotype...

A Quantile Method for Sizing Optical Maps (2009)

Haifeng Li, Anton Valouev, David C. Schwartz, Michael S. Waterman, Lei M. Li

Optical mapping is an integrated system for the analysis of single DNA molecules. It constructs restriction maps (noted as “optical map”) from individual DNA molecules presented on surfaces after...

AND (2009)

Michael S. Waterman, Thomas H. Byers

Just after he introduced dynamic programming, Richard Bellman with R. Kalaba in 1960 gave a method for finding Kth best policies. Their method has been modified since then, but it is still not...

Accuracy Assessment of Diploid Consensus Sequences (2009)

Jong Hyun Kim, Michael S. Waterman, Lei M. Li

Abstract—If the origins of fragments are known in genome sequencing projects, it is straightforward to reconstruct diploid consensus sequences. In reality, however, this is not true. Although there...

Vol. 23 ISMB/ECCB 2007, pages i222–i229 BIOINFORMATICS doi:10.1093/bioinformatics/btm222 (2008)

Yu Huang, Haifeng Li, Haiyan Hu, Xifeng Yan, Michael S. Waterman, Haiyan Huang, ...

Systematic discovery of functional modules and context-specific functional annotation of human genome

On (2008)

G. Reinert, Michael S. Waterman

the length of the longest exact position match in a random sequence

BIOINFORMATICS ORIGINAL PAPER Sequence analysis (2008)

Anton Valouev, Yu Zhang, David C. Schwartz, Michael S. Waterman, Keith A Cr

Motivation: Genomic mutations and variations provide insightful information aboutthe functionality of sequenceelementsand their association with human diseases. Traditionally, variations are...

I C Lecturee on Mathematics in the Life Sciences Volume 17,1986 Probability Distributions for DNA Sequence Comparisons (2008)

Michael S. Waterman

ABSTRACT. Recently DNA sequence comparisons have focused on finding long matching segments between two sequences, rather than matching the entire sequences. Generalizations of the celebrated...

ADVANCBS IN APPLIED MATHEMATICS 6,129-134 (1985): Dynamic Programming Algorithms for Picture Comparison (2008)

Michael S. Waterman

Two-dimensional arrays can be compared by a generalization of dynamic pre gramming algorithms for string comparison. Earlier algorithms have computational complexity O(N6) for comparison of two N x N...

Designer algorithms for cryptogene searches (2008)

Michael S. Waterman, Arndt Von Haeseler

Abstract RNA editing in the mitochondria of kinetoplastid protozoa describes the insertion and (or) deletion of precise numbers of uridines at precise locations in the transcribed RNA. Such genes are...

A PHASE TRANSITION FOR THE SCORE IN MATCHING RANDOM SEQUENCES ALLOWING DELETIONS' (2008)

Richard Hratia, Michael S. Waterman

We consider a sequence matching problem involving the optimal alignment score for contiguous subsequences, rewarding matches and penalizing for deletions and mismatches. This score is used by...

, AND (2008)

Michael S. Waterman, Los Angeles, Culiforniu I I, Temple F. Smith

important mathematical problem in molecular biology. Dynamic programming methods are currently the most useful computer technique but are frequently very expensive in running time. In this paper new...

The Continuing Case of the:-- Florida Dentist (2008)

S. Res, Temple F. Smith, Michael S. Waterman

the free atmosphere, away from the influ- 2. S. M. et a/.. J. Geophys. Res. 92. 1977 ences of the earth's surface. From all of these measurements will come an understanding of the processes that...

AND (2008)

Michael S. Waterman

For each point of the integer lattice Zd, let X and Y be independent identically distributed random variables with P(X = Y) =p E (0, 1). Let S(n) be the volume of the largest d-dimensional cube in...

AND (2008)

Marcela D. Perlwitz, Christian Bums, Michael S. Waterman

The genetic code is examined in a new and systematic fashion: we consider the code as mapping of one finite set (the 64 codons) to another (the 20 amino acids). Given a class of mappings simpler than...

A new computational method for detection of chimeric 16S rRNA artifacts generated by PCR amplification from mixed bacterial populations (2007)

George A. Komatsoulis, Michael S. Waterman

A new computational method (chimeric alignment) has been developed to detect chimeric 16S rRNA artifacts generated during PCR amplification from mixed bacterial populations. In contrast to other...

AND (2007)

Temple F. Smith, Michael S. Waterman

Recent sequencing of viral genomes supports the existence of multiframe codon reading. This study considers the restrictions imposed on proteins coded for in overlap-ping regions. Calculation of...

AND (2007)

Richard Arratia, Michael S. Waterman

Motivated by the comparison of DNA sequences, a generalization is given of the result of Erdos and Renyi on the length R, of the longest run of heads in the first n tosses of a coin. Consider two...

Darwin Molecular Corp. 2 (2007)

Fengzhu Sun, David Galas, Michael S. Waterman

National Institutes of Health(FS, GS, MSW), and the Guggenheim Foundation(MSW). Modeling in vitro selection-amplification 1 We construct a mathematical model for in vitro molecular selection with...

A graph-based approach to systematically reconstruct human transcriptional regulatory modules (2007)

Yan, Xifeng, Mehan, Michael R., Huang, Yu, Waterman, Michael S., Yu, Philip S., Zhou, Xianghong Jasmine

Motivation: A major challenge in studying gene regulation is to systematically reconstruct transcription regulatory modules, which are defined as sets of genes that are regulated by a common set of...

Systematic discovery of functional modules and context-specific functional annotation of human genome (2007)

Huang, Yu, Li, Haifeng, Hu, Haiyan, Yan, Xifeng, Waterman, Michael S., Huang, Haiyan, ...

Motivation: The rapid accumulation of microarray datasets provides unique opportunities to perform systematic functional characterization of the human genome. We designed a graph-based approach to...

Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi (2007)

Kim, Jong Hyun, Waterman, Michael S., Li, Lei M.

One of the main goals in genome sequencing projects is to determine a haploid consensus sequence even when clone libraries are constructed from homologous chromosomes. However, it has been noticed...

Gene Aging Nexus: a web database and data mining platform for microarray data on aging (2007)

Pan, Fei, Chiu, Chi-Hsien, Pulapura, Sudip, Mehan, Michael R., Nunez-Iglesias, Juan, Zhang, Kangyu, ...

The recent development of microarray technology provided unprecedented opportunities to understand the genetic basis of aging. So far, many microarray studies have addressed aging-related expression...

Integrative missing value estimation for microarray data (2006)

Hu, Jianjun, Li, Haifeng, Waterman, Michael S, Zhou, Xianghong

Abstract Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is...

Refinement of Optical Map Assemblies (original paper) (2006)

Anton Valouev, Yu Zhang, David C. Schwartz, Michael S. Waterman, Keith A Cr

Motivation. Genomic mutations and variations provide insightful information about the functionality of sequence elements and their association with human diseases. Traditionally, variations are...

Refinement of optical map assemblies (2006)

Valouev, Anton, Zhang, Yu, Schwartz, David C., Waterman, Michael S.

Motivation: Genomic mutations and variations provide insightful information about the functionality of sequence elements and their association with human diseases. Traditionally, variations are...

Gene Aging Nexus: a web database and data mining platform for microarray data on aging (2006)

Pan, Fei, Chiu, Chi-Hsien, Pulapura, Sudip, Mehan, Michael R., Nunez-Iglesias, Juan, Zhang, Kangyu, ...

The recent development of microarray technology provided unprecedented opportunities to understand the genetic basis of aging. So far, many microarray studies have addressed aging-related expression...

Gene Aging Nexus: a web database and data mining platform for microarray data on aging (2006)

Pan, Fei, Chiu, Chi-Hsien, Pulapura, Sudip, Mehan, Michael R., Nunez-Iglesias, Juan, Zhang, Kangyu, ...

The recent development of microarray technology provided unprecedented opportunities to understand the genetic basis of aging. So far, many microarray studies have addressed aging-related expression...

Alignment of Optical Maps (2005)

Anton Valouev, Lei Li, Yu-chi Liu, David C. Schwartz, Yi Yang, Yu Zhang, ...

We introduce a new scoring method for calculation of alignments of optical maps. Missing cuts, false cuts, and sizing errors present in optical maps are addressed by our alignment score through...

HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms (2005)

Zhang, Kui, Qin, Zhaohui, Chen, Ting, Liu, Jun S., Waterman, Michael S., Sun, Fengzhu

Summary: Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed with regions of low LD. Such LD patterns make it...

Whole-genome shotgun assembly and comparison of human genome assemblies. (2004)

Istrail, Sorin, Sutton, Granger G., Florea, Liliana, Halpern, Aaron L., Mobarry, Clark M., Lippert, Ross, ...

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in...

The Block Partitioning Results Using Different � and � a (2004)

Kui Zhang, Zhaohui S. Qin, Jun S. Liu, Ting Chen, Michael S. Waterman, Fengzhu Sun, ...

service This article cites 49 articles, 10 of which can be accessed free at:

Haplotype reconstruction from SNP alignment (2004)

Lei M. Li, Jonghyun Kim, Michael S. Waterman

In this paper, we describe a method for statistical reconstruction of haplotypes from a set of aligned SNP fragments. We consider the case of a pair of homologous human chromosomes, one from the...

HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms (2004)

Zhang, Kui, Qin, Zhaohui, Chen, Ting, Liu, Jun S., Waterman, Michael S., Sun, Fengzhu

Summary: Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. Such LD patterns make it...

Haplotype Block Partitioning and Tag SNP Selection Using Genotype Data and Their Applications to Association Studies (2004)

Zhang, Kui, Qin, Zhaohui S., Liu, Jun S., Chen, Ting, Waterman, Michael S., Sun, Fengzhu

Recent studies have revealed that linkage disequilibrium(LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs)...

Haplotype Block Partitioning and Tag SNP Selection Using Genotype Data and Their Applications to Association Studies (2004)

Zhang, Kui, Qin, Zhaohui S., Liu, Jun S., Chen, Ting, Waterman, Michael S., Sun, Fengzhu

Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs)...

HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms (2004)

Zhang, Kui, Qin, Zhaohui, Chen, Ting, Liu, Jun S., Waterman, Michael S., Sun, Fengzhu

Summary: Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. Such LD patterns make it...

Dynamic Programming Algorithms for Haplotype Block Partitioning: Applications to Human Chromosome 21 Haplotype Data (2003)

Kui Zhang, Fengzhu Sun, Michael S. Waterman, Ting Chen

Recent studies have shown that the human genome has a haplotype block structure such that it can be divided into discrete blocks of limited haplotype diversity. Patil et al. [6] and Zhang et al. [12]...

An eulerian path approach to global multiple alignment for DNA sequences (2003)

Yu Zhang, Michael S. Waterman

With the rapid increase in the dataset of genome sequences, the multiple sequence alignment problem is increasingly important and frequently involves the alignment of a large number of sequences....

An eulerian path approach to global multiple alignment for DNA sequences (2003)

Yu Zhang, Michael S. Waterman

With the rapid increase in the dataset of genome sequences, the multiple sequence alignment problem is increasingly important and frequently involves the alignment of a large number of sequences....

1 Running Head: Eulerian Assembly and Multiple Alignment Corresponding Author: (2003)

Yu Zhang, Michael S. Waterman, Yu Zhang

We describe an Eulerian path approach to the DNA fragment assembly that was originated by Idury and Waterman 1995, and then advanced by Pevzner et al. 2001b. This combinatorial approach bypasses the...

Estimating the Repeat Structure and Length of DNA Sequences Using {ell}-Tuples (2003)

Li, Xiaoman, Waterman, Michael S.

In shotgun sequencing projects, the genome or BAC length is not always known. We approach estimating genome length by first estimating the repeat structure of the genome or BAC, sometimes of interest...

Predicting Progress in Shotgun Sequencing with Paired Ends (2002)

Yeh, Ru-Fang, Speed, Terence P, Waterman, Michael S, Li, Xiaoman

Paired-end shotgun sequencing has become widely used for large-scale sequencing projects in recent years, including whole genome shot-gun sequencing and map-based BAC clone sequencing. Under this...

Local matching of random restriction maps (2001)

Tang, Mengxiang, Waterman, Michael S.

Optical mapping is a new technique to generate restriction maps of DNA easily and quickly. DNA restriction maps can be aligned by comparing corresponding restriction fragment lengths. To relate,...

Probabilistic and Statistical Properties of Words: An Overview (2000)

Gesine Reinert, Sophie Schbath, Michael S. Waterman

In the following, an overview is given on statistical and probabilistic properties of words, as occurring in the analysis of biological sequences. Counts of occurrence, counts of clumps, and renewal...

Probabilistic and Statistical Properties of Words: An Overview (2000)

Gesine Reinert, Sophie Schbath, Michael S. Waterman

this paper, we will focus on the homogeneous models Mm and give existing results for Mm-3. Because these probabilistic models have to be tted to the observed biological sequence, we will pay...

Statistics in molecular biology: An example from detection of chimeric 16S rRNA artifacts (1997)

George A. Komatsoulis, Michael S. Waterman

Statistical methods have had wide application in molecular biology. Genetic mapping, physical mapping, DNA sequence determination, evolutionary history reconstructions, sequence alignments, and...

Chimeric Alignment By Dynamic Programming: (1997)

Algorithm And, George A. Komatsoulis, Michael S. Waterman, Place Drb

: A new nearest-neighbor method for detecting chimeric 16S rRNA artifacts generated during PCR amplification from mixed populations has been developed. The method uses dynamic programming to generate...

Chimeric alignment by dynamic programming: Algorithm and biological uses (1997)

George A. Komatsoulis, Michael S. Waterman

A new nearest-neighbor method for detecting chimeric 16S rRNA artifacts generated during PCR amplification from mixed populations has been developed. The method uses dynamic programming to generate...

Simple Maximum Likelihood Methods for the Optical Mapping Problem (1997)

Vlado Dancik, Michael S. Waterman

Recently a new method for obtaining restriction maps was developed by David Schwartz at NYU. Using this method restriction maps are created from fluorescent images of individual molecules obtained...

A Phase Transition for the Minimum Free Energy of Secondary Structures of a Random RNA (1996)

Momiao Xiong, Michael S. Waterman

The free energy of a single-stranded RNA can be calculated by adding the free energies of the components: basepairs, bulges, and loops. Basepairs receive negative free energy while the unpaired bases...

Poisson process approximation for sequence repeats, and sequencing by hybridization (1996)

Richard Arratia, Daniela Martin, Gesine Reinert, Michael S. Waterman

Sequencing by hybridization is a tool to determine a DNA sequence from the unordered lit of all I-tuples contained in this sequence; typical numbers for 1 are I = 8, 10, 12. For theoretical purposes...

ARTICLE NO. Ah4960502. A Phase Transition for the Minimum Free Energy of Secondary Structures of a Random RNA* (1996)

Momiao Xiong, Michael S. Waterman

The free energy of a single-stranded RNA can be calculated by adding the free energies of the components: basepairs, bulges, and loops. Basepairs receive nega-tive free energy while the unpaired...

Whole genome amplification of single cells: mathematical analysis of PEP and tagged PCR (1995)

Fengzhu Sun, Norman Arnheim, Michael S. Waterman

We construct a mathematical model for two whole genome amplification strategies, primer extension preamplification (PEP) and tagged polymerase chain reaction (Tagged PCR). An explicit formula for the...

Genomic mapping by end-characterized random clones: A mathematical analysis (1995)

Ethan Port, Fengzhu Sun, Daniela Martin, Michael S. Waterman

Physical maps can be constructed by “fingerprinting ” a large number of random clones and inferring overlap between clones when the fingerprints are sufficiently similar. Lander and Waterman...

Whole genome amplification of single cells: mathematical analysis of PEP and tagged PCR (1995)

Fengzhu Sun, Norman Arnheim, Michael S. Waterman

'To whom correspondence should be addressed We construct a mathematical model for two whole genome amplification strategies, primer extension preamplification (PEP) and tagged polymerase chain...

A new algorithm for DNA sequence assembly (1995)

Ramana M. Idury, Michael S. Waterman

Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a ‘ well-established biological...

Whole genome amplification of single cells: mathematical analysis of PEP and tagged PCR (1995)

Sun, Fengzhu, Arnheim, Norman, Waterman, Michael S.

We construct a mathematical model for two whole genome amplification strategies, primer extension preamplification (PEP) and tagged potymerase chain reaction (tagged PCR). An explicit formula for the...

A method for fast database search for all k-nucleotide repeats (1994)

Gary Benson, Michael S Waterman

A signi cant portion of DNA consists of repeating patterns of various sizes, from very small (one, two and three nucleotides) to very large (over 300 nucleotides). Although the functions of these...

Sequence comparison significance and poisson approximation (1994)

Martin Vincron, Michael S. Waterman, Michael S. Waterman, Martin Vingron

Abstract. The Chen-Stein method of Poisson approximation has been used to establish theorems about comparison of two DNA or protein se-quences. The most usefil result for sequence alignment applies...

A method for fast database search for all k-nucleotide repeats (1994)

Benson, Gary, Waterman, Michael S.

A significant portion of DNA consists of repeating patterns of various sizes, from very small (one, two and three nucleotides) to very large (over 300 nucleotides). Although the functions of these...

Dynamic programming algorithms for restriction map comparison (1992)

Huang, Xiaoqiu, Waterman, Michael S.

For most sequence comparison problems there is a corresponding map comparison algorithm. While map data may appear to be incompatible with dynamic programming, we show in this paper that the rigor...

The distribution of restriction enzyme sites in Escherichia coli (1990)

Churchill, Gary A., Daniels, Donna L., Waterman, Michael S.

A statistical analysis of physical map data for eight restriction enzymes covering nearly the entire genome of E. coli is presented. The methods of analysis are based on a top-down modeling approach...

The match game: new stratigraphic correlation algorithms (1987)

Michael S. Waterman, Robert Raymond

New algorithms for automatic correlaiion of geologic strata are introduced. The algorithm are exiensions of the Smith and Waierman (1980) dynamic programming technique and include several features...

Multiple sequence alignment by consensus (1986)

Waterman, Michael S.

An algorithm for multiple sequenoe alignment is given that matches words of length and degree of mismatch chosen by the user. The alignment maximizes an alignment scoring function. The method is...

Rigorous Pattern-recognition Methods for DNA Sequences Analysis of Promoter Sequences from Escherichia coli (1985)

David J. Galad, Mark Eggert, Michael S. Waterman

The basic nature of the sequence features that define a promoter sequence for Escherichia coli RNA polymerase have been established by a variety of biochemical and genetic methods. We have developed...

Renewal theory for several patterns (1985)

Stephen Breen, Michael S. Waterman

Discrete renewal theory is generalized to study the occurrence of a collection of patterns in random sequences, where a renewal is defined to be the occurrence of one of the patterns in the...

Stanislaw M. Ulam’s Contributions to Theoretical Theory (1985)

William A. Beyer, Peter H. Sellers, Michael S. Waterman

Abstract. S. M. Ulam’s contributions to biology are surveyed. The survey covers cellular automata theory, population biology, Fermi-Pasta-Ulam results, pattern recognition, and sequence similarity....

The statistical distribution of nucleic acid similarities (1985)

Smith, Temple F., Waterman, Michael S., Burks, Christian

All pairs of a large set of known vertebrate DNA sequences were searched by computer for most similar segments. Analysis of this data shows that the computed similarity scores are distributed...

Efficient sequence alignment algorithms (1984)

Michael S. Waterman

Sequence alignments are becoming more important with the increase of nucleic acid data. Fitch and Smith have recently given an example where multiple insertion/deletions (rather than a series of...

Some applications of information theory to cellular automata, Phys. D (1984)

Michael S. Waterman

In this paper general deterministic one-dimensional cellular automata are identified with mappings of the unit interval into itself. This allows the machinery of dynamical systems analysis to be...

Determining all optimal and near-optimal solutions when solving shortest path problems by dynamic programming (1984)

Thomas H. Byers, Michael S. Waterman

This paper presents a new algorithm for finding all solutions with objective function values in the neighborhood of the optimum for certain dynamic pro-gramming models, including shortest path...

An extreme value theory for long head runs (1984)

Louis Gordon, Mark F. Schilling, Michael S. Waterman

Summary. For an infinite sequence of independent coin tosses with P(Heads)=pE(O, l), the longest run of consecutive heads in the first n tosses is a natural object of study. We show that the...

Algorithms for Restriction Map Comparisons (1984)

Michael S. Waterman, Temple F. Smith, Harold L. Katcher

An algorithm is presented which compares two restriction ~ ~ lating maps, yielding a measure of distance between the maps and re-the maps by an alignment. This new algorithm finds the minimum...

Frequencies of restriction sites (1983)

Waterman, Michael S.

Restriction sites or other sequence patterns are usually assumed to occur according to a Poisson distribution with mean equal to the reciprocal of the probability of the given site or pattern. For...

Hierarchical analysis of influenza A hemagglutinin gene sequences (1982)

Lipman, David J., Smith, Temple F., Beckman, Richard J., Waterman, Michael S.

Five recently sequenced hemagglutinin genes from Influenza A virus strains are studied for similarities in a hierarchical fashion. The sequences are compared for similarity, first on the level of...

Secondary structure of single - stranded nucleic acids (1978)

Michael S. Waterman, Dedicated To, John R. Kinney

The primary structure of a single-stranded nucleic acid, such as a tRNA, is the sequence of nucleotides or bases making up the molecule. Secondary structure of such a molecule is a class of graphs in...

Modeling and Optimizing a Gas-Water Reservoir: I Enhanced Recovery with waterflooding' (1978)

Mark E. Johnson, Ellis A. Mona, Michael S. Watermad, Mark E. Johnson, Euis A. Monash, Michael S. Waterman

Accepted practice dictates that waterflooding of gas reservoirs should commence, if ever. only when the reservoir pressure has declined to the minimum production pressure. Analytical proof of this...

Some ergodic properties of multi-dimensional F-expansions (1970)

Michael S. Waterman

This paper is concerned with probabilistic aspects of the expansion of points in n-dimensional Euclidean space. The expansions we consider need not converge although previous work has required...

An Eulerian path approach to DNA fragment assembly

Pevzner, Pavel A., Tang, Haixu, Waterman, Michael S.

For the last 20 years, fragment assembly in DNA sequencing followed the “overlap–layout–consensus” paradigm that is used in all currently available assembly tools. Although this approach...

A dynamic programming algorithm for haplotype block partitioning

Zhang, Kui, Deng, Minghua, Chen, Ting, Waterman, Michael S., Sun, Fengzhu

We develop a dynamic programming algorithm for haplotype block partitioning to minimize the number of representative single nucleotide polymorphisms (SNPs) required to account for most of the common...

Distributional regimes for the number of k-word matches between two random sequences

Lippert, Ross A., Huang, Haiyan, Waterman, Michael S.

When comparing two sequences, a natural approach is to count the number of k-letter words the two sequences have in common. No positional information is used in the count, but it has the virtue that...

Whole-genome shotgun assembly and comparison of human genome assemblies

Istrail, Sorin, Sutton, Granger G., Florea, Liliana, Halpern, Aaron L., Mobarry, Clark M., Lippert, Ross, ...

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in...

Sequence alignments in the neighborhood of the optimum with general application to dynamic programming

Waterman, Michael S.

When applying dynamic programming techniques to obtain optimal sequence alignments, a set of weights must be assigned to mismatches, insertion/deletions, etc. These weights are not predetermined,...

Estimating the Repeat Structure and Length of DNA Sequences Using ℓ-Tuples

Li, Xiaoman, Waterman, Michael S.

In shotgun sequencing projects, the genome or BAC length is not always known. We approach estimating genome length by first estimating the repeat structure of the genome or BAC, sometimes of interest...

Haplotype Block Partitioning and Tag SNP Selection Using Genotype Data and Their Applications to Association Studies

Zhang, Kui, Qin, Zhaohui S., Liu, Jun S., Chen, Ting, Waterman, Michael S., Sun, Fengzhu

Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs)...

An Eulerian path approach to local multiple alignment for DNA sequences

Zhang, Yu, Waterman, Michael S.

Expensive computation in handling a large number of sequences limits the application of local multiple sequence alignment. We present an Eulerian path approach to local multiple alignment for DNA...

An Eulerian path approach to DNA fragment assembly

Pevzner, Pavel A., Tang, Haixu, Waterman, Michael S.

For the last 20 years, fragment assembly in DNA sequencing followed the “overlap–layout–consensus” paradigm that is used in all currently available assembly tools. Although this approach...

A dynamic programming algorithm for haplotype block partitioning

Zhang, Kui, Deng, Minghua, Chen, Ting, Waterman, Michael S., Sun, Fengzhu

We develop a dynamic programming algorithm for haplotype block partitioning to minimize the number of representative single nucleotide polymorphisms (SNPs) required to account for most of the common...

Distributional regimes for the number of k-word matches between two random sequences

Lippert, Ross A., Huang, Haiyan, Waterman, Michael S.

When comparing two sequences, a natural approach is to count the number of k-letter words the two sequences have in common. No positional information is used in the count, but it has the virtue that...

Whole-genome shotgun assembly and comparison of human genome assemblies

Istrail, Sorin, Sutton, Granger G., Florea, Liliana, Halpern, Aaron L., Mobarry, Clark M., Lippert, Ross, ...

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in...

Sequence alignments in the neighborhood of the optimum with general application to dynamic programming

Waterman, Michael S.

When applying dynamic programming techniques to obtain optimal sequence alignments, a set of weights must be assigned to mismatches, insertion/deletions, etc. These weights are not predetermined,...

Estimating the Repeat Structure and Length of DNA Sequences Using ℓ-Tuples

Li, Xiaoman, Waterman, Michael S.

In shotgun sequencing projects, the genome or BAC length is not always known. We approach estimating genome length by first estimating the repeat structure of the genome or BAC, sometimes of interest...

Haplotype Block Partitioning and Tag SNP Selection Using Genotype Data and Their Applications to Association Studies

Zhang, Kui, Qin, Zhaohui S., Liu, Jun S., Chen, Ting, Waterman, Michael S., Sun, Fengzhu

Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs)...

An Eulerian path approach to local multiple alignment for DNA sequences

Zhang, Yu, Waterman, Michael S.

Expensive computation in handling a large number of sequences limits the application of local multiple sequence alignment. We present an Eulerian path approach to local multiple alignment for DNA...

Haplotype Block Partition with Limited Resources and Applications to Human Chromosome 21 Haplotype Data

Zhang, Kui, Sun, Fengzhu, Waterman, Michael S., Chen, Ting

Recent studies have shown that the human genome has a haplotype block structure such that it can be decomposed into large blocks with high linkage disequilibrium (LD) and relatively limited haplotype...

Gene Aging Nexus: a web database and data mining platform for microarray data on aging

Pan, Fei, Chiu, Chi-Hsien, Pulapura, Sudip, Mehan, Michael R., Nunez-Iglesias, Juan, Zhang, Kangyu, ...

The recent development of microarray technology provided unprecedented opportunities to understand the genetic basis of aging. So far, many microarray studies have addressed aging-related expression...

An algorithm for assembly of ordered restriction maps from single DNA molecules

Valouev, Anton, Schwartz, David C., Zhou, Shiguo, Waterman, Michael S.

The restriction mapping of a massive number of individual DNA molecules by optical mapping enables assembly of physical maps spanning mammalian and plant genomes; however, not through computational...

Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi

Kim, Jong Hyun, Waterman, Michael S., Li, Lei M.

One of the main goals in genome sequencing projects is to determine a haploid consensus sequence even when clone libraries are constructed from homologous chromosomes. However, it has been noticed...