Prototype Classification: Insights from Machine Learning (2009)
Graf, A.B.A., Bousquet, O., Rätsch, G., Schölkopf, B.
We shed light on the discrimination between patterns belonging to two different classes by casting this decoding problem into a generalized prototype framework. The discrimination process is then...
An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis (2009)
Schweikert, G., Widmer, C., Schölkopf, B., Rätsch, G.
We study the problem of domain transfer for a supervised classification task in mRNA splicing. We consider a number of recent domain transfer methods from machine learning, including some that are...
PALMA: Perfect Alignments using Large Margin Algorithms (2008)
Abstract: Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to...
A. Zien, G. Rätsch, S. Mika, B. Schölkopf, T. Lengauer
Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called...
A. Zien, G. Rätsch, S. Mika, B. Schölkopf, C. Lemmen, A. Smola
Abstract In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points from which regions encoding proteins start, the so-called translation initiation...
Sonnenburg, S., Zien, A., Philips, P., Rätsch, G.
Motivation: At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences. Frequently the most accurate...
Support vector machines and kernels for computational biology (2008)
Ben-Hur, A., Ong, C.S., Sonnenburg, S., Schölkopf, B., Rätsch, G.
Support Vector Machines and Kernels for Computational Biology (2008)
Ben-Hur, A., Ong, C.S., Sonnenburg, S., Schölkopf, B., Rätsch, G.
-
Laubinger, S., Zeller, G., Henz, S.R., Sachsenberg, T., Widmer, C.K., Naouar, N., ...
Gene expression maps for model organisms, including Arabidopsis thaliana, have typically been created using gene-centric expression arrays. Here, we describe a comprehensive expression atlas,...
Support Vector Machines and Kernels for Computational Biology (2008)
Ben-Hur, A., Ong, C.S., Sonnenburg, S., Schölkopf, B., Rätsch, G.
Laubinger, S., Zeller, G., Henz, S.R., Sachsenberg, T., Widmer, C.K., Naouar, N., ...
Gene expression maps for model organisms, including Arabidopsis thaliana, have typically been created using gene-centric expression arrays. Here, we describe a comprehensive expression atlas,...
Improving the Caenorhabditis elegans genome annotation using machine learning (2007)
Rätsch, G., Sonnenburg, S., Srinivasan, J., Witte, H., ...
For modern biology, precise genome annotations are of prime importance, as they allow the accurate definition of genic regions. We employ state-of-the-art machine learning methods to assay and...
PALMA: mRNA to genome alignments using large margin algorithms (2007)
Schulze, U., Hepp, B., Ong, C.S., Rätsch, G.
Motivation: Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to...
The need for open source software in machine learning (2007)
Sonnenburg, S., Braun, M.L., Ong, C.S., Bengio, S., Bottou, L., Holmes, G., ...
Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a...
Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana (2007)
Clark, R.M., Schweikert, G., Toomajian, C., Ossowski, S., Zeller, G., Shinn, P., ...
The genomes of individuals from the same species vary in sequence as a result of different evolutionary processes. To examine the patterns of, and the forces shaping, sequence variation in...
The Need for Open Source Software in Machine Learning (2007)
Sonnenburg, S., Braun, M.L., Ong, C.S., Bengio, S., Bottou, L., Holmes, G., ...
Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a...
Accurate Splice site Prediction Using Support Vector Machines (2007)
Sonnenburg, S., Schweikert, G., Philips, P., Behr, J., Rätsch, G.
Background: For splice site recognition, one has to solve two classification problems: discriminating true from decoy splice sites for both acceptor and donor sites. Gene finding systems typically...
PALMA: mRNA to Genome Alignments using Large Margin Algorithms (2007)
Schulze, U., Hepp, B., Ong, C.S., Rätsch, G.
Motivation: Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to...
Accurate splice site prediction using support vector machines (2007)
Sonnenburg, S., Schweikert, G., Philips, P., Behr, J., Rätsch, G.
Background: For splice site recognition, one has to solve two classification problems: discriminating true from decoy splice sites for both acceptor and donor sites. Gene finding systems typically...
Upstream of TSS: promoter containing transcription factor (2007)
Petra Philips, Friedrich Miescher, Eberhard Karls, G. Rätsch, C. S. Ong, ...
POL II binds to a rather vague region of ≈ [−20, +20] bp
ARTS: Accurate Recognition of Transcription Starts in Human (2006)
Sonnenburg, S., Zien, A., Rätsch, G.
Motivation: One of the most important features of genomic DNA are the protein-coding genes. While it is of great value to identify those genes and the encoded proteins, it is also crucial to...
Graph Based Semi-Supervised Learning with Sharper Edges (2006)
Shin, H., Hill, N.J., Rätsch, G.
In many graph-based semi-supervised learning algorithms, edge weights are assumed to be fixed and determined by the data points‘ (often symmetric)relationships in input space, without...
Large Scale Multiple Kernel Learning (2006)
Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of...
Learning interpretable SVMs for biological sequence classification (2006)
Rätsch, G., Sonnenburg, S., Schäfer, C.
Background: Support Vector Machines (SVMs) - using a variety of string kernels - have been successfully applied to biological sequence classification problems. While SVMs achieve high classification...
Large scale multiple kernel learning (2006)
Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of...
ARTS: Accurate Recognition of Transcription Starts in Human (2006)
Sonnenburg, S., Zien, A., Rätsch, G.
We develop new methods for finding transcription start sites (TSS) of RNA Polymerase II binding genes in genomic DNA sequences. Employing Support Vector Machines with advanced sequence kernels, we...
Matrix Exponential Updates for On-line Learning and Bregman Projection (2005)
Tsuda, K., Rätsch, G., Warmuth, M.K., Saul, L.K., Weiss, Y., Bottou, L.
Classifying 'drug-likeness' with kernel-based learning methods (2005)
Rätsch, G., Sonnenburg, S., Mika, S., Grimm, M., Heinrich, N.
In this article we report about a successful application of modern machine learning technology, namely Support Vector Machines, to the problem of assessing the 'drug-likeness' of a chemical from a...
RASE: Recognition of alternatively spliced exons in C.elegans (2005)
Rätsch, G., Sonnenburg, S., Schölkopf, B.
Motivation: Eukaryotic pre-mRNAs are spliced to form mature mRNA. Pre-mRNA alternative splicing greatly increases the complexity of gene expression. Estimates show that more than half of the human...
Learning interpretable SVMs for biological sequence classification (2005)
Sonnenburg, S., Rätsch, G., Schäfer, C.
We propose novel algorithms for solving the so-called Support Vector Multiple Kernel Learning problem and show how they can be used to understand the resulting support vector decision function. While...
Efficient margin maximizing with boosting (2005)
AdaBoost produces a linear combination of base hypotheses and predicts with the sign of this linear combination. The linear combination may be viewed as a hyperplane in feature space where the base...
G. Rätsch, S. Sonnenburg, B. Schölkopf
Vol. 21 Suppl. 1 2005, pages i369–i377 doi:10.1093/bioinformatics/bti1053
Large scale genomic sequence svm classifiers (2005)
S. Sonnenburg, G. Rätsch, B. Schölkopf
Abstract. In genomic sequence analysis tasks like splice site recognition or promoter identification, large amounts of training sequences are available, and indeed needed to achieve sufficiently high...
Learning interpretable SVMs for biological sequence classification (2005)
S. Sonnenburg, G. Rätsch, C. Schäfer
Abstract. We propose novel algorithms for solving the so-called Support Vector Multiple Kernel Learning problem and show how they can be used to understand the resulting support vector decision...
RASE: recognition of alternatively spliced exons in C.elegans (2005)
Rätsch, G., Sonnenburg, S., Schölkopf, B.
Motivation: Eukaryotic pre-mRNAs are spliced to form mature mRNA. Pre-mRNA alternative splicing greatly increases the complexity of gene expression. Estimates show that more than half of the human...
Introduction to Statistical Learning Theory (2004)
Bousquet, O., Boucheron, S., Lugosi, G., Bousquet, O., Luxburg, U. Von, Rätsch, G.
Concentration Inequalities (2004)
Boucheron, S., Lugosi, G., Bousquet, O., Bousquet, O., Luxburg, U. Von, Rätsch, G.
Gaussian Processes in Machine Learning (2004)
Rasmussen, C.E., Bousquet, O., Luxburg, U. Von, Rätsch, G.
We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present...
Mika,S., Rätsch,G., Weston,J., Schölkopf,B., Smola,A.J.
We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh...
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Smola, A.J.
We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh...
Sparse regression ensembles in infinite and finite hypothesis spaces (2002)
Rätsch, G., Demiriz, A., Bennett, K.P.
We examine methods for constructing regression ensembles based on a linear program (LP). The ensemble regression function consists of linear combinations of base hypotheses generated by some...
A new discriminative kernel from probabilistic models (2002)
Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S.
Recently, Jaakkola and Haussler (1999) proposed a method for constructing kernel functions from probabilistic models. Their so-called Fisher kernel has been combined with discriminative classifiers...
New methods for splice site recognition (2002)
Sonnenburg, S., Rätsch, G., Jagota, A.K.
Splice sites are locations in DNA which separate protein-coding regions (exons) from noncoding regions (introns). Accurate splice site detectors thus form important components of computational gene...
New methods for splice site recognition (2002)
S. Sonnenburg, G. Rätsch, A. Jagota
Abstract. Splice sites are locations in DNA which separate protein-coding regions (exons) from noncoding regions (introns). Accurate splice site detectors thus form important components of...
On the convergence of leveraging (2001)
Rätsch, G., Mika, S., Warmuth, M.
We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the Least-Square-Boost algorithm for regression. These methods have in common...
Soft Margins for AdaBoost (2001)
Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting. ADABOOST rarely overfits in the low noise regime,...
On the convergence of leveraging (2001)
Rätsch, G., Mika, S., Warmuth, M.
We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the Least-Square-Boost algorithm for regression. These methods have in common...
An Introduction to Kernel-Based Learning Algorithms (2001)
Mika, S., Rätsch, G., Tsuda, K., Schoelkopf, B.
This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and kernel principal component analysis (PCA), as examples for successful kernel-based...
Learning to predict the leave-one-out error of kernel based classifiers (2001)
Tsuda, K., Rätsch, G., Mika, S.
We propose an algorithm to predict the leave-one-out (LOO) error for kernel based classifiers. To achieve this goal with computational efficiency, we cast the LOO error approximation task into a...
SVM and boosting. One class (2000)
Rätsch, G., Schölkopf, B., Mika, S.
We show via an equivalence of mathematical programs that a Support Vector (SV) algorithm can be translated into an equivalent boosting-like algorithm and vice versa. We exemplify this translation...
Time series prediction using SV regression and neural networks (2000)
Smola, A.J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.
Invariant feature extraction and classification in kernel spaces (2000)
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Smola, A.J.
Engineering support vector machine kernels that recognize translation initiation sites (2000)
Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T.
Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called...
SVM and boosting. One class (2000)
Rätsch, G., Schölkopf, B., Mika, S.
We show via an equivalence of mathematical programs that a Support Vector (SV) algorithm can be translated into an equivalent boosting-like algorithm and vice versa. We exemplify this translation...
Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites (2000)
A. Zien, G. Rätsch, S. Mika, B. Schölkopf, T. Lengauer
Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called...
Using Support Vector Machines for Time Series Prediction (2000)
A. J. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, V. Vapnik
This paper is an extended version of [12]. Generic author design sample pages 2000/07/31 03:05
Analysis of Nonstationary Time Series by Mixtures of Self-Organizing Predictors (2000)
J. Kohlmorgen, S. Lemm, G. Rätsch, Gunnar Ratsch
We present a method for the analysis of time series from drifting or switching dynamics. In extension to existing approaches that identify switches or drifts between stationary dynamical modes, the...
Engineering support vector machine kernels that recognize translation initiation sites (2000)
Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T.
Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called...
Using support vector machines for time series prediction (1999)
Smola, A.J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.
Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lemmen, C., Smola, A.J., ...
Kernel PCA and de-noising in feature spaces (1999)
Mika, S., Schölkopf, B., Smola, A.J., Scholz, M., Rätsch, G.
Input space vs. feature space in kernel-based methods (1999)
Schölkopf, B., Mika, S., Burges, C.J., Knirsch, P., Rätsch, G., ...
An asymptotic analysis of AdaBoost in the binary classification case (1998)
Recent work has shown that combining multiple versions of weak classifiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the...
Predicting Time Series with Support Vector Machines (1997)
Smola, A.J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.
Using Support Vector Machines for Time Series Prediction (1997)
Smola, A.J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.
Predicting Time Series with Support Vector Machines (1997)
A. J. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, V. Vapnik
. Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two dierent cost functions for Support Vectors: training with (i) an...
Predicting Time Series with Support Vector Machines (1997)
A. J. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, V. Vapnik
. Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an ffl...