Ralf Steinberger

Details der Publikationsliste

Zeitraum

1994 - 2009

Anzahl

55

Co-Autoren

Linking News Content Across Languages (2009)

Steinberger, Ralf

Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 4-5. © 2009 The editors...

The Selection of Electronic Text Documents Supported by Only Positive Examples (2008)

Bruno Pouliquen, Camelia Ignat, Ralf Steinberger

The European Commission has a freely accessible news monitoring system called the Europe Media Monitor NewsBrief

Final Report for the IPSC Exploratory Research Project Cross-lingual Indexing (4/2001 – 3/2003) (2008)

Ralf Steinberger, Bruno Pouliquen

Cross-lingual information access: providing content descriptors in one language for texts written in another, by assigning Eurovoc thesaurus descriptors automatically.

Extending an Information Extraction Tool Set to Central and Eastern European Languages (2008)

Camelia Ignat, Bruno Pouliquen, António Ribeiro, Ralf Steinberger

date recognition; place name recognition; visualisation In a highly multilingual and multicultural environment such as in the European Commission with soon over twenty official languages, there is an...

Massive multi lingual corpus compilation: Acquis Communautaire (2008)

Tomaž Erjavec, Camelia Ignat, Bruno Pouliquen, Ralf Steinberger

The paper discusses the compilation of massively multilingual corpora, the EU ACQUIS corpus, and the corpus annotation tool “totale”. The ACQUIS text collection has recently become available on...

The Selection of Electronic Text Documents Supported by Only Positive Examples (2008)

Jan Žižka, Jiří Hroza, Bruno Pouliquen, Camelia Ignat, Ralf Steinberger

The European Commission has a freely accessible news monitoring system called the Europe Media Monitor NewsBrief

Combining Information about Epidemic Threats from Multiple Sources (2008)

Roman Yangarber, Clive Best, Peter Von Etter, Flavio Fuart, David Horby, Ralf Steinberger

This paper describes an on-going effort to combine Information Retrieval (IR) and Information Extraction (IE) technologies, to leverage the benefits provided by both approaches to add value for the...

Massive multi lingual corpus compilation: Acquis Communautaire and totale (2008)

Tomaž Erjavec, Camelia Ignat, Bruno Pouliquen, Ralf Steinberger

The paper discusses the compilation of massively multilingual corpora, the EU ACQUIS corpus, and the corpus annotation tool “totale”. The ACQUIS text collection has recently become available on...

Algorithms, Design, Experimentation (2008)

Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, Tom De Groeve

In this paper, we describe a system that recognises place names in natural language text and produces geographic maps and animations showing the geographical coverage of texts about a certain subject...

Text Categorization using bibliographic records: beyond (2008)

Document Content Arturo, Arturo Montejo-ráez, Ralf Steinberger

This paper studies the use of di#erent sources of information for performing a text classification task. The growing number of digital libraries imposes a review of the available data from those...

Spanish Text (2007)

Ralf Steinberger, Bruno Pouliquen, António Ribeiro, Camelia Ignat

A tool set to retrieve and analyse multilingual texts and to give users cross-lingual information access

Providing Cross-lingual Information Access in Multilingual Text Collections (2007)

Ralf Steinberger, Ralf Steinberger, Bruno Pouliquen

on radioactive waste Agenda ●A brief introduction to Computational Linguistics & Language Technology ●Motivation for our work (customers) ●Goal of the AIM sector’s Language Technology...

Linguistic Applications, including the “Multilingualism ” Programme (2007)

Johan Hagman, Domenico Perrotta, Ralf Steinberger, Aristide Varfis

Abstract This position paper reports on ongoing work where three clustering and visualisation techniques for large document collections – developed at the Joint Research Centre (JRC) – are...

The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages (2006)

Steinberger, Ralf, Pouliquen, Bruno, Widiger, Anna, Ignat, Camelia, Erjavec, Tomaz, Tufis, Dan, ...

We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EUanguages, with additional documents...

Automatic annotation of multilingual text collections with a conceptual thesaurus (2006)

Pouliquen, Bruno, Steinberger, Ralf, Ignat, Camelia

Automatic annotation of documents with controlled vocabulary terms (descriptors) from a conceptual thesaurus is not only useful for document indexing and retrieval. The mapping of texts onto the same...

Automatic Identification of Document Translations in Large Multilingual Document Collections (2006)

Pouliquen, Bruno, Steinberger, Ralf, Ignat, Camelia

Texts and their translations are a rich linguistic resource that can be used to train and test statistics-based Machine Translation systems and many other applications. In this paper, we present a...

Cross-lingual keyword assignment (2006)

Steinberger, Ralf

This paper presents a language-independent approach to controlled vocabulary keyword assignment using the EUROVOC thesaurus. Due to the multilingual nature of EUROVOC, the keywords for a document...

Extending an Information Extraction tool set to Central and Eastern European languages (2006)

Ignat, Camelia, Pouliquen, Bruno, Ribeiro, Antonio, Steinberger, Ralf

In a highly multilingual and multicultural environment such as in the European Commission with soon over twenty official languages, there is an urgent need for text analysis tools that use minimal...

Exploiting multilingual nomenclatures and language-independent text features as an interlingua for cross-lingual text analysis applications (2006)

Steinberger, Ralf, Pouliquen, Bruno, Ignat, Camelia

We are proposing a simple, but efficient basic approach for a number of multilingual and cross-lingual language technology applications that are not limited to the usual two or three languages, but...

Geocoding multilingual texts: Recognition, disambiguation and visualisation (2006)

Pouliquen, Bruno, Kimler, Marco, Steinberger, Ralf, Ignat, Camelia, Oellinger, Tamara, Blackler, Ken, ...

We are presenting a method to recognise geographical references in free text. Our tool must work on various languages with a minimum of language-dependent resources, except a gazetteer. The main...

Building and displaying name relations using automatic unsupervised analysis of newspaper articles (2006)

Pouliquen, Bruno, Steinberger, Ralf, Ignat, Camelia, Oellinger, Tamara

We present a tool that, from automatically recognised names, tries to infer inter-person relations in order to present associated people on maps. Based on an in-house Named Entity Recognition tool,...

A tool set for the quick and efficient exploration of large document collections (2006)

Ignat, Camelia, Pouliquen, Bruno, Steinberger, Ralf, Erjavec, Tomaz

We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the documents contain...

Multilingual person name recognition and transliteration (2006)

Pouliquen, Bruno, Steinberger, Ralf, Ignat, Camelia, Temnikova, Irina, Widiger, Anna, Zaghouani, Wajdi, ...

We present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the...

Navigating multilingual news collections using automatically extracted information (2006)

Steinberger, Ralf, Pouliquen, Bruno, Ignat, Camelia

We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that is of relevance to...

The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages (2006)

Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaž Erjavec, Dan Tufiş, ...

We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, with additional...

The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages (2006)

Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaž Erjavec, Dan Tufiş, ...

We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, with additional...

Building and displaying name relations using automatic unsupervised analysis of newspaper articles (2006)

Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, Tamara Oellinger

We present a tool that, from automatically recognised names, tries to infer inter-person relations in order to present associated people on maps. Based on an in-house Named Entity Recognition tool,...

Geocoding multilingual texts: Recognition, disambiguation and visualisation (2006)

Bruno Pouliquen, Marco Kimler, Ralf Steinberger, Camelia Ignat, Tamara Oellinger, Flavio Fluart, ...

We are presenting a method to recognise geographical references in free text. Our tool must work on various languages with a minimum of language-dependent resources, except a gazetteer. The main...

The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages (2006)

Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaž Erjavec, Dan Tufiş

We are presenting a new and unique parallel corpus available in all 2 official European Union (EU) languages, with additional documents available for some EU candidate countries. The average size is...

Text categorization using bibliographic records : beyond document content (2005)

Montejo Ráez, Arturo, Ureña López, Luis Alfonso, Steinberger, Ralf

En este artículo se estudia el uso de diferentes fuentes de información para tareas de clasificación de textos. Dado el creciente número de bibliotecas digitales, se impone una revisión de la...

Text categorization using bibliographic records: beyond document content (2005)

Montejo Ráez, Arturo, Ureña López, Luis Alfonso, Steinberger, Ralf

This paper studies the use of different sources of information for performing a text classification task. The growing number of digital libraries imposes a review of the available data from those...

Navigating Multilingual News Collections (2005)

Using Automatically Extracted, Ralf Steinberger, Bruno Pouliquen, Camelia Ignat

We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that is of relevance to...

A tool set for the quick and efficient exploration of large document collections (2005)

Camelia Ignat, Ralf Steinberger, Bruno Pouliquen, Tomaž Erjavec

We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the documents contain...

A tool set for the quick and efficient exploration of large document collections (2005)

Camelia Ignat, Ralf Steinberger, Bruno Pouliquen, Tomaž Erjavec

We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the documents contain...

International Conference on Computational Linguistics, CoLing'2004 (2004)

Geneva Switzerland August, Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, Emilia Käsper, Irina Temnikova

We are presenting a working system for automated news analysis that ingests an average total of 7600 news articles per day in five languages. For each language, the system detects the major news...

Automatic Annotation of Multilingual Text Collections with a Conceptual Thesaurus (2003)

Bruno Pouliquen, Ralf Steinberger, Camelia Ignat

Automatic annotation of documents with controlled vocabulary terms (descriptors) from a conceptual thesaurus is not only useful for document indexing and retrieval. The mapping of texts onto the same...

Automatic identification of document translations in large multilingual document collections (2003)

Bruno Pouliquen, Ralf Steinberger, Camelia Ignat

Texts and their translations are a rich linguistic resource that can be used to train and test statistics-based Machine Translation systems and many other applications. In this paper, we present a...

Automatic Annotation of Multilingual Text Collections with a Conceptual Thesaurus (2003)

Bruno Pouliquen, Ralf Steinberger, Camelia Ignat

Automatic annotation of documents with controlled vocabulary terms (descriptors) from a conceptual thesaurus is not only useful for document indexing and retrieval. The mapping of texts onto the same...

Cross-lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC (2002)

Ralf Steinberger, Bruno Pouliquen, Johan Hagman

Abstract. We are presenting an approach to calculating the semantic similarity of documents written in the same or in different languages. The similarity calculation is achieved by representing the...

Cross-lingual keyword assignment (2001)

Steinberger, Ralf

This paper presents a language-independent approach to controlled vocabulary keyword assignment using the EUROVOC thesaurus. Due to the multilingual nature of EUROVOC, the keywords for a document...

Cross-lingual keywoard assignment (2001)

Steinberger, Ralf

This paper presents a language-independent approach to controlled vocabulary keyword assignment using the EUROVOC thesaurus. Due to the multilingual nature of EUROVOC, the keywords for a document...

Using thesauri for automatic indexing and for the visualisation of multilingual document collections (2000)

Ralf Steinberger, Johan Hagman, Stefan Scheer

Abstract. This article presents an approach for cross-language document comparison and for the visualisation of multilingual document collections. Document comparison usually relies on the...

Approaches to Document Classification and Visualisation (1999)

Johan Hagman, Ralf Steinberger, Domenico Perrotta, Aristide Varfis

In this short paper we present two clustering and visualisation techniques for document collections which have been developed at the Joint Research Centre to support specific users within the...

Automatic selection and ranking of translation candidates (1997)

Ralf Steinberger

Abstract. We propose a method for selecting and ranking translation candidates using as, input disambiguated source language expressions with thesaurus-compatible senses. This procedure provides the...

Lexikoneintraege fuer deutsche Adverbien (Dictionary Entries for German Adverbs) (1994)

Steinberger, Ralf

Modifiers in general, and adverbs in particular, are neglected categories in linguistics, and consequently, their treatment in Natural Language Processing poses problems. In this article, we present...

Treating `Free Word Order' in Machine Translation (1994)

Steinberger, Ralf

In `free word order' languages, every sentence is embedded in its specific context. Among others, the order of constituents is determined by the categories `theme', `rheme' and `contrastive focus'....

Treating Free Word Order in Machine Translation. Coling (1994)

Ralf Steinberger

In free wordorder languages, every sentence is embedded in its specific context. The order of constituents is determined by the categories theme, rheme and contrastive focus. This paper shows how to...

Lexikoneintrage fur deutsche Adverbien. Tagungsband, KONVENS '94. Verarbeitung naturlicher Sprache (1994)

Ralf Steinberger

Abstract. Modi ers in general, and adverbs in particular, are neglected categories in linguistics, and consequently, their treatment in Natural Language Processing poses problems. In this article, we...

Automatic Recognition of Theme, Focus and Contrastive Stress", in: Bosch/van der Sandt (1994)

Ralf Steinberger, Paul Bennett

Theme, focus and contrastive stress are categories which are necessary for the solution of several problems in NLP, including word order determination, scope recognition and anaphora resolution....