Supported Languages#

The follwing list reflects the languages supported by the BBF for which we additionally tested their correctness:

Language Name

ISO 639-1 Code

Pre Processor Source Code

Pre Processing Type

English

en

Snowball stemmer

Stemmer

Spanish

es

Snowball stemmer

Stemmer

French

fr

Snowball stemmer

Stemmer

German

de

Snowball stemmer

Stemmer

Portuguese

pt

Snowball stemmer

Stemmer

Catalan

ca

Snowball stemmer

Stemmer

Luxembourgish

lb

spellux

Lemmatizer

Snowball stemmer#

The snowball stemmer supports more languages than the ones displayed in the table above. For the complete list check out https://snowballstem.org/algorithms/.

Luxembourgish Pre-processing#

To support the luxembourgish language, we wanted to enable luxembourgish-specific text-processing. For that purpose, we opted for the best available tool, which is the lemmatizer created by Christoph Purschke as part of “spellux - Automatic text normalization for Luxembourgish”.