machine translation bias removal tool
Gender bias in machine translation

Gender versus Czech

In Czech we don’t say ‘I am happy’, we say ‘I as a man am happy’ or ‘I as a woman am happy’.

Yes, this sentence really does have two translations in Czech: jsem šťastný if a man is saying it and jsem šťastná if a woman is saying it. Czech is one of those very gender-aware languages where many things depend on whether the subject is male or female. Even a simple sentence such as I am happy has two translations depending on whether the I refers to a man or a woman. It’s because Czech grammar requires that the word for happy agrees in gender with the subject.

This affects not only adjectives (like happy) but also some nouns. Many Czech nouns which refer to people come in two versions, male and female: driver is řidič (male) or řidička (female), customer is zákazník (male) or zákaznice (female) and so on. Here Czech isn’t so exceptional because many European languages have gender-specific nouns. Even English has them, even if only a handful, such as actor/actress, widow/widower.

Last but not least, gender-specificity affects verbs too, or at least some forms of verb. Basically, every Czech verb in the past tense or in the conditional mood has to agree with the subject in gender, in the same way that adjectives do. What’s the Czech for I have won? It’s vyhrála jsem if a woman is saying it, vyhrál jsem if a man is saying it. This may seem exotic, from an English-speaking person’s view, but all Slavic languages are like this: not just Czech but also Russian, Polish and so on. The result is that, like it or not, you can barely say anything in these languages without accidentally revealing your gender.

How do machine translators handle gender-specific translations?

The short answer is: not very well. Machine translation will typically give you just one version, either the male one or the female one, depending on which occurred more often in its training data (from which the AI has machine-learned all it knows). Google Translate translates I am a bus driver as jsem řidič autobusu (male) but I am a cleaner as jsem uklízečka (female). Why? Because it “thinks” (or perhaps “knows”) that bus drivers are men more often than women, and cleaners are women more often than men. That’s probably what the algorithm has seen in its training data.

So far, so explainable. But imagine you’re a female bus driver who just wants to translate this sentence in order to say something about yourself. Then it’s no use to you if the machine thinks that the I in this sentence must be male. In your case it’s female but the machine doesn’t care. If you don’t speak Czech and if you’re not aware of the existence of gender-specific translations, you won’t even suspect that you may have been given the wrong translation. Also, ask yourself whether it’s desirable that machine translators should always assume gender in this way. It isn’t difficult to see how this perpetuates gender stereotypes.

So, machine translation obviously has a problem with gender-specific translations. To be fair, the problem is known and does receive the attention of machine translation developers occasionally. For one, Google has partially solved the problem and offers both male and female translations, but only in some situations and only in a handful of languages (Czech not included).

How does Fairslator handle gender-specific translations?

Fairslator is a plug-in for any machine translation system (curretly it works with DeepL, Google Translate and Microsoft Translator). Fairslator examines the translation which the system has produced, and looks for cases where there might an ambiguity. In cases where an English sentence is gender-neutral and the Czech translations are gender-specific, Fairslator reinflects the translation to generate both versions, and offers them to you.

Fairslator never assumes that a bus driver must be a man or that a cleaner must be a woman. In fact, Fairslator tries to never assume anything. Whenever your intended meaning is unclear, be it a question of gender or something else, Fairslator does its best to detect that, and asks follow-up questions to clarify what exactly you mean.

Now, isn’t that something that makes you happy? In Czech that’s jsem z toho šťastná if you’re a woman and jsem z toho šťastný if you’re a man.

Contact the author

What next?

Read more about bias and ambiguity in machine translation.
Cover page
We need to talk about bias
in machine translation
The Fairslator whitepaper
Sign up for my very low-traffic mailing list. I'll keep you updated on what's new with Fairslator and what's happening with bias in machine translation generally.
Your address is safe here. I will only use it to send you infrequent updates about Fairslator. I will not give or sell it to anyone. You can ask me to be taken off the list at any time.

Faislator blog

| Status update
What's new with Fairslator #2
Fairslator now speaks French, and other news.
| Gendergerechte Sprache
Kann man das Gendern automatisieren?
Überall Gendersternchen verstreuen und fertig? Von wegen. Geschlechtergerecht zu texten, das braucht vor allem Kreativität.
| Oh là là
Three reasons why you shouldn’t use machine translation for French
But if you must, at least run it through Fairslator.
| Ó Bhéarla go Gaeilge
Tusa, sibhse agus an meaisínaistriúchán ó Bhéarla
Tugaimis droim láimhe leis an mhíthuiscint nach bhfuil ach aon aistriúchán amháin ar gach rud.
| Status update
What's new with Fairslator #1
A new language pair, some new publications, plus what's in the pipeline.
| Machine translation
Finally, an Irish translation app that knows the difference between ‘tú’ and ‘sibh’
It asks you how you want to translate ‘you’.
| Forms of address
Why machine translation has a problem with ‘you’
This innocent-looking English pronoun is surprisingly difficult to translate into other languages.
| Male and female
10 things you should know about gender bias in machine translation
Machine translation is getting better all the time, but the problem of gender bias remains. Read these ten questions and answers if you want to understand all about it.
| Machine translation in Czech
Finally, a translation app that knows the difference between Czech ‘ty’ and ‘vy’!
Wouldn’t it be nice if machine translation asked how you want to translate ‘you’?
| Strojový překlad
Představ si, že jseš stroj, který překládá
Proč se překladače nikdy neptají, jak to myslíme?
| German machine translation
Finally, a translation app that knows the difference between German ‘du’ and ‘Sie’!
Wouldn’t it be nice if machine translation asked how you want to translate ‘you’?
| Maschinelle Übersetzung
Stell dir vor, du bist DeepL
Warum fragt der Übersetzer eigentlich nicht, was ich meine?

Fairslator timeline

icon September 2022 — Fairslator was presented and demoed at the Text, Speech and Dialogue (TSD) conference in Brno.
icon August 2022Translations in London are talking about Fairslator in their blog post Overcoming gender bias in MT. They think the technology behind Fairslator could be useful in the translation industry for faster post-editing of machine-translated texts.
August 2022 — A fourth language pair released: English → French.
icon July 2022 — Germany's Goethe-Institut interviewed us for the website of their project Artificially Correct. Read the interview in German: Wenn die Maschine den Menschen fragt or in English: When the machine asks the human, or see this short video on Twitter.
icon May, a website for the translation industry, asked us for a guest post and of course we didn't say no. Read What You Need to Know About Bias in Machine Translation »
April 2022 — A third language pair added: English → Irish.
February 2022 — Fairslator launched with two language pairs: English → German, English → Czech. Cries of excitement from everywhere!