machine translation bias removal tool
DEMO
Male and female

10 things you should know about gender bias in machine translation

Machine translation is getting better all the time, but the problem of gender bias remains. Read these ten questions and answers if you want to understand all about it.

1. How does gender bias happen in translation?

Gender bias happens when you need to translate something from a language where it’s gender-neutral into a language where it’s going to be gender-specific. For example, any sentence with the English word doctor in it, such as I am a doctor or this is my doctor, will be affected because there are two words for doctor in most European languages, one for ‘male doctor’ and one for ‘female doctor’. A human translator will usually figure out from context which translation is needed. A machine translator, on the other hand, normally has no context, and assumes a gender arbitrarily: often it’s the male gender. This is why Google Translate, DeepL and others tend to translate doctor as ‘male doctor’.

2. Why male and not female?

The sentence with doctor is an example of something called the male default: the machine assumes the word doctor refers to a male doctor, and ignores the possibility that it may be referring to a woman. Many words that describe people by occupation, such as doctor and director and teacher and bus driver, are biased by the male default in machine translation because traditionally these jobs were occupied by men more than women. A small minority of words are biased by the opposite, the female default: words like nurse, cleaner, cook. In each case, the computer simply assumes that, because it has been like that in the past, it is like that on this occasion too.

3. Why do computers make these assumptions?

Computer translators usually ‘learn’ from large databases of texts that had been been translated before by humans. If they see that bus driver has been translated as ‘male bus driver’ more often than ‘female bus driver’, they will pick that up as a habit and then always translate bus driver that way (unless some clues in the context force a different interpretation). The same goes for nurse being translated as ‘female nurse’ more often than ‘male nurse’ and so on.

4. So... computers just replicate pre-existing human biases, right?

Yes, more or less. Machine translators end up being biased to interpret words as male or female because they have learned from biased data which ultimately comes from us, humans. But that’s not the whole story. Most machine learning algorithms are designed to overgeneralize, to over-favour typicality. So, even if nurse (without contextual clues as to gender) is translated as ‘female nurse’ only 75% of the time in the training data, the machine will end up translating it as ‘female nurse’ 100% of the time. In other words, machine translators don’t just replicate pre-existing biases, they amplify them.

5. But is gender bias really such a big deal?

That depends on what you mean by ‘big’. Most people, when they type a sentence like I am a bus driver into a machine translator and ask for a translation into another language, they have themselves in mind: they want to translate a statement about themselves. If the user is a woman and the translation is worded as if a man is saying it, then that’s not OK. The translation is wrong, where by ‘wrong’ we mean ‘different from what the user intended’.

6. Surely such errors are easy to correct manually?

Of course, if the user speaks the target language, then she should be able to spot and correct the error. But if she doesn’t, then she may not even know that she has been given the wrong translation, and end up talking about herself as a man: embarrassing! Also, consider the wider social impact: machine-translated texts are everywhere these days, and if we’re constantly reading about male doctors and female nurses and almost never the other way around, then this might create the subconscious impression in people’s minds that certain professions are more suitable for one gender than the other. A society which believes in gender equality should not take that sitting down.

7. OK. But hey, that’s machine translation for you: it makes mistakes sometimes.

Yes, it is of course true that all machine translators come with a certain margin of error, and that you should never trust unconditionally the translations they give you. But that shouldn’t stop us from trying to make that margin of error as narrow as possible. As it turns out, the problem of gender bias is fixable: we can make the machine know which gender we’re talking about, and then translate accordingly.

8. You mean we can teach machines to be better at figuring out gender from context?

No, not exactly. Machines are actually pretty good at doing that already. If you have a sentence where there is some context to help with gender disambiguation, such as she is a doctor or he is a nurse, then most machine translators will pick up on that and translate the words correctly (doctor as ‘female doctor’, nurse as ‘male nurse’). The problem is that in some sentences there simply is no context to help with that, such as I am a doctor or this is a nurse. There are no clues in those sentence to tip you off one way or another whether the doctors and nurses in question are male of female.

9. So... we can’t fix gender bias just by building better AI?

Correct, we can’t. No artificial intelligence, however smart, can ever guess what gender people are if there are no clues in the input that’s available to the machine. A human translator, when they realize they don’t know the gender of someone and need it for translation, will try to find out, for example by asking someone.

10. But computers can’t ask questions, can they?

Well, actually, they can! That’s exactly the idea behind Fairslator. Fairslator isn’t the kind of machine translator that tries to guess the unguessable. Instead, Fairslator is a tool which detects the presence of a gender ambiguity and asks you which gender you want. When translating I am a doctor from English into a language where there are two gender-specific words for doctor, Fairslator will ask you which one you mean, and then alter the translation accordingly. The thing is, to really fix gender bias, we must liberate ourselves from the misconception that there is always an immediately obtainable translation for everything. Sometimes, we must ask follow-up questions to clarify what the intended meaning is, before we can produce a translation into another language. It’s time machine translators learned how to do that, and Fairslator is leading the way.

Contact the author

What next?

Read more about bias and ambiguity in machine translation.
Cover page
We need to talk about bias
in machine translation
The Fairslator whitepaper
Download
Sign up for my very low-traffic mailing list. I'll keep you updated on what's new with Fairslator and what's happening with bias in machine translation generally.
Your address is safe here. I will only use it to send you infrequent updates about Fairslator. I will not give or sell it to anyone. You can ask me to be taken off the list at any time.

Faislator blog

| Status update
What's new with Fairslator #2
Fairslator now speaks French, and other news.
| Gendergerechte Sprache
Kann man das Gendern automatisieren?
Überall Gendersternchen verstreuen und fertig? Von wegen. Geschlechtergerecht zu texten, das braucht vor allem Kreativität.
| Oh là là
Three reasons why you shouldn’t use machine translation for French
But if you must, at least run it through Fairslator.
| Ó Bhéarla go Gaeilge
Tusa, sibhse agus an meaisínaistriúchán ó Bhéarla
Tugaimis droim láimhe leis an mhíthuiscint nach bhfuil ach aon aistriúchán amháin ar gach rud.
| Status update
What's new with Fairslator #1
A new language pair, some new publications, plus what's in the pipeline.
| Machine translation
Finally, an Irish translation app that knows the difference between ‘tú’ and ‘sibh’
It asks you how you want to translate ‘you’.
| Forms of address
Why machine translation has a problem with ‘you’
This innocent-looking English pronoun is surprisingly difficult to translate into other languages.
| Machine translation in Czech
Finally, a translation app that knows the difference between Czech ‘ty’ and ‘vy’!
Wouldn’t it be nice if machine translation asked how you want to translate ‘you’?
| Gender bias in machine translation
Gender versus Czech
In Czech we don’t say ‘I am happy’, we say ‘I as a man am happy’ or ‘I as a woman am happy’.
| Strojový překlad
Představ si, že jseš stroj, který překládá
Proč se překladače nikdy neptají, jak to myslíme?
| German machine translation
Finally, a translation app that knows the difference between German ‘du’ and ‘Sie’!
Wouldn’t it be nice if machine translation asked how you want to translate ‘you’?
| Maschinelle Übersetzung
Stell dir vor, du bist DeepL
Warum fragt der Übersetzer eigentlich nicht, was ich meine?

Fairslator timeline

icon September 2022 — Fairslator was presented and demoed at the Text, Speech and Dialogue (TSD) conference in Brno.
icon August 2022Translations in London are talking about Fairslator in their blog post Overcoming gender bias in MT. They think the technology behind Fairslator could be useful in the translation industry for faster post-editing of machine-translated texts.
August 2022 — A fourth language pair released: English → French.
icon July 2022 — Germany's Goethe-Institut interviewed us for the website of their project Artificially Correct. Read the interview in German: Wenn die Maschine den Menschen fragt or in English: When the machine asks the human, or see this short video on Twitter.
icon May 2022Slator.com, a website for the translation industry, asked us for a guest post and of course we didn't say no. Read What You Need to Know About Bias in Machine Translation »
April 2022 — A third language pair added: English → Irish.
February 2022 — Fairslator launched with two language pairs: English → German, English → Czech. Cries of excitement from everywhere!