In Czech we don’t say ‘I am happy’, we say ‘I as a man am happy’ or ‘I as a woman am happy’.
Yes, this sentence really does have two translations in Czech: jsem šťastný if a man is saying it and jsem šťastná if a woman is saying it. Czech is one of those very gender-aware languages where many things depend on whether the subject is male or female. Even a simple sentence such as I am happy has two translations depending on whether the I refers to a man or a woman. It’s because Czech grammar requires that the word for happy agrees in gender with the subject.
This affects not only adjectives (like happy) but also some nouns. Many Czech nouns which refer to people come in two versions, male and female: driver is řidič (male) or řidička (female), customer is zákazník (male) or zákaznice (female) and so on. Here Czech isn’t so exceptional because many European languages have gender-specific nouns. Even English has them, even if only a handful, such as actor/actress, widow/widower.
Last but not least, gender-specificity affects verbs too, or at least some forms of verb. Basically, every Czech verb in the past tense or in the conditional mood has to agree with the subject in gender, in the same way that adjectives do. What’s the Czech for I have won? It’s vyhrála jsem if a woman is saying it, vyhrál jsem if a man is saying it. This may seem exotic, from an English-speaking person’s view, but all Slavic languages are like this: not just Czech but also Russian, Polish and so on. The result is that, like it or not, you can barely say anything in these languages without accidentally revealing your gender.
The short answer is: not very well. Machine translation will typically give you just one version, either the male one or the female one, depending on which occurred more often in its training data (from which the AI has machine-learned all it knows). Google Translate translates I am a bus driver as jsem řidič autobusu (male) but I am a cleaner as jsem uklízečka (female). Why? Because it “thinks” (or perhaps “knows”) that bus drivers are men more often than women, and cleaners are women more often than men. That’s probably what the algorithm has seen in its training data.
So far, so explainable. But imagine you’re a female bus driver who just wants to translate this sentence in order to say something about yourself. Then it’s no use to you if the machine thinks that the I in this sentence must be male. In your case it’s female but the machine doesn’t care. If you don’t speak Czech and if you’re not aware of the existence of gender-specific translations, you won’t even suspect that you may have been given the wrong translation. Also, ask yourself whether it’s desirable that machine translators should always assume gender in this way. It isn’t difficult to see how this perpetuates gender stereotypes.
So, machine translation obviously has a problem with gender-specific translations. To be fair, the problem is known and does receive the attention of machine translation developers occasionally. For one, Google has partially solved the problem and offers both male and female translations, but only in some situations and only in a handful of languages (Czech not included).
Fairslator is a plug-in for any machine translation system (curretly it works with DeepL, Google Translate and Microsoft Translator). Fairslator examines the translation which the system has produced, and looks for cases where there might an ambiguity. In cases where an English sentence is gender-neutral and the Czech translations are gender-specific, Fairslator reinflects the translation to generate both versions, and offers them to you.
Fairslator never assumes that a bus driver must be a man or that a cleaner must be a woman. In fact, Fairslator tries to never assume anything. Whenever your intended meaning is unclear, be it a question of gender or something else, Fairslator does its best to detect that, and asks follow-up questions to clarify what exactly you mean.
Now, isn’t that something that makes you happy? In Czech that’s jsem z toho šťastná if you’re a woman and jsem z toho šťastný if you’re a man.
October 2024 —
We were talking about bias in machine translation
at a Translating Europe Workshop organised by the European Commission in Prague
as part of Jeronýmovy dny,
a series of public lectures and seminars on translation and interpreting.
Video here »
December 2023 —
Fairslator presented a workshop on bias in machine translation
at the European Commission's
Directorate-General for Translation,
attended by translation-related staff from all EU institutions.
November 2023 —
Fairslator went to Translating and the Computer,
an annual conference on translation technology in Luxembourg,
to present its brand new API.
Proceedings from this conference are here, our paper starts on page 98.
November 2023 —
We were talking about gender bias, gender rewriting and Fairslator
at the EAFT Summit
in Barcelona where we also launched an exciting spin-off
project there:
Genderbase,
a multilingual database of gender-sensitive terminology.
February 2023 —
We spoke to machinetranslation.com
about bias in machine translation, about Fairslator, and about our vision for “human-assisted machine translation”.
Read the interview here:
Creating an Inclusive AI Future: The Importance of Non-Binary Representation »
October 2022 —
We presented Fairslator at the
Translating and the Computer
(TC44) conference, Europe's main annual event for computer-aided translation, in Luxembourg.
Proceedings from this conference are here,
the paper that describes Fairslator starts on page 90.
Read our impressions from TC44 in this thread on
Twitter
and
Mastodon.
September 2022 —
In her article
Error sources in machine translation: How the algorithm reproduces unwanted gender roles
(German: Fehlerquellen der maschinellen Übersetzung: Wie der Algorithmus ungewollte Rollenbilder reproduziert),
Jasmin Nesbigall of oneword GmbH talks about bias in machine translation
and recommends Fairslator as a step towards more gender fairness.
September 2022 —
Fairslator was presented at the
Text, Speech and Dialogue
(TSD) conference in Brno.
August 2022 —
Translations in London are talking about Fairslator in their blog post
Overcoming gender bias in MT.
They think the technology behind Fairslator could be useful in the translation industry
for faster post-editing of machine-translated texts.
July 2022 —
We presented a paper titled A Taxonomy of Bias-Causing Ambiguities in Machine Translation
at a Workshop on Gender Bias in Natural Language Processing
during the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics
in Seattle.
May 2022 —
Slator.com, a website for the translation industry, asked us for a guest post and of course we didn't say no.
Read What You Need to Know About Bias in Machine Translation »