E-MIMIC: Empowering Multilingual Inclusive Communication

Preserving diversity and inclusion is becoming a compelling need in both industry and academia. The ability to use appropriate forms of writing, speaking, and gestures is not widespread even in formal communications such as public calls, public announcements, official reports, and legal documents. The improper use of linguistic expressions can foment unacceptable forms of exclusion, stereotypes as well as forms of verbal violence against minorities, including women. Furthermore, existing machine translation tools are not designed to generate inclusive content. The present paper investigates a joint effort of the research communities of linguistics and Deep Learning Natural Language Understanding in fighting against non-inclusive, prejudiced language forms. It presents a methodology aimed at tackling the improper use of language in formal communication, with a particular attention paid to Romanic languages (Italian, in particular). State-of-the-art Deep Language Modeling architectures are exploited to automatically identify non-inclusive text snippets, suggest alternative forms, and produce inclusive text rephrasing. A preliminary evaluation conducted on a benchmark dataset shows promising results, i.e., 85% accuracy in predicting inclusive/non-inclusive communications.

(2021). E-MIMIC: Empowering Multilingual Inclusive Communication . Retrieved from http://hdl.handle.net/10446/206154

E-MIMIC: Empowering Multilingual Inclusive Communication

Attanasio, Giuseppe;Greco, Salvatore;La Quatra, Moreno;Cagliero, Luca;Tonti, Michela;Cerquitelli, Tania;Raus, Rachele

2021-01-01

Abstract

Preserving diversity and inclusion is becoming a compelling need in both industry and academia. The ability to use appropriate forms of writing, speaking, and gestures is not widespread even in formal communications such as public calls, public announcements, official reports, and legal documents. The improper use of linguistic expressions can foment unacceptable forms of exclusion, stereotypes as well as forms of verbal violence against minorities, including women. Furthermore, existing machine translation tools are not designed to generate inclusive content. The present paper investigates a joint effort of the research communities of linguistics and Deep Learning Natural Language Understanding in fighting against non-inclusive, prejudiced language forms. It presents a methodology aimed at tackling the improper use of language in formal communication, with a particular attention paid to Romanic languages (Italian, in particular). State-of-the-art Deep Language Modeling architectures are exploited to automatically identify non-inclusive text snippets, suggest alternative forms, and produce inclusive text rephrasing. A preliminary evaluation conducted on a benchmark dataset shows promising results, i.e., 85% accuracy in predicting inclusive/non-inclusive communications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2021
			
	Tutti gli autori
	
						Attanasio, Giuseppe; Greco, Salvatore; La Quatra, Moreno; Cagliero, Luca; Tonti, Michela; Cerquitelli, Tania; Raus, Rachele
					
	Nelle collezioni:
	
				1.4.01 Contributi in atti di convegno - Conference presentations

File allegato/i alla scheda:

File	Dimensione del file	Formato
E_MIMIC.pdf Solo gestori di archivio Versione: publisher's version - versione editoriale Licenza: Licenza default Aisberg Dimensione del file 209.04 kB Formato Adobe PDF Visualizza/Apri	209.04 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/206154

Citazioni

11

2

social impact