The availability of unstructured big data, such as the ones produced by social media, highlights the increasing methodological interest on text analysis and on the linked pre-processing phases. Several works have recently studied the impact of different pre-processing treatments on text classification. This aspect has been rarely studied when the target of the research is the definition of a topic-oriented dictionary that could be used to select messages regarding a certain topic among a wide group of unlabelled texts. The latter is a crucial phase: carefully filtering messages is a key aspect to start and to properly develop any type of textual analysis. In this paper, we aim at setting up a dictionary regarding environment. Starting from a verified list of Twitter Official Social Accounts, we evaluate if and how different pre-processing treatments (and their combination) can affect the final dictionary.
(2021). Impact of Tweets Pre-processing Techniques on a Dictionary for Environment . Retrieved from http://hdl.handle.net/10446/206062
Impact of Tweets Pre-processing Techniques on a Dictionary for Environment
Toninelli, Daniele;Cameletti, Michela;
2021-01-01
Abstract
The availability of unstructured big data, such as the ones produced by social media, highlights the increasing methodological interest on text analysis and on the linked pre-processing phases. Several works have recently studied the impact of different pre-processing treatments on text classification. This aspect has been rarely studied when the target of the research is the definition of a topic-oriented dictionary that could be used to select messages regarding a certain topic among a wide group of unlabelled texts. The latter is a crucial phase: carefully filtering messages is a key aspect to start and to properly develop any type of textual analysis. In this paper, we aim at setting up a dictionary regarding environment. Starting from a verified list of Twitter Official Social Accounts, we evaluate if and how different pre-processing treatments (and their combination) can affect the final dictionary.File | Dimensione del file | Formato | |
---|---|---|---|
Impact of Tweets Pre-processing Techniques_2021 JSM_Final.pdf
Solo gestori di archivio
Versione:
publisher's version - versione editoriale
Licenza:
Licenza default Aisberg
Dimensione del file
1.61 MB
Formato
Adobe PDF
|
1.61 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo