This paper proposes a method of text mining to automatically retrieve knowledge from patents on how to recycle and reuse a waste. The main novelties are the introduction of a set of specific dependency patterns and the introduction of a partially revised TRIZ (Russian acronym for Theory ¨ of Inventive Problem Solving) ¨ ontology to classify the retrieved information. The proposed dependency patterns were manually extracted from a sample patents pool about waste recycling and reuse. The classification of the information is based on different classes: (1) what transformations can be carried out on the waste, (2) what technologies can be used to carry out these transformations, (3) what products can be obtained by transforming the waste, (4) what functions can be carried out by the waste, (5) with which technologies, and (6) on which entities. An automatic implementation of the proposed method, involving the manual check of the retrieved results, was tested through a case study about wood chip recycling and reuse. Compared to the dependency patterns from the literature, the proposed ones allowed to retrieve 28 % more pertinent information. This results mainly depends by better ability of the proposed patterns to better discriminate the relevant sentences from which to extract information, compared to the other patterns (i.e. + 40 %). The automatic classification of the information was also correctly performed: in almost each class, precision and recall were higher than 60 % and on average equal to 90 %.
(2021). Using text mining to retrieve information about circular economy [journal article - articolo]. In COMPUTERS IN INDUSTRY. Retrieved from http://hdl.handle.net/10446/186850
Using text mining to retrieve information about circular economy
Spreafico, Christian;Spreafico, Matteo
2021-01-01
Abstract
This paper proposes a method of text mining to automatically retrieve knowledge from patents on how to recycle and reuse a waste. The main novelties are the introduction of a set of specific dependency patterns and the introduction of a partially revised TRIZ (Russian acronym for Theory ¨ of Inventive Problem Solving) ¨ ontology to classify the retrieved information. The proposed dependency patterns were manually extracted from a sample patents pool about waste recycling and reuse. The classification of the information is based on different classes: (1) what transformations can be carried out on the waste, (2) what technologies can be used to carry out these transformations, (3) what products can be obtained by transforming the waste, (4) what functions can be carried out by the waste, (5) with which technologies, and (6) on which entities. An automatic implementation of the proposed method, involving the manual check of the retrieved results, was tested through a case study about wood chip recycling and reuse. Compared to the dependency patterns from the literature, the proposed ones allowed to retrieve 28 % more pertinent information. This results mainly depends by better ability of the proposed patterns to better discriminate the relevant sentences from which to extract information, compared to the other patterns (i.e. + 40 %). The automatic classification of the information was also correctly performed: in almost each class, precision and recall were higher than 60 % and on average equal to 90 %.File | Dimensione del file | Formato | |
---|---|---|---|
Spreafico_Spreafico_2021_Using_text_mining.pdf
Solo gestori di archivio
Versione:
publisher's version - versione editoriale
Licenza:
Licenza default Aisberg
Dimensione del file
2.23 MB
Formato
Adobe PDF
|
2.23 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo