The web is a complex information ecosystem that provides a large variety of content changing over time as a consequence of the combined effects of management policies, user interactions and external events. These highly dynamic scenarios challenge technologies dealing with discovery, management and retrieval of web content. In this paper, we address the problem of modeling and predicting web dynamics in the framework of time series analysis and forecasting. We present a general methodological approach that allows the identification of the patterns describing the behavior of the time series, the formulation of suitable models and the use of these models for predicting the future behavior. Moreover, to improve the forecasts, we propose a method for detecting and modeling the spiky patterns that might be present in a time series. To test our methodological approach, we analyze the temporal patterns of page uploads of the Reuters news agency website over one year. We discover that the upload process is characterized by a diurnal behavior and by a much larger number of uploads during weekdays with respect to weekend days. Moreover, we identify several sudden spikes and a daily periodicity. The overall model of the upload process – obtained as a superposition of the models of its individual components – accurately fits the data, including most of the spikes.

(2019). A methodological approach for time series analysis and forecasting of web dynamics . Retrieved from http://hdl.handle.net/10446/202728

A methodological approach for time series analysis and forecasting of web dynamics

Della Vedova, Marco L.;
2019-01-01

Abstract

The web is a complex information ecosystem that provides a large variety of content changing over time as a consequence of the combined effects of management policies, user interactions and external events. These highly dynamic scenarios challenge technologies dealing with discovery, management and retrieval of web content. In this paper, we address the problem of modeling and predicting web dynamics in the framework of time series analysis and forecasting. We present a general methodological approach that allows the identification of the patterns describing the behavior of the time series, the formulation of suitable models and the use of these models for predicting the future behavior. Moreover, to improve the forecasts, we propose a method for detecting and modeling the spiky patterns that might be present in a time series. To test our methodological approach, we analyze the temporal patterns of page uploads of the Reuters news agency website over one year. We discover that the upload process is characterized by a diurnal behavior and by a much larger number of uploads during weekdays with respect to weekend days. Moreover, we identify several sudden spikes and a daily periodicity. The overall model of the upload process – obtained as a superposition of the models of its individual components – accurately fits the data, including most of the spikes.
2019
Calzarossa, Maria Carla; DELLA VEDOVA, Marco Luigi; Massari, Luisa; Nebbione, Giuseppe; Tessera, Daniele
File allegato/i alla scheda:
File Dimensione del file Formato  
2019_trans-comp-collective-intelligence.pdf

Open Access dal 23/06/2021

Descrizione: Under Springer Nature terms of use for archived accepted manuscripts (AMs)
Versione: postprint - versione referata/accettata senza referaggio
Licenza: Licenza default Aisberg
Dimensione del file 483.34 kB
Formato Adobe PDF
483.34 kB Adobe PDF Visualizza/Apri
2019_DellaVedova_Transactions.pdf

Solo gestori di archivio

Versione: publisher's version - versione editoriale
Licenza: Licenza default Aisberg
Dimensione del file 385.12 kB
Formato Adobe PDF
385.12 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/202728
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact