Skip to content

Stage Preprocessed Data new

Overview

Sustainability DimensionEcological
ML Development PhaseData Collection and Preparation
ML Development StakeholdersML Development, Software Development

Description

Finally, the environmental efficiency can be improved by the DP of “Stage Preprocessed Data”. Staging of pre-preprocessed data reduces the need for recalculations (Vassiliadis, 2009). Schneider et al. (2019) suggest using intermediary stages of the processed data, such as feature stores, to facilitate rapid modeling, as operations only need to be executed once. Further, research focuses on the intelligent calculation of the timepoint to re-extract and re-transform the data when working with changing datasets (Vassiliadis, 2009).

Sources

  • Vassiliadis, P. (2009). A Survey of Extract–Transform–Load Technology: International Journal of Data Warehousing and Mining, 5(3), 1–27. https://doi.org/10.4018/jdwm.2009070101
  • Schneider, J., Basalla, M., & Seidel, S. (2019). Principles of Green Data Mining. Proceedings of the 52nd Hawaii International Conference on System Sciences. Hawaii International Conference on System Sciences. https://doi.org/10.24251/HICSS.2019.250