Stage Preprocessed Data new
Overview
Sustainability Dimension | Ecological |
ML Development Phase | Data Collection and Preparation |
ML Development Stakeholders | ML Development, Software Development |
Description
Finally, the environmental efficiency can be improved by the DP of “Stage Preprocessed Data”. Staging of pre-preprocessed data reduces the need for recalculations (Vassiliadis, 2009). Schneider et al. (2019) suggest using intermediary stages of the processed data, such as feature stores, to facilitate rapid modeling, as operations only need to be executed once. Further, research focuses on the intelligent calculation of the timepoint to re-extract and re-transform the data when working with changing datasets (Vassiliadis, 2009).
Sources
- Vassiliadis, P. (2009). A Survey of Extract–Transform–Load Technology: International Journal of Data Warehousing and Mining, 5(3), 1–27. https://doi.org/10.4018/jdwm.2009070101
- Schneider, J., Basalla, M., & Seidel, S. (2019). Principles of Green Data Mining. Proceedings of the 52nd Hawaii International Conference on System Sciences. Hawaii International Conference on System Sciences. https://doi.org/10.24251/HICSS.2019.250