Prediction of residential and non-residential building usage in Germany based on a novel nationwide reference data set
Research output: Contribution to journal › Research article › Contributed › peer-review
Contributors
Abstract
Building usage is an important variable in modelling the energetic, material and social properties of a building stock. Gathering this data on large geographical scale, and in the necessary temporal and spatial resolution, that means, on building level, is a challenging task. Machine Learning algorithms like Random Forest have proven useful in predicting building-related features in the past but often resort to training sets of limited geographic scope, for example, cities. This study presents a workflow of predicting the semantic attribute of usage on the level of individual buildings. Based on screening data of the previous ENOB:dataNWG project, a novel building ground-truth data set distributed across Germany, a Random Forest algorithm is used to assess how the German building stock can be classified according to its residential or non-residential use. Different sampling strategies had been applied in order to find a robust evaluation metric for the classifier. Furthermore, the relevance of the feature set is highlighted and it is examined whether regional differences in classification quality exist. Results show that a classification of residential and non-residential building footprints has good prospects with an AUC of up to 0.9.
Details
Original language | English |
---|---|
Pages (from-to) | 216-233 |
Number of pages | 18 |
Journal | Environment and Planning. B, Urban Analytics and City Science |
Volume | 51 |
Issue number | 1 |
Publication status | Published - Jan 2024 |
Peer-reviewed | Yes |
Keywords
ASJC Scopus subject areas
Keywords
- building stock, building usage, classification, feature importance, machine learning, Random Forest classifier, spatial cross-validation