Classification-of-Images-for-Building-Damage-Assessment

This project is centered around using deep learning technologies for a civil engineering application, namely recognition of structural damage from images. Damage surveys currently require teams of domain experts to visually inspect buildings to determine their safety, which is slow and subjective. Our objective is to automate the process using computer vision, in this project we use a supervised learning approach over 8 different recognition tasks: scene level, damage state, spalling condition, material type, collapse mode, component type, damage level, and damage type. The model will be constructed from 8 different CNN models and each image will be passed through the 8 recognition tasks to see which class in each recognition task it applies to. The challenge will be running all 8 recognition tasks to their fullest and handling the individual errors that might show up in each model along with the load on the PC to run these models consecutively.

This project aims to replace the current process, simplify the process and add consistency to the work done on analyzing structural damage and through the accuracy achieved in this project of over 80% accuracy on each recognition task we believe it is possible for this type of model to start taking over this type of work on a larger level, especially in cases of natural disasters where we need a quick and accurate analysis of the structural damage at a time where resources are spread thin, this type of project creates an unlimited resource to pull from in those situations.

Introduction

Ensuring the proper performance of all elements in a structure is a priority for designers and users. In most cases, continuous monitoring can detect damages at an early stage can prevent potential accidents and catastrophes that result from inadequate inspection or damages to the evaluation process. Structural health monitoring (SHM) involves the use of continuous monitoring using sensors that are permanently attached to the structure, together with algorithms related to the damage-identification process.

Collapses of civil infrastructures strike public opinion more and more often. They are generally due to either structural deterioration or modified working conditions with respect to the design ones. The main challenge of structural health monitoring is to increase the safety level of ageing structures by detecting, locating and quantifying the presence and the development of damages, possibly in real-time

However, visual inspections—whose frequencies are usually determined by the importance and the age of the structure—are still the workhorse in this field, even if they are rarely able to provide a quantitative estimate of structural damages. Therefore, it is evident why recent advances in sensing technologies and signal processing, coupled to the increased availability of computing power, are creating huge expectations in the development of robust and continuous SHM systems.

Structural health monitoring (SHM) and rapid damage assessment after natural hazards and disasters have become an important focus in civil engineering. Moreover, structural response records and images as the data media play an increasing role in nowadays data explosion epoch. Meanwhile, artificial intelligence (AI) and machine learning (ML) technologies are developing rapidly, especially in applications of deep learning (DL) in computer vision, which made giant progress in recent years.

In addition, the objective of implementation of ML and DL is to make computers perform labor-intensive repetitive tasks and also learn from past experiences. Nowadays, structural damage recognition using images is one of the important topics in vision-based SHM and structural reconnaissance, which greatly relies on human visual inspection and experience. However, several recent non-DL studies are addressing issues related to relatively tedious manual efforts.

Thus, following this trend, it is timely to implement the state-of-art DL technologies in civil engineering applications and evaluate its potential benefits. Computer vision is a field of artificial intelligence that trains the computer to interpret and understand the visual world. Using digital images from cameras and videos and deep learning models, machines can accurately identify and classify objects and then react to what they see. In deep learning a convolutional neural network is a class of artificial neural network most commonly applied to analyze visual imagery, CNN has been at the heart of spectacular recent advances in deep learning and is better than traditional computer vision and machine learning approaches due to many reasons including its depth of architecture and it no longer needing low level features or feature engineering, so CNNs use relatively little preprocessing compared to other image classification algorithms.

With such strong advances in image classification we have the chance to use it for structural health monitoring, so far only a few researches or applications of CNN exist in post disaster reconnaissance or SHM, civil engineering applications have not fully benefited from the data driven computer science or computer vision technologies, and so our aim is to combine the two fields and have a program that civil engineering can benefit from greatly.

The field of civil engineering needs more studies geared towards machine learning to get the full benefit from that field and this project is a step in that direction. In general, our objectives for this project are as follows:

Gather labelled and unlabeled data for structural health monitoring and post disaster reconnaissance for the purpose of applying machine learning algorithms
Create 8 different recognition tasks covering thousands of images each to be classified into multiple classes for each recognition task
Make an effective, consistent, and time efficient program capable of classifying structural damage on multiple levels and ready for regular use in structural health monitoring

Methodology

This project was started on the spyder platform, then continued on Google Colab. To start we download and import all the necessary libraries and for this project there were a lot of libraries involved. Then we moved on to loading the data, the data for this project was images in the form of numpy arrays so we had to use a specific command to load them in. After attempting to load them in we faced difficulties with the size of the data being too large to load in, to solve this issue we used (mmap_mode = r) which avoids overloading the memory, we also split the data into train/validation sets.

Next we started constructing our convolution layers for each recognition task, in this project we used relu for most layers with the output layer being softmax, early on we attempted to use sigmoid on the binary recognition tasks but found that it produced worse results than soft max.

We use a kernel size of 3 for most recognition tasks, with pool size 2. We also added a drop out layer with a rate of 0.5 for most recognition tasks. However due to issues with accuracy the kernel size, drop out rate and activation function of the last 2 recognition tasks was changed from the rest, using a 5 kernel size, a lower drop out rate of 0.1, and a tanh activation function for the final hidden layer in the last recognition task.

There were some issues with accuracy overall, so we used a lower learning rate of 0.0001 with decay and a high epoch count of 150 epochs per recognition task. After that we compiled the models, set the loss to categorical to match softmax in the output layer and to handle the labels properly, and set out metrics to accuracy. We also added image augmentation in the image data generator, rescaling the image, giving it a rotation range of 30, a width shift range of 0.1, a height shift range of 0.2, a zoom range of 0.3, a shear range of 0.2, allowed both horizontal and vertical flip, and set fill mode to nearest.

We then fit the models independently each in a cell of its own, then we created a plot for the accuracy and validation accuracy, and loss and validation loss.

After that we predicted the model on each recognition task against its test data set, and we calculated the accuracy against the test set for each.

We also set up out final cell to with one simple click upload an image, enter it into the eight models, predict its outcome for each recognition task then display the image and its corresponding classifications.

The biggest challenge in handling this project was the 8 different recognition tasks that each needed its own fine tuning and immense processing power. It was extremely difficult to run all 8 models even on a PC with fairly good qualities, for this reason the project was moved to Google Colab to utilize its higher speeds and remote processing on the difficult task we had ahead of us.

However, even with Google Colab’s high processing speed it was a challenge to run all 8 models without any issue.

The Dataset

Both AI and machine learning (ML) technologies have developing rapidly in recent decades, especially in the application of deep learning (DL) in computer vision (CV). The objective of ML and DL implementation is to have computers perform labor-intensive repetitive tasks while simultaneously “learning” from those tasks. Both ML and DL fall within the scope of empirical study, where data is the most essential component. In vision-based Structural Health Monitoring (SHM), using images as data media is currently an active research direction. Structural images obtained from reconnaissance efforts or daily life are playing an increasing role as the success of ML and DL is contingent on the volume of data media available. The expectation is that eventually computers will be able to realize autonomous recognition of structural damage in daily life—under service conditions—or after an extreme event—a large earthquake or extreme wind. Until now, vision-based SHM applications have not fully benefited from the data-driven CV technologies, even as interest on this topic is ever increasing. Its application to structural engineering has been hamstrung mainly due to two factors: (1) the lack of a general automated detection principles or frameworks based on domain knowledge; and (2) the lack of benchmark datasets with well-labeled large amounts of data [5].

To address the above mentioned drawbacks, there was a recent effort to build a large-scale open sourced structural image database: the PEER (Pacific Earthquake Engineering Research Center( hub ImageNet (PHI-Net) as of November 2019 this Phi-Net Dataset contains 36,413 images with multiple attributes for the following baseline recognition tasks: scene level classification, structural component type identification, crack existence check and damage level detection. The Phi-Net dataset uses a hierarchy-tree framework for automated structural detection tasks founded on past experiences from reconnaissance efforts for post-earthquakes and other hazards. Through a tree-branch mechanism, each structural image can be clustered into several subcategories representing detection tasks. This acts as a sort of filtering operation to decrease the complexity of the problem and improve the performance of the automated applications of the algorithms. To the best of the authors’ knowledge, until now there was no open sourced structural image dataset with multi-attribute labels and this volume of images in the vision-based structural health monitoring area. It is believed that this image dataset and its corresponding detection tasks and framework will provide the necessary benchmark for future studies of deep learning in vision based structural health monitoring.

Analogous to the classification and localization tasks in the ImageNet challenge, the goal of the Φ-Net framework was to construct similar recognition tasks, designed for structural damage recognition and evaluation. Based on past experiences from reconnaissance efforts (Sezen et al. [2003]; Li and Mosalam [2013]; Mosalam et al. [2014]; and Koch et al. [2015]), several issues affect the safety of structures post-event: the type of damaged component, the severity of damage in the component, and the type of damage. Because images collected from reconnaissance efforts broadly vary, including, different distances from objects, camera angles, and emphasized targets, it is useful to cluster these issues into different levels. That is, images taken from a very close distance or only containing part of the component belong to the pixel level; major targets in images such as single or multiple components belong to the object level, and images containing most of the structure belong to the structural level. Moreover, the corresponding evaluation criteria will be different for different levels: that is, images in the pixel level are more related to the material type and damage status; images on the structural level are more related to the structural type and failure status [5]. Herein is a new processing framework with a hierarchy-tree structure shown in Figure 1, where images are classified as follows: (1) a raw image is clustered to different scene levels; (2) according to its level, corresponding recognition tasks are applied layer by layer following this hierarchy structure; and (3) each node is seen as one recognition task or a classifier, and the output of each node is seen as a characteristic or feature of the image to help with further analysis and decision making if required [5].

In the current Φ-Net, we designed the following eight benchmark classification tasks:

three-class classification for scene level;
binary classification for damage state;
binary classification for spalling condition (material loss);
binary classification for material type;
three-class classification for collapse mode;
four-class classification for component type;
four-class classification for damage level and
four-class classification for damage type.

The proposed framework with a hierarchy-tree structure is depicted in Figure 1, where grey boxes represent the detection tasks for the corresponding attributes, white boxes within the dashed lines are possible labels of the attributes to be chosen in each task, and ellipsis in boxes represent other choices or conditions. In the detection procedure, one starts from a root leaf where recognition tasks are conducted layer by layer and node by node (grey box) until another leaf node. The output label of each node describes a structural attribute. Each structural image may have multiple attributes, i.e., one image can be categorized as being at the pixel level, concrete, damaged state, etc. In the terminology of CV, this is considered a multi-attribute (multi-label) classification problem. Given that this is a pilot study, these attributes are treated independently at this stage. More extensions such as the multi-label version of Φ-Net will be updated in future studies.

The dataset exists at: https://apps.peer.berkeley.edu/phi-net/

YehyaAbdellatif / -Classification-of-Images-for-Building-Damage-Assessment

Classification-of-Images-for-Building-Damage-Assessment

Introduction

Methodology

The Dataset

About

Languages