Full article: Prediction of concrete compressive strength using deep neural networks based on hyperparameter optimization

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

This paper describes deep neural network (DNN) models based on hyperparameter optimization for the prediction of the compressive strength of concrete. The novelty of this research lies in the implementation of optimized hyperparameters to train the DNN models with the aim to enhance their predictive accuracy. Utilizing the Keras Tuner library, the most effective hyperparameters for the DNN models were identified. These models are then trained and evaluated using an experimental dataset of 1030 instances, encompassing nine quantitative variables that include various concrete mix ingredients and age. The target variable for all models is the compressive strength of concrete. The assessment of model performance, based on metrics like mean squared error (MSE) and R² values, was conducted on previously unseen test data. The optimal configuration for the hidden layers in the model was identified as five layers, containing 12, 16, 16, 40, and 26 neurons, respectively. The optimal learning rate for the model was 0.001. With this set of optimal hyperparameters, the best DNN model achieved an MSE of 28.76 and an R² value of 0.89 on the testing data. The results demonstrate a significant improvement in the DNN models' performance when trained with the optimized hyperparameters. In comparison to the regression model, the performance of the DNN models was significantly better. Adopting the predictive models developed in this research offers the potential for substantial cost and time savings by circumventing the need for labour-intensive and time-consuming laboratory tests.

Keywords:

Reviewing Editor:

Sanjay Kumar Shukla, Edith Cowan University, Australia

Subjects:

1. Introduction

Cement concrete is an extensively used building material, composed of four main components: cement, sand, aggregates, and water. Concrete has several advantages that make it a popular choice in the construction industry: A major advantage of concrete is its high compressive strength and durability, which makes it suitable for the construction of buildings, bridges, roads, and dams. Concrete is a highly versatile material that offers immense flexibility in terms of shape, size, and design. It can be moulded into various forms to meet specific project requirements. Concrete can be pumped into high rise buildings, which is a highly desirable property in modern construction (Wang et al., Citation2023). This versatility allows for creative architectural designs and innovative construction solutions. Concrete is inherently fire-resistant, making it a safe choice for constructing fire-resistant structures. The major ingredients required to prepare cement concrete, such as cement, sand, and aggregates, are readily available and relatively affordable, making it an economically viable option for construction projects of all scales. The advantages associated with concrete make it a highly desirable material for various construction projects, ensuring strength, durability, versatility, and cost-effectiveness.

To prepare concrete of desired compressive strength and workability, the optimal proportions of ingredients are determined using the mix design process. Mix design, also known as concrete mix proportioning, involves two essential steps: (1) selecting the appropriate ingredients, and (2) determining their relative quantities to achieve concrete with optimal workability, strength, and durability while ensuring cost-effectiveness (Wang & Wu, Citation2023). The proportions of these ingredients are influenced by their specific characteristics. When using predetermined concrete-making materials, the variables to consider include the ratio of cement to aggregate, the water-cement ratio, the ratio of sand to coarse aggregate in the aggregate, and the amount of admixtures. Ultimately, the primary goal of mix design is to choose suitable ingredients from available materials and identify the most cost-efficient combination that will result in concrete meeting desired performance requirements.

Determination of the compressive strength of concrete prepared using different proportions of ingredients is a pre-requisite for the design of structures as it directly impacts the safety and performance of structures. The traditional approach to the determination of cement concrete is based on laboratory testing of concrete cubes or cylinders at a predefined age (Larrard & Sedran, Citation2002). However, the testing process is expensive and time-consuming. Moreover, various factors can influence the measured compressive strength of concrete. These factors include the preparation of test specimens, the conditions under which the specimens are cured, the rate at which they are loaded during testing, and the calibration of the testing apparatus. Inconsistencies in any of these aspects can introduce variability in the results, affecting the reliability of the measured compressive strength. Maintaining consistent and standardized testing procedures is crucial to minimize this variability in measured compressive strength. A combination of careful experimental design, proper sample preparation, and standardized testing procedures is required for a reliable estimate of the compressive strength of concrete.

Compressive strength of concrete depends upon several factors including its age and proportion of different ingredients used to prepare it. There is a complex non-linear relationship between the proportion of different ingredients used and the compressive strength of concrete. This relationship is even more complex for the high-performance concrete (HPC). Apart from the three fundamental components found in traditional concrete—Portland cement, fine and coarse aggregates, and water—HPC requires the inclusion of supplementary cementitious materials like fly ash and blast furnace slag, as well as chemical admixtures, such as superplasticizers (Chou & Pham, Citation2013; Stoński, Citation2010). Traditional experimental testing of concrete to evaluate the effects of the proportions of different ingredients in the concrete mix, particularly for the HPC is costly and time-consuming. Further, the existing empirical equations provided in codes and standards, which are used to estimate compressive strength, are developed using concrete test data that does not include the incorporation of supplementary cementitious materials.

Several machine learning models have been applied to predict the compressive strength of different types of concrete in the past. For example, Lin and Wu (Citation2021) used back propagation (BP) network with one hidden layer, and the number of neurons in the hidden layer was determined by judgement. Most studies have utilized hyperparameters based on manual tuning, which is inefficient and time-consuming. Therefore, the major aim of the present research was to develop machine learning models based on deep neural networks (DNNs) that utilize optimized hyperparameters to predict compressive strength of concrete using a set of ingredients and age of concrete as inputs. Identifying an optimal hyperparameter set is critical for improving the ability of the model to learn and generalize from the data during the training stage. The search space of possible hyperparameter combinations is often vast, and manually fine-tuning them can be time-consuming and inefficient. Hyperparameter optimization (Kong et al., Citation2023) used in the present research automates the search process to find the best configuration. The novel feature of the DNN models developed herein is the use of hyperparameter optimization for the determination of optimal hyper parameter sets with a view to improve the performance of the DNN models.

2. Related literature

Numerous applications of ML techniques to civil engineering problems have been reported in the literature (e.g. Asim et al., Citation2021; Kaya, Citation2010; Kumar et al., Citation2023; Pandey et al., Citation2018). Applications of ML techniques to the prediction of compressive strength of concrete, in particular, have been investigated in different forms by several researchers. For example, Stoński (Citation2010) investigated the application of Bayesian approach combined with Monte-Carlo simulation for regression modelling based on ANN for the prediction of compressive strength of high performance concrete (HPC). Wei et al. (Citation2012) established a relationship between the standard compressive strength of concrete at 28 days and resistivity of cement pastes at 24 h. An ensemble approach for predicting the compressive strength of high-performance concrete (HPC) was proposed by Chou and Pham (Citation2013). The results of the analysis indicated that the ensemble technique, which combines two or more models, achieved the highest level of prediction performance. Yuan et al. (Citation2014) proposed two hybrid models: the genetic-based algorithm and the adaptive network-based fuzzy inference system (ANFIS) for the prediction compressive strength of standard concrete. The results indicate that both hybrid models exhibit good performance in terms of prediction accuracy and practical applicability in real production scenarios. Erdal (Citation2013) describes the application of three different ensemble approaches for the prediction of compressive strength of concrete. Aiyer et al. (Citation2014) describe the application of Least Square Support Vector Machine (LSSVM) and Relevance Vector Machine (RVM) techniques for the determination of compressive strength of self-compacting concrete. The authors concluded that RVM is a robust technique for the determination of compressive strength of self-compacting concrete. Pham et al. (Citation2015) developed a hybrid model that combines the firefly algorithm with least squares support vector regression for predicting the compressive strength of HPC. The experimental findings indicated that the hybrid approach shows great promise as an alternative method for predicting the strength of HPC.

Omran et al. (Citation2016) investigated the effectiveness of nine different data mining models in forecasting the compressive strength of a new concrete variant, which incorporates three alternative materials: fly ash, Haydite lightweight aggregate, and Portland limestone cement. Khashman and Akpinar (Citation2017) describe the application of a classification model based on logistic regression to classify the compressive strength of various concrete mixes into low, moderate, or high strength categories. Deng et al. (Citation2018) proposed a machine learning model based on Convolutional Neural Network (CNN) for the prediction of compressive strength of recycled concrete. The authors concluded that the CNN model performed better than the traditional ANN model. Feng et al. (Citation2020) described the application of an adaptive boosting approach for the prediction of compressive strength of concrete using a set of ingredients and the curing time as inputs. The proposed method was also evaluated against artificial neural network (ANN) and support vector machine (SVM). Kaloop, Kumar, et al. (Citation2020) investigated the application of the multivariate adaptive regression spline model as a feature extraction method to identify and extract the most suitable inputs for designing HPC. Subsequently, the extracted features were fed into a gradient tree boosting machine learning technique to predict the compressive strength of concrete. Kaloop, Samui, et al. (Citation2020) describe the application of four soft-computing approaches for the estimation of slump flow and compressive strength of concrete. Liu et al. (Citation2021) describe the application of machine learning models for the prediction of carbonation depth for recycled aggregate concrete. Asteris et al. (Citation2021) utilized a collection of 1030 instances of experimental data available in the machine learning repository of the University of California, Irvine to develop a hybrid ensemble model based on ANN.

Tran et al. (Citation2022) describe the application of hybrid models for the prediction of the compressive strength of concrete made with recycled concrete aggregates. The authors concluded that the gradient boosting model combined with particle swarm optimization yielded the highest prediction accuracy. Ly et al. (Citation2021) describe the development of a DNN model to predict the compressive strength of rubber concrete. The DNN model proposed by the authors demonstrated better prediction accuracy compared to previously published results obtained from various machine learning algorithms. Nguyen et al. (Citation2021) implemented four predictive algorithms: support vector regression (SVR), multilayer perceptron (MLP), gradient boosting regressor (GBR), and extreme gradient boosting (XGBoost) for predicting the compressive and tensile strengths of high-performance concrete (HPC). The results of the study revealed that GBR and XGBoost models exhibit superior performance compared to SVR and MLP. Amiri and Hatami (Citation2022) developed an ANN model that utilizes the characteristics of reference samples to predict the mechanical and durability properties of waste-included concrete. Shishegaran et al. (Citation2021) proposed a novel hybrid model based on High Correlated Variables Creator Machine (HCVCM) for predicting the compressive strength of concrete by incorporating ultrasonic pulse velocity and rebound number data. The authors found that combining HCVCM with Adaptive Neuro-Fuzzy Inference System improved the prediction capabilities of the model significantly. Khursheed et al. (Citation2021) describe the application of several machine learning techniques for predicting compressive strength of fly ash concrete. Li and Song (Citation2022) employed four ensemble learning models, namely AdaBoost, GBDT, XGBoost, and random forest, to predict compressive strength of HPC. The authors concluded that the GBDT model exhibited better performance than the other machine learning models used in their research. Bello et al. (Citation2022) describe the application of DNNs to predict the required water for a normal concrete mix. Chi et al. (Citation2023) included electrical resistivity as an additional input parameter to predict the compressive strength of concrete. When the resistivity was considered as an input variable, the accuracy of the prediction model showed a significant improvement.

Numerous research studies have emphasized the effectiveness of ANNs in predicting variables of interest. However, configuring an ANN with an appropriate network structure remains a daunting task. The configuration process involves crucial decisions, such as determining the size of input layers, setting the learning rate, selecting the number of hidden layers and nodes in each layer, along with other related parameters. Typically, a trial-and-error method based upon user intuition and judgement is employed to discover the optimal parameter set. Often, utilizing such parameters obtained through trial and error might not be the most appropriate solution for the specific problem being addressed. Consequently, the full potential of ANNs might not be realized if suboptimal parameters are implemented. Therefore, the major focus of the present research is to develop DNN models based on hyperparameter optimization for predicting the compressive strength of concrete.

3. Machine learning techniques

With the advancements in artificial intelligence (AI), there has been a growing trend of utilizing machine learning (ML) techniques to solve non-linear regression problems. ML, being a subset of AI, offers a range of objectives, including classification, regression, clustering, and more. In comparison to conventional regression methods, ML employs specific algorithms capable of learning directly from input data, resulting in highly accurate outcomes of the output data. ANNs are an important component of ML and are designed to recognize patterns, learn from the input data, and make predictions or decisions. A typical ANN consists of interconnected nodes called neurons or units that are organized into layers. The first layer is an input layer and the last layer is an output layer. There are one or more layers between the input and output layers that are called hidden layers. Each unit in the input layer is provided with the input data, which is then processed using a mathematical function (activation function), and an output is produced by the model. The connections between the units are assigned weights, which determine the strength or importance of the connection. During training, ANNs adjust the weights of the connections to produce the desired output. By iteratively adjusting the weights based on the training data, ANNs can learn to recognize complex patterns and make accurate predictions or classifications. A network is classified as a DNN when it contains more than one hidden layer. DNNs have been utilized to address a wide variety of problems across different fields. For example, DNN has been applied to image recognition (e.g. Zheng et al., Citation2023), speech recognition (e.g. Hema & Marquez, Citation2023), regression modelling (e.g. Asim et al., Citation2021), and variety of classification problems (e.g. Dhillon et al., Citation2023).

4. Dataset

The dataset utilized in this research is publicly available on www.kaggle.com and UCI (Citation2017). It comprises 1030 observations and consists of nine quantitative variables. The dataset is complete, as no missing values have been reported by the data providers. provides descriptive statistics of all the input and output variables. To create DNN and regression models, 70% of the experimental data was utilized for training the models, while the remaining 30% was used for testing purposes. For the efficient training of the DNN models, the input data variables have been normalized in the range of 0–1. The input variables (features) considered for the training of the DNN models include amount of cement (CM), slag (SL), fly ash (FL), water (WT), superplasticizer (SP), coarse aggregates (CA), and fine aggregates (FA). Additionally, the strength of concrete depends on its age (AG), and therefore AG is also included as an input variable.

Table 1. Descriptive statistics.

Download CSV Display Table

shows the ranges of different components, such as SL, FL, and SP, in the dataset. The range of SL values indicates that the dataset comprises experiments conducted both with and without SL. Similarly, the range of FL values suggests that the dataset includes experiments conducted with and without FL. Additionally, the range of SP values in shows that the dataset includes experiments with and without the addition of superplasticizers. The maximum (82.56) and minimum (2.33) values of compressive strength shown in indicate that the dataset includes experiments conducted on low-grade to very high-grade concrete mixes.

The coefficients of correlation between different variables are shown in . Interpreting the correlation coefficients is crucial for understanding the relationships between concrete compressive strength (CS) and the various ingredients (CM, SL, FL, WT, SP, CA, FA) and the age (AG) of concrete. The correlation coefficient between compressive strength (CS) and cement content (CM) is 0.5, the highest among all correlations. This suggests that higher cement content is generally associated with increased compressive strength in concrete. A negative correlation of CS (−0.3) was observed with WT implying increased water-to-cement ratio leads to lower CS. There is a positive correlation of 0.4 between SP and CS, suggesting that a higher amount of superplasticizer is associated with increased compressive strength. There is a weak negative correlation of CS with SL, FL, CA, and FA. The correlation coefficient of 0.3 between CS and AG indicates a low positive correlation, implying that as the age of concrete increases, the compressive strength of concrete also tends to increase.

Figure 1. Coefficients of cross correlation between different variables.

5. Methodology

The methodology adopted in this research involved pre-processing the available data, using the Keras Tuner (O’Malley et al., Citation2019) Library to find the best hyperparameters of DNN models, constructing the DNN model with these hyperparameters, evaluating the model's performance on unseen test data using performance metrics based on MSE and R² values, and selecting the best models based on their performance. In this research, two sets of DNNs were developed: one using a user-defined set of hyperparameters, and the other employing optimal hyperparameters determined using the KerasTuner library in Python. To facilitate the comparison of results from DNN models, a multiple linear regression (MLR) model was also constructed. A flow chart of the methodology used in this research is shown in .

Figure 2. Methodology.

5.1. Deep neural networks

Neural networks are widely recognized as highly effective learning algorithms in the field of machine learning. A typical neural network architecture consists of an input layer, one or more hidden layers, and an output layer. When a network has multiple hidden layers, it is referred to as a DNN. The rise in popularity of DNNs has largely been driven by the abundant availability of large computing resources in recent times. Researchers and practitioners have applied DNNs to a wide range of complex problems across various domains, and numerous applications can be found in the literature. For instance, Asim et al. (Citation2021), Bello et al. (Citation2022), and Pandey et al. (Citation2018) are some of the studies that have utilized DNNs to solve complex problems successfully. The adaptability and effectiveness of DNNs have contributed to their widespread use and have opened up new possibilities for solving challenging real-world problems.

The efficient performance of a neural network heavily relies on its architecture, which is defined by the interconnection pattern among neurons and consists of the input layer, hidden layers, and an output layer. Determining the input and output layers is relatively straightforward; the input layer should have as many neurons as there are independent variables in the problem, while the output layer should match the number of dependent variables. In classification tasks, the output layer typically contains more than one neuron to accommodate multiple classes. Two critical factors that significantly impact the neural network's performance are the optimal number of hidden layers and the number of neurons in each hidden layer. shows the general structure of a DNN, and shows the functioning of a typical neuron.

Figure 3. A typical DNN.

Figure 4. Working of a neuron in a DNN.

The most commonly used activation functions for regression problems presently are the rectified linear unit (ReLu function) and tanh functions. The Relu function is defined as: (1) $f (z) = {\begin{matrix} z, z > 0 \\ 0, z \leq 0 \end{matrix}, or f (z) = max (0, z)$ (1)

The output of the Relu function is the same as the input when the input is positive, but it becomes zero when the input is negative or zero. At each node of the layer, the input is multiplied by its interconnection weight, and the bias is added. The sum is then passed to Relu function that produces the output according to the predefined rule.

The hyperbolic tangent activation function, also known as tanh, is a popular non-linear activation function extensively employed in neural networks. It maps the input values to the range of [−1, 1], which makes it a useful choice for models where the output needs to be bounded.

The tanh activation function can be mathematically represented as: (2) $f (z) = \frac{e^{z} - e^{- z}}{e^{z} + e^{- z}}$ (2)

A significant advantage of the tanh function is that it is centered at 0, unlike the ReLU function which is centered at 0 and only positive for positive input values. This centered property helps mitigate the vanishing gradient problem for very negative inputs, which can improve the training process in deeper networks.

5.2. Hyperparameter optimization

Hyperparameters in a DNN are the parameters that are set before the training process begins. Unlike the model's internal parameters (weights and biases), hyperparameters do not learn from the data during training; instead, they are predefined by the user based on their judgement and intuition. The choice of hyperparameters significantly affects the performance of the neural network and its ability to learn and generalize from the data. Therefore, optimization of hyperparameters is important for enhancing the performance of DNNs. Some common hyperparameters in a DNN include the number of hidden layers, number of nodes in hidden layers, learning rate, activation function, batch size, number of epochs, and type of optimizer. Techniques, such as grid search, random search, and Bayesian optimization are often used to determine the best set of hyperparameters for a given problem. In the present study, hyperparameter optimization was performed using the Keras Tuner library in Python, focusing on the Adam Optimizer (Kingma & Ba, Citation2014). The library creates a build model function that assigns hyperparameter values, such as the number of layers, nodes in each layer, and the learning rate, in a randomized manner to construct a DNN. Subsequently, the Keras Tuner trains the network and provides optimal values that minimize the MSE as the output.

5.3. Keras Tuner Library

KerasTuner, an open-source hyperparameter optimization framework designed for Keras and TensorFlow in Python, was used to find the optimal hyperparameter sets for the DNN. It is designed to help researchers and practitioners efficiently search for the optimal hyperparameters for their machine learning models, particularly those built with Keras and TensorFlow. Keras Tuner provides a high-level API that allows the user to define the hyperparameter search space and choose from different hyperparameter tuning algorithms, such as random search, grid search, and Bayesian optimization. The KerasTuner library supports tuning hyperparameters for various machine learning models, including deep learning models like neural networks, convolutional neural networks (CNNs), and recurrent neural networks (RNNs).

5.4. Regression modelling

MLR can be a valuable tool for predicting the compressive strength of concrete, which is influenced by multiple factors, such as cement content, water-to-cement ratio, aggregate properties, and curing age. Multiple linear regression allows us to include all these predictors simultaneously in the model, capturing their combined effects on compressive strength. Therefore, an MLR model based on the set of independent variables, namely CM, SL, FL, WT, FA, CA, SP, and AG was developed to predict the compressive strength of concrete. Results from the MLR model were compared with the DNN models developed herein.

5.5. Model runs

A total of five different models have been investigated in this research. Out of these, four models were different configurations of DNNs, while one model was based on MLR. Two activation functions, namely Relu and tanh have been explored in this research. To allow for a more controlled comparison of different model configurations, the Adam optimizer described by Kingma and Ba (Citation2014) was used for all model runs. Adam is a highly improved optimization algorithm in comparison to the classical stochastic gradient descent. Its efficiency in determining the optimal parameter set for training DNNs makes it a preferred choice for this research. With Adam as an optimizer, the Keras Tuner Library in Python was used to determine the optimal number of hidden layers, number of neurons in each hidden layer, and learning rate. A batch size of 16 was used for all model runs, and the maximum number of epochs was set to 500 while using the early stopping criterion. The optimal values of hidden layers, number of neurons in each hidden layer, and the learning rate are shown in .

Table 2. Statistical performance of DNN and regression models.

Download CSV Display Table

6. Results and discussion

The dataset utilized in this study consists of 1030 experimental values. Among these, 721 observations were used for training the models, while the remaining 309 observations were employed to assess their performance. A total of four different configurations of DNN models was explored in this research. DNN1 and DNN2 used Relu and tanh activation functions, respectively when no hyperparameter optimization was carried out. Two of the four DNN models, namely DNN3 and DNN4 used Relu and tanh activation functions, respectively. Both DNN3 and DNN4 were run with a set of optimal hyperparameters identified using KerasTuner Library in Python. Preliminary model runs indicated that a batch size of 16 for the training model is appropriate. Therefore, a batch size of 16 has been used for all model runs. A learning rate of 0.001 was identified as the optimal choice, and subsequently applied in all model runs. To avoid overfitting the models during the training process, the maximum number of epochs was set to 500, and early stopping function was used to stop the training process when the model started to overfit the data.

6.1. Performance evaluation of DNN models

Two performance metrics were used to evaluate the performance of the DNN models. Model evaluation encompasses both relative errors or goodness of fit and absolute error measures. The performance metrics considered herein include the mean square error (MSE), and the coefficient of determination (R²) (3) $MSE = \frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}$ (3) (4) $R^{2} = {[\frac{n \sum_{i = 1}^{n} xy ‐ (\sum_{i = 1}^{n} x) (\sum_{i = 1}^{n} y)}{\sqrt{n (\sum_{i = 1}^{n} x^{2}) - {(\sum_{i = 1}^{n} x)}^{2}} \sqrt{n (\sum_{i = 1}^{n} y^{2}) - {(\sum_{i = 1}^{n} y)}^{2}}}]}^{2}$ (4) where $x_{i}$ is the actual and $y_{i}$ is the predicted compressive strength; n is the number of data points.

R² measures the degree of correlation between the measured and predicted values. In general, a higher R² value indicates a better fit of the model to the data, as it suggests that a larger proportion of the variation in the dependent variable is explained by the independent variables. The higher the value of R², the better the prediction capacity of the model. The MSE represents the mean of the squared errors between the predicted and measured values Thus, lower MSE and higher R² values indicate improved model performance.

shows the progression of the loss curve for DNN2 which consists of three hidden layers, each comprising eight neurons, and utilizes the tanh activation function. The loss curve represents the model error throughout the training and validation phases, employing the MSE. In , a rapid MSE decrease from 1600 to 200 within approximately 100 epochs was observed. Subsequently, the MSE shows a gradual decline, reaching around 30 after 250 epochs, with no further noticeable improvement. As part of the model setup, an early stopping criterion was implemented, leading to the termination of the training process after 353 epochs. A zig-zag loss curve typically suggests overfitting of the model. However, in the present case, both the training and validation loss curves exhibit smooth patterns, indicating that there is no overfitting of the model. shows the scatter plot of the actual and predicted values of compressive strength for the test data set. The plot indicates a good agreement between the two sets of values. However, the performance of the model in the prediction of lower compressive strengths is relatively better compared to the prediction of higher compressive strengths.

Figure 5. Progression of loss curve with epochs—DNN2.

Figure 6. Scatterplot of actual and predicted values of compressive strength—DNN2.

shows the progression of the loss curve for DNN4 model that was created using the optimal set of hyperparameters identified using the KerasTuner Library. The optimal number of layers used in DNN4 was 5 and the optimal number of neurons in these layers were 12, 16, 16, 40, and 26, respectively. Initially, there was a significant decrease of MSE from 1550 to 50 in about 100 epochs. Subsequently, the MSE gradually declined further, reaching approximately 29 after 150 epochs, with no noticeable improvement beyond that point. To prevent overfitting, an early stopping criterion was set up in the model, and the training process stopped after 271 epochs.

Figure 7. Progression of loss curve with epochs—DNN4.

The scatter plot of actual and predicted values of compressive strength in DNN4 is shown in . The plot indicates a good agreement between the two sets of values. The model demonstrates relatively better performance in predicting lower compressive strength values. The scatter plot of predictions from DNN4 is similar to that of DDN2. However, the MSE for the testing data was 28.76 in DNN2 compared to 35.13 in DNN4 model.

Figure 8. Scatterplot of actual and predicted values of compressive strength—DNN4.

presents the results in terms of MSE and R² values from different DNN models. It is worth to note that DNN1 and DNN2 had three hidden layers with eight neurons in each layer. The activation function used in DNN1 was Relu, whereas DNN2 employed tanh activation function. The DNN3 and DNN4 models utilized the optimal set of hyperparameters identified through KerasTuner instead of user defined values of hyperparameters. DNN1 has the highest MSE of 42.57 during testing, indicating that its predictions have a larger average squared difference from the actual values during testing. DNN1 has the lowest R value of 0.84, indicating that it explains a slightly smaller proportion of the variance in compressive strength compared to the other models. During testing, the values of MSE and R² in DNN2 were 35.13 and 0.87, respectively. The results presented in clearly indicate that the DNN2 model, utilizing the tanh activation function, outperformed DNN1, which employed the Relu activation function. The lowest MSE value of 28.76 was shown by DDN4, suggesting that it performs the best in terms of accuracy compared to the other models during testing. The R² values for DNN3, and DNN4 during testing are relatively higher (0.87 and 0.89), indicating that a significant proportion of the variance in compressive strength is explained by the predictors in these models. It can be concluded from the results presented in that the DNN4 model utilizing the tanh activation function and the optimal set of hyperparameters outperformed the remaining three DDN models.

6.2. Performance evaluation of regression model

Using the dataset comprising 1030 observations, MLR analysis was performed. To ensure the model's generalization capability, the dataset was divided into two subsets: a training dataset and a testing dataset. Seventy percent of the data was used to determine the coefficients of the model, and the remaining 30% was used to test the accuracy of the model. The training dataset was utilized to estimate the coefficients of MLR equation, while the testing dataset was employed to evaluate the model's predictive performance on new and unseen data. The regression analysis conducted in this research yielded the following equation for predicting compressive strength. (5) $CS = 0.12544516 \times CM + 0.11677346 \times SL + 0.0900276 \times FL - 09095886 \times WT + 0.3949065 \times SP + 0.02805813 \times CA + 0.0363444 \times FA + 0.1139484 \times AG$ (5)

presents a scatter plot of the observed and predicted values obtained from the regression equation. To quantitatively assess the regression model's performance, evaluations based on MSE and R² were conducted for both the test and the training dataset. The values of MSE and R² have been presented in . When the test data was used to predict the values of CS using the MLR equation developed in this research, the R² value was found to be 0.59. The MSE for the test data calculated by comparing the actual values with the values predicted by the regression equation was found to be 109.7. The scatter plot in displays a noticeable dispersion of data points around the 45° line, indicating a significant deviation in the predicted values. When considering the training data, the MSE and R² values were 0.72 and 71.35, respectively. These results highlight the regression model's poor performance in predicting compressive strength using the test data, as indicated by the R² value of 0.59 and the considerable scatter observed in the scatter plot. As expected, the training data exhibited a lower MSE and a higher R² value in comparison to the testing data.

Figure 9. Scatterplot of actual and predicted values of CS from the multiple linear regression equation using test data.

Previous research has explored various methodologies for predicting the compressive strength of concrete with varied levels of accuracy. For instance, Yeh et al. (Citation1998b) employed ANN for predicting the compressive strength of high-performance concrete. The authors concluded that the ANN-based approach was more accurate than traditional regression analysis methods, with R² values ranging from 0.917 to 0.933 during training and 0.814 to 0.895 for testing. Chou et al. (Citation2014) also used an ANN model that achieved an MSE of 63.025, which is significantly higher than that obtained in the present research. However, they did not provide R² values. Erdal et al. (Citation2013) achieved an R² of 0.909 but did not report the MSE of their model. Feng et al. (Citation2020) implemented an ANN model for predicting compressive strength, resulting in an R² of 0.903 and an MSE of 26.42. In contrast, our DNN4 model, which utilized optimized hyperparameters, achieved an R² of 0.89 and an MSE of 28.76. This shows that our model outperformed the model of Chou et al. (Citation2014) and was comparable in R² performance to the model of Erdal et al. (Citation2013). However, a direct MSE comparison with the work of Erdal et al. (Citation2013) was not possible as the MSE values were not reported by the authors.

The results presented in demonstrate that replacing Relu with the tanh activation function in DNN models (DNN2 and DNN4) significantly contributed to achieving improved results. Furthermore, the enhanced performance of DNN3 and DNN4 can be attributed to the use of optimized hyperparameters in these models. Comparing the regression model with the DNN models, it becomes evident that the regression model performed poorly, yielding an MSE of 109.7 and an R² value of 0.59. It can be concluded from the results presented in that the performance of DNN3 and DNN4 models that employed optimized hyperparameters is considerably better than that of DNN1 and DNN2 models, which were created without such optimization. Among all the models considered, DNN4, utilizing the tanh activation function and optimal hyperparameter values, exhibited the best performance.

7. Conclusions

The present research focussed on developing DNN models for predicting the compressive strength of concrete. A distinguishing feature of this research was the emphasis on hyperparameter optimization to improve the overall performance of the DNN models. This approach is novel in the sense that it incorporates the use of an optimal set of hyperparameters during the training phase of the DNN models, which is not a common practice in traditional model development. To ensure a robust evaluation of these optimized DNN models, the study employed two widely used performance metrics: MSE and R² values. These metrics provided a comprehensive assessment of the models' accuracy and predictive capabilities. For comparative analysis, an MLR model for the prediction of concrete compressive strength was also developed.

The key findings of the research may be summarized as follows:

The best DNN model achieved an MSE of 28.76 and an R² value of 0.89. The low MSE and high R² values indicate reasonably good performance of the model, even in the presence of a highly complex relationship between variables affecting concrete's compressive strength.
Based on the value of performance indicators, it can be concluded that all DNN models significantly outperformed the MLR model.
The use of tanh activation function instead of the Relu function resulted in improved DNN model performance.
The success of the present DNN model can largely be attributed to the use of optimal hyperparameter sets, which surpass the conventional user-defined hyperparameters typically employed in DNN modelling.
The predictive models proposed herein offer significant potential for cost and time savings by reducing reliance on labour-intensive and time-consuming laboratory tests.
With the models described in this research, safe and high-performance structures that meet the required standards and ensure the safety of structures can be designed reliably.
A distinct practical advantage of the DNN models developed herein is that they can be easily implemented using Python, and applied for the prediction of compressive strength of different types of concrete mixes.
With further advancement in machine learning technologies, the process of prediction of compressive strength of concrete would improve in the future.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Mohammed Naved

Mohammed Naved is a student of Bachelor of Technology program in Civil Engineering at Jamia Millia Islamia CentralUniversity in New Delhi, India. He has completed his research internship from Heriot Watt University, United Kingdom on three-dimensional printing of concrete structures.

Mohammed Asim

Tanvir Ahmad, is a Professor in Department of Computer Engineering and also as an Additional Director at FTK-Centre of Information Technology of Jamia Millia Islamia, New Delhi. He also served as the Head of the department of Computer Engineering for 6 years from 2014 to 2021. He obtained his B.Tech degree from Bangalore University, M.Tech degree from I.P. University, New Delhi with Distinction. Thereafter he received the Ph.D. degree from Jamia Millia Islamia in the area of Text Mining. He has supervised more than 20 Ph.D. students and 15 of his students have been awarded Ph.D. degree.

He has published more than 100 papers in reputed international journals, Book Chapters and International conferences and more than 90 of his papers are indexed in the Scopus database and 30+ papers are in Science citation Index. He also holds one International and one Indian patents in the field of Data Mining.

Tanvir Ahmad

Mohammed Asim is a graduated student researcher in the Department of Computer Science at the University of California, Davis. Asim obtained his bachelor’s degree in Computer Engineering from Jamia Millia Islamia Central University, New Delhi in 2021.

References

Aiyer, B. G., Kim, D., Karingattikkal, N., Samui, P., & Rao, P. R. (2014). Prediction of compressive strength of self-compacting concrete using least square support vector machine and relevance vector machine. Korean Society of Civil Engineers Journal of Civil Engineering, 18, 1753–1758. https://doi.org/10.1007/s12205-014-0524-0
Google Scholar
Amiri, M., & Hatami, F. (2022). Prediction of mechanical and durability characteristics of concrete including slag and recycled aggregate concrete with artificial neural networks (ANNs). Construction and Building Materials, 325, 126839. https://doi.org/10.1016/j.conbuildmat.2022.126839
Web of Science ®Google Scholar
Asim, M., Rashid, A., & Ahmad, T. (2021). Scour modeling using deep neural networks based on hyperparameter optimization. ICT Express, 8, 357–362. https://doi.org/10.1016/j.icte.2021.09.012
Web of Science ®Google Scholar
Asteris, P. G., Skentou, A. D., Bardhan, A., Samui, P., & Pilakoutas, K. (2021). Predicting concrete compressive strength using hybrid ensembling of surrogate machine learning models. Cement and Concrete Research, 145, 106449. https://doi.org/10.1016/j.cemconres.2021.106449
Web of Science ®Google Scholar
Bello, S. A., Oyedele, L., Olaitan, O. K., Olonade, K. A., Olajumoke, A. M., Ajayi, A., Akanbi, L., Akinade, O., Sanni, M. L., & Bello, A. L. (2022) A deep learning approach to concrete water-cement ratio prediction. Results in Materials, 15(April), 100300. https://doi.org/10.1016/j.rinma.2022.100300
Google Scholar
Chi, L., Wang, M., Liu, K., Lu, S., Kan, L., Xia, X., & Huang, C. (2023). Machine learning prediction of compressive strength of concrete with resistivity modification. Materials Today Communications, 36(June), 106470. https://doi.org/10.1016/j.mtcomm.2023.106470
Google Scholar
Chou, J. S., Tsai, C. F., Pham, A. D., & Lu, Y. H. (2014). Machine learning in concrete strength simulations: multi-nation data analytics. Constrion and Building Materials, 73(2014), 771–780. https://doi.org/10.1016/j.conbuildmat.2014.09.054
Google Scholar
Chou, J.-S., & Pham, A.-D. (2013). Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength. Construction and Building Materials, 49, 554–563. https://doi.org/10.1016/j.conbuildmat.2013.08.078
Web of Science ®Google Scholar
Deng, F., He, Y., Zhou, S., Yu, Y., Cheng, H., Wu, X. (2018). Compressive strength prediction of recycled concrete based on deep learning. Construction and Building Materials, 175, 562–569. https://doi.org/10.1016/j.conbuildmat.2018.04.169
Web of Science ®Google Scholar
Dhillon, M. S., Sharif, M., Madsen, H., & Jakobsen, F. (2023). Seasonal precipitation forecasting for water management in the Kosi Basin, India using large-scale climate predictors. Journal of Water and Climate Change, 14(6), 1868–1880. https://doi.org/10.2166/wcc.2023.479
Web of Science ®Google Scholar
Erdal, H. I. (2013). Two-level and hybrid ensembles of decision trees for high performance concrete compressive strength prediction. Engineering Applications Artificial Intelligence, 26(7), 1689–1697. https://doi.org/10.1016/j.engappai.2013.03.014
Web of Science ®Google Scholar
Erdal, H. I., Karakurt, O., & Namli, E. (2013). High performance concrete compressive strength forecasting using ensemble models based on discrete wavelet transform. Engineering Applications Artificial Intelligence, 26(4), 1246–1254. https://doi.org/10.1016/j.engappai.2012.10.014
Web of Science ®Google Scholar
Feng, D. C., Liu, Z. T., Wang, X. D., Chen, Y., Chang, J. Q., Wei, D. F., & Jiang, Z. M. (2020). Machine learning-based compressive strength prediction for concrete: An adaptive boosting approach. Construction and Building Materials, 230, 1–11. https://doi.org/10.1016/j.conbuildmat.2019.117000
Web of Science ®Google Scholar
Hema, C., & Marquez, F. P. G. (2023). Emotional speech recognition using CNN and deep learning techniques. Applied Acoustics, 211, 1–11. https://doi.org/10.1016/j.apacoust.2023.109492
Web of Science ®Google Scholar
Kaloop, M. R., Kumar, D., Samui, P., Hu, J. W., & Kim, D. (2020). Compressive strength prediction of high-performance concrete using gradient tree boosting machine. Construction and Building Materials, 264, 1–11. https://doi.org/10.1016/j.conbuildmat.2020.120198
Web of Science ®Google Scholar
Kaloop, M. R., Samui, P., Shafeek, M., & Hu, J. W. (2020). Estimating slump flow and compressive strength of self-compacting concrete using emotional neural networks. Appl. Sci., 10(23), 8543. https://doi.org/10.3390/app10238543
Google Scholar
Kaya, A. (2010). Artificial neural network study of observed pattern of scour depth around bridge piers. Computers and Geotechnics 37(3), 413–418. https://doi.org/10.1016/j.compgeo.2009.10.003
Web of Science ®Google Scholar
Khashman, A. P., & Akpinar, P. (2017). Non-destructive prediction of concrete compressive strength using neural networks. Procedia Computer Science, 108(2017), 2358–2362. https://doi.org/10.1016/j.procs.2017.05.039
Google Scholar
Khursheed, S., Jagan, P., Samui, P., & Kumar, S. (2021). Compressive strength prediction of fly ash concrete by using machine learning techniques. Innovative Infrastructure Solutions 6(3), 149. https://doi.org/10.1007/s41062-021-00506-z
Web of Science ®Google Scholar
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
Google Scholar
Kong, Q., He, C., Liao, L., Xu, J., & Yuan, C. (2023). Hyperparameter optimization for interfacial bond strength prediction between fiber-reinforced polymer and concrete. Structures, 51, 573–601. https://doi.org/10.1016/j.istruc.2023.03.082
Web of Science ®Google Scholar
Kumar, S., Goyal, M. K., Deshpande, V., & Agarwal, M. (2023). Estimation of time dependent scour depth around circular bridge piers: Application of ensemble machine learning methods. Ocean Engineering, 270, 1–15. https://doi.org/10.1016/j.oceaneng.2022.113611
Web of Science ®Google Scholar
Larrard, F., & Sedran, T. (2002). Mixture-proportioning of high-performance concrete. Cement and Concrete Research, 32(11), 1699–1704. https://doi.org/10.1016/S0008-8846(02)00861-X
Web of Science ®Google Scholar
Li, Q., & Song, Z. (2022). High-performance concrete strength prediction based on ensemble learning. Construction and Building Materials, 324, 1–18. https://doi.org/10.1016/j.conbuildmat.2022.126694
Web of Science ®Google Scholar
Lin, C. J., & Wu, N. J. (2021) An ANN model for predicting the compressive strength of concrete. Applied Sciences, 11(9), 1–13. https://doi.org/10.3390/app11093798
Google Scholar
Liu, K., Alam, M. S., Zhu, J., Zheng, J., & Chi, L. (2021). Prediction of carbonation depth for recycled aggregate concrete using ANN hybridized with swarm intelligence algorithms. Construction and Building Materials, 301, 1–15. https://doi.org/10.1016/j.conbuildmat.2021.124382
Web of Science ®Google Scholar
Ly, H. B., Nguyen, T. A., Thi Mai, H. V., & Tran, V. Q. (2021). Development of deep neural network model to predict the compressive strength of rubber concrete. Construction and Building Materials, 301, 124081. https://doi.org/10.1016/j.conbuildmat.2021.124081
Web of Science ®Google Scholar
Nguyen, H., Vu, T., Vo, T. P., & Thai, H. T. (2021) Efficient machine learning models for prediction of concrete strengths. Constrion and Building Materials, 10(266), 1–17. https://doi.org/10.1016/j.conbuildmat.2020.120950
Google Scholar
O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., & Invernizzi, L. (2019). Keras Tuner. Retrieved from https://github.com/keras-team/keras-tuner
Google Scholar
Omran, B. A., Chen, Q., & Jin, R. (2016). Comparison of data mining techniques for predicting compressive strength of environmentally friendly concrete. J. Computing in Civil Engineering., 30(6), 04016029.
Web of Science ®Google Scholar
Pandey, M., Zakwan, M., Sharma, P. K., & Ahmad, Z. (2018). Multiple linear regression and genetic algorithm approaches to predict temporal scour depth near circular pier in non-cohesive sediment. ISH Journal of Hydraulic Engineering, 26, 96–103. https://doi.org/10.1080/09715010.2018.1457455
Google Scholar
Pham, A. D., Hoang, N. D., & Nguyen, Q. T. (2015). Predicting compressive strength of high-performance concrete using metaheuristic-optimized least squares support vector regression. Journal of Computing in Civil Engineering, 30(3), 06015002.
Web of Science ®Google Scholar
Shishegaran, A., Varaee, H., Rabczuk, T., & Shishegaran, G. (2021). High correlated variables creator machine: Prediction of the compressive strength of concrete. Computers & Structures, 247(106479), 1–12. https://doi.org/10.1016/j.compstruc.2021.106479
Google Scholar
Stoński, M. (2010). A comparison of model selection methods for compressive strength prediction of high-performance concrete using neural networks. Computers and Structures, 88(21), 1248–1253. https://doi.org/10.1016/j.compstruc.2010.07.003
Google Scholar
Tran, V. Q., Dang, V. Q., & Ho, L. S. (2022). Evaluating compressive strength of concrete made with recycled concrete aggregates using machine learning approach. Construction and Building Materials, 323, 1–20. https://doi.org/10.1016/j.conbuildmat.2022.126578
Web of Science ®Google Scholar
UCI (2017). Machine Learning Repository. Retrieved from Online dataset resources https://archive.ics.uci.edu/ml/datasets.html
Google Scholar
Wang, Q., Pan, C., Liang, Y., Gan, W., & Ho, J. (2023). Pumping lightweight aggregate concrete into high-rise buildings. Journal of Building Engineering, 80, 108069, https://doi.org/10.1016/j.jobe.2023.108069
Web of Science ®Google Scholar
Wang, Z., & Wu, B. (2023). A mix design method for self-compacting recycled aggregate concrete targeting slump-flow and compressive strength. Construction and Building Materials, 404, 133309. https://doi.org/10.1016/j.conbuildmat.2023.133309
Web of Science ®Google Scholar
Wei, X., Xiao, L., & Li, Z. (2012). Prediction of standard compressive strength of cement by the electrical resistivity measurement. Construction and Building Materials, 31, 341–346. https://doi.org/10.1016/j.conbuildmat.2011.12.111
Web of Science ®Google Scholar
Yeh, I.-C. (1998b). Modeling of strength of highperformance concrete using artificial neural networks. Cement and Concrete Research, 28(12), 1797–1808.
Web of Science ®Google Scholar
Yuan, Z., Wang, L., & Ji, X. (2014). Prediction of concrete compressive strength: Research on hybrid models genetic based algorithms and ANFIS. Advances in Engineering Software, 67, 156–163. https://doi.org/10.1016/j.advengsoft.2013.09.004
Web of Science ®Google Scholar
Zheng, H., Li, X., Sun, T., Huang, Z., & Xie, C. (2023). Multiaxial fatigue life prediction of metals considering loading paths by image recognition and machine learning. Engineering Failure Analysis, 143(Part B), 106851. https://doi.org/10.1016/j.engfailanal.2022.106851
Google Scholar

Prediction of concrete compressive strength using deep neural networks based on hyperparameter optimization

Abstract

1. Introduction

2. Related literature

3. Machine learning techniques