Effect of SVM Kernel Functions on Bearing Capacity Assessment of Deep Foundations

Document Type : Regular Article


1 Centre of Tropical Geoengineering (GEOTROPIK), Institute of Smart Infrastructure and Innovative Engineering (ISIIC), Faculty of Civil Engineering, Universiti Teknologi Malaysia, Johor Bahru 81310, Malaysia

2 Department of Civil Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur 50603, Malaysia

3 Civil Engineering Department, College of Engineering, University of Sulaimani, Kurdistan Region, Iraq

4 Department of Civil Engineering, Faculty of Engineering, Lorestan University, Khorramabad 6815144316, Iran

5 Department of Civil Engineering, Sekolah Tinggi Teknologi Pekanbaru, Indonesia


Pile foundations are vastly utilized in construction projects where their capacities (pile bearing capacity, PBC) should be determined in different stages of construction. A highly reliable and accurate prediction model can lead to many advantages, such as reducing the construction cost, shortening the construction timeline, and providing safety construction. Hence, the aim of this study is the developments of statistical and artificial intelligence (AI) models for predicting bearing capacities of 141 piles. At the preliminary of the study, features or inputs of this study to predict PBC were selected trough simple regression analysis. Then, this study presents different kernels of support vector machine (SVM) technique, i.e., the dot, the radial basis function (RBF), the polynomial, the neural, and the ANOVA to predict the PBC. The aforementioned models were evaluated by several performance indices and their results were compared using a simple ranking system. The results showed that the SVM-RBF model is able to achieve the highest coefficient of determination, R2 values which are 0.967 and 0.993 for training and testing stages, respectively. It is important to mention that a multiple regression model was also employed to predict PBC values. The other SVM kernels were provided a high degree of accuracy for estimating PBC, however, the SVM-RBF model is recommended to be used as a powerful, highly reliable, and simple solution for PBC prediction.


Main Subjects

1. Introduction

Pile foundation supports the essential constituent of the superstructure by transferring the overall load from the structure to beneath the soil or earth [ 1 - 3 ]. Pertaining to that information, the value of pile bearing capacity (PBC), which is delineated as the total load a pile can hold to support the superstructure, holds substantial significance in the design of pile foundations, whereby casualties and loss of property as a result of pile failure can be avoided [ 4 - 6 ]. In the past few decades, field test such as Static Load Test (SLT) and High-Strain Dynamic Testing (HSDT) are preferred to be conducted in relevant projects to determine the factors including the bearing capacity of the pile. However, it is impossible to carry the field tests on each pile due to their limitations such as time consuming and costly [ 3 , 7 - 9 ]. Since estimation of PBC is one of the hot topics in the of area geotechnical engineering [ 10 ], different methods in estimating PBC have been proposed by many researchers. Nevertheless, the model accuracy and consistency is always of prime importance and interest in such case.

There are many parameters influencing the PBC in the real scenario which can be divided into three categories i.e., pile geometry, soil condition and field test setting [ 11 - 15 ]. These categories with their sub-factors are presented in Figure 1. Among all effective factors, the pile geometry group including the embedment length of pile beneath the soil, the soil type and the apparatus used in the field test are considered the most influential parameters in measuring/predicting the PBC. The whole available models in the area of PBC estimation can be categorized into 3 general groups which are i) empirical/theoretical, ii) statistical and iii) artificial intelligence (AI)/machine learning (ML). The models in the first group are developed based on the theory from previous researchers and also the laboratory test data. The calculation of PBC values using empirical/theoretical techniques e.g., the Terzaghi formula (Terzaghi 1943) and the Vesic formula [ 17 ], could not be enlightened models for developing a reliable predictive tool specially when a new data is available [ 18 ]. The reason(s) may refer to the fact that these methods are lengthy in calculation and there are many assumptions that need to be made. Other than empirical/theoretical models, statistical models were also used to perform a solution for the PBC prediction [ 19 , 20 ]. Typically, the statistical model is a developed mathematical equation from the relationship between the predictors (inputs which are more than one variable) and the outcome (output) variables. Although statistical models are good in terms of their simplicity and efficiency, the performance capacity of these models is low, especially when extreme values are found in the data [ 20 ]. The models also do not show robustness that can solve complex and nonlinear relationships [ 9 ].

Fig. 1. The most important categories on the PBC prediction.

In this modern era, human cannot live without computer, programming, and the applications of computational-based models. With the needs and requirements from the society, technology is improving and advancing from days to days. In essence, AI/ML techniques are normally dealing with math, algorithm and a sense of creativity [ 14 , 21 , 22 ]. They have been efficiency applied in solving problems in various areas of engineering [ 23 - 29 ]. In the area of the PBC estimation, there are several published AI/ML works in literature. For instance, Pal and Deswal [ 30 ] and Momeni et al. [ 5 ] suggested a solution for solving PBC problem based on the Gaussian process regression (GPR) model and reported its successful application for predicting PBC values. In another study, Momeni et al. [ 31 ] conducted a study to propose a model on PBC prediction using a hybrid genetic algorithm (GA)-artificial neural network (ANN). They used a total number of 50 data samples for their study and received an excellent level of performance capacity for their proposed model. In another interesting study, Harandizadeh et al. [ 32 ] made use the applications of improved neuro-fuzzy approaches in predicting the PBC and their model received a very low system error in forecasting PBC. Chen et al. [ 33 ] have developed several hybrid AI/ML models including neuro-genetic, neuro-imperialism, genetic programming (GP), and ANN to estimate PBC values. After evaluation these techniques, the GP model was scored the highest coefficient of determination (R2) value among all proposed models. It seems that AI/ML models are able to provide a new solution and at the same time highest level of accuracy among all three described groups in estimating PBC values.

After reviewing, different kinds of AI/ML models in the area of PBC prediction, there is only a limited number of support vector machine (SVM) studies available for predicting pile capacity [ 20 , 34 ]. SVM is a ML method that has demonstrated very encouraging and excellent results in the geotechnical field such as liquefaction assessment [ 35 ], tunneling and underground space technology [ 36 ], dam, embankment and retaining wall [ 37 , 38 ], soft soil issues [ 39 , 40 ], rock strength issues [ 41 ] and blasting environmental issues [ 42 ]. In addition, to being a powerful modelling technique, SVM can be used to provide the user with advice regarding the variables lack in the training set database. Furthermore, SVM usually comes with different kernel functions such as linear, polynomial, sigmoid and radial basis function (RBF) that are able to simplify the complexity of nonlinear data.

This study aims to evaluate the feasibility of SVM model with the use of different kernel functions to predict PBC values. To this end, various SVM kernels i.e., dot or linear, RBF, polynomial, neural, and ANOVA are used to solve the problem in hand. Then, these kernels are evaluated based on their performance capacities in predicting PBC values and the best SVM kernel is selected to introduce.

2. Materials and methods

2.1. SVM background

Support vector machine (SVM) utilizes various kernel functions to reform the non-linear data sets by transforming the datasets from higher dimension to a lower dimension. Then, a separating hyperplane can be created in the central of the maximum margin separating the support vectors [ 43 ]. Sometimes, support vectors which are defined as the closest training points to the hyperplane, can be more than two. Figure 2 depicts the geometric point of view of the entire input space divided by a hyperplane into 2 parts (i.e., +1 and -1). The hyperplane may appear whether in a line or surface form depending of the dimensional space of the support vectors [ 44 ]. The margin between the hyperplane and support vectors needs to be maximized by minimizing the w value. The margin is strongly dependent with the parameter C in SVM where C is known as a hypermeter in controlling the misclassifying training example. To identify the function is in positive or negative, the equation below can be used:


where inputs and output of the model are denoted as x and y, respectively, w is the weight vector of x, (θ) is the feature mapped non-linear from the input space x, and the b infers the bias of the model. Hence, f(x) ≥ 1 will be considered as positive examples and f(x) ≤ −1 is the negative examples.

Fig. 2. Schematic of the SVM.

Kernel is a mathematical function that serves as a link bridge for non-linear function to linear one. Figure 3 displays the structure of kernel functions in transforming the data. In addition, the performance of SVM is greatly influenced by kernel functions.

Fig. 3. Typical structure of kernels function.

Often, data sets can be classified into two different cases, which are separable and non-separable. For separable case, the hyperplane can be easier drawn and straightforward. So, linear kernel function is commonly-used in the linear separable case. This is because linear kernel function is the simplest kernel function grant by the inner product or dot between the functions. In engineering problems, non-separable data is common and always appeared. Hence, other functions can be utilized in order to produce the hyperplane with maximum margin. For instance, RBF is the most favorable kernel used in the case when the relation between two attributes are non-linear. Besides, RBF provides more trustworthy results as it has higher capability in interpolation but once the extrapolation is in huge range, RBF becomes weak and not suitable. Moreover, neural kernel or also known as sigmoid, which has similar behavior like RBF for certain parameters is commonly-used to solve for non-separable data [ 45 ]. This neural kernel in SVM is a kind of multi-layer perception without hidden layer. In addition, the RBF and neural kernel functions may be influenced by the hyper-parameter, gamma. The main function of the gamma is to decide the curvature of the hyperplane in the decision boundary. Furthermore, polynomial kernel function is another commonly-used function where it represents the feature space over polynomials of the original variable. Lastly, ANOVA kernel function is the extend version of RBF function which is able to combine the RBF and laplacian formulations. Table 1 shows the kernel formulas applied in this study. In this figure, ‘d’ is the polynomial degree while ‘γ’ is the Gamma value for RBF, neural and polynomial kernels, xi and xj are the vector inputs.

Table 1

Formulas for different kernel functions used in this study.

Kernel FunctionEquationLinearG(xi,xj)=exp(-yDxi-xjD2)RBFG(xi,xj)=(-yxitxj+1)dNeuralG(xi,xj)=Tanh(-yxitxj+1)dPolynomialG(xi,xj)=xitxjANOVAG(xi,xj)=exp(-y(xi-xj))

2.2. Case study and collected data

The HSDT or commonly-known as pile driver analyzer (PDA) tests were conducted in Pekanbaru area, Indonesia (Figure 4). Pekanbaru city is the important city in Indonesia and it was declared as the capital of the Riau district in Sumatra Island. The population of Pekanbaru has recorded approximately 1 million in the year of 2014 with the increment of 3.5% per year from 1998. Rapid growth of economic needs infrastructure facilities and tower building to support human activities. With increasing the number of construction projects, the number of PDA tests must be also increased to check capacity of the piles used as foundations of super-structures. Therefore, in order to propose SVM models with various kernels for prediction of PBC, a number of 141 PDA tests were carried out in Pekanbaru, Indonesia. The tests were performed on the precast concrete piles. Figure 5 shows an example of PDA test using the pile driving analyzer equipment with Control and Provisioning of Wireless Access Points (CAPWAP) software to analyze the PBC. Various parameters including pile set, S, pile diameter, D, pile length, L, drop weight, H, and ram weight, W, were measured for these 141 tests. Of course, their PBC values were recorded as the ultimate objective factor of this study to be predicted. As discussed in introduction section, the collected/measured variables are all important for estimating PBC values. Therefore, the authors decided to use D, L, H, S, and W as model predictors or inputs to forecast PBC values. In order to give a better view of the used data, Table 2 lists 30 data samples comprising the input and output parameters out of the whole data (i.e., 141 samples). The ranges of (226-600 mm), (3-48 m), (12-90 kN), (0.2-3 m), and (291-3680 kN) were used for D, L, W, H, and PBC, respectively, in the modelling of this study.

Fig. 4. Location of Pekanbaru, Indonesia.

Fig. 5. PDA test conducted in Pekanbaru, Indonesia.

Sample No. Inputs Output
Pile Diameter Pile Length Ram Weight Drop Height Pile Bearing Capacity
(mm) (m) (kN) (m) (kN)
1 282 8 12 1 555
2 282 8 12 1 623
3 282 3 12 1 536
4 282 3 12 1 850
5 282 3 12 1 648
6 282 8 12 1 291
7 282 11 13 1.5 1,572
8 282 11 13 1.5 1,450
9 282 13 13 1.5 854
10 282 14 13 1.5 818
11 282 14 13 1.5 980
12 282 13 13 1.5 1,063
13 395 28 35 1 1,341
14 480 29 45 1 1,409
15 480 29 45 1 2,200
16 480 29 45 1 1,650
17 226 10 13 1 1,058
18 226 7 13 1 942
19 226 11 13 1 774
20 226 8 13 1 749
21 226 8 13 1 780
22 226 8 13 1 588
23 226 8 13 1 707
24 451 12 90 0.4 3,530
25 306 17 90 0.3 2,790
26 306 15 90 0.3 2,900
27 451 23 90 0.4 3,430
28 451 14 90 0.4 3,460
29 226 17 25 0.4 780
30 226 17 25 0.4 770
Table 2.A part of input and output variables used in the modelling.

2.3. Step-by-step overview of research

The first point to begin in this paper is setting up the research goal which is to introduce an applicable AI/ML technique to forecast the PBC. In this case, SVM predictive model is considered as a high level of performance and the errors is targeted to be lesser than 10%. Then, the research continues with the reviewing of past related published studies by the experts. After reviewing plenty of papers regarding the predictive model for PBC, it was found that there is a lack of study using SVM model with different kernels in forecasting the PBC. Various kernel functions of SVM are able to simply the complex and non-linear relations between inputs and output variables. After identifying the study problem, a series of quantitative data was obtained from the PDA tests. After compiling the data, the filtration was performed using ‘outlier labeling rule’ which is proposed by Hoaglin and Iglewicz [ 46 ] to check for missing data and outliers before analysis. The next step is related to input selection, which was done using simple regression analysis. The next objective of this paper is to propose SVM models with different kernels to forecast the PBC. The SVM was modelled using Rapidminer software, which is a user-friendly modelling software for researchers. Eventually, the capacity of PBC predicted by each model was tested using important performance indices and also a simple ranking system. Then, the best SVM kernel was selected and introduced as the most powerful one for prediction of the PBC. Figure 6 illustrates the research methodology procedures of this study.

Fig. 6. Methodology procedure flowchart.

2.4. Performance index

To identify the most precise model, different performance indices must be taken into account during the modelling and evaluation parts. After reviewing previous investigations, the authors decided to apply the R2, a20-index, root mean square error (RMSE), variance account for (VAF%), and mean absolute error (MAE) on the ML/AI results. The values of 1, 1, 0, 100% and 0 are considered as the perfect values for these indices, respectively. The formulas of these indices are presented in Table 3. In this table, n refers to total number of database, O stands for the measured database, indicates the predicted values of O, and O-refers to mean value of O, m20 indicates the rate of experimental value/predicted value that lies between the range of 0.80 to 1.20. These indices can be calculated for train and test phases.

Table 3

Equations of performance indices used in this investigation.

IndexEquationRMSE1ni=1n(-O)2R21-i(O-)2i(O-O-)2VAF (%)[1-var(O-)var(O)]×100MAE1ni=1n|O-|a20-indexm20n

3. PBC modelling

3.1. Input selection

It is important to mention that one of the shortcomings and disadvantages of ML/AI models is their limited practical application in different areas of engineering. We as engineers should always try to make them as simple as possible in practice for other researchers and designers. In this way, one of the possible options is related to the number of inputs that we need to give to the system. The level of complexity can be decreased by reducing the number of input parameters [ 47 ]. Another point is related to the fact that if a lower number of inputs are needed to collect, the process of data collection would be easier and faster compared with the situation in which we need to collect and have all inputs. Based on above discussion, the input or feature selection was conducted through simple regression analysis. To do this, different trend line functions including linear, exponential, power and logarithmic were used between predictors and the PBC and they were evaluated using R2. The results of these analyses are presented in Table 4. As shown in this table, there are a wide range of R2 for different predictors. It is obvious that parameters of D and W have a deep impact on PBC results. However, L and S showed the lowest influence on the system output because they received the lowest R2 values. In this stage, in order to remove only one parameter among them, the previous investigations were again reviewed. Based on this review and considering the fact that pile geometry category has a stronger effect on PBC results compared to field test setting category, the authors decided to remove S from the predictors. Therefore, variables i.e., D, L, W and H were set as model inputs in this study to predict PBC values. In the following sub-section, SVM modeling process and steps will be described.

Parameter Trend Line Function Relationship R2
D Linear PBC = 6.6942D - 418.42 0.354
Exponential PBC = 395.42e0.004D 0.317
Logarithmic PBC = 2538.5ln(D) -12862 0.380
Power PBC = 0.1715D1.5728 0.357
L Linear PBC = -5.5717L + 1959.9 0.004
Exponential PBC = 1549.2e-1E-04L 5E-06
Logarithmic PBC = 121.42ln(L)+ 1472.9 0.005
Power PBC = 1015.2L0.1409 0.016
S Linear PBC = -24.006S + 1962.3 0.004
Exponential PBC = 1600.3e-0.007S 0.001
Logarithmic PBC = 205.02ln(S) + 1519.2 0.017
Power PBC = 1138.4S0.1985 0.038
W Linear PBC = 29.751W + 199.35 0.698
Exponential PBC = 560.27e0.0185W 0.658
Logarithmic PBC = 1105.7ln(W) - 2394.3 0.579
Power PBC = 104.45W0.7044 0.576
H Linear PBC = -529.81H + 2160.6 0.049
Exponential PBC = 1788.1e-0.238H 0.024
Logarithmic PBC = -345.5ln(H) + 1610.6 0.037
Power PBC = 1431.6H-0.117 0.011
Table 4.Summary of simple regression analysis for selecting input parameters.

3.2. SVM modelling

The ultimate aim of this study is to introduce a new solution for prediction of PBC based on SVM and its different kernels. From the previous section, it was decided to use four input parameters of (H, W, L, and D) out of the collected variables which were S, H, W, L, and D. As mentioned before, the Rapidminer as an easy and fast software, was selected to conduct modeling of SVM with various kernels for PBC estimation. In AI/ML works, there is an important stage prior to modeling which is data division for purposes of development and assessment. For the purpose of model development, a portion of 80% was randomly selected from the whole 141 data samples while for the purpose of model assessment, another remaining portion (20%) of the data samples, was allocated. These divisions were performed based on reviewing the previous studies [ 48 - 50 ]. In the next stage, a SVM flowchart is created in the software which is shown in Figure 7. The order started with inserting the database for model developed. Once the database is inserted into the software, filter example is necessity but not a must. This step is to filter out the outliers such as non-numerical data, symbols which are not recognize by the system. Next, identifying the input parameters, outcome variable and the predicted variable can be specified in the set role. Then, all mentioned parameters should be connected to the SVM operator. In this SVM operator, the software enables us to choose the kernel functions. Hence, in such case, 5 different kernel functions i.e., dot/linear, RBF, polynomial, neural, and ANOVA were selected in predicting the PBC values. Furthermore, the value of complexity index, C, the optimizer parameter, convergence epsilon and kernel degree for each kernel function need to be designed. In this study, the mentioned parameters were determined using trial-and-error with the aim of obtaining the highest performance prediction for each kernel. Table 5 presents the final values for the effective SVM parameters for each kernel. These models and their performance ability in predicting PBC values will be discussed later.

Fig. 7. Flowchart of setting up SVM in Rapidminer software for PBC prediction.

Kernel Type Dot RBF Polynomial Neural ANOVA
Complexity Constant, C 1.0E-5 5.0E-6 5.0E-5 5.0E-4 5.0E-4
Convergence Epsilon 0.10 0.20 0.01 0.01 0.01
L Positive 1.30 1.00 1.50 1.50 1.50
L Negative 1.30 1.00 1.50 1.50 1.50
Kernel Degree - - 2.00 - 3.00
Kernel Gamma - 2.00 - - 4.00
Kernel Parameter A - - - 0.01 -
Kernel Parameter B - - - 0.01 -
Table 5.The final values related to effecting SVM parameters for each kernel.

4. Results and discussion

From the previous sections, it was found that using only four variable as inputs would be of more interest and applicability in practice. Therefore, the modelling was done using these four input parameters (H, W, D and L) to develop the best model in forecasting the PBC. The results has proved that the elimination of parameter, S, shows an insignificant deviation in the whole database. Different SVM kernels as predictive models were conducted to predict PBC values. Since SVM is a statistical-based technique, the authors decided to apply a linear multiple regression (LMR) model on the same training and testing portions for having a fair and logical comparison. The results of models and their abilities in predicting PBC values are presented in Tables 5 and 6 where a rating approach proposed by Zorlu et al. [ 51 ], was applied on the same. In this rating system, the better models in terms of all performance insides will get the highest rates. As shown in Tables 6 and 7, SVM with RBF kernel scored a total rating of 58 (out of 60) while this rate was obtained as 40, 29, 22, 10 and 51 for LMR, Dot, Polynomial, Neural and ANOVA, respectively. As a result, RBF received the highest position among all six models in this research for prediction of PBC values.

Group Model Index Rating
R2 RMSE VAF (%) MAE a-20 R2 RMSE VAF (%) MAE a-20
Train LMR 0.8274 0.1198 79.14 0.0865 0.5929 4 4 3 4 5
Dot 0.8241 0.1240 80.97 0.0901 0.5044 3 3 4 3 2
RBF 0.9669 0.0530 96.63 0.0287 0.5309 6 6 6 6 4
Polynomial 0.7555 0.1441 60.79 0.1081 0.5133 2 2 2 2 3
Neural 0.6434 0.2902 1.68 0.2533 0.1681 1 1 1 1 1
ANOVA 0.8456 0.1135 82.16 0.0815 0.6106 5 5 5 5 6
Test LMR 0.8283 0.1136 82.50 0.0886 0.4643 4 4 4 4 4
Dot 0.8125 0.1515 65.71 0.1194 0.2857 3 3 3 3 2
RBF 0.9934 0.0235 99.27 0.0116 0.9286 6 6 6 6 6
Polyno-mial 0.6974 0.1549 36.68 0.1244 0.3214 2 2 2 2 3
Neural 0.5840 0.2615 2.57 0.2170 0.2143 1 1 1 1 1
ANOVA 0.8654 0.1040 85.04 0.0634 0.6071 5 5 5 5 5
Table 6.The obtained results of different SVM kernels together with LMR technique.
Model Rating Position
Train Test Total
LMR 20 20 40 3
Dot 15 14 29 4
RBF 28 30 58 1
Polynomial 11 11 22 5
Neural 5 5 10 6
ANOVA 26 25 51 2
Table 7.Ratings and positions of developed models.

Next, SVM with ANOVA kernel with rating of 51 is the second option in predicting PBC. Then, LMR obtained 40 points as total rating followed by the SVM with dot and polynomial kernels with their ratings of 29 and 22, respectively. Lastly, SVM with neural kernel is the least accurate model among all six developed models in forecasting the PBC as it scores only 10 points. Overall, two conclusions can be drawn from the analysed results. Firstly, the AI/ML model such as SVM with RBF kernel and SVM with ANOVA kernel have higher accuracy in terms of prediction of PBC compared to the statistical LMR model. Secondly, SVM with RBF kernel model using four input parameters has the most influential result among all. Therefore, a graph of predicted PBC using simplified SVM with RBF kernel model against the actual PBC is developed for training and testing model in the Rapidminer software, which can be shown in Figures 8 and 9. In addition, a graph of difference between actual and predicted PBC values for testing set (28 data samples) is plotted using the best developed simplifies SVM with RBF kernel (Figure 10). These figures together with the obtained results of all models confirm that the RBF kernel of SVM is the best model applied in this study with the highest accuracy level and lowest system error. This model can be used for the same problem of PBC by other designers or engineers in the future.

Fig. 8. Graph of predicted PBC versus actual PBC for the training set of SVM-RBF.

Fig. 9. Graph of predicted PBC versus actual PBC for the testing set of SVM-RBF.

Fig. 10. Actual and predicted PBC values for 28 data samples of testing using SVM-RBF model.

Referring back to the literature reviewed, in fact, the SVM-RBF kernel developed in this study has a higher R2 value than some of the predictive tools including the hybrid model such as ANN, general regression neural networks and combination of group method of data handling and fuzzy polynomial. Over the years, several SVM models were proposed in forecasting the PBC value. For instance, Pal and Deswal [ 20 ] investigated SVM as a potential model for prediction the static pile capacity using a database with 81 samples. They found a R2 value of 0.967 for the RBF kernel function. On the other side, Samui and Kim [ 52 ] utilized the SVM model in forecasting the PBC value using 28 pile datasets. As a result, a training performance of R2 = 0.951 was achieved in their model. In addition, Kordjazi et al. [ 53 ] developed a SVM model to predict PBC using 108 data set samples. The SVM model from the research aforementioned incorporated with a radial basis kernel to show a highest correlation of coefficient of 0.945. Our study has two advantages compared to the mentioned studies. First, we managed to get a higher level of accuracy compared to them which is always of interest and importance in simulation studies. Second, we used a larger data samples compared to them which allow us to propose a model with higher level of generalization. It is an important point that researchers should be aware of it and tried to develop models which can cover a larger range of data.

5. Limitations of study

One of the limitations of this study is the finite database available in the industry. In developing AI/ML techniques, the available database plays an important role. Incomplete or insufficient database is the main obstacle in developing a high performance and accurate predictive model. The reason behind this limitation is that the preparation the database is time consuming and costly. The most popular test to obtain the input is the HSDT or commonly known as PDA test. The working procedure of this test is long and required a large number of workers. In addition, heavy machinery such as excavator and mobile crane are essential for the test meanwhile the test required lots of expensive equipment and technologies.

The second limitation of this study is the proposed predictive model only applicable in the particular area or places that having the similar soil properties. The PBC may vary with the soil properties. Moreover, every single part of the world has different kinds of soil with various soil properties in term of cohesion, friction angle and so on. Although the predictive model is site specific, the algorithm and research methodology has been discovered in this study so that the prediction can be done in an easy and quick manner.

6. Conclusion

With an idea of having a ML/AI solution which is easy and simple, a series of experimental works have been done in several construction sites. The aim was to determine pile capacity together with some important factors on it. In this study, the connections between these important parameters and pile capacity were done through statistical and SVM models. First, out of all five parameters (i.e., H, W, D, S and L), S as the least effective parameter on pile capacity, was removed and the rest were used for the modeling. Then, LMR as well as SVM with five different kernels models (i.e., dot, RBF, neural, polynomial, and ANOVA) were proposed to predict PBC values. To interpreting the first-rate model among those developed models on predicting the PBC, a rating system was used. The system ranked the performance indices and the highest rate value model is known as the best model. As a result, the cumulative rate values of 40, 29, 58, 22, 10, and 51 were obtained for the LMR, SVM-dot, SVM-RBF, SVM-polynomial, SVM-neural, and SVM-ANOVA models, respectively. This shows the SVM with RBF kernel is the most successful model where the R2 value of 0.9669 and 0.9934 was obtained for training and testing sets, respectively. The R2 value of 0.9934 shows that SVM is one of the best developed AI models in the area of PBC prediction. Besides, the findings of AI models are better than the statistical model is proven as well since the AI model such as SVM-RBF and SVM-ANOVA obtained higher rating values than the statistical LMR model. The proposed SVM-RBF is introduced as a powerful, easy to use and simple model to be used in construction industry for predicting PBC values with a high degree of accuracy.


The authors would like to thank the technicians' contributions to the PDA testing. Furthermore, the authors are appreciative to the University of Malaya for supporting and making this study feasible.

Conflict of Interest

The authors declare that they have no conflict of interest.


  1. Meyerhof GG. Uplift resistance of inclined anchors and piles. Proc. 8th ICSMFE, vol. 2, 1973, p. 167-72.
  2. Armaghani DJ, Sohaei H, Namazi E, Marto A. Investigation of Uplift Capacity of Deep Foundation in Various Geometry Conditions. Open Constr Build Technol J. 2020; 13:344-52. http://doi.org/10.2174/1874836801913010344. Publisher Full Text
  3. Shahin MA. Intelligent computing for modeling axial capacity of pile foundations. Can Geotech J. 2010; 47:230-43.
  4. Nazir R, Momeni E, Marsono K, Maizir H. An Artificial Neural Network Approach for Prediction of Bearing Capacity of Spread Foundations in Sand. J Teknol. 2015; 72. http://doi.org/10.11113/jt.v72.4004Publisher Full Text
  5. Momeni E, Dowlatshahi MB, Omidinasab F, Maizir H, Armaghani DJ. Gaussian Process Regression Technique to Estimate the Pile Bearing Capacity. Arab J Sci Eng. 2020; 45:8255-67. http://doi.org/10.1007/s13369-020-04683-4. Publisher Full Text
  6. Vakili A, Zomorodian SMA, Totonchi A. Laboratory and three-dimensional numerical modeling of laterally loaded pile groups in sandy soils. Iran J Sci Technol Trans Civ Eng. 2020. http://doi.org/10.1007/s40996-020-00502-w. Publisher Full Text
  7. Harandizadeh H, Toufigh V. Application of Developed New Artificial Intelligence Approaches in Civil Engineering for Ultimate Pile Bearing Capacity Prediction in Soil Based on Experimental Datasets. Iran J Sci Technol Trans Civ Eng. 2020; 44:545-559.
  8. Harandizadeh H, Armaghani DJ, Khari M. A new development of ANFIS-GMDH optimized by PSO to predict pile bearing capacity based on experimental datasets. Eng Comput. 2021; 37:685-700. http://doi.org/10.1007/s00366-019-00849-. Publisher Full Text
  9. Armaghani DJ, Harandizadeh H, Momeni E, Maizir H, Zhou J. An optimized system of GMDH-ANFIS predictive model by ICA for estimating pile bearing capacity. Artif Intell Rev. 2021. http://doi.org/10.1007/s10462-021-10065-5. Publisher Full Text
  10. Masouleh SF, Fakharian K. Application of a continuum numerical model for pile driving analysis and comparison with a real case. Comput Geotech. 2008; 35:406-18.
  11. Momeni E, Jahed Armaghani D, Hajihassani M, Mohd Amin MF. Prediction of uniaxial compressive strength of rock samples using hybrid particle swarm optimization-based artificial neural networks. Measurement. 2015; 60:50-63. http://doi.org/10.1016/j.measurement.2014.09.075. Publisher Full Text
  12. Momeni E, Nazir R, Armaghani DJ, Maizir H. Application of artificial neural network for predicting shaft and tip resistances of concrete piles. Earth Sci Res J. 2015; 19:85-93.
  13. Moayedi H, Armaghani DJ. Optimizing an ANN model with ICA for estimating bearing capacity of driven pile in cohesionless soil. Eng Comput. 2018; 34:347-56.
  14. Shahin MA. State-of-the-art review of some artificial intelligence applications in pile foundations. Geosci Front. 2016; 7:33-44. http://doi.org/10.1016/j.gsf.2014.10.002. Publisher Full Text
  15. Teh CI, Wong KS, Goh ATC, Jaritngam S. Prediction of pile capacity using neural networks. J Comput Civ Eng. 1997; 11:129-38.
  16. Terzaghi K. 1943, Theoretical Soil Mechanics, John Wiley & Sons, New York n.d.
  17. Vesic AS. Design of pile foundations. National cooperative highway research program synthesis of practice no. 42. Transp Res Board, Washington, DC 1977;3248.
  18. Pham TA, Ly H-B, Tran VQ, Giap L Van, Vu H-LT, Duong H-AT. Prediction of pile axial bearing capacity using artificial neural network and random forest. Appl Sci. 2020; 10:1871.
  19. Józefiak K, Zbiciak A, Maślakowski M, Piotrowski T. Numerical modelling and bearing capacity analysis of pile foundation. Procedia Eng. 2015; 111:356-63.
  20. Pal M, Deswal S. Modeling Pile Capacity Using Support Vector Machines and Generalized Regression Neural Network. J Geotech Geoenvironmental Eng. 2008; 134:1021-4. http://doi.org/10.1061/(ASCE)1090-0241(2008)134:7(1021). Publisher Full Text
  21. Shahin MA, Jaksa MB, Maier HR. Artificial neural network applications in geotechnical engineering. Aust Geomech. 2001; 36:49-62.
  22. Momeni E, Yarivand A, Dowlatshahi MB, Armaghani DJ. An Efficient Optimal Neural Network Based on Gravitational Search Algorithm in Predicting the Deformation of Geogrid-Reinforced Soil Structures. Transp Geotech. 2021; 26:100446.
  23. Armaghani DJ, Harandizadeh H, Momeni E. Load carrying capacity assessment of thin-walled foundations: an ANFIS-PNN model optimized by genetic algorithm. Eng Comput. 2021. http://doi.org/10.1007/s00366-021-01380-0. Publisher Full Text
  24. Parsajoo M, Armaghani DJ, Mohammed AS, Khari M, Jahandari S. Tensile strength prediction of rock material using non-destructive tests: A comparative intelligent study. Transp Geotech. 2021; 31:100652. http://doi.org/10.1016/J.TRGEO.2021.100652. Publisher Full Text
  25. Momeni E, He B, Abdi Y, Jahed Armaghani D. Novel Hybrid XGBoost Model to Forecast Soil Shear Strength Based on Some Soil Index Tests. Comput Model Eng Sci. 2023; 136:2527-50. http://doi.org/10.32604/cmes.2023.026531. Publisher Full Text
  26. Shalchi Tousi M, Ghazavi M, Laali S. Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownian Motion Algorithm (GBMOA). J Soft Comput Civ Eng. 2021; 5:1-18. http://doi.org/10.22115/scce.2021.248638.1256. Publisher Full Text
  27. Fakharian P, Rezazadeh Eidgahee D, Akbari M, Jahangir H, Ali Taeb A. Compressive strength prediction of hollow concrete masonry blocks using artificial intelligence algorithms. Structures. 2023; 47:1790-802. http://doi.org/10.1016/j.istruc.2022.12.007. Publisher Full Text
  28. Ghanizadeh AR, Ghanizadeh A, Asteris PG, Fakharian P, Armaghani DJ. Developing bearing capacity model for geogrid-reinforced stone columns improved soft clay utilizing MARS-EBS hybrid method. Transp Geotech. 2023; 38:100906. http://doi.org/10.1016/j.trgeo.2022.100906. Publisher Full Text
  29. Armaghani DJ, Asteris PG, Fatemi SA, Hasanipanah M, Tarinejad R, Rashid ASA, et al. On the Use of Neuro-Swarm System to Forecast the Pile Settlement. Appl Sci. 2020; 10:1904.
  30. Pal M, Deswal S. Modelling pile capacity using Gaussian process regression. Comput Geotech. 2010; 37:942-7. http://doi.org/10.1016/j.compgeo.2010.07.012. Publisher Full Text
  31. Momeni E, Nazir R, Jahed Armaghani D, Maizir H. Prediction of pile bearing capacity using a hybrid genetic algorithm-based ANN. Measurement. 2014; 57:122-31. http://doi.org/10.1016/j.measurement.2014.08.007. Publisher Full Text
  32. Harandizadeh H, Toufigh MM, Toufigh V. Application of improved ANFIS approaches to estimate bearing capacity of piles. Soft Comput. 2019; 23:9537-9549. http://doi.org/10.1007/s00500-018-3517-. Publisher Full Text
  33. Chen W, Sarir P, Bui X-N, Nguyen H, Tahir MM, Armaghani DJ. Neuro-genetic, neuro-imperialism and genetic programing models in predicting ultimate bearing capacity of pile. Eng Comput. 2020; 36:1101-1115. http://doi.org/10.1007/s00366-019-0075. Publisher Full Text
  34. Kordjazi A, Nejad FP, Jaksa MB. Prediction of ultimate axial load-carrying capacity of piles using a support vector machine based on CPT data. Comput Geotech. 2014; 55:91-102.
  35. Lee C-Y, Chern S-G. Application of a support vector machine for liquefaction assessment. J Mar Sci Technol. 2013; 21:318-24.
  36. Mahdevari S, Shahriar K, Yagiz S, Shirazi MA. A support vector regression model for predicting tunnel boring machine penetration rates. Int J Rock Mech Min Sci. 2014; 72:214-29.
  37. Fisher WD, Camp TK, Krzhizhanovskaya V V. Crack detection in earth dam and levee passive seismic data using support vector machines. Procedia Comput Sci. 2016; 80:577-86.
  38. Kim J-Y, Park U. A study on the selection model of retaining wall methods using support vector machines. Korean J Constr Eng Manag. 2006; 7:118-26.
  39. Besalatpour A, Hajabbasi MA, Ayoubi S, Gharipour A, Jazi AY. Prediction of soil physical properties by optimized support vector machines. Int Agrophysics. 2012; 26
  40. Ly H-B, Pham BT. Prediction of shear strength of soil using direct shear test and support vector machine model. Open Constr Build Technol J. 2020; 14:268-77.
  41. Jahed Armaghani D, Asteris PG, Askarian B, Hasanipanah M, Tarinejad R, Huynh V Van. Examining Hybrid and Single SVM Models with Different Kernels to Predict Rock Brittleness. Sustainability. 2020; 12:2229.
  42. Khandelwal M, Kankar P. Prediction of blast-induced air overpressure using support vector machine. Arab J Geosci. 2011; 4:427-33. http://doi.org/10.1007/s12517-009-0092-7. Publisher Full Text
  43. Marjanović M, Kovačević M, Bajat B, Voženílek V. Landslide susceptibility assessment using SVM machine learning algorithm. Eng Geol. 2011; 123:225-34.
  44. Tien Bui D, Pradhan B, Lofman O, Revhaug I. Landslide susceptibility assessment in vietnam using support vector machines, decision tree, and Naive Bayes Models. Math Probl Eng. 2012. http://doi.org/10.1155/2012/9. Publisher Full Text
  45. Lin Y-C, Tseng H-W, Fuh C-S. Pornography detection using support vector machine. 16th IPPR Conf. Comput. Vision, Graph. Image Process. (CVGIP 2003), vol. 19, 2003, p. 123-30.
  46. Hoaglin DC, Iglewicz B. Fine-tuning some resistant rules for outlier labeling. J Am Stat Assoc. 1987; 82:1147-9.
  47. Armaghani DJ, Mohamad ET, Momeni E, Narayanasamy MS. An adaptive neuro-fuzzy inference system for predicting unconfined compressive strength and Young’s modulus: a study on Main Range granite. Bull Eng Geol Environ. 2015; 74:1301-19.
  48. Mohamad ET, Armaghani DJ, Momeni E, Yazdavar AH, Ebrahimi M. Rock strength estimation: a PSO-based BP approach. Neural Comput Appl. 2016;1-12. http://doi.org/10.1007/s00521-016-2728-3. Publisher Full Text
  49. Rezaei H, Nazir R, Momeni E. Bearing capacity of thin-walled shallow foundations: an experimental and artificial intelligence-based study. J Zhejiang Univ A. 2016; 17:273-85. http://doi.org/10.1631/jzus.A1500033. Publisher Full Text
  50. Bunawan AR, Momeni E, Armaghani DJ, Rashid ASA. Experimental and intelligent techniques to estimate bearing capacity of cohesive soft soils reinforced with soil-cement columns. Measurement. 2018; 124:529-38.
  51. Zorlu K, Gokceoglu C, Ocakoglu F, Nefeslioglu HA, Acikalin S. Prediction of uniaxial compressive strength of sandstones using petrography-based models. Eng Geol. 2008; 96:141-58. http://doi.org/10.1016/j.enggeo.2007.10.009. Publisher Full Text
  52. Samui P, Kim D. Least square support vector machine and multivariate adaptive regression spline for modeling lateral load capacity of piles. Neural Comput Appl. 2013; 23:1123-7.
  53. Kordjazi A, Pooya Nejad F, Jaksa MB. Prediction of load-carrying capacity of piles using a support vector machine and improved data collection. Proc. 12th Aust. New Zeal. Conf. Geomech. Chang. Face Earth - Geomech. Hum. Influ. 2015 / Ramsay, G. (ed./s), pp, The New Zealand Geotechnical Society and the Australian Geomechanics Society; 2015, p. 1-8.