Full article: Evaluation framework for smartphone-based road roughness index estimation systems

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Roughness is an important indicator of road deterioration and has a significant impact on road serviceability. Conventional instruments for roughness measurement, such as laser profilers, are expensive and require a complex set-up, which limits the surveying frequency and coverage. As an alternative, embedded sensors in smartphones mounted in vehicles have been leveraged to measure roughness indirectly, and multiple smartphone-based roughness index estimation (sRIE) systems have become available recently. However, there lacks a framework to evaluate the performance of sRIE systems in a systematic and repeatable manner. This research proposed an evaluation framework to assess the performance of sRIE systems in practical settings. The framework consists of statistical measures that evaluate the consistency and accuracy of sRIE systems under various mountings, vehicle types, and survey speeds. Three popular sRIE systems were assessed using the framework to validate their validity and practicality. By standardising the performance metrics, this framework allows for performance benchmarking between sRIE systems and conventional instruments.

KEYWORDS:

1. Introduction

Road roughness is defined as ‘the deviations of a pavement surface from a true planar surface with characteristics dimensions that affect vehicle dynamics, ride quality, dynamic pavement loads, and pavement drainage’ (Wambold et al. Citation1981). Recent studies suggested that road roughness affects freight logistic efficiency (Steyn et al. Citation2014), energy dissipation (Louhghalam et al. Citation2019), vehicle fuel consumption (Botshekan et al. Citation2020), operating cost (Zaabar and Chatti Citation2014), as well as the associated environment (Louhghalam et al. Citation2017). Generally, pavement roughness assessment is needed to validate the quality of newly constructed pavement and monitor the existing pavement condition. Roughness induces distress to the pavement surface and diminishes riding comfortability. Moreover, distressed pavement causes damage to the traversing vehicles and could develop into more severe pavement distress, such as potholes and rutting. The pavement surface condition can be described by several indices, such as the pavement condition index (PCI), pavement serviceability rating (PSR), and international roughness index (IRI). Among them, the IRI (Sayers Citation1995), a single roughness scale that enables the exchange of pavement roughness information internationally, is the most predominantly adopted index. For a given road profile, the IRI quantitatively represents the response of a quarter-car model. It is computed by accumulating the elevation deviation of the body (sprung mass) and the wheel axis (unsprung mass) over the travelled length and is a mathematical function of the longitudinal profile of the road. The IRI has been adopted by most countries to indicate pavement surface conditions and to infer overall vehicle operating cost, ride quality, and dynamic wheel loads.

The IRI could be measured by various approaches. It is advised that the approaches can be categorised into four classes: Class 1: Precision profilers, Class 2: Inertial profilometric methods, Class 3: IRI estimates from correlation equations using response-based methods, and Class 4: Subjective ratings and uncalibrated measures (Sayers et al. Citation1986). It should be noted that the IRI computation relies on the pavement profile measurement, which can be represented by waveforms that vary in wavelength and amplitude (Mann et al. Citation1997). While the conventional Class 1 (e.g. walking profiler) and Class 2 (e.g. inertial profiler) instruments aim at obtaining these two variables, the Class 3 (e.g. roughometer and smartphone) approaches estimate the index based on the correlation relationship between the index and vehicle dynamic responses. With the goal of developing effective tools for pavement network monitoring and maintenance planning, the smartphone-based roughness index estimation (sRIE) system, due to its cost-effectiveness, accessibility, and pervasiveness, is gaining increasing research attention and is being adopted in practice.

The RIE instruments need to undergo validation tests to check their consistency and accuracy levels before being used in the field. Regarding the state-of-the-art guidelines on validating conventional roughness assessment tools, specifications for a profiler are regulated in ISO 13473-3 and BS EN 13036-5:2019. Moreover, BS EN 13036-7:2003 provides a guideline for the static profiler and rod-level method, which are considered the most accurate pavement profile estimating instruments and are used to calibrate other equipment (Sayers Citation1995). Furthermore, Austroads AM/T001-4 defines procedures for performing validation checks for profilers (Austroads Citation2016a). Despite various guidelines for validating Class 1 and 2 instruments, there still lacks a comprehensive framework to validate the smartphone-based systems.

sRIE systems have been tested in the existing literature. An sRIE was tested on both sealed and unsealed roads using a single vehicle type under different constant speeds (Wix Citation2016). They compared the results with that obtained from laser profilers to identify the limitations of sRIE systems. Similarly, two sRIE systems were evaluated on a 1.2 km road segment by adopting constant speed and single vehicle types, and their results were compared with that obtained from the laser profiler (Shah et al. Citation2017). Meanwhile, sRIE measurements obtained from four different vehicle models and mounting locations were plotted against the profiler measurements, and a statistical index was calculated to indicate the correlation between the two (Botshekan et al. Citation2021). The IRI estimated by an sRIE system was compared to the IRI measured by a bump integrator and their results were correlated using a linear regression relationship (Sandamal and Pasindu Citation2020). Moreover, to account for the impact of smartphone mounting configurations, (Hossain et al. Citation2019) evaluated the performance of two sRIE systems on sealed roads by adopting various mounting types under different constant speeds. The most common statistic used to quantitatively describe the accuracy of an sRIE system is the coefficients of the linear regression relationship between measurements obtained from the reference instrument and smartphones (Douangphachanh and Oneyama Citation2014a, Wessels and Steyn Citation2020), while some studies plot the measurements of the smartphone and the reference instrument spatially to obtain a direct view of its performance (Jeong et al. Citation2020, Xue et al. Citation2020).

This study noticed that most of the evaluations presented in the literature were conducted in different practical settings, such as with a vehicle of varying types, different mounting configurations, and at different travelling speeds. Furthermore, existing sRIE studies tend to have their own definition or calculation of repeatability and accuracy, which made the comparison between different methods challenging. There is a lack of standardised procedures that allow for a systematic and objective comparison between different sRIE systems. In light of this gap, this research proposes a framework that enables objective performance evaluation of sRIE systems for pavement roughness assessment. First, we proposed a set of summative metrics highlighting the critical statistical measures proposed to describe the accuracy and consistency of the roughness index measurements of sRIE systems. Then, this framework was applied to evaluate three commercial sRIE systems to validate their efficacy.

The rest of the paper is organised as follows. Section 2 introduces the estimating methodology of the state-of-the-art sRIE systems and discusses the dominating practical factors. Section 3 elaborates on the framework’s testing statistical measures. Section 4 entails the experimental set-up of a field test where three sRIE systems were evaluated. Section 5 presents experimental results. Section 6 summarises the experimental findings, highlights the significance of the proposed framework, and sheds light on considerations for future research. Section 7 concludes the paper.

2. State-of-the-art sRIE systems

2.1. Smartphone-based IRI estimation methods

The current sRIE systems estimated the IRI through (1) statistical correlation approach, (2) vehicle model-based approach, and (3) machine learning model (Yu et al. Citation2022).

Statistical methods establish relationships between the acceleration and the reference IRI. For instance, a relationship was built between the acceleration RMS from all three axials, and the IRI measured using an inertial profiler (Forslöf and Jones Citation2015, Thiandee et al. Citation2019). Meanwhile, a second-degree polynomial model was built using vertical acceleration and vehicle speed to estimate the IRI (Wessels and Steyn Citation2020). Nonetheless, the conventional regression model-based models require calibration for a specific vehicle or mounting calibration.

Since a vehicle’s mass and suspension characteristics affect how its body reacts to the road, it is essential to consider these characteristics when estimating the IRI. In the vehicle model-based approach, the IRI can be estimated from the vehicle response, measured by smartphones, using a transform function that defines the relationship between the power spectral density (PSD) of the vehicle body or axel’s acceleration and the road profile (Janani et al. Citation2020). Apart from the PSD relationship, studies estimate the road profile from the vehicle’s acceleration by solving a vehicle model-based state space matrix (Islam Citation2015). The suspension parameters of the testing vehicle could also be estimated from a bump test and the vehicle with known characteristics was then used to estimate the profile (Zhang et al. Citation2021). The IRI could be calculated using the standard algorithm once the road profile is known (Sayers et al. Citation1986, Sayers Citation1995).

Machine learning algorithms have been utilised to compute the IRI in recent studies. A convolutional neural network (CNN) was used to estimate the IRI from multiple vehicle responses measured by smartphones (Jeong et al. Citation2020). Moreover, the statistical features (i.e. mean, range, and variance) were extracted from the smartphone acceleration signal, and a prediction model was then trained from these features and the ground truth IRI (Laubis et al. Citation2016).

Besides the three categories, a stochastic model was derived to relate the road roughness PSD and vertical acceleration of the Quarter Car (QC) vehicle model (Botshekan et al. Citation2021). Vehicle dynamics and random vibration theory were incorporated to estimate the IRI; Correction functions were applied to account for the signals collected from variable speeds. Such an approach requires no prior knowledge of the suspension parameters or a calibration test and obtained robust results (estimation accuracy of 8%) for different vehicle speeds and mounting locations, and was validated as ideal for crowdsourcing applications (Botshekan et al. Citation2020).

2.2. Dominating practical factors

This study identified three critical factors: surveying speed, vehicle type, and mounting configuration. The following sections demonstrate how variation in these factors affects the sRIE measurements in the existing literature.

Speed. Response-based RIE systems are affected by the travelling speed as the vehicle body’s vertical acceleration depends on the vehicle speed (Schlotjes et al. Citation2014, Galagoda and Lanka Citation2019). Specifically, the coefficient of the IRI-acceleration regression model varies with the survey speed change (Douangphachanh and Oneyama Citation2014b). The acceleration increased by 93% when the vehicle speed changed from 30 to 80 km/h (Wang and Ghataora Citation2020). Similarly, smartphone-based systems were tested at various speeds in field experiments to determine the effect of speed (Cameron Citation2014). When comparing the smartphone-measured IRI (sIRI) to the reference IRI (rIRI) acquired by a laser profiler, studies reported poor correlations between them due to the sRIE systems being highly speed-dependent (Gamage et al. Citation2016, Wix Citation2016).

Vehicle type. sRIE systems are affected by discrepancies in the vehicle suspension types. In past studies, the vertical acceleration collected from different vehicles was dampened to different degrees due to varying vehicle suspensions (Islam et al. Citation2014). The sIRI collected from three vehicles was not statistically similar (Cameron Citation2014). Moreover, the coefficients of the linear relationship between the acceleration and the IRI changed when the data was collected from a different vehicle type (Douangphachanh and Oneyama Citation2014b).

Mounting. Smartphone mounting (e.g. windshield, dashboard, air vent) affects the measurement of the vehicle body’s acceleration. Previous studies investigated the impact of different mounting configurations on the measured sIRI (Bridgelall et al. Citation2019). Three mounting types were tested, and it was found that the windshield mount provided the closest result to a profiler while the air vent mount presented the largest error of 85.8% (Hanson et al. Citation2014). A comparison between windshield and dashboard mounts (Ordaz and Doyle Citation2021) revealed that the measurements from a rigid mount on the windshield were more consistent. Similarly, (Hossain et al. Citation2019) tested four mounting types and found their measurements did not converge well in most road sections.

The review found that it remains unclear whether commercial sRIE systems provide robust measurements under varying practical settings attributed to these factors. Therefore, a framework comprised of experimental design guidelines and performance testing statistical measures was designed to evaluate the sRIE systems, with a special focus on validating their consistency and accuracy performance subject to various practical settings.

3. Evaluation framework

The aim of establishing an evaluation framework is to quantify and rank the performance of an object in an unbiased and informative manner (Dollár et al. Citation2012). An effective evaluation framework contains two parts, namely accuracy and repeatability. Accuracy represents how close the results of the tested system are to the desired values, while repeatability concerns the system’s ability to produce the same results across multiple tests under the same circumstances. Repeatability and accuracy tests are commonly adopted to measure the performance of engineering systems. For example, they were used in the field of occupational ergonomics to evaluate the performance of the IMU that was designed for measuring workers’ upper arm elevation and angular displacement (Schall et al. Citation2016). In the field of robotics, accuracy and repeatability tests were applied to assess the object positioning and motion mapping precision of an industrial robot under various loading and temperature settings (Płaczek and Piszczek Citation2018).

Designed to benefit the field of pavement condition assessment, the proposed framework brings two direct benefits. First, it regulates the statistical measures and standardises the calculation procedures to determine them. Second, it guides experimental design for field tests. A schematic illustration of the framework is shown in . The practical setting includes surveying speed, vehicle type, and mounting configuration. The benchmark measurements are obtained from a reference instrument, such as an inertial profiler (a practical implementation of the data collection is demonstrated in Section 4). The segment length is determined by the user and could be a typical value of 10, 20, 50, 100 m, or 200 m, as long as it is kept consistent between the sIRI and rIRI systems. A detailed explanation of each testing statistical measure is elaborated in the following sections.

Figure 1. Schematic illustration of the evaluation framework.

3.1. Repeatability test

Repeatability tests evaluate to what extent sRIE systems can produce the same results under a particular testing setting. In other words, it reflects the agreement between successive measurements. The repeatability test plays a key role in determining the system’s reliability and therefore is in the developers’ and users’ interest to compute. By referring to existing literature that defines repeatability checks for inertial profilers (Austroads Citation2016b), five repetitive runs were determined necessary in this framework to validate the measurement repeatability.

The repeatability performance under each practical setting is reported using two statistical measures, namely the coefficient of variation ( $C o V$ ) of each measuring segment and the $R^{2}$ of the sIRI_mean vs sIRI_individual linear regression model. In addition to these two statistical measures, the evaluation framework also finds the factors that impose dominating impact on the repeatability performance in the sensitivity test. For this purpose, a multiple linear regression model is established using the $C o V$ (dependent variable) and the rIRI and survey speed (independent variables). Each statistic is further elaborated in its mathematical context in the following sub-sections.

3.1.1. Coefficient of variation

The $C o V$ of a segment measures the relative dispersion of five measurements around their mean (the ratio of standard deviation to the mean). This percentage directly shows the variation of measurements in each segment. $C o V_{m e a n}$ is computed as the average of the $C o V$ of all segments in a testing route and indicates the system’s performance on the entire route.

For each segment, there is: (1) $C o V = \frac{σ_{n}}{\bar{X_{n}}}$ (1) where, $σ_{n} = \sqrt{\frac{\sum_{i = 1}^{N} {(X_{n i} - \bar{X_{n}})}^{2}}{N - 1}};$ the sample standard deviation of measurements at the nth segment; $\bar{X_{n}} = \frac{\sum_{i = 1}^{N} X_{n i}}{N}$ ; the arithmetic mean of measurements at the nth segment; $N$ : the total number of repetitive runs; $X_{n i} :$ the measurement on segment n from ith run.

The $C o V_{m e a n}$ of a testing route is: (2) $C o V_{m e a n} = \frac{\sum_{n = 1}^{n_{s}} C o V_{n}}{n_{s}}$ (2) where, $C o V_{n}$ : the coefficient of variation of measurements at the nth segment; $n_{s}$ : total number of segments.

3.1.2. Correlation with the mean measurements

Apart from the $C o V_{m e a n}$ , the scatter plot of individual data points versus the mean provides a visual indication of the sRIE system’s level of consistency. In this plot, each data point on the x-axis corresponds to five data points on the y-axis. Using least squares regression, a linear regression model could be built between the individual sIRI values (dependent variable) and the mean of sIRI values (independent variable), and the slope and intercept of the regression model are always equal to 1 and 0. However, the $R^{2}$ of the regression model should be reported, as it explains the total residuals of the dependent variables to the regression model and indicates the closeness of individual measurements. (3) $s I R I_{i n d i v i d u a l} = k_{(c, m)} \times s I R I_{m e a n} + b_{(c, m)}$ (3) Where, $s I R I_{m e a n}$ : the mean of IRI of five repetitive runs; $s I R I_{i n d i v i d u a l}$ : the IRI of one run; $k_{(c, m)}, b_{(c, m)}$ : regression coefficients for each vehicle ( $c$ ) and mounting ( $m$ ) setting.

3.1.3. Repeatability sensitivity

As evidenced by the discussion in Section 2, smartphone measurements vary significantly among different practical settings; hence, the impact of alteration of the practical setting on the repeatability statistical measures is in the interest of the evaluation process. Besides practical settings, it is assumed that the roughness of the pavement itself also governs the repeatability performance. Specifically, the repeatability performance on rough pavement is expected to be worse than that of smooth pavement due to the vibration induced by the random roughness and intense vibration on bad pavement. It should be noted that the actual roughness of the pavement needs to be measured using a reference instrument.

In addition to the rIRI, other dominant practical factors, including survey speed, vehicle type and mounting configuration, should also be considered. Survey speed and the rIRI could be regarded as quantitative variables, while vehicle and mounting are categorical variables. The evaluation method should identify the most dominating factor by visualising the segment $C o V$ with respect to the variation of these factors. Therefore, for each vehicle and mounting setting, a multiple linear regression model should be established as below: (4) $C o V = β_{0} + β_{1} \times s p e e d + β_{2} \times r I R I$ (4) where, $C o V$ : the coefficient of variation of a segment (refer to Section 3.1.1); $s p e e d$ : the average speed across five repetitive runs at a segment; $r I R I$ : the benchmark IRI value at a segment; $β_{0}, β_{1}, β_{2}$ : the regression coefficients.

3.2. Accuracy test

Besides testing the repeatability, the sRIE system’s measurement accuracy is also crucial. An accuracy test quantitatively assesses how close the measurements of the sRIE systems are to that of the benchmark IRI obtained from the reference instrument.

The accuracy performance under each practical setting is reported using two measures, namely average measurement error ( $ϵ_{m e a n}$ ), and $R^{2}$ of the sIRI vs rIRI linear regression model. Besides these two measures, the evaluation framework also finds the factors affecting the measurement error, which is the difference between reference and smartphone measurements. For this purpose, a linear regression model is established between the $ϵ$ (dependent variable) and survey speed (independent variables). The computation of each test is elaborated in the following sections.

3.2.1. Average of measurement error

The average measurement error describes the difference between the sIRI and rIRI in an entire survey run. It is a percentage that directly reflects the accuracy of an sRIE system and is defined as: (5) $ϵ_{m e a n} = | \frac{1}{n_{s}} \sum_{n = 1}^{n_{s}} \frac{s I R I_{n_{m e a n}} - r I R I_{n}}{r I R I_{n}} |$ (5) where, $s I R I_{n_{m e a n}}$ : Average of five sIRI on nth segment; $r I R I_{n}$ : reference IRI on the nth segment; $n_{s}$ : total number of segments in a testing route.

3.2.2. Correlation with the reference IRI

The scatter plot of sIRI data points versus the rIRI provides a visualisation of how sIRI measurements distribute with respect to the benchmark values. Similar to Section 3.1.2, a linear regression model could be established, and the regression coefficients play an important role in describing the accuracy level herein. The closer the slope is to one and the intercept is to zero, the better the performance of the smartphone system is in producing measurements of close magnitude to the reference instrument.

Regression coefficients

Using least squares regression, a linear regression model along with the coefficient of determination could be determined between the rIRI and the sIRI: (6) $s I R I = k_{(c, m)} \times r I R I + b_{(c, m)}$ (6) where, $r I R I$ : reference IRI of a segment; $s I R I$ : smartphone measurements on a segment; $k_{(c, m)}, b_{(c, m)}$ : regression coefficients for each vehicle ( $c$ ) and mounting ( $m$ ) setting.

Coefficient of determination

The coefficient of determination, or $R^{2}$ , is a measure that provides information about the goodness of fit of a model. In the context of regression, it is a statistical measure of how well the regression line approximates the actual data.

3.2.3. Accuracy sensitivity

The accuracy sensitivity test aims to understand the impact of speed variation on measurement accuracy under different mounting and vehicle settings. The measurement accuracy is expressed as the relative difference between the smartphone and reference instrument measurement values. A linear regression model is established between the independent variables (surveying speed) and the dependent variables ( $ϵ$ ). If the sRIE system is speed-independent, there should be no statistical significance between the two variables. Mathematically, the relationship can be represented as follows: (7) $ϵ = k_{(c, m)} v + b_{(c, m)}$ (7) where, $ϵ$ : the relative measurement error $(s I R I - r I R I) / r I R I$ on a segment; $v$ : the average speed on a segment; $k_{(c, m)}, b_{(c, m)}$ : regression coefficients for each vehicle and mounting setting.

4. Experiment design

This section introduces a field experiment that demonstrates the feasibility of evaluating sRIE systems using the proposed framework. The experiment was conducted as per the data collection procedure illustrated in . Specifically, one reference instrument and three sRIE systems were involved; two survey routes were selected; three constant survey speeds were conducted on each route; two mounting locations and two vehicle types were tested on each run. However, it should be noted that the framework is scalable to accommodate the practical setting quantity that varies from those selected in this study. In terms of segment length, while the profiler and two of the Apps can produce IRI to a 10 m segmentation, 100 m segmentation, which is a common evaluating length in practice, was selected in this study.

4.1. sRIE systems

This study investigates three commercial smartphone applications. App 1 (Roadroid) (Forslöf and Jones Citation2015) estimates the IRI based on the RMS (Root Mean Square) of the acceleration signal. The RMS-IRI correlation relationship was established in the development stage and was tailored for different vehicle body types. App2 (Totalpave) (Cameron Citation2014, Hanson et al. Citation2014) first applied FFT (Fast Fourier Transform) to the acceleration signal into the frequency domain and converted the acceleration into displacement data, which is then used as the profile data for the IRI calculation algorithm. App3 (iDRIMS) (Zhao and Nagayama Citation2017, Zhao et al. Citation2019, Xue et al. Citation2020) requires the testing vehicle’s dimension information. Acceleration measurements were used to estimate the profile. The vehicle mass and suspension parameters of the Half-Car (HC) model were optimised such that the differences between the profiles estimated from the front and rear tyre locations are minimal. The IRI is then calculated from the estimated road profile.

Sensor and Smartphone model. Since most existing sRIE Apps were developed for the Android operating system, a common Android smartphone model Samsung S9 was used in all tests to eliminate the uncertainties caused by smartphone model variation. The onboard inertial measurement unit (IMU) is the LSM6DSL inertial module.

Vehicle type. Two vehicle models employed in the experiments are a Ford Ranger (Denoted as $U$ for ute) and a Volkswagen Golf (Denoted as $H$ for hatchback), as shown in .

Figure 2. Two vehicle models used in the experiment: (a) a ute and (b) a hatchback.

Mounting locations and mount type. The tested sRIE systems recommend mounting on the windshield. In addition, the dashboard was included since it is another common mounting location for drivers. The mounting locations were selected in line with the guideline on in-cabin smartphone mounting as per the guideline from Transport NSW (NSW Government Citation2017). Meanwhile, two sRIE systems suggested using a rigid and short-arm mount, while one sRIE system suggested tagging the smartphone to the vehicle body through an armless mount. However, all three sRIE systems were mounted on ‘iottie one touch Gen 5’, which is fixed to the vehicle body using a rigorous suction connection. The in-cabin smartphone set-up is shown in .

Figure 3. In-cabin set-up (App interface covered).

Reference instrument. To acquire accurate pavement profile data as the ground truth in this experiment, the Australian Road Research Board (ARRB) inertial profiler system, as shown in , was employed to collect the reference IRI data (rIRI). This instrument was in-service for surveying the Victoria road network and has been calibrated according to the criteria specified in Austroads standards (Austroads Citation2016a). Hence, it was adopted in this experiment to provide the reference IRI. The inertial profiler contains laser sensors and accelerometers. The distance to the road surface is measured by the laser sensor, while the movement of the vehicle body is obtained from processing the acceleration data. While the profilers usually have high repeatability (measurement consistency) (Wix Citation2016), three repetitive runs were conducted to obtain robust ground truth data.

Figure 4. Survey vehicle equipped with a profiler.

Survey routes. Two routes (Miles Rd and Convent School Rd) from the metropolitan area in Melbourne were selected. Similar to (Yang et al. Citation2020, Ahmed et al. Citation2021), this experiment selected both sealed and unsealed pavement. As shown in , Miles Rd is surfaced with sealed asphalt while Convent Rd is unsealed gravel pavement. The testing routes were selected such that no traffic lights exist, and thus a constant driving speed can be maintained throughout the survey.

Table 1. Testing route details and survey speeds.

Display Table

Time of survey and weather conditions. It should be noted that the surveys were conducted on two consecutive days. The temperature and relative humidity during the data collection are 16.9°C and 80% (at the start) and 24.4°C and 56% (at the end) on Day 1, and 16.0°C and 95% (at the start) and 15.7°C and 81% (at the end) on Day 2. The time gap between smartphone system runs and the reference profiler runs was kept minimal. This minimises the impact of ongoing pavement deterioration on the experiment results. Moreover, the data collection runs were completed under the same weather condition on both routes. This is particularly critical for gravel pavement, the roughness of which changes after heavy rain.

Synchronisation of sIRI and rIRI. The synchronisation of sIRI and rIRI was achieved by ensuring that all runs start and end at the same marked positions and that the same number of segments was obtained.

Miscellaneous. Cruise control was adopted to maintain a constant driving speed during the survey. The windscreen air conditioning was switched on to keep the smartphones from overheating. In addition, the number of drivers and passengers stayed consistent in all runs, as the weight of the vehicle body may affect the measurement consistency.

5. Evaluation framework validation results

This section presents the evaluation results in the form of the tested statistical measures. The results are presented as per the evaluation matrix included in .

Smartphone-measured IRI values (from one of the Apps) and the benchmark IRI values at Miles Rd are shown in . The plots present the measurements under the practical setting ‘Ute | Windshield at 60 km/hr’. As the plots indicate, the IRI of Miles Rd ranges from 1 mm/m to 3 mm/m in most segments, although it reaches nearly 6 in two segments. Conversely, the IRI of Convent Rd is between 3 mm/m to 5 mm/m, which is significantly higher than that of Miles Rd. Meanwhile, it could be observed that the sIRI are more consistent and aligned with the rIRI on Miles Rd, as evidenced by fewer sIRI spikes that deviate from the rIRI.

Figure 5. Plot of IRI measurements from one of the Apps and the profiler.

5.1. Repeatability test

5.1.1. $C o V_{m e a n}$

The $C o V_{m e a n}$ (Section 3.1.1) is the average of the coefficient of variation of all segments contained in a testing route. It indicates the relative dispersion of five measurements around their mean. The $C o V_{m e a n}$ results under all practical settings are presented in and .

Table 2. $C o V_{m e a n}$ on Miles Rd (%).

Display Table

Table 3. $C o V_{m e a n}$ on Convent Rd (%).

Display Table

The $C o V_{m e a n}$ values of the smartphones and the profiler vary significantly. The profiler measurements maintain a $C o V_{m e a n}$ of less than 5% across the testing routes (2.29% on Miles and 4.46% on Convent), which satisfies the repeatability criteria outlined in (Austroads Citation2016b). It is noticed that Convent Rd’s value is higher than that of Miles Rd. This difference could be attributed to the uneven gravel surface that introduces more vibration to the accelerometer and laser sensor on the profiler, which is likely to perform less consistently when exposed to intense vibration.

For the sRIE systems, the $C o V_{m e a n}$ sits in the range of 7.07–29.30. Among the three systems, App1 and App3 produced $C o V_{m e a n}$ of less than 10 under most practical settings. Furthermore, the $C o V_{m e a n}$ of the Convent Rd’s measurements is overall higher than that of Miles Rd, suggesting that the sRIE system’s repeatability drops significantly when testing a rougher pavement surface.

5.1.2. Correlation with the mean measurements

The $R^{2}$ of the linear regression model established between the individual sIRI values (dependent variable) and the mean of sIRI values (independent variable) indicates the level of measurement consistency (Section 3.1.2). The results of the sIRI_individual – sIRI_mean regression are shown in .

Figure 6. sIRI_individual vs sIRI_mean measurement plot and regression line (App1).

Figure 7. sIRI_individual vs sIRI_mean measurement plot and regression line (App2).

Figure 8. sIRI_individual vs sIRI_mean measurement plot and regression line (App3).

Each plot contains the measurements of sIRI_individual vs sIRI_mean under different survey speeds, distinguished by datapoint of different colours. The axial is bounded by a limit IRI value of 8 mm/m. Measured by the ground-truth instrument, the highest 100 m segment IRI of the two testing routes is 7.3 mm/m. The closer the data points are to the dotted 45^o line, the higher the $R^{2}$ value is. The $R^{2}$ value provides a quantitative indication of repeatability performance. It should be noted that the slope and intercept of the linear regression line are equal to 1 and 0, respectively. Python library ‘SciPy’ was adopted in this research for regression analysis. The regression coefficients $k$ and $β$ are estimated through ‘least square estimates’ which minimises the sum of squared residuals in the sample (Hastie et al. Citation2021).

Overall, App3 achieved a better repeatability performance, as evidenced by significantly higher $R^{2}$ values on both Miles and Convent Rd. In addition, the datapoints distribution shows the results of runs under different survey speeds. Moreover, the impact of different practical settings can be observed from the plot of data points taken in each mounting and vehicle type combination.

App 1. It appears that the variation in survey speed does not drastically affect the repeatability performance since the $R^{2}$ values are close among the three survey speeds. Furthermore, the plots do not identify the better performing vehicle or mounting type since their $R^{2}$ values are close.

App 2. The overall repeatability performance of App 2 is inferior to that of the other two Apps, as shown by the lower $R^{2}$ values. Moreover, the $R^{2}$ values are dropped significantly on Convent Rd. It should be noted that the measurements of ‘Dashboard 40 km/h’ are not visible from the plot because the sIRI measurements exceeded 8 mm/m, which suggests that the App was severely overestimated (more details are included in Section 5.2).

App 3. The $R^{2}$ values are high and consistent on Miles Rd in all practical settings. However, on Convent Rd, the $R^{2}$ values dropped slightly, and the measurements of 80 km/h have the lowest $R^{2}$ values. That being said, App 3 shows the best sIRI_individual – sIRI_mean correlation performance.

Looking at the plots of all three Apps, a common trend observed is that the correlation results on Convent Rd are worse than that on Miles Rd, as evidenced by the lower $R^{2}$ value on Convent Rd. This suggests that the road surface type makes a difference in the repeatability performance.

5.1.3. Repeatability sensitivity

The repeatability sensitivity was assessed using a multiple linear regression model where survey speed and rIRI were independent variables and the $C o V$ is the dependent variable (Section 3.1.3). The results are presented in , which shows the multiple linear regression plane of different practical settings of three Apps. The regression coefficients and $R^{2}$ are included in and . It should be noted that the range of the independent variables (rIRI and speed) is different between the two testing routes due to the different speed limits. The performance difference between the three Apps is analysed below.

Table 4. Regression coefficients and $R^{2}$ of MLR in repeatability sensitivity analysis (Miles Rd).

Display Table

Table 5. Regression coefficients and $R^{2}$ of MLR in repeatability sensitivity analysis (Convent Rd).

Display Table

App 1. On asphalt, a common trend that could be noticed is that the $C o V$ increases as the rIRI increases. A similar trend is observed on gravel pavement, except for the ‘Ute | Windshield’ setting, which shows a different trend from others.

App 2. On asphalt pavement, it could be noticed that the $C o V$ does vary drastically as the survey speed or the rIRI varies, except for the setting ‘Hatchback | Dashboard’ where the regression plane is positively related to the increase of rIRI. Under the setting of ‘Hatchback’, the lower survey speed seems to worsen the repeatability performance. Meanwhile, it is noticed that the regression plane of the setting ‘Ute | Dashboard’ has a different trend from the others.

App 3. First, the $C o V$ stays around 5%–10% and is not dependent on the change of rIRI. However, it is found that the $C o V$ of the setting ‘Ute’ is more sensitive to changes in survey speed, as evidenced by a greater slope. On Gravel pavement, both four settings exhibit a common trend, in which the $C o V$ increases as rIRI or speed increases. However, the rIRI is a more dominant factor with a greater regression slope value ().

Figure 9. Multiple Linear Regression (CoV against rIRI and speed).

5.2. Accuracy test

5.2.1. Average of measurement error

The $ϵ_{m e a n}$ is calculated as the percentage difference between sIRI and rIRI measurements (Section 3.2.1) and it reflects the overall accuracy performance of sRIE systems. The values are grouped into three categories as shown in and . Overall, it could be observed that the sRIE systems’ performance on Asphalt pavement is superior to that on Gravel pavement.

Table 6. $ϵ_{m e a n}$ on Miles Rd.

Display Table

Table 7. $ϵ_{m e a n}$ on Convent Rd.

Display Table

5.2.2. Correlation with the $r I R I$

Linear regression models were established between sIRI and rIRI measurements (Section 3.2.2) and are demonstrated in . Each figure contains the plots of the sIRI vs rIRI under different practical settings. The speed of measurements is distinguished by legends of different colours. Within the plot, the dotted 45^o line represents a perfect correlation between the two systems and references the data points and the fitting line. It is noted that the closer the slope is to one and the interest is to zero, the better the performance of the smartphone system is in producing measurements of close magnitude to the reference instrument.

Figure 10. sIRI vs rIRI measurement plot and regression line (App1).

Figure 11. sIRI vs rIRI measurement plot and regression line (App2).

Figure 12. sIRI vs rIRI measurement plot and regression line (App3).

App 1. On miles road, as the speed increases, the sIRI measurements tend to be greater, as evidenced by the data points of higher speed located at the top region of the plot. Moreover, the Ute seems more vulnerable to speed increment since the regression slopes differ more drastically than that of the Hatchback. On convent road, it could be seen from the data points and the $R^{2}$ values that the linear relationship is less strong than the other route. It is also noticed that the lowest speed (40 km/h) has resulted in the lowest $R^{2}$ value.

App 2. On miles road, as the survey speed increases, the sIRI measurements tend to be lower, as evidenced by the data points of higher speed located at the bottom region of the plot. This is opposite to App1’s results. On convent road, similar to the results of App1, the measurements are random and discretised with a lower correlation to the rIRI.

App 3. As shown in , differing from the previous two Apps, App3 underestimates the IRI, since most of the data points are located under the 45^o reference line. However, overall, the $R^{2}$ values are higher than that of the other two Apps, suggesting a better accuracy performance.

5.2.3. Accuracy sensitivity

Linear regression models are established between the independent variables (surveying speed) and the dependent variables ( $ϵ$ ) to understand the Apps’ sensitivity to speed under different mounting and vehicle settings (Section 3.2.3). It should be noted that the regression results are only applicable to the survey speeds conducted in this experiment, which are 40–80 km/h on gravel pavement and 60–100 km/h on asphalt pavement, both with a step interval of 20 km/h. The regression line plot and the regression statistics are shown in and presented in and . The R² value and p-value are rather different amongst the three Apps. For instance, the low R² of App3 suggest that the survey speed could only explain little of the variance in the measurement error. Meanwhile, its high p-value datapoints are rather scattered, suggesting no statistical significance between speed and $ϵ$ . The more accurate the sRIE system, the closer the datapoints are to 0 in y-axis. And if the system is speed-independent, the data should not have a significant k value.

Figure 13. Accuracy sensitivity test results.

Table 8. Regression coefficients, R² and p-value of $ϵ$ vs speed regression (Miles Rd).

Display Table

Table 9. Regression coefficients, R² and p-value of the ϵ vs speed regression (Convent Rd).

Display Table

App 1. Overall, as the regression line indicates, the $ϵ$ increases as the survey speed increases. Moreover, the performance of different practical settings varies. For instance, on asphalt pavement, it could be noticed that the regression slope of Ute is greater than that of Hatchback, suggesting that Ute’s relative measurement error is more vulnerable to speed change than the Hatchback. On gravel pavement, the least error is produced under the Hatchback-windshield setting. In contrast, the results from other settings tend to overestimate the IRI rather significantly.

App 2. Unlike App 1, App 2 shows an opposite trend with a decreasing $ϵ$ as the survey speed increases. Moreover, it could be noticed that both the slope and the intercept of the dashboard mount are greater than that of the windshield mount (on both asphalt and gravel pavement); hence it could be inferred that the relative measurement error is not only greater but also more sensitive to different surveying speeds under dashboard mount.

App 3. Unlike the other two Apps, it could be noticed that the magnitude of the slope is trivial. Meanwhile, the high p-values suggest no statistical significance between the $ϵ$ and the speed, and low $R^{2}$ values indicate no correlation between the two. The results have shown that the slope and the intercept of the four scenarios are close, indicating that the alteration between tested mounts and vehicle types does not impact this App’s performance on both asphalt and gravel pavement.

5.3. sRIE performance overview

This section aims to give an overview of the performance of the sRIE system. The median of $C o V_{m e a n}$ (3.1.1), $ϵ_{m e a n}$ (3.1.2), and $R^{2}$ of the correlation relationship (3.2.1 and 3.2.2) were calculated using the results from all speed, vehicle type and mounting configuration settings. The median informs the centre value of the testing results of these four statistical measures and indicates the sRIE systems’ performance in a general way, and the results are presented in and .

Table 10. Median of testing results and performance ranking (sealed road).

Display Table

Table 11. Median of testing results and performance ranking (unsealed road).

Display Table

In terms of repeatability on the sealed road, App3 achieves the lowest median in ‘ $C o V_{m e a n}$ ’ and the highest median in ‘ $R^{2}$ of correlation with the mean’, suggesting it to be the most consistent App. Meanwhile, in accuracy, App2 achieves the lowest $ϵ_{m e a n}$ with the lowest correlation with the rIRI. App3 still achieves the highest $R^{2}$ , but with a high $ϵ_{m e a n}$ . It could be seen from the accuracy sensitivity analysis in that App3 is less vulnerable to practical setting variations. Meanwhile, the ‘ $R^{2}$ of correlation with the rIRI’ under different practical settings are consistently high. However, suggests that App3 tend to underestimate the IRI most of the time, which explains the higher $ϵ_{m e a n}$ . Nonetheless, given the outstanding repeatability performance as well as the higher ‘ $R^{2}$ of correlation with the rIRI’, App3 is seen as the most robust sRIE system on the sealed road. On the unsealed road, App3 achieves a better performance on all statistical measures except for $C o V_{m e a n}$ , and is therefore selected as the recommended sRIE.

6. Discussion

With the validation of the proposed evaluation framework demonstrated in Section 5, this section discusses the findings and insights obtained from the validation experiments. The discussion contains three sections. First, the performance of the tested sRIE systems is summarised. Then, the benefits and limitations of the evaluation framework are discussed. Lastly, areas of future research that promote the application and incorporation of sRIE systems are elaborated.

6.1. Performance of tested sRIE systems

In practical terms, the results presented suggest that the sRIE systems are significantly less accurate than the profilers. Furthermore, the robustness and accuracy of sRIE systems are rather vulnerable to practical setting alteration, as demonstrated by the above analyses. Therefore, the following paragraphs aim to discuss the factors that may cause measuring errors and highlight the impact of the practical settings by summarising the key findings from the above analyses.

Positioning accuracy. The positioning methods used in data processing may contribute to the error in IRI measurement. The sRIE systems output the starting and ending positions of each segment. The position information could be obtained from the GPS signal or calculated by integrating the driving speed. The GPS output from smartphones usually has an error of up to 4.9 m (Moorefield Citation2020). Furthermore, the accumulating error could be significant when the latter approach is used, and such a misalignment error becomes greater when the survey route has a long distance. As a result, the positioning error may lead to incorrect segmentation of the acceleration signal, which in turn affects the IRI of adjacent segments.

Pavement roughness. The sensitivity test results suggested that the internal repeatability was related to the pavement roughness. For App2 and App3, the rougher the pavement surface was (a high IRI value), the greater the $C o V_{m e a n}$ value was.

In terms of accuracy test results, the sRIE systems achieved an $R^{2}$ value as high as 0.73 on Miles Rd, indicating a good correlation with the rIRI. However, they tended to overestimate the roughness index on gravel pavement by 60%–100%. On gravel pavement, all tested Apps correlated with the reference instrument with an R-squared value of less than 0.35. It can be concluded that the performance of smartphone systems on gravel pavement is not as robust as that on asphalt pavement.

Speed. The impact of survey speed on measurement repeatability varied among Apps. While a higher survey speed resulted in a higher $C o V_{m e a n}$ value for App1 and App2, as demonstrated in Section 5.1, App3 managed to produce consistent measurements regardless of the survey speed.

As in the accuracy test, a higher survey speed generally results in higher measurements on App 1, but lower measurements on App 2. However, App3’s measurements were independent of the survey speed.

Vehicle type. It was found that for App1 and App2 the $C o V_{m e a n}$ of surveys conducted using the Ute is generally higher than that conducted using the Hatchback. This could be resulted from the rigid suspension of the Ute that reacts more excessively to pavement roughness than that of the Hatchback. Nonetheless, App3 managed to provide consistent repeatability results in both vehicles.

The impact of vehicle type on the measurement results varies among Apps. The result showed that for App 1, the measurements undertaken on the Ute are more vulnerable to speed change than on the Hatchback, and thus, a larger measurement error could be expected on the Ute. However, in App2 and App3, vehicle type alteration does not drastically affect the measurements.

Our experiment noticed that Apps have different approaches to considering the vehicle type. Information about the testing vehicle was not required in App2. Conversely, the preset options of common domestic vehicle body types were provided in App1. In App3, the measurement of the testing vehicle’s dimension is required.

Mounting. It was noticed that the $C o V_{m e a n}$ of the dashboard mounting is significantly greater than that of the windshield mounting for App2. However, the results of App3 were rather consistent in both mounting locations.

For App1, the results were more sensitive to speed variation under windshield mounting. Moreover, the windshield mounting resulted in a more underestimated IRI than the dashboard. In contrast, the measurement error was not only greater but also more sensitive to surveying speed variation on the dashboard for App2. This difference may be attributed to the Apps’ different filtering techniques to remove the vibration noises. Nonetheless, App3 can provide constant measurements independent of the mounting location alteration.

Apart from the mounting location, one of the sRIE systems requires tagging the smartphone directly on the vehicle body using an armless mount since the mounting dynamic characteristics may influence the measurements. Hence, it is acknowledged that using a short-arm mounting may adversely affect the performance of this particular sRIE system.

In summary, the results of the analyses conducted with the three Apps suggested that:

The smartphone systems do not provide measurements on gravel pavement as consistent and accurate as on asphalt pavement. Under excessive vibration incurred by rough pavement, stiff suspension, or both, the Apps’ consistency and accuracy performance decline.
The repeatability performance of two of the three tested Apps depends on survey speed, vehicle type, and mounting variations, and they tend to overestimate the roughness index in general.
One system provides consistent measurements regardless of vehicle and mounting location variations.

6.2. The benefits and limitations of the evaluation framework

Benefits. The proposed framework has provided a comprehensive approach to evaluating sRIE systems and contributed to the body of knowledge in pavement roughness assessment. The benefits of the framework are three-fold. First, it regulates the statistics and procedures for validating sRIE systems. In practice, it realises the cross-comparison of different sRIE systems, which allows authorities to test and select the appropriate sRIE systems for roughness assessment. Moreover, the framework could be applied to evaluate the App’s performance at the development stage and validate the performance of an sRIE system before its adoption in the industry.

Secondly, it advances the development of a robust sRIE system. The framework pivots on validating sRIE systems under different practical settings. In particular, the consideration of practical factors (e.g. mounting, vehicle type) in the framework was intended to improve the evaluation validity by considering factors that may compromise the measurement consistency and accuracy. Our experimental results have shown that two of the three Apps are still rather sensitive to practical setting alterations; hence, it is suggested the Apps should improve their consistency level before being made commercial.

Thirdly, it promotes the integration of sRIE systems into the Pavement Management System (PMS), which is a systematic integration of road data collection, storage, analysis and modelling methods used in pavement performance evaluation, data management, and resource optimisation (Austroads Citation2019). A key component of a PMS is the evaluation of pavement performance (Austroads Citation2019), for which the roughness index is the important indicator. Conventional roughness data collection instruments are walking profiler, laser profiler, and integrated network survey vehicle (Austroads Citation2019), and there are evaluation frameworks to validate these measuring methods (Austroads Citation2016a). Meanwhile, these validation frameworks also contain benchmark criteria that the systems must meet to be adopted in industry use. Once such validation procedures are established for sRIE systems, more sRIE systems will be validated using standardised statistical measures. As a result, road agencies are more likely to integrate the validated systems into their PMS.

Lastly, the proposed evaluation framework exhibits great transferability. In essence, this framework assesses a system’s ability to produce consistent and accurate index measurements under alternating environmental settings. While this study applied it to evaluate sRIE systems, the framework could also be applied to other estimation systems that estimate quantitative roughness index other than the IRI.

Limitations. Although the findings in this study were drawn from the testing of three selected sRIE Apps among others available in the market, it should be noted that the intention of this study was not to characterise the sRIE systems as better or worse performance systems as compared to their peers but rather to evaluate them using the proposed framework and more generally, critically identify the limitations of the proposed evaluation framework. Several limitations have been identified.

First, it requires benchmark measurements. The validation of an sRIE system relies on knowing the reference IRI of the tested road segments. While the repeatability tests 3.1.1 and 3.1.2 can be completed independently, test 3.1.3 and the accuracy tests must involve a reference measuring instrument.

Secondly, sufficient data must be obtained for this framework to function. The framework suggests testing the system on various roads that contain a wide spectrum of pavement conditions. Five repetitive runs ensure that sufficient data is collected for performance evaluation.

Thirdly, the framework is only applicable to constant driving speed. Knowing the speed at which the system produces more consistent and accurate measurements is critical. Therefore, driving speed is a categorical variable in testing 3.1.2, 3.1.3, 3.2.1, and 3.2.2. The framework is incompatible with measurements taken at time-variant speeds; however, the step interval could be as small as possible (20 km/h in this research) to understand how an sRIE system performs at continuously varying speeds.

Lastly, the evaluation framework does not contain acceptance criteria. Unlike the profiler evaluation frameworks (Austroads Citation2016a, Citation2016b, Citation2016c) that outline the statistical measures as well as specify the corresponding threshold value that a tested instrument must meet to be successfully validated, the proposed framework does not inform whether the tested object meets the requirements of being a qualified sRIE system.

6.3. Future research

The primary contribution of this research adds to the body of knowledge of evaluation methodology for smartphone-based roughness assessment systems. In addition, our research defines opportunities for improvement in the following four areas to improve the practical adoption of the sRIE systems.

System integration. Road authorities are encouraged to include the proposed framework as a part of the technical guidance for the validation of roughness assessment sRIE systems. However, more effort on the administrative and legislative levels is needed.

Acceptance criteria. While the evaluation statistics have been proposed, the exact threshold that determines whether an sRIE system meets the industry use requirements is yet to be agreed on. Guidelines for validating profilometric-based methods specify the threshold (acceptance criteria) that each evaluating statistic should meet for the tested instrument to be certified (Austroads Citation2016a, Citation2016b, Citation2016c). While this study took the initiative to review three existing sRIE systems, the obtained results are too limited to represent the performance of the state-of-the-art sRIE field. However, it is encouraged that this proposed framework be adopted to validate more sRIE systems in the future. After the accuracy and consistency levels of a sufficient number of sIRI systems have been verified, the acceptance criteria can be established.

Accuracy expectation. Smartphone systems are inheritably less accurate and consistent than conventional instruments. Suppose the sRIE system is used as an independent assessment tool that locates segments needing maintenance. In that case, the acceptance criteria should be close to that of response-based methods, such as the roughometer. However, leveraging its accessible and pervasive features, smartphone systems should supplement conventional approaches by providing instantaneous monitoring and large-scale surveying of the road network. In this case, the acceptance criteria of sRIE systems should not be the same as that of conventional instruments, given the trade-off between accuracy and accessibility.

Crowdsourcing. The smartphone enables the population to participate in network-level pavement condition evaluation, known as crowdsourcing surveying (Yu et al. Citation2022). While crowdsourcing-based road surveying is yet to be adopted in the industry, it is potential that the crowdsourcing feature will facilitate the adoption of the sRIE system as it conducts pavement surveying tasks utilising the population. Meanwhile, an integrative framework that evaluates the sRIE system’s performance in a crowdsourcing context remains to be explored.

7. Conclusion

This research aimed to develop and validate a framework for evaluating sRIE systems in pavement roughness assessment. It is the first attempt to integrate practical factors that affect sRIE systems and synthesise them in the evaluation methodology. The study first identified dominating practical factors that affect the accuracy of the sRIE system from existing literature. Based on the factors, the evaluation framework was established, and the computation of the testing statistical measures was explained. Then, the paper demonstrated the field experiment of evaluating three existing sRIE systems. The results have proved the validity of the proposed framework and informed the performance of the tested systems.

The proposed framework could be applied to validate the sRIE systems’ repeatability and accuracy against conventional measurement instruments. The statistical measures have shown that the sRIE systems’ performance becomes less robust when tested on gravel pavement. Two of the three sRIE systems depend on survey speed, vehicle type, and mounting variations. However, one system can provide consistent measurements regardless of practical factor variations.

On top of the practical benefits of regulating the statistical measures and procedures of validating sRIE systems, the framework also advances the development of a robust sRIE system and justifies the application of sRIE systems in roughness assessment. This work contributes to the body of knowledge by (1) introducing robust performance indicators for sRIE systems considering the systems’ consistency and accuracy levels; (2) establishing a universal protocol that enables the performance comparison between the sRIE and conventional systems; (3) promoting the incorporation of the sRIE system into technical guidelines for pavement roughness assessment; and (4) providing a pragmatic guide demonstrating the adoption of the framework in practice.

Acknowledgement

This research work is part of a research project (Project No. 2.7) sponsored by the SPARC Hub (https://sparchub.org.au) at the Department of Civil Engineering, Monash University funded by the Australian Research Council (ARC) Industrial Transformation Research Hub (ITRH) Scheme (Grant number: IH180100010). The financial and in-kind support from ARRB and Monash University is gratefully acknowledged. Also, the financial support from ARC is highly acknowledged.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Australian Research Council (ARC) Industrial Transformation Research Hub (ITRH) Scheme [grant number IH180100010].

References

Ahmed, H.U., et al., 2021. Effects of smartphone sensor variability in road roughness evaluation. International Journal of Pavement Engineering, 0 (0), 1–6. doi:10.1080/10298436.2021.1946059.
Google Scholar
Austroads, 2016a. AG:AM/T003 validation validation of an inertial profilometer for measuring pavement roughness (loop device method). Available from: https://austroads.com.au/publications/asset-management/agam-t003-16 [Accessed 11 October 2022].
Google Scholar
Austroads, 2016b. AG:AM/T004 pavement pavement roughness repeatability and bias checks for an inertial profilometer, (May), 1–9. Available from: https://austroads.com.au/publications/asset-management/agam-t004-16 [Accessed 11 October 2022].
Google Scholar
Austroads, 2016c. AG:AM/T001 pavement pavement roughness measurement with an inertial profilometer. Available from: https://austroads.com.au/publications/asset-management/agam-t001-16 [Accessed 11 October 2022].
Google Scholar
Austroads, 2019. Guide to pavement technology Part 4K: selection and design of sprayed seals. Available from: https://austroads.com.au/__data/assets/pdf_file/0024/107448/AGPT04K-18_Guide_to_Pavement_Technology_Part-4K_Selection-_Design_Sprayed_Seals.pdf [Accessed 11 October 2022].
Google Scholar
Botshekan, M., et al., 2020. Roughness-induced vehicle energy dissipation from crowdsourced smartphone measurements through random vibration theory. Data-Centric Engineering, 1 (2), 1–25. doi:10.1017/dce.2020.17.
Google Scholar
Botshekan, M., et al., 2021. Smartphone-enabled road condition monitoring: from accelerations to road roughness and excess energy dissipation. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 477 (2246), 20200701. doi:10.1098/rspa.2020.0701.
Web of Science ®Google Scholar
Bridgelall, R., Hough, J., and Tolliver, D., 2019. Characterising pavement roughness at non-uniform speeds using connected vehicles. International Journal of Pavement Engineering, 20 (8), 958–964. doi:10.1080/10298436.2017.1366768.
Web of Science ®Google Scholar
Cameron, C.A., 2014. Innovative means of collecting international roughness index using smartphone technology. The University of New Brunswick. University of New Brunswick. Available from: https://unbscholar.lib.unb.ca/islandora/object/unbscholar%3A9404 [Accessed 23 January 2023].
Google Scholar
Dollár, P., et al., 2012. Pedestrian detection: an evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34 (4), 743–761. doi:10.1109/TPAMI.2011.155.
PubMed Web of Science ®Google Scholar
Douangphachanh, V. and Oneyama, H., 2014a. Using smartphones to estimate road pavement condition. International Symposium for Next Generation Infrastructure. doi:10.14453/isngi2013.proc.16.
Google Scholar
Douangphachanh, V. and Oneyama, H., 2014b. A study on the use of smartphones under realistic settings to estimate road roughness condition. Eurasip Journal on Wireless Communications and Networking, 2014 (1), 1551–1564. doi:10.1186/1687-1499-2014-114.
Google Scholar
Forslöf, L. and Jones, H., 2015. Roadroid: continuous road condition monitoring with smart phones. Journal of Civil Engineering and Architecture, 9 (4), 485–496. doi:10.17265/1934-7359/2015.04.012.
Google Scholar
Galagoda, D.Y. and Lanka, S., 2019. Smartphone applications for pavement roughness computation of Sri Lankan roadways. Journal of Eastern Asia Society for Transportation Studies, 13, 2581–2597. doi:10.11175/easts.13.2581.
Google Scholar
Gamage, D., Pasindu, H.R., and Bandara, S., 2016. Pavement roughness evaluation method for low volume roads. 8th International Conference on Maintenance and Rehabilitation of Pavements, 976–985. doi:10.3850/978-981-11-0449-7-199-cd.
Google Scholar
Hanson, T., Cameron, C., and Hildebrand, E., 2014. Evaluation of low-cost consumer-level mobile phone technology for measuring international roughness index (IRI) values. Canadian Journal of Civil Engineering, 41 (9), 819–827. doi:10.1139/cjce-2014-0183.
Web of Science ®Google Scholar
Hastie, T., et al., 2021. An introduction to statistical learning. 2nd ed. Springer Texts. Available from: https://www.statlearning.com/ [Accessed 11 October 2022].
Google Scholar
Hossain, M., et al., 2019. Evaluation of android-based cell phone applications to measure international roughness index of rural roads. International Conference on Transportation and Development, 309–318. doi:10.1061/9780784482575.034.
Google Scholar
Islam, S., 2015. Development of a smartphone application to measure pavement. University of Illinois at Urbana-Champaign. Available from: https://www.ideals.illinois.edu/handle/2142/89119 [Accessed 11 October 2022].
Google Scholar
Islam, S., et al., 2014. Use of cellphone application to measure pavement roughness. Int&DI Congress 2014: Planes, Trains, and Automobiles, 553–563. doi:10.1061/9780784413586.053.
Google Scholar
Janani, L., Sunitha, V., and Mathew, S., 2020. Influence of surface distresses on smartphone-based pavement roughness evaluation. International Journal of Pavement Engineering, 0 (0), 1–14. doi:10.1080/10298436.2020.1714045.
Google Scholar
Jeong, J.H., Jo, H., and Ditzler, G., 2020. Convolutional neural networks for pavement roughness assessment using calibration-free vehicle dynamics. Computer-Aided Civil and Infrastructure Engineering, 35 (11), 1209–1229. doi:10.1111/mice.12546.
Web of Science ®Google Scholar
Laubis, K., Simko, V., and Schuller, A., 2016. Road condition measurement and assessment: a crowd based sensing approach. International Conference on Information Systems, ICIS 2016, 1–10. Available from: https://www.researchgate.net/publication/319261042_Road_Condition_Measurement_and_Assessment_A_Crowd_Based_Sensing_Approach [Accessed 11 Octover 2022].
Google Scholar
Louhghalam, A., Akbarian, M., and Ulm, F.J., 2017. Carbon management of infrastructure performance: integrated big data analytics and pavement-vehicle-interactions. Journal of Cleaner Production, 142, 956–964. doi:10.1016/j.jclepro.2016.06.198.
Web of Science ®Google Scholar
Louhghalam, A., et al., 2019. Closed-form solution of road roughness-induced vehicle energy dissipation. Journal of Applied Mechanics, 86 (1), 1–8. doi:10.1115/1.4041500.
Web of Science ®Google Scholar
Mann, A. V., McManus, K.J., and Holden, J.C., 1997. Power spectral density analysis of road profiles for road defect assessment. Road and Transport Research, 6 (3), 36–46. https://trid.trb.org/view/476053.
Google Scholar
Moorefield, F., 2020. Global positioning system standard positioning service performance standard. GPS Navstar. Available from: https://www.gps.gov/systems/gps/performance/accuracy/ [Accessed 11 October 2022].
Google Scholar
NSW Government, 2017. Windscreen mounted phones and GPS Fact Sheet. Available from: https://roadsafety.transport.nsw.gov.au/downloads/windscreen-mounted-phones-gps.pdf [Accessed 11 October 2022].
Google Scholar
Ordaz, M. and Doyle, J., 2021. Quantifying extreme event-induced pavement roughness via smartphone apps. Geo-Extrme GSP, 330, 222–231. doi:10.1061/9780784483701.022.
Google Scholar
Płaczek, M. and Piszczek, Ł., 2018. Testing of an industrial robot’s accuracy and repeatability in off and online environment. Eksploatacja i Niezawodnosc, 20 (3), 455–464. doi:10.17531/ein.2018.3.15.
Web of Science ®Google Scholar
Sandamal, R.M.K. and Pasindu, H.R., 2020. Applicability of smartphone-based roughness data for rural road pavement condition evaluation. International Journal of Pavement Engineering, 0 (0), 1–10. doi:10.1080/10298436.2020.1765243.
Google Scholar
Sayers, M.W., 1995. On the calculation of international roughness index from longitudinal road profile. Transportation Research Record. Available from: http://onlinepubs.trb.org/Onlinepubs/trr/1995/1501/1501-001.pdf [Accessed 23 January 2023].
Google Scholar
Sayers, M.W., Gillespie, T.D., and Paterson, W.D.O., 1986. Guidelines for conducting and calibrating road roughness measurements. World Bank Technical Paper. ISBN: 0-8213-0590-5.
Google Scholar
Schall, M.C., et al., 2016. Accuracy and repeatability of an inertial measurement unit system for field-based occupational studies. Ergonomics, 59 (4), 591–602. doi:10.1080/00140139.2015.1079335.
PubMed Web of Science ®Google Scholar
Schlotjes, M.R., Visser, A., and Bennett, C., 2014. Evaluation of a smartphone roughness meter. 33rd Southern African Transport Conference, 141–153. Available from: http://hdl.handle.net/2263/45571 [Accessed 11 October 2022].
Google Scholar
Shah, A., Zhong, J., and Ly, J., 2017. Adopting smartphone technology to supplement road asset performance monitoring. In: AAPA international flexible pavement conference. Available from: https://trid.trb.org/view/1485235 [Accessed 11 October 2022].
Google Scholar
Steyn, W.J., et al., 2014. Freight-truck-pavement interaction, logistics, and economics: final executive summary report compilation of executive summaries, (February). Available from: https://escholarship.org/uc/item/70(4t0fp [Accessed 23 January 2023].
Google Scholar
Thiandee, P., et al., 2019. An experiment on measurement of pavement roughness via android-based smartphones. International Transaction Journal of Engineering, Management, Applied Science and Technologies, 10 (9), 1–9.
Google Scholar
Wambold, J.C., et al., 1981. State of the art of measurement and analysis of road roughness. Transportation Research Record, 21–29. Available from: https://onlinepubs.trb.org/Onlinepubs/trr/1981/836/836-004.pdf [Accessed 11 October 2022].
Google Scholar
Wang, G. and Ghataora, G., 2020. Study of the factors affecting road roughness measurement using smartphones. Journal of Infrastructure Systems, 26 (3), 04020020. doi:10.1061/(asce)is.1943-555x.0000558.
Web of Science ®Google Scholar
Wessels, I. and Steyn, W.J.M., 2020. Continuous, response-based road roughness measurements utilising data harvested from telematics device sensors. International Journal of Pavement Engineering, 21 (4), 437–446. doi:10.1080/10298436.2018.1483505.
Web of Science ®Google Scholar
Wix, R, 2016. Measuring road roughness with a smartphone – horses for courses? 27th ARRB Conference-Linking People, Places and Opportunities, 1, 13. Available from: https://trid.trb.org/view/1446692 [Accessed 11 October 2022].
Google Scholar
Xue, K., Nagayama, T., and Zhao, B., 2020. Road profile estimation and half-car model identification through the automated processing of smartphone data. Mechanical Systems and Signal Processing, 142. doi:10.1016/j.ymssp.2020.106722.
Web of Science ®Google Scholar
Yang, X., et al., 2020. Calibration of smartphone sensors to evaluate the ride quality of paved and unpaved roads. International Journal of Pavement Engineering, 0 (0), 1–11. doi:10.1080/10298436.2020.1809659.
Google Scholar
Yu, Q., Fang, Y., and Wix, R., 2022. Pavement roughness index estimation and anomaly detection using smartphones. Automation in Construction, 141 (March), 104409. doi:10.1016/j.autcon.2022.104409.
Google Scholar
Zaabar, I. and Chatti, K., 2014. Estimating vehicle operating costs caused by pavement surface conditions. Transportation Research Record: Journal of the Transportation Research Board, 2455, 63–76. doi:10.3141/2455-08.
Google Scholar
Zhang, Z., et al., 2021. Pavement roughness evaluation method based on the theoretical relationship between acceleration measured by smartphone and IRI. International Journal of Pavement Engineering, 0 (0), 1–17. doi:10.1080/10298436.2021.1881783.
Google Scholar
Zhao, B. and Nagayama, T., 2017. IRI estimation by the frequency domain analysis of vehicle dynamic responses. Procedia Engineering, 188, 9–16. doi:10.1016/j.proeng.2017.04.45.
Google Scholar
Zhao, B., Nagayama, T., and Xue, K., 2019. Road profile estimation, and its numerical and experimental validation, by smartphone measurement of the dynamic responses of an ordinary vehicle. Journal of Sound and Vibration, 457, 92–117. doi:10.1016/j.jsv.2019.05.015.
Web of Science ®Google Scholar

Evaluation framework for smartphone-based road roughness index estimation systems

ABSTRACT

1. Introduction

2. State-of-the-art sRIE systems

2.1. Smartphone-based IRI estimation methods

2.2. Dominating practical factors

3. Evaluation framework

3.1. Repeatability test

3.1.1. Coefficient of variation

3.1.2. Correlation with the mean measurements

3.1.3. Repeatability sensitivity

3.2. Accuracy test

3.2.1. Average of measurement error

3.2.2. Correlation with the reference IRI

3.2.3. Accuracy sensitivity

4. Experiment design

4.1. sRIE systems

Table 1. Testing route details and survey speeds.

5. Evaluation framework validation results

5.1. Repeatability test

5.1.1. CoVmean

Table 2. CoVmean on Miles Rd (%).

Table 3. CoVmean on Convent Rd (%).

5.1.2. Correlation with the mean measurements

5.1.3. Repeatability sensitivity

Table 4. Regression coefficients and R2 of MLR in repeatability sensitivity analysis (Miles Rd).

Table 5. Regression coefficients and R2 of MLR in repeatability sensitivity analysis (Convent Rd).

5.2. Accuracy test

5.2.1. Average of measurement error

Table 6. ϵmean on Miles Rd.

Table 7. ϵmean on Convent Rd.

5.2.2. Correlation with the rIRI

5.2.3. Accuracy sensitivity

Table 8. Regression coefficients, R2 and p-value of ϵ vs speed regression (Miles Rd).

Table 9. Regression coefficients, R2 and p-value of the ϵ vs speed regression (Convent Rd).

5.3. sRIE performance overview

Table 10. Median of testing results and performance ranking (sealed road).

Table 11. Median of testing results and performance ranking (unsealed road).

6. Discussion

6.1. Performance of tested sRIE systems

6.2. The benefits and limitations of the evaluation framework

6.3. Future research

7. Conclusion

Acknowledgement

Disclosure statement

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

5.1.1. $C o V_{m e a n}$

Table 2. $C o V_{m e a n}$ on Miles Rd (%).

Table 3. $C o V_{m e a n}$ on Convent Rd (%).

Table 4. Regression coefficients and $R^{2}$ of MLR in repeatability sensitivity analysis (Miles Rd).

Table 5. Regression coefficients and $R^{2}$ of MLR in repeatability sensitivity analysis (Convent Rd).

Table 6. $ϵ_{m e a n}$ on Miles Rd.

Table 7. $ϵ_{m e a n}$ on Convent Rd.

5.2.2. Correlation with the $r I R I$

Table 8. Regression coefficients, R² and p-value of $ϵ$ vs speed regression (Miles Rd).

Table 9. Regression coefficients, R² and p-value of the ϵ vs speed regression (Convent Rd).