Abstract
This paper, on the one hand, aims to identify significant crash risk factors at unsignalized three-leg intersections connecting rural two-lane two-way roads and minor roads with a STOP control on the approaches (3ST) and, on the other, to make adjustments to the Highway Safety Manual (HSM) procedure, fine-tuning its Safety Performance Function (SPF) based on observation of the local context. Over an 8-year period of study, a total of 240 crashes on 35 3ST intersections were observed, with no geometric-infrastructure adjustments or changes in the Annual Average Daily Traffic (AADT) and surrounding context noted at the intersections investigated. To obtain reliable results, the study period was divided into two groups: (a) 5 years to calibrate a new SPF, and (b) the remaining 3, not included in the first dataset, were used to validate the results. A negative binomial regression model was adopted to calibrate the new SPF. It was found that the AADT on the major and minor roads, the intersection skew angle, the co-occurrence of left and right-turn lanes on the major roads, and lighting seriously affect the crash scenario.
1. Introduction
Unsignalized intersections represent potential hazards not present at signalized intersections because of the priority of movement on the main road.
Ambros et al. [1] identified, for example, how crash prediction models (CPMs) can help practitioners as valuable tools, to link crashes with risk factors in order to identify and suggest improvements through adjustments in site.
The HSM (American Association of State Highway and Transportation Officials [2]) contains several SPFs that can be used to predict yearly crash frequencies both for homogeneous road segments and intersection areas. However, many scholars have suggested revising the CPMs on a country by country basis and that it is not recommended to directly transfer the CPMs/SPFs available in the HSM to sites located in places far away the US [3].
La Torre et al. [4] also confirmed that the models are generally transferable to the Italian road network, especially in relation to fatal and injury crashes, but they noted that improvements could be made in light of the variable calibration factors within the datasets or local crash modification factors (CMFs).
The research presented here aims basically to revise HSM SPF at 3ST intersections reflecting local conditions of 35 intersections to carefully define unsafe road conditions that need improvements during drivers’ maneuvers under several conflicts points. As detailed in later sections, this goal was achieved in various stages (Figure 1).

2. Literature Review
Greater use must be made of state-of-the-art analytical techniques to ensure continuing improvement in the field of road safety.
National Cooperative Highway Research (NCHRP) Report 500 [5] recommended safety improvements at unsignalized intersections based on considerations regarding geometric design modifications, changes to traffic control devices, targeted enforcement efforts, and public awareness.
In particular, a number of authors have investigated how traffic safety can affect efficiency at intersections in terms of driver performance, level of service, and crash types, suggesting potential courses of action to reduce risk factors and monitor expected yearly crash frequency using prediction models.
Kaisi and Alam [6], for example, observed how drivers behave at priority unsignalized intersections, only partially respecting control measures due to the lack of any controls in place; fewer delays are returned for certain levels of traffic volumes. Unfortunately, self-organization at this type of intersection is one of the main causes of very aggressive and potential hazardous driver behaviour. In particular, [7] investigated the correlation between geometric/environmental factors and driver performance in terms of crash data. Multilevel modeling techniques were used to predict the phenomenon within a hierarchical structure: driver characteristics are nested within crashes, crash characteristics are nested within site characteristics, site characteristics are nested within regional characteristics, and so forth.
Pan et al. [8] developed Safety Level of Service (SLOS) models at unsignalized intersections. The current LOS service measure (delay) suggested in the Highway Capacity Manual (HCM) and SLOS evaluation index (risk index) presented by authors were combined to form an improved service measure (delay and risk index).
Some authors have investigated how crash types affect the severity of outcomes. Ye et al., for example, [9] developed a simultaneous equation crash frequency model by collision type to apply to rural intersections. At the same time, [10] estimated rear-end crash probabilities at signalized intersections, observing that three types of driving tendency can be observed using K-means clustering analysis. [11] compared factors that might affect crash severity in hit-and-run (HR) and nonhit-and-run (NHR) crashes, observing how HR crashes were 34% more likely to result in injuries than NHR crashes on average.
In order to suggest improvements to intersections, [12] examined the effects of wet pavement surface conditions on the likelihood of nonsevere crashes occurring. The results showed that the main factors influencing nonsevere crashes on wet pavement surfaces are segment length, traffic volume, and posted speed limits.
Bhagavathula et al. [13], for example, investigated the effect of lighting level on the night-to-day (ND) crash ratios at 99 rural intersections in Virginia; in particular, a 1 lux (lx) increase at all rural intersections corresponded to a 7% decrease in the ND crash ratio.
Ghanbarikarekani et al. [14] propose a model to improve pre-signals by reducing the number of vehicles stopping behind the presignals. By applying the method, vehicles would be able to adjust their speed based on traffic conditions.
Barua et al. [15] observed that the fatality risk tends to increase at intersections on undivided rural highways in Alberta, Canada, when crashes occur at offset intersections or at cross or T-intersections associated with horizontal curves. The likelihood of fatality tends to increase if the intersection is on a sag curve or at a constant grade.
Haleem and Abdel-Aty [16] investigated multiple approaches to the analysis of crash injury severity at three- and four-legged unsignalized intersections in the state of Florida from 2003 until 2006. It was found that the traffic volume on the major approach, the number of lanes on the minor approaches, the upstream and downstream distance to the nearest signalized intersection, left and right shoulder width, the number of left turn movements on the minor approach, and the number of right and left turn lanes on the major approach, affect crash severity.
Monga and Bishnoi [17] submitted design criteria to improve various basic parameters useful for designing intersection areas by working on by-passes and some in the city of Sirsa to minimize conflicts/crashes at intersections.
Zhou et al. [18] investigated the safety effects of exclusively left-turn lane installations at unsignalized intersections, observing some explanatory variables that affected severity, such as area type, number of lanes, number of approach legs, and crash type. Negative binomial modeling and generic estimation equations were used to predict the expected number of crashes per year. It was observed that the left-turn lane created a safer condition.
Gomes et al. [19] performed crash prediction models for intersections in Lisbon (Portugal), by investigating 44 three-legged intersections and 50 four-legged intersections. Focusing on only three-legged intersections, four variables were found to have a positive effect on safety: (a) the ratio between the traffic flow entering minor roads and the total traffic flow entering the intersection, (b) the lane balance for all the approaches, able to create linear movement and lead to fewer conflicts, and (c) the presence of a median on either one or two legs of a major road was associated with fewer crashes.
Anowar et al. [20] applied a partially constrained generalized ordered logit model to a sample of crash data from 1998 to 2006 to determine the factors contributing to the severity of intersection crashes in Bangladesh. Crash severity was found to increase in cases where the intersections are located in rural areas and the crash occurs on dry pavement or during adverse weather conditions.
Some authors [21] began to estimate the safety effects of fixed lighting at a variety of intersection types.
Elvik [22] analyzed the international transferability of several safety performance functions for horizontal curves developed in Australia, Canada, Denmark, Germany, Great Britain, New Zealand, Norway, Portugal, Sweden, and the United States. The results confirmed that SPFs produced in different countries had important similarities, but they can diverge in terms of crash rate on the sharpest curves.
3. Data Collection and Preliminary Correlations
The study focuses on 35 at-grade three-leg intersections connecting rural two-lane two-way roads with minor roads that have a STOP control on the approaches; all are situated on a flat area. A total number of 240 crashes were observed over an 8-year period, during which 150 injury crashes and 90 property damage only crashes occurred. During this time interval, no geometric-infrastructure adjustments or changes to the Annual Average Daily Traffic or the surrounding context were observed at the 3ST intersections in question.
Table 1 shows the main geometric-infrastructure features and crash features at the 3ST intersections in question.
According to the methodological process summarized in Figure 1, a total of 151 crashes were observed in a 5-year set, selected from a random sampling without repositioning, adopted to calibrate new SPF, following HSM procedure in Section 3 in order to reflect local data. During this time interval, a total of 328 injury crashes and 201 property damage only crashes were counted.
To validate the new SPF, another set of time intervals was selected comprising the remaining 3 out of the total 8 years of the study and was not included in the calibration phase. During this second-time interval, a total of 89 crashes occurred, including 52 injury crashes and 39 property damage only crashes.
As shown in Figure 1, a preliminary investigation of the correlations between infrastructure independent and dependent variables (yearly crash frequency) was carried out adopting statistical processing such as the classification tree approach [23], implementing the exhaustive CHAID (Chi-Squared Automatic Interaction Detector) algorithm.
The outcome of the CHAID method [24] was a tree diagram of 11 nodes with 7 terminal nodes and 4 nonterminal nodes. It was observed how the safety effect can differ significantly depending on whether intersections belong to circular curves (higher average yearly crash frequency) or tangent segments (lower average yearly crash frequency).
In addition, focusing on skew angle variation, it was observed that when the skew angle values are lower than 10°, the total effects produced by the simultaneous presence of remaining infrastructural features are really positive in terms of the reduction of yearly crash frequency at 3ST intersections, but when this skew angle variation is not associated with any additional infrastructure element, the safety level decreases. From this initial analysis, it appears that when the angle of deviation is less than 10° and there is a left-turn lane with a right-turn lane on the main road, and the intersection is illuminated, the average annual crash frequency is lower than in any other case arising from an analysis.
4. A Synthesis of HSM Procedures for Rural 3ST Intersections
The HSM provides a predictive method, in the form of an SPF, to estimate the yearly crash frequency of a site in specific geometric and geographic conditions over a given period for a specific volume of annual average daily traffic (AADT). The HSM suggests adjusting the SPF for sites not reflecting the base conditions by means of crash modification factors (CMFi).
The base conditions of the 3ST intersections set out in the HSM to calibrate are as follows: (a) a skew angle of 0°, (b) no left-turn lanes on approaches without stop control, (c) no right-turn lanes on approaches without stop control, and (d) no lighting. The intersection skew angle is defined as the absolute value of deviation from a 90° intersection angle. HSM that refers to a statistical “base” prediction function was calibrated assuming that crash frequencies follow a Negative Binomial distribution, firstly, and then carrying out a statistical multiple regression techniques (see Equation (1)):
where is the annual average daily traffic volume on a major road and on a minor road.
When investigated intersections do not meet study base conditions, changes into (see Equation (2)), where CMFi commonly reduces the expected average crash frequency of sites under no-base conditions instead of those matching base conditions and Ci value refers to the total effects deriving from the local geographic context [25].
5. Validating HSM Procedure Using Local Data
Before moving on to the validation of the HSM SPF applied to a number of 3ST intersections located in Italy to predict yearly crash frequency, a preliminary filtering was carried out on the dataset of intersections that “meet HSM base conditions” and the dataset of intersections that “do not meet HSM base conditions”.
The 3σ method [26] was adopted to remove anomalous crash frequencies per year. The 3σ method makes it possible to check the homogeneity of distribution around the mean: the maximum deviation of frequency distribution is 3σ. Crash frequencies falling outside mean ± 3 std.dev. (µ ± 3σ) were removed from the dataset both under base conditions and nonbase conditions before moving to the calibration step. The results of the 3σ method show that no value was rejected under base or nonbase geometric conditions.
A residuals analysis was used to test the reliability of the HSM SPF applied at the 3ST intersections studied in Italy. As mentioned in Section 2, and in line with the flow chart in Figure 1, 5 years were selected out of the entire 8-year study period in a simple random sampling without repositioning to recalibrate the new and new , as shown in Section 5, while the remaining 3 years, not included in the first dataset, were used to validate the new functions in Section 6. The first stage in the validation procedure was therefore to measure the total predicted yearly crash frequency using the HSM equations without adjusting the coefficients to the local data. Residuals were calculated by subtracting the predicted responses from the observed ones. In particular, several parameter evaluations [27] were considered:(i)MAD (Mean Absolute Deviation), which is the sum of the absolute values of the difference between predicted and observed crash frequency per year divided by the number of intersections reflecting the specific geometric conditions.(ii)MSD or Mean Square Deviation. (iii) Cumulated Squared Residuals, to check the absence of vertical jumps (outliers), which were plotted on the basis of an increasing mean value for the total AADT entering an intersection. A vertical jump reflects a lack of flexibility in the functional form within the model. The Cumulated Squared Residuals plot should not have long increasing or decreasing runs because they correspond to areas of substantial over- and underestimation [28, 29]. (iv) The Performance Diagram is a means of visually identifying how close the sample of predicted values is to the observed values. The cloud of points must be homogeneously distributed around the bisector of the first quadrant. The x-axis shows the observed yearly crash frequencies, while the y-axis shows the corresponding predicted value.
It emerges, as Table 2 shows, that the mean MAD and MSD values for the two datasets investigated (the subset of 3ST intersections investigated that meets HSM base conditions, and the subset that does not meet HSM base conditions) are higher than 2 and 9 respectively, which means that HSM and HSM , applied to local data as presented in the manual without re-adjusting, do not adequately reflect the observed crash frequency investigated at the 3ST intersections, and fail to return reliable responses.
The diagrams in Table 2 also show the presence of jumps in the diagram of cumulated squared residuals plotting, considering the whole dataset of intersections that meet and do not meet HSM base conditions, and, as previously, they confirm the need to readjust the HSM SPF to local data in order to reduce residuals and jumps. This is also confirmed by the performance diagram, where the points cloud is completely displaced away from the bisector, emphasizing an overestimation of the observed values by applying the HSM functions as delivered in the manual without recalibrating to reflect local data.
6. Calibrating New and New Based on Local Data
6.1. Defining New Local Base Geometric Conditions
Following on from the main outcomes of the CHAID output, new base conditions have been suggested that reflect the simplest possible road configuration where more dangerous situations can arise and yearly crash frequency is not low: skew angle of less than 10° + absence of left/right-turn lanes + absence of lighting. A total of 65% of the 3ST intersections involved in the calibration phase (see Figure 1), met local base conditions. Thus, the new was calibrated under NB distribution to predict frequency per year over the 5 years of the study period on the investigated intersections as shown in Equation (∗) in Table 3. The over-dispersion parameter for this SPF is 0.59.
The results demonstrated that a total of 35% of study intersections (see Table 1) did not meet local base conditions, so new CMFs were calculated as described in the next section.
6.2. Local CMFs and Ci
The adjustment to the HSM CMFs and Ci to local data began by regulating the CMF related to the “skew angle” variable. Circular curves showed an exponential positive increased when the skew angle increases, while the exponential reduces for tangent elements when the skew angle increases. Next, the that reflects the effects of the co-occurrence of left-turn and right-turn lanes on the major roads with intersections that do not meet local base conditions was calculated. As stated above, the intersections investigated always have left-turn lanes combined with right-turn lanes on the major road: none of the two specific lanes exist in the absence of the other at the intersections investigated. The is shown in Table 4; the remaining CMFs values [25] are also shown in the same Table.
In conclusion, all intersections whose skew angle, left-right/turn lanes, and/or lighting conditions fell outside the local base conditions were modified in accordance with the CMF values as shown in Table 4. Lastly, a calibration factor Ci was produced to adjust NSPF 3ST to the local sites. In accordance with the HSM procedure, this was equal to the ratio between the average of the observed number of crashes at all the 3ST intersections that do not meet local base geometric conditions studied in relation to the total of the predicted number of crashes as per Equation (2). The new Ci for predicting crash frequency per year at 3ST intersections reflecting specific geometric conditions is shown in Table 4. The final complete local Npredicted predicting crash frequency per year at 3ST intersections under nonbase local conditions is shown in Equation (3):
where and reflect range values in Table 1. , CMF2+3, CMF4, Ci are shown in Table 4.
It emerges that significant changes occur when (a) the specific effects of skew angle are investigated in cases where the intersections belong to circular curves or tangent segments (less serious than the site belonging to circular curves during the study period), (b) there are both left- and right-turn lanes on the major roads, whose benefits exist according to local data but are characterized by only one CMF value in the Equation (see Equation (3)), returning a lower outcome than two multiplicative CMFs as suggested by the HSM (see Equation (2)).
7. Results and Discussions
As seen above, the whole study period covered 8 years, divided into two datasets by simple random sampling without repositioning. One set contained 5 years of crash data, and the other, 3 years of crash data, both of which relate to the same sample of 3ST intersections that did not undergo geometric, infrastructure, and traffic changes during the period in question. The first-time interval, with its crash-geometric-traffic features, was adopted to calibrate the new and new, while the remaining 3 years of the total 8 years that were not included in the calibration phase were adopted to validate the new and new .
Table 5 shows the numerical residuals: when the mean global values of MAD and MSD for two datasets (a subset of 3ST intersections that meets base conditions, and a subset that does not meet base conditions) are calculated on the basis of and adjusted to take into account local data (see second column in Table 5), they are significantly lower than those observed when HSM procedure is applied to the dataset without re-calibrating the coefficients according to local data.
The calibrated SPF (Equation (3)) reflects the risk exposure of the intersections under investigation more accurately, as it is directly related as a whole to the current geometric and traffic configurations found at the sites under investigation. Compared to the HSM model, which overestimates the average yearly crash rate at these sites, this one is better suited to predicting the total average yearly cash rate. If, therefore, the HSM model were used without corrective factors, the crash rate would result higher than the reality, according to the observed data. The purely numerical approach would have negative economic consequences in terms of costly safety interventions at the site—involving intervention on the geometrical modification of numerous variables—than the application of a more realistic model such as the one presented here.
Figure 2 is a graph showing the results of a comparison between the residuals calculated estimating (a) a predicted yearly crash frequency obtained using the HSM default functions (gray points in the graphs), and (b) a yearly crash frequency predicted using the HSM adjusted to local data (black points in the diagrams).

Analyzing the results in greater detail, Figure 2 highlights some significant points as follows:(i)Figure 2(a) underlines a marked overestimation of the observed values when adopting the default HSM functions. In fact, while the residuals calculated by estimating the predicted values using the new and new are evenly distributed around zero and never exceed 0.3, as may be seen more clearly from a zoom of this cloud of points in Figure 2(b), the residuals calculated by estimating the predicted values using the default HSM SPF follow a positive trend, increasing from zero to almost 9 as shown in Figure 2(a).(ii)Figure 2(c) shows the high values of the cumulated squared residuals (plotted in increasing order per total mean AADT value for vehicles entering the intersection) obtained by applying the default HSM functions when there are huge jumps in the diagram. The cumulated squared residuals increase from 0 to almost 110 when the default HSM functions are applied to local data. Conversely, as may be seen more clearly from the zoom in Figure 2(d), the cumulated squared residuals obtained by applying the new and new show no jumps and range from 0 to 10. Figure 2(d) thus confirms the greater reliability of the new and new in predicting yearly crash frequencies compared with the first estimate.(iii)An additional graphical analysis was carried out to support these conclusions. Figure 2(e) shows the performance where the x-axis represents the observed values for yearly crash frequency, and the y-axis shows the predicted values obtained by applying the default HSM functions (in gray) and the adjusted HSM functions to local data (in black). The diagram confirms that the observed values were overestimated as a result of applying the HSM functions without recalibration as set out in the Manual: Figure 2(e) shows a cloud of gray points all lying away from the bisector, while the black points are evenly positioned around zero, as may be seen better from the zoom in Figure 2(f).
Finally, a Kruskal–Wallis (KW) test [30] was carried out based on Equation (4) to verify the following questions:
(i)Are the predicted crash frequencies returned by the new NSPF,3ST and new Npredicted (see Equation (∗)) equations at 3ST intersections that meet or do not meet local base conditions respectively, statistically equal to the observed yearly crash frequency values (first hypothesis H0)?(ii)Are the predicted crash frequencies returned by HSM (see Equation (1)) and HSM (see Equation ((2))) equations at 3ST intersections that meet or do not meet HSM base conditions respectively, statistically equal to the observed yearly crash frequency values (second hypothesis H0)?
where is the number of observations in the ith dataset; is the sum of the rank of observation at each subset ith; is the overall sample size.
The KW test results shown in Table 6 adopted during the validation procedure confirm:
The procedure shown in this study can be used to evaluate measures of driver exposure to crash risk through the application of a revised HSM SPF in light of a study of the local context; the coefficient values of the yearly frequency prediction model are one of the main tools for investigating how much each predicting variable of the model can seriously affect the result, especially when this variable does not reflect a proper configuration as suggested by the Standard. To improve safety conditions at the intersections, it may be possible to work on variables used in the SPF by ensuring a proper control of the predicted results that will obviously affect the choice of maintenance operations and budget planning.
In conclusion, the factors introduced into the model are variables to be worked on to improve and monitor road safety through maintenance action at intersections. Revising the HSM SPF in light of our observations would be of support to procedures for monitoring the effectiveness of the actions when one or all of the variables change, with consequences on the reduction/increase in the degree of user safety.
The SPF for 3ST at-grade intersections was implemented in a diagram (Figure 3) showing the effects on yearly expected crash frequency reduction: changing the value of the explanatory variables is one of the main findings able to help practitioners. The diagram presents the yearly expected crash frequency on the y-axis, while the x-axis shows one of the geometric variables of the predictive model (Equation 2), in particular the skew angle. The diagram refers to a constant AADT value on the major and minor roads: 5000 vpd and 1700 vpd respectively.

The number of possible profiles is equal to the number of available variables used in the model on which it can actually work to improve road safety conditions.
Figure 3 shows different profiles obtained by changing the curvature of the element the intersection belongs to. It may be observed how a greater skew angle combined with a reduction in curve radius (the tangent element) favors reduction.
The right corner of Figure 3 shows a hazardous map referring to a scenario with intersections belonging to tangent segments in co-occurrence with left-turn and right-turn lanes on major roads. In this particular case, the results show that increasing the skew angle decreases the .
8. Conclusions and Recommendations
The results achieved in this study demonstrate that when the skew angle at intersections is greater than 10°, the expected outcomes can vary by changing the geometric site to which the intersection belongs: the skew angle of intersections belonging to circular curves causes a more dangerous scenario than skew angles at intersections belonging to tangent segments.
It was observed how the crash frequency is higher when the skew angle of the intersection increases, but it is more dangerous when these configurations happen with intersections belonging to circular curves rather than on tangent segments. As explained before, the joint presence of left and right-turn lanes on major roads at investigated intersections has made it possible to prove that the yearly crash frequencies predicted from adjusted functions based on local data reflect observed values better than those predicted by adopting the HSM functions: a local CMF that simultaneously involved the benefits of left and right-turn lanes was lower than the solution given in the HSM, which estimates the effects of left and right-turn lanes by means of two independent positive contributions through the adoption of two CMFs.
The research shed light on some differences between the model suggested by the HSM and the one used in this study for 3ST intersections on rural roads; in fact the following main differences were observed:(i)The negative exponential function of the for 3ST intersections that meet base geometric conditions is always lower in the HSM procedure than the one suggested by the local function proposed here; this issue sheds light on the overestimation of yearly crash frequency by adopting HSM rather than the observed values.(ii)The effect of the skew angle on safety when the value was greater than 100 is strongly correlated to the geometric layout of 3ST intersections (intersections belonging to tangents or to circular curves).(iii)The co-occurrence of left- and right-turn lanes makes it possible to calculate new local CMFs that reflect this configuration and have a lower value than those obtained through two multiplicative HSM CMFs referring to the presence of a left-turn lane together with a right-turn lane on a major road as independent configurations.(iv)The Ci factor is always greater in the HSM procedure than the one suggested by the presented here.
The SPF for 3ST at-grade intersections was implemented in a diagram showing the effects on yearly expected crash frequency reduction obtained by changing the value of the explanatory geometric variables. On the basis of the research findings, it can be concluded that the 3ST at-grade intersections belonging to tangent segments—unlike those belonging to circular curves—equipped with both left-turn and right turn-lanes and with a higher skew angle value, always have a lower predicted yearly crash frequency.
Data Availability
The crash data used to support the findings of this study are available from the corresponding author upon request.
Additional Points
Statement. The research did not receive specific funding but was carried out within the terms of employment of the authors at Federico II University of Naples and Nanjing University of Science and Technology.
Conflicts of Interest
The authors declare that no conflicts of interest exists regarding the publication of this paper.
Acknowledgments
We would like to express our deep gratitude to Prof. Gianluca Dell’Acqua (Federico II University of Naples) for his support in providing crash data used in the paper presented here.