Abstract
Dock-less bicycle-sharing programs have been widely accepted as an efficient mode to benefit health and reduce congestions. And modeling and prediction has always been a core proposition in the field of transportation. Most of the existing demand prediction models for shared bikes take regions as research objects; therefore, a POI-based method can be a beneficial complement to existing research, including zone-level, OD-level, and station-level techniques. Point of interest (POI) is the location description of spatial entities, which can reflect the cycling route characteristics for both commuting and noncommuting trips to a certain extent, and is also the main generating point and attraction point of shared-bike travel flow. In this study, we make an effort to model a POI-level cycling demand with a Bayesian hierarchical method. The proposed model combines the integrated nested Laplace approximation (INLA) and random partial differential equation (SPDE) to cope with the huge computation in the modeling process. In particular, we have adopted the dock-less bicycle-sharing rental records of Mobike as a case study to validate our method; the study area was one of the fastest growing urban districts in Shanghai in August 2016. The operation results show that the method can help better understand, measure, and characterize spatiotemporal patterns of bike-share ridership at the POI level and quantify the impact of the spatiotemporal effect on bicycle-sharing use.
1. Introduction
Bike-sharing is a service that allows the public to rent bicycles on a time-sharing basis, relying on the Internet to make full use of the demand for self-transportation brought about by rapid urban development [1]. After years of hot and calm changes, bike-sharing has entered a mature period of development, but it still has not gone out of the mode of occupying a large number of public resources. Shared bikes often swarm in a large area within a short period of time, occupying a large amount of traffic land, which has increased the great difficulty for urban management. The phenomenon of “zombie bikes” filling up the streets has become one of the hotspots for public complaints for the chaos they create [2, 3]. The reason behind this is that the bike-sharing operating companies over-release bikes and fail to match the corresponding operation maintenance and reasonable allocation. In this paper, the hierarchical Bayesian spatiotemporal model fitted by the two new technologies is used to predict the demand for shared bikes in the main points of interest in the city, optimize the deployment scheme of shared bikes, increase the turnover rate of each bike, avoid the serious imbalance between supply and demand of shared bikes in the delivery points, and provide relevant theoretical support for the deployment scheme of bicycle operators.
Most research on bike-sharing demand can be divided into two categories. One is to predict the demand of shared bikes based on regions. Some scholars divided regions into grids and proposed a hybrid deep learning neural network combining long short-term memory (LSTM) and convolutional long short-term memory (ConvLSTM) to predict the short-term traffic of shared bikes in each region (grid). Through analyzing existing methods, Yu et al. found that the reasonable way to divide the regions of shared bikes should be based on the marketization of bike-sharing mode [4]. Another perspective is to forecast the demand for shared bikes based on the site. Dell’Olio et al. established the location model of rental points with the help of geographic information system and determined the user attributes of the public bicycle system and the optimal location of rental points [5]. Chai et al. applied the deep learning method to the prediction of bicycle traffic based on the station level and proposed a new multigraph convolutional neural network to capture the spatial relationship between different stations [6]. Kaltenbrunner et al. made an attempt to predict the amount of bicycle borrowing and returning from the perspective of the station. They combined the data of bicycle running with the spatial characteristics, such as related terrain and time nodes [7]. Chen et al. adopted the clustering method to obtain high internal connectivity of the site through a graph-based clustering method and then predicted the overdemand probability of each cluster [8]. Therefore, a POI-based study on bike-sharing travel can combine the use characteristics of bike-sharing with the travel needs and help understand the circulation characteristics and service capacity of urban hot areas, filling the blank of existing studies.
Another thing worth notice is the method of prediction. In general, many research efforts have been made to predict bike-share demand based on various machine learning models, such as clustering analysis [9], k-nearest neighbors [10], support vector machine (SVM) [11], and artificial neural network (ANN) [12]. However, these models are lack of well consideration for some factors, such as requiring a large amount of observation data, only taking regions as objects for research, and failing to take into account the change in the space-time dimension of the number of shared-bike trips. For the neural network method, it is hard to determine the optimal network topology and training parameters. Similar machine learning models generally only carry out research in time or space dimension but cannot deal with the changes in space and time dimension well at the same time. The Bayesian hierarchical model has a good performance when simultaneously dealing with the space-time dimension problems and has a wide range of applications in various fields. The traditional Markov Chain Monte Carlo (MCMC)-based hierarchical Bayesian model now is faced with the test of high correlation of independent variables and huge amount of computation. This paper uses a new statistical method, namely the integrated nested Laplace approximation (INLA) via R-INLA packages [13–15]. For a complex Bayesian multiparameter model, INLA can quickly calculate the posterior distribution of parameters in the Bayesian model [16]. In addition, we express GF with MAT-ERN covariance function as a discrete Gaussian Markov random field (GMRF) by means of stochastic partial differential equation (SPDE) method [17, 18]. The latter has good computing characteristics. At present, the combination of INLA algorithm and SPDE method has been widely used in various fields, such as the evolution of Ebola virus and climate impact [19], but rarely applied in the field of traffic characteristic analyzing. The framework of this paper is as follows: the second part shows the methods used, the third part carries out the model architecture and calculation, and the fourth part discusses the research results.
2. Model Methods
2.1. The Spatiotemporal Model
Temporal and spatial effects represent changes that cannot be reflected by spatial or temporal effects alone. The usage characteristics of shared bikes determine that they have both time-varying characteristics and spatial distribution rules. Therefore, this section constructs a Bayesian hierarchical spatiotemporal model to describe the spatiotemporal distribution of shared bikes:where represents the travel prediction value of the shared bicycles at site and time point t, and i denotes the station number.
The model characterized by was built from covariate information measurement error , which is defined as a Gaussian white-noise process, and a state process denotes the vector of covariates for site and time point And is the coefficient vector.
was described as the spatiotemporal Gaussian field as shown below with first-order autoregressive dynamics, and it changes in time, where is a coefficient, and values between −1 and 1. .where is a temporally independent process, which can be described by a spatial-temporal covariance function: is a spatial correlation function, which can be estimated through the spatial distance of two sites and . For each and t, we have . The function can be further explained as follows:where controls the smoothness of the function, while denotes a second kind modified Bessel function. In addition, can be seen as an adjustable spatial distance scaling parameter. Then, the empirical function was used to represent the distance where a spatial correlation is close to 0.1.
2.2. GMRFs and SPDE Approach
GMRF is a probabilistic process that describes the structure of graphs and is applied to image segmentation (such as regular grids, lattice structures, or geographic regions), simplified calculations of design prior probabilities, and matrix calculations of Gaussian distributions. Purely Markov property, also known as the conditional independence property, refers to a transaction current state only related to the previous state, which is expressed as
Name the averages of MRF as ; the accuracy and positive definite matrix is Q, if point i and j is adjacent, it comes , also known as
Assuming that the MRF satisfies the Gaussian distribution, we can obtain a difference equation, called GMRF, which is expressed by the spatial pixel gray level.
The SPDE method is to express the continuous index GF as the discrete index GMRF (Gaussian Markov random field) by the matrix covariance function. The key is to replace the spatial and temporal covariance function and dense covariance matrix of GF with neighborhood structure and sparse precision matrix, respectively. SPDE uses finite element representation, subdividing region D into disjoint triangles by increasing vertices.
2.3. INLA Approximation for Bayesian Inference
Bayesian theorem provides a method to correct the subjective judgment of probability distribution by applying the observed phenomena in probability statistics. It is also a powerful tool in statistics for establishing a relationship between the quantity to be observed and the unknown quantity. The main advantages of the Bayesian approach lie on the uncertainty estimation and prediction.
The MCMC method has been widely used in Bayesian inference, but its limitation of large amount of computation has become an important problem. Thanks to the advances in data collection, we can collect large data sets with data from different sources, also with high spatial and temporal resolution. Given the spatial-temporal structure and the model complexity of large data sets, Bayesian inference of these data through MCMC may require days of computation time.
To overcome the above-mentioned problem, the integrated nested Laplace approximation (INLA) has been shown to operate accurately and fast. It was originally a standalone program and then be embedded into R as a package. From then on, it is widespread among statisticians and applied research personnel, with the spatial-temporal distribution model probably being one of its main applications.
The hyperparameters of the latent model can be expressed as parameter vector , which usually represents the error term of each stage of the model. Hidden effect includes the variables that we want to focus on. According to Bayes’ theorem, the joint posterior distribution of and is expressed as follows:where y is the observed value vector, is the likelihood function of y, and is the joint posterior distribution of and .
The likelihood function is expressed as
According to the conditional probability formula, the joint posterior distribution of and can be written as
Since obeys the Gaussian Markov random field and the accuracy matrix is , then the conditional distribution of with a given can be written as
Based on the theorem above, the joint posterior distribution of and then can be written as
According to Bayesian theory, the edge posterior distribution of X can be obtained as follows:
The above integrals can be difficult to compute when meeting high dimension. Hence, Laplace approximation and numerical integration can be used to approximate the edge posterior distribution of A and B. We end up with the following equation:
Here is the overall framework of the proposed model: We first build a hidden Gaussian model, which is essentially a Bayesian hierarchical model. The model we built can be divided into three layers: the first layer is the likelihood function of variable y; the second layer is the distribution of hidden variable x, which is required to be normal distribution; and the third layer is the prior distribution of the super parameter . The INLA algorithm is used to estimate the edge posterior distributions of and to simplify the Laplace approximation. Temporal and spatial data were substituted into this Bayesian hierarchical model and allocated to the grid segmented by GMRF and SPDE methods to obtain the posterior results of each regression coefficient.
3. Results Analysis and Discussion
3.1. Study Area Description
Jing’an District belongs to the Shanghai city and is located in the central area of Shanghai. In October 2015, the former Zhabei District and Jing’an District were merged and established as the new Jing’an District. The district covers an area of 373,700 square kilometers, with a permanent population of 1.1 million. It is a small area, yet with a high population density and a well-developed slow traffic system. Mobike, the research object of this paper, took the lead in launching in Shanghai. By August 2016, the number of bikes in Shanghai had reached 300,000, which means that Shanghai is not only the leading city with super-large-scale shared bikes in the world but also basically realizes the coverage of shared bikes in the whole region in our research scope. Therefore, it is very appropriate to carry out research on the travel volume of shared bikes in this area. Figure 1(a) illustrates the location of Jing’an District in the whole city. Figure 1(b) is the expected line graph of OD data of shared bikes within one random day in August 2016, drawn in the unit of Shanghai street administrative district. Observed by the changes in the thickness and color of the lines in the figure, we can see that a large number of bicycle trips are densely distributed in and around Jing’an District, which clarifies that most of the requirements for shared bikes are concentrated in Jing’an District. That provides a strong support for us to choose Jing’an District as the study area.

(a)

(b)
Jing’an District enjoys advantageous geographical location and convenient transportation, which has formed a transportation network of railways, viaducts, subways, and buses. It has five long-distance passenger stations, including Shanghai Railway Station, Shanghai-Pacific Long-distance Passenger Transport Station, and Shanghai Long-distance Bus Terminal. Rail transit has Line 1, 2, 3, 4, 7, 8, 12, 13 and other subway lines through the district: Yan’an Road, Inner Ring Road, Middle Ring Road, North-South, and other viaducts through Urban Terminal, Jing’an Temple Transportation Hub, Central Gonghe New Road Transportation Hub, etc., provide great convenience for air travel and transit. In addition, Jing’an District has perfect public facilities, such as medical service institutions, sports venues, all kinds of schools, cultural facilities, scenic spots, etc., which provides convenient conditions for the selection of interest points in the follow-up research.
3.2. Data Preprocessing
The research data in this paper are mainly from 2016 SODA Shanghai Open Data Innovation and Application Competition and Baidu Map API network open data. Among them, the full sample of bike-sharing user usage data covers the usage of about 306,000 Mobikes and 17,000 users in Shanghai from August 1, 2016, to August 30, 2016. The data attributes include two parts: (1) ID information, including order ID, user ID, and bike ID; and (2) track information, including track points in the process of cycling, starting time and ending time of cycling, starting point and finishing point of cycling, and so on. As stated in Section 3.1, the cycling data of Mobike in Jing’an District from August 1, 2016, to August 30, 2016, were chosen for the following analysis. In order to facilitate further research, we cleaned the track data of shared bikes. Orders that lasted for less than 2 minutes or more than 120 minutes were cleared as abnormal data because they often did not correspond to the bike-sharing behavior we wanted to analyze. Data with their cycling tracks beyond the zone of Jing’an were also deleted from the database.
The cycling data were statistically analyzed and visualized, as shown in Figure 2. Figure 2(a) shows the distribution of shared-bike travel time. Most of the trips are concentrated within 20 minutes, indicating that short trips account for a large proportion of trips among the district. Shared-bike trips lasting seven to eight minutes are the most popular. There is also a demand for trips of 20–40 minutes, while the use time of shared bikes hardly exceeds 60 minutes. It indicates that users will turn to other public transportation tools, such as subways and taxis, when traveling a long distance. Figure 2(b) shows the frequency of users using bicycles in a month. This distribution obeys the offset normal distribution, and its peak is about 5 times. 0–10 times is their common choice, and very few users use the bikes more than 15 times a month, indicating that users have not developed the habit of using daily commutes. Figure 2(c) shows the average hourly travel time of shared bikes in weekdays, revealing the real-time urban travel behavior and dynamic vitality distribution. In the picture, there are two obvious peak periods of bike use, one at 8 AM and the other at 6 PM, which just accord with the distribution of morning and evening peak hours in the city. Figure 2(d) shows the average hourly travel time of shared bikes on weekends, which is significantly different from that on weekdays. There has been a steady rise in shared-bikes use since 6 AM, peaking at 6 PM. These observations show that, to a certain extent, the use of shared bikes is consistent with the travel habits of Shanghai residents.

(a)

(b)

(c)

(d)
In addition, we obtained six categories of POI data from Baidu Maps through web crawling technology, including medical services, transport services, education services, public facilities services, cultural and sports services, and accommodation services. Interested readers can find similar data crawling and preprocessing method in the literature [20–22]. But the raw POI data have data redundancy, which needs to be reclassified. These problems include both identical data names and identical data space attributes, i.e., the same longitude and latitude field values. After that, a small number of points at the research area boundaries or with low public awareness were eliminated, and only about 300 POIs with significant spatial representative effects were retained. The distribution of these interest points is shown in Figure 3. What can be observed is that these POIs are relatively evenly distributed within the Jing’an zone.

3.3. Fixed Covariate Introduction
Some contextual factors, such as season, weekdays or weekends, weather, holidays, and abnormal events, should be considered in the covariates selection. Choosing historical days that are similar to the context factors of target days as the training set of the prediction algorithm helps make the prediction results more accurate [23, 24]. Song et al. tried to mine the relationship between contextual factors and traffic in a given time interval [25]. Ma et al. proposed a new daily traffic flow prediction method based on deep learning [26]. Ma et al. used NSGA-II to sort the test set grouping [27]. In this paper, the idea of contextual factors is used for reference, and the covariable is set as the influencing factor of the forecast of bicycle demand.
The geographical location of POIs is the most basic feature and the key factor to be considered. It has a significant advantage in estimating user behavior. So the longitude and latitude coordinates of the POIs were chosen as the covariables and named as UTMX and UTMY, respectively.
Similarly, the different types of POI determine the different services provided to travelers, and different services correspond to different travel needs of travelers. There is a mapping between the services provided by the POI and the needs of the travelers. Considering the basic travel needs of bike-sharing users, the services were divided into six categories, namely public facility services, medical services, education services, accommodation services, cultural and sports services, and transport services. The specific classification can be seen in Table 1.
In order to quantify the impact of a specific class of POI on the demand for shared-bicycle travel, we collected data on the daily visits to each POI in the government’s public data set. Then, we weighted the “attractiveness” of each POI based on the number of daily visitors. This quantified appeal reflects the ability of a particular type of POI to serve the travelers. In order to better reflect the mapping relationship between travel purpose and related travel intention categories, we named this covariable as POI service ability index, abbreviated as SAI. In order to be able to explore the specific impact size, we used an average weighted number of visits for each category of POI as a covariate.
Consider the characteristics of bicycle as a means of transportation, it is also reasonable to believe that the weather has a great influence on travelers’ decisions. So daily precipitation data during the study period were collected from the public official weather website and set as a covariable, named DPD.
Finally, the demand for shared bicycle use at a specific POI will also be affected by the number of visits to other POIs around it. This is because POIs of the same category within the average cycling radius may be competing to provide their specific services. And when the service capacity of one POI reaches saturation, travelers switch to other similar POIs instead. That is, the services provided by the same kind of POI are spatially correlated. For this reason, we assume that the demand for shared bicycles at other POIs within a radius of 500 meters from a specific POI will affect its demand. 500 meters is also the average maximum walking radius for pedestrians. This will have a cumulative effect in space. In order to quantify its impact on the demand of the specific POI we are concerned about, we use the ratio of average daily visitor volumes of other POIs in the nearby area to its services capacity as the influence coefficient and refine it into a covariate named MJD.
Since the distribution of covariate data is non-negative non-normal distribution, we use the min-max normalization processing method to shrink the covariable index to .
In the end, a 305 × 31 × 6 covariate array is built for model building. 6 is the number of covariates.
3.4. Model Building
Demanding for shared cycling in the point of interest exists in different time and space dimensions. In our study, it is assumed that the cycling demand is not only related in consecutive days but also independently distributed in terms of spatial effect and the combination of fixed covariables. Under the premise, the proposed space-time model was built with the SPDE method, all involving the stochastic effects of the calculation model and the marginal posterior distribution of parameters. The first step is to create a mesh to build the GRF, thus to realize the SPDE method. The constrained refined Delaunay triangulation is created with the coordinate points obtained as the vertex of the triangle. It is true that we can use more POIs as vertices to get a finer mesh, but with that comes more computation. Under the existing segmentation conditions, the maximum distance in each triangle is not more than 500 meters, which basically meets the user’s choice characteristics. The tracking points are assigned on this mesh according to latitude and longitude coordinates, as shown in Figure 4. The POIs were randomly divided into test set and verification set in an 8-to-2 ratio within each category. We labeled the test points red and verification points blue. Since the shared-bike traffic flow fluctuates in time and space, we used the interpolation method to insert the value for each mesh cell; its weight is inversely proportional to the distance. Finally, the model was run by integrating the covariable combination with the mesh value of shared-bike trip volume, and the predicted value of the daily shared-bike attraction for each POI was obtained.

The prediction results of each parameter are shown in Table 2. Table 2 summarizes the posterior statistics of each fixed covariable, including mean, standard deviation, and three types of quantiles. The absolute value of the mean value indicates the intensity of the influence of these covariables on bicycle travel demand. According to the observed data, precipitation has the most significant influence among all the covariables, indicating that riders’ travel decisions are greatly affected by weather. On the contrary, the category and intensity of interest points have no obvious attraction or repulsion to shared-bike trip volumes. A reliable explanation for this is that the study area has a large demand for the use of shared bikes, along with an intersecting travel demands through different types of POIs. Each category of POIs accounts for a large number of traffic trips, and there is a certain correlation between them. This allows the predicted travel volume of shared bikes to be more evenly distributed among these points, without reflecting too much fluctuation.
In addition, the negative mean value indicates that the covariable is inversely proportional to the trip number. For example, the greater the precipitation, the smaller the number of predicted shared bike trips, which also confirms our cognition. The 95% credible interval of the density of POIs is , showing a fluctuating changes in the impact on sharing-bike trip demands.
Table 3 collects the posterior estimates of the hyperparameters , , , and . scales the variance of Gaussian distribution . And, weighs up the variance of the measurement error term. A huge discrepancy in those two values means that the fluctuation of the demand distribution of shared bikes is more affiliated with the observational error. The posterior mean of , a coefficient vector of fixed covariates, was 3.696, and the 95% credible interval was . The fact that this parameter is clearly greater than zero infers a good clustering on shared-bikes demanding. While the posterior distribution values of another hyperparameter indicates no particularly significant correlation on temporal dimension.
3.5. Model Verification
To test the validity of the model, a comparison between the predicted and observed values of the shared-bikes demand is plotted and is shown in Figure 5. These points are almost uniformly and symmetrically distributed on either side of the solid line. It can be concluded that the accuracy of the model is within the acceptable range. Predictions are more accurate when travel rates are low.

We further focus on the space-time effect of this random field. Figure 6 shows the posterior mean and posterior standard deviation of special random effects on a certain day, August 27th, 2016. It shows the GRF with negative binomial likelihood. There are no particularly large values in the whole region, indicating that the spatial correlation is not significant within this region. In the lower part of the area with more concentrated POI points, the random field has greater uncertainty, indicating that the overdense POI has greater variability and will affect the prediction results. The standard deviation calculation of space random field is driven by information content, so less uncertainty can be inferred from the results. The higher posterior standard deviation in the boundary region may be due to the disturbance of other unobserved factors.

(a)

(b)
4. Conclusion
As a popular travel mode, sharing bicycles provide an efficient mobility option to travelers, especially for “first-last mile” trips. Accurate demand prediction could help decision-makers maintain a suitable balance of bicycles throughout the urban area, optimizing the docking stations as well as cycle lanes at spatial-temporal dimension. In order to achieve this goal, a Bayesian hierarchical spatiotemporal model was established based on the multiple source data such as POI, daily weather, and dock-less bike-sharing data. The use of the INLA method and SPDE approach in our model greatly reduced the computation time compared with the traditional MCMC method.
Major findings from the article can be concluded as followed: (1) A cycling demand prediction model was established from the perspective of time and space. (2) Compared with the MCMC method, the INLA algorithm improves the calculation speed while ensuring the accuracy. (3) Most of the previous studies were based on regions or stations, while our model was built on POI, which is more suitable to dock-less bike-sharing rental prediction. (4) Precipitation was identified as significant influencing factors, while the intensity of POIs had no obvious impact on cycling demands. (5) Measurement error is better than spatial correlation performance to explain the variation of travel distribution.
Future research can expand in the following several aspects: (1) The SPDE method should model the geographical data or use a more refined mesh instead of the direct triangle subdivision. (2) The model proposed in this paper has not been widely applied in the field of transportation, and the universality of the model can be tested in the practice process with larger research scope, larger data volume, and more sampling points of interest in the future [28, 29]. (3) More covariates or more time components can help improve the spatiotemporal model [30, 31]. (4) The category of POI can be further subdivided, and the geographic spatial attribute of POI, such as area and service capacity, can also be taken into consideration. (5) Due to the rapid development of the industry, users’ habits and travel patterns of shared bikes are also changing. Bike-sharing data are highly time-sensitive, and obtaining updated data for research is helpful to get more accurate conclusions.
Data Availability
The research data in this paper are mainly from 2016 SODA Shanghai Open Data Innovation and Application Competition and Baidu Map API network open data (https://shanghai.sodachallenges.com/data.html).
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This research was supported by the National Key Research and Development Program of China (2018YFB1600600).