Spatial Panel Regression of Chinese Housing Market
By: gstar • October 18, 2016 • Term Paper • 4,896 Words (20 Pages) • 1,208 Views
Predicting the Chinese Housing Market: a Spatial Panel Regression of the Chinese Housing Market
Plan:
- Introduction
- Chinese Housing Market
- Spatial panel models
- Dataset, interpolation extrapolation rules wulugao unbalanced panel +
- Obtaining distances, formula used, 3 different Weighting matrices, geography of China, map
- Estimation of coefficients
- Economic interpretation of results, statistical significance
- Developments in China’s Housing Market
http://www.stats.gov.cn/tjsj/ndsj/2014/indexeh.htm (yuan/m2)
In the analyzed period of 2003 - 2012 Chinese real estate market (not just the housing market) experienced large increases in prices. The graph below shows average yuan per square meter prices of different real estate units using data retrieved from National Bureau of Statistics of China. All types of real estate had increased in prices overtime. Some of them increased almost threefold (office buildings and villas with high grade apartments). Obviously, the data does not fully control for quality improvements of the buildings. Nevertheless, it is fair to assume that the quality of buildings did not increase as much as the prices did.
[pic 1]
Such increases in prices often lead to real estate or housing bubbles. However, the high prices themselves are not sufficient to determine whether there is a bubble in the market. Fang, Gu, Xiong and Zhou (2015) performed analysis on Chinese housing market and showed that on average housing prices in China increased had a real growth rate of 13.1% in the largest cities, 10.5% in the second tier cities and 7.9% in the smallest cities of their sample in 2003 – 2013, the growth rate that is even larger than the one that resulted in a housing bubble in the US. However, they argue that there is no housing bubble in China. Firstly, they say, the borrowing procedure in China is much stricter and thus safer for banks than in the US, e.g. the Chinese banks require down payments more than 30% on the mortgages. Secondly, the disposable income and per capita GDP was growing almost as fast as were the housing prices, which mitigated the financial burden borrowers felt due to the increased housing prices. Finally, housing market in China is protected by government from failure as it is one of the main investment vehicles for Chinese investors, who are restrained to outflow their capital to foreign capital markets.
Feng, Wu (2015) also confirm the idea that the house price appreciation in the Chinese housing market is not itself an indication of bubble using asset-pricing approach. Authors argue that rent-to-price ratio is the suitable indicator to determine whether the bubble exists. If this ratio is lower than the user cost of owning a house assuming realistic expected capital gains on the real estate owned, the bubble is indicated. For the sample of 60 large and medium size cities they find that rent-to-price ratio at the national level at the end of 2013 (3.21%) is in the equilibrium interval (2.85% to 3.39%), suggesting expected rate of nominal house price appreciation is in the rational interval and indicating no sign of bubble in the Chinese housing market. (http://www.ntu.edu.sg/home/guiying.wu/FengWu_201504.pdf)
The more recent studies of Chinese housing market show that market has started correcting itself. Kruger, Mo and Sawatzky (2016) argue that housing market finally started adjusting in 2014 (the volume of sales fell 9 per cent from 2013). As the key reasons for the downturn they name increased stock of unsold inventories (materials for construction), slowing economic growth, alternative investment opportunities and expected implementation of property tax. Notably, the paper mainly discusses the relation between housing and metal prices in China. The significance of housing market on the economy is confirmed by their findings that developments in a housing market accounted for a quarter of metal price increase in 2002-2010 and for a third of a decrease since 2014. (http://www.bankofcanada.ca/wp-content/uploads/2016/03/sdp2016-7.pdf)
- Dataset
The dataset used in this paper is taken from the paper titled "Demystifying the Chinese Housing Boom" by Hanming Fang, Quanlin Gu, Wei Xiong and Li-An Zhou. The dependent variable of the upcoming panel spatial fixed-effects lag model is the Price Index (PI), which was computed in the paper mentioned above. Please refer to the paper for the methodology of the Price Index computation. Generally, the Price Index shows the difference between housing prices in the same city for presumably same houses across time.
The original data set provides housing data for 124 cities that are split into 3 Tiers according to their economic impact and size. For Tiers 1 and 2 there are 140 monthly and 11 annual observations from 2003 till August, 2014. Tier 3 cities in most cases contain 123 monthly and 10 annual observations from 2003 till March, 2013. To make the panel balanced and also due to the fact that 2013-2014 period is missing values heavily, the sample is cut to 2003-2012 period with 120 monthly observations and 10 annual observations.
Other variables provided in the dataset are: Price index started at 2009 (PI09), NBS 70-city index (PI70), average price per square meter (avgPrice), index of average price (avgPI), share of land sales in city construction budget (shareLandSales), GRP in urban districts (urbGRP) and its index (urbGRPInd), per capita GDP (pcGDP) and its index (pcGDPInd), per capita disposable income (urbInc) and its index (urbIncInd), average urban salary (urbSalary) and its index (urbSalaryIndex), fixed base CPI (cpiAll) and fixed base urban CPI (cpiUrb). If not specified differently, all indices start with a value of 1 in January, 2003. Combination of above mentioned variables will be independent variables of panel spatial fixed-effects lag model.
Several variables were removed because of the large amount of missing values (up to 70% of observations) – average house price, average house price index, NBS 70-city index, urban income index and per capita GDP index (as indices are computed from the absolute values that are still contained in the data, no information is lost by removing the indices from the data set). Also, 5 out of 124 cities were removed because they had 0 observations in either PI or independent variables: Lanzhou, Guiyang, Changji, Taiyuan and Xilingol.
Also, the name of a city “Wuludao” was changed to “Huludao”. Google cannot find any city named “Wuludao” in China. However, there is a city named “Huludao”, which is in Liaoning province (which is the province of city “Wuludao” in the data set) and had almost identical population in early 2000s (900’000 vs 920’000). Presumably, the authors of the above-mentioned paper made a typo which did not have any influence on the results they got. However, in spatial panel fixed-effects lag model this faulty name would have had an impact on the computation of the weight matrix as the coordinates given by Google for city “Wuludao” differs from “Huludao” (more on weight matrices in the next chapter).
After the changes mentioned above there was still a large part of values missing, e.g. variables named urban salary and urban salary index still had 1589 observations missing in the panel (11.1%). The interpolation and extrapolation was performed. The following interpolation and extrapolation rules were applied:
Interpolation. The average (arithmetic) of i-1 and i+1 if the observation i is missing. If there is more than one consecutive value missing, use approx function in R which does exactly the same linear interpolation for multiple consecutive missing values.
Extrapolation. If there was just one missing value at the beginning or the end of the period, use linear extrapolation for that value. If there is more than one consecutive value missing at the beginning (end) of the period, substitute them with the first (last) available value. This rule was applied believing that naïve extrapolation using the first (last) available value is more accurate than linear extrapolation.
1,5.1 Theoretical background
The expanded versions of traditional regression models that account for spatial correlation observed in the data set are called spatial models. In other words, spatial models are able to capture the linkages between observations in particular areas or cities and distances between them. The following section will be based on the works of Elhorst (2010) and Millo, Piras (2012)
1,5.1.1. Panel Models
Panel data is the data set containing cross-sectional observations across time. Generally, panel data is preferred to cross-sectional or time series data because of panel data’s two dimensionality. Spatial panel data models are used to capture spatial interactions between places in a panel data setting.
Elhorst (2010) names one way of classifying panel data models – into fixed effects, random effects, fixed coefficients and random coefficients. As in this paper only (spatial) fixed effects model is used to explain Chinese housing prices, only the differences between the fixed effects and random effects models are briefly discussed.
...