Macroeconomics and FinTech: Uncovering Latent Macroeconomic Effects on Peer-to-Peer Lending

10/31/2017 ∙ by Jessica Foo, et al. ∙ The University of Chicago 0

Peer-to-peer (P2P) lending is a fast growing financial technology (FinTech) trend that is displacing traditional retail banking. Studies on P2P lending have focused on predicting individual interest rates or default probabilities. However, the relationship between aggregated P2P interest rates and the general economy will be of interest to investors and borrowers as the P2P credit market matures. We show that the variation in P2P interest rates across grade types are determined by three macroeconomic latent factors formed by Canonical Correlation Analysis (CCA) - macro default, investor uncertainty, and the fundamental value of the market. However, the variation in P2P interest rates across term types cannot be explained by the general economy.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Online marketplace lending, also known as peer-to-peer (P2P) lending, directly connects borrowers with lenders on an online platform, bypassing traditional financial intermediaries such as banks. Borrowers apply for loans, specifying loan details which are then listed on the platform. Prospective lenders can then access the platform and view listings, choosing to fund the loan fully or partially. The interest rate that corresponds to the individual loan is determined by the platform, which collects personal data on the individual to determine credit worthiness, before assigning a loan grade to the borrower. Loans with a higher credit risk will thus entail a higher interest rate, due to the higher possibility of late repayment or default. While traditional banks rely largely on FICO credit scores to determine loan amounts and rates, P2P platforms use complex algorithms to analyze personal data ranging from social media activity to platform use to determine credit worthiness. Furthermore, these algorithms adjust interest rates based on the demand and supply of P2P loans, functioning like a price mechanism.

P2P lending is part of the rapidly growing financial technology (FinTech) industry that is transforming and disrupting traditional financial services. Besides eliminating inefficiencies and overhead costs of transactions between borrowers and lenders, P2P lending allows borrowers with low FICO scores to obtain loans that were previously inaccessible to them. P2P lending is also advantageous for lenders, offering the possibility of higher yields for investors who wish to diversify their portfolio. Given these advantages, Transparency Market Research anticipates the global P2P market to expand at a compounded annual growth rate (CAGR) of 48.2% between 2016 and 2024. In the U.S., P2P lending has grown an average of 84% per quarter from 2007 to 2014.111Peer-to-Peer Lending is Poised to Grow. Federal Reserve Bank of Cleveland. August 14, 2014. However, while banks, credit card companies and other traditional lending companies generate more than $870 billion annually from fees and interest, P2P’s 2014 lending revenue constitutes only about 2% of that amount,222A Trillion Dollar Market By the People, For the People. Foundation Capital. May 6, 2014. with Prosper and Lending Club capturing 98% of the market.333Banking Without Banks. The Economist. February 28, 2014. Nevertheless, P2P lending is projected to service 10% to 15% of consumer debt by 2025.444Peer Pressure: How Peer-to-Peer Lending Platforms are Transforming the Consumer Lending Industry. PwC Consumer Finance Publications. February, 2015.

Given the potential of P2P lending to disrupt retail banking, this paper aims to analyze the relationship between P2P lending and the general economy. Economic theory suggests that credit markets are correlated with macroeconomic activity as they are used to channel an economy’s savings into other more productive uses. Analyzing the interest rates of different loans can thus reveal investors’ expectations of the economy, which are correlated with economic variables like unemployment rate and inflation. The corporate credit spread, defined as the additional yield investors receive for investing in a corporate bond instead of a government bond of similar term (loan period), has been found to be correlated with the economy. In this paper, we attempt to analyze the P2P credit spread, which we find analagous to the corporate credit spread, since both involve differing maturity terms and loan grade types, and adopt the methodologies used by Ahn . (2012) in analyzing corporate credit spreads.

Since macroeconomic variables are proxies of the economy, we construct latent factors from canonical correlation analysis (CCA) of P2P credit spreads and observed explanatory macroeconomic variables, and investigate the extent to which the factors capture the systematic variation of P2P credit spreads. We base our initial selection of macroeconomic variables on economic theory and intuition, and rely on them to interpret the factors.

We show that the common variation in P2P credit spreads is mainly explained by three factors. The first factor is a macro non-default factor, which is strongly correlated with economic indicators of economic expansion such as increased inflation, reduced unemployment and reduced household debt. Furthermore, an increase in the first factor is associated with an increase in P2P credit spreads of all term and grade types in a CCA factor regression. We identify and interpret the second factor as a latent market uncertainty factor, which is positively correlated with the risk-free slope and equity volatility. An increase in the second factor is associated with a decrease in P2P credit spreads across grade types, and not significantly associated in P2P credit spreads across term types. Our third factor can be interpreted as the fundamental value of the market, although canonical cross-loadings for the third factor are weak.

We also find that unlike corporate bonds or government bonds, P2P loans have little term structure that can be explained by macroeconomic conditions. One limitation is that P2P loans are currently only offered on two terms — 36 months and 60 months, making it difficult to observe interest rate variation across term types and the associated risk with longer term loans. However, across grade types, we find that macroeconomic factors do adequately explain P2P credit spread variation.

Our paper proceeds as follows: Section 2 describes existing literature related to P2P lending and credit markets. Section 3 outlines our methodology. Section 4 describes our data set and selection of macroeconomic proxy variables. Section 5 shows our results using multiple regressions, canonical correlation factors and our economic interpretations. We conclude in Section 6, and the Appendix contains Tables and Figures.

2 Literature Review

Academic research on P2P lending has focused primarily on two aspects — the determinants of funding success, and of default. The predictors studied are detailed borrower-specific variables that include personal information, credit history, financial ability and social information. To predict funding success or default, studies have focused on classification models, which either classify loans into loan sub-grades ranging from A to H (with A being the most credit worthy) or into binary predictions of default.

Klafft (2008)

found that funding success on Prosper is significantly influenced by verified bank account information and credit rating, while interest rates are primarily influenced by credit rating and debt-to-income ratios. Similarly, using a multivariate logistic regression model,

Herzenstein . (2008) found that funding success on Prosper depends more on borrowers’ financial strength, such as credit grade and debt-to-income ratio, and their efforts to obtain the loan (providing detailed information or joining platform communities) than demographic features such as race. Zhang . (2016)

used a decision tree to conclude that loan information, social media information and credit information are most crucial in determining default risk on China’s leading P2P platform, PPDai. In addition to loan-specific information and borrower-specific information,

Dietrich  Wernli (2016)

included macroeconomic variables into ordinary least squares regression and found that higher unemployment rates and three-year government bond yield rates were correlated with higher P2P lending rates in Switzerland. In essence, most research related to P2P lending have used individual data to predict individual loan interest rates.

On the other hand, traditional credit markets have been frequently analyzed using aggregate loan data, and have also relied primarily on factor models. Litterman  Scheinkman (1991)

first used principal component analysis on the yields of U.S. government bonds, and found that most of the variations in returns can be explained by first three eigenvectors, also known as yield curve factors in empirical finance, which they called level, steepness and curvature. Yield changes caused by the level factor are constant across maturities, resulting in a parallel shift in the yield curve, while the steepness factor raises the yields of treasuries with longer maturities, compared to those with shorter maturities. However, while no meaningful interpretation was provided for the third factor, they found that changes in the third factor is correlated with changes in interest rate volatility.

Ang  Piazzesi (2003)

then analyzed the latent factors and macroeconomic variables jointly in a vector autoregression, and found that forecasting performance was better for models that included additional macro factors, than models with only latent variables. Exploring the relation between latent factors and macro variables,

Ahn . (2012) used canonical correlations between corporate credit spreads and macroeconomic variables to form factors that explain variation in the term structure of corporate credit spreads. They found three common factors that accounted for 40% of the total variation, with the first related to the contemporaneous state of the economy, the second to expectations of future economic conditions, and the third to error correction in short-term spreads.

3 Methodology

3.1 Basic Model

We explore the theoretical model in Ahn . (2012) that takes into account two possible scenarios — first, that there are missing non-macroeconomic related factors not captured by macroeconomic indicators, and second, that the macroeconomic indicators may not be perfect proxies for true macroeconomic factors, resulting in “systematic measurement errors” that can be found in the residuals. To demonstrate this, we first assume that each -dimensional credit spread variable follows a linear -factor model given by


where is a vector of intercepts, is a vector of zero-mean idiosyncratic components, and are and factor loading matrices, and are and vectors of common factors. We assume that systematic variation in the dependent variables can all be captured by common factors. The factors are predicted by a theoretical model of interest, which in our case relates the P2P credit spreads with the wider economy. The factors , however, are not predicted by the macro-economy. Both factors and are unobserved, but we observe proxy macroeconomic variables that are correlated with factors , and will refer to these proxy variables as . We assume further that the factors in are uncorrelated with the proxy variables . Our formal assumptions are thus:

  1. Hence all common factors and proxy variables are uncorrelated with idiosyncratic error terms in .

  2. ; Hence the model is fully specified with factors.

  3. The linear projections of and on are


    Hence only the factors are correlated with macroeconomic proxies

  4. is a diagonal matrix. Therefore, conditional on the factors, there is no serial correlation in

Therefore, substituting the linear projections in equation (2) into equation (1) with some algebraic manipulation, we get


From (3), it is evident that the residuals, can contain two sources of systematic variation. The first is due to a missing factors that are not correlated with the proxies . The second is due to incomplete specification of , resulting in imperfection correlation between and , which implies that , the projection errors, are not 0. Therefore, if we regress on with ordinary least squares, the residuals will contain factor structure that can be uncovered with principal components, in both scenarios. However, if we adopt CCA factors as in Ahn . (2012), we can determine whether there are missing factors unaccounted by macroeconomic variables.

3.2 CCA Factor Model

Canonical correlation analysis (CCA) allows us to analyze linear relationships between two sets of random variables, by finding the optimal canonical variates — the linear combinations of the variables within each set, which maximize the correlation between them.

Given variables in one set, and variables in another set, let be the first set and let be the second set. Then, consider the linear combinations and for all , where and . Define


where . Then the first pair of canonical variates are . In general, the th pair of canonical variate has additional constraints that the th canonical variates have to be uncorrelated with the other canonical variates. Therefore, the th pair of canonical variates is defined as:

Assuming that represents the set of P2P yields, and the set of macroeconomic variables, we then use as CCA factors, with being the number of factors chosen.

Ahn . (2012) have shown that using

as estimated factors, where

, has two main advantages. Firstly, the factor loadings based on the CCA factors are the same as the factor loadings of the true factors

up to a linear transformation. This is because the columns of

form a basis spanning the columns in of (3), as shown by Gourieroux . (1993). That is, there exists a nonsingular matrix such that . Therefore, if we let , and let be the parameters of the linear projection of on and be the resulting projection error such that


Then, it implies that the columns of are spanned by the columns of , that is,

. Therefore, we can perform linear regression on the CCA factors, which span the same space as the true factors


Secondly, the projection errors from the regressions with the CCA factors will not contain systematic measurement errors , and are linear functions of the errors and the missing factors only. However, even so, the projection errors may still be mutually correlated, causing the first principal component of the regression residuals to be significant. Therefore, this may not imply that there exists missing factors. However, Ahn . (2012)

argue that if the first principal component of the residuals only has moderate explanatory power for all response variables or strong explanatory power for a few response variables, then there is no missing factor

. However, if the first principal component of the residuals has strong explanatory power for all response variables, then there exists a missing factor not captured by the proxies.

After constructing the CCA factors, we then adopt methodologies from Hair . (1998), using standard statistical tests such as Wilks’ Lambda and the Redundancy Index to estimate the number of factors, .

4 Data

4.1 P2P Credit Spreads

Our initial data set consists of 887,440 observations of individual loan data from Lending Club, starting from 1 June, 2007 to 1 December, 2015. Each observation includes features such as the interest rate at which the loan was made, the loan grade and the loan term. We then aggregate individual observations by averaging the interest rates across two cross sections — loan term and grade, and across monthly periods. There are two loan terms available for P2P loans — 36 months and 60 months, and 6 grades assigned to each loan — from A to F.

For 36 month loans, we have 103 monthly aggregated interest rate observations for each grade type, from 1 June, 2007 to 1 December, 2015. However, since 60 month loans were only offered later on, we have 67 aggregated interest rate observations for each grade type, from 1 May, 2010 to 1 Dec, 2015. Table 1 in the Appendix presents summary statistics for the aggregated interest rates. We then construct the levels of monthly credit spreads by subtracting the yield to maturity of a government bond of the same maturity term (36 months and 60 months respectively) from the P2P interest rate, at each point in time.

Since many empirical studies have found non-stationarity in the levels of credit spreads, we test for stationarity using the Augmented Dicker Fuller (ADF) test and present our findings in Table 2

. We find that we cannot reject the null hypothesis of a unit root present in an autoregressive model for any of the credit spread levels. Hence, the levels of credit spreads are non-stationary. Therefore, we use first differences in the levels of credit spreads, which represent the change in the levels of the credit spreads from one period to another. We present summary statistics of the differentiated series in Table 

3, and ADF test results for the differentiated series in Table 4. Since none of the -values are above 0.05, we can assume that all the differentiated series are stationary.

Since we will be using the differentiated series, it is necessary to check for cointegration, as co-integrated differenced credit spreads would require an error correction factor model, as in Tu . (2015). Co-integration of a set of time series variables implies that a linear combination of that set of variables integrated of order 1 (first-differenced) is integrated of order 0. Intuitively, economic time series tend to be cointegrated as they are dominated by smooth, long-run equilibrium trends. In our case, there is a possibility of a long-run relation in the levels of P2P credit spreads of different grade quality since it has been found in corporate credit spreads.

In order to test for this, we use the Johansen test, which tests for more than one cointegrating relationship of order 1. In particular, we use the Johansen test with trace, which tests the null hypothesis that the number of cointegration vectors or linear combination is versus the alternative that . We present our findings in Table 5, and conclude that none of the cointegration relations are statistically significant to reject the null hypothesis. Therefore, we do not include any error-correction mechanisms in our regressions.

4.2 Macroeconomic Variables

We use a set of macroeconomic variables that we posit may be important in explaining systematic variation in P2P credit spreads, and present their summary statistics in Table 6.

General economy.

Since little research exists to find correlation between aggregated P2P loan rates and the general economy, we rely mostly on economic intuition to choose economic variables that reflect the health of the economy and may affect P2P loan rates. Given that Dietrich  Wernli (2016)

found that higher unemployment rates were correlated with higher individual P2P lending rates, we include the U.S. seasonally adjusted unemployment rate (UNRATE). However, since the unemployment rate is a quarterly economic indicator, we linearly interpolate between the quarterly observations to obtain monthly observations. Although

Okun (1963) observed that the unemployment rate is empirically strongly correlated with gross domestic product (GDP), we choose to include GDP since the relationship between unemployment rate and GDP may have been weakened, especially given the evidence of stagflation. We also choose to include the consumer price index (CPI) and household service payments as a percent of disposable personal income (Debt) in our analysis. A higher unemployment rate may increase the likelihood of lower grade loan default as jobs and income are lost; a higher inflation rate reduces real household income and may increase the likelihood of default; higher proportion of debt repayments for standard loans like housing loans would mean less income for other non-standard loan repayments, like P2P loans, resulting in higher likelihood of P2P loan default.

Government bond market.

P2P loans can be seen as credit investments, except with a higher rate of default since corporations are more likely to repay loans due to regulations. Given their similarity, P2P credit spreads are likely to be correlated with the risk-free government bond interest rates, as Duffee (1998) found in corporate credit spreads and Dietrich  Wernli (2016) found in individual P2P loans. The risk-free government bond interest rates form the Treasury yield curve, a cross-section of Treasuries across different maturities at a single point in time, which has been found to contain three factors — level, slope and curvature. To account for the level factor, we use the first-differenced interest rate of the 10-year Treasury bill ( 10Y-rf). We also include the slope factor, as Litterman  Scheinkman (1991) have documented, by taking the difference between the 10-year Treasury bill rate and the 1-year Treasury bill rate (rf-slope). The curvature factor has little meaningful interpretation at present, therefore, we do not include it in our analysis. Our Treasury yields data and general economic variables data are obtained from the Federal Reserve Bank of St. Louis’ FRED database.

Equity market.

P2P loans may be affected by the equity market as the rates of return on P2P loans are comparable to equities, resulting in substitutability between P2P loans and equities as investments, especially in periods of low interest rates. We use observations from the S&P Index (SPX) compiled by Bloomberg to represent the general performance of the equity market. We also include the first differenced VIX index (VIX), which tracks changes in the market’s expectations of 30-day equity volatility. A large change in the VIX index implies investor uncertainty, which can reduce investor demand in investing in P2P loans, resulting in higher P2P interest rates. We are also interested in analyzing whether P2P credit spreads may be related to Fama  French (1993) three factors in stock returns. In particular, the equity premium from the small-minus-big (SMB) factor or the high-minus-low (HML) factor may be correlated to P2P credit spreads. If the equity premium on HML and SMB factors are low, investors may look for other sources of return, turning to P2P loans and reducing P2P interest rates. We obtain both factor returns from French’s publicly available data library website.

5 Results

5.1 Multiple Regressions

We first analyze our data using multiple regressions to establish a benchmark case. The specification for our full regression model is given by


We perform our regressions over two panels — one with the loan grade cross-section, and the other with the loan term cross-section. For example, for the grade-A loan regressions, we combine data of both 36-month grade-A loan observations and 60-month grade-A loan observations; for the 36-month loan regressions, we combine data of 36-month loan observations across all the grades. We present our results in Table 8 and Table 9 of the Appendix.

Similar to corporate credit spreads, the results show that changes in the risk-free rate are statistically and economically significant in explaining changes in P2P credit spreads across all grade types and terms. The magnitudes of the coefficients are also similar for all the regressions, and are negative, comparable to the coefficients found in corporate credit spreads in Ahn . (2012). This implies that an increase in the 10-year Treasury yield is associated with a decrease in P2P loan rates across all grade and term types.

Focusing first on Panel A, we find that the unemployment rate is a significant and negative predictor for higher grade credit spreads. Although the equity index is also statistically significant predictor for higher grade credit spreads, the coefficient is close to zero, and thus, not economically meaningful. We also find that HML and SMB are statistically significant for higher grade credit spreads, with the HML factor coefficient being positive and the SMB coefficient being negative.

For Panel B, unemployment rate remains a significant and negative predictor for both term types. Likewise, the SPX predictor, while significant, is not economically meaningful. The HML and SMB factor predictors are statistically significant for both term types, and also in the same sign direction as our findings in Panel A. Interestingly, the change in VIX is a significant and positive predictor for shorter term loans.

Since the predictors in the fully specified model are likely to be correlated by definition for factor construction, we may face problems of multicollinearity resulting in large standard errors and thus, conservative hypothesis testing. Therefore, we use AIC stepwise forward-backward regression for variable selection, shown in Tables 

10 and 11.

Our conclusions from the AIC stepwise regressions are not materially different from what we found with the full specification regressions. However, we note that for lower grade credit spreads, CPI is chosen as a significant predictor instead of the unemployment rate. This is not surprising given the high correlation between CPI and unemployment rate, as documented in Table 7.

Next, we examine the estimated residuals using principal components to ascertain the existence of a strong common component in the residuals. We follow Ahn . (2012) and extract the first principal component from the residuals, before adding the extracted series into the OLS regression. We present our results in Tables 12 and 13 in the Appendix. We see that similar to corporate credit spreads, the change in adjusted increases sharply after adding in the residual principal component. The average increase in adjusted for grade type regressions is 0.440 while the average increase for term type regressions is 0.536.

5.2 Canonical Correlation Analysis

Following the CCA technique we laid out in Section 3.2, we then let the first set of variables be the set of all P2P credit spreads across grade types and loan terms. Therefore, we have twelve variables in set . Then, we let the second set of variables be the ten macroeconomic proxy variables found in (6

). We first report the canonical correlation coefficients and the eigenvalues of the canonical roots in Table 

14, following the same analysis procedure in Hair . (1998). A quick analysis of the eigenvalues would suggest that the optimal number of factors would be five, as the eigenvalues from the fifth onward seem to level off.

To test the extent of correlation between the two sets of variables, we use Wilk’s lambda in Table 15, which tests sequential hypotheses that the th canonical variate and all that follow it are zero. The results show that at 5% significance level, we can reject the null hypothesis where .

This would suggest that having six factors would be optimal in capturing systematic variation in both sets of variables. However, while the squared canonical correlations (roots) give an estimate of the shared variance between the canonical variates, i.e., the variance shared by linear combinations of the two sets of variables, they do not capture the proportion of variance of the respective sets of variables. The redundancy index provides a measure of the extent to which a set of variables (taken as a set) explains variation in the other set of variables (taken one at a time), by computing the proportion of variances of the individual variables in each set which are accounted for by all the variables in the other set through the canonical variates. We report the redundancies for P2P credit spread variables in Table 


The redundancy analysis suggests that using the first three canonical variates is sufficient, since all the other canonical variates have very low redundancy indices. We note that the number of canonical variates found by the redundancy index is the same as the number of canonical variates found by Ahn . (2012) by GMM estimation for corporate spreads.

Before analyzing and interpreting each factor, we compute canonical cross-loadings, which are the pairwise correlations between the three canonical variates and the set of macroeconomic variables, in Table 17. We will then use the canonical cross-loadings along with the CCA factor regression estimates to interpret the factors.

5.3 CCA Factor Regressions

After extracting three factors by estimating the first three canonical correlations, we estimate our CCA factor model with the following specification, where is the vector of first canonical variate scores, is the vector of second canonical variate scores and so on.


The estimation results in Tables 18 and 19 show a large increase in explanatory power for grade type regressions, compared to the regressions in Section 5.1. The adjusted values are all greater than 0.7, while none of the loan type regressions in Section 5.1 had an adjusted greater than . Furthermore, the adjusted increases as the loan type becomes lower, that is, as the loan becomes less credit-worthy. We observe that the coefficient for increases for loans that are less credit-worthy, while the coefficient for is not monotonic — becoming more negative before becoming less negative and ultimately switching sign for the last category F. The coefficients for do not follow a trend, but is lowest for category A and highest for category E.

On the other hand, when loans are sorted by term type, we find that the adjusted values drop sharply, implying that the factors explain little of the systematic variation of interest rates in each term category. Furthermore, is not statistically significant in the regressions for both term types. The coefficients for both and are statistically significant and positive, and increase for the higher term category of 60 months.

The first canonical variate is strongly negatively correlated with the risk-free Treasury slope (rf-slope), household debt (Debt), and unemployment rate (UNRATE), but strongly positively correlated with inflation (CPI). This suggests that may represent a macro non-default factor. In periods of economic expansion, the unemployment rate decreases while inflation increases, as predicted by the Phillips curve (1958). Due to increased household income as a result of lower unemployment rates, household service payments as a percentage of disposable income also decreases, assuming constant household debt levels. However, the macro non-default factor is negatively correlated with the slope of the risk-free rate, which represents investors’ expectations about future risk-free rates. In particular, a decreased slope implies that investors do not expect short-term risk-free rates to rise. Prior to quantitative easing in 2008, this typically implies that investors are bearish about the economy in the short run. However, given the general low interest rate environment from 2008 to 2015, the traditionally established link between the Treasury yields slope and macroeconomic expectations can be questioned, and thus, will not affect our interpretation of as a macro non-default factor. Instead, this adds some subtlety in our interpretation of the non-default factor — although the economy is improving, economic perceptions are still bearish.

Since has a positive coefficient in the CCA factor regressions, this implies that an decrease in the likelihood of macro default is correlated with a decrease in P2P credit spreads across all grade types and term types. This is because if the probability of default across the economy decreases, the probability of default for P2P loans will also decrease, reducing the P2P interest rates. Furthermore, the magnitude of the effect increases for lower grade P2P loans, that is, loans that are less credit-worthy, as well as for longer term P2P loans. This may be explained by the fact that recessionary economic conditions are likely to affect lower grade P2P debtors more, since lower grade P2P loans are correlated with worse credit history or borrow-specific attributes like lower income levels. The effect is larger for longer term loans as well since longer term P2P loans have a higher risk of default as long term economic horizons are more difficult to predict. Therefore, the effect of macro-default on P2P loans is magnified, further increasing the interest rates of longer term loans.

The most apparent difference between the first canonical variate and the second canonical variate is that the risk-free Treasury slope changes from a negative correlation to a positive correlation, and GDP changes from a slight negative correlation to a strong positive correlation. In addition, equity volatility is slightly positively correlated with the second canonical variate, albeit at a small magnitude. This factor can be interpreted as latent market uncertainty that may be correlated with short-term market rallies. Given the low interest rate environment mentioned earlier, market uncertainty may be correlated with higher slopes i.e., a steeper Treasuries yield curve, as investors demand a higher premium for investing in long-term government bonds. At the same time, within the time frame of our dataset, US GDP experienced two sudden swings (a sudden drop followed by a sudden increase), although the unemployment rate steadily decreased. Therefore, given that GDP can function like a signal rather than an indicator in the economy, sudden swings in GDP may thus be correlated with greater uncertainty. This hypothesis is directly supported by the fact that the change in equity volatility is positively correlated with .

Our CCA factor regressions reveal that has a negative coefficient, implying that an increase in latent traditional market uncertainty is associated with a decrease in credit spreads for all grade types except grade F. The negative coefficient, however, is strongest for grade type B, but consistently decreases from B to E. We first explain the negative coefficient, which may arise as uncertainty about traditional investment products encourages more investors to seek higher returns through other means, such as P2P platforms, increasing money supply for the P2P market. This causes P2P interest rates to fall and the P2P credit spread to decrease. Next, we explain why market uncertainty has a reduced effect on the credit spread of lower grade loan categories. By the same argument, investors who wish to diversify their portfolio and switch from more traditional products to P2P lending will seek P2P investments that are more substitutable with traditional products in terms of the default risk they entail. Since P2P loans have a higher default risk than corporate bonds, we can expect P2P loans of grade A or B to be substitutable with corporate bonds of grade B or C. Therefore, demand for lower grade P2P loans do not increase proportionally. On the other hand, our CCA factor regressions by loan term type suggest that latent macroeconomic uncertainty have no significant effect on interest rates defined by their term type. This may make sense as market uncertainty will affect P2P interest rates regardless of term type, given the range of duration risk appetites of investors who substitute traditional investments with P2P investments, resulting in little difference in credit spreads across term types.

Interpretation for is less apparent as the third canonical variate is not strongly correlated with any of our macroeconomic proxies. Instead, we observe that has a stronger positive correlation with the HML equity factor, and a stronger negative correlation with the SMB equity factor compared to the first two P2P factors. Therefore, if necessary, we are inclined to interpret the third canonical variate as the fundamental value of the equity market. This is expected to be positively correlated to the HML factor, which is the spread in returns between value and growth stocks. The fundamental value of the equity market can also be negatively correlated with the SMB factor, since the SMB factor is comprised of volatile small-cap stocks with returns that are not associated with their fundamental value. Furthermore, Bergbrant  Kelly (2016) find that HML and SMB factors are not sensitive to macroeconomic risks, which can expain why has relatively low correlation with the other macroeconomic proxies, as compared to or . The CCA factor regressions show that an increase in is associated with an increase in P2P credit spreads across all grade types. A similar relationship is found by Vassalou (2000), who concluded that an increase in the returns of HML is associated with an increase in the difference between the return on long-term government and corporate bonds. However, our interpretation of is tenuous, as the correlations with the macroeconomic proxies are weak.

We note that while the level of the yield curve ( 10Y-rf) is statistically significant in our fully specified multiple regressions, it has very low correlation with the factors extracted from the canonical correlations. Given this disparity, we are inclined to think that OLS regressions face issues of multicollinearity, as the level of the yield curve is heavily correlated with the other macroeconomic proxies, like inflation rate. Therefore, CCA factors allow us to extract more meaningful and accurate insights about systematic variations in P2P spreads. However, while our interpretation are supported by economic theory and intuition, we concede that they are tentative.

As before, we extract the first principal component from the residuals and include the extracted series in our CCA factor regressions, see Tables 20 and 21. We find that the average increase in adjusted values for grade type regressions is 0.103, which is significantly smaller compared to the initial OLS regressions. On the other hand, the average increase in adjusted values for term type regressions is 0.544, which is similar to the average increase in the OLS regressions. Therefore, it appears that the CCA factors do account for most of the common variation in P2P spreads sorted by grade type, and that the unexplained erros are idiosyncratic, implying no missing factors in (1). On the other hand, the CCA factors fail to account for common variation in P2P spreads sorted by term type, implying that there are missing factors . Therefore, we are inclined to believe that macroeconomic conditions affect common variation in the interest rates of different P2P grades, but of different P2P terms.

6 Conclusion

We use canonical correlations to extract latent macroeconomic factors, in order to analyze the effect that the wider economy has on aggregated P2P interest rates. The technique also offers the advantage of allowing us to accurately verify whether there are missing non-macroeconomic factors that affect common variation in P2P credit spreads, by eliminating systematic measurement errors in macroeconomic variables that can affect our results. We show that the common variation in the grade structure of credit spreads is explained by three factors — macro non-default, market uncertainty and (possibly) fundamental market value.


The work in this article is generously supported by DARPA D15AP00109 and NSF IIS 1546413. The authors are grateful to the Stevanovich Center at the University of Chicago for permission to use its facilities.


min max mean sd
36-A 2.82 7.96 6.37 1.03
36-B 4.47 12.04 9.72 1.73
36-C 5.85 15.31 12.30 2.26
36-D 7.61 18.34 14.72 2.80
36-E 9.03 21.58 16.92 3.32
36-F 10.56 23.70 19.43 3.82
60-A 4.89 8.29 6.87 0.86
60-B 8.23 11.88 9.99 1.18
60-C 10.51 15.52 13.03 1.34
60-D 12.97 18.51 15.99 1.55
60-E 14.67 20.99 18.46 1.90
60-F 16.30 23.18 21.19 2.17
Table 1: Description statistics: levels of monthly credit spreads across 2 maturity terms and 6 grades
Test Statistic -value
36-A 0.69
36-B 0.60
36-C 0.87
36-D 0.99
36-E 0.98
36-F 0.98
60-A 0.74
60-B 0.70
60-C 0.91
60-D 0.93
60-E 0.92
60-F 0.98
Table 2: ADF test statistics and -values for levels of monthly credit spreads
Min Max Mean S.D.
36-A 1.17 0.02 0.30
36-B 1.14 0.03 0.35
36-C 1.04 0.05 0.33
36-D 1.41 0.07 0.36
36-E 1.98 0.09 0.44
36-F 2.71 0.11 0.49
60-A 0.99 0.01 0.30
60-B 0.89 0.34
60-C 0.83 0.00 0.31
60-D 1.47 0.02 0.35
60-E 1.66 0.03 0.36
60-F 2.03 0.08 0.38
Table 3: Description statistics: first differences of monthly credit spreads across 2 maturity terms and 6 grades
Test Statistic -value
36-A 0.01
36-B 0.01
36-C 0.01
36-D 0.01
36-E 0.01
36-F 0.01
60-A 0.01
60-B 0.01
60-C 0.01
60-D 0.01
60-E 0.01
60-F 0.01
Table 4: ADF test statistics and -values for first differences of monthly credit spreads
Test Statistic Critical Value (5%)
2.79 8.18
9.06 17.95
16.49 31.52
27.74 48.28
48.82 70.60
74.57 90.39
4.30 8.18
11.08 17.95
21.01 31.52
34.40 48.28
51.69 70.60
77.46 90.39
Table 5: Johansen test statistics and critical values for levels of monthly credit spreads
Min Max Mean S.D.
UNRATE 4.60 10.00 7.37 1.73
GDP 5.00 1.03 2.42
CPI 207.23 238.15 224.96 9.63
Debt 9.89 13.21 11.05 1.15
10Y-rf 0.64 0.27
rf-slope 3.43 2.11 0.76
SPX 735.09 2107.39 1449.21 365.96
VIX 20.50 5.57
SMB 6.11 0.13 2.27
HML 7.85 2.73
Table 6: Description statistics: macroeconomic variables of monthly frequency
CPI 1.00 0.23 0.91 0.04 0.09 0.09
UNRATE 1.00 0.84 0.32 0.07
Debt 0.84 1.00 0.51 0.11
GDP 0.23 1.00 0.40 0.19 0.01 0.06
SPX 0.91 0.40 1.00 0.07 0.03
10Y-rf 0.04 0.07 1.00 0.12 0.31 0.44
rf-slope 0.32 0.51 0.19 0.12 1.00 0.04
SMB 0.07 0.11 0.01 0.31 0.04 1.00
HML 0.09 0.44 1.00 0.04
VIX 0.09 0.06 0.03 0.04 1.00
Table 7: Description statistics: correlations between macroeconomic variables


(Intercept) UNRATE GDP CPI Debt 10Yrf Slope SPX VIX HML SMB Adj. 
A 6.741 0.006 0.073 0.339
1.546 0.409 1.359 1.681
B 6.795 0.096 0.002 0.013 0.400
1.415 1.628 0.431 1.488
C 1.434 0.003 0.009 0.479
0.345 0.902 1.193
D 3.785 0.002 0.005 0.361
0.746 0.543 0.514
E 2.515 0.009 0.003 0.245
0.395 0.470 0.522
F 4.884 0.065 0.001 0.001 0.223
0.683 0.743 0.169 0.076
Table 8: Panel A OLS Regressions: Full specification regressions according to Equation (6), across loan grade types
(Intercept) UNRATE GDP CPI Debt 10Yrf Slope SPX VIX HML SMB Adj. 
36-m 4.503 0.028 0.358
1.759 0.844 1.795 2.733
60-m 4.004 0.030 0.003 0.337
1.429 0.852 1.192 2.446
Table 9: Panel B OLS Regressions: Full specification regressions according to Equation (6), across loan term types
(Intercept) UNRATE CPI 10Yrf SPX HML SMB Adj. 
A 0.84 0.01 0.35
3.31 1.56
B 1.14 0.01 0.41
4.09 1.42
C 1.04 0.49
D 2.25 0.38
E 0.48 0.27
F 2.20 0.25
Table 10: Panel A OLS Regressions: AIC stepwise regressions, across loan term types
(Intercept) UNRATE 10Yrf SPX VIX HML SMB Adj. 
36-m 0.97 0.00 0.01 0.36
6.91 1.84 2.55
60-m 1.01 0.01 0.34
6.51 2.33
Table 11: Panel B OLS Regressions: AIC stepwise regressions, across loan term types
(Intercept) UNRATE GDP CPI Debt 10Yrf Slope SPX VIX HML SMB PC1 Adj. 
A 6.94 0.00 0.07 0.01 0.70
2.31 0.48 1.76 2.50
B 7.08 0.09 0.00 0.01 0.81
2.58 2.61 0.83 2.63
C 2.31 0.00 0.01 0.85
1.05 1.91 2.31
D 3.53 0.00 0.01 0.87
1.56 1.11 1.14
E 1.02 0.01 0.00 0.01 0.00 0.80
0.31 0.92 0.21 0.16 0.77
F 4.75 0.06 0.00 0.00 0.65
0.99 1.07 0.25 0.12
Table 12: Panel A PC OLS Regressions: Full specification regressions according to Equation (6) including first PC of residuals, across loan term types
(Intercept) UNRATE GDP CPI Debt 10Yrf Slope SPX VIX HML SMB PC1 Adj. 
36-m 4.34 0.03 0.00 0.01 0.21 0.88
3.86 1.87 3.91 5.97 49.97
60-m 4.17 0.03 0.00 0.01 0.22 0.88
3.53 2.09 3.09 6.05 49.99
Table 13: Panel B PC OLS Regressions: Full specification regressions according to Equation (6) including first PC of residuals, across loan term types
CanCor CanCor Eigenvalue Percentage Cum. Perc
1 0.99433 0.98868 87.39400 64.84569 64.85
2 0.98681 0.97380 37.17146 27.58094 92.43
3 0.93412 0.87258 6.84843 5.08148 97.51
4 0.74667 0.55752 1.25998 0.93490 98.44
5 0.67833 0.46012 0.85228 0.63239 99.08
6 0.63521 0.40349 0.67643 0.50191 99.58
7 0.51282 0.26298 0.35682 0.26476 99.84
8 0.36017 0.12972 0.14905 0.11060 99.95
9 0.23485 0.05515 0.05837 0.04331 100.00
10 0.07318 0.00535 0.00538 0.00399 100.00
Table 14: Canonical Correlation Coefficients and Eigenvalues
CanCor LR test stat approx F numDF denDF Pr( F)
1 0.99 0.00 11.36 120.00 332.93 0.0000
2 0.99 0.00 6.77 99.00 307.63 0.0000
3 0.93 0.01 3.65 80.00 281.29 0.0000
4 0.75 0.09 2.20 63.00 253.92 0.0000
5 0.68 0.19 1.86 48.00 225.48 0.0015
6 0.64 0.36 1.54 35.00 195.93 0.0355
7 0.51 0.60 1.07 24.00 165.17 0.3778
8 0.36 0.82 0.67 15.00 132.91 0.8105
9 0.23 0.94 0.39 8.00 98.00 0.9256
10 0.07 0.99 0.09 3.00 50.00 0.9654
Table 15: Wilk’s Lambda Tests for Significance of Canonical Variates
Variate 1 2 3 4 5 6 7 8 9 10
Redundancy 0.37 0.16 0.30 0.01 0.01 0.00 0.00 0.00 0.00 0.00
Table 16: Redundancy Indices for P2P Credit Spread Canonical Variates
-Can 1 -Can 2 -Can 3
CPI 0.72 0.65 0.15
GDP 0.56
SPX 0.50 0.84
10Y-rf 0.05
rf-slope 0.35 0.22
HML 0.09 0.19
VIX 0.11 0.01
Table 17: Pairwise Correlations of CCA Factors and Macroeconomic Proxies
(Intercept) Factor1 Factor2 Factor3 Adj. 
A 6.71 0.76
207.99 11.39 14.24
B 10.16 0.86
272.01 10.54 19.14
C 13.23 0.82
280.53 11.93 15.90
D 16.15 0.84
304.90 18.13 15.27
E 18.65 0.86
308.11 20.67 18.33
F 21.47 0.93
441.27 35.23 14.23 15.65
Table 18: Panel A CCA Factor Regressions: specification according to Equation (7), across loan grade types
(Intercept) Factor1 Factor2 Factor3 Adj. 
36-m 14.55 0.77 0.77 0.04
54.51 2.86 2.86
60-m 14.60 0.82 0.80 0.04
55.57 3.08 3.02
Table 19: Panel B CCA Factor Regressions: specification according to Equation (7), across loan term types
(Intercept) Factor1 Factor2 Factor3 PC1 Adj. 
A 6.71 0.37 0.46 0.81
233.05 12.76 15.96
B 10.16 0.40 0.72 0.95
462.54 17.93 32.54
C 13.23 0.57 0.76 0.97
684.63 29.11 38.80
D 16.14 0.97 0.81 0.97
759.25 45.15 38.03
E 18.65 1.26 1.12 0.99
1112.31 74.61 66.18
F 21.47 1.73 0.70 0.77 0.99
1489.26 118.89 48.01 52.81
Table 20: Panel A CCA-PC Factor Regressions: specification according to Equation (7) including first PC of residuals, across loan grade types
(Intercept) Factor1 Factor2 Factor3 PC1 Adj. 
1 14.55 0.77 0.77 3.65 0.58
2 82.75 4.35 4.35 22.11
3 14.60 0.82 0.80 3.59 0.58
4 84.37 4.68 4.58 22.11
Table 21: Panel B CCA-PC Factor Regressions: specification according to Equation (7) including first PC of residuals, across loan term types


  • Ahn . (2012) ahnAhn, SC., Dieckmann, S.  Perez, MF.  2012. Exploring common factors in the term structure of credit spreads: the use of canonical correlations. Exploring common factors in the term structure of credit spreads: the use of canonical correlations. Available at:
  • Ang  Piazzesi (2003) angAng, A.  Piazzesi, M.  2003. A no-arbitrage vector autoregression of term structure dynamics with macroeconomic and latent variables A no-arbitrage vector autoregression of term structure dynamics with macroeconomic and latent variables. Journal of Monetary economics504745–787.
  • Bergbrant  Kelly (2016) bergBergbrant, MC.  Kelly, PJ.  2016. Macroeconomic expectations and the size, value, and momentum factors Macroeconomic expectations and the size, value, and momentum factors. Financial Management454809–844.
  • Dietrich  Wernli (2016) dietrichDietrich, A.  Wernli, R.  2016. What Drives the Interest Rates in the P2P Consumer Lending Market? Empirical Evidence from Switzerland. What drives the interest rates in the p2p consumer lending market? empirical evidence from switzerland. Available at:
  • Duffee (1998) duffeeDuffee, GR.  1998. The relation between treasury yields and corporate bond yield spreads The relation between treasury yields and corporate bond yield spreads. The Journal of Finance5362225–2241.
  • Fama  French (1993) famaFama, EF.  French, KR.  1993. Common risk factors in the returns on stocks and bonds Common risk factors in the returns on stocks and bonds. Journal of financial economics3313–56.
  • Gourieroux . (1993) gourierouxGourieroux, C., Monfort, A.  Renault, E.  1993. Indirect inference Indirect inference. Journal of applied econometrics8S1.
  • Hair . (1998) hairHair, JF., Black, WC., Babin, BJ., Anderson, RE., Tatham, RL. .  1998. Multivariate data analysis Multivariate data analysis ( 5) ( 3). Prentice hall Upper Saddle River, NJ.
  • Herzenstein . (2008) herzensteinHerzenstein, M., Andrews, RL., Dholakia, UM.  Lyandres, E.  2008. The democratization of personal consumer loans? Determinants of success in online peer-to-peer lending communities The democratization of personal consumer loans? determinants of success in online peer-to-peer lending communities. Boston University School of Management Research Paper146.
  • Klafft (2008) klafftKlafft, M.  2008. Peer to peer lending: auctioning microcredits over the internet. Peer to peer lending: auctioning microcredits over the internet. Available at:
  • Litterman  Scheinkman (1991) littermanLitterman, RB.  Scheinkman, J.  1991. Common factors affecting bond returns Common factors affecting bond returns. The Journal of Fixed Income1154–61.
  • Okun (1963) okunOkun, AM.  1963. Potential GNP: its measurement and significance Potential gnp: its measurement and significance. Yale University, Cowles Foundation for Research in Economics.
  • Tu . (2015) tuTu, Y., Yao, Q.  Zhang, R.  2015. Error-Correction Factor Models. Error-correction factor models. Available at:
  • Vassalou (2000) vasVassalou, M.  2000. The Fama–French factors as proxies for fundamental economic risks The Fama–French factors as proxies for fundamental economic risks ( 181). Center on Japanese Economy and Business, Columbia Business School.
  • Zhang . (2016) zhangZhang, Y., Jia, H., Diao, Y., Hai, M.  Li, H.  2016. Research on Credit Scoring by Fusing Social Media Information in Online Peer-to-Peer Lending Research on credit scoring by fusing social media information in online peer-to-peer lending. Procedia Computer Science91168–174.