Data Science Lab

In the age of the information explosion, data mining and machine learning techniques have become heavily involved in financial market modelling. Many scholars have demonstrated the significance of dependence modelling across the multiple financial markets, especially during the catastrophic global financial crisis (GFC) in 2008. Sharp fluctuations across different markets demonstrate that the dependence is high-dimensional, contains various hierarchical and horizontal relationships, and often presents complicated dependence structures and characteristics such as an asymmetrical structure and tail dependence. Thus, a strong understanding of cross-market dependence is critical in cross-market applications such as portfolio management and risk management.

Unfortunately, modelling the dependence across multiple financial markets is highly challenging for the following reasons: (i) the cross-market dependence structure is often embedded with strong and complicated coupling relationships of high dimensionality as with any complex behavioural and social system; (ii) financial variables such as daily return have been demonstrated to follow non-Gaussian distributions, which means that dependence models should cover a wide range of dependencies to capture the asymmetric dependencies; and (iii) various tail dependencies such as lower and upper tail dependence are ubiquitous in financial markets.

Typical approaches such as Markov models, probabilistic graphical models, and neural network models could have advantages by building a conditional dependence structure between random variables to resolve the high dimensional problem. However, these models always impose unrealistic assumptions (e.g., Gaussian or mixtures of it), which leads to failure in capturing the complex dependence structures in the real world. In addition, copulas have been demonstrated to be effective in presenting dependence between variables in the statistics and finance communities. By splitting the joint distribution into dependence between variables and independent marginal distributions, copulas provide a flexible mechanism for investigating the specifications of the dependence across the market and the marginal distributions independently. Nevertheless, building effective dependence structures to address the aforementioned complexities is still a significant challenge for existing copula approaches. Over the last decade, various mixed models (e.g., tree-structured copula models and copula Bayesian networks) have been developed by utilising the advantages of both traditional copula models and probabilistic graphical models; however, assumptions and restrictions on the dependence structure have still not been avoided.

Based on the aforementioned research limitations and challenges, this thesis proposes weighted partial D vine copula, weighted partial regular vine copula and weighted regular vine variational long short-term memory (LSTM) models for high-dimensional cross-market modelling.

In Chapter 4, a novel bottom-to-top approach with no prior dependence assumptions called a weighted partial D vine copula is presented to capture the nonlinear and asymmetric dependence structures of cross-market data. By releasing these restrictions regarding the Gaussian assumptions, the new model is able to capture more sophisticated dependence structures between variables. The new modelling outcomes are applied to stock and exchange markets data as a case study, the extensive experimental results demonstrate that this model and its intrinsic design significantly outperform typical models and industry baselines.

Chapter 5 extends the approach presented in Chapter 4 to a more general structure, namely weighted partial regular vine. The new model can capture the nonlinear and asymmetric dependence structures with more flexible way by utilising the advantages of the regular vine structure and non-restricted bivariate copula families. Then, the application of model for asset allocation by optimising the utility of a portfolio is presented as a case study. Compared with the general approaches, such as minimum variance, the optimised utility function using the new weighted partial regular vine can avoid the Gaussian assumption, which is always implied in these models due to computational issues.

Chapter 6 discusses the benefits of taking flexible bivariate copulas with different tail dependencies using the new regular vine copula model. Experiments are conducted to implement the new model on the exchange market to analyse the dynamic movement of the tail dependencies, consequently demonstrating better performance.

Chapter 7 introduces a novel vine copula-based variational autoencoder (VAE) to generate randomness for LSTM to model the cross-market data. Current VAE models usually apply mean-field assumption to simplify the calculation process; however, such an assumption may lead to posterior collapse by removing the dependencies between variables. Our new model provides a two-step parameter estimation process to be incorporated with variational LSTM for capturing the complex dependencies on latent variables. By maintaining such dependency relationships over latent variables, the empirical results demonstrate a dramatic improvement for cross-market data modelling and avoid posterior collapse.

All of the aforementioned approaches and frameworks for high-dimensional cross-market dependence modelling are applied in accordance with business applications, such as value at risk. These models not only provide insightful knowledge for investors to control and reduce the aggregation risk of the portfolio, but also show promising potential for further exploration and development.