Review of deep learning models for crypto price prediction: implementation and evaluation (2024)

Jingyang WuXinyi ZhangFangyixuan HuangHaochen ZhouRohtiash Chandra

Abstract

There has been much interest in accurate cryptocurrency price forecast models by investors and researchers. Deep Learning models are prominent machine learning techniques that have transformed various fields and have shown potential for finance and economics. Although various deep learning models have been explored for cryptocurrency price forecasting, it is not clear which models are suitable due to high market volatility. In this study, we review the literature about deep learning for cryptocurrency price forecasting and evaluate novel deep learning models for cryptocurrency stock price prediction. Our deep learning models include variants of long short-term memory (LSTM) recurrent neural networks, variants of convolutional neural networks (CNNs), and the Transformer model. We evaluate univariate and multivariate approaches for multi-step ahead predicting of cryptocurrencies close-price. Our results show that the univariate LSTM model variants perform best for cryptocurrency predictions. We also carry out volatility analysis on the four cryptocurrencies which reveals significant fluctuations in their prices throughout the COVID-19 pandemic. Additionally, we investigate the prediction accuracy of two scenarios identified by different training sets for the models. First, we use the pre-COVID-19 datasets to model cryptocurrency close-price forecasting during the early period of COVID-19. Secondly, we utilise data from the COVID-19 period to predict prices for 2023 to 2024.

keywords:

cryptocurrency , deep learning , time series prediciton

PACS:

0000 , 1111

MSC:

0000 , 1111

journal: .

\affiliation

[inst1]UNSW Sydney, Sydney, Australia

1 Introduction

The traditional financial ecosystem is implemented through a complex set of policies and structural mechanisms that financial institutions utilise to engender currency within an economy [1]. The core of this ecosystem is the central bank, treasury, and commercial banking entities which are classified under three primary monetary frameworks: commodity-based [2], commodity-backed [3], and fiat currency systems [4]. Triggered by the flaws in these institutions such as inflationary propensities and transactional inefficiencies [5], the digitisation of currency has become a revolution [6]. Cryptocurrencies aim to rectify the existing system imperfections [7], such as inflation, financial stability, transactional efficiency, and reduced operational costs. A cryptocurrency is a peer-to-peer digital exchange system where cryptographic techniques are employed to create and distribute units of currency among participants [8, 9]. The cryptocurrency market has seen rapid and unpredictable changes over its relatively brief existence [10]. The security of the cryptocurrency market is ensured by a technology called blockchain [11], which provides a comprehensive security. In the present year (2024), there are over 5,000 cryptocurrencies and 5.8 million active users in the cryptocurrency industry [12]. Due to its inherent nature of mixing cryptography with a monetary unit, Bitcoin (BTC) became one of the most popular cryptocurrency and received attention in fields such as computer science, economics and cryptography [13]. Satoshi Nakamoto pseudonymously introduced Bitcoin and released it as an open source software in January 2009 [8]. The cryptocurrency ecosystem encompassing Bitcoin and Altcoins with tokens such as Civic and BitDegree, marks a significant stride towards a decentralized financial system. The cryptocurrency ecosystem refers to the broader infrastructure and community that encompasses various cryptocurrencies and blockchain projects, whereas a ”token” such as BitDegree and Civic, serves specific functions within these ecosystems, often facilitating access to services or representing certain assets. Unlike Bitcoin, which is primarily a digital currency intended for transactions and value storage, BitDegree serves a distinct purpose by focusing on education, offerring tokens as incentives within its educational platform. Nevertheless, due to its decentralised nature and absence of governmental support, the cryptocurrency market is susceptible to significant fluctuations in value and the formation of pricing bubbles [14].

The inherent volatility of cryptocurrencies featuring transaction volume fluctuations and price variability, complicates the predictive analysis of cryptocurrency prices [15]. However, volatility [16] makes it a profitable market for speculation as the sourse of potencial gain. The prominent cryptocurrencies such as Bitcoin (BTC), Ethereum (ETH), and Litecoin (LTC) differ based on valuation, transaction speed, usage, and volatility [17]. Identifying the precise catalysts for these price trends in the cryptocurrency domain remains elusive due to the sector’s pronounced volatility. Nevertheless, the market value of cryptocurrencies is projected to increase in the future, with an expected compound annual growth rate of 11.1% [18]. Meanwhile, the financial audit sector is evolving to integrate cryptocurrencies as a valid transaction medium. Investors have encountered challenges in previous instances due to price bubbles resulting in extreme fluctuations [19]. In order to surmount these obstacles, it is imperative to have a dependable model that can aid market participants in identifying trends and generating accurate predictions. Predicting cryptocurrency prices with precision is difficult due to its sensitivity to multiple factors, including government policies, technology advancements, public perception, and world events [20]. Muarry et al. [21] highlights the inherent difficulties in predicting the pricing of cryptocurrencies because of their high volatility, decentralised nature, and other distinctive features such as transaction speed and variations in their ecosystems.

Several researchers are affirming the correlation between cryptocurrencies and other domains such as the economics, finance, the internet, and even politics. Wang et al. [22] presented an analysis using machine learning models and revealed a strong correlation between cryptocurrencies and their intrinsic features (e.g., lagged volatility, previous trading information). Kyriazis [23] studied spillover effects in cryptocurrency markets, emphasising Bitcoin’s role using statistical models such as vector autoregression (VAR) [24] and generalized autoregressive conditional heteroskedasticity (GARCH) [25] to explain inter-market dynamics. Huynh et al. [26] revealed that gold can be used as a reliable tool to reduce the risk associated with unpredictable changes in the cryptocurrency market when utilized as a separate form of currency. However, investors are enthusiastic and also cautious due to the highly volatile cryptocurrency market. Machine learning, along with deep learning models are promising for cryptocurrency due to prediction capabilities and the ability to model multimodal [27], spatiotemporal data [28], and time series forecasting [29].

Machine learning and deep learning models have shown great potential in temporal forecasting problems for various domains, such as climate extremes [30], energy [31], and financial time series [32]. Deep learning models can assist in forecasting future cryptocurrency prices, although there are challenges due to nonlinear and volatile nature of the time series. Many researchers are keen to use long short-term memory (LSTM) and its variants to predict cryptocurrencies [33, 34, 35]. Deep learning methods such as LSTM recurrent neural networks [36, 37], convolutional neural networks (CNN) [38], and Transformer models [39] are also promising for predicting cryptocurrencies. Chandra et al. [40] led a comparative analysis of various deep learning models for multi-step forward time series prediction. A myriad of factors, both internal and external, such as the trading volume, market beta, and volatility, play a critical role in determining cryptocurrency value. Therefore, we need to utilise cryptocurrencies that are highly correlated for deep learning models and access univariate and multivariate deep learning models.

In this paper, we provide a detailed review of the literature on crypto-price forecasting using deep learning models and then evaluate novel deep learning models for cryptocurrency price forecasting. Specifically, we utilise variants of long short-term memory (LSTM) recurrent neural networks, variants of convolutional neural networks (CNNs), and the Transformer model. We evaluate univariate and multivariate approaches for multi-step ahead predicting of cryptocurrencies close-price. Our results show that the univariate LSTM model variants perform best for cryptocurrency predictions. We also carry out volatility analysis on the four cryptocurrencies and investigate the prediction accuracy of two scenarios identified by different training sets for the models. First, we use the pre-COVID-19 datasets to model cryptocurrency close-price forecasting during the early period of COVID-19. Secondly, we utilise data from the COVID-19 period to predict prices for 2023 to 2024. We investigate the effect of univariate and multivariate models, where the multivariate model features the Gold price, close, open, and high price of the crypto being predicted and a highlighted correlated crypto price index.

The rest of the paper is organised as follows. Chapter 2 provides a comprehensive overview and analysis of previous research and literature relating to the topic. Chapter LABEL:ch3 provides the framework that compares selected deep learning models. Chapter 4 presents the experiments and results. Chapter 5 presents a discussion, and Chapter 6 concludes the paper.

2 Review

Forecasting financial time series is highly favoured by researchers in both academic and financial sectors due to its wide applications and significant influence. Machine learning and deep learning have paved the way for numerous models, leading to a large body of published research. Among these areas of interest, cryptocurrency price prediction stands out. This section offers an in-depth overview of how machine learning and deep learning are applied to financial time series forecasting, especially for predicting cryptocurrency prices, without using complex terminology.

2.1 Financial time series prediction

Financial time series forecasting had an emphasis on predicting asset prices [41]. Although there are diverse methodologies, the key focus has been on predicting the future movements of the underlying asset with deep learning models [42]. This field covers a variety of subjects including forecasting of stock prices, index prediction, forex price prediction, as well as predictions for commodities (such as oil and gold), bond prices, volatility, and cryptocurrency prices [43]. Despite the wide range of topics, the underlying principles applied in these forecasts remain uniformly applicable across all categories.

Research within financial time series forecasting is broadly segregated into two categories based on precise price forecasting and trend (directional movement) forecasting [44]. Although exact price prediction aligns with regression tasks, the primary goal in numerous financial forecasting projects is not the accurate prediction of prices, but rather the correct identification of price movement direction. This shifts the emphasis towards trend prediction, or determining the directional change in prices, marking it as a more critical area of investigation compared to pinpoint price forecasting. Hence, trend prediction is approached as a classification issue. Some analyses focus on binary outcomes, addressing only upward/downward movements [45], while others incorporate a third class (neutral option), thus constituting a 3-class problem [46].

In recent years, researchers have utilised machine learning and deep learning for the analysis of financial time series data. Nabipour et al. [45] conducted a comparative analysis of deep learning models (simple recurrent neural network (RNNs) [47] and LSTM networks [36]) with machine learning models for stock market trend prediction, demonstrating the superior accuracy of deep learning. Mehtab et al. [48] enhanced NIFTY-50 111https://www.nseindia.com/ Indian stock index prediction using LSTM models with the grid-searching and walk-forward validation and achieved notable accuracy. NIFTY-50 represents the weighted average of 50 of the top companies listed on the National Stock Exchange (NSE) of India. Rezaei et al. [49] combined deep learning with frequency decomposition methods, including empirical mode decomposition (EMD) [50], and complete ensemble empirical mode decomposition (CEEMD) [51] to predict stock prices and demonstrated effectiveness of CEEMD-CNN-LSTM and EMD-CNN-LSTM. Jing et al. [52] developed a hybrid model that merges deep learning with investor sentiment analysis, utilising CNN for sentiment classification and LSTM for stock price prediction, demonstrating enhanced predictive accuracy for stock prices. Mehtab and Sen [53] used a blend of machine learning and deep learning models with walk-forward validation and grid-search technique for precise short-term forecasting rather than long-term trends of NIFTY-50, offering valuable insights for short-term traders. Li and Pan [54] enhanced stock price prediction accuracy by employing an ensemble deep learning model that leveraged stock prices and news data, using LSTM and gated recurrent unit (GRU) networks. Kanwal et al. [55] introduced a hybrid deep learning model combining bidirectional LSTM and one-dimensional CNN for stock price prediction, achieving higher accuracy and efficiency on five distinct stock datasets. Swathi et al. [56] presented a novel model for stock price prediction, leveraging Twitter sentiment analysis with an impressive accuracy of 94.73%, showcasing its effectiveness over traditional and other deep learning methods. Ben Ameur et al. [57] utilized deep learning models (LSTM, GRU, RNN, and CNNs) to forecast commodity prices for the Bloomberg Index, demonstrating LSTM models superior performance. Baser et al. [58] evaluated gold price prediction using tree-based models, including Decision Trees, AdaBoost, Random Forest, Gradient Boosting, and XGBoost. They demonstrated XGBoosts superior accuracy through technical indicators analysis. Deepa et al. [59] used statistical and machine learning models for prediction of cotton prices in India and reported that boosted decision tree regression provided the highest accuracy. Zhao and Yang [60] proposed an integrated deep learning framework for stock price movement prediction, which combined sentiment analysis with deep learning models and got enhanced prediction accuracy by incorporating both market data and investor sentiment.

Table 1 provides a list of sample studies focused on using traditional statistical and machine learning methods to predict cryptocurrency trends. We report various models with error measures such as the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean squared error(RMSE). We also mention the time periods of data used in these literatures.

MethodsData Targetpredictor Time range(month/day/year)Metric
LSTMgrid-search[48]NIFTY 50 index Priceprediction 12/29/2014-07/31/2020RMSE
MLP,RNN,LSTM[45]Stock market trends Trendprediction 11/01/2009-11/01/2019 F1-ScoreAccuracyROC-AUC
LSTM, CNN,empirical modedecomposition,CEEMD[49]Stock prices Priceprediction 01/01/2010-09/01/2019 RMSEMAEMAPE
CNN, LSTM[52]Stock prices Priceprediction 01/01/2017-07/01/2019 MAPE
Linear Regression,Bagging, XGBoostRanform ForestsMLP, SVM, LSTM[53]NSE stock prices Short-term priceprediction- Comparativeanalysis
LSTM withSentimentAnalysis[54]Stock prices PricePrediction 12/31/2017-06/01/2018 MSEPrecisionRecallF1-Score
BD-LSTM,1D-CNN[55]Stock prices Priceprediction 01/01/2000-12/31/2020 AccuracyEfficiency
TeachingLearning BasedOptimizationLSTM[56]Stock prices Priceprediction - AccuracyPrecisionRecallF1-Score
LSTM,Gated Recurrent Units,RNN, CNN[57] BloombergCommodityIndex Priceprediction 01/01/2002-12/31/2020Accuracy
Decision Tree,AdaBoost,Random Forests,Gradient Boosting, XGBoost[58]Gold prices Priceprediction 11/18/2011-01/01/2019 MAEMSERMSER2 Score
Logistic Regression,Bayesian Linear Regression,Boosted DecisionTree Regression,Random Forest Regression,Poisson Regression[59] Agriculturematerialprices Priceprediction- MAERMSERAER square
LSTM, Ensemble CNN,Denoising Autoencoder,Sentiment Analysis[60] Stock pricesand sentiment Pricemovement 01/01/2002-12/31/2020 RMSE

2.2 Cryptocurrency prediction

Some researchers have employed machine learning models, such as simple neural networks (SNN) also known as backpropagation and artificial neural networks [61], support vector machines (SVM) [62], genetic algorithm-based SNN [63], and neuroevolution of augmenting topologies (NEAT) [64] which evolves both architecture and neural network parameters.

Next, we review some of the machine learning models that are pivotal in predicting cryptocurrency prices. Greaves and Au [65] demonstrated the superiority of neural networks over linear regression, logistic regression, and support vector machines (SVM) [66] for Bitcoin price prediction. Sovbetov [67] examined the effect of market factors by using autoregressive distributed lag (ARDL) and the S&P50 Index on various cryptocurrencies. Guo et al. [68] improved short-term Bitcoin volatility forecasting with temporal mixture models, outperforming traditional methods. Akcora et al. [69] investigated the predictive Granger causality of chainlets and identify certain types of chainlets that exhibit the highest predictive influence on Bitcoin price and investment risk. Roy et al. [70] used ARIMA, Autoregressive, and Moving Average models in forecasting short-run volatility in Bitcoin’s weighted costs. Derbentsev et al. [71] compared binary autoregressive tree (BART), ARIMA, and autoregressive fractional integrated moving average (ARFIMA) models for forecasting Bitcoin, Ethereum, and Ripple prices where BART had best accuracy. Kumar et al. [72] and Latif et al. [73] examine the effectiveness of LSTM and ARIMA models in the short-term prediction of BTC prices, demonstrating that while ARIMA models can capture the general trend, LSTM models excel in predicting both the direction and magnitude of price movements, highlighting the potential of deep learning in financial market predictions. Maleki et al. [74] used machine learning models including linear regression, gradient boosting regressor(GBR), support vector regressor(SVR), random forest regressor (RFR) and ARIMA in predicting Bitcoin prices, suggesting new investment strategies in the cryptocurrency market.

Table 2 provides a list of studies focused on using traditional statistical and machine learning methods to predict cryptocurrency trends. The table reports metrics such as the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean squared error(RMSE). We also mention the time periods of data used in these papers.

Methods Cryptocurrency(type) Targetpredictor Time range(month/day/year)Metric
Linear regression,Logistic regression,Neural Networks, SVM[65]BTCFuture price prior-07/04/2013 MSEAccuracy
AutoregressiveDistributed Lag[67] BTCETHDashLTCMonero Short-Longterm price 01/01/2010-01/12/018 ADFtest price
Temporalmixture models[68]BTC Short-termvolatility 09/01/2015-04/01/2017 RMSEMAE
k-Chainlets[69]BTCClose Price 01/01/2009-01/01/2018 RMSEwallet gain
ARIMA,Autoregressive,Movingaverage[70]BTCMarket price 07/31/2013-08/01/2017 RMSE
BART,ARIMA,ARFIMA[71] BTCRippleETH Short-termprice 01/01/2017-03/01/2019RMSE
ARIMA,LSTM[72]ETHClose Price 01/01/2016-12/31/2021 MSE
ARIMA,LSTM[73]BTC Short-termprice 12/21/2020-12/21/2021 MAPEMAERMSE
Logistic Regression,Gradient boosting regressor,SVR,Random forest regressor,ARIMA[74] BTCETHZECLTCClose price 04/01/2018-03/31/2019 MSEMAPEMAEAICBIC

2.3 Deep learning models for cryptocurrency prediction

In recent years, deep learning models have been prominent in the prediction of cryptocurrencies, as follows. Jiang and Liang [38] combined CNNs with reinforcement learning [75] for portfolio management utilising historical cryptocurrency pricing data to allocate assets optimally within specified portfolio constraints. Wu et al. [35] improved Bitcoin prediction accuracy by using autoregressive characteristics in an LSTM network, outperforming standard LSTM. Lee et al. [76] introduced a novel approach employing inverse reinforcement learning coupled with agent-based modeling for Bitcoin price prediction. Ly et al. [77] employed LSTM networks to predict Bitcoin trends, demonstrating the models’ capability to forecast price changes and classify market movements with varying degrees of accuracy. Saad et al. [13] found LSTM to be the most accurate in forecasting Bitcoin prices compared to various machine learning modelsPatel et al. Lucarelli and Borrotti [78] investigated automated cryptocurrency trading using deep reinforcement learning, employing double deep Q-learning networks trained by Sharpe ratio rewards, which outperformed traditional models in Bitcoin trading. Lahmiri and Bekiros [79] compared LSTM networks with generalized regression neural networks (GRNN) to forecast cryptocurrency prices, revealing the chaotic dynamics and fractality in digital currencies’ time series. Patel et al. [80] introduced a hybrid LSTM with gated recurrent unit model for Litecoin and Monero and achieved more accuracy than a simple LSTM model. Livieris et al. [33] combined deep learning and ensemble learning to forecast trends and prices of Bitcoin, Ethereum, and Ripple. LSTM, bidirectional LSTM, and CNN models demonstrated the capability to deliver precise and dependable predictions. Marne et al. [81] used RNN and LSTM models to predict Bitcoin prices that showed better results than machine learning models. Nasekin and Chen [82] analysed cryptocurrency investor sentiment using CNN for sentiment classification and index construction from StockTwits messages. Sridhar and Sanagavarapu [39] employed a Transformer model for Dogecoin price prediction demonstrating the model’s capability to capture both short-term and long-term dependencies effectively. Betancourt and Chen [83] propose the utilization of deep reinforcement learning (DRL) [84] for the dynamic management of cryptocurrency asset portfolios, accommodating portfolios comprising an evolving number of cryptocurrency assets. Shahbazi and Byun [85] applied reinforcement learning for forecasting Litecoin and Monero market values. D’Amato et al. [86] employed a Jordan RNN to enhance the prediction of cryptocurrency volatility, demonstrating superior accuracy over traditional machine learning models for Bitcoin, Ripple, and Ethereum. Schnaubelt [87] applied reinforcement learning to develop cryptocurrency trading strategies. Parekh et al. [88] combined LSTM and sentiment analysis to predict cryptocurrency prices. The study integrated market sentiments from social media for enhanced forecasting accuracy. Kim et al. [89] applied a self-attention-based multiple LSTM model and improved the prediction accuracy for Bitcoin. Goutte et al. [90] used LSTM networks with technical analysis to enhance cryptocurrency trading strategies, particularly focusing on Bitcoin.

Table 3 provides an overview of sample research that focuses on applying deep learning techniques to forecast the trend of cryptocurrencies. The table reports metrics such as the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean squared error(RMSE). We also mention the time periods of data used in these papers.

Deep learningtechniques Cryptocurrency(type) Targetpredictor Time range(month/day/year)Metric
CNN,Reinforcement Learning[38] BTCETHXRPClose Price 01/01/2018-02/28/2019 RMSEAccuracyAUCF1
LSTM withautoregressivecharacteristics[35]BTC Short-Longterm price 01/01/2018-07/28/2018 MSERMSEMAPE
inverse ReinforcementLearningAgent-based Model[76] BTC Close Price 09/01/2016-07/31/2017 MAEMSERMSEMAPE
RNN, LSTM[77]BTC TrendPrediction--
Reinforcement Learning,LSTM,Conjugate Gradient[13] BTCETC Close price 10/01/2015-05/01/2018 RMSEMAE
Deep ReinforecemtLearning[78] BTC Trading - Profit-basedMetrics
ChaoticNeuralNetworks[79] BTCDASHXRP PriceForecasting 07/16/2010-10/01/2018-
LSTM withGated Recurrent Units[80] LTCMonero Short-Longterm price 30/01/2015-02/23/2020 Accuracy
LSTM,BD-LSTM,CNN[33] BTCETHXRP Short-Longterm price 01/01/2018-08/31/2019 MSERMSEMAPE
RNN, LSTM[81] BTC Close Price 01/01/2014-01/31/2019RMSE
CNN[82] Various SentimentAnalysis 03/01/2013-05/31/2018-
Transformer[39]DOGEClose Price 07/05/2019-04/28/2021 AccuracyR-squared
Deep ReinformentLearning[83] Various PortfolioManagement 08/17/2017-11/01/2019 Total ReturnSharpe Ratio
Reinforment Learning[85] LTCMonero Price Prediction 2016-2020 MAEMSERMSEMAPE
Jordan RNN[86] BTCXRPETH Volatility 04/28/2013-12/15/2019 MSEMAPE
Deep ReinforcementLearning[87] Various LimitOrderPlacement 01/01/2018-06/30/2019 Total ReturnSharpe Ratio
LSTMsentiment analysis[88] DashBTC-Cash PricePrediction- MSEMAEMAPE

2.4 Cryptocurrency volatility and prediction

Several researchers concentrate on analyzing and predicting the volatility of cryptocurrencies. Volatility in the cryptocurrencies market is a significant factor that influences numerous decisions in business and finance [91]. Recently, there has been identification of volatility spillovers between the cryptocurrency market and other financial markets [86]. Katsiampa [16] employed an Asymmetric Diagonal BEKK model to examine the volatility dynamics in the cryptocurrency market, revealing significant interdependencies and responsiveness to major news in the volatility levels of major cryptocurrencies such as Bitcoin, Ether, Ripple, Litecoin, and Stellar Lumen. Woebbeking [92] developed the CVX index using a model-free approach derived from cryptocurrency option prices, unveiling that cryptocurrency volatility often diverges from traditional financial markets, and is distinctly reactive to major market events. Yen and Cheng [91] utilized stochastic volatility models to analyze the impact of the Economic Policy Uncertainty (EPU) index on cryptocurrency volatility, finding that China’s EPU uniquely predicts the volatility of Bitcoin and Litecoin, suggesting these cryptocurrencies might serve as hedging tools against EPU risks. Cross, Hou, and Trinh [93] utilized a time-varying parameter model to explore the returns and volatility of cryptocurrencies during the 2017–18 bubble, highlighting a significant risk premium effect in Litecoin and Ripple and identifying adverse news effects as key drivers of the 2018 crash across Bitcoin, Ethereum, Litecoin, and Ripple. Ftiti, Louhichi, and Ben Ameur [94] utilized heterogeneous autoregressive (HAR) models with high-frequency data to explore cryptocurrency volatility during the COVID-19 pandemic. Their findings underscore the predictive superiority of models incorporating both positive and negative semi-variances, especially during the crisis, suggesting these models can effectively capture the asymmetric dynamics of market volatility. Yin, Nie, and Han [95] applied the Generalized Autoregressive Conditional Heteroskedasticity - Mixed Data Sampling (GARCH-MIDAS) model to explore the influence of oil market shocks on the volatility of Bitcoin, Ethereum, and Ripple. Their analysis revealed that oil market shocks, both supply and demand types, significantly affect the long-term volatility of these cryptocurrencies, thereby suggesting potential hedging capabilities against oil-induced economic uncertainties.

There is also some research on the prediction of cryptocurrency volatility. Catania, Grassi, and Ravazzolo [96] employed a score-driven Generalized Hyperbolic Skew Student’s t (GHSKT) model to analyze and predict the volatility of Bitcoin, Ethereum, Litecoin, and Ripple. They demonstrated that accounting for long memory and asymmetric reactions to past shocks enhances the model’s predictive accuracy significantly across various forecast horizons. Catania and Grassi [97] further employed the GHSKT model demonstrating that the model’s ability to incorporate higher-order moments and leverage effects significantly enhances the accuracy of volatility forecasts across various cryptocurrencies. Ma et al. [98] employed the Markov Regime-Switching Mixed Data Sampling (MRS-MIDAS) model to forecast cryptocurrency volatility, particularly focusing on Bitcoin. They enhanced the standard MIDAS approach by incorporating jump-driven time-varying transition probabilities, which allowed the model to capture dynamic changes in volatility states influenced by market jumps.

Table 4 provides an overview of sample research that focuses on cryptocurrency volatility and prediction. The table reports metrics such as the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean squared error(RMSE). We also mention the time periods of data used in these papers.

MethodsData Targetpredictor Time range(month/day/year) Metric
Asymmetric Diagonal,BEKK model[16] BTCETHXRPLTCXLM Volatilitydynamics 08/07/2015-02/10/2018 Past squared errors,past conditional volatility
Model-free volatility,CVX index[92] BTC Volatilitydynamics 02/06/2020-07/06/2021 Not specified
Stochastic volatilitymodels[91] BTCLTC EPU impact oncryptocurrencyvolatility 02/2014-06/2019 Not specified
Time-varyingparameterstochastic volatilitymodel[93] BTCETHLTCXRP Returns andvolatility dynamics 01/2017-01/2019 Forecast accuracyMSFEALPL
Heterogeneousautoregressivemodels[94] BTCETHETCXRP Volatilityforecasting 04/01/2018-06/30/2020 MSEMAEMAPE
GARCH-MIDASmodel[95] BTCETHXRP Impact of oilmarket shockson volatility 04/28/2013-12/31/2018 MAEMAPERMSE
Score-drivenGeneralized HyperbolicSkew Student’s tmodel[96] BTCETHLTCXRP Volatilityforecasting 04/29/2013-12/01/2017 Quasi-Likeloss function
Score-drivenGeneralized HyperbolicSkew Student’s tmodel[97] BTCETHLTCXRP Volatilityforecasting - MSEMAEMAPE
MarkovRegime-Switching,Mixed DataSampling model[98] BTC Volatilityforecasting 03/01/2013-09/29/2018 Quasi-Likeloss functionMSEMAE
GARCH-MIDASmodel[99] GoldSilver Cryptocurrencyuncertainty impacton precious metalvolatility 01/02/2014-05/13/2022 Diebold-Mariano testR-squareModel Confidence Set testDirection-of-Change rate test

3 Methodology: Implementation and Evaluation

3.1 Conventional models

3.1.1 ARIMA

The ARIMA model, often known as the Box-Jenkins model [100], is a commonly used statistical/econometric model for forecasting time series data. The ARIMA model consists of three components: autoregressive (AR), integrated (I), and moving average (MA). The integrated component represents the amount of differencing required to transform the series data into a stationary representation. The autoregressive component describes the relationship between the present value of a time series and its previous values, capturing their correlation. The moving average component indicates the correlation between the current observation and its previous error term. This component assists the model in capturing stochastic variations in the time series. The three components constitute the three parameters p𝑝pitalic_p, d𝑑ditalic_d, and q𝑞qitalic_q in the model. p𝑝pitalic_p represents the number of lag observations in the autoregressive part. d𝑑ditalic_d is the order of differencing, which forms the integrated part, and q𝑞qitalic_q is the number of lagged forecast errors in the moving average component.

3.1.2 Multilayer perceptron

A simple neural network, also known as the multilayer perceptron is a machine learning model that features an input layer, an output layer and at least one hidden layer. Figure 1 illustrates the architecture of the MLP. The MLP need to use a training algorithm to update the weights and biases to ensure that the output (prediction) of the network resembles the actual observations (training data). The network computes the weight sum of inputs to get the hidden and output layers by

hW,b(x)=f(Wx+b)=f(i=1nWixi+b)subscript𝑊𝑏𝑥𝑓𝑊𝑥𝑏𝑓superscriptsubscript𝑖1𝑛subscript𝑊𝑖subscript𝑥𝑖𝑏\begin{split}h_{W,b}(x)&=f(Wx+b)\\&=f(\sum_{i=1}^{n}W_{i}x_{i}+b)\end{split}start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_W , italic_b end_POSTSUBSCRIPT ( italic_x ) end_CELL start_CELL = italic_f ( italic_W italic_x + italic_b ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_f ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_b ) end_CELL end_ROW(1)

where x𝑥xitalic_x is the input item, f()𝑓f(\cdot)italic_f ( ⋅ ) is the activation function, b𝑏bitalic_b is the bias, n𝑛nitalic_n is the number of input units and w𝑤witalic_w is the weight.

Review of deep learning models for crypto price prediction: implementation and evaluation (1)

3.2 Deep learning models

3.2.1 Variants of LSTM networks

RNNs are well-known for modelling temporal sequences, which are distinguished by their context layers as they memory information from prior input to influence the future results. There are several simple RNN architectures, such as the Elman RNN [47] (also known as simple RNN) which was one of the earliest attempts for effectively modelling temporal sequences. Figure 2 gives architecture of the Elman RNN. There are trainable weights connecting each two adjacent layers. A context (state or memory) layer is used to store the output of state neurons resulting from the computation of previous time steps, making them appropriate for capturing time-varying patterns in data.

Review of deep learning models for crypto price prediction: implementation and evaluation (2)

However, simple RNNs faced problems in training due to the vanishing gradient problem [101] arising when handling long-term dependencies in sequence data. The LSTM algorithm is considered to be an enhanced version of the RNN [36]. The LSTM overcame the vanishing gradient constraint by enhancing its ability to retain long-term dependencies through memory cells in the hidden layer. We present the architecture of LSTM network in Figure 3 showing how the information is passed through LSTM memory cells in the hidden layer. The LSTM cell is designed as a unit that memorises each input information for a long time, where previous information can still be retained, and hence addressing the problem of learning long-term dependencies in sequence data. The LSTM cell calculates a hidden state output htsubscript𝑡h_{t}italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT by

ft=σ(Wf[ht1,xt]+bf)it=σ(Wi[ht1,xt]+bi)ot=σ(Wo[ht1,xt]+bo)z=tanh(Wz[ht1,xt]+bz)Ct=fCt1+itzht=0ttanh(Ct)subscript𝑓𝑡𝜎subscript𝑊𝑓subscript𝑡1subscript𝑥𝑡subscript𝑏𝑓subscript𝑖𝑡𝜎subscript𝑊𝑖subscript𝑡1subscript𝑥𝑡subscript𝑏𝑖subscript𝑜𝑡𝜎subscript𝑊𝑜subscript𝑡1subscript𝑥𝑡subscript𝑏𝑜𝑧𝑡𝑎𝑛subscript𝑊𝑧subscript𝑡1subscript𝑥𝑡subscript𝑏𝑧subscript𝐶𝑡𝑓subscript𝐶𝑡1subscript𝑖𝑡𝑧subscript𝑡subscript0𝑡𝑡𝑎𝑛subscript𝐶𝑡\begin{split}f_{t}&=\sigma(W_{f}[h_{t-1},x_{t}]+b_{f})\\i_{t}&=\sigma(W_{i}[h_{t-1},x_{t}]+b_{i})\\o_{t}&=\sigma(W_{o}[h_{t-1},x_{t}]+b_{o})\\z&=tanh(W_{z}[h_{t-1},x_{t}]+b_{z})\\C_{t}&=f*C_{t-1}+i_{t}*z\\h_{t}&=0_{t}*tanh(C_{t})\\\end{split}start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ ( italic_W start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT [ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] + italic_b start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_i start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ ( italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ ( italic_W start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT [ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] + italic_b start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_z end_CELL start_CELL = italic_t italic_a italic_n italic_h ( italic_W start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT [ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] + italic_b start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_f ∗ italic_C start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_i start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∗ italic_z end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = 0 start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∗ italic_t italic_a italic_n italic_h ( italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) end_CELL end_ROW(2)

where ftsubscript𝑓𝑡f_{t}italic_f start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ,itsubscript𝑖𝑡i_{t}italic_i start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and otsubscript𝑜𝑡o_{t}italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT refer to the forget gate, input gate and output gate respectively. W𝑊Witalic_W is weight matrices adjusted learning along with b𝑏bitalic_b, which is the bias. xtsubscript𝑥𝑡x_{t}italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the number of input features, and htsubscript𝑡h_{t}italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the number of hidden units. z𝑧zitalic_z express as intermediate cell state, and Ctsubscript𝐶𝑡C_{t}italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the current cell memory.

Review of deep learning models for crypto price prediction: implementation and evaluation (3)

The bi-directional LSTM (BD-LSTM) is an advanced algorithm based on LSTM, which process information in both directions with two independent hidden layers [102]. The basic idea is each input sequence passes through the RNN once in both the forward and reverse directions. This bidirectional architecture provides the output layer with complete past and future context information for each node in the input sequence. The structure of BD-LSTM is shown in Figure 4. In contrast to LSTM, BD-LSTM exhibits greater efficacy in addressing problems that require the acquisition of context from both temporal directions. This is particularly evident in certain applications within the domains of natural language processing [103] and speech recognition [104].

Review of deep learning models for crypto price prediction: implementation and evaluation (4)

The encoder-decoder LSTM (ED-LSTM) can output a required sequence based on an input sequence (the length of the sequence can be different) [105]. ED-LSTM makes specific architectural changes to the original LSTM to better handle a series of problems known as sequence to sequence. ED-LSTM is very suitable for translating a certain language into different languages [106]. We present the ED-LSTM architecture in Figure 5.

Review of deep learning models for crypto price prediction: implementation and evaluation (5)

3.2.2 Convolutional neural networks

CNNs are one of the most prominent deep learning models initially designed for computer vision and image processing tasks [107, 108, 109, 110]. Their application spans diverse areas, notably in detection [111] and segmentation tasks [112], where they have shown superior efficacy accuracy when compared to traditional machine learning models. A CNN typically comprises several layers, including convolutional, pooling, and fully connected layers.

Subsequently, the fully connected layer, akin to conventional neural networks, ensures a dense interconnection between the nodes of consecutive layers. CNNs identifies hierarchical patterns (features) in the data through iterative convolution and pooling, culminating in a fully connected layer that consolidates these features for the final task output. This structural design has been pivotal for their proficiency in handling tasks related to image processing.

Given that our dataset consists of univariate time series for stock price, it’s crucial to modify conventional two-diremntional CNN to suit our problem. Consequently, we’ve integrated a specialized function into our model that processes a set of stock data inputs, along with specifications like filter count, filter width, and stride length, while the kernel’s height remains irrelevant. This function initializes filter values using a Gaussian distribution and sets biases to zero. It generates several matrices as outputs, where their quantity corresponds to the filter count. These matrices are crucial as they contribute to feature extraction within the CNN model, ultimately serving as inputs for the subsequent pooling layer following the activation function’s execution. We note that in the case of multivariate time series data, the conventional 2D-CNN can be utilised.

The activation function is essential to optimise models performance. The activation functions such as hyperbolic tangent (Tanh), rectifier linear units (ReLU), Sigmoid, and leaky ReLU are typically employed in CNNs. We opt for ReLU and Leaky ReLU which are prominent in the literature and also have ability to avoid vanishing gradients as given below.

ReLU(z)ReLU𝑧\displaystyle\text{ReLU}(z)ReLU ( italic_z )={0ifz0,zifz>0,absentcases0if𝑧0𝑧if𝑧0\displaystyle=\begin{cases}0&\text{if }z\leq 0,\\z&\text{if }z>0,\end{cases}= { start_ROW start_CELL 0 end_CELL start_CELL if italic_z ≤ 0 , end_CELL end_ROW start_ROW start_CELL italic_z end_CELL start_CELL if italic_z > 0 , end_CELL end_ROW(3)
Leaky-ReLU(zi)Leaky-ReLUsubscript𝑧𝑖\displaystyle\text{Leaky-ReLU}(z_{i})Leaky-ReLU ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )={αiziifz0,ziifz>0,absentcasessubscript𝛼𝑖subscript𝑧𝑖if𝑧0subscript𝑧𝑖if𝑧0\displaystyle=\begin{cases}\alpha_{i}z_{i}&\text{if }z\leq 0,\\z_{i}&\text{if }z>0,\end{cases}= { start_ROW start_CELL italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL start_CELL if italic_z ≤ 0 , end_CELL end_ROW start_ROW start_CELL italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL start_CELL if italic_z > 0 , end_CELL end_ROW

where z𝑧zitalic_z and zisubscript𝑧𝑖z_{i}italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are convolution outcomes, and αisubscript𝛼𝑖\alpha_{i}italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is user-defined hyperparameter for convolutional layer i𝑖iitalic_i, typically starting at 0.01. We select ReLU for the initial convolutional layer to address the vanishing gradient issue, followed by Leaky-ReLU in subsequent layers as shown in Figure 6.We train the CNN model by minimising the error defined by the loss function using the Adam optimiser [113] with user-defined learning rate λ=0.0001𝜆0.0001\lambda=0.0001italic_λ = 0.0001.

Review of deep learning models for crypto price prediction: implementation and evaluation (6)

3.2.3 Convolutional LSTM networks

Convolutional LSTM (Conv-LSTM) network [114] was initially introduced for weather forecasting problems. This network extends the original fully connected LSTM and changes the matrix multiplication of the LSTM cell to convolution. We use * to present convolution operation. And \circ recognised as the Hadamard product. The key equations in the Conv-LSTM cell are expressed as :

ft=σ(Wxfxt+Whfht1+Wcfct1+bf)it=σ(Wxixt+Whiht1+Wcict1+bi)ot=σ(Wxoxt+Whoht1+Wcoct+bo)ct=ftct1+ittanh(Wxcxt+Whcht1+bc)ht=0ttanh(Ct)subscript𝑓𝑡𝜎subscript𝑊𝑥𝑓subscript𝑥𝑡subscript𝑊𝑓subscript𝑡1subscript𝑊𝑐𝑓subscript𝑐𝑡1subscript𝑏𝑓subscript𝑖𝑡𝜎subscript𝑊𝑥𝑖subscript𝑥𝑡subscript𝑊𝑖subscript𝑡1subscript𝑊𝑐𝑖subscript𝑐𝑡1subscript𝑏𝑖subscript𝑜𝑡𝜎subscript𝑊𝑥𝑜subscript𝑥𝑡subscript𝑊𝑜subscript𝑡1subscript𝑊𝑐𝑜subscript𝑐𝑡subscript𝑏𝑜subscript𝑐𝑡subscript𝑓𝑡subscript𝑐𝑡1subscript𝑖𝑡subscript𝑊𝑥𝑐subscript𝑥𝑡subscript𝑊𝑐subscript𝑡1subscript𝑏𝑐subscript𝑡subscript0𝑡subscript𝐶𝑡\begin{split}f_{t}&=\sigma(W_{xf}*x_{t}+W_{hf}*h_{t-1}+W_{cf}\circ c_{t-1}+b_{%f})\\i_{t}&=\sigma(W_{xi}*x_{t}+W_{hi}*h_{t-1}+W_{ci}\circ c_{t-1}+b_{i})\\o_{t}&=\sigma(W_{xo}*x_{t}+W_{ho}*h_{t-1}+W_{co}\circ c_{t}+b_{o})\\c_{t}&=f_{t}\circ c_{t-1}+i_{t}\circ\tanh(W_{xc}*x_{t}+W_{hc}*h_{t-1}+b_{c})\\h_{t}&=0_{t}\circ\tanh(C_{t})\\\end{split}start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ ( italic_W start_POSTSUBSCRIPT italic_x italic_f end_POSTSUBSCRIPT ∗ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_h italic_f end_POSTSUBSCRIPT ∗ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_c italic_f end_POSTSUBSCRIPT ∘ italic_c start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_i start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ ( italic_W start_POSTSUBSCRIPT italic_x italic_i end_POSTSUBSCRIPT ∗ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_h italic_i end_POSTSUBSCRIPT ∗ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_c italic_i end_POSTSUBSCRIPT ∘ italic_c start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ ( italic_W start_POSTSUBSCRIPT italic_x italic_o end_POSTSUBSCRIPT ∗ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_h italic_o end_POSTSUBSCRIPT ∗ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT ∘ italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = italic_f start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∘ italic_c start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_i start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∘ roman_tanh ( italic_W start_POSTSUBSCRIPT italic_x italic_c end_POSTSUBSCRIPT ∗ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_W start_POSTSUBSCRIPT italic_h italic_c end_POSTSUBSCRIPT ∗ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL = 0 start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∘ roman_tanh ( italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) end_CELL end_ROW(4)

where ftsubscript𝑓𝑡f_{t}italic_f start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ,itsubscript𝑖𝑡i_{t}italic_i start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, otsubscript𝑜𝑡o_{t}italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and htsubscript𝑡h_{t}italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT refer to the forget gate, input gate, output gate and hidden state respectively. W𝑊Witalic_W is weight matrices adjusted learning along with b𝑏bitalic_b, which is the bias. Also, the past status ct1subscript𝑐𝑡1c_{t-1}italic_c start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT can be regarded as “forgotten” in the process, and ctsubscript𝑐𝑡c_{t}italic_c start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the current cell memory. These equations are similar to 2. The Conv-LSTM model has the ability to capture both the spatial and temporal relationships in the data at the same time, resulting in more precise predictions. In our implementation, for the case of univariate time series, we utilise the 1D-convolutions in Conv-LSTM and 2D convolutional for multivariate time series forecasting.

3.2.4 Transformer Networks

Review of deep learning models for crypto price prediction: implementation and evaluation (7)

The Transformer model is an extension of the encoder-decoder LSTM architecture which has been widely used in machine translation problems [115].The encoder condenses the essential data of the input sequence into a vector of fixed length, which is subsequently transformed into an output by the decoder [105]. The design of the decoder offers a method for managing lengthy sequential data [116].

Analogously, we input the sequential data to a vector representation layer. Given the input sequence X={xi:i=1,,N}N𝑋conditional-setsubscript𝑥𝑖𝑖1𝑁superscript𝑁X=\{x_{i}:i=1,\ldots,N\}\in\mathbb{R}^{N}italic_X = { italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : italic_i = 1 , … , italic_N } ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, the m𝑚mitalic_m-dimensional embedding layer yields a matrix BN×m𝐵superscript𝑁𝑚B\in\mathbb{R}^{N\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_m end_POSTSUPERSCRIPT through a dense network.

We need to incorporate temporal encoding with the vectorised input to encapsulate the temporal structure of the time series. Hence, employing sine and cosine functions at distinct frequencies to represent temporal information, we define:

TE(i,2k)subscriptTE𝑖2𝑘\displaystyle\text{TE}_{(i,2k)}TE start_POSTSUBSCRIPT ( italic_i , 2 italic_k ) end_POSTSUBSCRIPT=sin(i/100002k/m),absent𝑖superscript100002𝑘𝑚\displaystyle=\sin\left(i/10000^{2k/m}\right),= roman_sin ( italic_i / 10000 start_POSTSUPERSCRIPT 2 italic_k / italic_m end_POSTSUPERSCRIPT ) ,
TE(i,2k+1)subscriptTE𝑖2𝑘1\displaystyle\text{TE}_{(i,2k+1)}TE start_POSTSUBSCRIPT ( italic_i , 2 italic_k + 1 ) end_POSTSUBSCRIPT=cos(i/100002k/m),absent𝑖superscript100002𝑘𝑚\displaystyle=\cos\left(i/10000^{2k/m}\right),= roman_cos ( italic_i / 10000 start_POSTSUPERSCRIPT 2 italic_k / italic_m end_POSTSUPERSCRIPT ) ,

where 12km12𝑘𝑚1\leq 2k\leq m1 ≤ 2 italic_k ≤ italic_m. The temporal encoding, hence, is TE N×mabsentsuperscript𝑁𝑚\in\mathbb{R}^{N\times m}∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_m end_POSTSUPERSCRIPT. The vector representations alongside the temporal encodings are then concatenated and provided to the encoder layers.

A concise overview of the complete framework of our Transformer model is delineated in Figure 7. The encoder depicted in Figure 7 consists of M𝑀Mitalic_M identically structured layers. Each layer is equipped with two sub-layers: a multihead self-attention mechanism and a fully connected feed-forward network. Both sub-layers incorporate residual connections and normalization to enhance their functionality. The decoder, also shown in Figure 7, mirrors the encoder’s structure with a notable distinction: it features an additional multi-head self-attention layer. Unlike the original decoder described in [117], this version omits the masked attention mechanism because it processes only observed historical data, which does not include future information.

The emergence of attention mechanisms marks a pivotal innovation in deep learning, focusing computational efforts to capture attention mechanism in cognition. Vaswani et al. [117] revolutionized this approach by introducing the Transformer architecture, predicated on the exclusive use of self-attention mechanisms. The self-attention mechanism, as defined, follows:

Attention(P,R,S)=softmax(PRm)S,Attention𝑃𝑅𝑆softmax𝑃superscript𝑅top𝑚𝑆\text{Attention}(P,R,S)=\text{softmax}\left(\frac{PR^{\top}}{\sqrt{m}}\right)S,Attention ( italic_P , italic_R , italic_S ) = softmax ( divide start_ARG italic_P italic_R start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT end_ARG start_ARG square-root start_ARG italic_m end_ARG end_ARG ) italic_S ,(5)

where P,R,SN×m𝑃𝑅𝑆superscript𝑁𝑚P,R,S\in\mathbb{R}^{N\times m}italic_P , italic_R , italic_S ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_m end_POSTSUPERSCRIPT correspond to the query, key, and value matrices derived from three separate linear transformations of the same input. The architecture of the self-attention mechanism is illustrated in 7.

The self-attention mechanism has transformed the strategy of focusing on vital local content within the data. Vaswani et al. [117] expanded this idea by proposing multi-head attention, whereby several self-attention processes, or ”heads,” are executed in parallel, each assessing different projected versions of the queries, keys, and values. The combined outcomes of these heads are then linearly transformed to obtain the final output as shown in Figure 7.

3.2.5 Model training with Adam Optimiser

We utilise the modified Adam (adaptive moment estimation) optimiser [113] which is an extension to the stochastic gradient descent [118, 119] and further extends adaptive gradient methods (AdaGrad [120], AdaDelta [121], and RMSProp [122]). Adam is an adaptive gradient-based optimisation algorithm that computes individual adaptive learning rates for different parameters from the history of the first and second moments (mean and variance) of the gradients. Let gkgksubscript𝑔𝑘subscript𝑔𝑘g_{k}\circ g_{k}italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∘ italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT signify the element-wise square of gksubscript𝑔𝑘g_{k}italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT.

Input:

  • 1.

    Step size, α𝛼\alphaitalic_α

  • 2.

    Exponential decay rates for the moment estimates, βa,βb[0,1)subscript𝛽𝑎subscript𝛽𝑏01\beta_{a},\beta_{b}\in[0,1)italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ∈ [ 0 , 1 )

  • 3.

    Stochastic objective function with parameters, f(ξ)𝑓𝜉f(\xi)italic_f ( italic_ξ )

  • 4.

    Initial parameter vector, ξ0subscript𝜉0\xi_{0}italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT

Initialise:

m0subscript𝑚0\displaystyle m_{0}italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT0(Initialize 1st moment vector)absent0(Initialize 1st moment vector)\displaystyle\leftarrow 0\quad\text{(Initialize 1st moment vector)}← 0 (Initialize 1st moment vector)
v0subscript𝑣0\displaystyle v_{0}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT0(Initialize 2nd moment vector)absent0(Initialize 2nd moment vector)\displaystyle\leftarrow 0\quad\text{(Initialize 2nd moment vector)}← 0 (Initialize 2nd moment vector)
k𝑘\displaystyle kitalic_k0(Initialize timestep)absent0(Initialize timestep)\displaystyle\leftarrow 0\quad\text{(Initialize timestep)}← 0 (Initialize timestep)

Algorithm:

whileξknot converged dowhilesubscript𝜉𝑘not converged do\displaystyle\text{while }\xi_{k}\text{ not converged do}while italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT not converged do
kk+1𝑘𝑘1\displaystyle\quad k\leftarrow k+1italic_k ← italic_k + 1
gkξf(ξk1)subscript𝑔𝑘subscript𝜉𝑓subscript𝜉𝑘1\displaystyle\quad g_{k}\leftarrow\nabla_{\xi}f(\xi_{k-1})italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← ∇ start_POSTSUBSCRIPT italic_ξ end_POSTSUBSCRIPT italic_f ( italic_ξ start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT )
mkβamk1+(1βa)gksubscript𝑚𝑘subscript𝛽𝑎subscript𝑚𝑘11subscript𝛽𝑎subscript𝑔𝑘\displaystyle\quad m_{k}\leftarrow\beta_{a}\cdot m_{k-1}+(1-\beta_{a})\cdot g_%{k}italic_m start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ⋅ italic_m start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT + ( 1 - italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ) ⋅ italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
vkβbvk1+(1βb)gkgksubscript𝑣𝑘subscript𝛽𝑏subscript𝑣𝑘11subscript𝛽𝑏subscript𝑔𝑘subscript𝑔𝑘\displaystyle\quad v_{k}\leftarrow\beta_{b}\cdot v_{k-1}+(1-\beta_{b})\cdot g_%{k}\circ g_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT + ( 1 - italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ) ⋅ italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∘ italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
m^kmk(1βak)subscript^𝑚𝑘subscript𝑚𝑘1superscriptsubscript𝛽𝑎𝑘\displaystyle\quad\hat{m}_{k}\leftarrow\frac{m_{k}}{(1-\beta_{a}^{k})}over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← divide start_ARG italic_m start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG ( 1 - italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) end_ARG
v^kvk(1βbk)subscript^𝑣𝑘subscript𝑣𝑘1superscriptsubscript𝛽𝑏𝑘\displaystyle\quad\hat{v}_{k}\leftarrow\frac{v_{k}}{(1-\beta_{b}^{k})}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← divide start_ARG italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG ( 1 - italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) end_ARG
ξkξk1αm^k(v^k+ϵ)subscript𝜉𝑘subscript𝜉𝑘1𝛼subscript^𝑚𝑘subscript^𝑣𝑘superscriptitalic-ϵ\displaystyle\quad\xi_{k}\leftarrow\xi_{k-1}-\alpha\cdot\frac{\hat{m}_{k}}{(%\sqrt{\hat{v}_{k}}+\epsilon^{\prime})}italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← italic_ξ start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - italic_α ⋅ divide start_ARG over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG ( square-root start_ARG over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG + italic_ϵ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG
end while

Output: Resulting parameters ξksubscript𝜉𝑘\xi_{k}italic_ξ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT

Adam optimisation aims to minimise the expected value of a differentiable function, such as a neural network model f(ξ)𝑓𝜉f(\xi)italic_f ( italic_ξ ) with a set of parameters given ξ𝜉\xiitalic_ξ representing the weights and biases. The algorithm updates the exponential moving averages of the gradient (mk)subscript𝑚𝑘(m_{k})( italic_m start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) and its square (vk)subscript𝑣𝑘(v_{k})( italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), with hyperparameters βa,βbsubscript𝛽𝑎subscript𝛽𝑏\beta_{a},\beta_{b}italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT controlling their decay rates. Adjustments in the algorithm improve efficiency, employing an updated computation for parameter adjustments with αk=α1βbk/(1βak)subscript𝛼𝑘𝛼1superscriptsubscript𝛽𝑏𝑘1superscriptsubscript𝛽𝑎𝑘\alpha_{k}=\alpha\sqrt{1-\beta_{b}^{k}}/(1-\beta_{a}^{k})italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_α square-root start_ARG 1 - italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG / ( 1 - italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ). We note that vector operations are performed element-wise.

The key to Adam’s update mechanism is the adaptive step size, influenced by the signal-to-noise ratio m^k/v^ksubscript^𝑚𝑘subscript^𝑣𝑘\widehat{m}_{k}/\sqrt{\widehat{v}_{k}}over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT / square-root start_ARG over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG, dictating the magnitude of parameter updates. This feature allows for effective scaling of steps in parameter space, contributing to the robustness and versatility of the algorithm in various optimisation contexts.

3.3 Data

We choose four different cryptocurrencies to evaluate the performance of the respective statistical and deep learning models. The cryptocurrencies include Bitcoin, Ethereum, Dogecoin and Litecoin. We focus on multi-step ahead stock price forecasting, where a step is defined by a day. Bitcoin is the first and most prominent cryptocurrency, which was launched in 2009 by Satoshi Nakamoto [8]. Ethereum was designed in 2013 by Vitalik Buterin and Gavin Wood [123]. Ethereum is not just a cryptocurrency but also a platform for building decentralized applications using smart contracts. After Bitcoin, Ethereum is the cryptocurrency with the second-largest market capitalisation. Dogecoin, another open-source cryptocurrency based on the popular ”doge” internet meme, grew in popularity and price in 2021 after billionaire Elon Musk publicly backed it. Litecoin created by Charlie Lee in 2011, Litecoin is based on Bitcoin’s protocol but differs in terms of the algorithm used. Litecoin uses the scrypt encryption, proposed by Colin Percival [124].

Due to the incompleteness of data sources, we combine data sources from two websites, including Yahoo Finance222https://finance.yahoo.com/lookup and Kaggle [125] with fundamental details summarised in Table 5. The datasets feature the historical price information of the four cryptocurrencies with begin and end date and the number of data points shown in Table 6. We forecast the closing price of each cryptocurrency using univariate and multivariate deep learning models. According to Wang et al. [22], in the multivariate model, it is feasible to incorporate the features of the cryptocurrency price such as the open𝑜𝑝𝑒𝑛openitalic_o italic_p italic_e italic_n, high𝑖𝑔highitalic_h italic_i italic_g italic_h, low𝑙𝑜𝑤lowitalic_l italic_o italic_w and volume𝑣𝑜𝑙𝑢𝑚𝑒volumeitalic_v italic_o italic_l italic_u italic_m italic_e to enhance forecasting accuracy. Hence, we used these features in our multivariate models as shown in Table 6. We also add gold prices as an additional feature to the multivariate model, as noted by Huynh et al. [26] who found a strong correlation between cryptocurrencies and the gold market. We obtained the Gold price data from London bullion market (LBMA) during 31 December 2012 to 28 February 2022 collected from Factset333https://www.factset.com/.

CryptocurrencyPeriod (Day/Month/Year)SizeMean: Close price (USD)Variance: Close price
Bitcoin29/04/2013-01/04/20243991136922.856×1082.856superscript1082.856\times 10^{8}2.856 × 10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT
Ethereum08/08/2015-01/04/20243160983.531.274×1061.274superscript1061.274\times 10^{6}1.274 × 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT
Dogecoin16/12/2013-01/04/202437600.04065.947×1035.947superscript1035.947\times 10^{-3}5.947 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT
Litecoin29/04/2013-01/04/2024399160.2083.884×1033.884superscript1033.884\times 10^{3}3.884 × 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT
VariableVariable DescriptionData type
SNothe order of the dataNumber
NameName of cryptocurrencyLetter
SymbolAbbreviation of cryptocurrencyLetter
DateDate of observationDate
HighHighest price on given dayNumber
LowLowest price on given dayNumber
OpenOpening price on given dayNumber
CloseClosing price on given dayNumber
VolumeVolume of transactions on given dayNumber

3.4 Data processing

We need to reconstruct the original time series data for multi-step-ahead prediction using deep learning models. The embedding theorem of Taken’s states that the reconstruction process can replicate significant characteristics of the initial time series [126]. Given an observed time series x(t)𝑥𝑡x(t)italic_x ( italic_t ), we can generate embedded phase space Y(t)=[x(t),x(tT),,X(t(D1)T)]𝑌𝑡𝑥𝑡𝑥𝑡𝑇𝑋𝑡𝐷1𝑇Y(t)=[x(t),x(t-T),...,X(t-(D-1)T)]italic_Y ( italic_t ) = [ italic_x ( italic_t ) , italic_x ( italic_t - italic_T ) , … , italic_X ( italic_t - ( italic_D - 1 ) italic_T ) ] ; where T𝑇Titalic_T is the time delay, D𝐷Ditalic_D is the embedding dimension with t=0,1,2,,ND1𝑡012𝑁𝐷1t=0,1,2,...,N-D-1italic_t = 0 , 1 , 2 , … , italic_N - italic_D - 1 , and N𝑁Nitalic_N is the original length of the time series. Takens’ theorem demonstrates that if the original attractor had a dimension of d𝑑ditalic_d, then an embedding dimension of D=2d+1𝐷2𝑑1D=2d+1italic_D = 2 italic_d + 1 would be enough.In univariate method prediction, we divide a single time series data set up into several sections. The input characteristics for each segment are data from N𝑁Nitalic_N subsequent time points, and the output label(s) is the time point(s) that comes afterwards. Therefore, we can have single-step prediction or multi-step ahead prediction.In the multivariate strategy, as input vectors, we will utilise a window that holds multiple time series data at sequential multiple time points as shown for the case of Bitcoin price prediction in Figure 8. The input data consists of multiple time series, such as the high-price and close-price of Bitcoin, and the stock price of Gold to provide a multi-step prediction of Bitcoin close price.

Review of deep learning models for crypto price prediction: implementation and evaluation (8)

3.5 Framework

We present our framework in Figure 9 that highlights major components of the entire process. In Step 1, we extract data and pre-process data for analysis. Since the Gold price data only has its value on trading days (Monday to Friday), we use interpolation methods to fill in prices on non-trading days (Saturday, Sunday and Public Holidays). The interpolation we used is a linear method444https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.interpolate.html#pandas.DataFrame.interpolate which fills in missing values with the average of both sides. This is commonly used to handle missing values in time series data [127].

The data pre-processing in Step 2 features data-scaling to ensure the model stability, which limits the price within the range defined by min-max scalar555https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html. Furthermore, we separate the data into 2 sets for comparing the influence of COVID-19 and also determine the train-test data split, which is based on a given timeline (not shuffled). Our goal is to predict the closing price of the respective cryptocurrencies, therefore we have univariate data with close price and multivariate data with close price, gold price, high price, low price and opening price as shown in Table 1. We split the dataset into a 70:30 ratio. We use the data for the selected cryptocurrency from the opening to June 2021 as Dataset 1, where the last 30% of the data includes the period following the beginning of COVID-19 (March 2020 to June 2021), during which high volatility in crypto price was evident. We know that cryptocurrency is a financial asset with highly volatile prices [15], hence we explore which model is more effective in predicting the rising trend following the breakout of COVID-19. Dataset 2 features the COVID-19 period both in the train and test dataset (March 2020 to April 2024) to ensure that high volatility crypto price data is part of the training dataset. Table 7 presents the dates for the train and test datasets, for both experiments.

CryptoExperimentSplitPeriod (Day/Month/Year)
BitcoinDataset 1Train29/04/2013-27/12/2018
Test28/12/2018-01/06/2021
Dataset 2Train01/03/2020-09/01/2023
Test10/01/2023-01/04/2024
EthereumDataset1Train08/08/2015-02/09/2019
Test03/09/2019-01/06/2021
Dataset 2training01/03/2020-09/01/2023
testing10/01/2023-01/04/2024
DogecoinDataset 1Train16/12/2013-06/03/2019
Test07/03/2019-01/06/2021
Dataset 2Train01/03/2020-09/01/2023
Test10/01/2023-01/04/2024
LitecoinDataset 1Train29/04/2013-27/12/2018
Test28/12/2018-01/06/2021
Dataset 2Train01/03/2020-09/01/2023
Test10/01/2023-01/04/2024

In Step 3, we select the optimal hyperparameters for each model using trial runs, knowledge from the same model runs in literature (e.g. [40]), and the default values in the library implementation (e.g. PyTorch[128]). We note that we use RMSE as the accuracy measure for all the models in our framework. Once the best parameters have been determined, we can then continue with our investigations that compare univariate and multivariate deep learning models.

In Step 4, we provide data analysis by first implementing a volatility analysis of the close price of selected cryptocurrencies, which can reveal fluctuations and patterns throughout the selected period. Historically, cryptocurrencies featured a wide price range, with high major fluctuation across the COVID-19 period [129, 130]. We use the volatility analysis to review the fluctuations, since they can lead to instability during the model training process, causing slower convergence and poor generalisation ability on the test dataset. In Step 4, taking into account our multivariate models, we provide feature correlation analysis to find out how the different features affect each other.

We next compare the respective models in Step 5 with Experiment 1 (pre-COVID-19 training data) and select the two best-performing models for the next step. We develop and compare the multivariate model and univariate deep learning model including LSTM, BD-LSTM, ED-LSTM, CNN, Conv-LSTM, and Transformer. models predict the close price of the selected cryptocurrencies. We use MLP and ARIMA models as baseline models for Bitcoin dataset.

In Step 6, we use Dataset 2 for Experiment 2 to predict the close price during COVID-19 using training data that features COVID-19 effect on the cryptocurrencies. We do this to determine whether the prediction accuracy has been improved and hence compare the results with Experiment 1. We also incorporate the shuffle data splitting strategy to enhance the efficacy of model performance. The initial 70% of the data is randomly rearranged using the shuffle.

Review of deep learning models for crypto price prediction: implementation and evaluation (9)

3.6 Technical details

In order to distinguish the model performance, we use RMSE as the criterion for the different prediction horizons. The smaller the RMSE values, the better the prediction accuracy:

RMSE=1Ni=1N(yiyi^)2𝑅𝑀𝑆𝐸1𝑁superscriptsubscript𝑖1𝑁superscriptsubscript𝑦𝑖^subscript𝑦𝑖2RMSE=\sqrt{\frac{1}{N}\sum_{i=1}^{N}(y_{i}-\hat{y_{i}})^{2}}italic_R italic_M italic_S italic_E = square-root start_ARG divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG(6)

where yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and yi^^subscript𝑦𝑖\hat{y_{i}}over^ start_ARG italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG are the observed data and predicted data, respectively. N𝑁Nitalic_N is the length of observed data.

As indicated earlier, we use the Adam optimiser for all the deep learning models, where we use default values for the hyperparameters, i.e. α=0.001,βa=0.9,βb=0.999formulae-sequence𝛼0.001formulae-sequencesubscript𝛽𝑎0.9subscript𝛽𝑏0.999\alpha=0.001,\beta_{a}=0.9,\beta_{b}=0.999italic_α = 0.001 , italic_β start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT = 0.9 , italic_β start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = 0.999, and ϵ=1e8superscriptitalic-ϵ1𝑒8\epsilon^{\prime}=1e-8italic_ϵ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 italic_e - 8.

In the case of the Transformer and ARIMA model, we reviewed the literature [39, 131, 70] to obtain the hyperparameters. Table 9 describes the details of model hyperparameters, including the number of input layers, output layers, hidden layers and other hyperparameters. We use the ReLu activation function in the respective deep learning models with a maximum training time of 200 epochs via the Adam optimiser [113].

We implemented specific experiments to determine the appropriate hyperparameters for each model. Based on related models in the literature [40], we used model architectures: CNN and LSTM variants feature one hidden layer with selected hidden units. We refer to previous research on MLP model for time series prediction [132] and evaluate performance for selected hidden neurons, as shown in Table 8. We use the first 70% of the Bitcoin close-price in dataset1 for training and the remaining for testing. We repeated model training with different initial parameters for each hyperparameter configuration 5 times and reported the average. Table 8 presents the performance (RMSE) of each model in the test dataset for the hyperparameters, with the best values in bold.

ModelHiddenTrainTest
LSTM200.01940.0696
500.01760.0490
1000.01650.0370
BD-LSTM200.01950.0239
500.01770.0210
1000.01700.0204
ED-LSTM200.01920.0930
500.01690.0609
1000.01640.0373
Conv-LSTM200.01240.0176
500.01230.0181
1000.01320.0194
CNN200.01260.0206
500.01280.0209
1000.01350.0235
MLP50.01370.0278
100.01300.0268
200.01220.0195
Inputlayers HiddenLayers OutputlayersComments
MLP(6,1)3(1,5)Include three hidden layers.
ARIMA---Construct ARIMA(1,0,1) model.
LSTM(6,1)2(1,5)Include two LSTM layers.
BD-LSTM(6,1)2(1,5)Include Forward&Backward LSTM layer.
ED-LSTM(6,1)4(1,5)Two LSTM networks with a time distributed layer.
Conv-LSTM(6,1)3(1,5)Include Conv1D layer, LSTM network and dense layer.
CNN(6,1)4(1,5)Include Conv1D layer, pooling layer and two dense layers.
Transformer(6,1)2(1,5)Include a Multi-Head Attention mechanism and aposition-wise fully connected feed-forward network.

4 Results

In this section, we provide comprehensive information about the datasets and present research design with computational results.

4.1 Data analysis

The coronavirus disease 2019 (COVID-19) pandemic [133] originated 17th November 2019 in Wuhan, China, and extensively began spreading from March 2020 [134] worldwide. COVID-19 had a devastating effect on the world economy, and its impact included finance, supply chain, politics, and mental health [135], with further effects in the post-pandemic era [136, 137, 138]. Therefore, it is necessary to analyze the price trends of the four cryptocurrencies in our study.

We investigate the trends for the four selected cryptocurrencies over the given period covering COVID-19. We cover all the phases of COVID-19, including its initiation, spread, and decline. Figure 10 presents the close price of Bitcoin, Ethereum, Dogecoin and Litecoin across the selected period, with the shaded region (pink) indicating COVID-19. We can observe that the closing price of each cryptocurrency exhibited large fluctuation within the red area. Litecoin experienced significant volatility before the beginning of COVID-19, while the price fluctuations of the other three cryptocurrencies (Bitcoin, Dogecoin and Ethereum) before COVID-19 were not significant. This demonstrates that after COVID-19, the price of cryptocurrency is more volatile than before. We observe that Ethereum trend is highly correlated to Bitcoin before and during COVID-19. There is a significant price increase from 2020 to 2022, which was subsequently followed by a decrease and another increase in recovering the price, in the case of Bitcoin and Ethereum. Next, we present the monthly volatility plot in Figure 11, where we observe that Ethereum and Litecoin generally lie below 10% during COVID-19 and highlighted (pink). We also show the Bitcoin monthly volatility below 6% during the same time; however, Dogecoin presents a different trend during COVID-19. The monthly volatility of the Dogecoin reached above 20% in January and May, 2021. During other months, it remained consistently at a value of 15%. Our analysis reveals that the volatility patterns of 4 cryptocurrencies indicate a significant decrease in volatility in the subsequent month after the periods of high volatility. The monthly volatility during COVID-19 is generally similar to the monthly volatility, prior to the pandemic (2018 onwards). Although the monthly volatility does not change significantly, it fluctuates significantly when looking at the daily close price across the entire period.

Review of deep learning models for crypto price prediction: implementation and evaluation (10)
Review of deep learning models for crypto price prediction: implementation and evaluation (11)
Review of deep learning models for crypto price prediction: implementation and evaluation (12)
Review of deep learning models for crypto price prediction: implementation and evaluation (13)
Review of deep learning models for crypto price prediction: implementation and evaluation (14)
Review of deep learning models for crypto price prediction: implementation and evaluation (15)
Review of deep learning models for crypto price prediction: implementation and evaluation (16)
Review of deep learning models for crypto price prediction: implementation and evaluation (17)
Review of deep learning models for crypto price prediction: implementation and evaluation (18)
Review of deep learning models for crypto price prediction: implementation and evaluation (19)
Review of deep learning models for crypto price prediction: implementation and evaluation (20)
Review of deep learning models for crypto price prediction: implementation and evaluation (21)
Review of deep learning models for crypto price prediction: implementation and evaluation (22)

Since we will develop a multivariate model, we also need to provide analyses of how different features of the cryptocurrency (low, high, open, and close price) are correlated with the Gold price.Figure 12 shows the correlations between the features of the multivariate model in each cryptocurrency using Pearson correlation. We observe that close-price is highly correlated to the low-price, high-price and open price. We observe that there is a lower correlation between Gold and other features; however, we will use Gold in our multivariate model as data that is outside the crypto ecosystem, but linked to it. We also find that Gold price has the highest correlation with Bitcoin, followed by Ethereum and Litecoin, and the least with Dodgecoin. Figure 13 presents the Pearson correlation for the respective features including the close, high, low and opening price for a given cryptocurrency with Gold price and most correlated other cryptocurrency (using Figure 12), which is Ethereum in the case of Bitcoin, i.e. Figure 13 -Panel (a). We will use this for multivariate prediction strategy using data processing as shown in Figure 8.

4.2 Results: pre-COVID-19

We next implement the investigations outlined in Step 4 (Experiment 1) of Framework (Figure 9, where we compare the selected deep learning models and univariate and multivariate strategies using training dataset pre-COVID-19. Note that our test dataset includes the first phase of COVID-19 (Table 3).

We present the results for each prediction horizon (step) obtained from 30 independent experimental runs (mean RMSE and 95% confidence interval) that feature model training using different initial weights and biases. We note that robustness is the degree of confidence in a forecast, which is indicated by a low confidence interval. Moreover, scalability refers to the capacity to maintain constant performance as the prediction horizon expands. Our main focus is the performance (RMSE) on the test dataset, both in terms of the mean of 5 prediction horizons, and the individual prediction horizons. Therefore, in the rest of the discussion, we focus on the test dataset.

We first use Bitcoin data to evaluate conventional models (MLP and ARIMA) when compared to deep learning models (LSTM, ED-LSTM, BD-LSTM, CNN, Conv-LSTM, Transformer), for the univariate (Figure 14) and multivariate strategies (15). The results show that MLP and ARIMA perform worse than the deep learning models. MLP exhibits a lack of robustness, and ARIMA model struggles in test prediction accuracy when compared to the deep learning models. We note that ARIMA does the best on the train dataset due to over-training and struggles in generalisation ability. The deep learning model results are consistent with the finding by Chandra et al. [40] where the prediction accuracy of deep learning models is better than conventional machine learning models for multistep ahead time-series forecasting. The prediction performance of each model shows a trend where the best Multivariate strategy (ED-LSTM) provides consistent accuracy as the prediction horizon changes when compared to the Univariate strategy (BD-LSTM). In Figure 15, the Multivariate strategy shows that Conv-LSTM provides the lowest prediction accuracy, while ED-LSTM and BD-LSTM models provide the most accurate predictions. In Figure 14, contrary to the results of the Multivariate strategy, the most robust Univariate model for predicting Bitcoin is Conv-LSTM.

Review of deep learning models for crypto price prediction: implementation and evaluation (23)
Review of deep learning models for crypto price prediction: implementation and evaluation (24)
Review of deep learning models for crypto price prediction: implementation and evaluation (25)
Review of deep learning models for crypto price prediction: implementation and evaluation (26)

Figure 16 presents the results for Ethereum using the Univariate strategy, where we observe that LSTM provides the best test performance, followed by BD-LSTM. Figure 17 provides the results for the Multivariate strategy, where CNN provides the best performance which is followed by Conv-LSTM. Notably, the Transformer model provides the best performance. In comparison to the Univariate strategy, we notice that the Multivariate strategy provides a much better test accuracy, which is also more robust and scalable, i.e. higher prediction horizons maintain better accuracy. Furthermore, we note that the Conv-LSTM provides the worst performance in the Univariate case, but one of the best in the Multivariate strategy.

In the case of Dodgecoin, Figures 18 and 19 reveal that BD-LSTM exhibits the best accuracy, both for the Univariate and the Multivariate strategies, and also provides similar stability for higher prediction horizons. This could be due to the price and vitality trends in Figures 10 and 11 (Panels c), where we notice that Dodgecoin has a similar trend pre-COVID-19 and during the first phase of COVID-19 which makes Dataset 1 used for these experiments. Furthermore, we also note that in Figure 13 (Panel c), Dodgecoin is least correlated with the Gold stock price, which is the major factor making a difference in the multivariate model. We notice that CNN provides the worst accuracy formed by the Transformer model in both strategies.

Finally, we present the results for the Litcoin for both Univariate and Multivariate strategies. Figures 20 21 show all the results of the Univariate models, where we find that the Conv-LSTM and BD-LSTM show the best performance, whereas LSTM, ED-LSTM and Conv-LSTM provide the best performances for the Multivariate strategies. On the contrary, the CNN provides the worst performance for the Multivariate strategy, much higher in magnitude when compared to the rest of the models. We also notice that the Multivariate strategy provides better stability as the prediction horizon increases when compared to the Univariate strategy, and the Univariate strategy provides much better accuracy of the best models when compared to the Multivariate strategy.We summarise the results further in Table 10 which features the model prediction accuracy of the test dataset. We report the RMSE mean and 95% confidence interval for the four cryptocurrencies, and the best models for the different steps are highlighted in bold. The test mean provides the average of the five steps. It is clear that the Univariate models are better than the Multivariate models; however, we find that the accuracy of both strategies is close (test mean) for Bitcoin and Dogecoin.

Review of deep learning models for crypto price prediction: implementation and evaluation (27)
Review of deep learning models for crypto price prediction: implementation and evaluation (28)
Review of deep learning models for crypto price prediction: implementation and evaluation (29)
Review of deep learning models for crypto price prediction: implementation and evaluation (30)
Review of deep learning models for crypto price prediction: implementation and evaluation (31)
Review of deep learning models for crypto price prediction: implementation and evaluation (32)
Review of deep learning models for crypto price prediction: implementation and evaluation (33)
Review of deep learning models for crypto price prediction: implementation and evaluation (34)
Review of deep learning models for crypto price prediction: implementation and evaluation (35)
Review of deep learning models for crypto price prediction: implementation and evaluation (36)
Review of deep learning models for crypto price prediction: implementation and evaluation (37)
Review of deep learning models for crypto price prediction: implementation and evaluation (38)
DataStrategyModelStep 1Step 2Step 3Step 4Step 5Test Mean
BTCUnivariateCNN0.0380±plus-or-minus\pm±0.00150.0451±plus-or-minus\pm±0.00170.0476±plus-or-minus\pm±0.00130.0537±plus-or-minus\pm±0.00150.0616±plus-or-minus\pm±0.00210.0492±plus-or-minus\pm±0.0016
LSTM0.0223±plus-or-minus\pm±0.00130.0337±plus-or-minus\pm±0.00200.0425±plus-or-minus\pm±0.00250.0496±plus-or-minus\pm±0.00260.0515±plus-or-minus\pm±0.00220.0399±plus-or-minus\pm±0.0021
ED-LSTM0.0250±plus-or-minus\pm±0.00140.0363±plus-or-minus\pm±0.00290.0421±plus-or-minus\pm±0.00300.0448±plus-or-minus\pm±0.00280.0477±plus-or-minus\pm±0.00250.0392±plus-or-minus\pm±0.0025
BD-LSTM0.0196±plus-or-minus\pm±0.00080.0296±plus-or-minus\pm±0.00130.0333±plus-or-minus\pm±0.00140.0385±plus-or-minus\pm±0.00120.0424±plus-or-minus\pm±0.00150.0327±plus-or-minus\pm±0.0012
Conv-LSTM0.0244±plus-or-minus\pm±0.00120.0302±plus-or-minus\pm±0.00110.0381±plus-or-minus\pm±0.00150.0434±plus-or-minus\pm±0.00200.0468±plus-or-minus\pm±0.00230.0366±plus-or-minus\pm±0.0016
Transformer0.0360±plus-or-minus\pm±0.00410.0431±plus-or-minus\pm±0.00380.0500±plus-or-minus\pm±0.00360.0563±plus-or-minus\pm±0.00370.0617±plus-or-minus\pm±0.00390.0494±plus-or-minus\pm±0.0038
MultivariateCNN0.0416±plus-or-minus\pm±0.00110.0477±plus-or-minus\pm±0.00160.0534±plus-or-minus\pm±0.00180.0575±plus-or-minus\pm±0.00190.0606±plus-or-minus\pm±0.00210.0522±plus-or-minus\pm±0.0017
LSTM0.0310±plus-or-minus\pm±0.00180.0395±plus-or-minus\pm±0.00290.0449±plus-or-minus\pm±0.00360.0458±plus-or-minus\pm±0.00310.0467±plus-or-minus\pm±0.00220.0416±plus-or-minus\pm±0.0027
ED-LSTM0.0290±plus-or-minus\pm±0.00220.0356±plus-or-minus\pm±0.00170.0384±plus-or-minus\pm±0.00150.0405±plus-or-minus\pm±0.00130.0430±plus-or-minus\pm±0.00140.0373±plus-or-minus\pm±0.0016
BD-LSTM0.0247±plus-or-minus\pm±0.00180.0336±plus-or-minus\pm±0.00220.0415±plus-or-minus\pm±0.00260.0462±plus-or-minus\pm±0.00260.0511±plus-or-minus\pm±0.00260.0394±plus-or-minus\pm±0.0024
Conv-LSTM0.0500±plus-or-minus\pm±0.00370.0466±plus-or-minus\pm±0.00190.0678±plus-or-minus\pm±0.00950.0886±plus-or-minus\pm±0.01440.1327±plus-or-minus\pm±0.02660.0771±plus-or-minus\pm±0.0112
Transformer0.0382±plus-or-minus\pm±0.00260.0418±plus-or-minus\pm±0.00240.0459±plus-or-minus\pm±0.00210.0501±plus-or-minus\pm±0.00200.0526±plus-or-minus\pm±0.00220.0457±plus-or-minus\pm±0.0023
ETHUnivariateCNN0.0384±plus-or-minus\pm±0.00120.0453±plus-or-minus\pm±0.00170.0478±plus-or-minus\pm±0.00260.0521±plus-or-minus\pm±0.00300.0586±plus-or-minus\pm±0.00350.0484±plus-or-minus\pm±0.0024
LSTM0.0280±plus-or-minus\pm±0.00130.0341±plus-or-minus\pm±0.00120.0372±plus-or-minus\pm±0.00110.0441±plus-or-minus\pm±0.00160.0470±plus-or-minus\pm±0.00170.0381±plus-or-minus\pm±0.0014
ED-LSTM0.0261±plus-or-minus\pm±0.00120.0373±plus-or-minus\pm±0.00190.0405±plus-or-minus\pm±0.00180.0456±plus-or-minus\pm±0.00130.0471±plus-or-minus\pm±0.00120.0393±plus-or-minus\pm±0.0015
BD-LSTM0.0265±plus-or-minus\pm±0.00120.0351±plus-or-minus\pm±0.00190.0401±plus-or-minus\pm±0.00190.0423±plus-or-minus\pm±0.00220.0498±plus-or-minus\pm±0.00250.0388±plus-or-minus\pm±0.0019
Conv-LSTM0.0327±plus-or-minus\pm±0.00200.0410±plus-or-minus\pm±0.00210.0497±plus-or-minus\pm±0.00300.0609±plus-or-minus\pm±0.00480.0751±plus-or-minus\pm±0.00630.0519±plus-or-minus\pm±0.0036
Transformer0.0337±plus-or-minus\pm±0.00350.0412±plus-or-minus\pm±0.00420.0449±plus-or-minus\pm±0.00390.0521±plus-or-minus\pm±0.00320.0535±plus-or-minus\pm±0.00360.0451±plus-or-minus\pm±0.0037
MultivariateCNN0.1007±plus-or-minus\pm±0.00380.1385±plus-or-minus\pm±0.02610.1564±plus-or-minus\pm±0.04440.1147±plus-or-minus\pm±0.00340.1276±plus-or-minus\pm±0.00230.1276±plus-or-minus\pm±0.0160
LSTM0.1913±plus-or-minus\pm±0.01560.1868±plus-or-minus\pm±0.01900.1993±plus-or-minus\pm±0.01780.1887±plus-or-minus\pm±0.01800.1767±plus-or-minus\pm±0.02240.1886±plus-or-minus\pm±0.0186
ED-LSTM0.1815±plus-or-minus\pm±0.03620.1955±plus-or-minus\pm±0.04230.1862±plus-or-minus\pm±0.04410.1881±plus-or-minus\pm±0.04590.1913±plus-or-minus\pm±0.04760.1885±plus-or-minus\pm±0.0432
BD-LSTM0.1033±plus-or-minus\pm±0.01590.1788±plus-or-minus\pm±0.03540.1799±plus-or-minus\pm±0.03140.1741±plus-or-minus\pm±0.02850.1960±plus-or-minus\pm±0.03020.1664±plus-or-minus\pm±0.0283
Conv-LSTM0.0799±plus-or-minus\pm±0.02430.1011±plus-or-minus\pm±0.02470.1185±plus-or-minus\pm±0.04220.1786±plus-or-minus\pm±0.05260.1950±plus-or-minus\pm±0.05170.1346±plus-or-minus\pm±0.0391
Transformer0.2464±plus-or-minus\pm±0.01410.2485±plus-or-minus\pm±0.01360.2590±plus-or-minus\pm±0.01500.2508±plus-or-minus\pm±0.01470.2470±plus-or-minus\pm±0.01530.2503±plus-or-minus\pm±0.0145
DOGEUnivariateCNN0.1714±plus-or-minus\pm±0.04900.2620±plus-or-minus\pm±0.08310.2786±plus-or-minus\pm±0.10560.4862±plus-or-minus\pm±0.09560.5171±plus-or-minus\pm±0.09680.3431±plus-or-minus\pm±0.0860
LSTM0.1492±plus-or-minus\pm±0.01220.1523±plus-or-minus\pm±0.01340.1577±plus-or-minus\pm±0.01230.1637±plus-or-minus\pm±0.01270.1681±plus-or-minus\pm±0.01330.1582±plus-or-minus\pm±0.0128
ED-LSTM0.1386±plus-or-minus\pm±0.01360.1419±plus-or-minus\pm±0.01290.1401±plus-or-minus\pm±0.01270.1422±plus-or-minus\pm±0.01210.1428±plus-or-minus\pm±0.01180.1411±plus-or-minus\pm±0.0126
BD-LSTM0.0509±plus-or-minus\pm±0.00140.0562±plus-or-minus\pm±0.00170.0722±plus-or-minus\pm±0.00250.0648±plus-or-minus\pm±0.00210.0645±plus-or-minus\pm±0.00200.0617±plus-or-minus\pm±0.0019
Conv-LSTM0.0590±plus-or-minus\pm±0.01980.1554±plus-or-minus\pm±0.09130.1430±plus-or-minus\pm±0.09140.1456±plus-or-minus\pm±0.08930.1413±plus-or-minus\pm±0.08300.1289±plus-or-minus\pm±0.0750
Transformer0.2116±plus-or-minus\pm±0.03610.2228±plus-or-minus\pm±0.05170.2435±plus-or-minus\pm±0.06010.2219±plus-or-minus\pm±0.05030.2188±plus-or-minus\pm±0.04610.2237±plus-or-minus\pm±0.0489
MultivariateCNN0.8122±plus-or-minus\pm±0.02120.6364±plus-or-minus\pm±0.07920.6725±plus-or-minus\pm±0.07000.6656±plus-or-minus\pm±0.08400.5972±plus-or-minus\pm±0.08710.6768±plus-or-minus\pm±0.0683
LSTM0.1706±plus-or-minus\pm±0.00870.1746±plus-or-minus\pm±0.00740.1828±plus-or-minus\pm±0.00830.1810±plus-or-minus\pm±0.00830.1825±plus-or-minus\pm±0.00950.1783±plus-or-minus\pm±0.0084
ED-LSTM0.1829±plus-or-minus\pm±0.02110.1823±plus-or-minus\pm±0.02050.1816±plus-or-minus\pm±0.02000.1821±plus-or-minus\pm±0.01960.1823±plus-or-minus\pm±0.01920.1822±plus-or-minus\pm±0.0201
BD-LSTM0.0616±plus-or-minus\pm±0.00650.0603±plus-or-minus\pm±0.00510.0619±plus-or-minus\pm±0.00390.0653±plus-or-minus\pm±0.00320.0720±plus-or-minus\pm±0.00420.0642±plus-or-minus\pm±0.0046
Conv-LSTM0.2103±plus-or-minus\pm±0.09750.1962±plus-or-minus\pm±0.08970.1912±plus-or-minus\pm±0.08590.2347±plus-or-minus\pm±0.10850.2160±plus-or-minus\pm±0.10120.2097±plus-or-minus\pm±0.0966
Transformer0.2472±plus-or-minus\pm±0.02000.2482±plus-or-minus\pm±0.02170.2501±plus-or-minus\pm±0.02110.2345±plus-or-minus\pm±0.02150.2332±plus-or-minus\pm±0.02030.2426±plus-or-minus\pm±0.0209
LTCUnivariateCNN0.0390±plus-or-minus\pm±0.00080.0470±plus-or-minus\pm±0.00150.0627±plus-or-minus\pm±0.00330.0948±plus-or-minus\pm±0.01280.1131±plus-or-minus\pm±0.01700.0713±plus-or-minus\pm±0.0071
LSTM0.0382±plus-or-minus\pm±0.00190.0479±plus-or-minus\pm±0.00240.0619±plus-or-minus\pm±0.00310.0768±plus-or-minus\pm±0.00380.0861±plus-or-minus\pm±0.00380.0622±plus-or-minus\pm±0.0030
ED-LSTM0.0369±plus-or-minus\pm±0.00280.0487±plus-or-minus\pm±0.00250.0631±plus-or-minus\pm±0.00360.0756±plus-or-minus\pm±0.00580.0838±plus-or-minus\pm±0.00660.0616±plus-or-minus\pm±0.0043
BD-LSTM0.0318±plus-or-minus\pm±0.00190.0401±plus-or-minus\pm±0.00190.0507±plus-or-minus\pm±0.00290.0588±plus-or-minus\pm±0.00390.0699±plus-or-minus\pm±0.00340.0503±plus-or-minus\pm±0.0028
Conv-LSTM0.0265±plus-or-minus\pm±0.00120.0362±plus-or-minus\pm±0.00170.0443±plus-or-minus\pm±0.00150.0513±plus-or-minus\pm±0.00190.0581±plus-or-minus\pm±0.00170.0433±plus-or-minus\pm±0.0016
Transformer0.0726±plus-or-minus\pm±0.00350.0712±plus-or-minus\pm±0.00410.0746±plus-or-minus\pm±0.00460.0818±plus-or-minus\pm±0.00380.0891±plus-or-minus\pm±0.00320.0779±plus-or-minus\pm±0.0038
MultivariateCNN0.4648±plus-or-minus\pm±0.04680.4553±plus-or-minus\pm±0.04440.4810±plus-or-minus\pm±0.04450.5086±plus-or-minus\pm±0.04600.5167±plus-or-minus\pm±0.04920.4853±plus-or-minus\pm±0.0462
LSTM0.0625±plus-or-minus\pm±0.00520.0788±plus-or-minus\pm±0.00700.0921±plus-or-minus\pm±0.00580.0885±plus-or-minus\pm±0.00440.0919±plus-or-minus\pm±0.00440.0828±plus-or-minus\pm±0.0054
ED-LSTM0.0940±plus-or-minus\pm±0.01300.1121±plus-or-minus\pm±0.01250.1032±plus-or-minus\pm±0.01160.0833±plus-or-minus\pm±0.00360.0880±plus-or-minus\pm±0.00370.0961±plus-or-minus\pm±0.0089
BD-LSTM0.0859±plus-or-minus\pm±0.01110.1520±plus-or-minus\pm±0.02680.2513±plus-or-minus\pm±0.04410.2838±plus-or-minus\pm±0.05230.2760±plus-or-minus\pm±0.05010.2098±plus-or-minus\pm±0.0369
Conv-LSTM0.0542±plus-or-minus\pm±0.00540.0713±plus-or-minus\pm±0.00940.0952±plus-or-minus\pm±0.01260.1028±plus-or-minus\pm±0.01360.1123±plus-or-minus\pm±0.01270.0872±plus-or-minus\pm±0.0107
Transformer0.1532±plus-or-minus\pm±0.01310.1624±plus-or-minus\pm±0.01080.1724±plus-or-minus\pm±0.00830.1671±plus-or-minus\pm±0.01290.1650±plus-or-minus\pm±0.01280.1640±plus-or-minus\pm±0.0116

4.3 Results: Data featuring COVID-19

The previous section presents results given by the respective models using data before COVID-19. We found that the univariate strategy was better than the Multivariate strategy (Table 10), therefore we only used the Univariate strategy for Experiment 2 (during COVID-19) and presented the results in Table 11. We find that comparing the results of Experiment 1, the prediction accuracy of Bitcoin, Ethereum, and Dogecoin has improved to a certain extent. The prediction accuracy that reveals the greatest improvement is forecasting the close price of Dogecoin. After training using data from the COVID-19 period, the prediction performance of Dogecoin is almost close to Bitcoin and Ethereum. Nevertheless, the forecast precision for Litecoin decreased. Additionally, we find that the robustness of the model improved after training with data from the COVID-19 period. The models keep that the confidence intervals for all predicted horizons of Bitcoin and Ethereum are controlled within ±plus-or-minus\pm±0.0007. In the case of Dogecoin and Litecoin, the robustness of the models in 1-step ahead prediction has generally improved.

DataModelStep 1Step 2Step 3Step 4Step 5Test Mean
BTCBD-LSTM0.0194±plus-or-minus\pm±0.00020.0258±plus-or-minus\pm±0.00030.0311±plus-or-minus\pm±0.00040.0367±plus-or-minus\pm±0.00030.0414±plus-or-minus\pm±0.00040.0309±plus-or-minus\pm±0.0003
ED-LSTM0.0199±plus-or-minus\pm±0.00010.0284±plus-or-minus\pm±0.00040.0339±plus-or-minus\pm±0.00060.0381±plus-or-minus\pm±0.00050.0418±plus-or-minus\pm±0.00030.0324±plus-or-minus\pm±0.0004
LSTM0.0283±plus-or-minus\pm±0.00040.0326±plus-or-minus\pm±0.00030.0369±plus-or-minus\pm±0.00030.0410±plus-or-minus\pm±0.00030.0447±plus-or-minus\pm±0.00020.0367±plus-or-minus\pm±0.0003
CNN0.0293±plus-or-minus\pm±0.00060.0342±plus-or-minus\pm±0.00060.0388±plus-or-minus\pm±0.00050.0431±plus-or-minus\pm±0.00040.0467±plus-or-minus\pm±0.00030.0384±plus-or-minus\pm±0.0005
Conv-LSTM0.0209±plus-or-minus\pm±0.00040.0263±plus-or-minus\pm±0.00040.0317±plus-or-minus\pm±0.00040.0372±plus-or-minus\pm±0.00030.0421±plus-or-minus\pm±0.00020.0316±plus-or-minus\pm±0.0004
Transformer0.0484±plus-or-minus\pm±0.00550.0514±plus-or-minus\pm±0.00510.0546±plus-or-minus\pm±0.00470.0585±plus-or-minus\pm±0.00470.0609±plus-or-minus\pm±0.00470.0548±plus-or-minus\pm±0.0050
ETHBD-LSTM0.0230±plus-or-minus\pm±0.00020.0304±plus-or-minus\pm±0.00040.0361±plus-or-minus\pm±0.00050.0426±plus-or-minus\pm±0.00040.0484±plus-or-minus\pm±0.00060.0361±plus-or-minus\pm±0.0004
ED-LSTM0.0230±plus-or-minus\pm±0.00010.0307±plus-or-minus\pm±0.00040.0362±plus-or-minus\pm±0.00040.0426±plus-or-minus\pm±0.00040.0481±plus-or-minus\pm±0.00060.0361±plus-or-minus\pm±0.0004
LSTM0.0238±plus-or-minus\pm±0.00050.0306±plus-or-minus\pm±0.00050.0361±plus-or-minus\pm±0.00050.0426±plus-or-minus\pm±0.00050.0480±plus-or-minus\pm±0.00050.0362±plus-or-minus\pm±0.0005
CNN0.0321±plus-or-minus\pm±0.00040.0397±plus-or-minus\pm±0.00080.0481±plus-or-minus\pm±0.00140.0536±plus-or-minus\pm±0.00160.0585±plus-or-minus\pm±0.00150.0464±plus-or-minus\pm±0.0012
Conv-LSTM0.0250±plus-or-minus\pm±0.00040.0332±plus-or-minus\pm±0.00070.0405±plus-or-minus\pm±0.00130.0481±plus-or-minus\pm±0.00200.0550±plus-or-minus\pm±0.00250.0404±plus-or-minus\pm±0.0014
Transformer0.0269±plus-or-minus\pm±0.00080.0359±plus-or-minus\pm±0.00100.0411±plus-or-minus\pm±0.00110.0486±plus-or-minus\pm±0.00170.0542±plus-or-minus\pm±0.00190.0413±plus-or-minus\pm±0.0013
DOGEBD-LSTM0.0291±plus-or-minus\pm±0.00010.0618±plus-or-minus\pm±0.00410.0673±plus-or-minus\pm±0.00460.0742±plus-or-minus\pm±0.00440.0795±plus-or-minus\pm±0.00410.0624±plus-or-minus\pm±0.0035
ED-LSTM0.0290±plus-or-minus\pm±0.00010.0626±plus-or-minus\pm±0.00300.0656±plus-or-minus\pm±0.00290.0708±plus-or-minus\pm±0.00260.0757±plus-or-minus\pm±0.00240.0607±plus-or-minus\pm±0.0022
LSTM0.0660±plus-or-minus\pm±0.00390.0683±plus-or-minus\pm±0.00420.0728±plus-or-minus\pm±0.00470.0792±plus-or-minus\pm±0.00480.0835±plus-or-minus\pm±0.00430.0740±plus-or-minus\pm±0.0044
CNN0.0646±plus-or-minus\pm±0.00410.0630±plus-or-minus\pm±0.00400.0655±plus-or-minus\pm±0.00390.0762±plus-or-minus\pm±0.00390.0806±plus-or-minus\pm±0.00310.0700±plus-or-minus\pm±0.0038
Conv-LSTM0.0538±plus-or-minus\pm±0.00210.0593±plus-or-minus\pm±0.00220.0612±plus-or-minus\pm±0.00200.0662±plus-or-minus\pm±0.00200.0726±plus-or-minus\pm±0.00190.0626±plus-or-minus\pm±0.0020
Transformer0.0536±plus-or-minus\pm±0.00710.0586±plus-or-minus\pm±0.00770.0639±plus-or-minus\pm±0.00730.0724±plus-or-minus\pm±0.00700.0789±plus-or-minus\pm±0.00640.0655±plus-or-minus\pm±0.0071
LTCBD-LSTM0.0577±plus-or-minus\pm±0.00070.0797±plus-or-minus\pm±0.00220.0968±plus-or-minus\pm±0.00180.1096±plus-or-minus\pm±0.00150.1205±plus-or-minus\pm±0.00160.0929±plus-or-minus\pm±0.0016
ED-LSTM0.0578±plus-or-minus\pm±0.00060.0797±plus-or-minus\pm±0.00120.0962±plus-or-minus\pm±0.00100.1096±plus-or-minus\pm±0.00090.1198±plus-or-minus\pm±0.00090.0926±plus-or-minus\pm±0.0009
LSTM0.0587±plus-or-minus\pm±0.00210.0804±plus-or-minus\pm±0.00170.0971±plus-or-minus\pm±0.00150.1100±plus-or-minus\pm±0.00140.1207±plus-or-minus\pm±0.00130.0934±plus-or-minus\pm±0.0016
CNN0.0823±plus-or-minus\pm±0.00110.1060±plus-or-minus\pm±0.00420.1163±plus-or-minus\pm±0.00250.1268±plus-or-minus\pm±0.00290.1403±plus-or-minus\pm±0.00490.1143±plus-or-minus\pm±0.0031
Conv-LSTM0.0809±plus-or-minus\pm±0.00730.1043±plus-or-minus\pm±0.00950.1201±plus-or-minus\pm±0.00870.1286±plus-or-minus\pm±0.00860.1383±plus-or-minus\pm±0.00720.1145±plus-or-minus\pm±0.0083
Transformer0.0890±plus-or-minus\pm±0.00560.1046±plus-or-minus\pm±0.00480.1163±plus-or-minus\pm±0.00420.1273±plus-or-minus\pm±0.00380.1354±plus-or-minus\pm±0.00330.1146±plus-or-minus\pm±0.0043
DataModelLSTMED-LSTMBD-LSTMCNNConv-LSTMTransformer
BTCUnivariate431526
Multivariate312564
ETHUnivariate132564
Multivariate543126
DOGEUnivariate431625
Multivariate231645
LTCUnivariate432516
Multivariate135624
Mean Rank32.8752.1254.8753.1255
DataLSTMED-LSTMBD-LSTMCNNConv-LSTMTransformer
BTC431526
ETH321456
DOGE641523
LTC312456
Mean Rank42.51.254.53.55.25

5 Discussion

In this section, we provide a discussion based on the results, taking into consideration the model architecture as well as the characteristics of the data. In summary, we evaluated the predictive performance of all models and presented the results through Figures 14 to 21. We also provide a ranking of the performance accuracy for each type of model in Tables 12 and 13.

We first review the results of the first experiment that investigated the model performance with COVID-19 data. Our results show that LSTM, BD-LSTM and ED-LSTM provide outstanding predictive performance across four different cryptocurrencies and two different approaches for selecting model variables. The CNN, Conv-LSTM, and Transformer models show good performance only under particular conditions. We also note that in all the cryptocurrencies, the Univariate model outperformed the Multivariate model, but in some cases (Bitcoin and Dogecoin) the accuracy was close when comparing both strategies. We found that the models with high forecast accuracy were mostly accompanied by narrower confidence intervals. On the contrary, the higher RMSE values usually resulted in lower robustness of the model. We found that models with better prediction accuracy (lower RMSE) provide more robust performance accuracy, given different model weight initializations in independent experimental runs. The accuracy of performance generally declines as the prediction horizons increase, which is natural for multistep ahead problems (Figure 14b). The prediction is derived from the current values, and the information gap expands as the prediction step increases. This is because our task is defined as a direct approach for forecasting multisteps, rather than an iterated prediction strategy. We observe the changes in the prediction horizon of each model and find that CNN and Conv-LSTM are significantly worse than other models. The forecast accuracy of these two models frequently declines more rapidly than that of other models, and occasionally even show volatility (Figure 18b), across Step 1 to 5.We found that the CNN-related models using convolution operation provided lower accuracy than other models in predicting cryptocurrency price. Later we will analyse the cause of this issue. Among the predictions for the four currencies in Experiment 1 (Table 10), Dogecoin has the worst prediction effect. The RMSE values of the model are significantly higher than those of the other three cryptocurrencies. We believe this is due to the particularity of the Dogecoin price, which leads to large prediction errors. The first 70% of the data fluctuates smoothly, while the last 30% of the data fluctuates violent.

In the second experiment, we evaluate the accuracy deep learning model predicting cryptocurrencies. We utilize the two models that predicted the best in previous experiments to do forecasting with the new dataset. The new dataset includes all close prices since COVID-19 to April 2024. It has been discovered that the close-price forecasts for Bitcoin, Dogecoin, and Ethereum have all shown enhancement. Also, as the prediction horizon rises, the prediction accuracy of the model deteriorates more slowly. We claim that there are two causes contributing to the decrease in the accuracy of forecasting litecoin. The first is that because our evaluation criterion for the model is the overall performance of the model, we did not choose Conv-LSTM, which had the best performance in predicting Litecoin in previous experiment. Maybe Conv-LSTM is more suitable for predicting Litecoin. The second reason is that Litecoin had a violent price ups and downs before COVID-19. Due to our design of the experiment, the data during this period was not included in the training set.

Next, we aim to investigate what might be contributing to the lower accuracy of multivariate models compared to univariate models in prediction. In our analysis (Fig or Table xxx, we notice that the price of cryptocurrencies is extremely unstable and is greatly influenced by several variables outside or inside of the market [10]. Simply inserting some additional factors will not only be ineffective in assisting the model to accurately forecast outcomes, but it may also mislead the model into acquiring knowledge of irrelevant data features.

Our analysis of volatility (Figure 10 and 11) shows the high degree of volatility exhibited by cryptocurrencies throughout the COVID-19 pandemic. Through a comparative analysis of the outcomes obtained from first and second experiment, we observed that the use of high volatility data as the training set provides better prediction accuracy for the model. The robustness and scalability of the model are also improved.

Next, we investigate the factors contributing to the advantages and disadvantages of each model. Conventional time series models and machine learning techniques are inadequate for addressing issues such as timing dependency and gradient explosion. As we used MLP and ARIMA models to predict Bitcoin, the prediction performance was not as good as the deep learning model. And according to [139], there is long-term memory in the cryptocurrency market. The LSTM network, a deep learning model initially designed to address long-term memory issues, distinguishes itself in this regard. The memory gate in the LSTM network can better capture information in time series with long-term dependencies. The prediction performance of CNN is worse than that of LSTM, which is what we expected. Because the convolutional layer in CNN is better at capturing local patterns and features, it has better prediction results for sequences where there is obvious spatial correlation between data points. According to past research, CNN seems to be more effective in handling image recognition problems. Next, we analyze the differences between the four models with LSTM layers (LSTM, ED-SLTM, BD-LSTM and Conv-LSTM). The ED-LSTM model has been created for language modeling problems, particularly for sequence to sequence modeling in language translation. In this model, an encoder LSTM is used to transform a source sequence into a fixed-length vector, while a decoder LSTM is employed to convert the vector representation back into a variable-length target sequence [105]. In our study, the encoder function maps an input time series to a vector of fixed length. Subsequently, the decoder LSTM function translates the vector representation to several prediction horizons. Despite the differences in the application, the fundamental objective of mapping inputs to outputs remains unchanged. As a result, ED-LSTM models have proven to be highly efficient for multi-step ahead prediction. BD-LSTMs utilize two LSTM models to capture both forward and backward information about the sequence at each time step [102]. While these models have been shown effective for language modeling, our findings indicate that they’re useful in tracking both present and future states for time series modeling. In our experiments, BD-LSTM and ED-LSTM provided stable and outstanding prediction performance. Although Conv-LSTM uses convolutional layers as input with LSTM memory cells in the hidden layer, it differs from the conventional LSTM models since the memory cells from different hidden layers update only in the time domain and are mutually independent [140]. Therefore, information at the top layer in time t1𝑡1t-1italic_t - 1 will be ignored by the bottom layer at time t𝑡titalic_t. In cryptocurrency price prediction, the time information is crucial. This also explains why the prediction accuracy of Conv-LSTM in our experiments is often low and unstable at high prediction horizons. We also employed the Transformer provided unsatisfactory results which may be attributed to the limited training data, as Transformer models are often better suited for handling large amounts of data.

6 Conclusions

In this study, we provide a rigorous evaluation of novel deep learning models for cryptocurrency price forecasting. We compared prominent deep learning models using univariate and multivariate strategies. The results show that the Bidirecional-LSTM provides the highest accuracy in predicting cryptocurrency prices. We also provided a comparison with baseline models such as multilayer perceptron and ARIMA and found that deep learning models generally outperform them. We also found that multivariate models provided less prediction efficiency than univariate models; however, it has scope for improvement given the availability of higher correlated time series data as features. In terms of the effect of COVID-19, we found that close-price volatility for cryptocurrency is quite apparent. Our experimental results show that utilising a training data set with high volatility enhances the precision of our predictions.

In future work, it would be worthwhile to improve the multivariate model. It is advisable to utilise more dependable factors to enhance forecasts, maybe employing techniques like causal inference to find these variables. We can also use this study framework to switch the goal into predictions of other financial indicators such as volatility of cryptocurrency. Further applications to other specific issues could also be viable, such as predicting energy use and extreme weather forecasting.

7 Code and Data

We provide open source code and data using GitHub repository 666https://github.com/sydney-machine-learning/deeplearning-crypto.

References

  • [1]S.Bose, G.Dong, A.Simpson, S.Bose, G.Dong, A.Simpson, The financial ecosystem, Springer, 2019.
  • [2]J.Frankel, B.Smit, F.Sturzenegger, Fiscal and monetary policy in a commodity-based economy 1, Economics of transition 16(4) (2008) 679–713.
  • [3]F.A. Hayek, Denationalisation of money: the argument refined: an analysis of the theory and practice of concurrent currencies, Vol.70, Institute of economic affairs, 1990.
  • [4]M.M. Gross, C.Siebenbrunner, Money creation in fiat and digital currency systems, International Monetary Fund, 2019.
  • [5]J.H. Boyd, R.Levine, B.D. Smith, The impact of inflation on financial sector performance, Journal of monetary Economics 47(2) (2001) 221–248.
  • [6]U.Milkau, J.Bott, Digitalisation in payments: From interoperability to centralised models?, Journal of Payments Strategy & Systems 9(3) (2015) 321–340.
  • [7]D.Chaum, Blind signatures for untraceable payments, in: Advances in Cryptology: Proceedings of Crypto 82, Springer, 1983, pp. 199–203.
  • [8]S.Nakamoto, Bitcoin: A peer-to-peer electronic cash system (2008).
  • [9]A.Manimuthu, G.Rejikumar, D.Marwaha, etal., A literature review on Bitcoin: transformation of crypto currency into a global phenomenon, IEEE Engineering Management Review 47(1) (2019) 28–35.
  • [10]R.Farell, An analysis of the cryptocurrency industry, Wharton Research Scholars 130 (2015) 1–23.
  • [11]I.Eyal, Blockchain technology: Transforming libertarian cryptocurrency dreams to finance and banking realities, Computer 50(9) (2017) 38–49.
  • [12]H.Jang, J.Lee, An empirical study on modeling and prediction of bitcoin prices with bayesian neural networks based on blockchain information, IEEE access 6 (2017) 5427–5437.
  • [13]M.Saad, J.Choi, D.Nyang, J.Kim, A.Mohaisen, Toward characterizing blockchain-based cryptocurrencies for highly accurate predictions, IEEE Systems Journal 14(1) (2019) 321–332.
  • [14]S.Corbet, B.Lucey, L.Yarovaya, Datestamping the bitcoin and ethereum bubbles, Finance Research Letters 26 (2018) 81–88.
  • [15]J.Bhosale, S.Mavale, Volatility of select crypto-currencies: A comparison of bitcoin, ethereum and litecoin, Annu. Res. J. SCMS, Pune 6(1) (2018) 132–141.
  • [16]P.Katsiampa, An empirical investigation of volatility dynamics in the cryptocurrency market, Research in International Business and Finance 50 (2019) 322–335.
  • [17]H.Elendner, S.Trimborn, B.Ong, T.M. Lee, The cross-section of crypto-currencies as financial assets: An overview (2016).
  • [18]P.L. Seabe, C.R.B. Moutsinga, E.Pindza, Forecasting cryptocurrency prices using lstm, gru, and bi-directional lstm: A deep learning approach, Fractal and Fractional 7(2) (2023) 203.
  • [19]N.Kyriazis, S.Papadamou, S.Corbet, A systematic review of the bubble dynamics of cryptocurrency prices, Research in International Business and Finance 54 (2020) 101254.
  • [20]M.A. Ammer, T.H. Aldhyani, Deep learning algorithm to predict cryptocurrency fluctuation prices: Increasing investment awareness, Electronics 11(15) (2022) 2349.
  • [21]K.Murray, A.Rossi, D.Carraro, A.Visentin, On forecasting cryptocurrency prices: A comparison of machine learning, deep learning, and ensembles, Forecasting 5(1) (2023) 196–209.
  • [22]Y.Wang, G.Andreeva, B.Martin-Barragan, Machine learning approaches to forecasting cryptocurrency volatility: Considering internal and external determinants, International Review of Financial Analysis 90 (2023) 102914.
  • [23]N.A. Kyriazis, A survey on empirical findings about spillovers in cryptocurrency markets, Journal of Risk and Financial Management 12(4) (2019) 170.
  • [24]J.H. Stock, M.W. Watson, Vector autoregressions, Journal of Economic perspectives 15(4) (2001) 101–115.
  • [25]J.-C. Duan, The garch option pricing model, Mathematical finance 5(1) (1995) 13–32.
  • [26]T.L.D. Huynh, M.A. Nasir, X.V. Vo, T.T. Nguyen, “small things matter most”: The spillover effects in the cryptocurrency market and gold as a silver bullet, The North American Journal of Economics and Finance 54 (2020) 101277.
  • [27]T.Baltrušaitis, C.Ahuja, L.-P. Morency, Multimodal machine learning: A survey and taxonomy, IEEE transactions on pattern analysis and machine intelligence 41(2) (2018) 423–443.
  • [28]S.Wang, J.Cao, S.Y. Philip, Deep learning for spatio-temporal data mining: A survey, IEEE transactions on knowledge and data engineering 34(8) (2020) 3681–3700.
  • [29]B.Lim, S.Zohren, Time-series forecasting with deep learning: a survey, Philosophical Transactions of the Royal Society A 379(2194) (2021) 20200209.
  • [30]V.Jacques-Dumas, F.Ragone, P.Borgnat, P.Abry, F.Bouchet, Deep learning-based extreme heatwave forecast, Frontiers in Climate 4 (2022).
  • [31]S.Mahjoub, L.Chrifi-Alaoui, B.Marhic, L.Delahoche, Predicting energy consumption using lstm, multi-layer gru and drop-gru neural networks, Sensors 22(11) (2022) 4062.
  • [32]R.Chandra, Y.He, Bayesian neural networks for stock price forecasting before and during covid-19 pandemic, Plos one 16(7) (2021) e0253217.
  • [33]I.E. Livieris, E.Pintelas, S.Stavroyiannis, P.Pintelas, Ensemble deep learning models for forecasting cryptocurrency time-series, Algorithms 13(5) (2020) 121.
  • [34]F.Ferdiansyah, S.H. Othman, R.Z. R.M. Radzi, D.Stiawan, Y.Sazaki, U.Ependi, A lstm-method for bitcoin price prediction: A case study yahoo finance stock market, in: 2019 international conference on electrical engineering and computer science (ICECOS), IEEE, 2019, pp. 206–210.
  • [35]C.-H. Wu, C.-C. Lu, Y.-F. Ma, R.-S. Lu, A new forecasting framework for bitcoin price with lstm, in: 2018 IEEE international conference on data mining workshops (ICDMW), IEEE, 2018, pp. 168–175.
  • [36]S.Hochreiter, J.Schmidhuber, Long short-term memory, Neural computation 9(8) (1997) 1735–1780.
  • [37]Y.Yu, X.Si, C.Hu, J.Zhang, A review of recurrent neural networks: LSTM cells and network architectures, Neural computation 31(7) (2019) 1235–1270.
  • [38]Z.Jiang, J.Liang, Cryptocurrency portfolio management with deep reinforcement learning, in: 2017 Intelligent systems conference (IntelliSys), IEEE, 2017, pp. 905–913.
  • [39]S.Sridhar, S.Sanagavarapu, Multi-head self-attention transformer for dogecoin price prediction, in: 2021 14th International Conference on Human System Interaction (HSI), IEEE, 2021, pp. 1–6.
  • [40]R.Chandra, S.Goyal, R.Gupta, Evaluation of deep learning models for multi-step ahead time series prediction, Ieee Access 9 (2021) 83105–83123.
  • [41]R.S. Tsay, Analysis of financial time series, John wiley & sons, 2005.
  • [42]T.Fischer, C.Krauss, Deep learning with long short-term memory networks for financial market predictions, European journal of operational research 270(2) (2018) 654–669.
  • [43]O.B. Sezer, M.U. Gudelek, A.M. Ozbayoglu, Financial time series forecasting with deep learning: A systematic literature review: 2005–2019, Applied soft computing 90 (2020) 106181.
  • [44]V.Plakandaras, T.Papadimitriou, P.Gogas, K.Diamantaras, Market sentiment and exchange rate directional forecasting, Algorithmic Finance 4(1-2) (2015) 69–79.
  • [45]M.Nabipour, P.Nayyeri, H.Jabani, S.Shahab, A.Mosavi, Predicting stock market trends using machine learning and deep learning algorithms via continuous and binary data; a comparative analysis, Ieee Access 8 (2020) 150199–150212.
  • [46]A.Kong, H.Zhu, R.Azencott, Predicting intraday jumps in stock prices using liquidity measures and technical indicators, Journal of Forecasting 40(3) (2021) 416–438.
  • [47]J.L. Elman, Finding structure in time, Cognitive science 14(2) (1990) 179–211.
  • [48]S.Mehtab, J.Sen, A.Dutta, Stock price prediction using machine learning and lstm-based deep learning models, in: Machine Learning and Metaheuristics Algorithms, and Applications: Second Symposium, SoMMA 2020, Chennai, India, October 14–17, 2020, Revised Selected Papers 2, Springer, 2021, pp. 88–106.
  • [49]H.Rezaei, H.Faaljou, G.Mansourfar, Stock price prediction using deep learning and frequency decomposition, Expert Systems with Applications 169 (2021) 114332.
  • [50]G.Rilling, P.Flandrin, P.Goncalves, etal., On empirical mode decomposition and its algorithms, in: IEEE-EURASIP workshop on nonlinear signal and image processing, Vol.3, Grado: IEEE, 2003, pp. 8–11.
  • [51]M.E. Torres, M.A. Colominas, G.Schlotthauer, P.Flandrin, A complete ensemble empirical mode decomposition with adaptive noise, in: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, 2011, pp. 4144–4147.
  • [52]N.Jing, Z.Wu, H.Wang, A hybrid model integrating deep learning with investor sentiment analysis for stock price prediction, Expert Systems with Applications 178 (2021) 115019.
  • [53]S.Mehtab, J.Sen, Stock price prediction using machine learning and deep learning algorithms and models, Machine Learning in the Analysis and Forecasting of Financial Time Series (2022) 235–303.
  • [54]Y.Li, Y.Pan, A novel ensemble deep learning model for stock prediction based on stock prices and news, International Journal of Data Science and Analytics 13(2) (2022) 139–149.
  • [55]A.Kanwal, M.F. Lau, S.P. Ng, K.Y. Sim, S.Chandrasekaran, Bicudnnlstm-1dcnn—a hybrid deep learning-based predictive model for stock price prediction, Expert Systems with Applications 202 (2022) 117123.
  • [56]T.Swathi, N.Kasiviswanath, A.A. Rao, An optimal deep learning-based lstm for stock price prediction using twitter sentiment analysis, Applied Intelligence 52(12) (2022) 13675–13688.
  • [57]H.BenAmeur, S.Boubaker, Z.Ftiti, W.Louhichi, K.Tissaoui, Forecasting commodity prices: empirical evidence using deep learning tools, Annals of Operations Research (2023) 1–19.
  • [58]P.Baser, J.R. Saini, N.Baser, Gold commodity price prediction using tree-based prediction models, International Journal of Intelligent Systems and Applications in Engineering 11(1s) (2023) 90–96.
  • [59]S.Deepa, A.Alli, S.Gokila, etal., Machine learning regression model for material synthesis prices prediction in agriculture, Materials Today: Proceedings 81 (2023) 989–993.
  • [60]Y.Zhao, G.Yang, Deep learning-based integrated framework for stock price movement prediction, Applied Soft Computing 133 (2023) 109921.
  • [61]J.Almeida, S.Tata, A.Moser, V.Smit, Bitcoin prediciton using ann, Neural networks 7 (2015) 1–12.
  • [62]D.C. Mallqui, R.A. Fernandes, Predicting the direction, maximum, minimum and closing prices of daily bitcoin exchange rate using machine learning techniques, Applied Soft Computing 75 (2019) 596–606.
  • [63]S.G. Quek, G.Selvachandran, J.H. Tan, H.Y.A. Thiang, N.T. Tuan, etal., A new hybrid model of fuzzy time series and genetic algorithm based machine learning algorithm: a case study of forecasting prices of nine types of major cryptocurrencies, Big Data Research 28 (2022) 100315.
  • [64]A.Radityo, Q.Munajat, I.Budi, Prediction of bitcoin exchange rate to american dollar using artificial neural network methods, in: 2017 international conference on advanced computer science and information systems (ICACSIS), IEEE, 2017, pp. 433–438.
  • [65]A.Greaves, B.Au, Using the bitcoin transaction graph to predict the price of bitcoin, No data 8 (2015) 416–443.
  • [66]C.Cortes, V.Vapnik, Support-vector networks, Machine learning 20 (1995) 273–297.
  • [67]Y.Sovbetov, Factors influencing cryptocurrency prices: Evidence from bitcoin, ethereum, dash, litcoin, and monero, Journal of Economics and Financial Analysis 2(2) (2018) 1–27.
  • [68]T.Guo, A.Bifet, N.Antulov-Fantulin, Bitcoin volatility forecasting with a glimpse into buy and sell orders, in: 2018 IEEE international conference on data mining (ICDM), IEEE, 2018, pp. 989–994.
  • [69]C.G. Akcora, A.K. Dey, Y.R. Gel, M.Kantarcioglu, Forecasting bitcoin price with graph chainlets, in: Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part III 22, Springer, 2018, pp. 765–776.
  • [70]S.Roy, S.Nanjiba, A.Chakrabarty, Bitcoin price forecasting using time series analysis, in: 2018 21st International Conference of Computer and Information Technology (ICCIT), IEEE, 2018, pp. 1–5.
  • [71]V.Derbentsev, N.Datsenko, O.Stepanenko, V.Bezkorovainyi, Forecasting cryptocurrency prices time series using machine learning approach, in: SHS Web of Conferences, Vol.65, EDP Sciences, 2019, p. 02001.
  • [72]S.Aanandhi, S.Akhilaa, V.Vardarajan, M.Sathiyanarayanan, etal., Cryptocurrency price prediction using time series forecasting (arima), in: 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 2021, pp. 598–602.
  • [73]N.Latif, J.D. Selvam, M.Kapse, V.Sharma, V.Mahajan, Comparative performance of lstm and arima for the short-term prediction of bitcoin prices, Australasian Accounting, Business and Finance Journal 17(1) (2023) 256–276.
  • [74]N.Maleki, A.Nikoubin, M.Rabbani, Y.Zeinali, Bitcoin price prediction based on other cryptocurrencies using machine learning and time series analysis, Scientia Iranica 30(1) (2023) 285–301.
  • [75]L.P. Kaelbling, M.L. Littman, A.W. Moore, Reinforcement learning: A survey, Journal of artificial intelligence research 4 (1996) 237–285.
  • [76]K.Lee, S.Ulkuatam, P.Beling, W.Scherer, Generating synthetic bitcoin transactions and predicting market price movement via inverse reinforcement learning and agent-based modeling, Journal of Artificial Societies and Social Simulation 21(3) (2018).
  • [77]B.Ly, D.Timaul, A.Lukanan, J.Lau, E.Steinmetz, Applying deep learning to better predict cryptocurrency trends, in: Midwest Instruction and Computing Symposium, 2018.
  • [78]G.Lucarelli, M.Borrotti, A deep reinforcement learning approach for automated cryptocurrency trading, in: Artificial Intelligence Applications and Innovations: 15th IFIP WG 12.5 International Conference, AIAI 2019, Hersonissos, Crete, Greece, May 24–26, 2019, Proceedings 15, Springer, 2019, pp. 247–258.
  • [79]S.Lahmiri, S.Bekiros, Cryptocurrency forecasting with deep learning chaotic neural networks, Chaos, Solitons & Fractals 118 (2019) 35–40.
  • [80]M.M. Patel, S.Tanwar, R.Gupta, N.Kumar, A deep learning-based cryptocurrency price prediction scheme for financial institutions, Journal of information security and applications 55 (2020) 102583.
  • [81]S.Marne, S.Churi, D.Correia, J.Gomes, Predicting price of cryptocurrency–a deep learning approach, NTASU-9 (3) (2020).
  • [82]S.Nasekin, C.Y.-H. Chen, Deep learning-based cryptocurrency sentiment construction, Digital Finance 2(1) (2020) 39–67.
  • [83]C.Betancourt, W.-H. Chen, Deep reinforcement learning for portfolio management of markets with a dynamic number of assets, Expert Systems with Applications 164 (2021) 114002.
  • [84]K.Arulkumaran, M.P. Deisenroth, M.Brundage, A.A. Bharath, Deep reinforcement learning: A brief survey, IEEE Signal Processing Magazine 34(6) (2017) 26–38.
  • [85]Z.Shahbazi, Y.-C. Byun, Improving the cryptocurrency price prediction performance based on reinforcement learning, IEEE Access 9 (2021) 162651–162659.
  • [86]V.D’Amato, S.Levantesi, G.Piscopo, Deep learning in predicting cryptocurrency volatility, Physica A: Statistical Mechanics and its Applications 596 (2022) 127158.
  • [87]M.Schnaubelt, Deep reinforcement learning for the optimal placement of cryptocurrency limit orders, European Journal of Operational Research 296(3) (2022) 993–1006.
  • [88]R.Parekh, N.P. Patel, N.Thakkar, R.Gupta, S.Tanwar, G.Sharma, I.E. Davidson, R.Sharma, Dl-guess: Deep learning and sentiment analysis-based cryptocurrency price prediction, IEEE Access 10 (2022) 35398–35409.
  • [89]G.Kim, D.-H. Shin, J.G. Choi, S.Lim, A deep learning-based cryptocurrency price prediction model that uses on-chain data, IEEE Access 10 (2022) 56232–56248.
  • [90]S.Goutte, H.-V. Le, F.Liu, H.-J. VonMettenheim, Deep learning and technical analysis in cryptocurrency market, Finance Research Letters 54 (2023) 103809.
  • [91]K.-C. Yen, H.-P. Cheng, Economic policy uncertainty and cryptocurrency volatility, Finance Research Letters 38 (2021) 101428.
  • [92]F.Woebbeking, Cryptocurrency volatility markets, Digital finance 3(3) (2021) 273–298.
  • [93]J.L. Cross, C.Hou, K.Trinh, Returns, volatility and the cryptocurrency bubble of 2017–18, Economic Modelling 104 (2021) 105643.
  • [94]Z.Ftiti, W.Louhichi, H.BenAmeur, Cryptocurrency volatility forecasting: What can we learn from the first wave of the covid-19 outbreak?, Annals of Operations Research 330(1) (2023) 665–690.
  • [95]L.Yin, J.Nie, L.Han, Understanding cryptocurrency volatility: The role of oil market shocks, International Review of Economics & Finance 72 (2021) 233–253.
  • [96]L.Catania, S.Grassi, F.Ravazzolo, Predicting the volatility of cryptocurrency time-series, Mathematical and Statistical Methods for Actuarial Sciences and Finance: MAF 2018 (2018) 203–207.
  • [97]L.Catania, S.Grassi, Forecasting cryptocurrency volatility, International Journal of Forecasting 38(3) (2022) 878–894.
  • [98]F.Ma, C.Liang, Y.Ma, M.I.M. Wahab, Cryptocurrency volatility forecasting: A markov regime-switching midas approach, Journal of Forecasting 39(8) (2020) 1277–1290.
  • [99]Y.Wei, Y.Wang, B.M. Lucey, S.A. Vigne, Cryptocurrency uncertainty and volatility forecasting of precious metal futures markets, Journal of Commodity Markets 29 (2023) 100305.
  • [100]G.E. Box, G.M. Jenkins, G.C. Reinsel, G.M. Ljung, Time series analysis: forecasting and control, John Wiley & Sons, 2015.
  • [101]S.Hochreiter, The vanishing gradient problem during learning recurrent neural nets and problem solutions, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(02) (1998) 107–116.
  • [102]A.Graves, J.Schmidhuber, Framewise phoneme classification with bidirectional lstm and other neural network architectures, Neural networks 18(5-6) (2005) 602–610.
  • [103]Y.Liu, C.Sun, L.Lin, X.Wang, Learning natural language inference using bidirectional lstm model and inner-attention, arXiv preprint arXiv:1605.09090 (2016).
  • [104]L.Chen, J.Tao, S.Ghaffarzadegan, Y.Qian, End-to-end neural network based automated speech scoring, in: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, 2018, pp. 6234–6238.
  • [105]I.Sutskever, O.Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, Advances in neural information processing systems 27 (2014).
  • [106]K.Cho, B.VanMerriënboer, C.Gulcehre, D.Bahdanau, F.Bougares, H.Schwenk, Y.Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv preprint arXiv:1406.1078 (2014).
  • [107]H.Gunduz, Y.Yaslan, Z.Cataltepe, Intraday prediction of borsa istanbul using convolutional neural networks and feature correlations, Knowledge-Based Systems 137 (2017) 138–148.
  • [108]L.DiPersio, O.Honchar, etal., Artificial neural networks architectures for stock price prediction: Comparisons and applications, International journal of circuits, systems and signal processing 10 (2016) 403–413.
  • [109]E.Hoseinzade, S.Haratizadeh, Cnnpred: Cnn-based stock market prediction using a diverse set of variables, Expert Systems with Applications 129 (2019) 273–285.
  • [110]A.Siripurapu, Convolutional networks for stock trading, Stanford Univ Dep Comput Sci 1(2) (2014) 1–6.
  • [111]H.Jiang, E.Learned-Miller, Face detection with the faster r-cnn, in: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017), IEEE, 2017, pp. 650–657.
  • [112]A.Garcia-Garcia, S.Orts-Escolano, S.Oprea, V.Villena-Martinez, J.Garcia-Rodriguez, A review on deep learning techniques applied to semantic segmentation, arXiv preprint arXiv:1704.06857 (2017).
  • [113]D.P. Kingma, J.Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014).
  • [114]X.Shi, Z.Chen, H.Wang, D.-Y. Yeung, W.-K. Wong, W.-c. Woo, Convolutional lstm network: A machine learning approach for precipitation nowcasting, Advances in neural information processing systems 28 (2015).
  • [115]K.Cho, B.VanMerriënboer, D.Bahdanau, Y.Bengio, On the properties of neural machine translation: Encoder-decoder approaches, arXiv preprint arXiv:1409.1259 (2014).
  • [116]D.Bahdanau, K.Cho, Y.Bengio, Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv:1409.0473 (2014).
  • [117]A.Vaswani, N.Shazeer, N.Parmar, J.Uszkoreit, L.Jones, A.N. Gomez, Ł.Kaiser, I.Polosukhin, Attention is all you need, Advances in neural information processing systems 30 (2017).
  • [118]S.-i. Amari, Backpropagation and stochastic gradient descent method, Neurocomputing 5(4-5) (1993) 185–196.
  • [119]S.Ruder, An overview of gradient descent optimization algorithms, arXiv preprint arXiv:1609.04747 (2016).
  • [120]J.Duchi, E.Hazan, Y.Singer, Adaptive subgradient methods for online learning and stochastic optimization., Journal of machine learning research 12(7) (2011).
  • [121]M.D. Zeiler, Adadelta: an adaptive learning rate method, arXiv preprint arXiv:1212.5701 (2012).
  • [122]G.Hinton, N.Srivastava, K.Swersky, Neural networks for machine learning lecture 6a overview of mini-batch gradient descent, Toronto University (2012).
  • [123]V.Buterin, etal., A next-generation smart contract and decentralized application platform, white paper 3(37) (2014) 2–1.
  • [124]C.Percival, S.Josefsson, The scrypt password-based key derivation function, Tech. rep. (2016).
  • [125]Cryptocurrency historical prices, last accessed 13 Feburary 2024 (2021).
    URL https://www.kaggle.com/datasets/sudalairajkumar/cryptocurrencypricehistory/data
  • [126]F.Takens, Detecting strange attractors in turbulence, in: Dynamical Systems and Turbulence, Warwick 1980: proceedings of a symposium held at the University of Warwick 1979/80, Springer, 2006, pp. 366–381.
  • [127]A.Gnauck, Interpolation and approximation of water quality time series and process identification, Analytical and bioanalytical chemistry 380 (2004) 484–492.
  • [128]A.Paszke, S.Gross, F.Massa, A.Lerer, J.Bradbury, G.Chanan, T.Killeen, Z.Lin, N.Gimelshein, L.Antiga, etal., Pytorch: An imperative style, high-performance deep learning library, Advances in neural information processing systems 32 (2019).
  • [129]P.E. Mandaci, E.C. Cagli, Herding intensity and volatility in cryptocurrency markets during the COVID-19, Finance Research Letters 46 (2022) 102382.
  • [130]M.A. Naeem, E.Bouri, Z.Peng, S.J.H. Shahzad, X.V. Vo, Asymmetric efficiency of cryptocurrencies during COVID19, Physica A: Statistical Mechanics and its Applications 565 (2021) 125562.
  • [131]A.Tanwar, V.Kumar, Prediction of cryptocurrency prices using transformers and long short term neural networks, in: 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP), IEEE, 2022, pp. 1–4.
  • [132]D.Wang, W.-Z. Lu, Forecasting of ozone level in time series using mlp model with a novel hybrid training algorithm, Atmospheric Environment 40(5) (2006) 913–924.
  • [133]D.Cucinotta, M.Vanelli, Who declares covid-19 a pandemic, Acta bio medica: Atenei parmensis 91(1) (2020) 157.
  • [134]K.G. Andersen, A.Rambaut, W.I. Lipkin, E.C. Holmes, R.F. Garry, The proximal origin of sars-cov-2, Nature medicine 26(4) (2020) 450–452.
  • [135]A.Brodeur, D.Gray, A.Islam, S.Bhuiyan, A literature review of the economics of covid-19, Journal of economic surveys 35(4) (2021) 1007–1044.
  • [136]M.Leach, H.MacGregor, I.Scoones, A.Wilkinson, Post-pandemic transformations: How and why COVID-19 requires us to rethink development, World development 138 (2021) 105233.
  • [137]G.Miao, Z.Chen, H.Cao, W.Wu, X.Chu, H.Liu, L.Zhang, H.Zhu, H.Cai, X.Lu, etal., From immunogen to COVID-19 vaccines: Prospects for the post-pandemic era, Biomedicine & Pharmacotherapy 158 (2023) 114208.
  • [138]D.Łaskawiec, M.Grajek, P.Szlacheta, I.Korzonek-Szlacheta, Post-pandemic stress disorder as an effect of the epidemiological situation related to the COVID-19 pandemic, in: Healthcare, Vol.10, 2022, p. 975.
  • [139]Y.Jiang, H.Nie, W.Ruan, Time-varying long-term memory in bitcoin market, Finance Research Letters 25 (2018) 280–284.
  • [140]Y.Wang, M.Long, J.Wang, Z.Gao, P.S. Yu, Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms, Advances in neural information processing systems 30 (2017).
Review of deep learning models for crypto price prediction: implementation and evaluation (2024)
Top Articles
Pareto Principle – 80/20 Rule
How to Increase Your Home Value
Use Copilot in Microsoft Teams meetings
Craigslist Warren Michigan Free Stuff
Tlc Africa Deaths 2021
Nyu Paralegal Program
Booknet.com Contract Marriage 2
Phone Number For Walmart Automotive Department
<i>1883</i>'s Isabel May Opens Up About the <i>Yellowstone</i> Prequel
P2P4U Net Soccer
Mustangps.instructure
Walgreens Alma School And Dynamite
Miami Valley Hospital Central Scheduling
How to watch free movies online
The Connecticut Daily Lottery Hub
Cashtapp Atm Near Me
Wal-Mart 140 Supercenter Products
Parent Resources - Padua Franciscan High School
Las 12 mejores subastas de carros en Los Ángeles, California - Gossip Vehiculos
Band Of Loyalty 5E
Rural King Credit Card Minimum Credit Score
I Saysopensesame
A Biomass Pyramid Of An Ecosystem Is Shown.Tertiary ConsumersSecondary ConsumersPrimary ConsumersProducersWhich
Who is Jenny Popach? Everything to Know About The Girl Who Allegedly Broke Into the Hype House With Her Mom
Sunset Time November 5 2022
Rubber Ducks Akron Score
Gas Buddy Prices Near Me Zip Code
Gotcha Rva 2022
Scripchat Gratis
Craigslist Northern Minnesota
Vlacs Maestro Login
FREE Houses! All You Have to Do Is Move Them. - CIRCA Old Houses
The Legacy 3: The Tree of Might – Walkthrough
Xemu Vs Cxbx
Oreillys Federal And Evans
42 Manufacturing jobs in Grayling
Retire Early Wsbtv.com Free Book
Petsmart Northridge Photos
19 Best Seafood Restaurants in San Antonio - The Texas Tasty
Rochester Ny Missed Connections
Skip The Games Grand Rapids Mi
Ferguson Employee Pipeline
Hireright Applicant Center Login
60 X 60 Christmas Tablecloths
Aita For Announcing My Pregnancy At My Sil Wedding
No Boundaries Pants For Men
Natasha Tosini Bikini
Ratchet And Clank Tools Of Destruction Rpcs3 Freeze
Tito Jackson, member of beloved pop group the Jackson 5, dies at 70
Okta Login Nordstrom
Craigslist Pet Phoenix
Latest Posts
Article information

Author: Edwin Metz

Last Updated:

Views: 5867

Rating: 4.8 / 5 (78 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Edwin Metz

Birthday: 1997-04-16

Address: 51593 Leanne Light, Kuphalmouth, DE 50012-5183

Phone: +639107620957

Job: Corporate Banking Technician

Hobby: Reading, scrapbook, role-playing games, Fishing, Fishing, Scuba diving, Beekeeping

Introduction: My name is Edwin Metz, I am a fair, energetic, helpful, brave, outstanding, nice, helpful person who loves writing and wants to share my knowledge and understanding with you.