Erez Katz, CEO and Co-founder of Lucena Research
In the past decade, the most successful implementations in deep neural networks (deep learning) centered around image processing. Handwriting recognition, natural language processing, speech recognition, and computer vision research are all predicated on some visual representation the network attempts to categorize and generalize. In other words, a shape (x) and its corresponding outcome (y).
By identifying the unique properties of a visual representation, the network is able to classify within a certain proximity close matches in new, unseen data. With the expectation that the outcome from such data will be similar to how the network was trained.
How To Use Deep Neural Networks To Forecast Stock Prices
Take a handwritten image of the number eight. Our brains are able to recognize 8 with ease. However, in order for a machine to recognize images and ultimately exceed the accuracy and speed of humans, it needs to deconstruct the image to its basic form.
Below is an example of a 28*28 pixel image of an 8. On the right side of the image you can see the input layer of a neural network that assigns a neuron for every pixel’s value, 784 input neurons in total (28*28).
If every 8 was written exactly as depicted, we wouldn’t need a sophisticated network to interpret it. But the problem of recognizing images becomes more complex as there are many ways to write 8. Not to mention the image can appear in various regions of the 28*28 pixels’ space and is not always perfectly centered, as depicted below.
Convolutional Neural Networks
One successful deep learning architecture used for image recognition is convolutional neural networks (CNN). CNNs exploit translational invariance by extracting features as regions in an image also called receptive fields. More specifically, CNNs view images in the form of spatial representations.
For our figure 8 example, rather than looking at the image through a pixel by pixel representation, CNNs allow the measurement of groups of pixels together no matter where they appear in the image space.
The notion of spatial reference bodes well for the timeseries data we collect at Lucena. We hold approximately 850 timeseries features that represent daily states of various securities over time (equities, FRX, futures, cryptocurrencies, etc.).
For example, we collect the social media sentiment score for each stock in the Russell 1K, daily. In practice, the social media sentiment score for a given stock is nothing more than a timeseries representation that can be graphed over time, just like a price of a stock.
With such a graphical representation we can provide a richer spatial training reference to a convolutional neural network learner, compared to providing merely a single point-in-time value. Intuitively you must agree that the formation of a trend over time has more information compared to just one value as a point in time.
Applying The Image Recognition Concept To Equity Price Forecasting
Let me start by making a bold statement: Neural networks can compute any function! No matter how complicated and wiggly the graphical representation is, it’s guaranteed to be solved by a neural network. For a deeper dive into neural networks computing different functions read more here.
Now, imagine you have a stock price timeseries that is a representation of a sine wave (let’s assume its symbol is XYZ, for reference). We want to test and see if we can derive a compelling forecasting model using neural nets that can accurately predict XYZ’s 21-day price return.
Chad Landis, one of our rising quant stars, constructed a simple neural net model with no hidden layers and trained it with 5,000 epochs (an epoch is a single set of data served as an input to the neural network so that it can repeatedly train its weights and biases in order to get as close as possible to the desired label output).
Chad used simple logistic regression (similar to a neural network with no hidden layers) with a rolling 21-day mean value. After the training dataset of the first 5,000 epochs, Chad ran the model against the validation period of the subsequent 5,000 epochs and was able to easily achieve validation accuracy of 99.86%, with precision in excess of 99.96%.
To put this into context, in theory, if we were to trade XYZ by inputting into the model its rolling 21-day price mean, XYZ’s performance could have looked like the chart below:
A few important takeaways about timeseries data and deep neural networks:
- Timeseries data can be used effectively by convolutional neural networks when transformed into visual representations.
- If there is information in a timeseries data, it can be successfully exploited by representing it as a normalized graphical representation. Furthermore, with a sound CNN model, such data can ultimately provide sufficient advantage for profit.
- Since neural nets can solve any equation, if there is information in the data that can be represented by f(x) = y, no matter how complex it is, it could ultimately be exploited and used for effective forecasting by the proper neural network model.
In preliminary tests of passing timeseries point-in-time data to deep neural networks, we were able to achieve validation accuracy of approximately 52%. However, by applying a graphical transformation representation of the features’ timeseries data, we are now experiencing more than 60% validation (out of sample) accuracy, which is very exciting for us but still requires a significant amount of additional research and validation. There is more to come as we advance our research further.
Want more about CNN’s and stock forecasting using deep learning? CEO Erez Katz discusses “How to Make Neural Nets Work for Forecasting Stock Prices”
Liked this post? Read more about similar topics: