first article finished

rageSpin · Jun 18, 2024 · a14f92b · a14f92b
1 parent 481d17b
commit a14f92b
Show file tree

Hide file tree

Showing 18 changed files with 250 additions and 54 deletions.
diff --git a/content/posts/finance/stock_prediction/GRU/images/test_results.png b/content/posts/finance/stock_prediction/GRU/images/test_results.png
diff --git a/content/posts/finance/stock_prediction/GRU/index.md b/content/posts/finance/stock_prediction/GRU/index.md
@@ -12,10 +12,9 @@ hero: images/stock-market-prediction-using-data-mining-techniques.jpg
 tags: ["Finance", "Deep Learning", "Forecasting"]
 categories: ["Finance"]
 ---
-
 ## Introduction
 
-In this article, we will explore time series data extracted from the **stock market**, focusing on prominent technology companies such as Apple, Amazon, Google, and Microsoft. Our objective is to equip data analysts and scientists with the essential skills to effectively manipulate and interpret stock market data. 
+In this article, we will explore time series data extracted from the **stock market**, focusing on prominent technology companies such as Apple, Amazon, Google, and Microsoft. Our objective is to equip data analysts and scientists with the essential skills to effectively manipulate and interpret stock market data.
 
 To achieve this, we will utilize the *yfinance* library to fetch stock information and leverage visualization tools such as Seaborn and Matplotlib to illustrate various facets of the data. Specifically, we will explore methods to analyze stock risk based on historical performance, and implement predictive modeling using **GRU/ LSTM** models.
 
@@ -27,16 +26,20 @@ Throughout this tutorial, we aim to address the following key questions:
 4. What is the **correlation** between different stocks?
 5. How can we forecast future stock behavior, exemplified by predicting the closing price of Apple Inc. using LSTM or GRU?"
 
-*** 
+---
 
 ## Getting Data
+
 The initial step involves **acquiring and loading** the data into memory. Our source of stock data is the **Yahoo Finance** website, renowned for its wealth of financial market data and investment tools. To access this data, we'll employ the **yfinance** library, known for its efficient and Pythonic approach to downloading market data from Yahoo. For further insights into yfinance, refer to the article titled [Reliably download historical market data from with Python](https://aroussi.com/post/python-yahoo-finance).
 
 ### Install Dependencies
+
 ```bash
 pip install -qU yfinance seaborn
 ```
+
 ### Configuration Code
+
 ```python
 import pandas as pd
 import numpy as np
@@ -62,13 +65,16 @@ data = yf.download("MSFT", start, end)
 ```
 
 ## Statistical Analysis on the price
+
 ### Summary
+
 ```python
 # Summary Stats
 data.describe()
 ```
 
 ### Closing Price
+
 The closing price is the last price at which the stock is traded during the regular trading day. A stock’s closing price is the standard benchmark used by investors to track its performance over time.
 
 ```python
@@ -80,8 +86,11 @@ plt.title('Stock Price History')
 plt.legend()
 plt.show()
 ```
+
 ### Volume of Sales
+
 Volume is the amount of an asset or security that _changes hands over some period of time_, often over the course of a day. For instance, the stock trading volume would refer to the number of shares of security traded between its daily open and close. Trading volume, and changes to volume over the course of time, are important inputs for technical traders.
+
 ```python
 plt.figure(figsize=(14, 5))
 plt.plot(data['Volume'], label='Volume')
@@ -92,8 +101,8 @@ plt.show()
 ```
 
 ### Moving Average
-The moving average (MA) is a simple **technical analysis** tool that smooths out price data by creating a constantly updated average price. The average is taken over a specific period of time, like 10 days, 20 minutes, 30 weeks, or any time period the trader chooses.
 
+The moving average (MA) is a simple **technical analysis** tool that smooths out price data by creating a constantly updated average price. The average is taken over a specific period of time, like 10 days, 20 minutes, 30 weeks, or any time period the trader chooses.
 
 ```python
 ma_day = [10, 20, 50]
@@ -112,7 +121,9 @@ plt.show()
 ```
 
 ## Statistical Analysis on the returns
+
 Now that we've done some baseline analysis, let's go ahead and dive a little deeper. We're now going to analyze the risk of the stock. In order to do so we'll need to take a closer look at the daily changes of the stock, and not just its absolute value. Let's go ahead and use pandas to retrieve teh daily returns for the **Microsoft** stock.
+
 ```python
 # Compute daily return in percentage
 data['Daily Return'] = data['Adj Close'].pct_change()
@@ -131,7 +142,9 @@ plt.title('MSFT Daily Return')
 plt.show()
 
 ```
+
 ## Data Preparation
+
 ```python
 # Create a new dataframe with only the 'Close column 
 X = data.filter(['Adj Close'])
@@ -147,7 +160,9 @@ scaled_data = scaler.fit_transform(X)
 
 scaled_data
 ```
+
 Split training data into small chunks to ingest into LSTM and GRU
+
 ```python
 # Create the training data set 
 # Create the scaled training data set
@@ -162,7 +177,7 @@ for i in range(seq_length, len(train_data)):
  if i<= seq_length+1:
  print(x_train)
  print(y_train, end="\n\n")
-  
+
 # Convert the x_train and y_train to numpy arrays 
 x_train, y_train = np.array(x_train), np.array(y_train)
 
@@ -171,7 +186,9 @@ x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
 ```
 
 ## GRU
+
 Gated-Recurrent Unit (GRU) is adopted in this part
+
 ```python
 from tensorflow.keras.models import Sequential
 from tensorflow.keras.layers import GRU, Dense, Dropout
@@ -184,11 +201,13 @@ lstm_model.add(Dropout(0.2))
 lstm_model.add(Dense(units=1))
 
 lstm_model.compile(optimizer='adam', loss='mean_squared_error')
-lstm_model.fit(x_train, y_train, epochs=10, batch_size=4)
+lstm_model.fit(x_train, y_train, epochs=10, batch_size=8)
 ```
 
 ## LSTM
+
 Long Short-Term Memory (LSTM) is adopted in this part
+
 ```python
 from tensorflow.keras.layers import LSTM
 
@@ -200,13 +219,89 @@ lstm_model.add(Dropout(0.2))
 lstm_model.add(Dense(units=1))
 
 lstm_model.compile(optimizer='adam', loss='mean_squared_error')
-lstm_model.fit(x_train, y_train, epochs=10, batch_size=4)
+lstm_model.fit(x_train, y_train, epochs=10, batch_size=8)
 ```
 
-
 ## Testing Metrics
-* mean squared error
+
+* root mean squared error (RMSE)
+
+```python
+
+# Create the testing data set
+# Create a new array containing scaled values from index 1543 to 2002 
+test_data = scaled_data[training_data_len - 60: , :]
+# Create the data sets x_test and y_test
+x_test = []
+y_test = dataset[training_data_len:, :]
+for i in range(60, len(test_data)):
+ x_test.append(test_data[i-60:i, 0])
+
+# Convert the data to a numpy array
+x_test = np.array(x_test)
+
+# Reshape the data
+x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1 ))
+
+# Get the models predicted price values 
+predictions_gru = gru_model.predict(x_test)
+predictions_gru = scaler.inverse_transform(predictions_gru)
+predictions_lstm = lstm_model.predict(x_test)
+predictions_lstm = scaler.inverse_transform(predictions_lstm)
+
+# Get the root mean squared error (RMSE)
+rmse_lstm = np.sqrt(np.mean(((predictions_lstm - y_test) ** 2)))
+rmse_gru = np.sqrt(np.mean(((predictions_gru - y_test) ** 2)))
+print(f"LSTM RMSE: {rmse_lstm:.4f}, GRU RMSE: {rmse_gru:.4f}")
+```
+
+> "LSTM RMSE: 4.2341, GRU RMSE: {3.3575}"
 
 ### Test Plot
 
+{{< img src="/posts/finance/stock_prediction/GRU/images/test_results.png" align="center" title="Results">}}
+GRU-based model shows a bit better results both graphically and on MSE. However, this does not tell us anything about the actual profitability of these models.
+
 ## Possible trading performance
+
+The strategy implementation is:
+
+* BUY: if prediction > actual_price
+* SELL: if prediction < actual_price
+
+To close a position the next candle _close_ is waited. However, LSTM and GRU has some offset that does not allow a proper utilization of this strategy.
+
+Hence, the **returns** of the predictions are adopted.
+
+```python
+# Assume a trading capital of $10,000
+trading_capital = 10000
+pred_gru_df = pd.DataFrame(predictions_gru, columns=['Price'])
+pred_test_df = pd.DataFrame(y_test, columns=['Price'])
+pred_gru_df['returns'] = pred_gru_df.pct_change(-1)
+pred_test_df['returns'] = pred_test_df.pct_change(-1)
+
+# Compute Wins
+wins = ((pred_gru_df.dropna().returns<0) & (pred_test_df.dropna().returns<0)) | ((pred_gru_df.dropna().returns>0) & (pred_test_df.dropna().returns>0))
+print(wins.value_counts())
+
+returns_df = pd.concat([pred_gru_df.returns, pred_test_df.returns], axis=1).dropna()
+total_pos_return = pred_test_df.dropna().returns[wins].abs().sum()
+total_neg_return = pred_test_df.dropna().returns[np.logical_not(wins)].abs().sum()
+
+# compute final capital and compare with BUY&HOLD strategy
+final_capital = trading_capital*(1+total_pos_return-total_neg_return)
+benchmark_return = (valid.Close.iloc[-1] - valid.Close.iloc[0])/valid.Close.iloc[0]
+bench_capital = trading_capital*(1+benchmark_return)
+print(final_capital, bench_capital)
+```
+
+> returns
+> True 81
+> False 72
+> Name: count, dtype: int64
+> 10535.325897548326 9617.616876598737
+
+## Conclusion
+As showed in the previous section, these two simple Deep Learning models exhibits interesting positive results both regarding regression and trading metrics.
+The latter is particularly important, indeed a return of **5%** is obtained while the stock price decreased of approximately 4%. This also lead to a very high sharpe and colmar ratio.
diff --git a/public/categories/finance/index.html b/public/categories/finance/index.html
@@ -296,14 +296,14 @@
  <div class="card">
  <div class="card-head">
  <a href="/posts/finance/stock_prediction/gru/" class="post-card-link">
- <img class="card-img-top" src='/images/default-hero.jpg' alt="Hero Image">
+ <img class="card-img-top" src='/posts/finance/stock_prediction/gru/images/stock-market-prediction-using-data-mining-techniques.jpg' alt="Hero Image">
  </a>
  </div>
  <div class="card-body">
  <a href="/posts/finance/stock_prediction/gru/" class="post-card-link">
- <h5 class="card-title">Microsoft Stock Prediction using LSTM or GRU</h5>
- <p class="card-text post-summary">Pick a stock commodity &rsquo; &hellip; '
-Statistical Analysis on the price Statistical Analysis on the returns GRU Model Init Training Testing Metrics mean squared error LSTM Comparison Possible trading performance </p>
+ <h5 class="card-title">MSFT Stock Prediction using LSTM or GRU</h5>
+ <p class="card-text post-summary">Introduction In this article, we will explore time series data extracted from the stock market, focusing on prominent technology companies such as Apple, Amazon, Google, and Microsoft. Our objective is to equip data analysts and scientists with the essential skills to effectively manipulate and interpret stock market data.
+To achieve this, we will utilize the yfinance library to fetch stock information and leverage visualization tools such as Seaborn and Matplotlib to illustrate various facets of the data.</p>
  </a>
 
  <div class="tags">
@@ -326,7 +326,7 @@ <h5 class="card-title">Microsoft Stock Prediction using LSTM or GRU</h5>
  <div class="card-footer">
  <span class="float-start">
  Sunday, June 16, 2024
- | 1 minute </span>
+ | 6 minutes </span>
  <a
  href="/posts/finance/stock_prediction/gru/"
  class="float-end btn btn-outline-info btn-sm">Read</a>

diff --git a/public/categories/finance/index.xml b/public/categories/finance/index.xml
@@ -7,13 +7,13 @@
  <generator>Hugo -- gohugo.io</generator>
  <language>en</language>
  <lastBuildDate>Sun, 16 Jun 2024 00:00:00 +0100</lastBuildDate><atom:link href="http:https://localhost:1313/categories/finance/index.xml" rel="self" type="application/rss+xml" /><item>
- <title>Microsoft Stock Prediction using LSTM or GRU</title>
+ <title>MSFT Stock Prediction using LSTM or GRU</title>
  <link>http:https://localhost:1313/posts/finance/stock_prediction/gru/</link>
  <pubDate>Sun, 16 Jun 2024 00:00:00 +0100</pubDate>
 
  <guid>http:https://localhost:1313/posts/finance/stock_prediction/gru/</guid>
- <description>Pick a stock commodity &amp;rsquo; &amp;hellip; &#39;
-Statistical Analysis on the price Statistical Analysis on the returns GRU Model Init Training Testing Metrics mean squared error LSTM Comparison Possible trading performance </description>
+ <description>Introduction In this article, we will explore time series data extracted from the stock market, focusing on prominent technology companies such as Apple, Amazon, Google, and Microsoft. Our objective is to equip data analysts and scientists with the essential skills to effectively manipulate and interpret stock market data.
+To achieve this, we will utilize the yfinance library to fetch stock information and leverage visualization tools such as Seaborn and Matplotlib to illustrate various facets of the data.</description>
  </item>