In partnership with

What Happens When $4.7T in Real Estate Debt Comes Due?

A wave of properties hit the market for up to 40% less than recent values. AARE is buying these income-producing buildings at a discount for its new REIT, which plans to pay at least 90% of its income to investors. And you can be one of them.

Learn more about investing in AARE.

_{This is a paid advertisement for AARE Regulation CF offering. Please read the offering circular at}_{https://invest.aare.com/}

🚀 Your Algo Edge Just Leveled Up — Premium Plans Are Here! 🚀

A year in, our Starter, Pro, and Elite Quant Plans are crushing it—members are live-trading bots and booking 1-on-1 wins. Now with annual + lifetime deals for max savings.

Every premium member gets:
✅ Full code from every article
✅ Private GitHub repos + templates
✅ 3–5 deep-dive paid articles/mo
✅ Early access + live strategy teardowns

Pick your edge:

Starter (€20/mo) → 1 paid article + public repos
Builder (€30/mo) → Full code + private repos (most popular)
Master (€50/mo) → Two 1-on-1 calls + custom bot built for you

Best deals: 📅 Annual: 2 months FREE 🔒 Lifetime: Own it forever + exclusive perks

First 50 annual/lifetime signups get a free 15-min audit. Don’t wait—the market won’t.

Mean Reversion Trading with Kalman Filters: Estimating Fair Value in EUR/USD

How to identify leading, weakening, lagging, and improving sectors in the Indian stock market using Python, Nifty 50 as benchmark, and live yfinance data – full working code included

algoedgeinsights.beehiiv.com/p/mean-reversion-trading-with-kalman-filters-estimating-fair-value-in-eur-usd

👉 Upgrade Now →

🔔 Limited-Time Holiday Deal: 20% Off Our Complete 2026 Playbook! 🔔

Level up before the year ends!

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

30+ battle-tested algorithmic trading strategies from the AlgoEdge Insights newsletter – fully coded in Python, backtested, and ready to deploy. Your full arsenal for dominating 2026 markets.

Special Promo: Use code DECEMBER2025 for 20% off

Valid only until December 20, 2025 — act fast!

👇 Buy Now & Save 👇

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

30+ battle-tested algorithmic trading strategies from the AlgoEdge Insights newsletter – fully coded in Python, backtested, and ready to deploy.

algoedgeinsights.beehiiv.com/products/algoedge-insights-30-python-powered-trading-strategies-the-complete-2026-playbook

Instant access to every strategy we've shared, plus exclusive extras.

— AlgoEdge Insights Team

It has recently been shown that the Python’s powerful framework of libraries and tools makes it ideal for implementing statistical arbitrage strategies (aka stat arb). Unlike traditional arbitrage, statistical arbitrage involves predicting and capitalizing on price movements over a time period. It focuses on immediate price gaps and exploits anticipated price adjustments over a longer period. This is where supervised Machine Learning (ML) comes into play [2]. Its essence lies in creating trained models that can automatically extract knowledge from market inefficiencies by taking advantage of pricing discrepancies between cointegrated assets. The objective of this post is to improve ROI of statistical arbitrage strategies by invoking the SciKit-Learn ML classifiers. In the sequel, we’ll consider the BBY and AAL close prices. According to the Macroaxis Correlation Matchups, these two securities move together with a correlation of +0.89.

Let’s get started!

3 Tricks Billionaires Use to Help Protect Wealth Through Shaky Markets

“If I hear bad news about the stock market one more time, I’m gonna be sick.”

We get it. Investors are rattled, costs keep rising, and the world keeps getting weirder.

So, who’s better at handling their money than the uber-rich?

Have 3 long-term investing tips UBS (Swiss bank) shared for shaky times:

Hold extra cash for expenses and buying cheap if markets fall.
Diversify outside stocks (Gold, real estate, etc.).
Hold a slice of wealth in alternatives that tend not to move with equities.

The catch? Most alternatives aren’t open to everyday investors

That’s why Masterworks exists: 70,000+ members invest in shares of something that’s appreciated more overall than the S&P 500 over 30 years without moving in lockstep with it.*

Contemporary and post war art by legends like Banksy, Basquiat, and more.

Sounds crazy, but it’s real. One way to help reclaim control this week:

Skip waitlist.

_{*Past performance is not indicative of future returns. Investing involves risk. Reg A disclosures:}_{masterworks.com/cd}

import yfinance as yf
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import coint

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Download stock data 
tickers = ['AAL', 'BBY']
data = yf.download(tickers, start='2020-01-01')['Close']

# Preview the data
data.tail()

Ticker     AAL   BBY
Date  
2025-05-19 11.86 71.599998
2025-05-20 11.65 71.150002
2025-05-21 11.24 70.150002
2025-05-22 11.40 70.760002
2025-05-23 11.19 69.919998

Plotting the stock close prices 2020–2025

plt.figure(figsize=(12, 6))
plt.plot(data['AAL'], label='AAL')
plt.plot(data['BBY'], label='BBY')
plt.title('Historical Stock Prices of AAL and BBY')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid()
plt.show()

Historical Stock Prices of AAL and BBY

Performing the AAL-BBY cointegration test

score, p_value, _ = coint(data['AAL'], data['BBY'])

print(f'Cointegration test p-value: {p_value}')

# If p-value is low (<0.05), the pairs are cointegrated
if p_value < 0.05:
    print("The pairs are cointegrated.")
else:
    print("The pairs are not cointegrated.")

Cointegration test p-value: 0.009532269137951204
The pairs are cointegrated.

Calculating and plotting the spread between the two stocks

data['Spread'] = data['AAL'] - data['BBY']

# Plot the spread
plt.figure(figsize=(10, 6))
plt.plot(data.index, data['Spread'], label='Spread (AAL - BBY)')
plt.axhline(data['Spread'].mean(), color='red', linestyle='--', label='Mean')
plt.legend()
plt.xlabel('Date')
plt.ylabel('Spread')
plt.title('Spread between AAL and BBY')
plt.grid()
plt.show()

Spread between AAL and BBY

Defining the Z-score to normalize the spread and setting upper/lower thresholds for entering and exiting trades

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

30+ battle-tested algorithmic trading strategies from the AlgoEdge Insights newsletter – fully coded in Python, backtested, and ready to deploy.

algoedgeinsights.beehiiv.com/products/algoedge-insights-30-python-powered-trading-strategies-the-complete-2026-playbook

# Define z-score to normalize the spread
data['Z-Score'] = (data['Spread'] - data['Spread'].mean()) / data['Spread'].std()

# Set thresholds for entering and exiting trades
upper_threshold = 2
lower_threshold = -2

# Initialize signals
data['Position'] = 0

# Generate signals for long and short positions
data['Position'] = np.where(data['Z-Score'] > upper_threshold, -1, data['Position'])  # Short the spread
data['Position'] = np.where(data['Z-Score'] < lower_threshold, 1, data['Position'])   # Long the spread
data['Position'] = np.where((data['Z-Score'] < 1) & (data['Z-Score'] > -1), 0, data['Position'])  # Exit

# Plot z-score and positions
plt.figure(figsize=(10, 6))
plt.plot(data.index, data['Z-Score'], label='Z-Score')
plt.axhline(upper_threshold, color='red', linestyle='--', label='Upper Threshold')
plt.axhline(lower_threshold, color='green', linestyle='--', label='Lower Threshold')
plt.legend()
plt.title('Z-Score of the Spread with Trade Signals')
plt.xlabel('Date')
plt.ylabel('Z-Score')
plt.grid()
plt.show()

Z-Score of the Spread with Trade Signals

Calculating and plotting the strategy cumulative return (backtesting)

# Calculate daily returns
data['AAL_Return'] = data['AAL'].pct_change()
data['BBY_Return'] = data['BBY'].pct_change()

# Strategy returns: long spread means buying PEP and shorting KO
data['Strategy_Return'] = data['Position'].shift(1) * (data['AAL_Return'] - data['BBY_Return'])

# Cumulative returns
data['Cumulative_Return'] = (1 + data['Strategy_Return']).cumprod()

# Plot cumulative returns
plt.figure(figsize=(10, 6))
plt.plot(data.index, data['Cumulative_Return'], label='Cumulative Return from Strategy',lw=4)
plt.title('Cumulative Returns of Pairs Trading Strategy')
plt.legend()
plt.xlabel('Date')
plt.ylabel('Cumulative Return')
plt.grid()
plt.show()

Cumulative Returns of Pairs Trading Strategy

Calculating the Sharpe ratio and max Drawdown of the strategy

# Calculate Sharpe Ratio
sharpe_ratio = data['Strategy_Return'].mean() / data['Strategy_Return'].std() * np.sqrt(252)
print(f'Sharpe Ratio: {sharpe_ratio}')

# Calculate max drawdown
cumulative_max = data['Cumulative_Return'].cummax()
drawdown = (cumulative_max - data['Cumulative_Return']) / cumulative_max
max_drawdown = drawdown.max()
print(f'Max Drawdown: {max_drawdown}')

Sharpe Ratio: 0.6280509157958919
Max Drawdown: 0.30985471721466773

Preparing our dataset for ML training

data['AAL_Return']=data['AAL_Return'].fillna(0)
data['BBY_Return']=data['BBY_Return'].fillna(0)
data['Cumulative_Return']=data['Cumulative_Return'].fillna(0)
data['Strategy_Return']=data['Cumulative_Return'].fillna(0)

data.head()

Input dataset for ML training

Defining the model features X (Returns) and the target variable y (Position)

X = data[['AAL_Return', 'BBY_Return']]
y = data['Position']

Splitting the data into the train/test sets with test_size=0.2

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Implementing the Random Forest Classifier (RFC)

rf = RandomForestClassifier()
rf.fit(X_train, y_train)

# Make predictions
predictions = rf.predict(X_test)

# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, predictions)
print(f'Accuracy of the Random Forest Classifier: {accuracy}')

Accuracy of the Random Forest Classifier: 0.9522058823529411

Using the test set to compare the cumulative strategy returns with/without RFC test predictions

s = pd.Series(predictions, index=y_test.index)

data['Test_Return'] = y_test.shift(1) * (data['AAL_Return'] - data['BBY_Return'])
data['RFC_Return'] =s.shift(1) * (data['AAL_Return'] - data['BBY_Return'])
# Cumulative returns
data['Test_Cumulative_Return'] = (1 + data['Test_Return']).cumprod()
data['RFC_Cumulative_Return'] = (1 + data['RFC_Return']).cumprod()
# Plot cumulative returns
plt.figure(figsize=(10, 6))
plt.plot(data.index, data['Test_Cumulative_Return'], label='Test Cumulative Return',lw=4)
plt.plot(data.index, data['RFC_Cumulative_Return'], label='RFC Cumulative Return',lw=4)
plt.title('Test Data: Cumulative Returns of Pairs Trading Strategy with/without ML Classifier')
plt.legend()
plt.xlabel('Date')
plt.ylabel('Cumulative Return')
plt.grid()
plt.show()

Test set: cumulative strategy returns with/without RFC predictions

Observe a 8% loss and 4% profit in the test data strategy without/with RFC, respectively.
Implementing the KNN Classifier


rf = KNeighborsClassifier(n_neighbors=3)

rf.fit(X_train, y_train)

# Make predictions
predictions = rf.predict(X_test)

# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, predictions)
print(f'Accuracy of KNN: {accuracy}')

Accuracy of KNN: 0.9632352941176471

Using the test set to compare the cumulative strategy returns with/without KNN test predictions

s = pd.Series(predictions, index=y_test.index)

# Strategy returns 
data['Test_Return'] = y_test.shift(1) * (data['AAL_Return'] - data['BBY_Return'])
data['KNN_Return'] =s.shift(1) * (data['AAL_Return'] - data['BBY_Return'])
# Cumulative returns
data['Test_Cumulative_Return'] = (1 + data['Test_Return']).cumprod()
data['KNN_Cumulative_Return'] = (1 + data['KNN_Return']).cumprod()
# Plot cumulative returns
plt.figure(figsize=(10, 6))
plt.plot(data.index, data['Test_Cumulative_Return'], label='Test Cumulative Return',lw=4)
plt.plot(data.index, data['KNN_Cumulative_Return'], label='KNN Cumulative Return',lw=4)
plt.title('Test Data: Cumulative Returns of Pairs Trading Strategy with/without ML Classifier')
plt.legend()
plt.xlabel('Date')
plt.ylabel('Cumulative Return')
plt.grid()
plt.show()

Test set: cumulative strategy returns with/without KNN predictions

Observe a 8% loss and ~3% profit in the test data strategy without/with KNN, respectively.
Calculating the Sharpe ratio and max Drawdown for the test set without ML and with RFC/KNN predictions

#Test set without ML

# Calculate Sharpe Ratio
sharpe_ratio = data['Test_Return'].mean() / data['Test_Return'].std() * np.sqrt(252)
print(f'Sharpe Ratio: {sharpe_ratio}')

# Calculate max drawdown
cumulative_max = data['Test_Cumulative_Return'].cummax()
drawdown = (cumulative_max - data['Test_Cumulative_Return']) / cumulative_max
max_drawdown = drawdown.max()
print(f'Max Drawdown: {max_drawdown}')

Sharpe Ratio: -0.7233058926453204
Max Drawdown: 0.0817497058494896



#Test set with RFC


# Calculate Sharpe Ratio
sharpe_ratio = data['RFC_Return'].mean() / data['RFC_Return'].std() * np.sqrt(252)
print(f'Sharpe Ratio: {sharpe_ratio}')

# Calculate max drawdown
cumulative_max = data['RFC_Cumulative_Return'].cummax()
drawdown = (cumulative_max - data['RFC_Cumulative_Return']) / cumulative_max
max_drawdown = drawdown.max()
print(f'Max Drawdown: {max_drawdown}')

Sharpe Ratio: 0.7330645837264052
Max Drawdown: 0.023280253670126764

#Test set with KNN


# Calculate Sharpe Ratio
sharpe_ratio = data['KNN_Return'].mean() / data['KNN_Return'].std() * np.sqrt(252)
print(f'Sharpe Ratio: {sharpe_ratio}')

# Calculate max drawdown
cumulative_max = data['KNN_Cumulative_Return'].cummax()
drawdown = (cumulative_max - data['KNN_Cumulative_Return']) / cumulative_max
max_drawdown = drawdown.max()
print(f'Max Drawdown: {max_drawdown}')

Sharpe Ratio: 0.5483810027312457
Max Drawdown: 0.02328025367012676

Conclusions

In this post we employed supervised ML techniques [2] to improve ROI of the stat arb strategy applied to the cointegrated pair AAL-BBY 2020–2025.
We have used backtesting with Sharpe ratio and max Drawdown to evaluate the viability of our strategies on test data.
Our ML test results have confirmed that RFC can significantly improve ROI of stat arb (cf. 8% loss vs 4% profit without/with RFC test predictions, respectively) with 4 times lower max Drawdown.
It appears that RFC has slightly outperformed KNN in terms of ROI (cf. 4% and 3% profits, respectively).
It is interesting to see that both RFC and KNN yield the same 2% max Drawdown.
Robustness checks and sensitivity analyses will further corroborate these findings.

Can Supervised ML Classifiers Improve ROI of Pairs Trading?

What Happens When $4.7T in Real Estate Debt Comes Due?

🚀 Your Algo Edge Just Leveled Up — Premium Plans Are Here! 🚀

🔔 Limited-Time Holiday Deal: 20% Off Our Complete 2026 Playbook! 🔔

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

3 Tricks Billionaires Use to Help Protect Wealth Through Shaky Markets

Conclusions

Keep Reading

AlgoEdge Insights