Acquiring and Analyzing Earnings Announcements Data in Python

In partnership with

You're overpaying for crypto.

Every exchange has different prices for the same crypto. Most people stick with one and pay whatever it costs.

CoW Swap checks them all automatically. Finds the best price. Executes your trade. Takes 30 seconds.

Stop leaving money on the table.

Check your price

Elite Quant Plan – 14-Day Free Trial (This Week Only)

No card needed. Cancel anytime. Zero risk.

You get immediate access to:

Full code from every article (including today’s HMM notebook)
Private GitHub repos & templates
All premium deep dives (3–5 per month)
2 × 1-on-1 calls with me
One custom bot built/fixed for you

Try the entire Elite experience for 14 days — completely free.

→ Start your free trial now 👇

Upgrade | AlgoEdge Insights

Daily algorithmic trading signals, market insights, and code-ready strategies to sharpen your edge.

algoedgeinsights.beehiiv.com/upgrade

(Doors close in 7 days or when the post goes out of the spotlight — whichever comes first.)

See you on the inside.

Mean Reversion Trading with Kalman Filters: Estimating Fair Value in EUR/USD

How to identify leading, weakening, lagging, and improving sectors in the Indian stock market using Python, Nifty 50 as benchmark, and live yfinance data – full working code included

algoedgeinsights.beehiiv.com/p/mean-reversion-trading-with-kalman-filters-estimating-fair-value-in-eur-usd

👉 Upgrade Now →

🔔 Limited-Time Holiday Deal: 20% Off Our Complete 2026 Playbook! 🔔

Level up before the year ends!

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

30+ battle-tested algorithmic trading strategies from the AlgoEdge Insights newsletter – fully coded in Python, backtested, and ready to deploy. Your full arsenal for dominating 2026 markets.

Special Promo: Use code DECEMBER2025 for 20% off

Valid only until December 20, 2025 — act fast!

👇 Buy Now & Save 👇

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

30+ battle-tested algorithmic trading strategies from the AlgoEdge Insights newsletter – fully coded in Python, backtested, and ready to deploy.

algoedgeinsights.beehiiv.com/products/algoedge-insights-30-python-powered-trading-strategies-the-complete-2026-playbook

Instant access to every strategy we've shared, plus exclusive extras.

— AlgoEdge Insights Team

Premium Members – Your Full Notebook Is Ready

The complete Google Colab notebook from today’s article (with live data, full Hidden Markov Model, interactive charts, statistics, and one-click CSV export) is waiting for you.

Preview of what you’ll get:

Inside:

Fetching Earnings Data
Stock Prices and Earnings Surprise
Price, Volatility and Volume Around Earnings
Advanced Analysis — Historical and Future Movement Probabilities
Beautiful interactive Plotly charts
Regime duration & performance tables
Ready-to-use CSV export
Bonus: works on Bitcoin, SPX, or any ticker with one line change

Free readers – you already got the full breakdown and visuals in the article. Paid members – you get the actual tool.

Not upgraded yet? Fix that in 10 seconds here👇

Upgrade | AlgoEdge Insights

Daily algorithmic trading signals, market insights, and code-ready strategies to sharpen your edge.

algoedgeinsights.beehiiv.com/upgrade

Google Collab Notebook With Full Code Is Available In the End Of The Article Behind The Paywall 👇 (For Paid Subs Only)

Earnings announcements offers valuable insights into a company’s financial health and future prospects. These periodic revelations sway market sentiments and shape investment strategies. The real challenge lies in effectively analyzing these announcements to extract actionable insights, not to mention the difficulty of accessing easily retrievable data.

In this article, we utilize open-source tools to access earnings information without incurring expenses. Furthermore, our objective is to transform the earnings data into insights on volatility, trend forecasting, and earnings surprise impact evaluation.

We will navigate through a series of Python-powered methodologies, starting from the initial step of fetching earnings data using Selenium, to advanced techniques involving data processing, visualization, and predictive analytics.

Figure 6. Chart depicting the normalized price movements in the days surrounding earnings announcements, highlighting market behavior in response to financial disclosures.

This guide as structure as follows:

❝

1. Fetching Earnings Data

2. Stock Prices and Earnings Surprise

3. Price, Volatility and Volume Around Earnings

4. Advanced Analysis — Historical and Future Movement Probabilities

Stop Drowning In AI Information Overload

Your inbox is flooded with newsletters. Your feed is chaos. Somewhere in that noise are the insights that could transform your work—but who has time to find them?

The Deep View solves this. We read everything, analyze what matters, and deliver only the intelligence you need. No duplicate stories, no filler content, no wasted time. Just the essential AI developments that impact your industry, explained clearly and concisely.

Replace hours of scattered reading with five focused minutes. While others scramble to keep up, you'll stay ahead of developments that matter. 600,000+ professionals at top companies have already made this switch.

Join them today, for free.

1. Fetching Earnings Data

We employ Selenium, a tool for web scraping, to extract earnings data directly from Yahoo Finance. This approach offers a dynamic and cost-effective alternative to traditional financial data services.

1.1 Scraping Earnings Announcement Data

The Python code below uses Selenium to set up a headless Chrome browser, navigates to the Yahoo Finance earnings calendar page for a specified stock ticker, and systematically retrieves the earnings information.

The data, encompassing elements such as symbol, company name, earnings date, EPS estimate, reported EPS, and the earnings surprise percentage, which is then parsed into a structured format.

1.2 Data Processing and Cleaning

The process of cleaning for now involves extracting and standardizing time and timezone information, as well as converting dates to a consistent format suitable for time-series analysis.

Furthermore, this segment of the code cleans numerical data for further analysis. See the complete code for data fetching and cleansing below.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import pandas as pd

def fetch_earnings_data(ticker):
    # Set up Selenium to run headlessly
    options = Options()
    options.headless = True
    options.add_argument("--headless")
    options.add_argument("--disable-gpu")
    options.add_argument("--window-size=1920x1080")

    driver = webdriver.Chrome(options=options)
    url = f"https://finance.yahoo.com/calendar/earnings?symbol={ticker}"
    driver.get(url)

    # Find the rows of the earnings table
    rows = driver.find_elements(By.CSS_SELECTOR, 'table tbody tr')

    data = []

    for row in rows:
        cols = row.find_elements(By.TAG_NAME, 'td')
        cols = [elem.text for elem in cols]
        data.append(cols)

    # Close the WebDriver
    driver.quit()

    # Assuming the data structure is as expected, create a DataFrame
    columns = ['Symbol', 'Company', 'Earnings Date', 'EPS Estimate', 'Reported EPS', 'Surprise(%)']
    df = pd.DataFrame(data, columns=columns)

    return df

# Example usage:
ticker = "SAP"
earnings_data = fetch_earnings_data(ticker)

# Extract the time and timezone information into a new column
earnings_data['Earnings Time'] = earnings_data['Earnings Date'].str.extract(r'(\d{1,2} [AP]MEDT)')

# Extract just the date part from the "Earnings Date" column
earnings_data['Earnings Date'] = earnings_data['Earnings Date'].str.extract(r'(\b\w+ \d{1,2}, \d{4})')

# Convert string date to datetime
earnings_data['Earnings Date'] = pd.to_datetime(earnings_data['Earnings Date'], format='%b %d, %Y')

# Convert datetime to desired string format
earnings_data['Earnings Date'] = earnings_data['Earnings Date'].dt.strftime('%Y-%m-%d')

earnings_data['Surprise(%)'] = earnings_data['Surprise(%)'].str.replace('+', '').astype(float)

earnings_data

Figure 1. A detailed snapshot displaying Historical Earnings Data extracted from Yahoo Finance, showcasing earnings dates, EPS estimates, reported EPS, surprise percentages, and corresponding earnings times.

2. Stock Prices and EPS

2.1 EPS Markers on Stock Price

Utilizing Python’s yfinance library, we can fetch historical stock price data, which, when constrasted with the EPS figures previously retrieved, reveals the market's response to earnings announcements.

The following Python snippet fetches historical stock prices within the time frame of available earnings data. It then overlays significant earnings surprises, both positive and negative, on a time series plot of the stock price.

import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt


#earnings_data['Surprise(%)'] = earnings_data['Surprise(%)'].str.replace('+', '').astype(float)
#earnings_data['Earnings Date'] = pd.to_datetime(earnings_data['Earnings Date'])

# Fetch stock price data
ticker = 'SAP'
stock_data = yf.download(ticker, start=earnings_data['Earnings Date'].min(), end=earnings_data['Earnings Date'].max())

# Plotting stock data
plt.figure(figsize=(25, 7))
stock_data['Close'].plot(label='Stock Price', color='blue')

# Plotting earnings surprise
for index, row in earnings_data.iterrows():
    date = row['Earnings Date']
    # If exact date is not available, use the closest available date
    if date not in stock_data.index:
        date = stock_data.index[stock_data.index.get_loc(date, method='nearest')]
    
    if row['Surprise(%)'] > 0:
        color = 'green'
        marker = '^'
    else:
        color = 'red'
        marker = 'v'
    
    plt.plot(date, stock_data.loc[date, 'Close'], marker, color=color, markersize=15)

plt.title(f'{ticker} Stock Price with Earnings Surprise', fontsize = 13 )
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

Figure 2. Long-Term Trend Analysis of SAP SE Stock Price with Markers Indicating Earnings Surprises Over the Time Horizon available in Earnings Data.

2.2 The Price Effect of Earnings Announcements

2.2.1 Calculating Price Effect

A crucial aspect of earnings analysis is assessing the price effect, which measures the stock’s price movement before and after the earnings announcement.

Go from AI overwhelmed to AI savvy professional

AI will eliminate 300 million jobs in the next 5 years.

Yours doesn't have to be one of them.

Here's how to future-proof your career:

Join the Superhuman AI newsletter - read by 1M+ professionals
Learn AI skills in 3 mins a day
Become the AI expert on your team

Start learning AI now

The following code enriches the earnings_data DataFrame with new columns that capture the stock's price before and after the earnings announcement, as well as the percentage change, which is the price effect.

import yfinance as yf
import pandas as pd

# Assuming 'earnings_data' is the DataFrame and has an 'Earnings Date' column in string format
earnings_data['Earnings Date'] = pd.to_datetime(earnings_data['Earnings Date'])

# Now add buffer days to the start and end dates
buffer_days = 10
startDate = earnings_data['Earnings Date'].min() - pd.Timedelta(days=buffer_days)
endDate = earnings_data['Earnings Date'].max() + pd.Timedelta(days=buffer_days)

# Fetch SAP SE stock price data with additional buffer days
stock_data = yf.download(tickerSymbol, start=startDate, end=endDate)

# Fetch SAP SE stock price data
stock_data = yf.download(tickerSymbol, start=startDate, end=endDate)

# Function to compute price effect
def compute_price_effect(earnings_date, stock_data):
    try:
        # For "Price Before", if missing, we use the most recent previous price
        price_before = stock_data.loc[:pd.Timestamp(earnings_date) - pd.Timedelta(days=1), 'Close'].ffill().iloc[-1]
        
        price_on = stock_data.loc[pd.Timestamp(earnings_date), 'Close']
        
        # For "Price After", if missing, we use the next available price
        price_after = stock_data.loc[pd.Timestamp(earnings_date) + pd.Timedelta(days=1):, 'Close'].bfill().iloc[0]
        
        price_effect = ((price_after - price_before) / price_before) * 100
    except (KeyError, IndexError):  # in case the date is missing in the stock_data even after filling
        return None, None, None, None
    return price_before, price_on, price_after, price_effect

# Apply the function
earnings_data['Price Before'], earnings_data['Price On'], earnings_data['Price After'], earnings_data['Price Effect (%)'] = zip(*earnings_data['Earnings Date'].apply(compute_price_effect, stock_data=stock_data))

#earnings_data['Surprise(%)'] = earnings_data['Surprise(%)'].str.replace('+', '').astype(float)

earnings_data

Figure 3. Tabulated Earnings Data Showing EPS Estimates, Reported EPS, Surprise Percentage, and the Calculated Price Effect.

2.2.2 Stock Price, Effect Percentage, EPS Surprise

To further our analysis, we visualize the relationship between stock prices around the earnings announcement dates, the price effect percentage, and the EPS surprise. This is achieved through the following Python code which generates a multi-faceted bar and line plot.

import pandas as pd
import matplotlib.pyplot as plt

#df = pd.DataFrame(data)
#df['Earnings Date'] = pd.to_datetime(df['Earnings Date'])

# Sort the dataframe by 'Earnings Date' in ascending order
latest_earnings_data = earnings_data.sort_values(by='Earnings Date').tail(14)

# Setting up the plot
fig, ax1 = plt.subplots(figsize=(30,8))

# Bar positions
positions = range(len(latest_earnings_data ))
width = 0.25
r1 = [pos - width for pos in positions]
r2 = positions
r3 = [pos + width for pos in positions]

# Clustered bar plots for prices
bars1 = ax1.bar(r1, latest_earnings_data ['Price Before'], width=width, label='Price Before', color='blue', edgecolor='grey')
bars2 = ax1.bar(r2, latest_earnings_data ['Price On'], width=width, label='Price On', color='cyan', edgecolor='grey')
bars3 = ax1.bar(r3, latest_earnings_data ['Price After'], width=width, label='Price After', color='lightblue', edgecolor='grey')

# Line plots for Surprise(%) and Price Effect (%)
ax2 = ax1.twinx()
ax2.plot(positions, latest_earnings_data ['Surprise(%)'], color='red', marker='o', label='Surprise(%)')
ax2.plot(positions, latest_earnings_data ['Price Effect (%)'], color='green', marker='o', label='Price Effect (%)')

# Annotations for the Surprise(%) and Price Effect (%)
for i, (date, surprise, effect) in enumerate(zip(latest_earnings_data ['Earnings Date'], latest_earnings_data ['Surprise(%)'], latest_earnings_data ['Price Effect (%)'])):
    ax2.annotate(f"{surprise}%", (i, surprise), textcoords="offset points", xytext=(0,10), ha='center', fontsize=16, color='red', fontweight='bold')
    ax2.annotate(f"{effect:.2f}%", (i, effect), textcoords="offset points", xytext=(0,10), ha='center', fontsize=16, color='green', fontweight='bold')

# Annotations for prices
def annotate_bars(bars, ax):
    for bar in bars:
        yval = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2, yval, round(yval, 2), ha='center', va='bottom', fontsize=14, rotation=45)

annotate_bars(bars1, ax1)
annotate_bars(bars2, ax1)
annotate_bars(bars3, ax1)

# Setting x-axis with better spacing
ax1.set_xticks(positions)
ax1.set_xticklabels(latest_earnings_data ['Earnings Date'].dt.strftime('%Y-%m-%d'), rotation=45, ha='right', fontsize=14)

# Setting labels and title
ax1.set_xlabel('Earnings Date', fontweight='bold')
ax1.set_ylabel('Price', fontweight='bold')
ax2.set_ylabel('Percentage (%)', fontweight='bold')
ax1.set_title('Earnings Data with Surprise and Price Effect', fontsize=18)

# Add legends
ax1.legend(loc='upper left')
ax2.legend(loc='upper right')

plt.tight_layout()
plt.show()

Figure 4. Combined Bar and Line Chart Illustrating the Stock Price Before, On, and After Earnings Dates Alongside Earnings Surprises and Price Effects.

2.3 The Price Effect and EPS Surprise Relationship

To explore the correlation between earnings surprises and the subsequent price effect, we employ a scatter plot with a fitted regression line. This scatter plot offers a quantitative insight into how surprising earnings figures might influence stock price, thus informing analysts about the potential predictive power of earnings surprises on stock performance.

import matplotlib.pyplot as plt
import pandas as pd

# Drop rows with NaN values in 'Surprise(%)' and 'Price Effect (%)' columns
filtered_earnings_data = earnings_data.dropna(subset=['Surprise(%)', 'Price Effect (%)'])

# Linear regression
slope, intercept = np.polyfit(filtered_earnings_data['Surprise(%)'], filtered_earnings_data['Price Effect (%)'], 1)
x = np.array(filtered_earnings_data['Surprise(%)'])
y_pred = slope * x + intercept

# Compute r-squared
correlation_matrix = np.corrcoef(filtered_earnings_data['Surprise(%)'], filtered_earnings_data['Price Effect (%)'])
correlation_xy = correlation_matrix[0,1]
r_squared = correlation_xy**2

# Scatter plot with regression line
plt.figure(figsize=(30, 8))
plt.scatter(filtered_earnings_data['Surprise(%)'], filtered_earnings_data['Price Effect (%)'], color='blue', marker='o')
plt.plot(x, y_pred, color='red', label=f'y={slope:.3f}x + {intercept:.3f}')  # regression line
plt.title('Earnings Surprise vs. Price Effect', fontsize = 20)
plt.xlabel('Earnings Surprise(%)')
plt.ylabel('Price Effect(%)')
plt.grid(True)
plt.legend(loc="upper right")
plt.annotate(f'R-squared = {r_squared:.3f}', xy=(0.05, 0.95), xycoords='axes fraction', fontsize=15, color='green')
plt.show()

Figure 5. Scatter Plot Demonstrating the Relationship Between Earnings Surprise Percentages and the Subsequent Price Effect, with a Fitted Regression Line and R-squared Value.

3. Price, Volatility and Volume Around Earnings

3.1 Price Movement Around Earnings

The period surrounding earnings announcements is typically marked by heightened investor attention, often translating into significant price movements.

To analyze these fluctuations, we normalize the yfinanceretrieved stock prices to the closing price five days before the earnings date to observe relative changes.

The Python script below sets up the necessary parameters and iterates through each earnings date to create a time series of normalized prices.

These are subsequently plotted to visualize price movements around earnings dates, providing a clear depiction of market behavior during these critical periods.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import timedelta

# Define the ticker symbol
tickerSymbol = 'SAP'

# Check the minimum and maximum earnings dates from the 'df' DataFrame
min_earnings_date = earnings_data['Earnings Date'].min()
max_earnings_date = earnings_data['Earnings Date'].max()

# Get data on this ticker
stock_data = yf.Ticker(tickerSymbol)

# Get the historical prices for this ticker within the range of earnings dates
hist = stock_data.history(start=min_earnings_date, end=max_earnings_date)

# Make the datetime index timezone-naive for compatibility with earnings dates
hist.index = hist.index.tz_localize(None)

# Initialize an empty list to hold the price series
price_series_list = []

# Extract relevant price data
for index, row in earnings_data[['Earnings Date']].iterrows():
    earnings_date = pd.to_datetime(row['Earnings Date']).date()

    # Adjust the start date to ensure there's data available for forward-filling
    extended_start_date = earnings_date - timedelta(days=7)  # extending to ensure we have data to forward-fill
    start_date = earnings_date - timedelta(days=5)
    end_date = earnings_date + timedelta(days=5)

    # Select the stock prices for the extended date range
    prices = hist.loc[extended_start_date:end_date, 'Close']

    if prices.empty:
        print(f"No price data available for the range {extended_start_date} to {end_date}. Skipping.")
        continue

    # Forward-fill missing values, this time with available data due to extended range
    all_days = pd.date_range(start=extended_start_date, end=end_date, freq='D')
    prices = prices.reindex(all_days, method='ffill')

    # Truncate the prices Series to only the date range we're interested in (i.e., -5 to +5 days around earnings)
    prices = prices.loc[start_date:end_date]

    # Normalize prices based on the closing price 5 days before earnings
    prices /= prices.iloc[0]

    # Add the series to the list with the days relative to earnings as the new index
    price_series_list.append(prices.reset_index(drop=True))  # reset_index for proper alignment during concatenation

# Check if the price_series_list is empty
if not price_series_list:
    raise ValueError("No price data was added to the list. Please check your input data and date ranges.")

# Concatenate all the series into a single DataFrame
price_data = pd.concat(price_series_list, axis=1)

# Correcting the index to represent days relative to earnings
price_data.index = np.arange(-5, 6)

# Now, let's plot each series correctly
plt.figure(figsize=(25, 10))

# Iterate over each series and plot
for column in price_data.columns:
    plt.plot(price_data.index, price_data[column])  # Each series represents a different earnings date

plt.axvline(x=0, color='red', linestyle='--', label='Earnings Date', linewidth=5)
plt.xticks(np.arange(-5, 6, 1))  # Ensuring the x-axis reflects -5 to +5 days

# Adding title and labels
plt.title('SAP Stock Prices Around Earnings Announcements', fontsize=15)
plt.xlabel('Days Relative to Earnings', fontsize=12)
plt.ylabel('Normalized Price', fontsize=12)

# Set the tick size
plt.tick_params(axis='both', which='major', labelsize=12)  # Increase tick label size

#plt.legend(loc='upper left', bbox_to_anchor=(1.05, 1), fontsize='small')  # Adjusted the legend position so it doesn't overlap the plot
plt.grid(True)
plt.show()

Figure 6. Chart depicting the normalized price movements in the days surrounding earnings announcements, highlighting market behavior in response to financial disclosures.

3.2 Volatility Action Around Earnings

Volatility often increases around earnings announcements due to the uncertainty and potential for surprises. By computing the 20-day rolling volatility of the stock price, we aim to capture and visualize this phenomenon.

The following Python script calculates the daily returns and rolling volatility, then isolates the volatility around each earnings date. The plotted data reflects the market’s anticipation and reaction to new information, a vital consideration for risk assessment and trading strategies.

This visualization offers investors a lens through which to view the risk profile of a stock around earnings announcements, critical for managing portfolios during earnings seasons.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import timedelta

# Define the ticker symbol and time range
tickerSymbol = 'SAP'
startDate = earnings_data['Earnings Date'].min()
endDate = earnings_data['Earnings Date'].max()

# Get data on this ticker
stock_data = yf.Ticker(tickerSymbol)

# Get the historical prices for this ticker
hist = stock_data.history(start=startDate, end=endDate)
hist.index = hist.index.tz_localize(None)  # Make timezone-naive

# Calculate daily returns
hist['Returns'] = hist['Close'].pct_change()

# Calculate 20-day rolling volatility
hist['20D Volatility'] = hist['Returns'].rolling(window=20).std()


# Initialize an empty list to hold the volatility series
volatility_series_list = []

# Extract relevant volatility data
for index, row in earnings_data[['Earnings Date']].iterrows():
    earnings_date = pd.to_datetime(row['Earnings Date']).date()
    extended_start_date = earnings_date - timedelta(days=7)
    start_date = earnings_date - timedelta(days=5)
    end_date = earnings_date + timedelta(days=5)

    # Select the volatility for the extended date range
    volatilities = hist.loc[extended_start_date:end_date, '20D Volatility']

    if volatilities.empty:
        print(f"No volatility data available for the range {extended_start_date} to {end_date}. Skipping.")
        continue

    # Forward-fill missing values
    all_days = pd.date_range(start=extended_start_date, end=end_date, freq='D')
    volatilities = volatilities.reindex(all_days, method='ffill')

    # Truncate the volatilities Series to only the date range we're interested in (i.e., -5 to +5 days around earnings)
    volatilities = volatilities.loc[start_date:end_date]

    # Reindex the series to start at 0 on day -5
    volatilities = volatilities - volatilities.iloc[0]
    
    # Add the series to the list with the days relative to earnings as the new index
    volatility_series_list.append(volatilities.reset_index(drop=True))

if not volatility_series_list:
    raise ValueError("No volatility data was added to the list. Please check your input data and date ranges.")

# Concatenate all the series into a single DataFrame
volatility_data = pd.concat(volatility_series_list, axis=1)

# Correcting the index to represent days relative to earnings
volatility_data.index = np.arange(-5, 6)

# Now, let's plot each series correctly
plt.figure(figsize=(25, 10))
for column in volatility_data.columns:
    plt.plot(volatility_data.index, volatility_data[column])  # Each series represents a different earnings date

plt.axvline(x=0, color='red', linestyle='--', label='Earnings Date', linewidth=5)
plt.xticks(np.arange(-5, 6, 1))
plt.title(f'20-Day Rolling Volatility Around Earnings Announcements for {tickerSymbol}', fontsize=15)
plt.xlabel('Days Relative to Earnings', fontsize=12)
plt.ylabel('20-Day Volatility',fontsize=12)

plt.tick_params(axis='both', which='major', labelsize=12)
plt.grid(True)
plt.show()

Figure 7. Illustrating the 20-day rolling volatility of stock price before and after earnings dates, capturing the market’s anticipation and reaction to earnings reports.

3.3 Volume Changes Around Earnings

Trading volume is another key indicator that can signal the market’s reaction to earnings reports. A surge in volume often accompanies significant earnings surprises, reflecting increased trading activity as investors reassess the stock’s value.

Utilizing yfinance, we extract and analyze the trading volume around earnings dates. The script below processes this data, aligning it with the earnings dates and reindexing the series to highlight changes in trading activity.

By visualizing the reindexed volume data, investors can gauge the intensity of the market’s response to earnings announcements, which can be a useful proxy for market sentiment and investor interest.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import timedelta

# Define the ticker symbol and time range you want to analyze
tickerSymbol = 'SAP'
startDate = earnings_data['Earnings Date'].min()
endDate = earnings_data['Earnings Date'].max()

# Get data on this ticker
stock_data = yf.Ticker(tickerSymbol)

# Get the historical prices for this ticker
hist = stock_data.history(start=startDate, end=endDate)
hist.index = hist.index.tz_localize(None)  # Make timezone-naive

# Initialize an empty list to hold the volume series
volume_series_list = []

# Extract relevant volume data
for index, row in earnings_data[['Earnings Date']].iterrows():
    earnings_date = pd.to_datetime(row['Earnings Date']).date()
    extended_start_date = earnings_date - timedelta(days=7)
    start_date = earnings_date - timedelta(days=5)
    end_date = earnings_date + timedelta(days=5)

    # Select the volume for the extended date range
    volumes = hist.loc[extended_start_date:end_date, 'Volume']

    if volumes.empty:
        print(f"No volume data available for the range {extended_start_date} to {end_date}. Skipping.")
        continue

    # Forward-fill missing values
    all_days = pd.date_range(start=extended_start_date, end=end_date, freq='D')
    volumes = volumes.reindex(all_days, method='ffill')

    # Truncate the volumes Series to only the date range we're interested in (i.e., -5 to +5 days around earnings)
    volumes = volumes.loc[start_date:end_date]

    # ### START OF REINDEXING ###
    # Reindex the series to start at 0 on day -5
    volumes = volumes - volumes.iloc[0]
    # ### END OF REINDEXING ###

    # Add the series to the list with the days relative to earnings as the new index
    volume_series_list.append(volumes.reset_index(drop=True))

if not volume_series_list:
    raise ValueError("No volume data was added to the list. Please check your input data and date ranges.")

# Concatenate all the series into a single DataFrame
volume_data = pd.concat(volume_series_list, axis=1)

# Correcting the index to represent days relative to earnings
volume_data.index = np.arange(-5, 6)


# Now, let's plot each series correctly
plt.figure(figsize=(25, 10))
for column in volume_data.columns:
    plt.plot(volume_data.index, volume_data[column])  # Each series represents a different earnings date

plt.axvline(x=0, color='red', linestyle='--', label='Earnings Date', linewidth=5)
plt.xticks(np.arange(-5, 6, 1))
plt.title(f'Reindexed Volume Around Earnings Announcements for {tickerSymbol}',fontsize=15)
plt.xlabel('Days Relative to Earnings',fontsize=12)
plt.ylabel('Reindexed Volume',fontsize=12)
plt.tick_params(axis='both', which='major', labelsize=12)
plt.grid(True)
plt.show()

Figure 8. Showcasing the trading volume reindexed to the days leading up to and following earnings announcements, reflecting shifts in market activity and investor interest.

4. Historical and Future Movement Probabilities

We now consider three different type of analysis to estimate potential future price movements and associated probabilities. Namely, we use the historical earnings price movements, market implied movements from opcion prices and Monte Carlo simulated prices using the historical earnings data.

4.1 Historical Movement Probabilities

Understanding historical movement probabilities can be invaluable. It allows for an estimation of how the stock price might behave around earnings announcements, based on historical volatility and price movements in previous earnings annoucements.

The approach is as follows: we normalize the price data to the period leading up to the earnings announcement, and then analyze the distribution of these paths to determine the probability of various outcomes, e.g. a 10% increase/decrease within n-days.

Figure. 9 Showcasing the normalized Price Movementes around Earnings Date and Estimating the Probability of a Given Percentage Threshold Movements.

4.2 Market Implied Movemement Probabilities

Moving beyond historical data, we can also utilize market-implied probabilities to gauge expectations about future stock price movements.

This involves analyzing option prices to extract the implied volatility, which reflects the market’s forecast of a stock’s potential to undergo significant price changes. Implied Volatility can be retrieved from the options chain prices on Yahoo Finance.

By applying this market-implied information in Monte Carlo simulations, we can predict a range of potential price outcomes and their associated probabilities for the period around an earnings announcement.

4.3 Monte Carlo Simulated Movemement Probabilities

To quantify the impact of earnings announcements, we also investigate historical price movements using Monte Carlo simulations. This advanced analysis provides a probabilistic assessment of potential price trajectories leading up to and following earnings releases.

A Monte Carlo simulation employs repeated random sampling to predict the behavior of a system that cannot easily be predicted due to the intervention of random variables.

We combine both historical data and market-implied statistics to run comprehensive Monte Carlo simulations. This fusion provides a framework for forecasting prices around earnings announcements, considering both past performance and current market sentiment.

5. Further Trading Applications

5.1 Machine Learning in Earnings Prediction

By analyzing historical earnings data alongside other financial indicators, predictive models can be developed to forecast earnings beats or misses. This approach leverages the data available to uncover patterns that might not be immediately evident.

5.2 Earnings Revision Models

Tracking how analysts’ earnings forecasts change over time is another insightful approach. Earnings revision models focus on the trends in these revisions, which can signal market sentiment and potential future performance of a stock.

5.3 Volatility Arbitrage around Earnings

For traders focusing on volatility, analyzing the differences between actual and implied volatility around earnings announcements can present arbitrage opportunities.

5.4 Post-Earnings Announcement Drift (PEAD) Strategies

Post-earnings announcement drift refers to the tendency of a stock’s price to continue moving in the direction of an earnings surprise for several weeks. By identifying and acting on statistically significant earnings surprises, traders can exploit this drift for potential gains.

6. Conclusion

This article has outlined how to harness historical data, market sentiment, and probabilistic modeling to gain a comprehensive understanding of potential market behaviors surrounding earnings announcements.

Subscribe to our premium content to read the rest.

Become a paying subscriber to get access to this post and other subscriber-only content.

Upgrade

Acquiring and Analyzing Earnings Announcements Data in Python

You're overpaying for crypto.

Elite Quant Plan – 14-Day Free Trial (This Week Only)

🔔 Limited-Time Holiday Deal: 20% Off Our Complete 2026 Playbook! 🔔

AlgoEdge Insights: 30+ Python-Powered Trading Strategies – The Complete 2026 Playbook

Premium Members – Your Full Notebook Is Ready

Google Collab Notebook With Full Code Is Available In the End Of The Article Behind The Paywall 👇 (For Paid Subs Only)

Stop Drowning In AI Information Overload

1. Fetching Earnings Data

1.1 Scraping Earnings Announcement Data

1.2 Data Processing and Cleaning

2. Stock Prices and EPS

2.1 EPS Markers on Stock Price

2.2 The Price Effect of Earnings Announcements

Go from AI overwhelmed to AI savvy professional

2.3 The Price Effect and EPS Surprise Relationship

3. Price, Volatility and Volume Around Earnings

3.1 Price Movement Around Earnings

3.2 Volatility Action Around Earnings

3.3 Volume Changes Around Earnings

4. Historical and Future Movement Probabilities

4.1 Historical Movement Probabilities

4.2 Market Implied Movemement Probabilities

4.3 Monte Carlo Simulated Movemement Probabilities

5. Further Trading Applications

5.1 Machine Learning in Earnings Prediction

5.2 Earnings Revision Models

5.3 Volatility Arbitrage around Earnings

5.4 Post-Earnings Announcement Drift (PEAD) Strategies

6. Conclusion

Subscribe to our premium content to read the rest.

Keep Reading

AlgoEdge Insights