Information in noise. Two techniques for visualizing many… | by Lenix Carter | Sep, 2024

Two techniques to visualize many time series at once

Picture this: You have a bunch of line charts, and you’re sure there’s at least one trend hiding somewhere among all that data. Whether you’re tracking sales of your company’s thousands of products or analyzing stock market data, your goal is to uncover those subtrends and make them stand out in your visualization. Let’s explore a couple of techniques that will help you do just that.

Hundreds of lines have been drawn, but it is not clear what the subtrends are. This synthetic data may show the benefit of these strategies.Image by the author)

Density line plots are a clever plotting technique introduced by Dominik Moritz and Danyel Fisher in their article, Visualizing a million time series with the density line chart. This method transforms numerous line graphs into heat maps, revealing areas where the lines overlap the most.

When we apply density line plots to the synthetic data shown above, the results look like this:

PyDLC allows us to see “hot spots” where a high degree of lines overlap.Image by the author)

This implementation allows us to see where our trends appear and identify the subtrends that make this data interesting.

For this example we use the Python library PyDLC by Charles L. BerubéImplementation is quite straightforward, thanks to the library's user-friendly design.

plt.figure(figsize=(14, 14))
im = dense_lines(synth_df.to_numpy().T, 
x=synth_df.index.astype('int64'), 
cmap='viridis',
ny=100,
y_pad=0.01
)plt.ylim(-25, 25)
plt.axhline(y=0, color='white', linestyle=':')
plt.show()

When using density line graphs, keep in mind that parameters such as ny and y_pad Some adjustments may be necessary to obtain the best results.

This technique hasn't been discussed as much and doesn't have a universally recognized name. However, it is essentially a variation on “line density charts” or “line density visualizations,” where we use thicker lines with low opacity to reveal areas of overlap and density.

This technique shows subtrends quite well and reduces the cognitive load of the many lines.Image by the author)

We can clearly identify what appear to be two distinct trends and observe the high degree of overlap during the downward movements of the sine waves. However, it is a little more complicated to determine where the effect is strongest.

The code for this approach is also pretty straightforward:

plt.figure(figsize=(14, 14))for column in synth_df.columns:
plt.plot(synth_df.index, 
synth_df(column), 
alpha=0.1, 
linewidth=2, 
label=ticker,
color='black'
)

Here, the two parameters that might require some adjustment are alpha and linewidth.

Let's imagine we are looking for subtrends in the daily returns of 50 stocks. The first step is to extract the data and calculate the daily returns.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as snsstock_tickers = (
'AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA', 'META', 'NVDA', 'BRK-B', 'UNH', 'V',
'HD', 'MA', 'KO', 'DIS', 'PFE', 'NKE', 'ADBE', 'CMCSA', 'NFLX', 'CSCO',
'INTC', 'AMGN', 'COST', 'PEP', 'TMO', 'AVGO', 'QCOM', 'TXN', 'ABT', 'ORCL',
'MCD', 'MDT', 'CRM', 'UPS', 'WMT', 'BMY', 'GILD', 'BA', 'SBUX', 'IBM',
'MRK', 'WBA', 'CAT', 'CVX', 'T', 'MS', 'LMT', 'GS', 'WFC', 'HON'
)
start_date = '2024-03-01'
end_date = '2024-09-01'
percent_returns_df = pd.DataFrame()
for ticker in stock_tickers:
stock_data = yf.download(ticker, start=start_date, end=end_date)
stock_data = stock_data.fillna(method='ffill').fillna(method='bfill')
if len(stock_data) >= 2:
stock_data('Percent Daily Return') = stock_data('Close').pct_change() * 100
stock_data('Ticker') = ticker
percent_returns_df = pd.concat((percent_returns_df, stock_data(('Ticker', 'Percent Daily Return'))), axis=0)
percent_returns_df.reset_index(inplace=True)
display(percent_returns_df)

Then we can graph the data.

pivot_df = percent_returns_df.pivot(index='Date', columns='Ticker', values='Percent Daily Return')pivot_df = pivot_df.fillna(method='ffill').fillna(method='bfill')
plt.figure(figsize=(14, 14))
sns.lineplot(data=pivot_df, dashes=False)
plt.title('Percent Daily Returns of Top 50 stocks')
plt.xlabel('Date')
plt.ylabel('Percent Daily Return')
plt.legend(title='Stock Ticker', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True)
plt.tight_layout()

A very messy plot with many lines and little discernible information.Image by the author)

The density line chart faces some challenges with this data due to its sporadic nature. However, it still provides valuable insights into overall market trends. For example, it can spot periods where the densest areas correspond to significant declines, highlighting tough days in the market.

plt.figure(figsize=(14, 14))
im = dense_lines(pivot_df(stock_tickers).to_numpy().T, 
x=pivot_df.index.astype('int64'), 
cmap='viridis',
ny=200,
y_pad=0.1
)plt.axhline(y=0, color='white', linestyle=':')
plt.ylim(-10, 10)
plt.show()

However, we have found that the transparency technique works much better for this particular problem. The market declines we mentioned earlier become much clearer and more noticeable.

plt.figure(figsize=(14, 14))for ticker in pivot_df.columns:
plt.plot(pivot_df.index, 
pivot_df(ticker), 
alpha=0.1, 
linewidth=4, 
label=ticker,
color='black'
)

Both strategies have their own merits and strengths, and the best approach for your work may not be obvious until you've tried both. I hope you find one of these techniques useful for your future projects. If you know of other techniques or use cases for handling massive line charts, I'd love to hear about them!

Thanks for reading and take care.

Information in noise. Two techniques for visualizing many… | by Lenix Carter | Sep, 2024

Technical Terrence Team

Rite Aid emerges from bankruptcy with a new CEO (Expert Market:RADCQ)

Leave a Reply Cancel reply

Recommended.

MIT researchers make breakthrough in privacy protection for machine learning models with probably approximately correct (PAC) privacy

Analyst points out the level to overcome if Bitcoin wants to reach $76,000

Sakana AI Mimics Nature To Revolutionize Tokyo’s AI Landscape

Fixing The Incentives: How Fiat Funds National Corruption

Fezoo aims to surpass OKX; Litecoin and Ethereum holders join the pre-sale

Categories

Important Links

Information in noise. Two techniques for visualizing many… | by Lenix Carter | Sep, 2024

Two techniques to visualize many time series at once

Related

Technical Terrence Team

Rite Aid emerges from bankruptcy with a new CEO (Expert Market:RADCQ)

Leave a Reply Cancel reply

Recommended.

MIT researchers make breakthrough in privacy protection for machine learning models with probably approximately correct (PAC) privacy

Analyst points out the level to overcome if Bitcoin wants to reach $76,000

Sakana AI Mimics Nature To Revolutionize Tokyo’s AI Landscape

Fixing The Incentives: How Fiat Funds National Corruption

Fezoo aims to surpass OKX; Litecoin and Ethereum holders join the pre-sale

Categories

Important Links

Get daily news updates to your inbox!