How to accurately handle time zones and timestamps with Pandas

Author's image | Midjourney

Time-based data can be unique when we are dealing with different time zones. However, interpreting timestamps can be difficult due to these differences. This guide will help you manage time zones and timestamps using the Pandas library in Python.

Preparation

In this tutorial, we will be using the Pandas package. We can install the package using the following code.

Now, we will explore how to work with time-based data in Pandas with practical examples.

Handling Timezones and Timestamps with Pandas

Time data is a unique data set that provides a specific time reference for events. The most accurate time data is timestamps, which contain detailed time information from year to millisecond.

Let's start by creating a sample dataset.

import pandas as pd

data = {
    'transaction_id': (1, 2, 3),
    'timestamp': ('2023-06-15 12:00:05', '2024-04-15 15:20:02', '2024-06-15 21:17:43'),
    'amount': (100, 200, 150)
}

df = pd.DataFrame(data)
df('timestamp') = pd.to_datetime(df('timestamp'))

The “timestamp” column in the above example contains time data with second-level precision. To convert this column to a datetime format, we need to use the pd.to_datetime function.”

We can then make date and time data timezone-aware. For example, we can convert the data to Coordinated Universal Time (UTC).

df('timestamp_utc') = df('timestamp').dt.tz_localize('UTC')
print(df)

Output>> 
  transaction_id           timestamp  amount             timestamp_utc
0               1 2023-06-15 12:00:05     100 2023-06-15 12:00:05+00:00
1               2 2024-04-15 15:20:02     200 2024-04-15 15:20:02+00:00
2               3 2024-06-15 21:17:43     150 2024-06-15 21:17:43+00:00

The 'timestamp_utc' values contain a lot of information, including the time zone. We can convert the existing time zone to another one. For example, I used the UTC column and changed it to the Japan time zone.

df('timestamp_japan') = df('timestamp_utc').dt.tz_convert('Asia/Tokyo')
print(df)

Output>>>
  transaction_id           timestamp  amount             timestamp_utc  \
0               1 2023-06-15 12:00:05     100 2023-06-15 12:00:05+00:00   
1               2 2024-04-15 15:20:02     200 2024-04-15 15:20:02+00:00   
2               3 2024-06-15 21:17:43     150 2024-06-15 21:17:43+00:00   

            timestamp_japan  
0 2023-06-15 21:00:05+09:00  
1 2024-04-16 00:20:02+09:00  
2 2024-06-16 06:17:43+09:00

With this new time zone, we could filter the data based on a particular time zone. For example, we can filter the data based on Japan time.

start_time_japan = pd.Timestamp('2024-06-15 06:00:00', tz='Asia/Tokyo')
end_time_japan = pd.Timestamp('2024-06-16 07:59:59', tz='Asia/Tokyo')

filtered_df = df((df('timestamp_japan') >= start_time_japan) & (df('timestamp_japan') <= end_time_japan))

print(filtered_df)

Output>>>
  transaction_id           timestamp  amount             timestamp_utc  \
2               3 2024-06-15 21:17:43     150 2024-06-15 21:17:43+00:00   

            timestamp_japan  
2 2024-06-16 06:17:43+09:00

Working with time series data would allow us to perform time series resampling. Let's look at an example of hourly data resampling for each column in our dataset.

resampled_df = df.set_index('timestamp_japan').resample('H').count()

Take advantage of Pandas' time zone data and timestamps to get the most out of its features.

Additional Resources

Cornellius Yudha Wijaya Cornellius is a Data Science Assistant Manager and Data Writer. While working full-time at Allianz Indonesia, he loves sharing Python and data tips through social media and writing. Cornellius writes on a variety of ai and machine learning topics.

How to accurately handle time zones and timestamps with Pandas

Technical Terrence Team

New July lows for indices

Leave a Reply Cancel reply

Recommended.

Chinese pet services platform New Ruipeng files $100 million US initial public offering.

Google AI Releases TensorFlow GNN 1.0 (TF-GNN) – A Production-Tested Library for Building GNN at Scale

Shibarium Transactions Drop 85%, Shiba Inu Burns Drop 78%, What's Happening?

Ramaco Resources aims to double met coal production By Investing.com

Ethereum jumps as bitcoin keeps its pace

Categories

Important Links

How to accurately handle time zones and timestamps with Pandas

Preparation

Handling Timezones and Timestamps with Pandas

Additional Resources

Related

Technical Terrence Team

New July lows for indices

Leave a Reply Cancel reply

Recommended.

Chinese pet services platform New Ruipeng files $100 million US initial public offering.

Google AI Releases TensorFlow GNN 1.0 (TF-GNN) – A Production-Tested Library for Building GNN at Scale

Shibarium Transactions Drop 85%, Shiba Inu Burns Drop 78%, What's Happening?

Ramaco Resources aims to double met coal production By Investing.com

Ethereum jumps as bitcoin keeps its pace

Categories

Important Links

Get daily news updates to your inbox!