Simplifying Date and Time with Python

Simplifying Date and Time with Python

Introduction

In this post, we will dive into the DateTime related module of Python and Pandas. Handling DateTime is always a boring part of any programming language. Many times we can achieve most of our requirements without delving much into this module. But if we understand it structurally, it is not that boring. It will make your life pretty easy when handling a Timeseries dataset.
We will try to develop a mindmap along with this post. We will cover,

  • Datetime objects in Python
  • Operations and Arithmetic on Python Datetime object
  • Read DateTime from String and format back to String
  • Datetime objects in Pandas
  • Learning to operate TimeSeries data based on Datetime Index
  • Understanding and applying Delta, Offsets, Timezone

Python Date and Time ecosystem

Blog_Py_time.PNG

The above-mentioned modules we will cover in this post. We will dive deep into the datetime module of Python and all the shown modules of Pandas. These are enough for all our DateTime need.

Python Internal packages/modules

Time module

Time is the first package that we will discuss. You may not need it more often because the datetime module will cover everything that is available in this module.

Create a Time object

There are 3 ways we can input the information for a time

  • epoch - Seconds since a reference instant, known as the epoch. Midnight, UTC, of January 1, 1970, is a popular epoch used on both Unix and Windows platforms.
  • As a tuple - An alternative to seconds since the epoch, a time instant can be represented by a tuple of nine integers, called a timetuple. As show below tm_year=2005, tm_mon=8, tm_mday=7, tm_hour=23, tm_min=21, tm_sec=29, tm_wday=6, tm_yday=219, tm_isdst=0
    This is an intuitive approach since we have the option to input all the relevant values with a keyword argument. This approach is common across different modules but with different names of the underlying Class. struct_time is the name for the Class in time module
  • From String - We can also read from strings like '2020-11-18 23:59:59'
    Let's see the functions that are required to achieve the above methods.
import time

tm = time.gmtime(1123456889.5) # epoch --> time.struct_time object
time.mktime(tm) # time.struct_time object --> epoch
time.struct_time((2005, 8, 7, 23, 21, 29, 6, 219, 0)) # Create struct_time explicitly
time.time() # Current time in epoch

# Get the individual attributes
print(tm.tm_year, tm.tm_mon, tm.tm_mday,tm.tm_hour, tm.tm_min, tm.tm_sec)

Code-explanation
We have simply used the 3 methods of time class [ in the time module ]
All other parts of the code is quite trivial and self-explanatory.

With the above code snippet, we are equipped to read and save time data. let's read from Sring and format back to a string

read_time = time.strptime("2018-04-02 23:59:50", '%Y-%m-%d %H:%M:%S')
str_time = time.strftime('%d-%b-%Y %H:%M:%S', read_time)

Code-explanation
We have two method to our service - strptime and strftime.
The meaning of each alphabetic code can be checked Here

Datetime module

The datetime module has all the functionality of the time module and has many APIs on top of it. So, you might ignore the time module.
The datetime module has Classes for - Date, Time, and Datetime. The first two are for Date and Time respectively and the last one is the superset for the two. Hence the last one i.e datetime Class is sufficient for all of our tasks.

Why we need datetime when we have the time module
The high-level reason is that the time module is to handle time as a Float. It is not designed keeping humans in mind. datetime has all the required API needed to handle date and time by a Human. Check his Reddit Answer Reddit

Let's check the datetime module with the required code. Be mindful that the Object of the Datetime which stores the values will be datetime. Also, take a note that the name of the top-level package is also datetime

from datetime import datetime # Both are named datetime

dtm = datetime(2000, 5, 23, hour=0, minute=0,second=0, microsecond=0,tzinfo=None) # Time tuple
dtm = datetime.fromtimestamp(1123456889.5) # epoch --> datetime. Similar to mktime

datetime.now() # Current time

# Read from String
datetime.strptime("2018-04-02 23:59:50", '%Y-%m-%d %H:%M:%S') # string--> datetime

# Back to String
d = datetime.strptime("2018-04-02 23:59:50", '%Y-%m-%d %H:%M:%S')
datetime.strftime(d, '%d-%b-%Y %H:%M:%S') # datetime --> string

# Individual attributes of datetime
d2.year,d2.month, d2.day, d2.minute, d2.second

# Weekdays names are not directly avaialble as attribute
d2.strftime("%A"), d2.strftime("%a")

Code-explanation
Code is quite intuitive to understand. In addition, now we have an option for timezone(tzinfo parameter). We will use it later

Datetime arithmetic and Timedelta module

We now know the approach to input, format, and print formatted datetime. So, let's learn how to do Arithmetic with datetime.
timedelta is the module to create and manage the difference between to datetime. We can also calculate the future date if the delta is known.
Instances of the timedelta class represent time intervals with three read-only integer attributes days, seconds, and microseconds.
Let's check the timedelta module with the required code.

from datetime import timedelta, datetime
d1 = datetime.strptime("2018-04-02 23:59:50", '%Y-%m-%d %H:%M:%S')
d2 = datetime.strptime("2019-05-03 23:57:12", '%Y-%m-%d %H:%M:%S')

d2 - d1 # >>> datetime.timedelta(days=395, seconds=86242) # this is timedelta object

delta = timedelta(days=395, seconds=86242) # this is timedelta object
delta.days,delta.seconds # Check attributes

# Add the delta to d1
datetime.strftime(d1+delta, '%d-%b-%Y %H:%M:%S') # Same as d2
str(d1+delta) # str function implementation of timedelta

Code-explanation
As mentioned above, timedeta can be expressed in only 3 attributes days, seconds and microseconds.
Difference of two datetime object is a timedelta object

Pytz package

pytz is a third-party module to handle timezone-related manipulations. Timezone handling can be prone to bugs and issues. Here are the words of wisdom from "Python in a Nutshell"

The best way to program around the traps and pitfalls of time zones is to always use the UTC time zone internally, converting from other time zones on input, and to other time zones only for display purposes.

Let's check a quick code snippet to handle timezone with datetime.

!pip install pytz
import pytz 

# Get the list of all available timezones
pytz.common_timezones #1
# Timezone for a particular country # Use the ISO format of country code
pytz.country_timezones('IN') # >>> ['Asia/Kolkata'] #2

inp_ny = datetime(2021,11,11, tzinfo=pytz.timezone('America/New_York')) # Return datetime with New york time #3
# use the astimezone method of datetime object
out_ind = inp_ny.astimezone( pytz.timezone('Asia/Kolkata')) #4

Code-explanation
#1 - Fetech the list of all avaialble timezones
#2 - Fetch the list of all timezones for a country [India here]
#3 -Use the tzinfo of datetime constructor
#4 -Covert to the desired timezone

When we create a datetime without a tzinfo it's a naive datetime i.e. just a datetime without any timezone attached. When we pass the timezone to the tzinfo parameter, the datetime became the datetime for that timezone.
Let's do a small exercise and create two datetime with the same values but pinned to different timezones. Then calculate the timedelta of the two.

time_1 = datetime(2021,11,11, tzinfo=pytz.timezone('America/New_York'))
time_2 = datetime(2021,11,11, tzinfo=pytz.timezone('Pacific/Auckland'))

time_1 - time_2 # >>>datetime.timedelta(seconds=59700) | Equivalent to ~16.5 Hours

Conclusion

This was all for this post. If you keep these few snippets in mind, datetime will never haunt you. We will continue this post and add on to the Pandas library. That post will not just focus on core pf pandas datetime objects but also on the Timeseries data.

You may try,

  • The dateutil module - a third-party package that offers modules to manipulate dates. [Link]
  • The calendar module - calendar module supplies calendar-related functions
  • The arrow library - It offers a sensible and human-friendly approach to creating, manipulating, formatting and converting dates, times and timestamps. [Link]