Don’t predict too far into the future.
I was exploring the pandas docs while preparing a pandas course for calmcode.io when I stumbled on an interesting fact: there are bounds for timestamps in pandas.
To quote the docs:
Since pandas represents timestamps in nanosecond resolution, the time span that can be represented using a 64-bit integer is limited to approximately 584 years:
import pandas as pd
min
pd.Timestamp.# Timestamp('1677-09-21 00:12:43.145224193')
max
pd.Timestamp.# Timestamp('2262-04-11 23:47:16.854775807')
It makes sense when you consider pandas can handle nano-seconds and there’s only so much information that you can store in a 64-bit integer. If you have a use-case outside of this span of time, pandas does have a trick up it’s sleeve: you can create a date-like Period
that could work as a datetime instead.
Period
classHere’s how to generate periods.
= pd.period_range("1215-01-01", "1381-01-01", freq="D") span
You can also cast dates manually as an alternative to pd.to_datetime
if you like.
= pd.Series(['1111-01-01', '1212-12-12'])
s
def convert(item):
= int(item[:4])
year = int(item[5:7])
month = int(item[8:10])
day return pd.Period(year=year, month=month, day=day, freq="D")
apply(convert)
s.
# 0 1111-01-01
# 1 1212-12-12
# dtype: period[D]