Extract the Year from a Date in Python

In a previous post, we explained how we can convert a string to a date object in Python. When you perform this action, it is often because you need to generate some kind of summary statistics, by year for instance. To accomplish that, you need to be able to extract the year from the date object.

In this tutorial, we’ll explain how.

Use the year attribute from the date object

When you create a date object (from scratch or converting a string), a bunch of attributes are associated to it. You can list them easily using the following code:

from datetime import datetime #create an object for today today = datetime.today() #list attributes print([element for element in dir(today) if '__' not in element])
Code language: PHP (php)

Output:

['astimezone', 'combine', 'ctime', 'date', 'day', 'dst', 'fold', 'fromisocalendar', 'fromisoformat', 'fromordinal', 'fromtimestamp', 'hour', 'isocalendar', 'isoformat', 'isoweekday', 'max', 'microsecond', 'min', 'minute', 'month', 'now', 'replace', 'resolution', 'second', 'strftime', 'strptime', 'time', 'timestamp', 'timetuple', 'timetz', 'today', 'toordinal', 'tzinfo', 'tzname', 'utcfromtimestamp', 'utcnow', 'utcoffset', 'utctimetuple', 'weekday', 'year']
Code language: Python (python)

This list actually contains methods and attributes, but one of them, year, is an attribute and allows us to extract the year from a date object. The code is super simple:

from datetime import datetime #create an object for today today = datetime.today() #get year year = today.year #2022
Code language: PHP (php)

Other date attributes such as day, month or minute can be accessed using the same logic.

Use the dt.year attribute if you’re working with Pandas

When you work with large datasets loaded into a Pandas DataFrame, you can use a similar code to extract this information at scale. Let’s imagine that you load a CSV file containing the following information and that you want to add an extra column, containing the year.

date invoices 1/10/2021 525 2/10/2022 136 4/09/2010 125 7/07/2005 84 19/12/2000 4469

Firstly, you need to load the dataset into a Pandas Dataframe using the read_csv() method.

df = pd.read_csv('file.csv')
Code language: Python (python)

By default, dates are not always converted to date object by Pandas, so we need to force this behavior.

df['date'] = pd.to_datetime(df['date'])
Code language: JavaScript (javascript)

And we can finally create the extra column we needed

df['year'] = df['date'].dt.year
Code language: JavaScript (javascript)

Output:

dateinvoicesyear
1/10/20215252021
2/10/20221362022
4/09/20101252010
7/07/2005842005
19/12/200044692000

Simple and efficient!

Leave a Reply

Your email address will not be published. Required fields are marked *