How Can I Take The First Four Elements Of A Column In Each Row and Append it To A Newly Created Column Using Python Pandas?

I'm trying a project to get the average stock price of each year but currently, I'm stuck with a problem. I have a CSV file with two columns: Date(YYYY-MM-DD) and High. Basically, I want to create a third column called 'Year' and for every row, I want to take just the year from the date column and add it to the 'Year' column.

Here is my initial table:

enter image description here

Here is my desired output table:

enter image description here

Note: I just know how to add a column but I am not sure how to index the date of each row and append it to the 'Year' column for each row. So for example, for the row with the date '1980-12-12', I want the year column to have just '1980', for the row with the date '1980-12-18', I want the year column to have just '1980', etc.

Here is my code currently:

import pandas as pd
appleStock = pd.read_csv("Apple_stock_history.csv")
for i in appleStock["Date"]:
  appleStock["Year"] = i[0:4]
print(appleStock.head())

My output for the code is:

enter image description here

I figured out that my code is pretty inconsistent; basically there is are more rows in the original CSV file... The last row has a date of '2022-01-03' (which probably explains why I am getting that in my year column every time. In line 4 of my code, when I change it to appleStock["Year"] = i[0:], it gives me the entire date (2022-01-03).

If your df['date'] is str format like this :

df = pd.DataFrame({
    'Date' : ['1980-12-12','1981-12-12'],
    'High' : [0.1, 0.2]
    })

print(df['Date'][0],type(df['Date'][0]))
1980-12-12 <class 'str'>

You can try this :

df['year'] = df['Date'].str[0:4]

How Can I Take The First Four Elements Of A Column In Each Row and Append it To A Newly Created Column Using Python Pandas?

Related

Recent Posts