Airflow s3Hook - read files in s3 with pandas read_csv
Solution 1:
The format you are looking for is the following:
filepath = f"s3://{bucket_name}/{key}"
So in your specific case, something like:
for file in keys:
filepath = f"s3://s3_bucket/{file}"
df = pd.read_csv(filepath, sep='\t', skiprows=1, header=None)
Just make sure you have s3fs
installed though (pip install s3fs
).