Solution 1:

import boto3
import pandas as pd
from sagemaker import get_execution_role

role = get_execution_role()
data_key = 'train.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)


Solution 2:

In the simplest case you don't need boto3, because you just read resources.
Then it's even simpler:

import pandas as pd

data_key = 'train.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)


But as Prateek stated make sure to configure your SageMaker notebook instance to have access to s3. This is done at configuration step in Permissions > IAM role

Solution 3:

If you have a look here it seems you can specify this in the InputDataConfig. Search for "S3DataSource" (ref) in the document. The first hit is even in Python, on page 25/26.

Solution 4:

You could also access your bucket as your file system using s3fs

import s3fs
fs = s3fs.S3FileSystem()

# To List 5 files in your accessible bucket's3://bucket-name/data/')[:5]

# open it directly
with's3://bucket-name/data/image.png') as f: