Ray Dataset: ArrowInvalid: Unrecognized filesystem type in URI: gs://
Pyarrow (as of 6.0.1, the latest at the moment) does not have support for a direct GCS filesystem and is not able to recognize gs://
URIs. You can create a GCS filesystem using an fsspec adapter. I think you can use this in Ray with:
import gcsfs
fs = gcsfs.GCSFileSystem(project='my-google-project')
ray.data.read_parquet("path", filesystem=fs)
Pyarrow 7.0.0 (which will likely release in the next month) should include some support for a native GCS filesystem (I'm not entirely clear on how much will be included in 7.0.0 but I know it is actively being worked on so check the release notes). Support for gs://
URIs does not yet appear to have been implemented.