Difference Between Amazon S3 Select and Amazon Redshift Spectrum
Solution 1:
S3 Select is focused on retrieving data from S3 using SQL:
S3 Select, enables applications to retrieve only a subset of data from an object by using simple SQL expressions. By using S3 Select to retrieve only the data needed by your application, you can achieve drastic performance increases – in many cases you can get as much as a 400% improvement compared with classic S3 retrieval.
Redshift Spectrum enable quering S3 data directly from your AWS Redshift Cluster:
Amazon Redshift Spectrum enables you to run Amazon Redshift SQL queries against exabytes of data in Amazon S3. With Redshift Spectrum, you can extend the analytic power of Amazon Redshift beyond data stored on local disks in your data warehouse to query vast amounts of unstructured data in your Amazon S3 “data lake”
Athena is focused on extract, transform and load (ETL) data from S3 and has a good integration with AWS Glue:
Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.
References: Athena, Spectrum and S3 Select