Question 15

Question

A company stores time-series data about user clicks in an Amazon S3 bucket. The raw data consists of
millions of rows of user activity every day. ML engineers access the data to develop their ML models.
The ML engineers need to generate daily reports and analyze click trends over the past 3 days by
using Amazon Athen
a. The company must retain the data for 30 days before archiving the data.
Which solution will provide the HIGHEST performance for data retrieval?

Accepted Answer

Organize the time-series data into partitions by date prefix in the S3 bucket. Apply S3 Lifecycle
policies to archive partitions that are older than 30 days to S3 Glacier Flexible Retrieval.

Casey · Answer

Option C

Sanjay K. · Answer

Option C but if Athena ever changed how it handled non-partitioned buckets that would flip this. Otherwise partitioning by date still wins.

Morgan · Answer

C , pretty sure that's what Athena is optimized for. Partitioning by date prefix lets you scan just what you need, so queries are way faster than hitting everything. Splitting to buckets like D isn't really how Athena likes it. Open to debate if anyone had a different thought though.

Hannah M. · Answer

C , partitioning by date prefix is key for Athena speed. D is tempting but separate buckets don't help query performance here.

Owen R. · Answer

B or D. Both work but D looks simpler for archiving, might be missing something here.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE