Question 2

Question

[Data Engineering]
A retail company is ingesting purchasing records from its network of 20,000 stores to Amazon S3 by
using Amazon Kinesis Data Firehose. The company uses a small, server-based application in each
store to send the data to AWS over the internet. The company uses this data to train a machine
learning model that is retrained each day. The company's data science team has identified existing
attributes on these records that could be combined to create an improved model.
Which change will create the required transformed records with the LEAST operational overhead?

Accepted Answer

Create an AWS Lambda function that can transform the incoming records. Enable data
transformation on the ingestion Kinesis Data Firehose delivery stream. Use the Lambda function as
the invocation target.

Piya · Answer

Going with A here too. Using Lambda for transformation within Firehose avoids managing infrastructure, and AWS handles the scaling. The other choices involve running clusters or EC2, which is more to maintain. Pretty sure this is the least ops work, correct me if I missed something!

Liam · Answer

A . Using Lambda for transformation with Firehose means no server management, auto-scaling, and it's built right into the delivery stream. The other options add way more operational work. If everything fits in a Lambda, this is the easiest path. Tell me if I'm missing any curveballs here.

Riley M. · Answer

C/D? Both need more ops than A but can't tell if option D does anything the question wants.

Quinn L. · Answer

Its A

Anita · Answer

Its A. EMR in B is tempting but it’s overkill for just transforming Firehose records, way more management needed compared to Lambda.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE