Q: 5
A financial company receives a high volume of real-time market data streams from an external
provider. The streams consist of thousands of JSON records every second.
The company needs to implement a scalable solution on AWS to identify anomalous data points.
Which solution will meet these requirements with the LEAST operational overhead?
Options
Discussion
Makes sense to pick A here, since Flink has built-in anomaly detection and keeps operational work minimal. Anyone disagree with that approach?
C/D? Both use Lambda but D sets up for batch, not real-time. Since the question wants real-time anomaly detection with low ops, I’m pretty sure A is right over these. Anyone seeing something I missed in B or C?
A , built-in Random Cut Forest in Flink is the lowest ops here.
Why not B? It needs Lambda and SageMaker, so more ops than A.
Its D if you batch, but since you need real-time and as little ops work as possible, option A is built for this. The built-in RANDOM_CUT_FOREST in Flink means no custom ML or extra infra. Pretty sure that's what AWS wants here.
I don’t think it’s B. A is using Flink’s built-in RANDOM_CUT_FOREST, so no extra model training or Lambda wiring needed. B adds more moving parts and higher ops with SageMaker and Lambda, which isn’t as streamlined. Unless you need custom logic, A should be best here, but happy if someone sees a gap.
Probably A, but if they needed a custom ML model or more flexible logic, B would edge it.
Its B for me. Had something like this in a mock and went with SageMaker endpoint plus Lambda since it's real-time and feels more flexible than using Apache Flink's built-in function. Kinesis feeds into Lambda, which can call the SageMaker model for anomaly detection. Pretty sure this satisfies scalability, though maybe not quite as little ops as A. Agree?
I'm thinking B this time, SageMaker with Lambda seems almost as streamlined as Flink for anomaly detection and less vendor lock-in than A.
A imo. Automating this with Flink's built-in Random Cut Forest on managed service is less ops than setting up SageMaker/Lambda or EC2. B adds too much maintenance risk in practice, I think. Disagree?
Be respectful. No spam.