Question 8

Question

You are designing a data processing pipeline. The pipeline must be able to scale automatically as load
increases. Messages must be processed at least once, and must be ordered within windows of 1
hour. How should you design the solution?

Accepted Answer

Use Cloud Pub/Sub for message ingestion and Cloud Dataflow for streaming analysis.

Meera G. · Answer

Option D is the way to go. Pub/Sub plus Dataflow are both cloud-native, can autoscale automatically, and Dataflow supports windowed ordering for that 1-hour requirement. B is tempting if you like Kafka, but it isn't as integrated or fully managed in GCP as Pub/Sub. Anyone see a use case where B would actually be better?

Parker U. · Answer

I don't think it's B. D is more cloud-native and actually autoscaling, plus Dataflow gives that windowed ordering you need.

Ivy L. · Answer

Its D since Pub/Sub and Dataflow together are cloud-native, fully managed, and both autoscale with load. Dataflow specifically gives you windowing for that 1 hour ordering which is what the question asks. Kafka options don’t fit as seamlessly on GCP for this use case. Pretty sure about this but open to pushback if I missed some requirement.

Alex J. · Answer

Cloud Pub/Sub plus Dataflow (D) is serverless and autoscaling, which fits the scalable pipeline need. Dataflow handles windowed ordering for that 1-hour window. Pretty sure that's what they want here but happy if someone has another angle.

Jamie M. · Answer

D , Pub/Sub and Dataflow are fully managed and actually autoscale on demand. Also Dataflow supports windowed ordering so you can order messages per hour just like the question needs. Not 100% sure if there's any gotcha but this combo is pretty standard for GCP. Let me know if I missed anything.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE