Question 13

Question

You work for the AI team of an automobile company, and you are developing a visual defect
detection model using TensorFlow and Keras. To improve your model performance, you want to
incorporate some image augmentation functions such as translation, cropping, and contrast
tweaking. You randomly apply these functions to each training batch. You want to optimize your data
processing pipeline for run time and compute resources utilization. What should you do?

Accepted Answer

Embed the augmentation functions dynamically in the tf.Data pipeline.

LiamB · Answer

A. Official TensorFlow docs or practice labs usually push for tf.Data pipeline for this type of augmentation.

Piya I. · Answer

A . Had something like this in a mock and tf.Data pipelines are way more efficient for augmentation since they use native TensorFlow ops and support optimizations like parallel mapping and prefetching. Keras generators are decent for small stuff but don't scale or integrate as well, especially with distributed training. Pretty sure A is what they're looking for here, unless I'm missing a subtlety.

Skyler D. · Answer

A is the way to go here. tf.Data pipelines can parallelize and optimize these augmentations on the fly, plus it's easier to scale and integrate with distributed training. Pretty sure that's what Google expects in this scenario, though B isn't totally wrong for smaller tasks.

NoraX · Answer

tf.Data pipeline wins every time for this in Google exams, tbh. A imo

Riley · Answer

C/D? I know pre-generating augmentations with Dataflow (C or D) is tempting because you could save run-time augmentation cost, but having tried something like this before, the storage overhead gets out of hand really fast. Also, tf.Data pipeline (A) typically does these ops efficiently on the fly and fits better with distributed TensorFlow workflows. Maybe someone sees a reason to pick C or D over A for a very large dataset?

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE