Q: 18
You want to build a managed Hadoop system as your data lake. The data transformation process is
composed of a series of Hadoop jobs executed in sequence. To accomplish the design of separating
storage from compute, you decided to use the Cloud Storage connector to store all input data,
output data, and intermediary dat
a. However, you noticed that one Hadoop job runs very slowly with Cloud Dataproc, when compared
with the on-premises bare-metal Hadoop environment (8-core nodes with 100-GB RAM). Analysis
shows that this particular Hadoop job is disk I/O intensive. You want to resolve the issue. What
should you do?
Options
Discussion
Option A Similar questions in the GCP practice exams highlight in-memory processing as key for performance issues.
Be respectful. No spam.