Prepare Better for the Associate-Data-Practitioner Exam with Our Free and Reliable Associate-Data-Practitioner Exam Questions – Updated for 2025.

At Cert Empire, we are committed to providing the most accurate and up-to-date exam questions for students preparing for the Cisco Associate-Data-Practitioner Exam. To make studying easier, we’ve made parts of our Associate-Data-Practitioner exam resources free for everyone. You can practice as much as you want with Free Associate-Data-Practitioner Practice Test.

Get Associate-Data-Practitioner Exam Dumps

GOOGLE Cloud Associate-Data-Practitioner Free Exam Questions

Disclaimer

Please keep a note that the demo questions are not frequently updated. You may as well find them in open communities around the web. However, this demo is only to depict what sort of questions you may find in our original files.

Nonetheless, the premium exam dumps files are frequently updated and are based on the latest exam syllabus and real exam questions.

1 / 60

You are a data analyst working with sensitive customer data in BigQuery. You need to ensure that only authorized personnel within your organization can query this data, while following the principle of least privilege. What should you do?

Encrypt the data by using customer-managed encryption keys (CMEK)

Export the data to Cloud Storage, and use signed URLs to authorize access

Update dataset privileges by using the SQL GRANT statement

Enable access control by using IAM roles

2 / 60

You manage a BigQuery table that is used for critical end-of-month reports. The table is updated weekly with new sales data. You want to prevent data loss and reporting issues if the table is accidentally deleted. What should you do?

Create a clone of the table. On deletion, re-create the deleted table by copying the content of the clone

Create a view of the table. On deletion, re-create the deleted table from the view and time travel data

Configure the time travel duration on the table to be exactly seven days. On deletion, re-create the deleted table solely from the time travel data

Schedule the creation of a new snapshot of the table once a week. On deletion, re-create the deleted table using the snapshot and time travel data

3 / 60

Your organization needs to implement near real-time analytics for thousands of events arriving each second in Pub/Sub. The incoming messages require transformations. You need to configure a pipeline that processes, transforms, and loads the data into BigQuery while minimizing development time. What should you do?

Use Cloud Run functions to process the Pub/Sub messages, perform transformations, and write the results to BigQuery

Create a Cloud Data Fusion instance and configure Pub/Sub as a source. Use Data Fusion to process the Pub/Sub messages, perform transformations, and write the results to BigQuery

Load the data from Pub/Sub into Cloud Storage using a Cloud Storage subscription. Create a Dataproc cluster, use PySpark to perform transformations in Cloud Storage, and write the results to BigQuery

Use a Google-provided Dataflow template to process the Pub/Sub messages, perform transformations, and write the results to BigQuery

4 / 60

You used BigQuery ML to build a customer purchase propensity model six months ago. You want to compare the current serving data with the historical serving data to determine whether you need to retrain the model. What should you do?

Evaluate data drift

Compare the two different models

Evaluate the data skewness

Compare the confusion matrix

5 / 60

Your company’s ecommerce website collects product reviews from customers. The reviews are loaded as CSV files daily to a Cloud Storage bucket. The reviews are in multiple languages and need to be translated to Spanish. You need to configure a pipeline that is serverless, efficient, and requires minimal maintenance. What should you do?

Load the data into BigQuery using Dataproc. Use Apache Spark to translate the reviews by invoking the Cloud Translation API. Set BigQuery as the sink.U

Use a Dataflow templates pipeline to translate the reviews using the Cloud Translation API. Set BigQuery as the sink

Load the data into BigQuery using a Cloud Run function. Create a BigQuery remote function that invokes the Cloud Translation API. Use a scheduled query to translate new reviews

Load the data into BigQuery using a Cloud Run function. Use the BigQuery ML create model statement to train a translation model. Use the model to translate the product reviews within BigQuery

6 / 60

You need to create a new data pipeline. You want a serverless solution that meets the following requirements:

• Data is streamed from Pub/Sub and is processed in real-time.

• Data is transformed before being stored.

• Data is stored in a location that will allow it to be analyzed with SQL using Looker.

Which Google Cloud services should you recommend for the pipeline?

Dataflow, BigQuery

Cloud Composer, Cloud SQL for MySQL

BigQuery, Analytics Hub

Dataproc Serverless, Bigtable

7 / 60

You need to transfer approximately 300 TB of data from your company's on-premises data center to Cloud Storage. You have 100 Mbps internet bandwidth, and the transfer needs to be completed as quickly as possible. What should you do?

Request a Transfer Appliance, copy the data to the appliance, and ship it back to Google

Compress the data, upload it to multiple cloud storage providers, and then transfer the data to Cloud Storage

Use Cloud Client Libraries to transfer the data over the internet

Use the gcloud storage command to transfer the data over the internet

8 / 60

Your team wants to create a monthly report to analyze inventory data that is updated daily. You need to aggregate the inventory counts by using only the most recent month of data, and save the results to be used in a Looker Studio dashboard. What should you do?

Create a BigQuery table that uses the SUM( ) function and the DATE_DIFF( ) function

Create a BigQuery table that uses the SUM( ) function and the _PARTITIONDATE filter

Create a materialized view in BigQuery that uses the SUM( ) function and the DATE_SUB( ) function

Create a saved query in the BigQuery console that uses the SUM( ) function and the DATE_SUB( ) function. Re-run the saved query every month, and save the results to a BigQuery table

9 / 60

You are designing an application that will interact with several BigQuery datasets. You need to grant the application’s service account permissions that allow it to query and update tables within the datasets, and list all datasets in a project within your application. You want to follow the principle of least privilege. Which pre-defined IAM role(s) should you apply to the service account?

roles/bigquery.connectionUser and roles/bigquery.dataViewer

roles/bigquery.user and roles/bigquery.filteredDataViewer

roles/bigquery.admin

roles/bigquery.jobUser and roles/bigquery.dataOwner

10 / 60

Your organization’s business analysts require near real-time access to streaming data. However, they are reporting that their dashboard queries are loading slowly. After investigating BigQuery query performance, you discover the slow dashboard queries perform several joins and aggregations.

You need to improve the dashboard loading time and ensure that the dashboard data is as up-to-date as possible. What should you do?

Disable BiqQuery query result caching

Create materialized views

Create a scheduled query to calculate and store intermediate results

Modify the schema to use parameterized data types

11 / 60

You need to create a data pipeline for a new application. Your application will stream data that needs to be enriched and cleaned. Eventually, the data will be used to train machine learning models. You need to determine the appropriate data manipulation methodology and which Google Cloud services to use in this pipeline. What should you choose?

ELT; Cloud SQL -> Analytics Hub

ETL; Cloud Data Fusion -> Cloud Storage

ELT; Cloud Storage -> Bigtable

ETL; Dataflow -> BigQuery

12 / 60

You are using your own data to demonstrate the capabilities of BigQuery to your organization’s leadership team. You need to perform a one-time load of the files stored on your local machine into BigQuery using as little effort as possible. What should you do?

Execute the bq load command on your local machine

Create a Dataflow job using the Apache Beam FileIO and BigQueryIO connectors with a local runner

Create a Dataproc cluster, copy the files to Cloud Storage, and write an Apache Spark job using the spark-bigquery-connector

Write and execute a Python script using the BigQuery Storage Write API library

13 / 60

Your team is building several data pipelines that contain a collection of complex tasks and dependencies that you want to execute on a schedule, in a specific order. The tasks and dependencies consist of files in Cloud Storage, Apache Spark jobs, and data in BigQuery. You need to design a system that can schedule and automate these data processing tasks using a fully managed approach. What should you do?

Create directed acyclic graphs (DAGs) in Cloud Composer. Use the appropriate operators to connect to Cloud Storage, Spark, and BigQuery

Use Cloud Tasks to schedule and run the jobs asynchronously

Create directed acyclic graphs (DAGs) in Apache Airflow deployed on Google Kubernetes Engine. Use the appropriate operators to connect to Cloud Storage, Spark, and BigQuery

Use Cloud Scheduler to schedule the jobs to run

14 / 60

Your organization uses scheduled queries to perform transformations on data stored in BigQuery. You discover that one of your scheduled queries has failed. You need to troubleshoot the issue as quickly as possible. What should you do?

Navigate to the Scheduled queries page in the Google Cloud console. Select the failed job, and analyze the error details

Set up a log sink using the gcloud CLI to export BigQuery audit logs to BigQuery. Query those logs to identify the error associated with the failed job ID

Request access from your admin to the BigQuery information_schema. Query the jobs view with the failed job ID, and analyze error details

Navigate to the Logs Explorer page in Cloud Logging. Use filters to find the failed job, and analyze the error details

15 / 60

Your organization has a petabyte of application logs stored as Parquet files in Cloud Storage. You need to quickly perform a one-time SQL-based analysis of the files and join them to data that already resides in BigQuery. What should you do?

Create external tables over the files in Cloud Storage, and perform SQL joins to tables in BigQuery to analyze the data

Use the bq load command to load the Parquet files into BigQuery, and perform SQL joins to analyze the data

Launch a Cloud Data Fusion environment, use plugins to connect to BigQuery and Cloud Storage, and use the SQL join operation to analyze the data

Create a Dataproc cluster, and write a PySpark job to join the data from BigQuery to the files in Cloud Storage

16 / 60

You are a database administrator managing sales transaction data by region stored in a BigQuery table. You need to ensure that each sales representative can only see the transactions in their region. What should you do?

Grant the appropriate 1AM permissions on the dataset

Add a policy tag in BigQuery

Create a data masking rule

Create a row-level access policy

17 / 60

You work for a financial organization that stores transaction data in BigQuery. Your organization has a regulatory requirement to retain data for a minimum of seven years for auditing purposes. You need to ensure that the data is retained for seven years using an efficient and cost-optimized approach. What should you do?

Set the dateset-level retention policy in bigQuery to seven years

Export the bigQuery tables to cloud storage daily, and enforce a lifecycle management policy that has a seven-year retention rule

Create a partition by transaction date, and set the partition expiration policy to seven years

Set the table-level retention policy in bigQuery to seven years

18 / 60

You need to create a weekly aggregated sales report based on a large volume of data. You want to use Python to design an efficient process for generating this report. What should you do?

Create a Dataflow directed acyclic graph (DAG) coded in Python. Use Cloud Scheduler to schedule the code to run once a week

Create a Colab Enterprise notebook and use the bigframes.pandas library. Schedule the notebook to execute once a week

Create a Cloud Data Fusion and Wrangler flow. Schedule the flow to run once a week

Create a Cloud Run function that uses NumPy. Use Cloud Scheduler to schedule the function to run once a week

19 / 60

Your organization has decided to move their on-premises Apache Spark-based workload to Google Cloud. You want to be able to manage the code without needing to provision and manage your own cluster. What should you do?

Migrate the Spark jobs to Dataproc on Compute Engine

Migrate the Spark jobs to Dataproc on Google Kubernetes Engine

Configure a Google Kubernetes Engine cluster with Spark operators, and deploy the Spark jobs

Migrate the Spark jobs to Dataproc Serverless

20 / 60

You are developing a data ingestion pipeline to load small CSV files into BigQuery from Cloud Storage. You want to load these files upon arrival to minimize data latency. You want to accomplish this with minimal cost and maintenance. What should you do?

Create a Cloud Run function to load the data into BigQuery that is triggered when data arrives in Cloud Storage

Create a Dataproc cluster to pull CSV files from Cloud Storage, process them using Spark, and write the results to BigQuery

Create a Cloud Composer pipeline to load new files from Cloud Storage to BigQuery and schedule it to run every 10 minutes

Use the bq command-line tool within a Cloud Shell instance to load the data into BigQuery

21 / 60

You need to create a data pipeline that streams event information from applications in multiple Google Cloud regions into BigQuery for near real-time analysis. The data requires transformation before loading. You want to create the pipeline using a visual interface. What should you do?

APush event information to a Pub/Sub topic. Create a Dataflow job using the Dataflow job builder.

BPush event information to a Pub/Sub topic. Create a Cloud Run function to subscribe to the Pub/Sub topic, apply transformations, and insert the data into BigQuery.

CPushPush event information to a Pub/Sub topic. Create a BigQuery subscription in Pub/Sub.

D event information to Cloud Storage, and create an external table in BigQuery. Create a BigQuery scheduled job that executes once each day to apply transformations.

PushPush event information to a Pub/Sub topic. Create a BigQuery subscription in Pub/Sub

Event information to Cloud Storage, and create an external table in BigQuery. Create a BigQuery scheduled job that executes once each day to apply transformations

Push event information to a Pub/Sub topic. Create a Cloud Run function to subscribe to the Pub/Sub topic, apply transformations, and insert the data into BigQuery

Push event information to a Pub/Sub topic. Create a Dataflow job using the Dataflow job builder

22 / 60

Your organization's ecommerce website collects user activity logs using a Pub/Sub topic. Your organization's leadership team wants a dashboard that contains aggregated user engagement metrics. You need to create a solution that transforms the user activity logs into aggregated metrics, while ensuring that the raw data can be easily queried. What should you do?

Create a Cloud Storage subscription to the Pub/Sub topic. Load the activity logs into a bucket using the Avro file format. Use Dataflow to transform the data, and load it into a BigQuery table for reporting

Create an event-driven Cloud Run function to trigger a data transformation pipeline to run. Load the transformed activity logs into a BigQuery table for reporting

Create a BigQuery subscription to the Pub/Sub topic, and load the activity logs into the table. Create a materialized view in BigQuery using SQL to transform the data for reporting

Create a Dataflow subscription to the Pub/Sub topic, and transform the activity logs. Load the transformed data into a BigQuery table for reporting

23 / 60

Your company currently uses an on-premises network file system (NFS) and is migrating data to Google Cloud. You want to be able to control how much bandwidth is used by the data migration while capturing detailed reporting on the migration status. What should you do?

Use Storage Transfer Service

Use gcloud storage commands

Use a Transfer Appliance

Use Cloud Storage FUSE

24 / 60

You work for a home insurance company. You are frequently asked to create and save risk reports with charts for specific areas using a publicly available storm event dataset. You want to be able to quickly create and re-run risk reports when new data becomes available. What should you do?

Reference and query the storm event dataset using SQL in a Colab Enterprise notebook. Display the table results and document with Markdown, and use Matplotlib to create charts

Reference and query the storm event dataset using SQL in BigQuery Studio. Export the results to Google Sheets, and use cell data in the worksheets to create charts

Export the storm event dataset as a CSV file. Import the file to Google Sheets, and use cell data in the worksheets to create charts

Copy the storm event dataset into your BigQuery project. Use BigQuery Studio to query and visualize the data in Looker Studio

25 / 60

You work for an online retail company. Your company collects customer purchase data in CSV files and pushes them to Cloud Storage every 10 minutes. The data needs to be transformed and loaded into BigQuery for analysis. The transformation involves cleaning the data, removing duplicates, and enriching it with product information from a separate table in BigQuery. You need to implement a low-overhead solution that initiates data processing as soon as the files are loaded into Cloud Storage. What should you do?

Use Dataflow to implement a streaming pipeline using an OBJECT_FINALIZE notification from Pub/Sub to read the data from Cloud Storage, perform the transformations, and write the data to BigQuery

Create a Cloud Data Fusion job to process and load the data from Cloud Storage into BigQuery. Create an OBJECT_FINALI ZE notification in Pub/Sub, and trigger a Cloud Run function to start the Cloud Data Fusion job as soon as new files are loaded

Schedule a direct acyclic graph (DAG) in Cloud Composer to run hourly to batch load the data from Cloud Storage to BigQuery, and process the data in BigQuery using SQL

Use Cloud Composer sensors to detect files loading in Cloud Storage. Create a Dataproc cluster, and use a Composer task to execute a job on the cluster to process and load the data into BigQuery

26 / 60

You are working with a large dataset of customer reviews stored in Cloud Storage. The dataset contains several inconsistencies, such as missing values, incorrect data types, and duplicate entries. You need to clean the data to ensure that it is accurate and consistent before using it for analysis. What should you do?

Use Cloud Run functions to clean the data and load it into BigQuery. Use SQL for analysis

Use Storage Transfer Service to move the data to a different Cloud Storage bucket. Use event triggers to invoke Cloud Run functions to load the data into BigQuery. Use SQL for analysis

Use the PythonOperator in Cloud Composer to clean the data and load it into BigQuery. Use SQL for analysis

Use BigQuery to batch load the data into BigQuery. Use SQL for cleaning and analysis

27 / 60

Your retail organization stores sensitive application usage data in Cloud Storage. You need to encrypt the data without the operational overhead of managing encryption keys. What should you do?

Use customer-supplied encryption keys (CSEK)

Use customer-supplied encryption keys (CSEK) for the sensitive data and customer-managed encryption keys (CMEK) for the less sensitive data

Use customer-managed encryption keys (CMEK)

Use Google-managed encryption keys (GMEK)

28 / 60

Your company's customer support audio files are stored in a Cloud Storage bucket. You plan to analyze the audio files' metadata and file content within BigQuery to create inference by using BigQuery ML. You need to create a corresponding table in BigQuery that represents the bucket containing the audio files. What should you do?

Create an object table

Create a native table

Create a temporary table

Create an external table

29 / 60

You work for a financial services company that handles highly sensitive dat

a. Due to regulatory requirements, your company is required to have complete and manual control of data encryption. Which type of keys should you recommend to use for data storage?

Use Google-managed encryption keys (GMEK)

Use a dedicated third-party key management system (KMS) chosen by the company

Use customer-managed encryption keys (CMEK)

Use customer-supplied encryption keys (CSEK)

30 / 60

Your team needs to analyze large datasets stored in BigQuery to identify trends in user behavior. The analysis will involve complex statistical calculations, Python packages, and visualizations. You need to recommend a managed collaborative environment to develop and share the analysis. What should you recommend?

Create a Looker Studio dashboard and connect the dashboard to BigQuery. Share the dashboard with your team. Analyze the data and generate visualizations in Looker Studio

Create a statistical model by using BigQuery ML. Share the query with your team. Analyze the data and generate visualizations in Looker Studio

Create a Colab Enterprise notebook and connect the notebook to BigQuery. Share the notebook with your team. Analyze the data and generate visualizations in Colab Enterprise

Connect Google Sheets to BigQuery by using Connected Sheets. Share the Google Sheet with your team. Analyze the data and generate visualizations in Gooqle Sheets

31 / 60

Your organization has several datasets in BigQuery. The datasets need to be shared with your external partners so that they can run SQL queries without needing to copy the data to their own projects. You have organized each partner's data in its own BigQuery dataset. Each partner should be able to access only their dat

a. You want to share the data while following Google-recommended practices. What should you do?

Grant the partners the bigquery.user IAM role on the BigQuery project

Export the BigQuery data to a Cloud Storage bucket. Grant the partners the storage.objectUser IAM role on the bucket

Create a Dataflow job that reads from each BigQuery dataset and pushes the data into a dedicated Pub/Sub topic for each partner. Grant each partner the pubsub. subscriber IAM role

Use Analytics Hub to create a listing on a private data exchange for each partner dataset. Allow each partner to subscribe to their respective listings

32 / 60

Your organization has decided to migrate their existing enterprise data warehouse to BigQuery. The existing data pipeline tools already support connectors to BigQuery. You need to identify a data migration approach that optimizes migration speed. What should you do?

Use the existing data pipeline tool's BigQuery connector to reconfigure the data mapping

Use the BigQuery Data Transfer Service to recreate the data pipeline and migrate the data into BigQuery

Use the Cloud Data Fusion web interface to build data pipelines. Create a directed acyclic graph (DAG) that facilitates pipeline orchestration

Create a temporary file system to facilitate data transfer from the existing environment to Cloud Storage. Use Storage Transfer Service to migrate the data into BigQuery

33 / 60

You are responsible for managing Cloud Storage buckets for a research company. Your company has well-defined data tiering and retention rules. You need to optimize storage costs while achieving your data retention needs. What should you do?

Configure the buckets to use the Archive storage class

Configure a lifecycle management policy on each bucket to downgrade the storage class and remove objects based on age

Configure the buckets to use the Standard storage class and enable Object Versioning

Configure the buckets to use the Autoclass feature

34 / 60

You are a Looker analyst. You need to add a new field to your Looker report that generates SQL that will run against your company's database. You do not have the Develop permission. What should you do?

Create a new field in the LookML layer, refresh your report, and select your new field from the field picker

Create a custom field from the field picker in Looker, and add it to your reportCreate a custom field from the field picker in Looker, and add it to your report

Create a table calculation from the field picker in Looker, and add it to your report

Create a calculated field using the Add a field option in Looker Studio, and add it to your report

35 / 60

Your organization is building a new application on Google Cloud. Several data files will need to be stored in Cloud Storage. Your organization has approved only two specific cloud regions where these data files can reside. You need to determine a Cloud Storage bucket strategy that includes automated high availability. What should you do?

Create a single-region bucket in each of the two regions, and use the gcloud storage command to replicate the data across the buckets in both regions

Create a single-region bucket in each of the two regions, and use Storage Transfer Service to replicate the data across the buckets in both regions

Create a multi-region bucket, and upload the files to this bucket

Create a dual-region bucket, and upload the files to this bucket

36 / 60

You work for a gaming company that collects real-time player activity dat

a. This data is streamed into Pub/Sub and needs to be processed and loaded into BigQuery for analysis. The processing involves filtering, enriching, and aggregating the data before loading it into partitioned BigQuery tables. You need to design a pipeline that ensures low latency and high throughput while following a Google-recommended approach. What should you do?

Use Dataflow to create a streaming pipeline that reads the data from Pub/Sub, processes the data, and writes it to BigQuery using the streaming API

Use Cloud Run functions to subscribe to the Pub/Sub topic, process the data, and write it to BigQuery using the streaming API

Use Dataproc to create an Apache Spark streaming job that reads the data from Pub/Sub, processes the data, and writes it to BigQuery

Use Cloud Composer to orchestrate a workflow that reads the data from Pub/Sub, processes the data using a Python script, and writes it to BigQuery

37 / 60

Another team in your organization is requesting access to a BigQuery dataset. You need to share the dataset with the team while minimizing the risk of unauthorized copying of data. You also want to create a reusable framework in case you need to share this data with other teams in the future. What should you do?

Enable domain restricted sharing on the project. Grant the team members the BigQuery Data Viewer IAM role on the dataset

Create authorized views in the team’s Google Cloud project that is only accessible by the team

Create a private exchange using Analytics Hub with data egress restriction, and grant access to the team members

38 / 60

You are designing a pipeline to process data files that arrive in Cloud Storage by 3:00 am each day. Data processing is performed in stages, where the output of one stage becomes the input of the next. Each stage takes a long time to run. Occasionally a stage fails, and you have to address the problem. You need to ensure that the final output is generated as quickly as possible. What should you do?

Design the processing as a directed acyclic graph (DAG) in Cloud Composer. Clear the state of the failed task after correcting any stage output data errors

Design the workflow as a Cloud Workflow instance. Code the workflow to jump to a given stage based on an input parameter. Rerun the workflow after correcting any stage output data errors

Design a Spark program that runs under Dataproc. Code the program to wait for user input when an error is detected. Rerun the last action after correcting any stage output data errors

Design the pipeline as a set of PTransforms in Dataflow. Restart the pipeline after correcting any stage output data errors

39 / 60

You work for an ecommerce company that has a BigQuery dataset that contains customer purchase history, demographics, and website interactions. You need to build a machine learning (ML) model to predict which customers are most likely to make a purchase in the next month. You have limited engineering resources and need to minimize the ML expertise required for the solution. What should you do?

Use Colab Enterprise to develop a custom model for purchase prediction

Export the data to Cloud Storage, and use AutoML Tables to build a classification model for purchase prediction

Use Vertex AI Workbench to develop a custom model for purchase prediction

Use BigQuery ML to create a logistic regression model for purchase prediction

40 / 60

You manage a large amount of data in Cloud Storage, including raw data, processed data, and backups. Your organization is subject to strict compliance regulations that mandate data immutability for specific data types. You want to use an efficient process to reduce storage costs while ensuring that your storage strategy meets retention requirements. What should you do?

Create a Cloud Run function to periodically check object metadata, and move objects to the appropriate storage class based on age and access patterns. Use object holds to enforce immutability for specific objects

Use object holds to enforce immutability for specific objects, and configure lifecycle management rules to transition objects to appropriate storage classes based on age and access patterns

Move objects to different storage classes based on their age and access patterns. Use Cloud Key Management Service (Cloud KMS) to encrypt specific objects with customer-managed encryption keys (CMEK) to meet immutability requirements

Configure lifecycle management rules to transition objects to appropriate storage classes based on access patterns. Set up Object Versioning for all objects to meet immutability requirements

41 / 60

You work for a healthcare company that has a large on-premises data system containing patient records with personally identifiable information (PII) such as names, addresses, and medical diagnoses. You need a standardized managed solution that de-identifies PII across all your data feeds prior to ingestion to Google Cloud. What should you do?

Use Apache Beam to read the data and perform the necessary cleaning and transformation operations. Store the cleaned data in BigQuery

Load the data into BigQuery, and inspect the data by using SQL queries. Use Dataflow to transform the data and remove any errors

Use Cloud Run functions to create a serverless data cleaning pipeline. Store the cleaned data in BigQuery

Use Cloud Data Fusion to transform the data. Store the cleaned data in BigQuery

42 / 60

You manage a Cloud Storage bucket that stores temporary files created during data processing. These temporary files are only needed for seven days, after which they are no longer needed. To reduce storage costs and keep your bucket organized, you want to automatically delete these files once they are older than seven days. What should you do?

Create a Cloud Run function that runs daily and deletes files older than seven days

Develop a batch process using Dataflow that runs weekly and deletes files based on their age

Set up a Cloud Scheduler job that invokes a weekly Cloud Run function to delete files older than seven days

Configure a Cloud Storage lifecycle rule that automatically deletes objects older than seven days

43 / 60

You want to process and load a daily sales CSV file stored in Cloud Storage into BigQuery for downstream reporting. You need to quickly build a scalable data pipeline that transforms the data while providing insights into data quality issues. What should you do?

Create a batch pipeline in Dataflow by using the Cloud Storage CSV file to BigQuery batch template

Load the CSV file as a table in BigQuery, and use scheduled queries to run SQL transformation scripts

Load the CSV file as a table in BigQuery. Create a batch pipeline in Cloud Data Fusion by using a BigQuery source and sink

Create a batch pipeline in Cloud Data Fusion by using a Cloud Storage source and a BigQuery sink

44 / 60

Your company is building a near real-time streaming pipeline to process JSON telemetry data from small appliances. You need to process messages arriving at a Pub/Sub topic, capitalize letters in the serial number field, and write results to BigQuery. You want to use a managed service and write a minimal amount of code for underlying transformations. What should you do?

Use a Pub/Sub to BigQuery subscription, write results directly to BigQuery, and schedule a transformation query to run every five minutes

Use a Pub/Sub to Cloud Storage subscription, write a Cloud Run service that is triggered when objects arrive in the bucket, performs the transformations, and writes the results to BigQuery

Use a Pub/Sub push subscription, write a Cloud Run service that accepts the messages, performs the transformations, and writes the results to BigQuery

Use the “Pub/Sub to BigQuery” Dataflow template with a UDF, and write the results to BigQuery

45 / 60

You created a customer support application that sends several forms of data to Google Cloud. Your application is sending:

1. Audio files from phone interactions with support agents that will be accessed during trainings.

2. CSV files of users' personally identifiable information (Pll) that will be analyzed with SQL.

3. A large volume of small document files that will power other applications.

You need to select the appropriate tool for each data type given the required use case, while following Google-recommended practices. Which should you choose?

1. Cloud Storage 2. CloudSQL for PostgreSQL 3. Bigtable

1. Filestore 2. Cloud SQL for PostgreSQL 3. Datastore

1. Cloud Storage 2. BigQuery 3. Firestore

1. Filestore 2. Bigtable 3. BigQuery

46 / 60

You are migrating data from a legacy on-premises MySQL database to Google Cloud. The database contains various tables with different data types and sizes, including large tables with millions of rows and transactional data. You need to migrate this data while maintaining data integrity, and minimizing downtime and cost. What should you do?

Use Database Migration Service to replicate the MySQL database to a Cloud SQL for MySQL instance

Export the MySQL database to CSV files, transfer the files to Cloud Storage by using Storage Transfer Service, and load the files into a Cloud SQL for MySQL instance

Use Cloud Data Fusion to migrate the MySQL database to MySQL on Compute Engine

Set up a Cloud Composer environment to orchestrate a custom data pipeline. Use a Python script to extract data from the MySQL database and load it to MySQL on Compute Engine

47 / 60

You work for a global financial services company that trades stocks 24/7. You have a Cloud SGL for PostgreSQL user database. You need to identify a solution that ensures that the database is continuously operational, minimizes downtime, and will not lose any data in the event of a zonal outage. What should you do?

Configure and create a high-availability Cloud SQL instance with the primary instance in zone A and a secondary instance in any zone other than zone A.

Create a read replica in the same region but in a different zone

Continuously back up the Cloud SGL instance to Cloud Storage. Create a Compute Engine instance with PostgreSCL in a different region. Restore the backup in the Compute Engine instance if a failure occurs

Create a read replica in another region. Promote the replica to primary if a failure occurs

48 / 60

Your team uses the Google Ads platform to visualize metrics. You want to export the data to BigQuery to get more granular insights. You need to execute a one-time transfer of historical data and automatically update data daily. You want a solution that is low-code, serverless, and requires minimal maintenance. What should you do?

Export the historical data to BigQuery by using BigQuery Data Transfer Service. Use BigQuery Data Transfer Service for daily automation

Export the historical data to Cloud Storage by using Storage Transfer Service. Use Pub/Sub to trigger a Dataflow template that loads data for daily automation

Export the historical data as a CSV file. Import the file into BigQuery for analysis. Use Cloud Composer for daily automation

Export the historical data to BigQuery by using BigQuery Data Transfer Service. Use Cloud Composer for daily automation

49 / 60

Your organization has a BigQuery dataset that contains sensitive employee information such as salaries and performance reviews. The payroll specialist in the HR department needs to have continuous access to aggregated performance data, but they do not need continuous access to other sensitive dat

a. You need to grant the payroll specialist access to the performance data without granting them access to the entire dataset using the simplest and most secure approach. What should you do?

Create a table with the aggregated performance data. Use table-level permissions to grant access to the payroll specialist

Create a SQL query with the aggregated performance data. Export the results to an Avro file in a Cloud Storage bucket. Share the bucket with the payroll specialist

Use authorized views to share query results with the payroll specialist

Create row-level and column-level permissions and policies on the table that contains performance data in the dataset. Provide the payroll specialist with the appropriate permission set

50 / 60

Your organization sends IoT event data to a Pub/Sub topic. Subscriber applications read and perform transformations on the messages before storing them in the data warehouse. During particularly busy times when more data is being written to the topic, you notice that the subscriber applications are not acknowledging messages within the deadline. You need to modify your pipeline to handle these activity spikes and continue to process the messages. What should you do?

Retry messages until they are acknowledged

Implement flow control on the subscribers

Forward unacknowledged messages to a dead-letter topic

Seek back to the last acknowledged message

51 / 60

Your data science team needs to collaboratively analyze a 25 TB BigQuery dataset to support the development of a machine learning model. You want to use Colab Enterprise notebooks while ensuring efficient data access and minimizing cost. What should you do?

Copy the BigQuery dataset to the local storage of the Colab Enterprise runtime, and analyze the data using Pandas

Create a Dataproc cluster connected to a Colab Enterprise notebook, and use Spark to process the data in BigQuery

Export the BigQuery dataset to Google Drive. Load the dataset into the Colab Enterprise notebook using Pandas

Use BigQuery magic commands within a Colab Enterprise notebook to query and analyze the data

52 / 60

You have a Dataproc cluster that performs batch processing on data stored in Cloud Storage. You need to schedule a daily Spark job to generate a report that will be emailed to stakeholders. You need a fully-managed solution that is easy to implement and minimizes complexity. What should you do?

Use Dataproc workflow templates to define and schedule the Spark job, and to email the report

Use Cloud Composer to orchestrate the Spark job and email the report

Use Cloud Run functions to trigger the Spark job and email the report

Use Cloud Scheduler to trigger the Spark job. and use Cloud Run functions to email the report

53 / 60

Your organization uses Dataflow pipelines to process real-time financial transactions. You discover that one of your Dataflow jobs has failed. You need to troubleshoot the issue as quickly as possible. What should you do?

Navigate to the Dataflow Jobs page in the Google Cloud console. Use the job logs and worker logs to identify the error

Use the gcloud CLI tool to retrieve job metrics and logs, and analyze them for errors and performance bottlenecks

Create a custom script to periodically poll the Dataflow API for job status updates, and send email alerts if any errors are identified

Set up a Cloud Monitoring dashboard to track key Dataflow metrics, such as data throughput, error rates, and resource utilization

54 / 60

You are a data analyst at your organization. You have been given a BigQuery dataset that includes customer information. The dataset contains inconsistencies and errors, such as missing values, duplicates, and formatting issues. You need to effectively and quickly clean the data. What should you do?

Use BigQuery's built-in functions to perform data quality transformations

Export the data from BigQuery to CSV files. Resolve the errors using a spreadsheet editor, and re-import the cleaned data into BigQuery

Develop a Dataflow pipeline to read the data from BigQuery, perform data quality rules and transformations, and write the cleaned data back to BigQuery

Use Cloud Data Fusion to create a data pipeline to read the data from BigQuery, perform data quality transformations, and write the clean data back to BigQuery

55 / 60

You are predicting customer churn for a subscription-based service. You have a 50 PB historical customer dataset in BigQuery that includes demographics, subscription information, and engagement metrics. You want to build a churn prediction model with minimal overhead. You want to follow the Google-recommended approach. What should you do?

Use the BigQuery Python client library in a Jupyter notebook to query and preprocess the data in BigQuery. Use the CREATE MODEL statement in BigQueryML to train the churn prediction model

Create a Looker dashboard that is connected to BigQuery. Use LookML to predict churn

Export the data from BigQuery to a local machine. Use scikit-learn in a Jupyter notebook to build the churn prediction model

Use Dataproc to create a Spark cluster. Use the Spark MLlib within the cluster to build the churn prediction model

56 / 60

Your company uses Looker to generate and share reports with various stakeholders. You have a complex dashboard with several visualizations that needs to be delivered to specific stakeholders on a recurring basis, with customized filters applied for each recipient. You need an efficient and scalable solution to automate the delivery of this customized dashboard. You want to follow the Google-recommended approach. What should you do?

Use the Looker Scheduler with a user attribute filter on the dashboard, and send the dashboard with personalized filters to each stakeholder based on their attributes

Create a script using the Looker Python SDK, and configure user attribute filter values. Generate a new scheduled plan for each stakeholder

Embed the Looker dashboard in a custom web application, and use the application's scheduling features to send the report with personalized filters

Create a separate LookML model for each stakeholder with predefined filters, and schedule the dashboards using the Looker Scheduler

57 / 60

You have a Cloud SQL for PostgreSQL database that stores user data. To ensure high availability and minimize downtime in the event of a zonal outage, what should you do?

Use Cloud Spanner to migrate the database for automatic replication and high availability

Continuously back up the Cloud SQL instance to Cloud Storage. Create a Compute Engine instance with PostgreSQL in a different region. Restore the backup in the Compute Engine instance if a failure occurs

Implement a manual failover strategy using Cloud SQL replicas in different zones

Configure and create a high-availability Cloud SQL instance with the primary instance in zone A and a secondary instance in any zone other than zone A

58 / 60

Your company uses Looker as its primary business intelligence platform. You want to use LookML to visualize the profit margin for each of your company's products in your Looker Explores and dashboards. You need to implement a solution quickly and efficiently. What should you do?

Apply a filter to only show products with a positive profit margin

Create a new dimension that categorizes products based on their profit margin ranges (e.g., high, medium, low)

Create a derived table that pre-calculates the profit margin for each product, and include it in the Looker model

Define a new measure that calculates the profit margin by using the existing revenue and cost fields

59 / 60

Your company collects customer feedback from various sources, including online transactions, customer surveys, and social media activity. You need to design a data pipeline to extract this data and store it in Google Cloud for further analysis and machine learning model training. Which Google Cloud storage system(s) should you select for each data source?

Online transactions: Cloud Storage; Customer feedback: Cloud Storage; Social media activity: Cloud Storage

Online transactions: Cloud SQL for MySQL; Customer feedback: BigQuery; Social media activity: Cloud Storage

Online transactions: Bigtable; Customer feedback: Cloud Storage; Social media activity: Cloud SQL for MySQL

Online transactions: BigQuery; Customer feedback: Cloud Storage; Social media activity: BigQuery

60 / 60

You are designing a data pipeline that ingests data from CSV, Avro, and Parquet files into Cloud Storage. The data includes raw user input. You need to remove all malicious SQL injections before storing the data in BigQuery. Which data manipulation methodology should you choose?

ETL (Extract, Transform, Load)

ETLT (Extract, Transform, Load, Transform)

EL (Extract, Load)

ELT (Extract, Load, Transform)

Your score is

The average score is 0%

By Wordpress Quiz plugin

Free Associate-Data-Practitioner Exam Questions – 2025 Updated

Prepare Better for the Associate-Data-Practitioner Exam with Our Free and Reliable Associate-Data-Practitioner Exam Questions – Updated for 2025.

Disclaimer

Contact Us

[email protected]

Helpful links

Top Exams

Popular Exams

FLASH OFFER

avail $6 DISCOUNT on YOUR PURCHASE