Get Ready Smarter for the Generative AI Engineer Associate Exam with Our Free and Trusted Generative AI Engineer Associate Exam Questions – Updated for 2025.

At Cert Empire, we are focused on providing the most accurate and up-to-date exam questions for students preparing for the Databricks Generative AI Engineer Associate Exam. To help learners prepare more effectively, we’ve made parts of our Generative AI Engineer Associate exam resources free for everyone. You can practice as much as you want with Free Generative AI Engineer Associate Practice Test.

Get Generative AI Engineer Associate Exam Dumps

Databricks Generative-AI-Engineer-Associate Free Exam Questions

Disclaimer

Please keep a note that the demo questions are not frequently updated. You may as well find them in open communities around the web. However, this demo is only to depict what sort of questions you may find in our original files.

Nonetheless, the premium exam dumps files are frequently updated and are based on the latest exam syllabus and real exam questions.

1 / 60

Which technique helps reduce hallucinations in LLMs running on Databricks?

Disabling attention mechanisms

Fine-tuning with domain-specific datasets

Removing embeddings from the model

Increasing model size without additional training

2 / 60

Which Databricks feature is used to securely share AI model outputs and datasets across organizations?

AutoML

Feature Store

Delta Sharing

MLflow Tracking

3 / 60

Which Databricks feature can be used to track experiment results and hyperparameter tuning for AI models?

Unity Catalog

MLflow Tracking

Feature Store

Delta Lake

4 / 60

How does MLflow assist in LLM deployment in Databricks?

It generates synthetic data for LLMs

It directly integrates with OpenAI APIs

It removes the need for vector storage

It provides model versioning and tracking

5 / 60

What is the role of Lakehouse architecture in Generative AI workloads?

Reduces model accuracy

Only supports real-time inference, not training

Eliminates the need for vector embeddings

Combines structured and unstructured data for AI model training

6 / 60

How does Databricks optimize GPU usage for AI workloads?

By disabling parallelism in AI pipelines

By forcing CPU-based training

Through Databricks runtime with GPU acceleration

By limiting the use of distributed computing

7 / 60

What is the function of Feature Store in Databricks AI workflows?

To manage and reuse ML model features across AI projects

To provide real-time model monitoring

To store fine-tuned LLM weights

To generate synthetic AI training data

8 / 60

Which technique can optimize LLM inference speed in Databricks?

Using JSON instead of Parquet

Model pruning and quantization

Increasing batch size indefinitely

Disabling caching mechanisms

9 / 60

How does MLflow facilitate model versioning in Databricks?

It replaces the need for distributed training

It provides automated rollback for failed AI models

It automatically deletes older models after deployment

It enables tracking and managing multiple versions of AI models

10 / 60

What is an advantage of using Apache Spark on Databricks for AI workloads?

It allows distributed processing for faster model training

It eliminates the need for GPUs

It restricts data access for compliance

It replaces the need for vector databases

11 / 60

Which Databricks component is crucial for storing and managing AI training datasets?

AutoML

Delta Lake

Feature Store

MLflow Registry

12 / 60

What is the key advantage of fine-tuning an LLM in Databricks instead of using a pre-trained model directly?

Fine-tuning removes the need for embeddings

Fine-tuning reduces data storage requirements

Fine-tuning improves model personalization for domain-specific tasks

Pre-trained models have lower latency than fine-tuned models

13 / 60

In an LLM-based chatbot built on Databricks, how can retrieval-augmented generation (RAG) improve responses?

By reducing GPU memory usage

By retrieving relevant data from external sources for better context

By enabling model quantization

By eliminating training requirements

14 / 60

What is the primary benefit of using Databricks AutoML for AI projects?

Provides direct access to OpenAI’s API

Eliminates the need for model validation

Helps train AI models without extensive coding

Automatically creates production-ready APIs for LLMs

15 / 60

Which Databricks feature enables parameter tuning for AI models?

Hyperopt

Unity Catalog

Auto Loader

Delta Live Tables

16 / 60

What role does Databricks Unity Catalog play in Generative AI workflows?

Optimizes Spark SQL queries

Automates hyperparameter tuning

Generates synthetic AI training data

Manages metadata and access control for AI models and datasets

17 / 60

Which file format is commonly used in Databricks for efficient AI data storage and retrieval?

CSV

XML

Parquet

JSON

18 / 60

What is a key challenge when using Generative AI models on Databricks?

Lack of GPU support

High computational costs for model training and inference

Inability to integrate with external APIs

No support for distributed computing

19 / 60

Which feature in Databricks allows you to monitor LLM performance metrics?

MLflow Model Registry

AutoML

Feature Store

Delta Sharing

20 / 60

What is the purpose of the Databricks Model Serving feature in AI applications?

To generate synthetic data for AI models

To perform data transformation pipelines

To deploy machine learning and LLM models at scale

To train models with distributed computing

21 / 60

Which Databricks service is essential for managing LLM fine-tuning experiments?

AutoML

MLflow Tracking

Unity Catalog

Photon Engine

22 / 60

What is the role of vector databases in Generative AI workflows on Databricks?

They act as a replacement for Delta Lake

They eliminate the need for embeddings in LLMs

They are used for storing structured data for LLM training

They facilitate semantic search and retrieval-augmented generation (RAG)

23 / 60

Which of the following is a key advantage of using Databricks for training large language models (LLMs)?

No need for vector databases

Built-in support for edge computing

LLMs do not require fine-tuning in Databricks

High-performance distributed training with optimized GPU usage

24 / 60

In Databricks, which feature is used to store and manage machine learning models, including LLMs?

MLflow

AutoML

Unity Catalog

Delta Lake

25 / 60

What is the primary function of Databricks MosaicML in Generative AI?

To improve data ingestion speed

To replace Apache Spark for AI training

To optimize and fine-tune AI models efficiently

To create dashboards for AI model visualization

26 / 60

A Generative AI Engineer is building a Generative AI system that suggests the best matched employee team member to newly scoped projects. The team member is selected from a very large team. The match should be based upon project date availability and how well their employee profile matches the project scope. Both the employee profile and project scope are unstructured text. How should the Generative Al Engineer architect their system?

Create a tool for finding available team members given project dates. Embed team profiles into a vector store and use the project scope and filtering to perform retrieval to find the available best matched team members

Create a tool for finding available team members given project dates. Embed all project scopes into a vector store, perform a retrieval using team member profiles to find the best team member

Create a tool for finding team member availability given project dates, and another tool that uses an LLM to extract keywords from project scopes. Iterate through available team members’ profiles and perform keyword matching to find the best available team member

Create a tool to find available team members given project dates. Create a second tool that can calculate a similarity score for a combination of team member profile and the project scope. Iterate through the team members and rank by best score to select a team member

27 / 60

Generative AI Engineer is creating an LLM-powered application that will need access to up-to-date news articles and stock prices.

The design requires the use of stock prices which are stored in Delta tables and finding the latest relevant news articles by searching the internet.

How should the Generative AI Engineer architect their LLM system?

Use an LLM to summarize the latest news articles and lookup stock tickers from the summaries to find stock prices

Create an agent with tools for SQL querying of Delta tables and web searching, provide retrieved values to an LLM for generation of response

Download and store news articles and stock price information in a vector store. Use a RAG architecture to retrieve and generate at runtime

Query the Delta table for volatile stock prices and use an LLM to generate a search query to investigate potential causes of the stock volatility

28 / 60

A Generative AI Engineer is designing a chatbot for a gaming company that aims to engage users on its platform while its users play online video games.

Which metric would help them increase user engagement and retention for their platform?

Randomness

Lack of relevance

Diversity of responses

Repetition of responses

29 / 60

A company has a typical RAG-enabled, customer-facing chatbot on its website.

Select the correct sequence of components a user's questions will go through before the final output is returned. Use the diagram above for reference.

1.response-generating LLM, 2.context-augmented prompt, 3.vector search, 4.embedding model

1.response-generating LLM, 2.vector search, 3.context-augmented prompt, 4.embedding model

1.context-augmented prompt, 2.vector search, 3.embedding model, 4.response-generating LLM

1.embedding model, 2.vector search, 3.context-augmented prompt, 4.response-generating LLM

30 / 60

A team wants to serve a code generation model as an assistant for their software developers. It should support multiple programming languages. Quality is the primary objective.

Which of the Databricks Foundation Model APIs, or models available in the Marketplace, would be the best fit?

MPT-7b

CodeLlama-34B

BGE-large

Llama2-70b

31 / 60

A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in PDF format. These PDFs can contain both text and images. They want to develop a solution using the least amount of lines of code.

Which Python package should be used to extract the text from the source documents?

flask

numpy

unstructured

beautifulsoup

32 / 60

A Generative AI Engineer received the following business requirements for an external chatbot.

The chatbot needs to know what types of questions the user asks and routes to appropriate models to answer the questions. For example, the user might ask about upcoming event details. Another user might ask about purchasing tickets for a particular event.

What is an ideal workflow for such a chatbot?

The chatbot should only process payments

The chatbot should only look at previous event information

There should be two different chatbots handling different types of user queries

The chatbot should be implemented as a multi-step LLM workflow. First, identify the type of question asked, then route the question to the appropriate model. If it's an upcoming event question, send the query to a text-to-SQL model. If it's about ticket purchasing, the customer should be redirected to a payment platform

33 / 60

Generative AI Engineer at an electronics company just deployed a RAG application for customers to ask questions about products that the company carries. However, they received feedback that the RAG response often returns information about an irrelevant product.
What can the engineer do to improve the relevance of the RAG’s response?

Assess the quality of the retrieved context

Implement caching for frequently asked questions

Use a different semantic similarity search algorithm

Use a different LLM to improve the generated response

34 / 60

A Generative AI Engineer I using the code below to test setting up a vector store:

Assuming they intend to use Databricks managed embeddings with the default embedding model, what should be the next logical function call?

vsc.get_index()

vsc.similarity_search()

vsc.create_delta_sync_index()

vsc.create_direct_access_index()

35 / 60

A Generative AI Engineer wants to build an LLM-based solution to help a restaurant improve its online customer experience with bookings by automatically handling common customer inquiries. The goal of the solution is to minimize escalations to human intervention and phone calls while maintaining a personalized interaction. To design the solution, the Generative AI Engineer needs to define the input data to the LLM and the task it should perform.
Which input/output pair will support their goal?

Input: Online chat logs; Output: Cancellation options

Input: Customer reviews; Output: Classify review sentiment

Input: Online chat logs; Output: Buttons that represent choices for booking details

Input: Online chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions

36 / 60

A Generative AI Engineer is developing an LLM application that users can use to generate personalized birthday poems based on their names.
Which technique would be most effective in safeguarding the application, given the potential for malicious user inputs?

Reduce the time that the users can interact with the LLM

Increase the amount of compute that powers the LLM to process input faster

Ask the LLM to remind the user that the input is malicious but continue the conversation with the user

Implement a safety filter that detects any harmful inputs and ask the LLM to respond that it is unable to assist

37 / 60

A Generative AI Engineer is developing a patient-facing healthcare-focused chatbot. If the patient’s question is not a medical emergency, the chatbot should solicit more information from the patient to pass to the doctor’s office and suggest a few relevant pre-approved medical articles for reading. If the patient’s question is urgent, direct the patient to calling their local emergency services.
Given the following user input:
“I have been experiencing severe headaches and dizziness for the past two days.”
Which response is most appropriate for the chatbot to generate?

Please call your local emergency services

Headaches can be tough. Hope you feel better soon!

Here are a few relevant articles for your browsing. Let me know if you have questions after reading them

Please provide your age, recent activities, and any other symptoms you have noticed along with your headaches and dizziness

38 / 60

A Generative Al Engineer has developed an LLM application to answer questions about internal company policies. The Generative AI Engineer must ensure that the application doesn’t hallucinate or leak confidential data.
Which approach should NOT be used to mitigate hallucination or confidential data leakage?

Fine-tune the model on your data, hoping it will learn what is appropriate and not

Limit the data available based on the user’s access level

Add guardrails to filter outputs from the LLM before it is shown to the user

Use a strong system prompt to ensure the model aligns with your needs

39 / 60

A Generative Al Engineer would like an LLM to generate formatted JSON from emails. This will require parsing and extracting the following information: order ID, date, and sender email. Here’s a sample email:

They will need to write a prompt that will extract the relevant information in JSON format with the highest level of output accuracy.
Which prompt will do that?

You will receive customer emails and need to extract date, sender email, and order IReturn the extracted information in JSON format

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in a human-readable format

You will receive customer emails and need to extract date, sender email, and order ID. You should return the date, sender email, and order ID information in JSON format

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in JSON format. Here’s an example: {“date”: “April 16, 2024”, “sender_email”: “[email protected]”, “order_id”: “RE987D”}

40 / 60

When developing an LLM application, it’s crucial to ensure that the data used for training the model complies with licensing requirements to avoid legal risks.
Which action is NOT appropriate to avoid legal risks?

Only use data explicitly labeled with an open license and ensure the license terms are followed

Reach out to the data curators directly before you have started using the trained model to let them know

Reach out to the data curators directly after you have started using the trained model to let them know

Use any available data you personally created which is completely original and you can decide what license to use

41 / 60

A Generative Al Engineer interfaces with an LLM with prompt/response behavior that has been trained on customer calls inquiring about product availability. The LLM is designed to output “In Stock” if the product is available or only the term “Out of Stock” if not.
Which prompt will work to allow the engineer to respond to call classification labels correctly?

Respond with “In Stock” if the customer asks for a product

Respond with “Out of Stock” if the customer asks for a product

You will be given a customer call transcript where the customer asks about product availability. The outputs are either “In Stock” or “Out of Stock”. Format the output in JSON, for example: {“call_id”: “123”, “label”: “In Stock”}

You will be given a customer call transcript where the customer inquires about product availability. Respond with “In Stock” if the product is available or “Out of Stock” if not

42 / 60

A Generative AI Engineer has been asked to build an LLM-based question-answering application. The application should take into account new documents that are frequently published. The engineer wants to build this application with the least cost and least development effort and have it operate at the lowest cost possible.
Which combination of chaining components and configuration meets these requirements?

The LLM needs to be frequently with the new documents in order to provide most up-to-date answers

For the question-answering application, prompt engineering and an LLM are required to generate answers

For the application a prompt, a retriever, and an LLM are required. The retriever output is inserted into the prompt which is given to the LLM to generate answers

For the application a prompt, an agent and a fine-tuned LLM are required. The agent is used by the LLM to retrieve relevant content that is inserted into the prompt which is given to the LLM to generate answers

43 / 60

A small and cost-conscious startup in the cancer research field wants to build a RAG application using Foundation Model APIs.
Which strategy would allow the startup to build a good-quality RAG application while being cost-conscious and able to cater to customer needs?

Limit the number of queries a customer can send per day

Pick a smaller LLM that is domain-specific

Limit the number of relevant documents available for the RAG application to retrieve from

Use the largest LLM possible because that gives the best performance for any general queries

44 / 60

A Generative Al Engineer is creating an LLM-based application. The documents for its retriever have been chunked to a maximum of 512 tokens each. The Generative Al Engineer knows that cost and latency are more important than quality for this application. They have several context length levels to choose from.
Which will fulfill their need?

context length 512: smallest model is 0.13GB and embedding dimension 384

context length 514; smallest model is 0.44GB and embedding dimension 768

context length 2048: smallest model is 11GB and embedding dimension 2560

context length 32768: smallest model is 14GB and embedding dimension 4096

45 / 60

A Generative AI Engineer has a provisioned throughput model serving endpoint as part of a RAG application and would like to monitor the serving endpoint’s incoming requests and outgoing responses. The current approach is to include a micro-service in between the endpoint and the user interface to write logs to a remote server.
Which Databricks feature should they use instead which will perform the same task?

Inference Tables

DBSQL

Lakeview

Vector Search

46 / 60

Ingest documents from a source –> Index the documents and save to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> LLM generates a response -> Evaluate model –> Deploy it using Model Serving

Create a tool for finding available team members given project dates. Embed all project scopes into a vector store, perform a retrieval using team member profiles to find the best team member

47 / 60

A Generative AI Engineer is designing a RAG application for answering user questions on technical regulations as they learn a new sport.
What are the steps needed to build this RAG application and deploy it?

Ingest documents from a source –> Index the documents and save to Vector Search –> Evaluate model –> Deploy it using Model Serving

Ingest documents from a source –> Index the documents and saves to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> Evaluate model –> LLM generates a response –> Deploy it using Model Serving

User submits queries against an LLM –> Ingest documents from a source –> Index the documents and save to Vector Search –> LLM retrieves relevant documents –> LLM generates a response –> Evaluate model –> Deploy it using Model Serving

48 / 60

Which indicator should be considered to evaluate the safety of the LLM outputs when qualitatively assessing LLM responses for a translation use case?

The similarity to the previous language

The accuracy and relevance of the responses

The ability to generate responses in code

The latency of the response and the length of text generated

49 / 60

What is an effective method to preprocess prompts using custom code before sending them to an LLM?

Directly modify the LLM’s internal architecture to include preprocessing steps

Write a MLflow PyFunc model that has a separate function to process the prompts

Rather than preprocessing prompts, it’s more effective to postprocess the LLM outputs to align the outputs to desired outcomes

It is better not to introduce custom code to preprocess prompts as the LLM has not been trained with examples of the preprocessed prompts

50 / 60

A Generative AI Engineer is tasked with deploying an application that takes advantage of a custom MLflow Pyfunc model to return some interim results.
How should they configure the endpoint to pass the secrets and credentials?

Use spark.conf.set ()

Pass the secrets in plain text

Add credentials using environment variables

Pass variables using the Databricks Feature Store API

51 / 60

A Generative AI Engineer is developing a chatbot designed to assist users with insurance-related queries. The chatbot is built on a large language model (LLM) and is conversational. However, to maintain the chatbot’s focus and to comply with company policy, it must not provide responses to questions about politics. Instead, when presented with political inquiries, the chatbot should respond with a standard message:
“Sorry, I cannot answer that. I am a chatbot that can only answer questions around insurance.”
Which framework type should be implemented to solve this?

Safety Guardrail

Security Guardrail

Contextual Guardrail

Compliance Guardrail

52 / 60

A Generative AI Engineer has been asked to design an LLM-based application that accomplishes the following business objective: answer employee HR questions using HR PDF documentation.
Which set of high level tasks should the Generative AI Engineer's system perform?

Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user

Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee

Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved

Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved

53 / 60

A Generative AI Engineer is creating an agent-based LLM system for their favorite monster truck team. The system can answer text based questions about the monster truck team, lookup event dates via an API call, or query tables on the team’s latest standings.
How could the Generative AI Engineer best design these capabilities into their system?

Ingest PDF documents about the monster truck team into a vector store and query it in a RAG architecture

Write a system prompt for the agent listing available tools and bundle it into an agent system that runs a number of calls to solve a query

Instruct the LLM to respond with “RAG”, “API”, or “TABLE” depending on the query, then use text parsing and conditional statements to resolve the query

Build a system prompt with all possible event dates and table information in the system prompt. Use a RAG architecture to lookup generic text questions and otherwise leverage the information in the system prompt

54 / 60

A Generative Al Engineer is tasked with developing a RAG application that will help a small internal group of experts at their company answer specific questions, augmented by an internal knowledge base. They want the best possible quality in the answers, and neither latency nor throughput is a huge concern given that the user group is small and they’re willing to wait for the best answer. The topics are sensitive in nature and the data is highly confidential and so, due to regulatory requirements, none of the information is allowed to be transmitted to third parties.
Which model meets all the Generative Al Engineer’s needs in this situation?

BGE-large

Dolly 1.5B

Llama2-70B

OpenAI GPT-4

55 / 60

A Generative Al Engineer is creating an LLM system that will retrieve news articles from the year 1918 and related to a user's query and summarize them. The engineer has noticed that the summaries are generated well but often also include an explanation of how the summary was generated, which is undesirable.
Which change could the Generative Al Engineer perform to mitigate this issue?

Provide few shot examples of desired output format to the system and/or user prompt

Tune the chunk size of news articles or experiment with different embedding models

Split the LLM output by newline characters to truncate away the summarization explanation

Revisit their document ingestion logic, ensuring that the news articles are being ingested properly

56 / 60

What is the most suitable library for building a multi-step LLM-based workflow?

Pandas

PySpark

LangChain

TensorFlow

57 / 60

A Generative Al Engineer is responsible for developing a chatbot to enable their company’s internal HelpDesk Call Center team to more quickly find related tickets and provide resolution. While creating the GenAI application work breakdown tasks for this project, they realize they need to start planning which data sources (either Unity Catalog volume or Delta table) they could choose for this application. They have collected several candidate data sources for consideration: call_rep_history: a Delta table with primary keys representative_id, call_id. This table is maintained to calculate representatives’ call resolution from fields call_duration and call start_time. transcript Volume: a Unity Catalog Volume of all recordings as a *.wav files, but also a text transcript as *.txt files. call_cust_history: a Delta table with primary keys customer_id, cal1_id. This table is maintained to calculate how much internal customers use the HelpDesk to make sure that the charge back model is consistent with actual service use. call_detail: a Delta table that includes a snapshot of all call details updated hourly. It includes root_cause and resolution fields, but those fields may be empty for calls that are still active. maintenance_schedule – a Delta table that includes a listing of both HelpDesk application outages as well as planned upcoming maintenance downtimes.
They need sources that could add context to best identify ticket root cause and resolution.
Which TWO sources do that? (Choose two.)

call_detail, call_rep_history

call_cust_history, maintenance_schedule

transcript Volume, call_detail

call_rep_history, call_cust_history

maintenance_schedule, transcript Volume

58 / 60

A Generative Al Engineer is tasked with improving the RAG quality by addressing its inflammatory outputs.
Which action would be most effective in mitigating the problem of offensive text outputs?

Inform the user of the expected RAG behavior

Increase the frequency of upstream data updates

Restrict access to the data sources to a limited number of users

Curate upstream data properly that includes manual review before it is fed into the RAG system

59 / 60

A Generative AI Engineer is designing an LLM-powered live sports commentary platform. The platform provides real-time updates and LLM-generated analyses for any users who would like to have live summaries, rather than reading a series of potentially outdated news articles.
Which tool below will give the platform access to real-time data for generating game analyses based on the latest game scores?

AutoML

DatabricksIQ

Feature Serving

Foundation Model APIs

60 / 60

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.
Which metric should they monitor for their customer service LLM application in production?

Energy usage per query

HuggingFace Leaderboard values for the base LLM

Final perplexity scores for the training of the model

Number of customer inquiries processed per unit of time

Your score is

The average score is 78%

By Wordpress Quiz plugin

Free Generative AI Engineer Associate Practice Exam – 2025 Updated

Get Ready Smarter for the Generative AI Engineer Associate Exam with Our Free and Trusted Generative AI Engineer Associate Exam Questions – Updated for 2025.

Disclaimer

Contact Us

[email protected]

Helpful links

Top Exams

Popular Exams

FLASH OFFER

avail $6 DISCOUNT on YOUR PURCHASE