Free Practice Test

Free Certified Professional Data Scientist Practice Exam – 2025 Updated

Study Smarter for the Certified Professional Data Scientist Exam with Our Free and Trusted Certified Professional Data Scientist Exam Questions – Updated for 2025.

At Cert Empire, we are focused on delivering the most accurate and up-to-date exam questions for students preparing for the Databricks Certified Professional Data Scientist Exam. To help learners prepare more effectively, we’ve made parts of our Certified Professional Data Scientist exam resources free for everyone. You can practice as much as you want with Free Certified Professional Data Scientist Practice Test.

Databricks Certified Professional Data Scientist Free Exam Questions

Disclaimer

Please keep a note that the demo questions are not frequently updated. You may as well find them in open communities around the web. However, this demo is only to depict what sort of questions you may find in our original files.

Nonetheless, the premium exam dumps files are frequently updated and are based on the latest exam syllabus and real exam questions.

1 / 60

You recommend a movie with three stars but the user loves it (he'd rate it five stars). So which statement correctly applies?

2 / 60

A fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of the

3 / 60

Refer to image below

databricks certified professional data scientist exam demo question

4 / 60

Select the choice where Regression algorithms are not best fit

5 / 60

What describes a true limitation of Logistic Regression method?

6 / 60

Stories appear in the front page of Digg as they are "voted up" (rated positively) by the community. As the community becomes larger and more diverse, the promoted stories can better reflect the average interest of the community members.

Which of the following technique is used to make such recommendation engine?

7 / 60

A data scientist wants to predict the probability of death from heart disease based on three risk factors: age, gender, and blood cholesterol level.

What is the most appropriate method for this project?

8 / 60

A researcher is interested in how variables, such as GRE (Graduate Record Exam scores), GPA (grade point average) and prestige of the undergraduate institution, effect admission into graduate school. The response variable, admit/don't admit, is a binary variable.

Above is an example of

9 / 60

A problem statement is given as below

Hospital records show that of patients suffering from a certain disease, 75% die of it.

What is the probability that of 6 randomly selected patients, 4 will recover?

Which of the following model will you use to solve it?

10 / 60

Which of the following is a correct example of the target variable in regression (supervised learning)?

11 / 60

What is the probability that the total of two dice will be greater than 8, given that the first die is a 6?

12 / 60

In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?

13 / 60

You have data of 10.000 people who make the purchasing from a specific grocery store. You also have their income detail in the data. You have created 5 clusters using this data. But in one of the cluster you see that only 30 people are falling as below 30, 2400, 2600, 2700, 2270 etc."

What would you do in this case?

14 / 60

Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?

15 / 60

Clustering is a type of unsupervised learning with the following goals

16 / 60

You have modeled the datasets with 5 independent variables called A, B, C, D and E having relationships which is not dependent each other, and also the variable A,B and C are continuous and variable D and E are discrete (mixed mode).

Now you have to compute the expected value of the variable let say A, then which of the following computation you will prefer

17 / 60

In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features (such as the words in a language), i.e., turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values modulo the number of features as indices directly, rather than looking the indices up in an associative array.

So what is the primary reason of the hashing trick for building classifiers?

18 / 60

Which technique you would be using to solve the below problem statement? "What is the probability that individual customer will not repay the loan amount?"

19 / 60

Refer to exhibit

databricks certified professional data scientist exam demo question

 

 

 

 

 

You are asked to write a report on how specific variables impact your client's sales using a data set provided to you by the client. The data includes 15 variables that the client views as directly related to sales, and you are restricted to these variables only. After a preliminary analysis of the data, the following findings were made: 1. Multicollinearity is not an issue among the variables 2. Only three variables-A, B, and C-have significant correlation with sales You build a linear regression model on the dependent variable of sales with the independent variables of A, B, and C. The results of the regression are seen in the exhibit. You cannot request additional data.

What is a way that you could try to increase the R2 of the model without artificially inflating it?

20 / 60

RMSE measures error of a predicted

21 / 60

You are designing a recommendation engine for a website where the ability to generate more personalized recommendations by analyzing information from the past activity of a specific user, or the history of other users deemed to be of similar taste to a given user. These resources are used as user profiling and helps the site recommend content on a user-by-user basis. The more a given user makes use of the system, the better the recommendations become, as the system gains data to improve its model of that user.

What kind of this recommendation engine is?

22 / 60

Which is an example of supervised learning?

23 / 60

The method based on principal component analysis (PCA) evaluates the features according to

24 / 60

You are working on a problem where you have to predict whether the claim is done valid or not. And you find that most of the claims which are having spelling errors as well as corrections in the manually filled claim forms compare to the honest claims.

Which of the following technique is suitable to find out whether the claim is valid or not?

25 / 60

Your customer provided you with 2. 000 unlabeled records three groups.

What is the correct analytical method to use?

26 / 60

Which of the following is not a correct application for the Classification?

27 / 60

What is the best way to evaluate the quality of the model found by an unsupervised algorithm like k-means clustering, given metrics for the cost of the clustering (how well it fits the data) and its stability (how similar the clusters are across multiple runs over the same data)?

28 / 60

Regularization is a very important technique in machine learning to prevent over fitting. And Optimizing with a L1 regularization term is harder than with an L2 regularization term because

29 / 60

Which of the following technique can be used to the design of recommender systems?

30 / 60

As a data scientist consultant at ABC Corp, you are working on a recommendation engine for the learning resources for end user.

So Which recommender system technique benefits most from additional user preference data?

31 / 60

What is one modeling or descriptive statistical function in MADlib that is typically not provided in a standard relational database?

32 / 60

Consider flipping a coin for which the probability of heads is p, where p is unknown, and our goa is to estimate p. The obvious approach is to count how many times the coin came up heads and divide by the total number of coin flips. If we flip the coin 1000 times and it comes up heads 367 times, it is very reasonable to estimate p as approximately 0.367.

However, suppose we flip the coin only twice and we get heads both times. Is it reasonable to estimate p as 1.0? Intuitively, given that we only flipped the coin twice, it seems a bit rash to conclude that the coin will always come up heads, and____________is a way of avoiding such rash conclusions.

33 / 60

A bio-scientist is working on the analysis of the cancer cells. To identify whether the cell is cancerous or not, there has been hundreds of tests are done with small variations to say yes to the problem. Given the test result for a sample of healthy and cancerous cells, which of the following technique you will use to determine whether a cell is healthy?

34 / 60

You are working on a email spam filtering assignment, while working on this you find there is new word e.g. HadoopExam comes in email, and in your solutions you never come across this word before, hence probability of this words is coming in either email could be zero.

So which of the following algorithm can help you to avoid zero probability?

35 / 60

Refer to Exhibit

databricks certified professional data scientist exam demo question

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan, and the blue represents borrowers that are known to have defaulted on their loan.

Which analytical method could produce the probabilities needed to build this exhibit?

36 / 60

The figure below shows a plot of the data of a data matrix M that is 1000 x 2.

databricks certified professional data scientist exam demo question

 

 

 

 

 

 

 

 

 

 

 

 

 

Which line represents the first principal component?

37 / 60

Logistic regression is a model used for prediction of the probability of occurrence of an event. It makes use of several variables that may be......

38 / 60

You are working as a data science consultant for a gaming company. You have three member team and all other stake holders are from the company itself like project managers and project sponsored, data team etc.

During the discussion project managed asked you that when can you tell me that the model you are using is robust enough, after which step you can consider answer for this question?

39 / 60

In which of the scenario you can use the regression to predict the values

40 / 60

If E1 and E2 are two events, how do you represent the conditional probability given that E2 occurs given that E1 has occurred?

41 / 60

What type of output generated in case of linear regression?

42 / 60

You are creating a regression model with the input income, education and current debt of a customer, what could be the possible output from this model.

43 / 60

What are the advantages of the mutual information over the Pearson correlation for text classification problems?

44 / 60

You are analyzing data in order to build a classifier model. You discover non-linear data and discontinuities that will affect the model. Which analytical method would you recommend?

45 / 60

Which of the following is a Continuous Probability Distributions?

46 / 60

Suppose there are three events then which formula must always be equal to P(E1|E2,E3)?

47 / 60

What is the best way to ensure that the k-means algorithm will find a good clustering of a collection of vectors?

48 / 60

Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?

49 / 60

Feature Hashing approach is "SGD-based classifiers avoid the need to predetermine vector size by simply picking a reasonable size and shoehorning the training data into vectors of that size" now with large vectors or with multiple locations per feature in Feature hashing?

50 / 60

Suppose A, B , and C are events. The probability of A given B , relative to P(|C), is the same as the probability of A given B and C (relative to P ). That is,

51 / 60

What is the considerable difference between L1 and L2 regularization?

52 / 60

While working with Netflix the movie rating websites you have developed a recommender system that has produced ratings predictions for your data set that are consistently exactly 1 higher for the user-item pairs in your dataset than the ratings given in the dataset. There are n terms in the dataset. What will be the calculated RMSE of your recommender system on the dataset?

53 / 60

Which of the following are point estimation methods?

54 / 60

You are working in an ecommerce organization, where you are designing and evaluating a recommender system, you need to select which of the following metric wilt always have the largest value?

55 / 60

Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. Which of the following will you use to calculate the probability whether it will rain on the day of Marie's wedding?

56 / 60

Refer to the exhibit.

databricks certified professional data scientist exam demo question

 

 

 

 

You are building a decision tree. In this exhibit, four variables are listed with their respective values of info-gain.

Based on this information, on which attribute would you expect the next split to be in the decision tree?

57 / 60

You are working in a data analytics company as a data scientist, you have been given a set of various types of Pizzas available across various premium food centers in a country. This data is given as numeric values like Calorie. Size, and Sale per day etc. You need to group all the pizzas with the similar properties, which of the following technique you would be using for that?

58 / 60

You are asked to create a model to predict the total number of monthly subscribers for a specific magazine. You are provided with 1 year's worth of subscription and payment data, user demographic data, and 10 years worth of content of the magazine (articles and pictures). Which algorithm is the most appropriate for building a predictive model for subscribers?

59 / 60

Which of the below best describe the Principal component analysis?

60 / 60

RMSE is a good measure of accuracy, but only to compare forecasting errors of different models for a______, as it is scale-dependent.

Your score is

The average score is 0%

Shopping Cart
Scroll to Top

FLASH OFFER

Days
Hours
Minutes
Seconds

avail $6 DISCOUNT on YOUR PURCHASE