Q: 14
Identify how the count_if function and the count where x is null can be used
Consider a table random_values with below data.
What would be the output of below query?
select count_if(col > 1) as count_
a. count(*) as count_b.count(col1) as count_c from random_values col1
0
1
2
NULL -
2
3
Options
Discussion
C/D? I feel like it's one of those, mainly because count_if can be tricky if the example data includes NULLs. My guess is D since usually count(*) covers all and count(col1) skips NULLs, but not totally sure if there are three or four hits for col > 1. If I'm missing a detail in the sample table let me know.
D imo
Its A. The trap here is thinking count_if counts NULLs, but it doesn't-it only counts where the condition is true. Also, count(*) includes nulls, while count(col1) skips those. So if there are three values >1, six total rows, and five non-null col1 entries, A matches up. Pretty sure this lines up based on Databricks docs but happy to hear other takes.
B or D. I thought count_if(col > 1) would give 4 because I assumed there are four values greater than 1 in the example data, not three. The count(*) and count(col1) totals make sense for 6 and 5, but that first number keeps tripping me up. Maybe I'm missing a NULL trap in the sample? Let me know if anyone reads it differently.
Is this in the official practice set or only in the exam guide? Want to check details.
Option D for me, since count_if(col > 1) should count rows where the value is above 1, and I figured that plus standard counts would match up. It looks like 4 for the first one fits if there are four values greater than 1. Not totally sure on the rest but D seems close.
Be respectful. No spam.