Question 7 - WGU Ethics-In-Technology Real Exam Questions [March 2026 Update]

Q: 7

A data scientist selects publicly available demographic data from a limited group of familiar zip codes when training an algorithm. What are two ethical concerns with this approach? Choose 2 answers.

Options

Correct Answer:

C, D

Explanation

Using a limited and familiar set of zip codes introduces significant sampling bias. This constitutes a small sample size (or more accurately, a non-representative sample), leading to an algorithm that is not generalizable and will perform unfairly or inaccurately when applied to a broader population.

Furthermore, because demographic characteristics like race and socioeconomic status are often correlated with geographic location (zip codes), using this data can lead to proxy discrimination. The algorithm may learn to use the zip code as a stand-in for a protected characteristic, resulting in discriminatory outcomes against individuals from unrepresented or different zip codes.

Why Incorrect

A. Privacy violation: The data is specified as "publicly available," which makes direct privacy violation a less immediate or primary ethical concern compared to the biases introduced by the flawed data selection method.

B. Predictive coding: This is a technical term for a machine learning process used to classify data. It is a method, not an inherent ethical concern itself. The ethical issue lies in the biased data used for training.

References

1. Barocas, S., & Selbst, A. D. (2016). Big Data's Disparate Impact. California Law Review, 104(3), 671-732. In Section II.A, "Proxies," the authors explain how seemingly neutral attributes like zip codes can serve as proxies for protected classes, leading to proxy discrimination (p. 683). https://doi.org/10.15779/Z38BG31

2. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A Survey on Bias and Fairness in Machine Learning. ACM Computing Surveys, 54(6), Article 115. Section 3.1.1, "Sample/Selection Bias," details how non-representative data samples, such as those from specific geographic locations, cause models to be biased and perform poorly when generalized. https://doi.org/10.1145/3457607

3. Stanford University. (CS 182: Ethics, Public Policy, and Technological Change). Course materials discuss how biased datasets, resulting from unrepresentative sampling, are a primary source of algorithmic unfairness and discriminatory outcomes. The concept of proxy variables like zip codes is a key topic in lectures on algorithmic bias.

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE