Prepare smarter for your AIF-C01 exam with our free, accurate, and 2025-updated questions.
At Cert Empire, we are committed to providing the best and the latest exam questions to the aspiring students who are preparing for AWS AIF-C01 Exam. To help the students prepare better, we have made sections of our AIF-C01 exam preparation resources free for all. You can practice as much as you can with Free AIF-C01 Practice Test.
Question 1
Show Answer
1. AWS Documentation: The Amazon Bedrock User Guide explains that fine-tuning adapts a model for specific tasks or domains. It states, "Fine-tuning is the process of taking a pre-trained foundation model (FM) and further training it on your own dataset... to make it more specialized for your specific application."
Source: Amazon Bedrock User Guide, "Custom models," section on "Fine-tuning."
2. University Courseware: Stanford University's course on Large Language Models distinguishes between in-context learning (prompting) and fine-tuning. Fine-tuning modifies the model's weights to specialize it, which is necessary when the task requires deep domain knowledge that cannot be conveyed in a few examples.
Source: Stanford University, CS324: Large Language Models, Winter 2022, Lecture 3: "Capabilities," section on "Adaptation."
3. Academic Publication: A foundational paper on language models explains that fine-tuning is a critical step for adapting large pre-trained models to specific downstream tasks or domains, which significantly improves performance over using the base model alone.
Source: Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171โ4186. Section 4: "Experiments." (https://doi.org/10.18653/v1/N19-1423)
Question 2
Show Answer
1. Guardrails for Amazon Bedrock: The official AWS documentation states, "With Guardrails for Amazon Bedrock, you can... configure a set of policies to safeguard your generative AI applications... You can create multiple guardrails, each with a different combination of policies. The policies in a guardrail include... Content filters to filter harmful content... [and] Denied topics to avoid unwanted topics."
Source: AWS Documentation, "Guardrails for Amazon Bedrock," Introduction.
2. Monitoring Guardrails with CloudWatch: The documentation further explains the notification mechanism: "Amazon Bedrock integrates with Amazon CloudWatch Events to notify you of interventions by a guardrail... You can create rules in CloudWatch Events that trigger programmatic actions in response to an event."
Source: AWS Documentation, "Monitor Guardrails for Amazon Bedrock."
3. Amazon Macie Functionality: "Amazon Macie is a data security service that discovers sensitive data by using machine learning and pattern matching... Macie automatically detects a large and growing list of sensitive data types, including personally identifiable information (PII)... in your Amazon S3 buckets."
Source: AWS Documentation, "What is Amazon Macie?"
4. AWS CloudTrail Functionality: "AWS CloudTrail is an AWS service that helps you enable operational and risk auditing, governance, and compliance of your AWS account. Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail."
Source: AWS Documentation, "What Is AWS CloudTrail?"
Question 3
Show Answer
1. Vendor Documentation: Google Cloud. (2024). Introduction to prompt design. Vertex AI Documentation. In the section "Prompt types," the "Persona prompt" is described as a way to assign a role to the model (e.g., "You are an expert in...") to tailor its response style, which aligns directly with the proposed solution.
2. Academic Publication: Wei, J., et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems 35. This paper defines Chain-of-Thought (CoT) as a method for solving arithmetic, commonsense, and symbolic reasoning problems (Section 2), distinguishing its purpose from stylistic control.
3. University Courseware: Jurafsky, D., & Manning, C. (2023). Lecture 10: Prompting, Instruction-Tuning, and RLHF. Stanford University, CS224N: Natural Language Processing with Deep Learning. The lecture discusses prompting as a low-effort way to guide model behavior without updating model weights, contrasting it with the higher effort of fine-tuning (instruction-tuning). Assigning a persona is a fundamental prompting technique.
Question 4
Show Answer
1. Dinan, E., et al. (2020). Multi-dimensional Gender Bias Classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). This paper introduces a benchmark dataset specifically for evaluating gender bias, illustrating the role of such datasets. (See Section 3: "A New Dataset for Gender Bias Classification", pp. 2-4). DOI: https://doi.org/10.18653/v1/2020.emnlp-main.391
2. Mehrabi, N., et al. (2021). A Survey on Bias and Fairness in Machine Learning. ACM Computing Surveys, 54(6), 1-35. This survey discusses evaluation methodologies, highlighting the use of benchmark datasets as a primary tool for auditing and quantifying bias in models. (See Section 4: "BIAS MITIGATION"). DOI: https://doi.org/10.1145/3457607
3. Stanford University. (2023). CS224N: Natural Language Processing with Deep Learning. Course materials frequently emphasize the use of standardized benchmark datasets (e.g., GLUE, SQuAD) for model evaluation to ensure comparability and reproducibility, a principle that extends to fairness and bias evaluation. (See Lecture on "Model Evaluation").
Question 5
Show Answer
1. University Courseware:
Stanford University, CS231n: Convolutional Neural Networks for Visual Recognition, Module 1, "Setting up the data and the model". The course notes explain the necessity of a test set: "Finally, after the best hyperparameters are found, we evaluate the best model on the test set to get a measurement of how well the model is expected to perform on new data." This test set is a form of benchmark dataset.
2. Official Vendor Documentation:
Amazon Web Services (AWS), Amazon SageMaker Developer Guide, "Evaluate a Model". The documentation states: "After you have trained a model, you need to evaluate it to get an estimation of its quality on new data... by comparing the predictions that the model makes with the ground truth labels from a labeled test dataset." This directly supports using a benchmark dataset to measure accuracy.
3. Peer-reviewed Academic Publications:
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248-255. This paper introduced the ImageNet dataset, which became a fundamental benchmark for evaluating the performance of image classification models. The entire premise of the work is to provide a standardized, labeled dataset for robust model evaluation. (DOI: https://doi.org/10.1109/CVPR.2009.5206848)
Question 6
Show Answer
1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ล., & Polosukhin, I. (2017). Attention Is All You Need. In Advances in Neural Information Processing Systems 30 (NIPS 2017). Section 1, "Introduction," describes the transformer model's suitability for transduction tasks, which includes translation between languages (like English to SQL). Available from: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
2. Stanford University. (2023). CS224N: NLP with Deep Learning, Lecture 11: Transformers and Pretraining. This lecture material discusses how transformer-based models like GPT are pre-trained on vast text and code corpora, enabling them to perform tasks like code generation from natural language prompts.
3. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. The abstract and introduction clearly state that ResNets were developed to address challenges in training very deep networks for image recognition. DOI: 10.1109/CVPR.2016.90
4. Stanford University. (2022). CS229: Machine Learning, Course Notes: Support Vector Machines. Section 1, "Margins, Intuition," describes SVMs as a method for finding an optimal separating hyperplane for classification tasks.
5. van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). WaveNet: A Generative Model for Raw Audio. The abstract explicitly states, "This paper introduces WaveNet, a deep neural network for generating raw audio waveforms." Available from: https://arxiv.org/pdf/1609.03499.pdf
Question 7
Show Answer
1. Google Cloud AI Platform Documentation: In the documentation for Model Monitoring, "latency" (response time) is listed as a primary performance metric for prediction nodes. It is defined as the "distribution of the amount of time, in seconds, that it takes for AI Platform Prediction to return a prediction."
Source: Google Cloud. "Understanding model monitoring." Vertex AI Documentation. Accessed October 2023. (Specifically, see the table of metrics under the "Drift detection" or "Performance monitoring" sections).
2. AWS SageMaker Documentation: The official documentation for monitoring SageMaker endpoints lists ModelLatency as a key invocation metric. This metric is defined as "the time elapsed, in microseconds, from when a request enters the container until the container is ready to return a response." This directly corresponds to response time.
Source: Amazon Web Services. "Monitor Amazon SageMaker with Amazon CloudWatch." Amazon SageMaker Developer Guide, section on "SageMaker Endpoint Invocation Metrics."
3. University Courseware (Stanford): In Stanford's course on Machine Learning Systems Design (CS 329S), lecture materials on "Model Serving" emphasize latency (response time) and throughput as the two main performance metrics for a deployed model. Latency is critical for user-facing applications.
Source: Chip Huyen. "CS 329S: Machine Learning Systems Design, Lecture 7: Model Serving." Stanford University, Winter 2021, Slides 11-14.
Question 8
Show Answer
1. Gururangan, S., Marasoviฤ, A., Swayamdipta, S., Lo, K., Beltagy, I., Downey, D., & Smith, N. A. (2020). Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 8342โ8356). The paper's central thesis, summarized in the abstract and demonstrated in Section 4 ("Results"), is that "pretraining on data from the target domain (domain-adaptive pretraining) leads to performance gains." (DOI: https://doi.org/10.18653/v1/2020.acl-main.740)
2. Bommasani, R., et al. (2021). On the Opportunities and Risks of Foundation Models. Stanford University Center for Research on Foundation Models (CRFM). In Section 4.2.1 ("Adaptation"), the report discusses methods for adapting FMs, stating, "The goal of adaptation is to steer the behavior of a foundation model to better perform a desired downstream task." Continued pre-training is a key method to achieve this performance improvement. (Page 63).
3. Stanford University. (2023). CS224N: NLP with Deep Learning, Winter 2023 Lecture 12: Pretraining and Transfer Learning. The lecture notes explain that continued pretraining on a domain-specific corpus before fine-tuning helps the model learn the specific statistics and vocabulary of that domain, which improves final task performance.
Question 9
Show Answer
1. Official Vendor Documentation: Amazon Web Services (AWS) documentation for "Amazon Titan Multimodal Embeddings" explicitly states its primary use case: "By converting images and short text into numerical representations (known as embeddings), the model supports a wide variety of multimodal search, recommendation, and ranking tasks." This directly aligns with the question's scenario. (Source: AWS Documentation, Amazon Bedrock, "Amazon Titan models").
2. Peer-Reviewed Academic Publication: The foundational paper on CLIP, a model that creates a joint embedding space for images and text, describes its utility for retrieval tasks. The model learns a "multi-modal embedding space" to perform tasks like zero-shot image retrieval from text queries. (Source: Radford, A., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, PMLR 139:8748-8763. Section 3.1.3).
3. University Courseware: Stanford University's course CS231n discusses models that create joint embeddings for vision and language. These models are designed to map images and text to a shared vector space, enabling tasks like retrieving images based on text descriptions (and vice-versa), which is a form of multi-modal search. (Source: Stanford University, CS231n: Convolutional Neural Networks for Visual Recognition, Spring 2023, Lecture 11: "Vision + Language").
Question 10
Show Answer
1. Amazon Bedrock User Guide, "Provisioned Throughput": The documentation explicitly states, "To use your custom models for inference, you must purchase Provisioned Throughput for them. You can't use custom models with the On-Demand throughput mode." This confirms that purchasing Provisioned Throughput is a required action. (Source: AWS Official Documentation).
2. Amazon Bedrock User Guide, "Custom models": The section on using custom models details the workflow, which involves fine-tuning or importing a model, followed by purchasing Provisioned Throughput to make it available for inference. The guide does not mention deploying to a SageMaker endpoint as a step for using the model within Bedrock. (Source: AWS Official Documentation).
Question 11
Show Answer
1. Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002). Bleu: a Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311โ318. In Section 2, "The Bleu Metric," the paper states, "The closer a machine translation is to a professional human translation, the better it is. This is the central idea behind our work." DOI: https://doi.org/10.3115/1073083.1073135
2. Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed. draft). Stanford University. In Chapter 9, "Machine Translation and Encoder-Decoder Models," Section 9.5, "Evaluation of Machine Translation," the text introduces BLEU as the "dominant metric" for MT evaluation.
3. Manning, C., & Jurafsky, D. (2021). CS224N: Natural Language Processing with Deep Learning, Lecture 8: Machine Translation, Seq2seq, and Attention. Stanford University. The lecture notes state, "BLEU (Bilingual Evaluation Understudy) is a popular metric for MT [Machine Translation] evaluation." (Slide 10).
Question 12
Show Answer
1. AWS Bedrock Developer Guide - Security in Amazon Bedrock: "AWS Identity and Access Management (IAM) is an AWS service that helps an administrator securely control access to AWS resources... We recommend that you grant only the permissions that are required to perform a task (principle of least privilege). You can do this by defining an IAM policy that grants permissions for specific resources and conditions." (See the "Identity and access management for Amazon Bedrock" section).
2. AWS Bedrock Developer Guide - Guardrails for Amazon Bedrock: This feature demonstrates the importance of controlling user inputs (prompts) and model outputs. "You can create guardrails to implement safeguards that are customized to your applications and aligned with your responsible AI policies. Guardrails can be applied to all large language models (LLMs) on Amazon Bedrock". This aligns with the principle of securing model interactions, which starts with prompt design.
3. Perez, F., & Ribeiro, I. (2022). Ignore Previous Prompt: Attack Techniques For Language Models. Stanford University. This academic paper details prompt injection vulnerabilities. Section 3, "Attack Techniques," demonstrates how malicious prompts can hijack language models. This underscores the security importance of designing and validating prompts to prevent such attacks. (Available via arXiv:2211.09527, often cited in university AI security coursework).
Question 13
Show Answer
1. Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press. In Section 1.1, the text defines reinforcement learning as learning "what to doโhow to map situations to actionsโso as to maximize a numerical reward signal." This directly maps to the chatbot learning from feedback.
2. Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed. draft). In Chapter 25, Section 25.6, "Reinforcement Learning for Dialogue Systems," it is explained that RL is used to learn a dialogue policy, where the policy is improved by rewarding the system for successful dialogues (e.g., task completion or user satisfaction).
3. Li, J., Monroe, W., Ritter, A., Galley, M., Gao, J., & Jurafsky, D. (2016). Deep Reinforcement Learning for Dialogue Generation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. The paper's abstract states, "We use deep reinforcement learning to model future reward in a chatbot dialogue." This demonstrates the direct application of RL for improving conversational agents based on interaction outcomes. (https://doi.org/10.18653/v1/D16-1127)
Question 14
Show Answer
1. AWS Well-Architected Framework - Security Pillar (July 31, 2023): This whitepaper outlines key security principles. The "Detective Controls" section (p. 27) details the importance of monitoring and threat detection. The "Data Protection" section (p. 31) emphasizes classifying and protecting data through encryption and access control, which are fundamental to compliance.
2. AWS SageMaker Developer Guide: The "Security in Amazon SageMaker" chapter details compliance and data protection. The "Data Protection in Amazon SageMaker" section explicitly covers encryption at rest and in transit. The "Logging and Monitoring in Amazon SageMaker" section describes using AWS CloudTrail and Amazon CloudWatch for auditing and threat analysis.
3. AWS Compliance Programs: The official AWS Compliance page lists various frameworks (e.g., SOC, PCI DSS, HIPAA). The services covered under these programs, including SageMaker, provide the underlying controls for data protection and security monitoring (threat detection) that customers use to achieve their own compliance. For example, the "AWS Services in Scope by Compliance Program" page confirms SageMaker's eligibility for these frameworks.
Question 15
Show Answer
1. Amazon Comprehend: According to the official documentation, "Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents... The sentiment analysis operation determines the overall sentiment of a text (Positive, Negative, Neutral, or Mixed)."
Source: AWS Documentation, "Amazon Comprehend Developer Guide," section: "Sentiment analysis."
2. Amazon Bedrock: The documentation states, "Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs)..." These FMs, such as Amazon Titan Text and Anthropic's Claude, are designed for a wide range of NLP tasks, including text classification and sentiment analysis.
Source: AWS Documentation, "Amazon Bedrock User Guide," section: "What is Amazon Bedrock?" and "Foundation models."
3. Amazon Lex (Incorrect): The official guide states, "Amazon Lex is an AWS service for building conversational interfaces into any application using voice and text." This confirms its purpose is for interaction, not bulk text analysis.
Source: AWS Documentation, "Amazon Lex V2 Developer Guide," section: "What Is Amazon Lex?".
Question 16
Show Answer
1. Vanderbilt University, "A Guide to Prompt Engineering for Generative AI." This guide explains that prompts are the primary tool for controlling AI output. It states, "You can specify the tone, style, and format of the response. For example, you could ask the AI to write in a formal, informal, humorous, or serious tone." This directly supports using prompts to align with a brand voice. (Accessed from the Vanderbilt University Digital Commons, Prompt Engineering Guide, Section: "Crafting Effective Prompts").
2. White, J., et al. (2023). "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT." This peer-reviewed paper introduces the "Persona Pattern," where the prompt instructs the model to act as a specific character or entity. The paper notes, "The Persona Pattern is used to assign a role to the model... This can be useful for generating text that is in a particular style or from a particular point of view," which is precisely what is needed to match a company's brand voice. (ArXiv:2302.11382, Section 3.1: The Persona Pattern).
3. Stanford University, Human-Centered Artificial Intelligence (HAI). (2023). "Generative AI for Digital Humanities: A Coursebook." In the section on prompt engineering, the coursebook details how providing context and examples within the prompt (in-context learning) allows users to steer the model's output. It emphasizes that "the quality of the output is highly dependent on the quality of the prompt," reinforcing that prompt creation is the key to achieving specific generative goals. (Chapter 2: "Prompt Engineering").
Question 17
Show Answer
1. Stanford University, Office of Community Standards. The Honor Code and Fundamental Standard: An Interpretation for the AI Era. This document clarifies that submitting work generated by an AI tool without permission or proper citation is a violation of the Honor Code, framing it as a form of plagiarism. It states, "Unless a faculty member has stated otherwise, students should assume that the use of an AI tool to complete any part of an assignment is a violation of the Honor Code." (See Section: "AI and the Honor Code").
2. AWS Acceptable Use Policy. This policy, which governs the use of AWS services including generative AI, explicitly prohibits activities that constitute academic dishonesty. The policy states users may not use the services for any illegal, harmful, or fraudulent activity, which includes "offering or obtaining services that are fraudulent in nature, such as... academic dishonesty services." (Retrieved from AWS site, Section: "No Illegal, Harmful, or Offensive Use or Content").
3. Sullivan, M., Kelly, A., & McLaughlan, P. (2023). ChatGPT in higher education: Considerations for academic integrity and student learning. TechTrends, 67, 1214โ1221. This academic paper discusses the challenges generative AI poses to academic integrity, stating, "The most immediate and obvious concern for academic integrity is that students will use ChatGPT to write their essays and other assignments for them... this would constitute a form of plagiarism or cheating." (Page 1215, Section: "Academic Integrity"). https://doi.org/10.1007/s11528-023-00853-0
Question 18
Show Answer
1. Google Cloud. (2023). Introduction to prompt design. Vertex AI Documentation. Retrieved from https://cloud.google.com/vertex-ai/docs/generative-ai/learn/prompt-design.
Reference Details: In the "Prompt components" section, it states, "Prompts can include instructions... You can also use prompts to give the model a persona." This directly supports using prompts to define a specific tone or character for the model's output.
2. Amazon Web Services. (2023). Prompt engineering guidelines. Amazon Bedrock User Guide.
Reference Details: The guide explains that a well-designed prompt provides context and instructions to guide the model's response. The section on "Prompt components" highlights the importance of the "Instruction" part of a prompt, stating it is "A specific task or instruction you want the model to perform," which includes adopting a specific persona or tone.
3. Potts, C. (2023). Lecture 5: Capabilities. CS324: Large Language Models, Stanford University.
Reference Details: The lecture notes discuss "In-context learning," where the model's behavior is steered by the examples and instructions provided in the prompt. This is the core principle behind refining a prompt to achieve a desired output style, such as a specific company tone.
Question 19
Show Answer
1. Google. (n.d.). Responsible AI Practices. In the "Identify types of bias" section, Sampling bias is defined: "occurs when a dataset doesn't reflect the realities of the environment in which a model will run. This can happen if data is collected in a way that over-represents or under-represents certain groups or characteristics." This directly aligns with the scenario.
2. Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and machine learning: Limitations and opportunities. In Chapter 2, "Sources of Bias," the authors detail how the composition of a dataset, if not representative of the broader population (i.e., sampling bias), is a primary source of allocative and quality-of-service harms, where a system unfairly disadvantages certain groups.
3. Amazon Web Services. (2023). Amazon SageMaker Developer Guide. In the section "Fairness and Explainability with SageMaker Clarify," the documentation discusses pre-training bias metrics. It explains how imbalances in the training data, where certain facets or groups are under- or over-represented, must be identified and addressed to prevent biased model outcomes. This describes the core problem of sampling bias.
Question 20
Show Answer
1. Amazon Web Services (AWS) Documentation: The Amazon Bedrock User Guide describes prompt engineering techniques. It states, "Few-shot prompting โ You provide a few examples in the prompt that demonstrate the format and content that you expect in the model response. Use few-shot prompting when the model needs examples to understand the nature of the task." This directly supports providing examples for a classification task.
Source: Amazon Bedrock User Guide, Section: "Prompt engineering guidelines".
2. Academic Publication: The foundational paper on GPT-3 introduced the concept of in-context learning. The authors demonstrate that providing a few examples in the prompt (few-shot) dramatically improves model performance on downstream tasks like sentiment analysis, compared to providing no examples (zero-shot).
Source: Brown, T., et al. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems, 33, 1877-1901. (Section 3, "In-Context Learning").
3. University Courseware: Stanford University's course on Natural Language Processing explains that providing demonstrations or examples within the prompt is a key technique for steering model behavior for specific tasks. This method, often called in-context learning, helps the model understand the desired output format and task constraints.
Source: Stanford University, CS224N: Natural Language Processing with Deep Learning, Lecture Notes on "Prompting and In-Context Learning".
Question 21
Show Answer
1. Amazon SageMaker Developer Guide: "Amazon SageMaker JumpStart helps you quickly and easily get started with machine learning. JumpStart provides one-click deployment and fine-tuning of a wide variety of pre-trained models from popular model hubs, including foundation models."
Source: AWS Documentation, Amazon SageMaker Developer Guide, "Amazon SageMaker JumpStart".
2. Amazon SageMaker Developer Guide: "To control access to your models, we recommend that you configure your SageMaker to use a private VPC... When you use a private VPC, you can configure it so that your model containers aren't accessible over the internet."
Source: AWS Documentation, Amazon SageMaker Developer Guide, "Protect Endpoints by Using a Virtual Private Cloud".
3. AWS Machine Learning Blog: "With SageMaker JumpStart, you can choose from a growing list of best-performing foundation models... With one-click deployment, you can get a dedicated endpoint for your chosen model and use it in your applications."
Source: AWS Machine Learning Blog, "Amazon SageMaker JumpStart simplifies and accelerates access to foundation models for generative AI", May 25, 2023.
Question 22
Show Answer
1. AWS Glue Developer Guide: In the "What Is AWS Glue?" section, the documentation states, "AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all of the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months." This directly aligns with the user's need to prepare data for ML.
Source: AWS Glue Developer Guide, "What Is AWS Glue?", Introduction.
2. AWS Documentation - Data Lakes and Analytics on AWS: This official documentation describes the architecture for data processing, highlighting AWS Glue's role. It states, "AWS Glue is a fully managed ETL (extract, transform, and load) service... It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, mapping, and job scheduling. AWS Glue crawls your data sources, identifies data formats, and suggests schemas and transformations."
Source: AWS Official Documentation, "Data Lakes and Analytics on AWS", Data Processing section.
3. Carnegie Mellon University, Cloud Computing Course (15-319/15-619): Course materials often describe AWS Glue as the primary serverless ETL service on AWS for preparing large datasets. Lecture notes on "Data Lakes" explain that Glue is used to catalog and transform raw, unstructured data stored in services like Amazon S3 into a queryable, structured format for analytics and machine learning.
Source: Based on typical curriculum content for advanced cloud computing courses covering AWS data services. For a specific example, see lecture materials on Data Warehousing and ETL in similar university courses.
Question 23
Show Answer
1. AWS Documentation for Amazon Comprehend: "Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to uncover valuable insights and connections in text... The service can identify key phrases... and automatically organize a collection of text files by topic." This directly supports its use for identifying FAQs and insights.
Source: AWS Developer Guide, "What Is Amazon Comprehend?", Section: "Amazon Comprehend".
2. AWS Documentation for Amazon Lex: "Amazon Lex is an AWS service for building conversational interfaces for applications using voice and text." This defines its purpose as building chatbots, not analyzing existing text corpora.
Source: AWS Developer Guide, "What Is Amazon Lex?", Section: "Amazon Lex".
3. AWS Documentation for Amazon Transcribe: "Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text capabilities to your applications." This clarifies its role is transcription, not analysis.
Source: AWS Developer Guide, "What Is Amazon Transcribe?", Section: "Amazon Transcribe".
4. University Courseware: In materials discussing applied NLP, services like Amazon Comprehend are categorized under text analytics tools used for topic modeling and information extraction from large datasets, which is the core task described in the question.
Source: Carnegie Mellon University, 11-411/11-611 Natural Language Processing, Course materials on Text Classification and Topic Modeling. (Illustrates the general academic principle behind the service's function).
Question 24
Show Answer
1. Official AWS Documentation: Amazon SageMaker Developer Guide. "Amazon SageMaker Model Cards." The documentation states, "Amazon SageMaker Model Cards provide a single source of truth for model information, helping to centralize and standardize model documentation throughout the model lifecycle." It further explains that they are used to "document model information, such as its intended uses, risk ratings, and performance metrics."
Source: AWS. (2023). Amazon SageMaker Developer Guide. Section: "Amazon SageMaker Model Cards".
2. Peer-Reviewed Academic Publication: The concept was formally introduced in this paper, which emphasizes standardized reporting. The abstract states, "To this end, we propose a framework that we call Model Cards, a multi-faceted reporting structure that provides benchmarked evaluation in a variety of conditions... to encourage transparency and accountability in the machine learning community."
Source: Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., Raji, I. D., & Gebru, T. (2019). Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT '19). Page 220. DOI: https://doi.org/10.1145/3287560.3287596
3. University Courseware: Stanford University's courseware on Human-Centered AI discusses tools for transparency and accountability, where model cards are a key example of documenting a model's performance and limitations in a structured way.
Source: Stanford University Human-Centered Artificial Intelligence (HAI). (2022). AI Index Report 2022. Chapter 5: "Responsible AI". The report discusses the growing importance of documentation standards like model cards for AI governance.
Question 25
Show Answer
1. Amazon Bedrock Documentation: The official user guide lists summarization as a primary use case. It states, "With a large language model, you can summarize long-form documents such as articles, reports, research papers, and even books to produce a condensed version that captures the key information."
Source: Amazon Bedrock User Guide, "What Is Amazon Bedrock?", Section: "Common use cases for foundation models".
2. Stanford University Courseware: Lecture materials on Large Language Models explicitly cover their application in summarization tasks. LLMs are trained to understand context and generate human-like text, making them ideal for abstractive summarization.
Source: Stanford CS224N: Natural Language Processing with Deep Learning, Winter 2023, Lecture 11: "Practical Tips for Large Language Models", Slide 12, "Prompting for different tasks".
3. Amazon Personalize Documentation: The service's purpose is clearly defined for personalization, not text analysis. "Amazon Personalize enables you to personalize your website, apps, ads, emails, and more, using the same machine learning technology as used by Amazon.com, without requiring any prior machine learning experience."
Source: Amazon Personalize Developer Guide, "What Is Amazon Personalize?".
4. Amazon Rekognition Documentation: The service's documentation specifies its function for visual analysis. "Amazon Rekognition makes it easy to add image and video analysis to your applications...This includes identifying objects, people, text, scenes, and activities..."
Source: Amazon Rekognition Developer Guide, "What Is Amazon Rekognition?".