AWSFundamentals of GenAI

Temperature Parameter Exam Question Explained

Q: What is the correct answer for this AWS Certified AI Practitioner question?

The correct answer is A. Decrease temperature toward 0. Decreasing temperature toward 0 is the best answer because temperature controls how random the model sampling process is. A lower temperature makes the model favor the most likely next tokens, which produces more stable and repeatable responses. That is what a legal document summarizer needs: consistency, conservative wording, and low variation across runs. A higher temperature does the opposite. It increases randomness and can make outputs more creative, diverse, or surprising, which may be useful for brainstorming but is risky for legal summarization. Increasing the context window affects how much source text the model can read in a single request; it does not make the model more deterministic. Increasing top-p toward 1.0 broadens the pool of candidate tokens considered during nucleus sampling, which can also increase variation rather than reduce it. On the exam, look for words such as deterministic, consistent, repeatable, conservative, legal, medical, compliance, or factual. Those clues usually point to lowering temperature.

This AWS AI Practitioner question tests whether you understand inference controls for foundation models. The key rule: lower temperature when the scenario needs deterministic, consistent answers.

Short answer

The correct answer is A. Decrease temperature toward 0.

Decreasing temperature toward 0 is the best answer because temperature controls how random the model sampling process is. A lower temperature makes the model favor the most likely next tokens, which produces more stable and repeatable responses. That is what a legal document summarizer needs: consistency, conservative wording, and low variation across runs. A higher temperature does the opposite. It increases randomness and can make outputs more creative, diverse, or surprising, which may be useful for brainstorming but is risky for legal summarization. Increasing the context window affects how much source text the model can read in a single request; it does not make the model more deterministic. Increasing top-p toward 1.0 broadens the pool of candidate tokens considered during nucleus sampling, which can also increase variation rather than reduce it. On the exam, look for words such as deterministic, consistent, repeatable, conservative, legal, medical, compliance, or factual. Those clues usually point to lowering temperature.

Practice Question

A developer is calling a large language model for a legal document summarizer and wants responses to be highly deterministic and consistent across runs. Which change to the inference parameters best supports that goal?

Correct Answer: A

Decrease temperature toward 0

Why A is correct

Why the other options are wrong

Option B: Increase temperature toward 1.0

Increasing temperature makes the model sample more creatively and less predictably. That is useful for ideation, but it conflicts with the requirement for deterministic legal summaries.

Option C: Increase the context window size

A larger context window lets the model read more input tokens. It can prevent truncation, but it does not directly reduce randomness or make repeated generations more consistent.

Option D: Increase top-p toward 1.0

Top-p, also called nucleus sampling, controls how much probability mass is eligible for sampling. Raising it toward 1.0 broadens the candidate token set and can increase variation.

How temperature works on foundation model inference

Temperature is an inference parameter that changes how strongly a foundation model prefers the most likely next token. At low temperature, the probability distribution is sharper, so the model repeatedly selects high-probability tokens and produces more predictable answers. At high temperature, the distribution is flatter, so lower-probability tokens have more chance to appear, making the response more varied. Temperature does not change what the model knows, how many tokens it can read, or whether retrieved documents are available. It changes the sampling behavior at generation time. For exam scenarios, match temperature to the risk profile. Use low temperature for summarization, classification, extraction, legal work, medical work, compliance, documentation, and any workflow where consistent wording matters. Use higher temperature for creative writing, brainstorming, marketing copy variants, or exploratory ideation where diversity is valuable. Top-p is related but not identical. Top-p limits sampling to the smallest set of tokens whose cumulative probability reaches a threshold. Lower top-p narrows choices; higher top-p broadens them. The exam often pairs temperature with context window, token limits, and embeddings as distractors. Context window controls input length. Max tokens controls output length. Embeddings represent text for search. Temperature controls randomness.

Ready to see how you'd score?

Take a free 20-question diagnostic and find out which AWS Certified AI Practitioner domains you need to focus on. No signup required.

Take Free Diagnostic See all 500+ practice questions

Practice 5 similar questions

Same cert, same or adjacent domain. Use these after reviewing the explanation.

Retake diagnostic →

1Prompt Engineering Patterns for the AWS AI Practitioner ExamApplications of Foundation Models 2Foundation Models vs Traditional ML — When to Use EachFundamentals of GenAI 3RAG vs Fine-Tuning on AWS — When to Use EachApplications of Foundation Models 4Amazon Bedrock vs Amazon SageMaker — When to Use EachFundamentals of Generative AI 5Amazon Bedrock Knowledge Bases — Managed RAG on AWSApplications of Foundation Models

Quick FAQ

What is the correct answer for this AWS Certified AI Practitioner question?

The correct answer is A. Decrease temperature toward 0. Decreasing temperature toward 0 is the best answer because temperature controls how random the model sampling process is. A lower temperature makes the model favor the most likely next tokens, which produces more stable and repeatable responses. That is what a legal document summarizer needs: consistency, conservative wording, and low variation across runs. A higher temperature does the opposite. It increases randomness and can make outputs more creative, diverse, or surprising, which may be useful for brainstorming but is risky for legal summarization. Increasing the context window affects how much source text the model can read in a single request; it does not make the model more deterministic. Increasing top-p toward 1.0 broadens the pool of candidate tokens considered during nucleus sampling, which can also increase variation rather than reduce it. On the exam, look for words such as deterministic, consistent, repeatable, conservative, legal, medical, compliance, or factual. Those clues usually point to lowering temperature.

How should I study similar AWS Certified AI Practitioner questions?

Review the explanation, compare every distractor, then practice related questions in the same domain: Fundamentals of GenAI.