Self-Consistency Sampling
Voting over multiple reasoning paths.
One Chain Is Fragile
A single chain-of-thought can take a wrong turn early and never recover. Self-consistency (Wang et al., 2022) addresses this by sampling many independent reasoning paths and aggregating their final answers, typically by majority vote.
The intuition: correct reasoning converges on the same answer through diverse routes, while errors scatter. Aggregation amplifies the consistent signal.
def self_consistency(prompt, n=20, temperature=0.7):
answers = []
for _ in range(n):
chain = llm(prompt, temperature=temperature)
answers.append(extract_answer(chain))
return majority_vote(answers)Why Marginalizing Over Paths Works
Self-consistency approximates marginalizing over reasoning paths: instead of trusting one latent derivation, you estimate the most probable final answer integrated across many derivations.
This is a Monte Carlo estimate of the answer distribution. The mode of that distribution is usually more reliable than any single sampled path, especially when the answer space is small and discrete.
from collections import Counter
def majority_vote(answers):
norm = [normalize_answer(a) for a in answers]
counts = Counter(norm)
answer, votes = counts.most_common(1)[0]
confidence = votes / len(norm)
return answer, confidenceAll lessons in this course
- Chain-of-Thought Prompting
- Self-Consistency Sampling
- Tree-of-Thought Exploration
- When Reasoning Prompts Help