Artificial Idea | AI careers · practical prompts · no hype Thursday, November 27, 2025 · Issue #34 · Prompt Tutorial

The thinking instruction

Chain-of-thought prompting explained simply: how to get AI to think before it answers

The most common reason AI gives a wrong answer is not that it lacks the information. It is that it did not work through the problem before responding. One instruction fixes that.

Issue #33 established that the salary premium for AI-fluent professionals is concentrated in five specific capabilities, the first and most frequently cited of which is the ability to evaluate AI-generated outputs for accuracy and appropriateness in domain-specific contexts. That capability presupposes something this issue addresses directly: understanding how AI models arrive at their outputs well enough to know when to trust them and when to question them.

Chain-of-thought prompting is the technique that most directly addresses this. It is also the technique with the largest body of peer-reviewed research behind it, the clearest mechanism of action, and the most consistent track record of improving output quality on the specific class of tasks where AI most commonly fails: multi-step reasoning, analytical problems, and decisions that require working through intermediate steps before arriving at a conclusion.

It is not a complicated technique. It is an underused one. The gap between those two facts is where this issue lives.

Why AI gets analytical problems wrong

To understand why chain-of-thought prompting works, it helps to understand the specific failure mode it addresses.

Language models generate responses by predicting the most likely next token given everything that has come before it. For factual recall tasks, this mechanism works well: the most likely next token after a question about a well-documented fact is usually the correct answer, because the training data contains the answer in many forms and the model has learned to reproduce it reliably.

For analytical tasks requiring multiple reasoning steps, the mechanism is less reliable. The model is predicting tokens in sequence, which means it is generating an answer before it has worked through the reasoning that should support the answer. In human terms, it is writing the conclusion before thinking through the argument, and then generating an argument that sounds plausible for the conclusion it has already committed to rather than reasoning from evidence to conclusion.

A 2022 paper by researchers at Google Brain, which introduced the term chain-of-thought prompting and remains the foundational research in this area, demonstrated this failure mode clearly. When models were asked complex reasoning questions without explicit instruction to show their working, they produced confident wrong answers at a rate significantly higher than when they were asked to reason through the problem step by step before stating a conclusion. The information required to answer correctly was available to the model in both conditions. The difference was whether the model was prompted to use it through intermediate reasoning steps or allowed to shortcut directly to a conclusion.

The accuracy improvement from chain-of-thought prompting on multi-step reasoning tasks in the Google Brain research was 38% on average, with the improvement concentrated in tasks requiring four or more logical steps. Subsequent research has replicated this finding consistently across model types, task categories, and difficulty levels.

The instruction itself

Chain-of-thought prompting reduces, in its simplest form, to one addition to any prompt where multi-step reasoning is required.

Before stating your question or task, add one of the following instructions:

"Think through this step by step before giving your answer."

Or:

"Work through your reasoning before stating your conclusion."

Or, more specifically for analytical tasks:

"Before answering, identify the key variables involved, consider how they interact, and then state your conclusion with the reasoning that supports it."

The specific phrasing matters less than the instruction to reason before concluding. What the instruction does is prevent the model from generating a confident conclusion before completing the reasoning that should support it. Instead of predicting the most likely answer token immediately, the model generates reasoning tokens first, and those reasoning tokens constrain the conclusion tokens that follow in a way that significantly improves their accuracy.

This is the entire mechanism. It is not mysterious. It is the computational equivalent of telling someone to show their working rather than just giving you the answer, for the same reason that showing working catches errors that a straight answer would conceal.

Where it makes the most difference

Chain-of-thought prompting produces its largest improvements on specific task types that are worth identifying, because applying it indiscriminately adds length without always adding value.

The tasks where it makes the most difference are those requiring causal reasoning, specifically problems where identifying the cause of an outcome requires working through multiple contributing factors and their interactions. Business diagnosis tasks, root cause analysis, and strategy evaluation are the professional applications where this class of reasoning is most commonly required and where chain-of-thought prompting most reliably produces better outputs.

It also makes a significant difference on tasks requiring comparison and trade-off analysis, specifically decisions between options where each option has different strengths across multiple dimensions. Without the chain-of-thought instruction, models tend to identify the option that sounds most generically appealing. With it, they work through each dimension systematically and arrive at conclusions that are better calibrated to the specific context.

It makes a smaller but still meaningful difference on tasks requiring information synthesis, where multiple pieces of information need to be combined to produce a conclusion that none of them individually supports. Research synthesis, due diligence analysis, and strategic planning tasks fall into this category.

It makes almost no difference on factual recall tasks, tasks with a single clear correct answer that does not require multi-step reasoning to reach. Adding chain-of-thought instructions to these tasks adds length without improving accuracy and should be omitted.

The professional prompts

These five prompts apply chain-of-thought instruction to the specific analytical tasks most commonly encountered in professional contexts. Each one is designed to be used as a starting point rather than a rigid template.

Prompt 1: The root cause analyser

Think through this problem step by step 
before stating your conclusion.

Problem I am trying to diagnose: 
[describe the outcome you are trying 
to explain: a metric that has changed, 
a process that is failing, 
a situation that has deteriorated]

What I know about the context: 
[describe the relevant background, 
what has changed recently, 
what the normal baseline looks like, 
and what interventions have 
already been attempted]

Please work through the following 
reasoning sequence:

Step 1: Identify all the plausible 
causes of this outcome, without 
filtering for likelihood yet. 
List them exhaustively.

Step 2: For each cause, assess 
the evidence that supports it 
and the evidence that argues against it, 
based on what I have described.

Step 3: Identify which causes 
can be ruled out based on 
the available evidence, 
and why.

Step 4: Rank the remaining causes 
by their probability given 
the evidence, with your reasoning.

Step 5: State your conclusion 
about the most likely root cause 
and what additional information 
would confirm or disconfirm it.

Do not skip to Step 5. 
The value is in the steps that precede it.

The instruction not to skip to Step 5 is the one that makes this prompt work rather than just sound like it should work. Models will shortcut to conclusions if not explicitly prevented from doing so. The instruction enforces the reasoning sequence.

Prompt 2: The strategic trade-off analyser

Work through your reasoning carefully 
before stating your recommendation.

Decision I need to make: 
[describe the choice between 
two or more options]

What matters most in this decision: 
[list the criteria by which 
the options should be evaluated, 
in order of importance]

Options under consideration: 
[describe each option with 
enough detail that the analysis 
will be specific rather than generic]

Please reason as follows:

Step 1: Evaluate each option 
against each criterion, 
one combination at a time. 
Be specific about the evidence 
or reasoning behind each assessment.

Step 2: Identify where the options 
diverge most significantly, 
the dimensions on which the 
choice actually turns.

Step 3: Identify the assumption 
that most influences the outcome 
of this decision, the belief 
about the future that, 
if wrong, would change 
the recommendation.

Step 4: State your recommendation 
with the reasoning that supports it 
and the conditions under which 
a different option would be preferable.

Do not state your recommendation 
until you have completed Steps 1 through 3.

Step 3, identifying the assumption most influencing the outcome, is the chain-of-thought element that most directly serves the professional's actual decision-making need. Most strategic decisions do not turn on the quality of the analysis. They turn on one key assumption about the future. Surfacing that assumption explicitly, before the recommendation is stated, tells the decision-maker where the real uncertainty lies and what information would most reduce it.

Prompt 3: The argument evaluator

Reason through this carefully 
before stating your assessment.

Argument or proposal I need to evaluate: 
[paste or describe the argument, 
proposal, or recommendation 
you are assessing]

Context: [who is making the argument, 
what decision it is meant to support, 
and what is at stake if the argument 
is accepted or rejected]

Please work through the following:

Step 1: Reconstruct the argument 
in its strongest form. 
What is the best version 
of the case being made?

Step 2: Identify the logical 
structure of the argument. 
What are the premises? 
What conclusion do they support? 
Is the logical structure valid?

Step 3: Evaluate each premise 
separately. Which are well-supported? 
Which are asserted without 
adequate evidence? 
Which are false or questionable?

Step 4: Identify the weakest link 
in the argument chain, 
the premise whose failure 
would most undermine the conclusion.

Step 5: State your overall assessment 
of the argument's strength 
and what it would take 
to make it more robust.

Show each step before proceeding 
to the next.

The instruction to reconstruct the argument in its strongest form before evaluating it prevents the most common failure mode in argument evaluation: attacking the weakest version of a position rather than the strongest one. Evaluating the steel-manned version and finding it wanting is a more useful and more honest analytical exercise than finding weaknesses in a position you have already misrepresented.

Prompt 4: The forecast builder

Think through each component 
of this forecast before 
stating the overall projection.

What I am trying to forecast: 
[describe the outcome you are 
projecting: revenue, demand, 
cost, headcount, or any 
other quantifiable variable]

The relevant inputs and 
what I know about them: 
[describe each variable 
that influences the outcome, 
what its current value is, 
and what you know about 
its likely direction]

Please build the forecast 
as follows:

Step 1: Identify all the variables 
that significantly influence 
the outcome, including any 
I may have omitted.

Step 2: For each variable, 
state your best estimate 
of its value over the 
forecast period and 
the range of uncertainty 
around that estimate.

Step 3: Identify the interactions 
between variables: 
where changes in one 
affect the value of another.

Step 4: Build the base case 
forecast from the most 
likely values of each variable.

Step 5: Build a downside scenario 
by applying the pessimistic 
end of the uncertainty range 
to the two or three variables 
with the highest impact 
on the outcome.

Step 6: State the single variable 
whose actual value will 
most determine whether 
the base case is accurate, 
and why.

Do not state the base case 
until you have completed 
Steps 1 through 4.

Step 6, identifying the variable that will most determine accuracy, is the chain-of-thought element that produces the most immediate practical value. Forecasts are not equally uncertain in all their components. The uncertainty is concentrated in specific variables, and understanding which variable matters most tells the decision-maker where to focus monitoring attention rather than treating the forecast as a single number to be revised uniformly when reality diverges from it.

Prompt 5: The scenario planner

Work through each scenario 
fully before comparing them.

Decision or strategy I am planning for: 
[describe what you are trying to plan]

The key uncertainties: 
[identify the two or three 
variables most uncertain 
and most consequential 
for the outcome of your decision]

Please build scenarios as follows:

Step 1: Define the extreme values 
of your two most important 
uncertainties, the optimistic 
and pessimistic ends of 
the realistic range for each.

Step 2: Build four scenarios 
from the combinations of 
those extreme values. 
For each scenario, describe 
what the world looks like 
in concrete terms, not just 
what the variables are worth 
but what it means for the 
organisations and people involved.

Step 3: Identify which of your 
current plans or strategies 
performs well across all four scenarios, 
which performs well in some 
and poorly in others, 
and which performs poorly 
in scenarios that are 
realistically probable.

Step 4: Identify the strategy 
most robust to the range 
of scenarios, with the 
trade-offs that robustness requires.

Step 5: Identify the early 
signal that would tell you 
which scenario is materialising 
at the earliest possible moment, 
creating the maximum window 
for strategic adjustment.

Complete each step before 
proceeding to the next. 
The value is in the 
reasoning, not the conclusions.

The closing instruction, that the value is in the reasoning not the conclusions, is the principle that underlies all five prompts and chain-of-thought prompting as a technique. The conclusions of AI-assisted analysis are only as good as the reasoning that precedes them. When the reasoning is made visible, it can be evaluated, challenged, and improved by the professional applying it to a context the model cannot fully understand. When it is hidden behind a confident conclusion, it cannot.

Making the reasoning visible is the core of what chain-of-thought prompting does. The professional who understands why it works is better positioned to apply it, adapt it, and evaluate the outputs it produces than one who uses it as a recipe without understanding the mechanism.

The mechanism is simple. The value it produces is not.

Monday we are examining the data on the 2025 job market overall, what was created, what was eliminated, what surprised the researchers who track this most closely, and what the year's patterns suggest about where the transition is heading in 2026. It is the most comprehensive jobs analysis this newsletter will publish and it draws on sources that have not been synthesised in this way elsewhere.

The year told a more complicated story than either the optimists or the pessimists predicted. Monday covers the full picture.

— The Artificial Idea team

Keep Reading