π¦ Tree of Thoughts (ToT) Prompting
- Tree of Thoughts (ToT) prompting enables models to explore and evaluate multiple reasoning paths, enhancing decision-making and solution accuracy.
- ToT mimics human problem-solving by using a tree structure where nodes represent partial solutions, allowing the model to backtrack when necessary.
- Two key components: Propose prompts generate possible solutions, and value prompts evaluate and guide the model toward the best path.
- ToT outperforms other methods in tasks like math reasoning, creative writing, and puzzles, with higher success rates and more coherent results.
- Limitations include increased resource consumption and inefficiency for simpler tasks that donβt require extensive reasoning.
What is Tree of Thoughts Prompting?
Since their inception, Large Language Models (LLMs) are increasingly becoming popular and getting deployed in a wide range of applications across many different industries. But, a common theme across LLM inference is that they are still confined to token-level, left-to-right decision-making processes. They still fall short in tasks requiring exploration or making assumptions about the future state, given the present state.
Tree of Thoughts (ToT) prompting is a framework for LLM inference that allows LLMs to make an informed decision by considering and self-evaluating multiple different reasoning paths that will likely lead to an optimal solution. ToT also empowers the model to backtrack when a path is unlikely to lead to a valid solution. ToT is similar to the best-first search algorithm in Computer Science.
ToT aims to mimic human's problem-solving approach. Research shows that, given a problem, humans search through a combinatorial problem space - a tree where the nodes represent partial solutions and branches represent operators that modify the nodes. They use heuristics to identify the next branch that guides them closer to the solution. The process continues till the problem concludes.
How to Use Tree of Thoughts Prompting?
Let's look at how we can implement ToT using two distinct examples:
- Game of 24
- Creative writing
Game of 24
Game of 24 is a mathematical problem where the goal is to use given four numbers and four basic operators: +, -, /,*, to obtain 24. At each step, the propose prompt generates three possible solutions, and the value prompt evaluates each of the generated candidates and decides whether proceeding with the suggested generation is worthwhile. It will be clear once we look at the example below.
Problem statement: Using the numbers 4, 9, 10, and 13 and four basic operators +, -, /,*, generate an expression that evaluates to 24.
- In the first step, prompt the model to get candidate solutions.
- In the second step, prompt the model to evaluate all its generated solution
 Evaluating 13, 10, 13
Evaluating 6, 9, 13
For each generated node using the propose prompt, the value prompt evaluates it. Then, for all the nodes that are likely to reach the solution, expand them using the propose prompt. Use Breadth First Search (BFS) to expand all nodes at one level before moving on to the nodes at the next level. The process is continued until only the number 24 is left in a node.
Creative Writing
The creative writing task helps evaluate the creative thinking and planning abilities of the LLM.
Problem statement: Given four random sentences, generate a passage with four paragraphs that end in the input four sentences respectively.
Step 1:
First, ask the LLM to generate 5 different plans for the passage.
Plan 1
Plan 2
Step 2:Present all 5 plans to the LLM and ask it to choose the best one. A simple Zero-Shot voting prompt, "analyze choices below, then conclude which is most promising for the instruction," is used.
Repeat this step 5 times and choose the plan (say Plan 1) that gains the maximum votes.
Step 3
Use the chosen plan to generate the passage.
The image below illustrates the use of ToT for creative writing tasks.
ToT for creative writing
What Are Tree of Thoughts Prompting Results?
- In the "Game of 24" task, a mathematical reasoning problem, ToT with b = 1(retaining the best 1 candidate at each step) is comparable to Chain-of-Thought (CoT) Prompting, and ToT with b = 5 beats CoT by a huge margin of 25%.
| Method | Success | 
|---|---|
| IO (best of 100) | 33% | 
| CoT (best of 100) | 49% | 
| ToT (ours) (b=5) | 74% | 
- In creative writing, ToT generates more coherent passages compared to passages generated using Input-Output (IO) prompting and CoT prompting.
ToT coherency score for creative writing task
- In crossword puzzles, ToT significantly outperforms IO and CoT techniques in word level success rate and also wins 20% of the games compared to the 1% win rate of CoT.
| Method | Success Rate(%) | ||
|---|---|---|---|
| Letter | Word | Game | |
| IO | 38.7 | 14 | 0 | 
| CoT | 40.6 | 15.6 | 1 | 
| ToT | 78 | 60 | 20 | 
Limitations of Tree of Thoughts Prompting
- Although the ToT framework can help LLMs solve problems that require planning and decision-making, it may not be the most efficient prompting technique for common NLP (Natural Language Processing) tasks as they are too easy for models like GPT-4.
- ToT is a resource (cost, number of requests, etc.) intensive framework.
Cost analysis of the Game of 24
Conclusion
Tree of Thoughts (ToT) is a practical framework for intellectually demanding tasks that require some planning and look-ahead. However, implementing ToT is demanding in terms of resources consumed and effort required. Consequently, it is wise to use it to solve only those tasks that cannot be solved using techniques like IO prompting and CoT prompting.
Find more on Decomposition Prompting methods.
Bhuwan Bhatt
Bhuwan Bhatt, a Machine Learning Engineer with over 5 years of industry experience, is passionate about solving complex challenges at the intersection of machine learning and Python programming. Bhuwan has contributed his expertise to leading companies, driving innovation in AI/ML projects. Beyond his professional endeavors, Bhuwan is deeply committed to sharing his knowledge and experiences with others in the field. He firmly believes in continuous improvement, striving to grow by 1% each day in both his technical skills and personal development.

 Learn Prompting
Learn Prompting