Endless Reasoning Loop

#2
by dzur658 - opened

Hi Unsloth team, first thank you for getting these quants out so fast!

I wanted to bring to your team's attention a problem I found with my initial testing of the GGUFs. The 2B model seems very prone to getting caught in endless thinking loops and never outputting the correct answer. I'm not sure if this is a problem with the quants themselves, or the actual Qwen 3.5 models but nonetheless I thought I'd bring it to your attention.

Tested on Llama Server

Following the "General" Guidance from Unsloth's Guide

  • temp: 1.0
  • top-p: 0.95
  • top-k: 20
  • min-p: 0.00
  • context: 8192
  • Reasoning Enabled
  • Quantization Level: bf16
  • llama-server version: 8132

Example Output

User Query:

Imagine three boxes in a row: Box A (left), Box B (middle), and Box C (right).
Inside Box A is a coin.

I swap Box A and Box C.

Then, I move the coin from the box it is currently in, directly into Box B.

Finally, I swap Box B and Box A.
Where is the coin right now, and what are the current positions of the boxes from left to right?

Model Response:

<think>
Here's a thinking process that leads to the solution:

1.  **Analyze the Request:**
    *   Initial Setup: Three boxes in a row: Box A (left), Box B (middle), Box C (right).
    *   Initial State of Content: Box A contains a coin. Box B is empty. Box C is empty.
    *   Operation 1: Swap Box A and Box C.
    *   Operation 2: Move the coin from its current location directly into Box B.
    *   Operation 3: Swap Box B and Box A.
    *   Goal: Determine the final position of the coin and the final order of the boxes (left to right).

2.  **Step-by-Step Simulation:**

    *   **Initial State:**
        *   Box A: Coin
        *   Box B: Empty
        *   Box C: Empty
        *   Order (Left to Right): A, B, C

    *   **Step 1: Swap Box A and Box C.**
        *   This is a physical swap of the boxes themselves.
        *   Old positions: A (Coin), B (Empty), C (Empty).
        *   Swap A and C:
            *   Box A gets what was in Box C (Empty).
            *   Box C gets what was in Box A (Coin).
            *   Box B stays the same (Empty).
        *   Current State:
            *   Box A: Empty
            *   Box B: Empty
            *   Box C: Coin
        *   Order (Left to Right): A, B, C

    *   **Step 2: Move the coin from the box it is currently in directly into Box B.**
        *   Where is the coin? It is in Box C.
        *   Action: Move the coin from Box C to Box B.
        *   Result:
            *   Box C: Empty
            *   Box B: Coin
        *   Current State:
            *   Box A: Empty
            *   Box B: Coin
            *   Box C: Empty
        *   Order (Left to Right): A, B, C

    *   **Step 3: Swap Box B and Box A.**
        *   Current positions of boxes: A (left), B (middle), C (right).
        *   Swap B and A:
            *   Old positions: A (Empty), B (Coin), C (Empty).
            *   Box A gets what was in Box B (Coin).
            *   Box B gets what was in Box A (Empty).
            *   Box C stays the same (Empty).
        *   Current State:
            *   Box A: Coin
            *   Box B: Empty
            *   Box C: Empty
        *   Order (Left to Right): A, B, C

3.  **Review and Verification:**
    *   Let's double-check the logic.
    *   Start: [A:Coin, B:Empty, C:Empty]
    *   Swap A & C: [A:Empty, B:Empty, C:Coin]
    *   Move Coin from C to B: [A:Empty, B:Coin, C:Empty]
    *   Swap B & A: [A:Coin, B:Empty, C:Empty]

    *   Wait, let's re-read the operations carefully. Sometimes "swap boxes" implies swapping contents of boxes A and C, or swapping the physical objects. Usually, in these riddles, it means swapping the boxes (which contain whatever is inside them).
    *   Let's check the wording: "I swap Box A and Box C." -> This means Box A and Box C exchange places.
    *   "Then, I move the coin from the box it is currently in, directly into Box B." -> This means the coin is the object being moved.
    *   "Finally, I swap Box B and Box A." -> This means Box B and Box A exchange places.

    *   Let's trace again.
    *   **Start:**
        *   A: Coin
        *   B: Empty
        *   C: Empty
        *   Layout: A | B | C
    *   **1. Swap A and C:**
        *   A gets Coin (from C)? No, A gets what C had. C gets what A had.
        *   A was Coin, C was Empty.
        *   Now A is Empty, C is Coin.
        *   Layout: A | B | C
        *   Contents: A (Empty), B (Empty), C (Coin)
    *   **2. Move coin from current box to B:**
        *   Current box with coin = C.
        *   Move to B.
        *   A stays Empty. B gets Coin. C stays Empty.
        *   Layout: A | B | C
        *   Contents: A (Empty), B (Coin), C (Empty)
    *   **3. Swap B and A:**
        *   A gets what B had (Coin). B gets what A had (Empty).
        *   A becomes Coin. B becomes Empty. C is Empty.
        *   Layout: A | B | C
        *   Contents: A (Coin), B (Empty), C (Empty)

    *   **Conclusion:** The coin is in Box A. The order is A, B, C.

    *   **Wait, is there a trick?**
    *   Let's re-read "I swap Box A and Box C."
    *   Start: A has coin. C has nothing.
    *   After swap: A has nothing. C has coin.
    *   "Move the coin from the box it is currently in, directly into Box B."
    *   Coin is in C. Move to B.
    *   Now B has coin. C has nothing. A has nothing.
    *   "Finally, I swap Box B and Box A."
    *   Swap B and A.
    *   A gets coin (from B). B gets nothing (from A).
    *   Result: A has coin. B is empty. C is empty.
    *   Positions: A, B, C.
    *   Wait, is it possible the question implies the boxes are shuffled physically? "Where is the coin right now, and what are the current positions of the boxes from left to right?"
    *   If A, B, C are the labels, the physical boxes are distinct.
    *   Let's assume the question implies the *labels* A, B, C are on the boxes.
    *   Let's assume the question implies the *positions* A, B, C are left, middle, right.
    *   The question asks: "Where is the coin right now, and what are the current positions of the boxes from left to right?"
    *   So, I need to identify the box containing the coin (e.g., "Box A") and the order of labels A, B, C on the boxes.
    *   Since the labels are fixed to the boxes (A is a specific box, B is a specific box, C is a specific box), asking for "positions of the boxes from left to right" usually means listing the labels on the boxes.
    *   So, if the coin is in Box A, and the boxes are in order A, B, C, the answer is "Coin is in Box A, boxes are in order A, B, C."
    *   Wait, let's look at the wording again: "what are the current positions of the boxes from left to right?"
    *   This could mean: Box A is now on the left, Box B is on the right? No, A, B, C are the boxes.
    *   It means: Which box is on the left, which is in the middle, which is on the right?
    *   Since the swap operation 1 was "Swap Box A and Box C", Box C is now on the left.
    *   Wait, the question asks "what are the current positions of the boxes from left to right?"
    *   This phrasing is slightly ambiguous. Does it mean "In what order are the boxes arranged?" or "Where is the coin?"
    *   Given the context of "Where is the coin right now", the second part likely asks for the arrangement of the boxes.
    *   If the coin is in Box A, and the boxes are A, B, C.
    *   Initial: A, B, C.
    *   Swap A, C: A becomes C's spot, C becomes A's spot.
    *   Move coin to B: B has coin.
    *   Swap B, A: A has B's spot, B has A's spot.
    *   Wait, let's trace the content again.
    *   Start: A (Coin), B (Empty), C (Empty).
    *   Step 1 (Swap A, C): A (Empty), B (Empty), C (Coin).
    *   Step 2 (Move Coin to B): A (Empty), B (Coin), C (Empty).
    *   Step 3 (Swap B, A): A (Coin), B (Empty), C (Empty).
    *   Final State:
        *   Box A: Coin
        *   Box B: Empty
        *   Box C: Empty
        *   Order (Left to Right): A, B, C.

    *   Is it possible the question implies the boxes themselves move? "Swap Box A and Box C".
    *   If I swap Box A and Box C, Box A moves to where Box C was (Right), and Box C moves to where Box A was (Left).
    *   Then I move the coin from C (now right) to B (middle).
    *   Then I swap Box B (middle) and Box A (Left).
    *   Box A goes to middle. Box B goes to Left.
    *   Final State:
        *   Box A (Left) now has Coin.
        *   Box B (Middle) is Empty.
        *   Box C (Right) is Empty.
    *   So the coin is in Box A. The boxes are in order A, B, C.

    *   Wait, I need to check if I interpreted "Swap Box A and Box C" correctly.
    *   Does it mean swap the *contents* of A and C?
    *   "I swap Box A and Box C." -> Usually means the boxes themselves. If they meant contents, they would say "I swap the contents of Box A and Box C".
    *   However, if they swap the boxes, the coin moves from A to C.
    *   Let's check the alternative interpretation: "Swap the contents".
    *   Start: A (Coin), B (Empty), C (Empty).
    *   Swap contents A & C: A (Empty), B (Empty), C (Coin).
    *   Move Coin from C to B: A (Empty), B (Coin), C (Empty).
    *   Swap Box B and Box A (contents swap): A (Coin), B (Empty), C (Empty).
    *   Result is the same. Whether A and C are boxes or contents, the final state of the coin's location is the same because the content of A is empty in the end.
    *   Wait, if I swap contents A and C, then A is empty. If I swap boxes A and C...
    *   Case 1 (Boxes Swap): A gets C's content (Empty). C gets A's content (Coin).
    *   Case 2 (Contents Swap): A gets C's content (Empty). C gets A's content (Coin).
    *   The state is the same.
    *   So, A has Empty, B has Empty, C has Coin.
    *   Move Coin from C to B.
    *   A (Empty), B (Coin), C (Empty).
    *   Swap B and A (Boxes Swap).
    *   A gets B's content (Coin). B gets A's content (Empty).
    *   A (Coin), B (Empty), C (Empty).
    *   So the coin is in Box A. The boxes are in order A, B, C.

    *   Wait, I need to check the question one more time: "what are the current positions of the boxes from left to right?"
    *   Does it mean "A, B, C" or "Box A, Box B, Box C"?
    *   If the boxes are labeled A, B, C, then the order is A, B, C.
    *   If the boxes are physical objects, and I swap them, their positions change.
    *   Let's assume the boxes are distinct entities with labels.
    *   Start: Box A (Left), Box B (Middle), Box C (Right).
    *   Swap Box A and Box C: Box C is now Left, Box A is now Middle, Box B is Right. (This is confusing).
    *   Better interpretation: The boxes have fixed labels.
    *   Left Box = A. Middle Box = B. Right Box = C.
    *   Swap Box A and Box C -> Left Box becomes C, Middle Box becomes A, Right Box becomes B.
    *   Move coin from C to B.
    *   Swap B and A -> Middle Box becomes C, Left Box becomes B.
    *   Wait, this interpretation changes the position of the boxes based on labels.
    *   Let's simplify.
    *   Box A is the box labeled "A". Box B is the box labeled "B". Box C is the box labeled "C".
    *   They are in a row.
    *   Start: A | B | C
    *   Swap Box A and Box C: The box labeled "A" moves to the position of "C", and the box labeled "C" moves to the position of "A".
    *   So now: C | B | A (Positions Left to Right).
    *   Move coin (which was in A) to B.
    *   Wait, if I swap Box A and Box C, the coin is in Box C (because Box A started with the coin).
    *   So coin moves from C to B.
    *   Now: B | A | C (Positions Left to Right).
    *   Wait, is this correct?
    *   Start: A, B, C. A has coin.
    *   Swap A and C.
    *   Box A (Left) takes content of Box C (Empty).

(I stopped the model at this point since it fell into a CoT groove)

However basic prompts do not seem to cause endless reasoning behavior.

Basic Test

User Query:

hello

Model Response:

<think>Okay, the user just said "hello". That's pretty straightforward. They're probably starting a conversation or checking in. Since it's a greeting, I should respond warmly and offer help. I need to make sure to keep it friendly and open-ended. Maybe ask how they're doing or what they need assistance with. Let me keep it simple and friendly.</think>Hello! πŸ‘‹ How can I help you today? Whether it's a question, a task, or just chat, I'm here! 😊

This review video I watched also seemed to encounter similar issues with the 9B version of Qwen 3.5's Unsloth Quantization as well.

Video

Unsloth Mention
Infinite Thinking of 9B model

According to the tester in the video, prompting with "do not overthink" appeared to solve the issue.

Again not sure if this is an issue with the Qwen 3.5 models themselves or the quantization but thought I'd let you all know regardless!

Sign up or log in to comment