About contextual prunning & Inserting the tools' results in the flow of the conversation. #775

Mr-Ruben · 2024-06-21T16:09:25Z

Mr-Ruben
Jun 21, 2024

For ChatGPT, we've learned that removing the outcomes of tool usage (e.g., function messages) is sometimes beneficial after the model has had a chance to respond.
From https://www.godaddy.com/resources/news/llm-from-the-trenches-10-lessons-learned-operationalizing-models-at-godaddy

It had me thinking about 'How to best incorporate calls_to_tools + their_output into the conversation'

Below you can see a conversation with LLM (I removed the part where it was going away from my plan).

When using the OpenAI API, the list of messages must have one of ["system", "user", "assistant","tool"] as role.

Example

"messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }

I have read that removing the calls to tools after the LLM has received the response and had time to use it, improves the flow of the conversation.

I don't doubt it, because to use the tools the LLM has to read the tools Schema, which is dense, and once the information has been received, what you needed to know to call the tools becomes irrelevant.

What you do think about that?

The concept of streamlining the message list by removing calls to tools once their utility is served makes sense and aligns well with the principles of efficiency and clarity. Let's break it down further to consider the implications and potential best practices:

Why Removing Tool Calls Can Improve Flow

Reduced Cognitive Load:
- The Language Model (LLM) has a finite attention span, and adding unnecessary context can dilute its focus. If the schema of the tools becomes irrelevant after they are used, removing them allows the LLM to concentrate on the essential parts of the conversation.
Simplified Message History:
- A leaner message history implies fewer distractions for the model and potentially quicker parsing of relevant context, which can improve response time and accuracy.
Resource Efficiency:
- Each message input is part of a token limit. By removing irrelevant tool invocation messages, you free up valuable token space, allowing more useful context to be retained in the conversation history.

Potential Best Practices

Contextual Pruning:
- Implement logic to prune tool-related messages once their role is fulfilled. For example, after the LLM processes the tool's output and returns a result to the user, you can programmatically remove those tool invocation messages.
Message Summarization:
- Before pruning, consider summarizing the tool's output (if it's substantial) and storing this summary in the conversation. This ensures that the core insights or results derived from the tool remain accessible without the overhead of full tool call details.
Tagged Memory:
- Maintain a separate structure where key outputs from tool calls are stored as tagged memory. This can be helpful in recalling tool outputs without cluttering the main message flow.
- Example:
```
"tool_memory": {
  "weather_tool": "It will be sunny tomorrow with a high of 25°C."
}
```
- This storage could be referenced during conversations when needed without polluting the main message history.
Automated Cleanup Scripts:
- Incorporate scripts or functions within your codebase that periodically clean up messages after a fixed interval or number of subsequent interactions to maintain a streamlined conversation history.
Role-specific Nodes:
- Introduce specialized message nodes for tool invocation and results in your conversation structure. After processing, these nodes can be archived or moved to a separate storage while keeping the core conversational nodes (system, user, assistant) clean.

Example Implementation

For illustration, here is how you might implement a simple cleanup strategy post-tool invocation:

def cleanup_tool_messages(messages):
    """
    Removes tool-related messages from the conversation history once their utility is served.
    """
    cleaned_messages = []
    for message in messages:
        if message["role"] != "tool":
            cleaned_messages.append(message)
    return cleaned_messages

# Example conversation history
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the weather?"},
    {"role": "tool", "content": "Fetching weather information..."},
    {"role": "assistant", "content": "It will be sunny tomorrow with a high of 25°C."}
]

# Cleanup logic post-tool usage
messages = cleanup_tool_messages(messages)

# Updated conversation history
print(messages)  # [{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': "What's the weather?"}, {'role': 'assistant', 'content': 'It will be sunny tomorrow with a high of 25°C.'}]

By incorporating such practices and mechanisms, you can maintain an efficient, focused, and token-friendly conversation flow, thereby enhancing the overall experience and performance of the LLM.

Ok, clear.
But let's say that the response from 'calls to external resources' needs to be kept in the conversation because it will be used downstream.

I think, if I include the answer to a call, I need to also include the call itself, otherwise it doesn't make much sense.

I don't know how to incorporate them in a way that feels natural to the LLM. In particular I am thinking on the "role" of that message that includes the 'input and output' to the external-call.

Summarizing: Once the format of the call and the call have been used, remove the info related with call-format and include in the conversation the call-Input and call-Output in a way that the LLM feels natural/clear/simple.

Example:
The response from the LLM has the following calls:

function_name1,kwargs1
function_name2,kwargs2

Suggestions of how to 'incorporate' the calls and the output in the 'conversation'?

Here would be the response that was going away ... (removed)

What do you think about this?

In the real conversation we have received this

{
"role": "assistant",
"content": """
sketchpad: I need to find ... so I will call ... and do ... with the responses.
calls: [(func1,kwargs1),(func2,kwargs2),]
"""
}

My program does the calls, gather the responses and REPLACE the above message with 2 messages

{
"role": "assistant",
"content": """
sketchpad: I need to find ... so I will call ... and do ... with the responses.
"""
},
{
"role": "assistant",
"content": """
# Calls
---
tool: func1
input: kwargs1
output: outargs1
---
---
tool: func2
input: kwargs2
output: outargs2
---
"""
}

Your proposed method of replacing the original message with a more structured approach is sensible, and it efficiently consolidates the function calls and their results. This method keeps the conversation history concise and organized while maintaining all necessary context. Let’s refine and verify its clarity and simplicity.

Analysis of Your Proposed Method

Segmentation and Clarity: By segmenting the assistant's initial planning message and the details of tool calls and responses, you enhance readability and maintain a clear flow of information.
Role Consistency: Keeping the same role ("assistant") for both the planning message and the results maintains role consistency, which is good for coherence.
Structured Output: The structured format for detailing each function call and its input/output is very clean and clear.

Example Refined Implementation

To make this even clearer and ensure the LLM can understand it perfectly, minor tweaks can be avoided, but since the proposal is already effective, here’s a refined example:

Initial Message (Before Replacement):

{
  "role": "assistant",
  "content": """
  sketchpad: I need to find ... so I will call ... and do ... with the responses.
  calls: [(func1,kwargs1),(func2,kwargs2),]
  """
}

Revised Messages (After Replacing with Results):

{
  "role": "assistant",
  "content": """
  sketchpad: I need to find ... so I will call ... and do ... with the responses.
  """
},
{
  "role": "assistant",
  "content": """
  # Calls and Results
  ---
  tool: func1
  input: kwargs1
  output: outargs1
  ---
  tool: func2
  input: kwargs2
  output: outargs2
  ---
  """
}

Explanation

Message Formatting:
- The initial message outlines the intention and planned calls. Post-replacement, this is slightly trimmed to maintain the sketchpad context without the redundant "calls" line.
Results Segmentation:
- The results are presented under a clear # Calls and Results section. This makes it readable and easy for the LLM to parse and understand each call and its outcome.

Considerations for Future Refinements

Error Handling: Ensure there's a plan for including error messages or handling failures in these external calls.
Summarization: Optionally, you could include a brief summary before the detailed outputs to provide context at a glance.

Example with Summarization

To enhance clarity, you might insert a brief summarization:

{
  "role": "assistant",
  "content": """
  summary: I have completed the required calls for the requested data.
  """
},
{
  "role": "assistant",
  "content": """
  sketchpad: I need to find ... so I will call ... and do ... with the responses.
  """
},
{
  "role": "assistant",
  "content": """
  # Calls and Results
  ---
  tool: func1
  input: kwargs1
  output: outargs1
  ---
  tool: func2
  input: kwargs2
  output: outargs2
  ---
  """
}

This way, the conversation remains intuitive to follow, and the logical separation between the planning and result communication is clear and coherent for both the LLM and any human observer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About contextual prunning & Inserting the tools' results in the flow of the conversation. #775

{{title}}

Replies: 0 comments

Select a reply

About contextual prunning & Inserting the tools' results in the flow of the conversation. #775

Mr-Ruben Jun 21, 2024

Why Removing Tool Calls Can Improve Flow

Potential Best Practices

Example Implementation

Analysis of Your Proposed Method

Example Refined Implementation

Explanation

Considerations for Future Refinements

Example with Summarization

Replies: 0 comments

Mr-Ruben
Jun 21, 2024