Skip to content

Releases: deepset-ai/haystack

v2.23.0

27 Jan 09:14

Choose a tag to compare

⭐️ Highlights

🔄 Human-in-the-Loop for Agents

Agents can now pause for human confirmation before executing tools. You can define confirmation behavior per tool: always ask, ask only on first use, or never ask and fully customize the confirmation UI.
This makes it much easier to build safer, more transparent agent workflows, especially when tools trigger side effects or access sensitive data.

agent = Agent(
    chat_generator=OpenAIChatGenerator(model="gpt-4.1"),
    tools=[balance_tool, addition_tool, phone_tool],
    system_prompt="You are a helpful financial assistant. Use the provided tool to get bank balances when needed.",
    confirmation_strategies={
        balance_tool.name: BlockingConfirmationStrategy(
            confirmation_policy=AlwaysAskPolicy(), confirmation_ui=RichConsoleUI(console=cons),
        ),
        phone_tool.name: BlockingConfirmationStrategy(
            confirmation_policy=AskOncePolicy(), confirmation_ui=SimpleConsoleUI(),
        ),
        addition_tool.name: BlockingConfirmationStrategy(
            confirmation_policy=NeverAskPolicy(), confirmation_ui=SimpleConsoleUI(),
        )
    },
)

For a detailed walkthrough of confirmation strategies and UI customization, see Tutorial: Human-in-the-Loop with Haystack Agents

🖼️ Image Support for Tool Results

Tool classes can now return images alongside text, enabling richer agent responses.
ToolCallResult.result supports lists of TextContent and ImageContent, allowing agents to retrieve, pass around, and describe images when used with providers that support it (e.g. OpenAIResponsesChatGenerator, AnthropicChatGenerator and more).

This unlocks new use cases like image-based tool outputs like custom retrievals, visual search and inspection through MCP tools returning base64 strings, and multimodal agent reasoning.

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.tools import ComponentTool
from haystack.dataclasses import ChatMessage, ImageContent
from haystack import component

@component
class ImageRetriever:
  @component.output_types(images=list[ImageContent])
  def run(self):
    return {"images":[ImageContent.from_file_path("/content/image.jpg")]}

image_retriever_tool = ComponentTool(
    component=ImageRetriever(), outputs_to_string={"raw_result":True, "source":"images"}
)

agent = Agent(
    chat_generator=OpenAIResponsesChatGenerator(model="gpt-5-nano"),
    system_prompt="You are an Agent that can retrieve images and describe them.",
    tools=[image_retriever_tool],
)

user_message = ChatMessage.from_user("Retrieve the image and describe it. Tell me if you cannot see the image")
result = agent.run(messages=[user_message])

print(result["last_message"].text)

🧩 Simpler Serialization for Custom Components

Custom components now serialize and deserialize automatically in most cases via component_from_dict() and component_to_dict(), even when they contain complex attributes such as DocumentStore, Secret,ComponentDevice, or any object that implements to_dict()/from_dict().

This means most custom components no longer need to implement to_dict()/from_dict() themselves. Pipeline snapshots and YAML definitions are easier to create and restore, and custom components are more portable and less error-prone.

⬆️ Upgrade Notes

  • Pipeline snapshot file saving is now disabled by default. You must now explicitly enable it by setting the environment variable HAYSTACK_PIPELINE_SNAPSHOT_SAVE_ENABLED=true.
  • Remove backward-compatibility support for deserializing pipeline snapshots with the old pipeline_outputs format. Pipeline snapshots created before Haystack 2.22.0 that contain pipeline_outputs without the serialization_schema and serialized_data structure are no longer supported. Users should recreate their pipeline snapshots with the current Haystack version before upgrading to 2.23.0.
  • The return_empty_on_no_match parameter has been fully removed from the RegexTextExtractor component. In Haystack 2.22.0, this parameter was ignored. Starting with Haystack 2.23.0, passing this parameter during component initialization will raise an error. During pipeline deserialization, the parameter is ignored to avoid breaking existing pipelines.

🚀 New Features

  • Added a snapshot_callback parameter to Pipeline.run() that allows users to customize how pipeline snapshots are handled. When a callback is provided, it is invoked instead of the default file-saving behavior whenever a snapshot is created (e.g., during breakpoints or error handling). This enables use cases like saving snapshots to a database, sending them to a remote service, or implementing custom logging. If no callback is provided, the default behavior of saving to a JSON file remains unchanged.

  • component_from_dict() and component_to_dict() now work with custom components out of the box also if the component has a ComponentDevice as an attribute. Users no longer need to explicitly define to_dict() and from_dict() methods in their custom components to call ComponentDevice.from_dict() or device.to_dict(). component_from_dict() and component_to_dict() now handle this automatically.

  • component_from_dict/component_to_dict now work out of the box with custom components that have an object as init parameter as long as the object defines to_dict/from_dict methods. Users no longer need to explicitly define to_dict/from_dict methods in their custom components in such cases. For example, a custom retriever, which has a DocumentStore as an init parameter, does not need explicitly defined to_dict/from_dict methods. component_from_dict/component_to_dict now handle such cases automatically.

  • component_from_dict() and component_to_dict() now work with custom components out of the box also if the component has a Secret as an attribute. Users no longer need to explicitly define to_dict() and from_dict() methods in their custom components to call deserialize_secrets_inplace() or api_key.to_dict(). component_from_dict() and component_to_dict() now handle this automatically.

  • Added HAYSTACK_PIPELINE_SNAPSHOT_SAVE_ENABLED environment variable. When set to "true" or "1", pipeline snapshots are saved to disk. Disabled by default. Note: Custom snapshot_callback functions are still invoked regardless of this setting.

  • Expanded the ToolCallResult.result field to accept not only strings but also lists of TextContent and ImageContent objects. This enables tools to return images for providers that support this capability. This feature is already available when using OpenAIResponsesChatGenerator, and support for additional providers will be added soon. The Chat Completions API does not support this functionality, so the classic OpenAIChatGenerator cannot be used with it.

  • The outputs_to_string parameter of the Tool class now supports returning raw results without string conversion using the raw_result key. This is intended for tools that return images. ComponentTool and PipelineTool also support this feature.

⚡️Enhancement Notes

  • Added haystack.component.fully_qualified_type field to component tracing output. This new field provides the full module path and class name (e.g., haystack.components.generators.chat.openai.OpenAIChatGenerator) alongside the existing haystack.component.type field that only contains the class name. This enables dynamic component loading and better tooling integration.

  • In OpenAIChatGenerator, streaming now handles cases where a ChatCompletionChunk has a delta set to None in choices. This can occur with some OpenAI-compatible providers, and the component will now handle it gracefully.

  • Components no longer handle the (de-)serialization of ComponentDevice explicitly. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Components no longer handle the (de-)serialization of init parameter objects explicitly if the objects define to_dict/from_dict themselves. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Components no longer handle the (de-)serialization of Secrets explicitly. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Support for flattened generation_kwargs in OpenAIResponsesChatGenerator

    The OpenAIResponsesChatGenerator component now supports flattened generation keyword arguments, allowing users to specify reasoning parameters directly without nesting them. This enhancement simplifies the configuration process and improves usability.

    Example:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    generator = OpenAIResponsesChatGenerator(
        model="gpt-4",
        generation_kwargs={
            "reasoning_effort": "low",
            "reasoning_summary": "auto"
        }
    )
  • Support for flattened verbosity in generation_kwargs of OpenAIResponsesChatGenerator

    The OpenAIResponsesChatGenerator component now supports flattened verbosity generation keyword arguments, allowing users to specify verbositydirectly without nesting them in text. This enhancement simplifies the configuration process and improves usability.

    Example:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    generator = OpenAIResponsesChatGenerator(
        model="gpt-4",
        generation_kwargs={
            "verbosity": "low",
        }
    )
  • Added the outputs_to_string parameter to create_tool_from_function and the @tool decorator to provide addi...

Read more

v2.23.0-rc1

26 Jan 10:03

Choose a tag to compare

v2.23.0-rc1 Pre-release
Pre-release

Release Notes

v2.23.0-rc1

Upgrade Notes

  • Pipeline snapshot file saving is now disabled by default. You must now explicitly enable it by setting the environment variable HAYSTACK_PIPELINE_SNAPSHOT_SAVE_ENABLED=true.
  • Remove backward-compatibility support for deserializing pipeline snapshots with the old pipeline_outputs format. Pipeline snapshots created before Haystack 2.22.0 that contain pipeline_outputs without the serialization_schema and serialized_data structure are no longer supported. Users should recreate their pipeline snapshots with the current Haystack version before upgrading to 2.23.0.
  • The return_empty_on_no_match parameter has been fully removed from the RegexTextExtractor component. In Haystack 2.22.0, this parameter was ignored. Starting with Haystack 2.23.0, passing this parameter during component initialization will raise an error. During pipeline deserialization, the parameter is ignored to avoid breaking existing pipelines.

New Features

  • Added support for human confirmation on Agent tool calls. You can now configure per-tool confirmation strategies when building an Agent, including always requiring confirmation, never requiring it, or requesting confirmation only on first use. The confirmation experience is customizable through different UI strategies, allowing fine-grained control over how and when users approve tool executions.

    agent = Agent(
        chat_generator=OpenAIChatGenerator(model="gpt-4.1"),
        tools=[balance_tool, addition_tool, phone_tool],
        system_prompt=(
            "You are a helpful financial assistant. "
            "Use the provided tool to get bank balances when needed."
        ),
        confirmation_strategies={
            balance_tool.name: BlockingConfirmationStrategy(
                confirmation_policy=AlwaysAskPolicy(),
                confirmation_ui=RichConsoleUI(console=cons),
            ),
            addition_tool.name: BlockingConfirmationStrategy(
                confirmation_policy=NeverAskPolicy(),
                confirmation_ui=SimpleConsoleUI(),
            ),
            phone_tool.name: BlockingConfirmationStrategy(
                confirmation_policy=AskOncePolicy(),
                confirmation_ui=SimpleConsoleUI(),
            ),
        },
    )
  • Added a snapshot_callback parameter to Pipeline.run() that allows users to customize how pipeline snapshots are handled. When a callback is provided, it is invoked instead of the default file-saving behavior whenever a snapshot is created (e.g., during breakpoints or error handling). This enables use cases like saving snapshots to a database, sending them to a remote service, or implementing custom logging. If no callback is provided, the default behavior of saving to a JSON file remains unchanged.

  • component_from_dict() and component_to_dict() now work with custom components out of the box also if the component has a ComponentDevice as an attribute. Users no longer need to explicitly define to_dict() and from_dict() methods in their custom components to call ComponentDevice.from_dict() or device.to_dict(). component_from_dict() and component_to_dict() now handle this automatically.

  • component_from_dict/component_to_dict now work out of the box with custom components that have an object as init parameter as long as the object defines to_dict/from_dict methods. Users no longer need to explicitly define to_dict/from_dict methods in their custom components in such cases. For example, a custom retriever, which has a DocumentStore as an init parameter, does not need explicitly defined to_dict/from_dict methods. component_from_dict/component_to_dict now handle such cases automatically.

  • component_from_dict() and component_to_dict() now work with custom components out of the box also if the component has a Secret as an attribute. Users no longer need to explicitly define to_dict() and from_dict() methods in their custom components to call deserialize_secrets_inplace() or api_key.to_dict(). component_from_dict() and component_to_dict() now handle this automatically.

  • Added HAYSTACK_PIPELINE_SNAPSHOT_SAVE_ENABLED environment variable. When set to "true" or "1", pipeline snapshots are saved to disk. Disabled by default. Note: Custom snapshot_callback functions are still invoked regardless of this setting.

  • Expanded the ToolCallResult.result field to accept not only strings but also lists of TextContent and ImageContent objects. This enables tools to return images for providers that support this capability. This feature is already available when using OpenAIResponsesChatGenerator, and support for additional providers will be added soon. The Chat Completions API does not support this functionality, so the classic OpenAIChatGenerator cannot be used with it.

  • The outputs_to_string parameter of the Tool class now supports returning raw results without string conversion using the raw_result key. This is intended for tools that return images. ComponentTool and PipelineTool also support this feature.

    Here is an example of how to use it:

    from haystack.components.agents import Agent
    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    from haystack.dataclasses import ChatMessage, ImageContent, TextContent
    from haystack.tools import create_tool_from_function
    def retrieve_image():
        """Tool to retrieve an image"""
        return [
            TextContent("Here is the retrieved image."),
            ImageContent.from_file_path("test/test_files/images/apple.jpg"),
        ]
    
    
    image_retriever_tool = create_tool_from_function(
        function=retrieve_image, outputs_to_string={"raw_result": True}
    )
    
    agent = Agent(
        chat_generator=OpenAIResponsesChatGenerator(model="gpt-5-nano"),
        system_prompt="You are an Agent that can retrieve images and describe them.",
        tools=[image_retriever_tool],
    )
    
    user_message = ChatMessage.from_user("Retrieve the image and describe it in max 10 words.")
    result = agent.run(messages=[user_message])
    
    print(result["last_message"].text)
    # Red apple with stem resting on straw.

Enhancement Notes

  • Added haystack.component.fully_qualified_type field to component tracing output. This new field provides the full module path and class name (e.g., haystack.components.generators.chat.openai.OpenAIChatGenerator) alongside the existing haystack.component.type field that only contains the class name. This enables dynamic component loading and better tooling integration.

  • In OpenAIChatGenerator, streaming now handles cases where a ChatCompletionChunk has a delta set to None in choices. This can occur with some OpenAI-compatible providers, and the component will now handle it gracefully.

  • Components no longer handle the (de-)serialization of ComponentDevice explicitly. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Components no longer handle the (de-)serialization of init parameter objects explicitly if the objects define to_dict/from_dict themselves. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Components no longer handle the (de-)serialization of Secrets explicitly. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Support for flattened generation_kwargs in OpenAIResponsesChatGenerator

    The OpenAIResponsesChatGenerator component now supports flattened generation keyword arguments, allowing users to specify reasoning parameters directly without nesting them. This enhancement simplifies the configuration process and improves usability.

    Example:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    generator = OpenAIResponsesChatGenerator(
        model="gpt-4",
        generation_kwargs={
            "reasoning_effort": "low",
            "reasoning_summary": "auto"
        }
    )
  • Support for flattened verbosity in generation_kwargs of OpenAIResponsesChatGenerator

    The OpenAIResponsesChatGenerator component now supports flattened verbosity generation keyword arguments, allowing users to specify verbositydirectly without nesting them in text. This enhancement simplifies the configuration process and improves usability.

    Example:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    generator = OpenAIResponsesChatGenerator(
        model="gpt-4",
        generation_kwargs={
            "verbosity": "low",
        }
    )
  • Added the outputs_to_string parameter to create_tool_from_function and the @tool decorator to provide additional customization options for these convenience constructors.

Deprecation Notes

  • deserialize_document_store_in_init_params_inplace is deprecated and will be removed in Haystack version 2.24. It is no longer used internally and should not be used in new code. The deserialization of DocumentStores is handled automatically now by default_from_dict.

Bug Fixes

  • Fix ComponentTool, create_tool_from_function, and the @tool decorator failing to create a tool schema when Callable type parameters are present (such as snapshot_callback). This enables using Agent as a ComponentTool without raising SchemaGenerationError.
  • Fixed a bug in OpenAIResponsesChatGenerator where empty reasoning items were discarded during streaming. This caused subsequent requests to the OpenAI Responses API to fail when the message history was sent back, as the API requires every tool call to be preceded by its associated reasoning item. These items are now correctly preserved in the ChatMessage history, even when the summary text is empty.
  • Fix SASEvaluator to work when using numpy>=2.4 by manually squeezing a PyTorch tensor to the correct dimension.
  • Fixed usage info extraction in streaming responses for OpenAI-compatible chat genera...
Read more

v2.22.0

08 Jan 14:25

Choose a tag to compare

⭐️ Highlights

✂️ Smarter Document Chunking with Embedding-Based Splitting

Introducing the new EmbeddingBasedDocumentSplitter, a component that takes an embedder and splits documents based on semantic similarity rather than fixed sizes or rules.

from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.preprocessors import EmbeddingBasedDocumentSplitter

# Initialize an embedder to calculate semantic similarities
embedder = SentenceTransformersDocumentEmbedder()

# Configure the splitter with parameters that control splitting behavior
splitter = EmbeddingBasedDocumentSplitter(
    document_embedder=embedder,
    sentences_per_group=2,      # Group 2 sentences before calculating embeddings
    percentile=0.95,            # Split when cosine distance exceeds 95th percentile
    min_length=50,              # Merge splits shorter than 50 characters
    max_length=1000             # Further split chunks longer than 1000 characters
)
result = splitter.run(documents=[doc])

🔥 warm_up Runs Automatically on First Use

Components that define awarm_up method now run it automatically on first execution, removing the need for manual calls and preventing errors in standalone usage.

from haystack.components.embedders import SentenceTransformersTextEmbedder

text_embedder = SentenceTransformersTextEmbedder()
# text_embedder.warm_up() # ❌ Don't need this step anymore
print(text_embedder.run("I love pizza!"))

## {'embedding': [-0.07804739475250244, 0.1498992145061493,, ...]}

🛠️ Multiple Tool String Outputs with outputs_to_string

Tools can now expose multiple string outputs via the new outputs_to_string configuration, giving you fine-grained control over how tool results are surfaced to the LLM, without changing the underlying tool logic.

def format_documents(documents):
    return "\n".join(f"{i+1}. Document: {doc.content}" for i, doc in enumerate(documents))

def format_summary(metadata):
    return f"Found {metadata['count']} results"

tool = Tool(
    name="search",
    description="Search for documents",
    parameters={...},
    function=search_func,  # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
    outputs_to_string={
        "formatted_docs": {"source": "documents", "handler": format_documents},
        "summary": {"source": "metadata", "handler": format_summary}
        # Note: "debug_info" is not included, so it won't be converted to a string
    }
)

# After the tool invocation, the tool result includes:
# {
#     "formatted_docs": "1. Document Title\n   Content...\n2. ...",
#     "summary": "Found 5 results"
# }

🐍 Python 3.10+ Only

Haystack now requires Python 3.10 or later, as Python 3.9 reached End of Life (EOL) in October 2025.

⬆️ Upgrade Notes

  • HuggingFaceLocalChatGenerator now uses Qwen/Qwen3-0.6B as the default model, replacing the previous default.

⚡️Enhancement Notes

  • The parameters query_suffix and document_suffix have been added to SentenceTransformersSimilarityRanker to support the Qwen3 reranker model family.

    Here is an example of how to use these new parameters to use the Qwen3-Reranker-0.6B:

    from haystack import Document
    from haystack.components.rankers.sentence_transformers_similarity import SentenceTransformersSimilarityRanker
    
    ranker = SentenceTransformersSimilarityRanker(
        model="tomaarsen/Qwen3-Reranker-0.6B-seq-cls",
        query_prefix='<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: ',
        query_suffix="\n",
        document_prefix="<Document>: ",
        document_suffix="<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n",
    )
    
    result = ranker.run(
        query="Which planet is known as the Red Planet?",
        documents=[
            Document(content="Venus is often called Earth's twin because of its similar size and proximity."),
            Document(content="Mars, known for its reddish appearance, is often referred to as the Red Planet."),
            Document(content="Jupiter, the largest planet in our solar system, has a prominent red spot."),
            Document(content="Saturn, famous for its rings, is sometimes mistaken for the Red Planet."),
        ],
    )
    
    print(result)

    NOTE: This only works with the Qwen3 reranker models that use the sequence classification architecture. For example, you can find some on tomaarsen's Hugging Face profile.

  • Added reasoning content support to HuggingFaceAPIChatGenerator. The component now extracts reasoning content from models that support chain-of-thought reasoning (e.g., DeepSeek R1). Both streaming and non-streaming modes are supported. Access via reply.reasoning.reasoning_text.

  • When an Agent runs as part of a Pipeline, the agent's tracing span now uses the component span as its parent. This enables proper nested trace visualization in tracing tools like Datadog, Braintrust, or OpenTelemetry backends.

  • The _handle_async_stream_response() method in OpenAIChatGenerator now handles asyncio.CancelledError exceptions. When a streaming task is cancelled mid-stream, the async for loop gracefully closes the stream using asyncio.shield() to ensure the cleanup operation completes even during cancellation.

  • A new enable_thinking parameter has been added to enable thinking mode in chat templates for thinking-capable models, allowing them to generate intermediate reasoning steps before producing final responses.

  • Add support for PEP 604 type syntax. This means that when defining types in components, you can use X | Y instead of Union[X, Y] and X | None instead of Optional[X]. The codebase has been migrated to the new syntax, but both syntaxes are fully supported.

  • Support Multiple Tool String Outputs

    Added support for tools to define multiple string outputs using the outputs_to_string configuration. This allows users to specify how different parts of a tool's output should be converted to strings, enhancing flexibility in handling tool results.

    • Updated ToolInvoker to handle multiple output configurations.
    • Updated Tool to validate and store multiple output configurations.
    • Added tests to verify the functionality of multiple string outputs.

    This enables tools to provide rich, varied context to language models or downstream components without requiring multiple tool calls, while keeping full control over which outputs are stringified.

  • Added validation for inputs_from_state and outputs_to_state parameters in the Tool class. Tools now validate at construction time that state mappings reference valid tool parameters and outputs, catching configuration errors early instead of at runtime. The validation uses function introspection and JSON schema to ensure parameter names exist, and subclasses like ComponentTool validate against component input/output sockets.

🐛 Bug Fixes

  • Improved error messages in ConditionalRouter when non-string values are provided as route outputs. Users now receive clear guidance (e.g., "use '2' instead of 2") instead of the cryptic "Can't compile non template nodes" error.
  • Fixes jinja2 variable detection in ConditionalRouter, ChatPromptBuilder, PromptBuilder and OutputAdapter by properly skipping variables that are assigned within the template. Previously under specific scenarios variables assigned within a template would falsely be picked up as input variables to the component. For more information you can check out the parent issue in the Jinja2 library here: pallets/jinja#2069
  • Fixes deserializing an instance of NamedEntityExtractor when pipeline_kwargs is stored in the deserialization dict with the value of None.
  • When creating an HTTP client object from a dictionary, we now convert the limits parameter to an httpx.Limits object to avoid AttributeError.
  • Raise a ValueError when an async function is passed to the Tool class. Async functions are not supported as tools. This change provides a clear error message instead of silent failures where coroutines are never awaited.

⚠️ Deprecation Notes

  • The return_empty_on_no_match parameter has been removed from the RegexTextExtractor component. This component now always returns a dictionary with the key "captured_text"; the value can be an empty string if no match is found or the captured text. Currently, the return_empty_on_no_match parameter is ignored. Starting from Haystack 2.23.0, initializing the component with this parameter will raise an error.

💙 Big thank you to everyone who contributed to this release!

@anakin87, @ArzelaAscoIi, @bilgeyucel, @Bobholamovic, @davidsbatista, @dfokina, @GunaPalanivel, @majiayu000, @OliverZhangA, @sjrl, @TaMaN2031A, @tommasocerruti, @tstadel, @vblagoje, @YassineGabsi

v2.22.0-rc1

07 Jan 13:31

Choose a tag to compare

v2.22.0-rc1 Pre-release
Pre-release

⭐️ Highlights

✂️ Smarter Document Chunking with Embedding-Based Splitting

Introducing the new EmbeddingBasedDocumentSplitter, a component that takes an embedder and splits documents based on semantic similarity rather than fixed sizes or rules.

from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.preprocessors import EmbeddingBasedDocumentSplitter

# Initialize an embedder to calculate semantic similarities
embedder = SentenceTransformersDocumentEmbedder()

# Configure the splitter with parameters that control splitting behavior
splitter = EmbeddingBasedDocumentSplitter(
    document_embedder=embedder,
    sentences_per_group=2,      # Group 2 sentences before calculating embeddings
    percentile=0.95,            # Split when cosine distance exceeds 95th percentile
    min_length=50,              # Merge splits shorter than 50 characters
    max_length=1000             # Further split chunks longer than 1000 characters
)
result = splitter.run(documents=[doc])

🔥 warm_up Runs Automatically on First Use

Components that define awarm_up method now run it automatically on first execution, removing the need for manual calls and preventing errors in standalone usage.

from haystack.components.embedders import SentenceTransformersTextEmbedder

text_embedder = SentenceTransformersTextEmbedder()
# text_embedder.warm_up() # ❌ Don't need this step anymore
print(text_embedder.run("I love pizza!"))

## {'embedding': [-0.07804739475250244, 0.1498992145061493,, ...]}

🛠️ Multiple Tool String Outputs with outputs_to_string

Tools can now expose multiple string outputs via the new outputs_to_string configuration, giving you fine-grained control over how tool results are surfaced to the LLM, without changing the underlying tool logic.

def format_documents(documents):
    return "\n".join(f"{i+1}. Document: {doc.content}" for i, doc in enumerate(documents))

def format_summary(metadata):
    return f"Found {metadata['count']} results"

tool = Tool(
    name="search",
    description="Search for documents",
    parameters={...},
    function=search_func,  # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
    outputs_to_string={
        "formatted_docs": {"source": "documents", "handler": format_documents},
        "summary": {"source": "metadata", "handler": format_summary}
        # Note: "debug_info" is not included, so it won't be converted to a string
    }
)

# After the tool invocation, the tool result includes:
# {
#     "formatted_docs": "1. Document Title\n   Content...\n2. ...",
#     "summary": "Found 5 results"
# }

🐍 Python 3.10+ Only

Haystack now requires Python 3.10 or later, as Python 3.9 reached End of Life (EOL) in October 2025.

⬆️ Upgrade Notes

  • HuggingFaceLocalChatGenerator now uses Qwen/Qwen3-0.6B as the default model, replacing the previous default.

⚡️Enhancement Notes

  • The parameters query_suffix and document_suffix have been added to SentenceTransformersSimilarityRanker to support the Qwen3 reranker model family.

    Here is an example of how to use these new parameters to use the Qwen3-Reranker-0.6B:

    from haystack import Document
    from haystack.components.rankers.sentence_transformers_similarity import SentenceTransformersSimilarityRanker
    
    ranker = SentenceTransformersSimilarityRanker(
        model="tomaarsen/Qwen3-Reranker-0.6B-seq-cls",
        query_prefix='<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: ',
        query_suffix="\n",
        document_prefix="<Document>: ",
        document_suffix="<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n",
    )
    
    result = ranker.run(
        query="Which planet is known as the Red Planet?",
        documents=[
            Document(content="Venus is often called Earth's twin because of its similar size and proximity."),
            Document(content="Mars, known for its reddish appearance, is often referred to as the Red Planet."),
            Document(content="Jupiter, the largest planet in our solar system, has a prominent red spot."),
            Document(content="Saturn, famous for its rings, is sometimes mistaken for the Red Planet."),
        ],
    )
    
    print(result)

    NOTE: This only works with the Qwen3 reranker models that use the sequence classification architecture. For example, you can find some on tomaarsen's Hugging Face profile.

  • Added reasoning content support to HuggingFaceAPIChatGenerator. The component now extracts reasoning content from models that support chain-of-thought reasoning (e.g., DeepSeek R1). Both streaming and non-streaming modes are supported. Access via reply.reasoning.reasoning_text.

  • When an Agent runs as part of a Pipeline, the agent's tracing span now uses the component span as its parent. This enables proper nested trace visualization in tracing tools like Datadog, Braintrust, or OpenTelemetry backends.

  • The _handle_async_stream_response() method in OpenAIChatGenerator now handles asyncio.CancelledError exceptions. When a streaming task is cancelled mid-stream, the async for loop gracefully closes the stream using asyncio.shield() to ensure the cleanup operation completes even during cancellation.

  • A new enable_thinking parameter has been added to enable thinking mode in chat templates for thinking-capable models, allowing them to generate intermediate reasoning steps before producing final responses.

  • Add support for PEP 604 type syntax. This means that when defining types in components, you can use X | Y instead of Union[X, Y] and X | None instead of Optional[X]. The codebase has been migrated to the new syntax, but both syntaxes are fully supported.

  • Support Multiple Tool String Outputs

    Added support for tools to define multiple string outputs using the outputs_to_string configuration. This allows users to specify how different parts of a tool's output should be converted to strings, enhancing flexibility in handling tool results.

    • Updated ToolInvoker to handle multiple output configurations.
    • Updated Tool to validate and store multiple output configurations.
    • Added tests to verify the functionality of multiple string outputs.

    This enables tools to provide rich, varied context to language models or downstream components without requiring multiple tool calls, while keeping full control over which outputs are stringified.

  • Added validation for inputs_from_state and outputs_to_state parameters in the Tool class. Tools now validate at construction time that state mappings reference valid tool parameters and outputs, catching configuration errors early instead of at runtime. The validation uses function introspection and JSON schema to ensure parameter names exist, and subclasses like ComponentTool validate against component input/output sockets.

🐛 Bug Fixes

  • Improved error messages in ConditionalRouter when non-string values are provided as route outputs. Users now receive clear guidance (e.g., "use '2' instead of 2") instead of the cryptic "Can't compile non template nodes" error.
  • Fixes jinja2 variable detection in ConditionalRouter, ChatPromptBuilder, PromptBuilder and OutputAdapter by properly skipping variables that are assigned within the template. Previously under specific scenarios variables assigned within a template would falsely be picked up as input variables to the component. For more information you can check out the parent issue in the Jinja2 library here: pallets/jinja#2069
  • Fixes deserializing an instance of NamedEntityExtractor when pipeline_kwargs is stored in the deserialization dict with the value of None.
  • When creating an HTTP client object from a dictionary, we now convert the limits parameter to an httpx.Limits object to avoid AttributeError.
  • Raise a ValueError when an async function is passed to the Tool class. Async functions are not supported as tools. This change provides a clear error message instead of silent failures where coroutines are never awaited.

⚠️ Deprecation Notes

  • The return_empty_on_no_match parameter has been removed from the RegexTextExtractor component. This component now always returns a dictionary with the key "captured_text"; the value can be an empty string if no match is found or the captured text. Currently, the return_empty_on_no_match parameter is ignored. Starting from Haystack 2.23.0, initializing the component with this parameter will raise an error.

💙 Big thank you to everyone who contributed to this release!

@anakin87, @ArzelaAscoIi, @bilgeyucel, @Bobholamovic, @davidsbatista, @dfokina, @github-actions[bot], @GunaPalanivel, @majiayu000, @OliverZhangA, @sjrl, @TaMaN2031A, @tommasocerruti, @tstadel, @vblagoje, @YassineGabsi

v2.21.0

08 Dec 15:47

Choose a tag to compare

⭐️ Highlights

🔍 Smarter, Broader Retrieval with Multi-Query RAG

This release introduces three new components that significantly boost retrieval recall in RAG systems by expanding the user query and retrieving documents across multiple reformulations:

  • QueryExpander generates semantically similar variations of a user query to broaden search coverage.
  • MultiQueryTextRetriever runs multiple queries in parallel using a text-based retriever (e.g., BM25) and merges results by score.
  • MultiQueryEmbeddingRetriever performs the same multi-query retrieval flow using embeddings, enabling richer semantic recall.

Used together, these components create a multi-query retrieval pipeline that improves recall especially when queries are short or ambiguous.

🧪 Example: Expanding a Query and Retrieving More Relevant Documents

from haystack import Pipeline
from haystack.components.query import QueryExpander
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.retrievers import MultiQueryTextRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack import Document
from haystack.document_stores.types import DuplicatePolicy

# Sample documents
docs = [
    Document(content="Renewable energy comes from natural sources like wind and sunlight."),
    Document(content="Geothermal energy is heat from beneath the Earth's surface."),
    Document(content="Hydropower generates electricity using flowing water."),
]

# Store documents
store = InMemoryDocumentStore()
writer = DocumentWriter(document_store=store, policy=DuplicatePolicy.SKIP)
writer.run(documents=docs)

# Components
expander = QueryExpander()
retriever = InMemoryBM25Retriever(document_store=store, top_k=1)
multi_retriever = MultiQueryTextRetriever(retriever=retriever)

# Expand and retrieve
expanded = expander.run(query="renewable energy")
results = multi_retriever.run(queries=expanded["queries"])

for doc in results["documents"]:
    print(doc.content)

This pipeline expands "renewable energy" into multiple related queries, retrieves documents for each in parallel, and returns a richer set of relevant results — demonstrating how multi-query retrieval improves recall with minimal effort.

⬆️ Upgrade Notes

  • Updated the default Azure OpenAI model from gpt-4o-mini to gpt-4.1-mini and the default API version from 2023-05-15 to 2024-12-01-preview for both AzureOpenAIGenerator and AzureOpenAIChatGenerator.
  • The default OpenAI model has been changed from gpt-4o-mini to gpt-5-mini for OpenAIChatGenerator and OpenAIGenerator. If you rely on the default model and need to continue using gpt-4o-mini, explicitly specify it when initializing these components: OpenAIChatGenerator(model="gpt-4o-mini").

🚀 New Features

  • Three new components added QueryExpander, MultiQueryEmbeddingRetriever, MultiQueryTextRetriever. When used together, they allow a query to be expanded and each expansion is used to retrieve a potentially different set of documents.

⚡️Enhancement Notes

  • Added a return_empty_on_no_match parameter (default True) to RegexTextExtractor.__init__(). When set to False, the component returns {"captured_text": ""} instead of {} when no regex match is found. Provides a consistent output structure for pipeline integration.
  • The FilterRetriever and AutoMergingRetriever components now support asynchronous execution.
  • Previously, when using tracing with objects like ByteStream and ImageContent, the payload sent to the tracing backend could become too large, hitting provider limits or causing performance degradation. We now replace these objects with string placeholders to avoid oversized payloads.
  • The default OpenAI model for OpenAIChatGenerator and OpenAIGenerator has been updated from gpt-4o-mini to gpt-5-mini.

🐛 Bug Fixes

  • Ensure request header keys are unique in link_content to prevent 400 Bad Request errors.

    Some image providers return a 400 Bad Request when using ImageContent.from_url() because the User-Agent header appears multiple times with different casing (e.g., user-agent, User-Agent). This update normalizes header keys in a case-insensitive way, removes duplicates, and preserves only the last occurrence.

  • Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.

  • Fix the serialization and deserialization of pipeline_outputs in pipeline_snapshot and make it use the same schema as the rest of the pipeline state when running pipelines with breakpoints. The deserialization of the older format of pipeline_outputs without serialization schema is supported till Haystack 2.23.0.

  • Fixed ToolInvoker missing tools after warmup for lazy-initialized toolsets. The invoker now refreshes its tool registry post-warmup, ensuring replaced placeholders (e.g., MCPToolset with eager_connect=False) resolve to the actual tool names at invocation time.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @anakin87, @davidsbatista, @dfokina, @mrchtr, @OscarPindaro, @schwartzadev, @sjrl, @TaMaN2031A, @vblagoje, @YassineGabsi, @ZeJ0hn

v2.21.0-rc1

03 Dec 20:33

Choose a tag to compare

v2.21.0-rc1 Pre-release
Pre-release
v2.21.0-rc1

v2.20.0

13 Nov 15:06

Choose a tag to compare

⭐️ Highlights

Support for OpenAI's Responses API

Haystack now integrates the OpenAI's Responses API through the new OpenAIResponsesChatGenerator and AzureOpenAIResponsesChatGenerator components.

This unlocks several advanced capabilities like:

  • Retrieving concise summaries of the model’s reasoning process.
  • Using native OpenAI or MCP tool formats alongside Haystack Tool objects and Toolset instances.

Example with reasoning and a web search tool:

from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

# with `OpenAIResponsesChatGenerator`
chat_generator = OpenAIResponsesChatGenerator(
    model="o3-mini",
    generation_kwargs={"summary": "auto", "effort": "low"},
    tools=[{"type": "web_search"}],
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's a positive news story from today?")])

# with `AzureOpenAIResponsesChatGenerator`
chat_generator = AzureOpenAIResponsesChatGenerator(
    azure_endpoint="https://example-resource.azure.openai.com/",
    azure_deployment="gpt-5-mini",
    generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}},
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's Natural Language Processing?")])

print(response["replies"][0].text)

🚀 New Features

  • Added the AzureOpenAIResponsesChatGenerator, a new component that integrates Azure OpenAI's Responses API into Haystack.
  • Added the OpenAIResponsesChatGenerator, a new component that integrates OpenAI's Responses API into Haystack.
  • If logprobs are enabled in the generation kwargs, return logprobs in ChatMessage.meta for OpenAIChatGenerator and OpenAIResponsesChatGenerator.
  • Added an extra field to ToolCall and ToolCallDelta to store provider-specific information.
  • Updated serialization and deserialization of PipelineSnapshots to work with pydantic BaseModels.
  • Added async support to SentenceWindowRetriever with a new run_async() method, allowing the retriever to be used in async pipelines and workflows.
  • Added warm_up() method to all ChatGenerator components (OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, and FallbackChatGenerator) to properly initialize tools that require warm-up before pipeline execution. The warm_up() method is idempotent and follows the same pattern used in Agent and ToolInvoker components. This enables proper tool initialization in pipelines that use ChatGenerators with tools but without an Agent component.
  • The AnswerBuilder component now exposes a new parameter return_only_referenced_documents (default: True) that controls if only documents referenced in the replies are returned. Returned documents include two new fields in the meta dictionary:
    • source_index: the 1-based index of the document in the input list
    • referenced: a boolean value indicating if the document was referenced in the replies (only present if the reference_pattern parameter is provided).
      These additions make it easier to display references and other sources within a RAG pipeline.

⚡️ Enhancement Notes

  • Adds generation_kwargs to the Agent component, allowing for more fine-grained control at run-time over chat generation.
  • Added a revision parameter to all Sentence Transformers embedder components (SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder, SentenceTransformersSparseDocumentEmbedder, and SentenceTransformersSparseTextEmbedder) to allow users to specify a specific model revision/version from the Hugging Face Hub. This enables pinning to a particular model version for reproducibility and stability.
  • Updated the components Agent, LLMMetadataExtractor, LLMMessagesRouter, and LLMDocumentContentExtractor to automatically call self.warm_up() at runtime if they have not been warmed up yet. This ensures that the components are ready for use without requiring an explicit warm-up call. This differs from previous behavior where warm-up had to be manually invoked before use, otherwise a RuntimeError was raised.
  • Improved log-trace correlation for DatadogTracer by using the official ddtrace.tracer.get_log_correlation_context() method.
  • Improved Toolset warm-up architecture for better encapsulation. The base Toolset.warm_up() method now warms up all tools by default, while subclasses can override it to customize initialization (e.g., setting up shared resources instead of warming individual tools). The warm_up_tools() utility function has been simplified to delegate to Toolset.warm_up().

🐛 Bug Fixes

  • Fixed deserialization of state schema when it is None in Agent.from_dict.

  • Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.

  • Fixed type compatibility issue where passing list[Tool] to components with a tools parameter (such as ToolInvoker) caused static type checker errors.
    In version 2.19, the ToolsType was changed to Union[list[Union[Tool, Toolset]], Toolset] to support mixing Tools and Toolsets. However, due to Python's list invariance, list[Tool] was no longer considered compatible with list[Union[Tool, Toolset]], breaking type checking for the common pattern of passing a list of Tool objects.

    The fix explicitly lists all valid type combinations in ToolsType: Union[list[Tool], list[Toolset], list[Union[Tool, Toolset]], Toolset]. This preserves backward compatibility for existing code while still supporting the new functionality of mixing Tools and Toolsets.

    Users who encountered type errors like "Argument of type 'list[Tool]' cannot be assigned to parameter 'tools'" should no longer see these errors after upgrading. No code changes are required on the user side.

  • When creating a pipeline snapshot, we now ensure use of _deepcopy_with_exceptions when copying component inputs to avoid deep copies of items like components and tools since they often contain attributes that are not deep-copyable.
    For example, the LinkContentFetcher has httpx.Client as an attribute, which throws an error if deep-copied.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @anakin87, @cmnemoi, @davidsbatista, @dfokina, @HamidOna, @Hansehart, @jdb78, @mrchtr, @sjrl, @swapniel99, @TaMaN2031A, @tstadel, @vblagoje

v2.20.0-rc2

13 Nov 10:55

Choose a tag to compare

v2.20.0-rc2 Pre-release
Pre-release
v2.20.0-rc2

v2.20.0-rc1

11 Nov 14:59

Choose a tag to compare

v2.20.0-rc1 Pre-release
Pre-release
v2.20.0-rc1

v2.19.0

20 Oct 12:53

Choose a tag to compare

⭐️ Highlights

🛡️ Try Multiple LLMs with FallbackChatGenerator

Introduced FallbackChatGenerator, a resilient chat generator that runs multiple LLMs sequentially and automatically falls back when one fails. It tries each generator in order until one succeeds, handling errors like timeouts, rate limits, or server issues. Ideal for building robust, production-grade chat systems that stay responsive across providers.

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.chat.fallback import FallbackChatGenerator

anthropic_generator = AnthropicChatGenerator(model="claude-sonnet-4-5", timeout=1) # force failure with low timeout
google_generator = GoogleGenAIChatGenerator(model="gemini-2.5-flashy") # force failure with typo in model name
openai_generator = OpenAIChatGenerator(model="gpt-4o-mini") # success

chat_generator = FallbackChatGenerator(chat_generators=[anthropic_generator, google_generator, openai_generator])
response = chat_generator.run(messages=[ChatMessage.from_user("What is the plot twist in Shawshank Redemption?")])

print("Successful ChatGenerator: ", response["meta"]["successful_chat_generator_class"])
print("Response: ", response["replies"][0].text)

Output:

WARNING:haystack.components.generators.chat.fallback:ChatGenerator AnthropicChatGenerator failed with error: Request timed out or interrupted...
WARNING:haystack.components.generators.chat.fallback:ChatGenerator GoogleGenAIChatGenerator failed with error: Error in Google Gen AI chat generation: 404 NOT_FOUND...
Successful ChatGenerator:   OpenAIChatGenerator
Response:  In "The Shawshank Redemption," ....

🛠️ Mix Tool and Toolset in Agents

You can now combine both Tool and Toolset objects in the same tools list for Agent and ToolInvoker components. This update brings more flexibility, letting you organize tools into logical groups while still adding standalone tools in one go.

from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset

math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])

agent = Agent(
    chat_generator=generator,
    tools=[math_toolset, weather_toolset, calendar_tool],  # ✨ Now supported!
)

⚙️ Faster Agents with Tool Warmup

Tool and Toolset objects can now perform initialization during Agent or ToolInvoker warmup. This allows setup tasks such as connecting to databases, loading models, or initializing connection pools before the first use.

from haystack.tools import Toolset
from haystack.components.agents import Agent

# Custom toolset with initialization needs
class DatabaseToolset(Toolset):
    def __init__(self, connection_string):
        self.connection_string = connection_string
        self.pool = None
        super().__init__([query_tool, update_tool])
        
    def warm_up(self):
        # Initialize connection pool
        self.pool = create_connection_pool(self.connection_string)

🚀 New Features

  • Updated our serialization and deserialization of PipelineSnapshots to work with python Enum classes.

  • Added FallbackChatGenerator that automatically retries different chat generators and returns first successful response with detailed information about which providers were tried.

  • Added pipeline_snapshot and pipeline_snapshot_file_path parameters to BreakpointException to provide more context when a pipeline breakpoint is triggered.
    Added pipeline_snapshot_file_path parameter to PipelineRuntimeError to include a reference to the stored pipeline snapshot so it can be easily found.

  • A new component RegexTextExtractor which allows to extract text from chat messages or strings input based on custom regex pattern.

  • CSVToDocument: add conversion_mode='row' with optional content_column; each row becomes a Document; remaining columns stored in meta; default 'file' mode preserved.

  • Added the ability to resume an Agent from an AgentSnapshot while specifying a new breakpoint in the same run call. This allows stepwise debugging and precise control over chat generator inputs tool inputs before execution, improving flexibility when inspecting intermediate states. This addresses a previous limitation where passing both a snapshot and a breakpoint simultaneously would throw an exception.

  • Introduce SentenceTransformersSparseTextEmbedder and SentenceTransformersSparseDocumentEmbedder components. These components embed text and documents using sparse embedding models compatible with Sentence Transformers. Sparse embeddings are interpretable, efficient when used with inverted indexes, combine classic information retrieval with neural models, and are complementary to dense embeddings. Currently, the produced SparseEmbedding objects are compatible with the QdrantDocumentStore.

    Usage example:

    from haystack.components.embedders import SentenceTransformersSparseTextEmbedder
    
    text_embedder = SentenceTransformersSparseTextEmbedder()
    text_embedder.warm_up()
    
    print(text_embedder.run("I love pizza!"))
    # {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}
  • Added a warm_up() function to the Tool dataclass, allowing tools to perform resource-intensive initialization before execution. Tools and Toolsets can now override the warm_up() method to establish connections to remote services, load models, or perform other preparatory operations. The ToolInvoker and Agent automatically call warm_up() on their tools during their own warm-up phase, ensuring tools are ready before use.

  • Fixed a serialization issue related to function objects in a pipeline; now they are converted to type None (functions cannot be serialized). This was preventing the successful setting of breakpoints in agents and their use as a resume point. If an error occurs during an Agent execution, for instance, during tool calling. In that case, a snapshot of the last successful step is raised, allowing the caller to catch it to inspect the possible reason for the crash and use it to resume the pipeline execution from that point onwards.

⚡️ Enhancement Notes

  • Added tools to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passing Tool objects or a Toolset.
  • Enhanced the tools parameter across all tool-accepting components (Agent, ToolInvoker, OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator) to accept either a mixed list of Tool and Toolset objects or just a Toolset object. Previously, components required either a list of Tool objects OR a single Toolset, but not both in the same list. Now users can organize tools into logical Toolsets while also including standalone Tool objects, providing greater flexibility in tool organization. For example: Agent(chat_generator=generator, tools=[math_toolset, weather_toolset, standalone_tool]). This change is fully backward compatible and preserves structure during serialization/deserialization, enabling proper round-trip support for mixed tool configurations.
  • Refactored _save_pipeline_snapshot to consolidate try-except logic and added a raise_on_failure option to control whether save failures raise an exception or are logged. _create_pipeline_snapshot now wraps _serialize_value_with_schema in try-except blocks to prevent failures from non-serializable pipeline inputs.

🐛 Bug Fixes

  • Fix Agent run_async method to correctly handle async streaming callbacks. This previously triggered errors due to a bug.
  • Prevent duplication of the last assistant message in the chat history when initializing from an AgentSnapshot.
  • We were setting response_format to None in OpenAIChatGenerator by default which doesn't follow the API spec. We now omit the variable if response_format is not passed by the user.
  • Ensure that the OpenAIChatGenerator is properly serialized when response_format in generation_kwargs is provided as a dictionary (for example, {"type": "json_object"}). Previously, this caused serialization errors.
  • Fixed parameter schema generation in ComponentTool when using inputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example, inputs_from_state={"text": "text"} removed text as expected, but inputs_from_state={"state_text": "text"} did not. This is now resolved, and such cases work as intended.
  • Refactored SentenceTransformersEmbeddingBackend to ensure unique embedding IDs by incorporating all relevant arguments.
  • Fixed Agent to correctly raise a BreakpointException when a ToolBreakpoint with a specific tool_name is provided in an assistant chat message containing multiple tool calls.
  • The OpenAIChatGenerator implementation uses ChatCompletionMessageCustomToolCall, which is only available in OpenAI client >=1.99.2. We now require openai>=1.99.2.

💙 Big thank you to everyone who contributed to this release!

@anakin87, @bilgeyucel, @davidsbatista, @dfokina, @...

Read more