-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: Enable components to automatically accept multiple connections if their input type is list
#10441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Pull Request Test Coverage Report for Build 21361994530Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
|
@anakin87 you might also find this interesting to review given your work on the OutputAdapter issue |
releasenotes/notes/add-auto-joiners-for-lists-types-b07771e6d8add3da.yaml
Outdated
Show resolved
Hide resolved
…add3da.yaml Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
| # We automatically set the receiver socket as variadic if: | ||
| # - it has at least one sender already connected | ||
| # - it's not already variadic | ||
| # - its origin type is list | ||
| if not receiver_socket.is_variadic and _safe_get_origin(receiver_socket.type) == list: | ||
| receiver_socket.is_lazy_variadic = True | ||
| # We also disable wrapping inputs into list so the sender outputs matches the type of the receiver | ||
| # socket. | ||
| receiver_socket.wrap_input_in_list = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about making this it's own function so I could reuse it but it felt too short and unnecessarily obfusicated the logic of what was going on so I decided to leave as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we automatically set the receiver socket as variadic could that change the execution order of components in a pipeline in some edge cases? If we set it to variadic we need to postpone execution until we can be sure that there won't be any other inputs, right? Can we be sure that because of that change the component is not executed later than before ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that should all be fine. I'm utilizing the same logic that already exists. Since the execution queue is not filled until the pipeline is run this receiver is just treated like a prexisting variadic component. I added a behavioural test that copied a complicated joiner scenario with branches that didn't always fire and it passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for changing the order of this receiver component compared to before. I don't think that would really be an issue since before, a joiner component would have lived right before the receiver so it wouldn't execute until the joiner fired anyway.
| @@ -580,14 +579,26 @@ def connect(self, sender: str, receiver: str) -> "PipelineBase": # noqa: PLR091 | |||
| # This is already connected, nothing to do | |||
| return self | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests that cover changes in this file to test/core/pipeline/test_pipeline_base.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
If you merge main, Coveralls message should reflect the new coverage.
| if socket.senders and socket_name in component_inputs: | ||
| # We automatically set the receiver socket as lazy variadic if: | ||
| # - it has at least one sender already connected | ||
| # - it's not already variadic | ||
| # - its origin type is list | ||
| if not socket.is_variadic and _safe_get_origin(socket.type) == list: | ||
| socket.is_lazy_variadic = True | ||
| # We also disable wrapping inputs into list so the sender outputs matches the type of the | ||
| # receiver socket. | ||
| socket.wrap_input_in_list = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where we make a component auto variadic if a user input is a second connection to a component.
|
I'd also explain this new feature in docs. |
Good idea, do you know where in the docs this would ideally go? |
I'm not sure. Somewhere in https://docs.haystack.deepset.ai/docs/pipelines or in child pages. |
|
Writing it here so I don't forget, but a follow up idea would be to getting this to also work for components that have Any as their input type. E.g. ChatPromptBuilders. Where it would be cool that if both senders have the same type and the senders origin type are |
…into auto-variadic-input
| class TestPipelineConnect: | ||
| def test_connect(self): | ||
| comp1 = component_class("Comp1", output_types={"value": int})() | ||
| comp2 = component_class("Comp2", input_types={"value": int})() | ||
| pipe = PipelineBase() | ||
| pipe.add_component("comp1", comp1) | ||
| pipe.add_component("comp2", comp2) | ||
| assert pipe.connect("comp1.value", "comp2.value") is pipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I collected all connect tests under this class
| def test_connect_auto_variadic(self): | ||
| @component | ||
| class ListAcceptor: | ||
| @component.output_types(result=list[int]) | ||
| def run(self, numbers: list[int]) -> dict[str, list[int]]: | ||
| return {"result": numbers} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the new test I added to test the new auto-variadic behavior on pipeline connection
| class TestValidateInput: | ||
| def test_validate_input_wrong_comp_name(self): | ||
| pipe = PipelineBase() | ||
| pipe.add_component("comp", FakeComponent()) | ||
| with pytest.raises(ValueError, match="Component named 'wrong_comp_name' not found in the pipeline."): | ||
| pipe.validate_input(data={"wrong_comp_name": {"input_": "test"}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function looked to be completely untested at least from unit tests so I added quite few under this class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Is it fine if I do this in a separate PR? I can create a follow up issue |
Ok! |
anakin87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I'd like @julian-risch to take a look too.
julian-risch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too. I just would like to briefly discuss if this PR can change the order of execution of different components. Would be an edge case. Or maybe you already thought about it?
anakin87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Related Issues
Proposed Changes:
Natively support connecting multiple outputs directly to another component without needing the user to define a
Joinercomponent.For example, a user may want to connect multiple output edges of the
FileTypeRouterto all go to the same converter (e.g.ImageConverter). Before you would have to define aListJoineras an in-between component. The changes introduced in this PR would allow you to directly connect all outputs from theFileTypeRouterto theImageConverter.Here is an example
How did you test it?
Notes for the reviewer
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.