Releases: pytask-dev/pytask
v0.4.6
What's Changed
A bug fix release that fixes an error when users used from pytask import mark to mark their tasks.
- Skip collection of
MarkGeneratorwhen usingfrom pytask import mark. by @tobiasraabe in #576
Full Changelog: v0.4.5...v0.4.6
v0.4.5
Highlights
pytask is moving towards v0.5, which will remove lots of deprecated features. Please upgrade your syntax or pin pytask to <0.5 in your requirements files.
The release comes with two new features and lots of little improvements.
-
pytask can now handle files that are referenced via HTTP(S) URLs or files on popular cloud storage like AWS, all thanks to universal_pathlib. Explanations can be found in this guide.
-
It is now easier to extend pytask than before. The new
hook_moduleconfiguration option allows adding modules that contain hook implementations. This guide offers more explanation.
What's Changed
- CI: remove unneeded install of graphviz on ubuntu by @NickCrews in #515
- Raise error when non-existing task paths are added to the config. by @tobiasraabe in #517
- Do not allow builtin functions as tasks. by @tobiasraabe in #519
- Enhance issue templates. by @tobiasraabe in #522
- Refactor
get_file. by @tobiasraabe in #523 - Improve some linter and formatter rules. by @tobiasraabe in #524
- Bump actions/setup-python from 4 to 5 by @dependabot in #527
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #526
- Enable
PathNodeandPickleNodeto deal with URLs, S3, etc.. by @tobiasraabe in #525 - Add error message for not collected tasks with
@taskdecorator. by @tobiasraabe in #521 - Improve codecov and coverage. by @tobiasraabe in #528
- Bump sigstore/gh-action-sigstore-python from 2.1.0 to 2.1.1 by @dependabot in #533
- Bump actions/download-artifact from 3 to 4 by @dependabot in #531
- Bump actions/upload-artifact from 3 to 4 by @dependabot in #532
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #530
- Reenable and fix tests with Jupyter. by @tobiasraabe in #535
- Allow task functions to be partialed. by @tobiasraabe in #536
- Fix coverage. by @tobiasraabe in #537
- Change CLI entrypoint and allow passing task function to pytask.build. by @tobiasraabe in #540
- Refactor the plugin manager. by @tobiasraabe in #542
- Implement
hook_moduleconfig option. by @tobiasraabe in #539 - Update imports in tests. by @tobiasraabe in #543
- Some changes to the docs. by @tobiasraabe in #538
- Require sqlalchemy>=2 and upgrade code. by @tobiasraabe in #544
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #541
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #546
- Release v0.4.5. by @tobiasraabe in #545
Full Changelog: v0.4.4...v0.4.5
v0.4.4
What's Changed
- Fix typing issues with the DataCatalog. by @tobiasraabe in #510
- [automated] Update plugin list by @github-actions in #511
- Improve the documentation. by @tobiasraabe in #509
Full Changelog: v0.4.3...v0.4.4
v0.4.3
What's Changed
This release contains a lot of smaller improvements and bug fixes. Here is a short list.
- #484 raises an error message when a
PathNodewas used with a directory instead a file. - #496 makes pytask even lazier. When a preceding task is executed and produces the same outputs, the following task will no longer be executed.
- Objects in task modules that overwrite
__getattr__should not cause any problems anymore (#507 was fixed in #508). Same applies to importTaskin task modules.
Complete list of changes
- Simplify the teardown of a task. by @tobiasraabe in #483
- Correctly unconfigure pytask. by @tobiasraabe in #485
- Raise informative error when path nodes point to directories. by @tobiasraabe in #484
- Add default names to
PPathNodes. by @tobiasraabe in #486 - Modernize
TopologicalSorter. by @tobiasraabe in #458 - Raise error for invalid value in return annotation. by @tobiasraabe in #488
- Refactor and better test products. by @tobiasraabe in #489
- Refactor and better test parsing of dependencies. by @tobiasraabe in #490
- Addition to #489. by @tobiasraabe in #491
- Make pytask even lazier. by @tobiasraabe in #496
- Bump sigstore/gh-action-sigstore-python from 1.2.3 to 2.1.0 by @dependabot in #495
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #494
- Remove unnecessary code from the collection of tasks. by @tobiasraabe in #497
- Fix errors when using
TaskandTaskWithoutPathin task modules. by @tobiasraabe in #498 - Allow tasks to depend on other tasks. by @tobiasraabe in #493
- Move test dependencies to
pyproject.tomlby @tobiasraabe in #500 - [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #499
- Remove
MetaNode. by @tobiasraabe in #501 - [automated] Update plugin list by @github-actions in #505
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #504
- Catch objects pretending to be
PTask. by @tobiasraabe in #508
Full Changelog: v0.4.2...v0.4.3
v0.4.2
Highlights
This release contains a new feature and some improvements for users.
- 🚀 The new feature is the
pytask.DataCatalogthat allows users to manage dependencies and products in projects more easily. Read the tutorial to get started. 🚀 - File changes are now detected by hashes instead of modification timestamps. It should prevent accidental executions when working with cloud storage providers like Dropbox or OneDrive and in many other situations. To save runtime, pytask uses a cache for the hashes when the modification timestamp has not changed.
- Nodes now have signatures that separate how nodes are named and displayed from how nodes are identified internally. If you have written a custom node, please update it according to the how-to guide.
- All of pytask's internal files are now stored in a
.pytaskfolder in your project. The file.pytask.sqlite3is moved to this location as well. Add.pytaskto your.gitignoreto prevent accidentally committing the folder.
What's Changed
- Simplify building the plugin manager. by @tobiasraabe in #449
- Rename
graph.pytodag_command.pyand improvecollect_command.py. by @tobiasraabe in #451 - Remove more
.svgs and replace them with animations. by @tobiasraabe in #454 - [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #452
- [automated] Update plugin list by @github-actions in #453
- Add more explanation when
PNode.load()fails during execution. by @tobiasraabe in #455 - Refer to source code on Github in API docs. by @tobiasraabe in #456
- Refactor code for
format_node_name. by @tobiasraabe in #457 - Add hook to sort
__all__. by @tobiasraabe in #459 - Simplify removing internal tracebacks from exceptions with cause. by @tobiasraabe in #460
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #461
- Fix import error for pluggy<1.3. by @tobiasraabe in #462
- Raise error when function is defined outside the loop body. by @tobiasraabe in #463
- Improve pins. by @tobiasraabe in #464
- Test that internal tracebacks are removed by reports. by @tobiasraabe in #465
- Add
is_producttoPNode.load(). by @tobiasraabe in #472 - Add a data catalog. by @tobiasraabe in #419
- Hash files instead of relying on modification timestamps. by @tobiasraabe in #469
- Move
.pytask.sqlite3to.pytask/. by @tobiasraabe in #470 - [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #471
- Update PyPI action. by @tobiasraabe in #477
- Add node signatures. by @tobiasraabe in #473
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #476
- Add snapshot tests. by @tobiasraabe in #475
- Switch from black to ruff-format. by @tobiasraabe in #478
- Rework reports and tracebacks. by @tobiasraabe in #474
- Give skips higher precendence than ancestor failed as outcome. by @tobiasraabe in #479
- Remove checks for missing root nodes. by @tobiasraabe in #480
- Improve coverage. by @tobiasraabe in #481
- Fix handling of names and signatures of
PythonNodes. by @tobiasraabe in #482
Full Changelog: v0.4.1...v0.4.2
v0.4.1
What's Changed
Of course, it's a mandatory bug fix release after a bigger release.
Using the product annotation, Annotated[..., Product] did not work with multiple products.
- Fix setting the name of
PythonNode. by @tobiasraabe in #443 - Move content of
setup.cfgtopyproject.toml. by @tobiasraabe in #444 - [automated] Update plugin list by @github-actions in #445
- Fix when multiple product annotations are used. by @tobiasraabe in #448
- Fix
PythonNodewhen used as return. by @tobiasraabe in #446 - Simplify the
tree_mapcode for generating the DAG. by @tobiasraabe in #447
Full Changelog: v0.4.0...v0.4.1
v0.4.0
News
pytask became three years old in July, which is a suitable event to rethink pytask's design and blow dust off of some of its oldest components.
Here are the highlights of v0.4.0 🚀 ⭐
Highlights
New interfaces for products.
Every argument can be declared as a product with the new' Product' annotation. The path can be passed as a default value.
from pathlib import Path
from pytask import Product
from typing_extensions import Annotated
def task_hello_earth(path: Annotated[Path, Product] = Path("hello_earth.txt")):
path.write_text("Hello, earth!")More explanation can be found at https://tinyurl.com/yrezszr4.
It is also possible to use the return of the task function as a product, which allows wrapping any third-party function as a task function. Read more about it here: https://tinyurl.com/pytask-return.
from pathlib import Path
from pytask import Product
from typing_extensions import Annotated
def task_hello_earth() -> Annotated[str, Path("hello_earth.txt")]:
return "Hello, earth!"Every task argument is a dependency
In older pytask versions, only paths were treated as task dependencies. That meant when you passed other arguments to the task, and they changed, it did not trigger a rerun of the task.
Now, every argument to a task can be a dependency, and you can hash them if they should trigger a rerun. It is explained in https://tinyurl.com/pytask-hash.
from pathlib import Path
from typing import Annotated
from pytask import Product
from pytask import PythonNode
def task_example(
text: Annotated[str, PythonNode(value="Hello, World", hash=True)],
path: Annotated[Path, Product] = Path("file.txt"),
) -> None:
path.write_text(text)A new functional interface
The functional interface for pytask has been reworked and accepts a list of task functions. You can use it within your terminal or a Jupyter notebook. Read this guide to learn more about it: https://tinyurl.com/pytask-functional.
from pathlib import Path
from typing import Annotated
from pytask import build
def create_text() -> Annotated[str, Path("hello_earth.txt")]:
return "Hello, earth!"
session = build(tasks=[create_text])Custom Nodes through Protocols
In the newest version, nodes (dependencies and products) and tasks follow protocols. It allows for customizations like PickleNodes that store any Python object as a pickle file and inject the object into the task when used as a dependency. It is explained in more detail in this guide: https://tinyurl.com/pytask-custom-nodes.
Other notable changes
- Python 3.12 is supported, and support for Python 3.7 is dropped.
@pytask.mark.depends_onand@pytask.mark.producesare deprecated. There are better options to define dependencies and products explained in https://tinyurl.com/yrezszr4.@pytask.mark.taskis also deprecated and replaced byfrom pytask import taskand@task.
What's Changed
- Remove Python 3.7 support and add a new action for mamba. by @tobiasraabe in #323
- Replace pony with sqlalchemy>=1.4.36. by @tobiasraabe in #387
- Remove
@pytask.mark.parametrize. by @tobiasraabe in #391 - Parse dependencies from all args if
depends_onis not used. by @tobiasraabe in #384 - Add products with
typing.Annotation. by @tobiasraabe in #394 - Refactor pybaum to
_pytask.tree_util. by @tobiasraabe in #395 - Replace pybaum with optree and add paths to PythonNode names. by @tobiasraabe in #396
- Add support for
NamedTupleand attrs classes in@pytask.mark.task(kwargs=...). by @tobiasraabe in #397 - Deprecate decorators for
depends_onandproduces. by @tobiasraabe in #398 - Use protocols instead of ABCs. by @tobiasraabe in #402
- Allow tasks to return products. by @tobiasraabe in #404
- Tracking changes in v0.4.0. by @tobiasraabe in #400
- Bump peter-evans/create-pull-request from 5.0.1 to 5.0.2 by @dependabot in #390
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #388
- Allow to use prefix trees as nodes to parse function returns. by @tobiasraabe in #406
- Remove
.valuefromNodeprotocol. by @tobiasraabe in #408 - Make
.from_annotan optional feature of nodes. by @tobiasraabe in #409 - Allow to pass functions to
PythonNode(hash=...). by @tobiasraabe in #410 - Add protocols for tasks. by @tobiasraabe in #412
- Remove scripts to generate
.svgs. by @tobiasraabe in #413 - Allow more ruff rules. by @tobiasraabe in #414
- A new functional interface. by @tobiasraabe in #411
- Deprecate
@pytask.mark.taskin favor of@pytask.task. by @tobiasraabe in #417 - Simplify and fix code in
dag.py. by @tobiasraabe in #418 - Convert
DeprecationWarningtoFutureWarningfor deprecated decorators. by @tobiasraabe in #420 - Remove deprecation warning for
produces. by @tobiasraabe in #421 - Document new interface. by @tobiasraabe in #392
- Fix
import_path. by @tobiasraabe in #424 - Publish
pytask.tree_util. by @tobiasraabe in #426 - Fix type annotations of
task.depends_onandtask.produces. by @tobiasraabe in #427 - Document functional interface. by @tobiasraabe in #423
- Update example in
README.md. by @tobiasraabe in #428 - Add better error message when
node.state()throws error during DAG validation. by @tobiasraabe in #429 - Update parts of the documentation. by @tobiasraabe in #430
- Enable colors in WSL. by @tobiasraabe in #431
- Fix type checking for
pytask.mark.x. by @tobiasraabe in #432 - Fix ids of
PythonNodes. by @tobiasraabe in #433 - Add support for Python 3.12. by @tobiasraabe in #434
- Fix detection of task functions. by @tobiasraabe in #437
- Clarify some types. by @tobiasraabe in #438
- Refine typing. by @tobiasraabe in #440
Full Changelog: v0.3.2...v0.4.0
v0.4.0rc4
The last pre-release.
v0.4.0rc3
A couple of new fixes. Most notably a fix for the ids of PythonNodes that should prevent rebuilds.
v0.4.0rc2
Another release candidate that fixes the installation via conda and adds full support for pytask-parallel.