Skip to content

Add a deployment mode to skip non-essential validation checks during kedro run #4671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DimedS opened this issue Apr 14, 2025 · 2 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@DimedS
Copy link
Member

DimedS commented Apr 14, 2025

Description

Introduce a dedicated deployment mode for kedro run command, which could be enabled via a flag (e.g., --deployment) or through project settings. This mode would skip non-essential validation checks that are only useful during pipeline development.

This ticket is a follow-up to the discussion in #4603 (comment)

Currently, there are more than five checks prefixed with _validate_ that verify the validity of the pipeline structure. However, once a pipeline has been developed and tested locally, these checks are less meaningful when running the pipeline repeatedly in a deployment environment without any changes to the pipeline code.

Benefits

  • Reduced runtime overhead in production
  • More control for advanced users deploying stable pipelines
  • Aligns with patterns seen in other tools that offer dev vs. prod execution modes
@DimedS DimedS added the Issue: Feature Request New feature or improvement to existing feature label Apr 14, 2025
@datajoely
Copy link
Contributor

How are we ensuring dependencies are valid. We have a wider uv ticket, it would be great to look into compiling just the necessary requirements.

@datajoely
Copy link
Contributor

I'd also like to see how we handle the following requirement:

  • as a user I'm deploying my Kedro pipeline to a k8s based orchestrator
  • part of my Kedro pipeline is data engineering focused and needs Spark, JDK etc.
  • part of my Kedro pipeline is ML focused and needs Tensorflow/Pytorch

I want to only want to deploy the relevant dependencies to the target containers. Note it's not an issue / very cheap to deploy the same code twice even if only part of it will be executed in different locations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

2 participants