Skip to content

Conversation

@amoghrajesh
Copy link
Contributor

@amoghrajesh amoghrajesh commented Dec 31, 2025

Why

As of today, workers unfortunately need plugins for macros (template rendering), listeners, and operator extra links. Now, the plugins manager is present in core airflow and sdk code needs to have a cross package import for it.

What

I am trying to move the generic plugin management code to a shared library (shared/plugins_manager) to support client-server separation between airflow-core and task-sdk. This ensures both components use the same plugin discovery and integration logic without code duplication which can happen if we just copy paste the entire code in both packages.

Changes of note

What was moved into shared library

  • Generic methods, classes, ABCs etc

For example:

  • AirflowPlugin base class
  • AirflowPluginSource, PluginsDirectorySource, EntryPointSource
  • _load_plugins_from_plugin_directory(), _load_entrypoint_plugins()
  • integrate_macros_plugins(), integrate_listener_plugins()
  • is_valid_plugin(), make_module()

The principle I used here was that we have to keep the library standalone and hence move no provider discovery, logic to get plugins (read from providers too), anything similar like this.

In order to do this, we maintain wrappers in core airflow and sdk now and the code within each of those use their own versions.

Some function signature changed for dependency injection

Some methods needed a change in signature to be able to inject library specific values to these functions.

Shared code couldn't import airflow.settings or airflow.configuration.conf, so:

Function Old Params New Params
PluginsDirectorySource.__init__ (self, path) (self, path, plugins_folder: str)
_load_plugins_from_plugin_directory () (plugins_folder: str, load_examples: bool = False, example_plugins_module: str | None = None)
integrate_macros_plugins () (macros_module: ModuleType, macros_module_name_prefix: str, plugins: list[AirflowPlugin])
integrate_listener_plugins () (listener_manager: ListenerManager, plugins: list[AirflowPlugin])

These values are injected via core airflow / task sdk wrappers

Providers

Added AirflowPlugin to common.compat.sdk. Added # use next version to affected providers.

TODO / Open Qns

  1. Docs: Unsure what to do about it, should we continue endorsing imports of format: from airflow.plugins_manager or move to from airflow.sdk.plugins_manager? I am doubtful because plugins can also be used for CORE only modifications like react_apps or fastapi_apps.
  2. Guess I can move most of the tests to shared

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@amoghrajesh amoghrajesh changed the title Move plugins manager to shared Dec 31, 2025
@amoghrajesh amoghrajesh marked this pull request as ready for review December 31, 2025 13:52
@amoghrajesh amoghrajesh self-assigned this Dec 31, 2025
Comment on lines +30 to +32
# TODO: these should be moved into module_loading i think
from airflow.utils.entry_points import entry_points_with_dist
from airflow.utils.file import find_path_from_directory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will send in a precursor for this as well.

@@ -0,0 +1,280 @@
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do it as I did with modules_loading and move all that to __init__.py - i found it much nicer as you would not have to repeat plugins_manager twice in imports.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, i should do that


if macros_module_instance:
sys.modules[macros_module_instance.__name__] = macros_module_instance
# Register the newly created module on airflow.macros such that it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not valid -> the macros module is now injected from outside.



def integrate_macros_plugins(
macros_module: ModuleType, macros_module_name_prefix: str, plugins: list[AirflowPlugin]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
macros_module: ModuleType, macros_module_name_prefix: str, plugins: list[AirflowPlugin]
target_macros_module: ModuleType, macros_module_name_prefix: str, plugins: list[AirflowPlugin]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think also betted docstrings in those methods would be really helpful to understand what they do.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really cool. Just few comments.

@amoghrajesh
Copy link
Contributor Author

Thanks @potiuk! It's almost complete :)

Do you have any thoughts on open qns I posted in the PR desc?

Docs: Unsure what to do about it, should we continue endorsing imports of format: from airflow.plugins_manager or move to from airflow.sdk.plugins_manager? I am doubtful because plugins can also be used for CORE only modifications like react_apps or fastapi_apps.

Especially

from typing import TYPE_CHECKING, Any

# TODO: these should be moved into module_loading i think
from airflow.utils.entry_points import entry_points_with_dist
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in: #59971

@potiuk
Copy link
Member

potiuk commented Dec 31, 2025

Docs: Unsure what to do about it, should we continue endorsing imports of format: from airflow.plugins_manager or move to from airflow.sdk.plugins_manager? I am doubtful because plugins can also be used for CORE only modifications like react_apps or fastapi_apps.

I think what we should do now is to define sdk.AirflowProviderPlugin and airflow.AirflowCorePlugin and make the old "AirflowPlugin" (for now) - and possibly we can already start importing the right ones in the right places as base classes. There is a bit of difficulty with Listeners though - because some listeners are "core" listeners, and some are "worker" listeners - but I think making it clear which listener is "task-sdk" and which one is "core" is something that we should do long time ago anyway. Our users (and ousrselves) are utterly confused where each listener is executed and it would be nice to make it clear by separating the "core" and "task-sdk" plugin interfaces now - even if for now we will use the same discovery mechanism in both. Possibly also eventually those two different plugin types should be advertised using different entrypoints - but that's something we can do later likely.

A bit more context why and what is my "North Star" here.

I think eventually we should have two (or maybe even 3) types of plugins. I thought about it in the past when discussing the task isolation with @ashb and splitting the distributions, and I am quite sure we will have to do it sooner or later (and rather sooner) and have different distribution for "core" and "task.sdk" plugins - when things are different than regular "providers" (i.e. when we have provider-specific executors, listeners and macros).

This is related to something we discussed yesterday with @kacpermuda in #59921 (comment) -> and it's clearly visible there because as of Airflow 3.2. openlineage is a bit in a Shroedinger state - it both needs, and does not need sqlalchemy, depending if it is installed with airflow-core (needs), or with task-sdk (might use when installed but does not need it). We get-by for now by declaring sqlalchemy as an optional dependency, and handling the optionality in the "worker" side of things.

I don't think we are "ready" to discuss the exact way how to do the split - with naming and implementation details - (we need to complete core isolation from task.sdk first).

But I think I have a very clear idea how it can work on high level (and would be the ultimate distribution split I had in mind when we started discussing task-isolation).

My thinking is that we might have two types of distribution - current task-sdk providers (hooks, operators, macros and other task-sdk side plugins) - and core-plugins (apache-airflow-core-plugins-openlineage) where we will only have "core" types of plugins. Possibly even later we can split out UI plugin types apache-airflow-ui-plugins-openlineage ??) - but that would only be needed if we decide to split out "airflow-ui" from "airflow-core" - which we might not want to do at all - depending on how different "UI" and "core" dependencies turn out after we complete current isolation work.

That would be very, very visible with edge provider - which (if we do all the split till the end) should have all 3 (!) types of distributions (cc: @jscheffl):

  • apache-airflow-providers-edge3 -> worker side (edge3 does not have task-sdk plugins I think but they coudl be defined here)
  • apache-airflow-core-plugins-edge3 -> executor
  • apache-airlfow-ui-plugins-edge3 -> UI plugin

Similarly:

  • apache-airflow-providers-amazon (including amazon-specific macros if there are any).
  • apache-airflow-core-plugins-amazon -> executor

A little bit of problem there is that those "separate" distributions will necessarlily need to share some code. But this problem is largely solved now with shared concept of ours. We would just have to extend the shared concept we have now to providers - because those different distributions (say for amazon) will have some common code. We should have for example shared.amazon_common and use it in both apache-airflow-providers-amazon and apache-airflow-providers-core-plugins for example.

That is my dream structure of Airflow 3 that is my "North Star".

We do not have to decide on all that now - naming, folders, etc. can (and should be) discussed later when we complete the current isolation work. But we can do some preliminary work - i.e. start separating "types" of plugins.

Guess I can move most of the tests to shared

Yes.

fastapi_root_middlewares: list[Any] = []
external_views: list[Any] = []
react_apps: list[Any] = []
menu_links: list[Any] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a long time it bothered me that most of the element lists are of type Any - as we are moving here, would it be (non breaking interface?) possible to change them to a TypedDict such that elements needed are at least described?

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good also in my view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment