This issue has been rescoped (original scope)

Problem/Motivation

Coming from #3566536: [meta] eliminate core .module files

Proposed resolution

  • Create a new service Drupal\filter\TextFormatter
  • Move the code from check_markup() into a new method Drupal\filter\TextFormatter::format().
  • Deprecate check_markup()

Remaining tasks

None.

User interface changes

None.

Introduced terminology

None.

API changes

  • New service and method Drupal\filter\TextFormatter:: format()
  • check_markup() is deprecated

Data model changes

None.

Original report

In my opinion (and I fully expect to get shot down, but that's what suggestions are for) the name of check_markup() is problematic in several ways:

1.) False friends: check_markup()'s nominal counterpart is historically the check_plain() function, which fits on some level since you call them in similar ways and sometimes for similar reasons. But check_plain() is in common.inc, while check_markup() is part of filter.module. This distinction goes very deep; check_plain() is a basic library function while check_markup() is high-level and tied to a site's input format settings.

2.) Code refactoring: As mentioned above, check_markup() is part of filter.module. Almost all non-theme module functions are prefixed with the name of the module as a general policy.

3.) Purpose: When you "check" something, you make sure it is safe, correct, valid, etc. This may work for check_plain(), which is an output security feature. check_markup() can take the same role with certain filters (HTML correction, HTML restriction). But it can also do something completely different, like evaluate PHP, render BBCode, generate syntax-highlighted source code... none of which have to do with output security.

-

I don't have an alternative suggestion myself, except that it should probably begin with "filter_". But I do think what we have now could be improved, and since we are renaming functions already (drupal_execute to drupal_form_submit, for example) it might be a good idea to look at it for Drupal 7.

Issue fork drupal-455724

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

greggles’s picture

check_plain - sanitizes text
check_markup - filters text, may sanitize, may not
filter_xss_admin - sanitizes text more liberally

I agree, this is a naming inconsistency and a DX issue that leads to a potential security issue.

Maybe check_markup could be changed to something about "running the filter" or filter_text or apply_filter.

cburschka’s picture

Status: Active » Needs review
StatusFileSize
new9.55 KB

filter_output() is probably not a perfect name either, but here is a patch to get things rolling.

Edit: This comes from not reloading the page in time! :)

We have a number of possibilties beside the one in this patch, including filter_apply(), filter_run(), filter_text(), filter_markup() or filter_execute().

heine’s picture

As filtertext and filterformat are always needed both, why not make a richtext object offered by filter module with a method that outputs HTML?

greggles’s picture

I think in 7.x we are moving to "[verb]_[noun]" style, which is where I got my suggestions. filter_output is in that format as well, and does a good job of communicating the idea that Drupal filters on output, but we aren't really filtering "output" we're filtering input and making it output.

If we do this, then perhaps we should reevaluate filter_xss and any other helper functions (maybe there aren't any, but I don't know) as well.

heine’s picture

check_plain converts plain text to HTML.
check_markup converts richtext (RT) to HTML.

filter_output doesn't tell me that either.

greggles’s picture

check_markup does a ton of other things as well.

Another related function worth reconsidering:

drupal_html_to_text

Heine pointed out in irc (and I somewhat agree) that check_markup will always return HTML. That's kind of true, though it's possible that a chunk of php would return an empty string or that an input format which has no filters assigned to it can return any text instead of HTML. To be really accurate the new name for check_markup should either be agnostic or we should change the way it works to always return HTML (I prefer the first solution).

I'd be happy with something like:

drupal_html_to_text -> leave alone
check_plain -> drupal_text_to_html

filter_xss, filter_xss_admin leave alone.
check_markup -> filter_text

This follows the logic that the functions with known output be called drupal_[something]_to_[otherthingthing] and that functions which do a more complex filtering or which have a specific filtering purpose are called filter_*

cburschka’s picture

drupal_html_to_text -> leave alone

It makes sense for drupal_html_to_text to be named this way, and it also makes sense for drupal_text_to_html to be named this way. But they will need comments to clarify that their operation is not symmetric or even related (check_plain escapes HTML-relevant characters, while d_h_t_t generates a "close-enough" plaintext version of HTML specifically for emails - it lives in mail.inc).

filter_text() sounds good, though.

Still leaving on NR for more comments.

heine’s picture

That's kind of true, though it's possible that a chunk of php would return an empty string or that an input format which has no filters assigned to it can return any text instead of HTML.

That's kind of beside the point as the output is slung into the content as if it _was_ HTML.

catch’s picture

filter_text($text, $format) is really tidy.

And we have 'text formats' now instead of input formats so it's pretty self explanatory there as well.

Status: Needs review » Needs work

The last submitted patch failed testing.

cburschka’s picture

Status: Needs work » Needs review
cburschka’s picture

StatusFileSize
new9.56 KB

Here is a new patch using filter_text().

Note that check_markup is a popular function that contrib may call as well, so this change should be mentioned in the module updating guide.

Status: Needs review » Needs work

The last submitted patch failed testing.

cburschka’s picture

Status: Needs work » Needs review

Testbot is silly.

sun’s picture

Good.

This API change would be pretty nonsense on its own, but luckily we can accompany this change with some real value: #446518: Remove $check argument from check_markup()

Most probably, however, this means that we need to merge this patch into the other issue.

dmitrig01’s picture

Check_plain doesn't check if something is plain - it converts it.

sun’s picture

And?

dries’s picture

(Somewhat off-topic: newer versions of PHP sports some filter mechanisms. I haven't looked at them recently, but I wonder if it makes sense to bring ours closer to PHP's? It could help developers.)

agentrickard’s picture

Semantically, I suppose I'd like to see some consistent use of filter_[name] where the second element indicates what will be output. That might give us:

filter_text() --> check_plain()
filter_html() --> check_markup()
filter_xss_safe() --> filter_xss() and filter_xss_admin()

And so forth. Not sure about the last function, since it effectively returns HTML.

dmitrig01’s picture

to me, filter_X implies that I want the input to be X (filter text means I want to do some filtering on a piece of text, not that I want to output text).

cburschka’s picture

And so forth. Not sure about the last function, since it effectively returns HTML.

In fact, all of these functions return HTML. check_plain() takes plain text and escapes characters that are reserved in HTML. The result is valid HTML markup that will show the plain content (albeit with collapsed whitespace) when viewed in a browser.

Consistency is good, but "filter_[type]" sounds like [type] is what goes in, not what comes out. One would expect filter_html to accept html to be filtered (which it may do, but which isn't its defining purpose). filter_markup() may be better.

There is one point this change *wouldn't* address, though: check_plain() is not a filter.module function. It is a common.inc function. It is used at levels that don't even know what a module or a database is - see conf_init() in bootstrap.inc. check_plain() cannot be moved out of common.inc for this reason. If we rename it to filter_plain(), we've got the opposite of our current problem: Instead of a filter.module function being named as if it was elsewhere, we'd have a common.inc function named as it if was part of filter.module.

agentrickard’s picture

/me almost concedes the point.

PHP uses filter_* for these functions -- http://us.php.net/manual/en/book.filter.php -- which effectively negates dmitri's argument.

If we did not have Drupal-specific functions for this, we would use something like:

// Print a user name securely.
global $user;
print filter_var($user->name, FILTER_SANITIZE_STRING);

See -- http://us.php.net/manual/en/filter.filters.sanitize.php -- for the sanitize tokens.

Would it make sense to use secure_ as the function namespace?

e.g.

secure_plain()
secure_html()
secure_xss()

-or-

secure_output_plain()
secure_output_html()
secure_prevent_xss()

This namespace is not used by PHP or Drupal ATM.

The other option, it seems to me is to use the new PHP 5 functions and dustbin our old Drupal-specific ones.

sun’s picture

Mixing check_plain() in here and trying to unify these function names is wrong. check_plain() and check_markup() are completely different functions. The resulting content of a variable processed by check_markup() can be anything but not necessarily safe for HTML. What you really mean are the filter_xss*() functions, which are about to be moved out of filter.module: #470632: Move filter_xss*() into common.inc

That's why I agree that check_markup() is poorly named and should using a prefix of filter_ instead. It applies a configurable set of filters - or none at all - to some arbitrary input. The input - as well as the output - can be plain text, ASCII, raw HTML, safe HTML, XML, PHP, BBCode or whatever. The result can be safe HTML, but can also be the original, unaltered input. There are modules that use filters to not output HTML, but plain-text that is safe for sending via e-mail.

agentrickard’s picture

@sun -- cross post while I was gathering info. Does #22 change anything for you?

sun’s picture

Nope.

1) Renaming of filter_xss*() as well as check_plain(), check_url(), check_file(), valid_email_address(), valid_url(), and possible more should happen in a separate issue. Those functions validate and/or sanitize (ensure) a certain syntax or format. check_markup() has nothing to do with them.

2) Which also means that we'll simply move the filter_xss*() functions in #470632: Move filter_xss*() into common.inc without renaming, so the renaming may or may not happen later on.

3) This issue is solely about renaming check_markup() in filter.module.

agentrickard’s picture

I am in agreement with sun.

On this patch, I like filter_text, so this is a vote for RTBC.

Other issues moved to #471184: Reconcile Drupal's input security functions with PHP filter_*, including the renaming of other functions and possibly replacing this function with filter_var.

sun’s picture

cburschka’s picture

Should all discussion of this issue move there?

sun’s picture

No - I repeat: check_markup() is not a validation/sanitation function. It is a tool to apply a set of configurable filters to a string.

heine’s picture

check_plain is also not just a validation/sanitation function. It is a tool to convert plain text strings (Ladies & Gentlemen) to HTML (Ladies & Gentlemen).

While not apparant in the PHPDoc for check_markup, its intended output is HTML. See Handle text in a secure fashion. I consider it a hack that some modules use this in another way.

Instead of renaming check_markup to a very ambiguous filter_text, it is time for a good look at how Drupal handles different string types, how to make the relation between function and the conversion being done can be made obvious. and how this all can be made easier for developers to prevent foe #1: XSS.

dmitrig01’s picture

as per http://acko.net/blog/safe-string-theory-for-the-web what about filter($context, $string, $param) which would do everything?

cburschka’s picture

Intriguing idea - but what is "everything"?

- filtering the request's cookie domain while finding the site's settings.php file on bootstrap
- querying an in input format from the database, getting its component filters and dispatching the content to the various modules

There's no way to do all of this in the same function without playing havoc with encapsulation.

agentrickard’s picture

We are also moving away from $context style functions (replace $context with $op to get a clear idea). So having filter('email', $input_string, $param); seems like a step backwards to me.

greggles’s picture

Just some more brainstorming on this... we also have db_escape_table and db_escape_string that will sanitize strings for use in different contexts.

Regardless of whether the $context is in the function name or the first argument to the function, there is a potential solidify a whole lot of different "escape strings for use in context X" functions.

sun’s picture

Good catches! However, those should go into #471264: Consistently name validation/sanitation functions.

Even if the purpose of check_markup() was (or is) to produce HTML, it still applies a configurable set of filters to the value (the input format). The major difference to all other functions is that a developer cannot rely on an expected, validated and sanitized result in a certain syntax or format.

Let's keep this issue on focus. I would be very happy to discuss potential ideas for changes to check_markup() in a separate issue (please leave a pointer here).

heine’s picture

Even if the purpose of check_markup() was (or is) to produce HTML, it still applies a configurable set of filters to the value (the input format).

Right, it converts richtext (with an associated format) to HTML.

The major difference to all other functions is that a developer cannot rely on an expected, validated and sanitized result in a certain syntax or format.

Nor does the developer have to.

Changing check_markup to filter_text doesn't add a thing to understanding its function. And change for the sake of change is IMO never a good argument for an API change.

sun’s picture

That's why I referred to #446518: Remove $check argument from check_markup() in #15.

If we can find a better name than filter_text(), sure. However, putting the function into the namespace of filter.module makes a lot sense to me.

agentrickard’s picture

Well, thinking about this, I would call it filter_output(), since it is designed to sanitize data before outputting markup.

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Priority: Normal » Critical

Bumping to critical, since #446518: Remove $check argument from check_markup() has been committed.

We need a new function name for check_markup() ASAP now.

cburschka’s picture

We should probably avoid getting caught on the output/text bikeshed. What makes this issue critical now seems to be the "check_".

The proposals so far have been, unless I glossed over something:

1. Renaming s/check_markup/filter_[foobar]/g (where foobar is still to be determined)
1b) Accompanied by renaming other assorted filtering functions like check_plain, drupal_html_to_text, filter_xss_admin, etc.
2. Restructure filtering into an OO form, with a RichText::html() class/method.
3. Consolidate all filtering into a single filter($op, ...) function.

The last is a deprecated pattern being removed from D7, so we're not using that I guess.

What about the others?

agentrickard’s picture

StatusFileSize
new12.77 KB

Now is not the time to introduce new OO, IMHO, so I would rule #2 out.

So let's just check for namespace collisions on filter_foo() --> http://us2.php.net/manual/en/book.filter.php

filter_markup() is fine by me, and legal. Patch attached.

agentrickard’s picture

Status: Needs work » Needs review

See also #557148: Rename check_plain() to drupal_htmlspecialchars() for discussion of check_plain() and a patch.

sun’s picture

Status: Needs review » Reviewed & tested by the community

I thought about this and I think that the proposed change is very good, because

- "markup" in filter_markup() can mean both input and output, which is actually the case for this function. You can filter text to HTML, other markup languages to HTML, and HTML to HTML; and even if it was not intended to allow for it, it also allows to filter HTML to text.

- filter_markup() is in the namespace of the implementing module.

- filter_markup() is both understandable by existing developers and new developers, while it's an easily recallable API change for existing developers.

damien tournoud’s picture

Status: Reviewed & tested by the community » Needs work

I'm with Heine on the richtext_to_html() proposal (http://drupal.org/node/557148#comment-1959718).

agentrickard’s picture

But the function is _in_ filter module.

So namespaces without filter_ are no good.

This is a horrible bikeshed debate, IMO. filter_markup is a perfectly good incremental improvement. If you want a name that doesn't start with filter_, then you have to move the function to common.inc.

sun’s picture

Status: Needs work » Needs review

ok, seems like we need a bit more discussion.

richtext_to_html() is wrong, because

- it is not guaranteed that it returns HTML; whether it returns HTML or not depends on the enabled filters in a format

- you can pass plain text, HTML, or other markup languages in; whether the input is properly converted into the intended output language depends on the enabled filters in a format

- the function is, unlike all other filter_* and check_* functions, a highly configurable converter that applies multiple string/text processors to the input; only each (enabled) text processor is supposed to return some expected result, while not all processors need to be enabled.

- it's not in the namespace of filter.module

catch’s picture

I like filter_markup().

agentrickard’s picture

That's 3 votes for filter_markup().

Any other filter_* suggestions?

sun’s picture

Status: Needs review » Reviewed & tested by the community
heine’s picture

Status: Reviewed & tested by the community » Needs work

This is not a bikeshed debate. Proper text handling appears to be one of the things most taxing on the Drupal developer's brain. Just count the XSS issues reported in the last two years.

I'm disappointed by the focus on the namespace and the imo callous disregard for central role of the check_* and filter_* functions in string handling. Getting these function names right could minimize developer confusion and prevent countless XSS issues in the coming years.

Even in this issue I see confusion about what check_markup does: check_markup takes richtext (an 'arbitrary' format defined by the associated INPUT FORMAT) and converts it to HTML. That the resulting string doesn't have to contain a single tag doesn't mean the resultant string type isn't HTML; the function implicitly returns a string that is destined to be used as HTML. If check_markup did not return HTML, how could we show nodes and comments to visitors?

(BTW Filter use by the messaging module is a hack; the module will be rewritten.)

cburschka’s picture

From the DX standpoint, Drupal is a web interface. Everything it outputs is converted to HTML (and if some day we want alternatives to that, the entire filtering system would need a rewrite anyway). In that sense, "Filter this" is equivalent to "[convert] this to HTML".

If "filter_richtext" and "richtext_to_html" are equivalent in meaning, then the first fits a lot better.

sun’s picture

Status: Needs work » Needs review

You seem to insist on the mysterious assumption that check_markup() always returns HTML that is safe for output.

Please have one more look at http://api.drupal.org/api/function/check_markup/7 and tell me where filter_xss() or any other sanitation function is invoked there. It is not.

check_markup() applies an arbitrary set of text processors to some string. That set can contain some processors or can be empty. Whatever string or formatted text you pass in may or may not be converted to something else.

Sanitation only happens if at least one security filter is enabled in the given text format, for example "HTML filter" (but there are more advanced security filters in contrib).

check_markup() is able to process many markup languages, especially lightweight markup languages, such as Markdown or BBCode. It is processing markup and is supposed to return HTML. The result may or may not be sanitized, depending on whether any security filter is additionally enabled in the given format.

For that sake, the "check" in check_markup() referred to the user access check for the given format that was previously contained. The $check parameter is gone now, so all what this function does now is to apply arbitrary filters to some potentially formatted text in some markup language and potentially convert that to another markup language (HTML is supposed). Therefore, filter_markup().

agentrickard’s picture

sun made the argument much better than I ever could. This is RTBC still.

cburschka’s picture

For the sake of completeness, whether we call something "markup" or "richtext" really /is/ a bikeshed. However, markup (as in Hyper-Text Markup Language) is the term we're already using throughout Drupal, whereas the first thing anyone connects with "richtext" is Microsofts completely unrelated RTF format.
So yeah, +1 for [filter_] and +1 for [_markup] from me.

heine’s picture

You seem to insist on the mysterious assumption that check_markup() always returns HTML that is safe for output.

Did I?

That is absolutely NOT my "assumption". Check_markup returns HTML. That's it. Whether or not it is "safe" is immaterial and something for filter_access to resolve.

check_markup() is able to process many markup languages, especially lightweight markup languages, such as Markdown or BBCode. It is processing markup and is supposed to return HTML. The result may or may not be sanitized, depending on whether any security filter is additionally enabled in the given format.

Excellent so we agree. check_markup processes a markup language defined by the input format to HTML.

For that sake, the "check" in check_markup() referred to the user access check for the given format that was previously contained. The $check parameter is gone now, so all what this function does now is to apply arbitrary filters to some potentially formatted text in some markup language and potentially convert that to another markup language (HTML is supposed). Therefore, filter_markup().

I didn't think I pledged for keeping the check_* function names. Those are pretty lame. filter_markup IMO sucks just as bad.

agentrickard’s picture

Is there a filter_* named function you thinks sucks less?

heine’s picture

On this question:

Is there a filter_* named function you thinks sucks less?

I can only answer with a quote from my earlier post (emphasis added):

I'm disappointed by the focus on the namespace and the imo callous disregard for central role of the check_* and filter_* functions in string handling. Getting these function names right could minimize developer confusion and prevent countless XSS issues in the coming years.

Status: Needs review » Needs work

The last submitted patch failed testing.

cburschka’s picture

Status: Needs work » Needs review
StatusFileSize
new12.83 KB

Broken by a change in the function code.

Status: Needs review » Needs work

The last submitted patch failed testing.

cburschka’s picture

Status: Needs work » Needs review
StatusFileSize
new12.99 KB

Renaming {box} to {block_custom} changed one line too close to a check_markup call.

Status: Needs review » Needs work

The last submitted patch failed testing.

cburschka’s picture

Status: Needs work » Needs review

#394182: DBTNG search.module broke HEAD. Retest.

agentrickard’s picture

Personally, I still like one of the following:

filter_output
filter_markup
drupal_filter
drupal_sanitize
drupal_secure_text
drupal_secure_output
drupal_filter_markup
filter_markup

Any chance we can pick something and move forward? And if the namespace is not filter_*, do we have to move the function to common.inc?

roychri’s picture

I personally do not like *_secure_* *_sanitize because the output may not be secure at all.

We can also specify in the PHPDoc that the output will be valid (perhaps insecure) HTML format (even if no tags are present in the output). Security depends on the actual filters that were applied based on access permissions.

The documentation says "Run all the enabled filters on a piece of text."

What about drupal_apply_filters(), apply_filters() or filters_apply()?

This is very technical name and refer to the implementation details. It performs its job by applying the necessary filters.

As mentioned in the original description, this is a very high-level functions.
Why do we need to apply all the filters anyway? To prepare the input and making it ready to display/output. But why? What is the ultimate goal from a user's perspective? Remove this function and the input will not be formatted the way you intended. It seems to me this is more about formatting than filtering. Filtering is only a mean to an end.

format_output()? format_input()?

There are other format_* function in common.inc which takes an input and sometimes some formatting option and return an altered input based on the formatting options. Surprisingly, they (almost) all take the $langcode as well.

Just my 2 cents.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Version: 7.x-dev » 8.x-dev

D8.

catch’s picture

Priority: Critical » Major

No such thing as a critical task.

pillarsdotnet’s picture

How about "filter_apply" ?

quicksketch’s picture

Priority: Major » Normal

Even "major" tasks are now preventing development of new features per the new policy at http://drupal.org/node/1201874. This would be a nice change, but largely just to fulfill OCD naming conventions. Considering we don't even have a consensus here, this shouldn't hold up development of new features.

greggles’s picture

just to fulfill OCD naming conventions.

I disagree, it's about a consistent API being easier to understand when first encountered which makes it easier to apply _safely_.

That said, I'm fine with the new priority.

wim leers’s picture

Version: 8.0.x-dev » 9.x-dev
catch’s picture

Version: 9.x-dev » 8.1.x-dev
Status: Needs work » Postponed

I think we could add the newly named function in a minor release and leave check_markup() as a deprecated wrapper if we wanted to.

We've halved the calls to check_markup() in core since 7.x - 16 down to 9 https://api.drupal.org/api/drupal/core%21modules%21filter%21filter.modul...

So contrib interaction with this should be minimal now.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

quietone’s picture

Status: Postponed » Active

I think no longer postponed.

catch’s picture

Very unlikely we'd rename this to another procedural function now, but the problem is still there 13 years later.

Reading back through some of the terminology discussions, a big problem is that filter module is called filter module. Filter module now provides 'text formats' which are configured collections of 'filter plugins', so it should be called 'format' module or 'text_format' module. You could then have a service like format.formatter or similar with a ::formatMarkup() or ::applyFormat() method.

wim leers’s picture

a big problem is that filter module is called filter module

Amen!

And a related big problem is: #3231354: [PP-2] [META] Discuss: merge the Editor config entity into the FilterFormat config entity.

IMHO we should finally do that, and merge filter and editor module into text_format module. Neither https://www.drupal.org/project/format nor https://www.drupal.org/project/text_format are taken. Perhaps I should pre-emptively register both? 🤓

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.

claudiu.cristea’s picture

We probably want to deprecate as a procedural function and move it somewhere. Where?

claudiu.cristea’s picture

claudiu.cristea’s picture

Title: Rename "check_markup()" » Deprecate check_markup()
Issue tags: -Needs title update
claudiu.cristea’s picture

Initially I was thinking that could go in the renderer service but now I see it depends on \Drupal\filter\Element\ProcessedText, which is part of filter module. Also function to be deprecated is also provided by filter.

Another option would be to not provide any replacement as the code is only 2 instructions. But I know it's a very popular function. According to https://search.tresbien.tech/search?q=check_markup it has ~450 usages

claudiu.cristea’s picture

Issue summary: View changes
Issue tags: -Needs issue summary update

I don't see any service in filter that could host this code. So probably, the suggestion from #88 is most likely to fit this case, even we ignore all remarks you the name of filter module.

claudiu.cristea’s picture

Issue tags: -Needs change record

Added CR

claudiu.cristea’s picture

Status: Active » Needs review

Ready for review.

I see the concerns from #88 and #89 but I think that should be subject of more debate and decisions. For now we only want to solve the deprecation issue. For the same reason (#88), intentionally I didn't prefix the class name with "Filter"

nicxvan’s picture

I don't really want to start the naming discussion all over, but I think we either need to create a follow up or find a name here first.

I think it should be applyFilters.

After scanning the issue again, I think that was commonly suggested, it was only rejected because it didn't have the filter_ prefix. Now that we are oop that doesn't matter so let's update the method to applyFilters then i think this is ready.

claudiu.cristea’s picture

If we want to include the verb in the method’s name, I think it’s the format not the filters. Because we’re applying a format. The format is passed as parameter to the method

berdir’s picture

Do we even need this function at all anymore?

It's just a wrapper to create a render array + force-rendering it early in isolation.

Rendering early is discouraged and it might actually result in cacheability bugs since this drops any cacheablity related to this.

Scanning through the MR, there is only a single call to it in core? For the filter tests, we could have a test trait if it's just about making it convenient to test different filter format configurations?

The comments that document ->processed are actually wrong. \Drupal\text\TextProcessed::getValue does not use check_markup(), that was changed in #2626924: Include processed text in normalizations: "text" field type's "processed" computed property should be non-internal and carry cacheability metadata exactly to fix the cacheability issues.

There are 400 usages in contrib, many off them possibly affected cache bugs too.

Another idea would be add a new method to \Drupal\filter\Element\ProcessedText that allows to create the render array without force-rendering it, ProcessedText::create() or so

claudiu.cristea’s picture

I agree, but contrib will cry

berdir’s picture

We could still add the create wrapper just for the render array, but I'm not sure we need to.

https://search.tresbien.tech/search?q=check_markup%20-r%3Adrupal&num=50&...

I don't think it's that bad, and compared to bigger things like oop hooks pretty harmless. Where it's used quite a bit is around mails from what I see. If (and that's still a pretty big if) we manage to provide a stable alternative mail API by then, then that will be part of that refactoring.

nicxvan’s picture

It's applying the filters of a format isn't it?

From the comment:

Runs all the enabled filters on a piece of text.
...
The text to be filtered.
...
(optional) An array of filter types to skip, or an empty array (default)
   *   to skip no filter types. All the format's filters will be applied, except
   *   for filters of the types that are marked to be skipped.

It does seem to be used quite a bit. I'm not sure removing it outright is correct, but I don't feel super strongly.

berdir’s picture

> It does seem to be used quite a bit. I'm not sure removing it outright is correct, but I don't feel super strongly.

Yeah, I'm not suggesting to remove it because it's unused but because using it's implementation is outdated and is a source of cache bugs, as shown by #2626924: Include processed text in normalizations: "text" field type's "processed" computed property should be non-internal and carry cacheability metadata.

Consider how incredibly old check_markup is, I think there are surprisingly few usages. The function was renamed in #27551: Change check_output to check_markup 21 years go and was added by Dries 25 years ago, before there were issues, it's older than the concept of modules.

nicxvan’s picture

There is no filter maintainer so I think it escalates to framework manager.

We definitely need sign-off if we are to just deprecate without replacement except a test trait.

nicxvan’s picture

I rebased, there were three conflicts, all easy, one services.yml, one use statements in filter.module and the last one was two tests with use statements and using the newly deprecated fallback format function.

catch’s picture

Retagging for release manager review since the question is mostly about the impact on contrib.

I am not really here so didn't look at the usages in depth. But first one I saw was assuming the results to #markup when instead it could use the render element directly. That seems like an obvious case for deprecation without replacement. We'd need instructions for doing it correctly.

What do we tell people using it to generate emails? To inline the logic?

This will be for removal in 13.0 so plenty of time to update.

berdir’s picture

Status: Needs review » Needs work

> What do we tell people using it to generate emails? To inline the logic?

As an intermediate step, yes. Core already does that, an example for that would be \Drupal\contact\Hook\ContactHooks::mail(). Trying to support render arrays in there is tricky because it's pluggable and has alter hooks. We could add something to \Drupal\Core\Mail\MailManager::doMail(), but at which point exactly? before alter hook, after, both?

With #3539651: Introduce email plugins, both text and html mails will likely go through twig and might remove the need to manually render this, although the direction there is far from final.

I added a comment on the single use in core, which I think we can directly convert to a render array, and we should also fix the token hook comments which are wrong in HEAD per #104, we could refer to Drupal\text\TextProcessed::getValue, but I'm not sure that implementation detail is even needed? It possibly was added when it was converted from an inline check_markup() to that, but these days, using the processed property is more common than check_markup, so we don't need to explain why we are doing that?

With those changes, there will be no non-test reference/usage left in core.

claudiu.cristea’s picture

Issue tags: +Needs change record
claudiu.cristea’s picture

I think I need some help in improving the deprecation message and building a good CR

nicxvan’s picture

First we need guidance if we're removing or replacing this. I think we should just wait for a bit.

berdir’s picture

The changes in asked for in #112 make sense whether or not we deprecate or remove as the token references were wrong and the usage in the views plugin unnecessary/could lead to bugs. Not sure if a test-only helper might be useful are there are quite a few repeating instances.

The views tests now also weirdly mix renderInIsolation() and renderRoot().

Also a bit concerning is the langcode stuff. A default value of an empty string for the langcode makes no sense and only works if code relying on it isn't using strict types. It's rarely used, but I saw one filter plugin passing it to \Drupal\Core\Entity\EntityRepository::getTranslationFromContext(), which only works as that does an empty() check and not a === NULL. Probably out of scope, but requires to explicitly pass the langcode to keep the current behavior such as in the views case.

claudiu.cristea’s picture

Not sure if a test-only helper might be useful are there are quite a few repeating instances.

Added a helper to see how it looks.

The views tests now also weirdly mix renderInIsolation() and renderRoot().

I think the test doesn't lose its scope if we replace renderRoot() with renderInIsolation()

claudiu.cristea’s picture

Status: Needs work » Needs review

We still need the release manager review about the direction

catch’s picture

Thanks for #112, removing without replacement makes sense to me.

smustgrave’s picture

Status: Needs review » Needs work

Thanks @catch for taking a look, think the last thing to do is update the CR string in the MR with the actual CR.

If you are another contributor eager to jump in, please allow the previous poster(s) at least 48 hours to respond to feedback first, so they have the opportunity to finish what they started!

claudiu.cristea’s picture

Assigned: Unassigned » claudiu.cristea

Assigning to continue the work

claudiu.cristea’s picture

Assigned: claudiu.cristea » Unassigned
Status: Needs work » Needs review
Issue tags: -Needs change record

Added a CR but it would help a look from @berdir and/or @catch to see if we can do it better.