Improve handling of disable/enable/ignore directives #123

anomiex · 2023-12-04T16:21:08Z

Description

The current method, listing codes to disable and a list of exceptions to that list, still has trouble with some cases. For example, disabling a standard, re-enabling a category within that standard, then ignoring or disabling a sniff within that category cannot be handled. We'd need a list of exceptions to the exceptions, and possibly a list of exceptions to that list too, and figuring out how to keep those lists up to date as new directives are encountered could prove to be confusing.

Since the standard→category→sniff→code hierarchy is supposed to be thought of as a tree, let's store the ignore list that way instead. Manipulating the branches of the tree is straightforward no matter what directives are encountered.

In this implementation I've favored speed over space: there are cases where we could prune a subtree that would evaluate to "ignore" or "don't ignore" for any possible input, but detecting that doesn't seem worth the time when it's not likely there will be so many enable or disable directives that the wasted space will be a problem.

Suggested changelog entry

Fixed bug #111: Improve handling of disable/enable/ignore directives

Related issues/external references

Follow up on squizlabs/PHP_CodeSniffer#1986

Fixes #111
Fixes squizlabs/PHP_CodeSniffer#3889
Closes squizlabs/PHP_CodeSniffer#3891

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
- This change is only breaking for integrators, not for external standards or end-users.
Documentation improvement

Technically this could be considered a breaking change, since PHP_CodeSniffer\Tokenizers\Tokenizer->ignoredLines happens to be public and this changes the values stored there. Is that really a public API though?

PR checklist

I have checked there is no other PR open for the same change.
I have read the Contribution Guidelines.
I grant the project the right to include and distribute the code under the BSD-3-Clause license (and I have the right to grant these rights).
I have added tests to cover my changes.
I have verified that the code complies with the projects coding standards.
[Required for new sniffs] I have added XML documentation for the sniff.

jrfnl · 2023-12-04T16:40:50Z

Thanks for porting this over @anomiex ! I will look at this at a later point in time and am tentatively earmarking this PR for the 4.0.0 release as it constitutes a bigger change than I deem responsible for the 3.x branch.

anomiex · 2023-12-04T17:52:15Z

Since it sounds like 4.0.0 is probably coming fairly soon after 3.8.0 now, that seems fine with me. 👍

Also, since 4.0.0 is dropping old PHP support, I'm going to ignore the failing PHP 5.4 test. 🙂 Too bad it cancelled the rest of the tests after that one failed.

Links in the link lists have to be unique and as all issue/PR sources (this repo, Squizlabs repo, PEAR repo) use numeric issue references, there will be duplicate/overlapping numbers at some point. To prevent this, I'm proposing the following change: * Links to _this repo_ will continue to use simple issue number links, i.e. `[#123]`. This will keep things most straight forward for new content being added to the changelog. * Links to the _Squizlabs repo_ will get a `sq` prefix in the link reference. This means the link reference will need to be added where the link is used, i.e `[#123][sq-123]`. * Links to the _PEAR repo_ will get a `pear` prefix in the link reference. Again, this means the link reference will need to be added where the link is used, i.e `[#123][pear-123]`. This commit makes it so for all existing issue/PR links in the changelog. Related to 244

jrfnl · 2025-04-17T23:15:56Z

@anomiex It's been a while, but 4.0 has finally been recreated in the new 4.x branch.

With that (mostly) done, I'd like to have a good look at this PR next week.
Considering it's been a while and I don't know whether you have any availability in the near future: would you like to rebase the PR yourself and for me to review it ? Or would you prefer I took over the PR, rebased it and fixed up anything I deem necessary ?

The current method, listing codes to disable and a list of exceptions to that list, still has trouble with some cases. For example, disabling a standard, re-enabling a category within that standard, then ignoring or disabling a sniff within that category cannot be handled. We'd need a list of exceptions to the exceptions, and possibly a list of exceptions to that list too, and figuring out how to keep those lists up to date as new directives are encountered could prove to be confusing. Since the standard→category→sniff→code hierarchy is supposed to be thought of as a tree, let's store the ignore list that way instead. Manipulating the branches of the tree is straightforward no matter what directives are encountered. In this implementation I've favored speed over space: there are cases where we could prune a subtree that would evaluate to "ignore" or "don't ignore" for any possible input, but detecting that doesn't seem worth the time when it's not likely there will be so many enable or disable directives that the wasted space will be a problem. Fixes PHPCSStandards#111

anomiex · 2025-04-18T13:50:32Z

Did you want me to rebase this on master or on the 4.x branch? 🙂

jrfnl · 2025-04-18T17:22:45Z

Did you want me to rebase this on master or on the 4.x branch? 🙂

@anomiex Against the 4.x branch please as this PR was earmarked for the 4.0 release.

anomiex · 2025-04-18T19:32:15Z

Ok. Let's see if I can do it without getting GitHub all confused. 😅

anomiex · 2025-04-18T19:34:13Z

Whew, GitHub didn't get confused into thinking there were 1000 commits added. 😀

jrfnl · 2025-04-18T20:44:39Z

Whew, GitHub didn't get confused into thinking there were 1000 commits added. 😀

Nice ;-) As you've probably figured out - I found that if you change the base branch ahead of pushing the rebased branch, it seems to work smoothly (and CI works properly then too).

jrfnl

@anomiex Thank you for this PR and for your patience.

I've had a good look now and I've left a number of remarks inline, some nitpicks, some about conventions etc.
Aside from that, I have a couple of questions regarding the implementation, for which I'd like to understand your reasoning for the choices made.

The class has some static constructors, but the code in the Tokenizer class still uses clone $ignoring. How about a IgnoreList::get[New]InstanceFrom($ignoring) method which does clone $this ? What is your opinion on this ?
The IgnoreList has a "ValueObject" feel to it, but isn't a value object currently. Any particular reason for that ?
Should there be any validation in the set() method that the sniff code being ignored is actually a potentially valid error code ? I.e. max 4 parts.
In the set() method, some sub-arrays get a .default key, some don't. This feels error prone. Is there a reason to set the sub-arrays like this ?
Am I correct in saying that the isAll() method only works by the grace of the top-level always having the .default key ?
I'm a bit concerned about the performance implications of this PR what with isAll() being called for every single message being added (providing anything is being ignored). Did you run any performance checks on this PR ?

Nitpick: what is the logic behind the method order in the IgnoreList class ? If I look at the methods, it feels like set() should come before check() ?

Other than that:

The tests nowadays follow the directory structure of the source code more closely.
So with IgnoreList class being in the Util subdirectory, the IgnoreListTest class(es) should be in a tests/Core/Util/IgnoreList subdirectory.

And while not (yet) required, as a rule of thumb, each test class should only contain tests for one method, or one "set" of methods, to reduce code churn for when PHPUnit 12 will start to be supported.
In practice, this means that preferably @covers tags should be at class level and all tests in the class should "cover" the same code (which will be required in PHPUnit 12).
In this case, that would mean splitting the test class into three/four classes, though I'm okay with leaving this for now and doing that when the PHPUnit 12 compat PR will be done.

For the record:

✅ Confirmed that the new tests in the ErrorSuppressionTest class would fail without this fix and pass with.
✅ Checked code coverage for the changed and new code.

src/Util/IgnoreList.php

tests/Core/IgnoreListTest.php

src/Util/IgnoreList.php

tests/Core/IgnoreListTest.php

anomiex · 2025-05-06T17:40:59Z

The class has some static constructors, but the code in the Tokenizer class still uses clone $ignoring. How about a IgnoreList::get[New]InstanceFrom($ignoring) method which does clone $this ? What is your opinion on this ?

Doesn't matter to me. Done.

The IgnoreList has a "ValueObject" feel to it, but isn't a value object currently. Any particular reason for that ?

I've never been all that big a fan of strictly doing value objects. If you really want I could convert it (it would just mean having ->set() clone first), with the drawback that something like @phpcs:disable Foo, Bar, Baz would make some extra clones that immediately get thrown away.

Should there be any validation in the set() method that the sniff code being ignored is actually a potentially valid error code ? I.e. max 4 parts.

Other than the @var array<string, bool|array<string, bool|array<string, bool|array<string, bool>>>> you had me add, everything works fine if there are more parts. I'm inclined to leave it, unless you really want that check added.

In the set() method, some sub-arrays get a .default key, some don't. This feels error prone. Is there a reason to set the sub-arrays like this ?

The .default is only being set when necessary. For example,

@phpcs:disable Standard
@phpcs:enable Standard.Category.Sniff
@phpcs:disable Standard.Category.Sniff.Message

there's no need to flag Standard.Category as being disabled when it can easily inherit that from Standard.

While we could add it in, there would be zero benefit while it would make set() more complex.

Am I correct in saying that the isAll() method only works by the grace of the top-level always having the .default key ?

Yes, as written it needs the top-level .default being set to false by default. isIgnored() also currently requires the top-level .default is set.

We could work around that in both cases if we wanted, by changing the $returnValue = $data['.default']; to $returnValue = $data['.default'] ?? false; in the latter and adding something like if ( ! isset( $data['.default'] ) ) { return false; } to the top of the former. Is saving one array key per instance for some small added code complexity worth it? I don't know.

I'm a bit concerned about the performance implications of this PR what with isAll() being called for every single message being added (providing anything is being ignored). Did you run any performance checks on this PR ?

Most of the time isAll() shouldn't have to do much work, as either ->data['.default'] will be false and the first thing checked so it'll exit on the first iteration of the loops, or ->data will be [ '.default' => true ] so the loops will have only one iteration.

The question made me realize another test case that I could add, though:

// @phpcs:disable
// @phpcs:enable Foo.Bar

Current code doesn't honor the enable on the second line, while this PR does.

As for performance, I ran 4.x and this PR over Automattic/jetpack with the config deleted so it produced tons of messages (almost 1.8 million 😀). Took 4:48.61 for 4.x and 4:47.36 for this PR, which seems close enough to say there's no performance impact either way. Then a second run with 4.x took 5:03:26. 🤷

Nitpick: what is the logic behind the method order in the IgnoreList class ? If I look at the methods, it feels like set() should come before check() ?

First the static methods, then the check and set, then the methods to check for ignoring nothing or everything. No particular reason for check() coming before set().

github-actions bot added Type: bug Core Component: File Core Component: Tokenizer Core Component: Utils labels Dec 4, 2023

jrfnl added the Status: triage label Dec 4, 2023

jrfnl added this to the 4.0.0 milestone Dec 4, 2023

jrfnl added the Type: breaking change label Dec 10, 2023

anomiex force-pushed the fix/ignore-list-handling branch from 5ea7447 to 7338e5a Compare December 10, 2023 22:40

github-actions bot added the Status: has merge conflict label Dec 31, 2023

jrfnl force-pushed the master branch from e06559a to aab0862 Compare December 7, 2024 10:54

anomiex changed the base branch from master to 4.x April 18, 2025 19:32

anomiex force-pushed the fix/ignore-list-handling branch from 7338e5a to 9a0ee0c Compare April 18, 2025 19:32

github-actions bot removed the Status: has merge conflict label Apr 18, 2025

jrfnl force-pushed the 4.x branch from d853095 to 4a0c674 Compare May 4, 2025 17:00

jrfnl reviewed May 6, 2025

View reviewed changes

jrfnl removed the Status: triage label May 6, 2025

Apply suggestions from code review

5b3d2c8

Use FQCNs in doc comments

0ff1f11

Missed one arrow alignment

3e4bcb0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve handling of disable/enable/ignore directives #123

Improve handling of disable/enable/ignore directives #123

anomiex commented Dec 4, 2023 •

edited by jrfnl

Loading

jrfnl commented Dec 4, 2023 •

edited

Loading

anomiex commented Dec 4, 2023

jrfnl commented Apr 17, 2025 •

edited

Loading

anomiex commented Apr 18, 2025

jrfnl commented Apr 18, 2025

anomiex commented Apr 18, 2025

anomiex commented Apr 18, 2025

jrfnl commented Apr 18, 2025

jrfnl left a comment •

edited

Loading

anomiex commented May 6, 2025 •

edited by jrfnl

Loading

Improve handling of disable/enable/ignore directives #123

Are you sure you want to change the base?

Improve handling of disable/enable/ignore directives #123

Conversation

anomiex commented Dec 4, 2023 • edited by jrfnl Loading

Description

Suggested changelog entry

Related issues/external references

Types of changes

PR checklist

jrfnl commented Dec 4, 2023 • edited Loading

anomiex commented Dec 4, 2023

jrfnl commented Apr 17, 2025 • edited Loading

anomiex commented Apr 18, 2025

jrfnl commented Apr 18, 2025

anomiex commented Apr 18, 2025

anomiex commented Apr 18, 2025

jrfnl commented Apr 18, 2025

jrfnl left a comment • edited Loading

Choose a reason for hiding this comment

anomiex commented May 6, 2025 • edited by jrfnl Loading

anomiex commented Dec 4, 2023 •

edited by jrfnl

Loading

jrfnl commented Dec 4, 2023 •

edited

Loading

jrfnl commented Apr 17, 2025 •

edited

Loading

jrfnl left a comment •

edited

Loading

anomiex commented May 6, 2025 •

edited by jrfnl

Loading