Skip to content

Improve handling of disable/enable/ignore directives #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: 4.x
Choose a base branch
from

Conversation

anomiex
Copy link
Contributor

@anomiex anomiex commented Dec 4, 2023

Description

The current method, listing codes to disable and a list of exceptions to that list, still has trouble with some cases. For example, disabling a standard, re-enabling a category within that standard, then ignoring or disabling a sniff within that category cannot be handled. We'd need a list of exceptions to the exceptions, and possibly a list of exceptions to that list too, and figuring out how to keep those lists up to date as new directives are encountered could prove to be confusing.

Since the standard→category→sniff→code hierarchy is supposed to be thought of as a tree, let's store the ignore list that way instead. Manipulating the branches of the tree is straightforward no matter what directives are encountered.

In this implementation I've favored speed over space: there are cases where we could prune a subtree that would evaluate to "ignore" or "don't ignore" for any possible input, but detecting that doesn't seem worth the time when it's not likely there will be so many enable or disable directives that the wasted space will be a problem.

Suggested changelog entry

Fixed bug #111: Improve handling of disable/enable/ignore directives

Related issues/external references

Follow up on squizlabs/PHP_CodeSniffer#1986

Fixes #111
Fixes squizlabs/PHP_CodeSniffer#3889
Closes squizlabs/PHP_CodeSniffer#3891

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
    • This change is only breaking for integrators, not for external standards or end-users.
  • Documentation improvement

Technically this could be considered a breaking change, since PHP_CodeSniffer\Tokenizers\Tokenizer->ignoredLines happens to be public and this changes the values stored there. Is that really a public API though?

PR checklist

  • I have checked there is no other PR open for the same change.
  • I have read the Contribution Guidelines.
  • I grant the project the right to include and distribute the code under the BSD-3-Clause license (and I have the right to grant these rights).
  • I have added tests to cover my changes.
  • I have verified that the code complies with the projects coding standards.
  • [Required for new sniffs] I have added XML documentation for the sniff.

@jrfnl
Copy link
Member

jrfnl commented Dec 4, 2023

Thanks for porting this over @anomiex ! I will look at this at a later point in time and am tentatively earmarking this PR for the 4.0.0 release as it constitutes a bigger change than I deem responsible for the 3.x branch.

@anomiex
Copy link
Contributor Author

anomiex commented Dec 4, 2023

Since it sounds like 4.0.0 is probably coming fairly soon after 3.8.0 now, that seems fine with me. 👍

Also, since 4.0.0 is dropping old PHP support, I'm going to ignore the failing PHP 5.4 test. 🙂 Too bad it cancelled the rest of the tests after that one failed.

@anomiex anomiex force-pushed the fix/ignore-list-handling branch from 5ea7447 to 7338e5a Compare December 10, 2023 22:40
jrfnl added a commit that referenced this pull request Mar 20, 2024
Links in the link lists have to be unique and as all issue/PR sources (this repo, Squizlabs repo, PEAR repo) use numeric issue references, there will be duplicate/overlapping numbers at some point.

To prevent this, I'm proposing the following change:
* Links to _this repo_ will continue to use simple issue number links, i.e. `[#123]`.
    This will keep things most straight forward for new content being added to the changelog.
* Links to the _Squizlabs repo_ will get a `sq` prefix in the link reference. This means the link reference will need to be added where the link is used, i.e `[#123][sq-123]`.
* Links to the _PEAR repo_ will get a `pear` prefix in the link reference. Again, this means the link reference will need to be added where the link is used, i.e `[#123][pear-123]`.

This commit makes it so for all existing issue/PR links in the changelog.

Related to 244
jrfnl added a commit that referenced this pull request Mar 20, 2024
Links in the link lists have to be unique and as all issue/PR sources (this repo, Squizlabs repo, PEAR repo) use numeric issue references, there will be duplicate/overlapping numbers at some point.

To prevent this, I'm proposing the following change:
* Links to _this repo_ will continue to use simple issue number links, i.e. `[#123]`.
    This will keep things most straight forward for new content being added to the changelog.
* Links to the _Squizlabs repo_ will get a `sq` prefix in the link reference. This means the link reference will need to be added where the link is used, i.e `[#123][sq-123]`.
* Links to the _PEAR repo_ will get a `pear` prefix in the link reference. Again, this means the link reference will need to be added where the link is used, i.e `[#123][pear-123]`.

This commit makes it so for all existing issue/PR links in the changelog.

Related to 244
@jrfnl
Copy link
Member

jrfnl commented Apr 17, 2025

@anomiex It's been a while, but 4.0 has finally been recreated in the new 4.x branch.

With that (mostly) done, I'd like to have a good look at this PR next week.
Considering it's been a while and I don't know whether you have any availability in the near future: would you like to rebase the PR yourself and for me to review it ? Or would you prefer I took over the PR, rebased it and fixed up anything I deem necessary ?

The current method, listing codes to disable and a list of exceptions to
that list, still has trouble with some cases. For example, disabling a
standard, re-enabling a category within that standard, then ignoring or
disabling a sniff within that category cannot be handled. We'd need a
list of exceptions to the exceptions, and possibly a list of exceptions
to that list too, and figuring out how to keep those lists up to date as
new directives are encountered could prove to be confusing.

Since the standard→category→sniff→code hierarchy is supposed to be
thought of as a tree, let's store the ignore list that way instead.
Manipulating the branches of the tree is straightforward no matter what
directives are encountered.

In this implementation I've favored speed over space: there are cases
where we could prune a subtree that would evaluate to "ignore" or "don't
ignore" for any possible input, but detecting that doesn't seem worth
the time when it's not likely there will be so many enable or disable
directives that the wasted space will be a problem.

Fixes PHPCSStandards#111
@anomiex
Copy link
Contributor Author

anomiex commented Apr 18, 2025

Did you want me to rebase this on master or on the 4.x branch? 🙂

@jrfnl
Copy link
Member

jrfnl commented Apr 18, 2025

Did you want me to rebase this on master or on the 4.x branch? 🙂

@anomiex Against the 4.x branch please as this PR was earmarked for the 4.0 release.

@anomiex
Copy link
Contributor Author

anomiex commented Apr 18, 2025

Ok. Let's see if I can do it without getting GitHub all confused. 😅

@anomiex anomiex changed the base branch from master to 4.x April 18, 2025 19:32
@anomiex anomiex force-pushed the fix/ignore-list-handling branch from 7338e5a to 9a0ee0c Compare April 18, 2025 19:32
@anomiex
Copy link
Contributor Author

anomiex commented Apr 18, 2025

Whew, GitHub didn't get confused into thinking there were 1000 commits added. 😀

@jrfnl
Copy link
Member

jrfnl commented Apr 18, 2025

Whew, GitHub didn't get confused into thinking there were 1000 commits added. 😀

Nice ;-) As you've probably figured out - I found that if you change the base branch ahead of pushing the rebased branch, it seems to work smoothly (and CI works properly then too).

Copy link
Member

@jrfnl jrfnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anomiex Thank you for this PR and for your patience.

I've had a good look now and I've left a number of remarks inline, some nitpicks, some about conventions etc.
Aside from that, I have a couple of questions regarding the implementation, for which I'd like to understand your reasoning for the choices made.

  1. The class has some static constructors, but the code in the Tokenizer class still uses clone $ignoring. How about a IgnoreList::get[New]InstanceFrom($ignoring) method which does clone $this ? What is your opinion on this ?
  2. The IgnoreList has a "ValueObject" feel to it, but isn't a value object currently. Any particular reason for that ?
  3. Should there be any validation in the set() method that the sniff code being ignored is actually a potentially valid error code ? I.e. max 4 parts.
  4. In the set() method, some sub-arrays get a .default key, some don't. This feels error prone. Is there a reason to set the sub-arrays like this ?
  5. Am I correct in saying that the isAll() method only works by the grace of the top-level always having the .default key ?
  6. I'm a bit concerned about the performance implications of this PR what with isAll() being called for every single message being added (providing anything is being ignored). Did you run any performance checks on this PR ?

Nitpick: what is the logic behind the method order in the IgnoreList class ? If I look at the methods, it feels like set() should come before check() ?

Other than that:

The tests nowadays follow the directory structure of the source code more closely.
So with IgnoreList class being in the Util subdirectory, the IgnoreListTest class(es) should be in a tests/Core/Util/IgnoreList subdirectory.

And while not (yet) required, as a rule of thumb, each test class should only contain tests for one method, or one "set" of methods, to reduce code churn for when PHPUnit 12 will start to be supported.
In practice, this means that preferably @covers tags should be at class level and all tests in the class should "cover" the same code (which will be required in PHPUnit 12).
In this case, that would mean splitting the test class into three/four classes, though I'm okay with leaving this for now and doing that when the PHPUnit 12 compat PR will be done.

For the record:

  • ✅ Confirmed that the new tests in the ErrorSuppressionTest class would fail without this fix and pass with.
  • ✅ Checked code coverage for the changed and new code.

@anomiex
Copy link
Contributor Author

anomiex commented May 6, 2025

  1. The class has some static constructors, but the code in the Tokenizer class still uses clone $ignoring. How about a IgnoreList::get[New]InstanceFrom($ignoring) method which does clone $this ? What is your opinion on this ?

Doesn't matter to me. Done.

  1. The IgnoreList has a "ValueObject" feel to it, but isn't a value object currently. Any particular reason for that ?

I've never been all that big a fan of strictly doing value objects. If you really want I could convert it (it would just mean having ->set() clone first), with the drawback that something like @phpcs:disable Foo, Bar, Baz would make some extra clones that immediately get thrown away.

  1. Should there be any validation in the set() method that the sniff code being ignored is actually a potentially valid error code ? I.e. max 4 parts.

Other than the @var array<string, bool|array<string, bool|array<string, bool|array<string, bool>>>> you had me add, everything works fine if there are more parts. I'm inclined to leave it, unless you really want that check added.

  1. In the set() method, some sub-arrays get a .default key, some don't. This feels error prone. Is there a reason to set the sub-arrays like this ?

The .default is only being set when necessary. For example,

@phpcs:disable Standard
@phpcs:enable Standard.Category.Sniff
@phpcs:disable Standard.Category.Sniff.Message

there's no need to flag Standard.Category as being disabled when it can easily inherit that from Standard.

While we could add it in, there would be zero benefit while it would make set() more complex.

  1. Am I correct in saying that the isAll() method only works by the grace of the top-level always having the .default key ?

Yes, as written it needs the top-level .default being set to false by default. isIgnored() also currently requires the top-level .default is set.

We could work around that in both cases if we wanted, by changing the $returnValue = $data['.default']; to $returnValue = $data['.default'] ?? false; in the latter and adding something like if ( ! isset( $data['.default'] ) ) { return false; } to the top of the former. Is saving one array key per instance for some small added code complexity worth it? I don't know.

  1. I'm a bit concerned about the performance implications of this PR what with isAll() being called for every single message being added (providing anything is being ignored). Did you run any performance checks on this PR ?

Most of the time isAll() shouldn't have to do much work, as either ->data['.default'] will be false and the first thing checked so it'll exit on the first iteration of the loops, or ->data will be [ '.default' => true ] so the loops will have only one iteration.

The question made me realize another test case that I could add, though:

// @phpcs:disable
// @phpcs:enable Foo.Bar

Current code doesn't honor the enable on the second line, while this PR does.

As for performance, I ran 4.x and this PR over Automattic/jetpack with the config deleted so it produced tons of messages (almost 1.8 million 😀). Took 4:48.61 for 4.x and 4:47.36 for this PR, which seems close enough to say there's no performance impact either way. Then a second run with 4.x took 5:03:26. 🤷

Nitpick: what is the logic behind the method order in the IgnoreList class ? If I look at the methods, it feels like set() should come before check() ?

First the static methods, then the check and set, then the methods to check for ignoring nothing or everything. No particular reason for check() coming before set().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

phpcs:enable can sometimes override phpcs:ignore
2 participants