Skip to content

Narrow explode() when the delimiter is a known substring#5959

Open
paulbalandan wants to merge 3 commits into
phpstan:2.2.xfrom
paulbalandan:explode-known-substring-two-elements
Open

Narrow explode() when the delimiter is a known substring#5959
paulbalandan wants to merge 3 commits into
phpstan:2.2.xfrom
paulbalandan:explode-known-substring-two-elements

Conversation

@paulbalandan

Copy link
Copy Markdown
Contributor

Summary

PHPStan did not narrow explode($delimiter, $string, 2) to a two-element list even when a
str_contains($string, $delimiter) guard proved the delimiter is present, so destructuring
[$first, $rest] = explode(...) reported Offset 1 might not exist on non-empty-list<string>
(under reportPossiblyNonexistentGeneralArrayOffset). The report framed this as a while-only
problem, but it reproduced identically inside a plain if. This teaches the explode
return-type extension to use a known-present delimiter.

Changes

  • src/Type/Php/ExplodeFunctionDynamicReturnTypeExtension.php — when the scope proves the
    delimiter occurs in the string, return array{string, string} for a limit of exactly 2,
    array{0: string, 1: string, 2?: string, ...} for a larger constant limit, and
    array{string, string, ...<string>} otherwise. Presence is probed by reconstructing
    str_contains()/str_starts_with()/str_ends_with() on the same haystack and delimiter
    (with a fully-qualified name, so the node key matches the resolved call) and checking whether
    the scope knows the result to be true.
  • src/Type/Php/StrContainingTypeSpecifyingExtension.php — for str_contains(),
    str_starts_with() and str_ends_with(), also remember the call value as true in the
    truthy branch, so a bare if/while guard carries the same information ... === true
    already did.
  • tests/PHPStan/Analyser/nsrt/bug-14651.php — type-inference regression covering the
    reporter's while reproducer, the if form, str_starts_with/str_ends_with guards, a
    non-empty-string variable needle, and the untouched cases (no guard, different delimiter,
    limit === 1, negative limit).

Root cause

Two gaps combined:

  1. ExplodeFunctionDynamicReturnTypeExtension always returned non-empty-list<string>; it
    never consulted whether the delimiter was known to be present, so it could not prove a
    second element exists.
  2. A plain if (str_contains(...)) did not remember the call's truthiness. When a
    FunctionTypeSpecifyingExtension handles a call, FuncCallHandler::specifyTypes() returns
    the extension's SpecifiedTypes directly and never unions
    handleDefaultTruthyOrFalseyContext(), so inside the branch str_contains($x, $y) stayed
    bool. (Contrast is_numeric(), which has no extension and is remembered as true, and
    str_contains($x, $y) === true, where the Identical handler pins the call to true.)
    With the call unremembered, the guard scope held only the haystack narrowing
    (non-falsy-string), which carries no "contains delimiter" information for explode to use.

The first gap is fixed in the explode extension; the second by having the string-containment
extension remember the value of the three literal-substring-proving functions. explode then
reconstructs the guard call and asks the scope for its type — the reconstructed name must be
fully-qualified because parsed calls are stored under their resolved (\str_contains(...)) key.

Test

  • tests/PHPStan/Analyser/nsrt/bug-14651.php asserts the inferred explode() type across
    guarded and unguarded forms; it fails before the fix (guarded cases inferred
    non-empty-list<string>) and passes after.
  • Other delimiters, a possibly-empty needle, limit === 1 and negative limits keep their
    previous types, so no new false positives are introduced.
  • Probed the remembering change for fallout: make phpstan (self-analysis) is clean, and
    NodeScopeResolverTest, AnalyserIntegrationTest, and the Comparison, DeadCode,
    Functions and Variables rule suites pass — no spurious always-true reports.

Fixes phpstan/phpstan#14651

Comment thread tests/PHPStan/Analyser/nsrt/bug-14651.php
@paulbalandan paulbalandan force-pushed the explode-known-substring-two-elements branch from bb53127 to 2a372eb Compare July 1, 2026 05:40
Comment thread src/Type/Php/ExplodeFunctionDynamicReturnTypeExtension.php Outdated
Comment thread src/Type/Php/ExplodeFunctionDynamicReturnTypeExtension.php Outdated
@paulbalandan paulbalandan force-pushed the explode-known-substring-two-elements branch from 2a372eb to 3ead043 Compare July 1, 2026 05:47
@paulbalandan paulbalandan force-pushed the explode-known-substring-two-elements branch from 3ead043 to 774bd6b Compare July 1, 2026 06:05
Comment thread src/Type/Php/ExplodeFunctionDynamicReturnTypeExtension.php Outdated
Comment thread src/Type/Php/ExplodeFunctionDynamicReturnTypeExtension.php
@staabm

staabm commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

feel free to add more commits to the PR instead of rewriting the same commit over and over and force-push.

its easier to review as separate commits.
maintainer will squash them on merge.

@paulbalandan paulbalandan changed the title Narrow explode() when the delimiter is a known substring Narrow explode() when the delimiter is a known substring Jul 1, 2026
Comment thread tests/PHPStan/Analyser/nsrt/bug-14651.php
@staabm staabm requested a review from VincentLanglet July 1, 2026 07:12
return null;
}

$finiteTypes = $limitType->getFiniteTypes();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we calling getFiniteTypes and then getConstantScalarValues ? Can't we call directly getConstantScalarValues ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

str_contains($s, $needle) does not narrow explode($needle, $s, 2) to a 2-element list inside a while loop

3 participants