Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update swift/main #665

Open
wants to merge 8 commits into
base: swift/main
Choose a base branch
from
Open

Conversation

milseman
Copy link
Member

No description provided.

natecook1000 and others added 8 commits April 19, 2023 14:06
The fast path for quantification incorrectly discards the last save
position when the quantification used up all possible trips, which is
only possible with range-based quantifications (e.g. `{0,3}`). This
bug shows up when a range-based quantifier matches the maximum - 1
repetitions of the preceding pattern.

For example, the regex `/a{0,2}a/` should succeed as a full match any
of the strings "aa", "aaa", or "aaaa". However, the pattern fails
to match "aaa", since the save point allowing a single "a" to match
the first `a{0,2}` part of the regex is discarded.

This change only discards the last save position when advancing the
quantifier fails due to a failure to match, not maxing out the number
of trips.
These changes remove several seconds of type-checking time from the
RegexBuilder test cases, bringing all expressions under 150ms (on
the tested computer).
Clean up and refactor the processor

* Simplify instruction fetching

* Refactor metrics out, and void their storage in release builds

*Put operations onto String
Calls to `ranges(of:)` and `firstRange(of:)` with a string parameter
actually use two different string searching algorithms. `ranges(of:)`
uses the "z-searcher" algorithm, while `firstRange(of:)` uses a
two-way search. Since it's better to align on a single path for these
searches, the z-searcher has lower requirements, and the two-way
search implementation has a correctness bug, this change removes
the two-way search algorithm and uses z-search for `firstRange(of:)`.

The correctness bug in `firstRange(of:)` appears only when searching
for the second (or later) occurrence of a substring, which you have
to be fairly deliberate about. In the example below, the substring
at offsets `7..<12` is missed:

    let text = "ADACBADADACBADACB"
    //          =====  -----=====
    let pattern = "ADACB"
    let firstRange = text.firstRange(of: pattern)!
    // firstRange ~= 0..<5
    let secondRange = text[firstRange.upperBound...].firstRange(of: pattern)!
    // secondRange ~= 12..<17

This change also removes some unrelated, unused code in Split.swift,
in addition to removing an (unused) usage of `TwoWaySearcher`.

rdar://92794248
Bug fix in newline hot path, and apply hot path to quantified dot
Finish refactoring logic onto String
@milseman milseman requested a review from natecook1000 April 19, 2023 20:08
@milseman milseman changed the title Update main Update swift/main Apr 19, 2023
@milseman
Copy link
Member Author

@swift-ci please test

extension String {
// TODO: Should the below have a `limitedBy` parameter?

func matchAnyNonNewline(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self (and @natecook1000), we should unit test these methods.


extension String {

func match(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, we can unit test these

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants