Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgr fixes #8259

Merged
merged 3 commits into from
Jun 30, 2024
Merged

Upgr fixes #8259

merged 3 commits into from
Jun 30, 2024

Conversation

sbulen
Copy link
Contributor

@sbulen sbulen commented Jun 26, 2024

Fixes #7799

Also adds $smcFunc['truncate'], used when building log_search_subjects via text2words() in old 1.0.x migrations. This was causing fatal errors when running the upgrader against 1.0.x & yabbse1.5.5.

Needed to get this line:

$returned_words[] = $max_chars === null ? $word : $smcFunc['truncate']($word, $max_chars);

To behave like the prior version for the upgrade:

$returned_words[] = $max_chars === null ? $word : substr($word, 0, $max_chars);

Note that $smcFunc['truncate'] is actually pretty involved & factors in entity encoding...

'truncate' => function($string, $length) use ($utf8, $ent_check, $ent_list, &$smcFunc)

But to port over all the portions of Load.php into upgrade.php to make that work felt like overkill. Opted to just match the prior logic, which has been used for years.

other/upgrade.php Outdated Show resolved Hide resolved
@Sesquipedalian Sesquipedalian merged commit 95a03e2 into SimpleMachines:release-2.1 Jun 30, 2024
8 checks passed
@sbulen
Copy link
Contributor Author

sbulen commented Jul 1, 2024

@Sesquipedalian -

The latest change introduced an error:

+++ Indexing topic subjects...Error: Undefined variable $string File: \yadayada\van1023\upgrade.php Line: 723Error: preg_split(): Passing null to parameter # 2 ($subject) of type string is deprecated File: \yadayada\van1023\upgrade.php Line: 723

Wrong var name used:
image

@Sesquipedalian
Copy link
Member

D'oh!

@sbulen
Copy link
Contributor Author

sbulen commented Jul 1, 2024

Please note also that I'd be hesitant to assume the "u" modifier is OK - a lot of these old DBs are probably not utf8.

Many aren't even ISO-8859-1... Some are ISO-8859-2+, for other language support back in the day...

I helped one VERY LARGE forum (lcdtvthailand) migrate forward from TIS-620...

I'm not sure I know what the "u" modifier will do to that.

Part of this is why I left the truncate function alone, as noted in my comment.

OTOH... I do think that it is quite possible that the EXACT logic used in Load.php might work, because all the vars are defined at that point in the upgrader (maybe moved a few lines lower after $utf8 is declared)...

'truncate' => function($string, $length) use ($utf8, $ent_check, $ent_list, &$smcFunc)

@sbulen
Copy link
Contributor Author

sbulen commented Jul 1, 2024

I know I am often too conservative on these things, but... I'd rather run with what's been working for the last umpteen years...

@Sesquipedalian
Copy link
Member

Sesquipedalian commented Jul 1, 2024

It will be fine. All those old 8-bit character sets will be divided up by this regex correctly. If we were trying to analyze the characters, we'd get into trouble, because the bytes beyond 7F would mean different things in different character sets. But we're just splitting them, so as long as each character is just one byte (which they are), it doesn't matter what they map to in this or that character set. Indeed, the only reason I used the u flag was to get access to the \X shorthand, which is like . except that it will match newlines, too. Beyond that, the u flag will make no discernible difference in this context.

I know I am often too conservative on these things, but... I'd rather run with what's been working for the last umpteen years...

That's the thing. It hasn't been. It cuts in the middle of entities. That's how the problems with broken entities in the log_search_subjects table happened.

@Sesquipedalian
Copy link
Member

... Actually, just in case, I will go ahead and change it so that it doesn't use the u flag after all.

@sbulen sbulen deleted the upgr_215_fixes branch July 1, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants