Skip to content

Fix mb_strimwidth RC info #9254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

iluuu1994
Copy link
Member

@iluuu1994 iluuu1994 commented Aug 5, 2022

return zend_string_copy(input);

This function can now return values with RC != 1.

Associated failure on nightly:

https://dev.azure.com/phpazuredevops/PHP/_build/results?buildId=27444&view=ms.vss-test-web.build-test-results-tab&runId=608358&resultId=107821&paneView=debug

val is an embedded array in zend_string. The value would decay to a
pointer which can never be NULL.
@alexdowad
Copy link
Contributor

Thanks very much!!

@@ -118,7 +118,6 @@ function mb_strcut(string $string, int $start, ?int $length = null, ?string $enc

function mb_strwidth(string $string, ?string $encoding = null): int {}

/** @refcount 1 */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this hint is used for some kind of optimizations by the JIT? Probably for cases where the returned string doesn't "escape" and can be immediately freed at the end of a block?

I just looked through the entire list and the annotations look accurate for the other functions. There are some other cases where potentially we could just return the input string unmodified, which might help to reduce the number of dynamic allocations. If I ever do that I will remember to adjust the annotations here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder why mb_scrub isn't annotated as @refcount 1. I believe that function should always return a newly allocated string with refcount 1. Or... is there something I'm missing?

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you! This looks good, except maybe the change to PHP_INI_MH(OnUpdate_mbstring_http_input).

cc @alexdowad

@@ -761,7 +761,7 @@ static PHP_INI_MH(OnUpdate_mbstring_http_input)
php_error_docref("ref.mbstring", E_DEPRECATED, "Use of mbstring.http_input is deprecated");
}

if (!new_value || !ZSTR_VAL(new_value)) {
if (!new_value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you change this deliberately? If so, the implementation of the function could be simplified; basically to an if-else, then swap the branches, then drop the else, since the then branch guards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it produced a compiler warning (as the check is redundant) but I didn't want to change anything else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that check really redundant? What about an empty zend_string (or more precisely, a zend_string where the first byte is NUL)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmb69 Then it's still decayed to a pointer that points to a '\0'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to demonstrate: https://godbolt.org/z/6x9z7d8a9

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like adding a new regression test is in order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is a bug stemming from the time we changed from char* to zend_string*

Doesn't look like it. It was added here: 379d9a1 But maybe it was just always wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried checking what the manual says about mbstring.http_input: https://www.php.net/manual/en/mbstring.configuration.php#ini.mbstring.http-input

It doesn't document what is "supposed" to happen if mbstring.http_input is set to an empty string. I suggest we stick to whatever the historical behavior was.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, might be best for now. After all, these deprecated INI settings were supposed to be removed long ago.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this without that commit for now.

@iluuu1994 iluuu1994 closed this in a6f489b Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants