-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[RFC] Add RFC 3986 and WHATWG compliant URL parsing support #14461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -3715,7 +3715,6 @@ function uniqid(string $prefix = "", bool $more_entropy = false): string {} | |||
|
|||
/** | |||
* @return int|string|array<string, int|string>|null|false | |||
* @compile-time-eval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed for benchmarking purposes
ext/url/php_url.c
Outdated
|
||
static void cleanup_parser(void) | ||
{ | ||
if (++URL_G(urls) % 500 == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach is copy-pasted from lexbor/lexbor#206
823e4c6
to
e65deb7
Compare
0ccffa0
to
04d4e6e
Compare
In preparation of php#14461 (https://wiki.php.net/rfc/url_parsing_api)
In preparation of php#14461 (https://wiki.php.net/rfc/url_parsing_api)
In preparation of php#14461 (https://wiki.php.net/rfc/url_parsing_api)
In preparation of php#14461 (https://wiki.php.net/rfc/url_parsing_api)
In preparation of php#14461 (https://wiki.php.net/rfc/url_parsing_api)
|
||
zval property_errors_default_value; | ||
ZVAL_UNDEF(&property_errors_default_value); | ||
zend_string *property_errors_name = zend_string_init("errors", sizeof("errors") - 1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add the new strings from zend_string.h to PropertyInfo::PHP_85_KNOWN
in gen_stub.php so they can be used here
lexbor_str_init_append(&str, lexbor_parser->mraw, (const lxb_char_t *) Z_STRVAL_P(value), Z_STRLEN_P(value)); \ | ||
} else if (Z_TYPE_P(value) == IS_LONG && Z_LVAL_P(value) != 0) { \ | ||
ZVAL_STR(value, zend_long_to_str(Z_LVAL_P(value))); \ | ||
lexbor_str_init_append(&str, lexbor_parser->mraw, (const lxb_char_t *) Z_STRVAL_P(value), Z_STRLEN_P(value)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few lines here indented with spaces rather than tabs
case URI_COMPONENT_READ_NORMALIZED_UNICODE: /* Intentional fallthrough */ \ | ||
case URI_COMPONENT_READ_NORMALIZED_ASCII: { \ | ||
ZVAL_STRINGL(retval, (const char *) start, len); \ | ||
break; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break; \ | |
break; \ |
|
||
if (lexbor_uri->path.opaque) { | ||
LEXBOR_READ_ASCII_URI_COMPONENT(lexbor_uri->path.str.data, lexbor_uri->path.str.length, read_mode, retval); | ||
} else if (lexbor_uri->path.str.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two blocks look the same - can they be combined?
return smart_str_extract(&uri_str); | ||
} | ||
|
||
static void lexbor_free_uri(void *uri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this function isn't needed, suggest removing the handler and making it optional in the lexbor_uri_handler
specification above
if (uri_handler_register(&parse_url_uri_handler) == FAILURE) { | ||
return FAILURE; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about just return uri_handler_register(&parse_url_uri_handler);
?
ZVAL_STRINGL(retval, (const char *) start, len); \ | ||
break; \ | ||
} \ | ||
EMPTY_SWITCH_DEFAULT_CASE() \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given the EMPTY_SWITCH_DEFAULT_CASE() and the fact that all of the cases fall through, what is the point of even having a switch
?
zend_string *that_str = that_internal_uri->handler->uri_to_string( | ||
that_internal_uri->uri, URI_RECOMPOSITION_NORMALIZED_ASCII, exclude_fragment); | ||
if (that_str == NULL) { | ||
zend_throw_exception_ex(NULL, 0, "Cannot recompose %s to string", ZSTR_VAL(that_object->ce->name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this_str
be released here?
|
||
ZEND_BEGIN_ARG_WITH_RETURN_TYPE_INFO_EX(arginfo_class_Uri_Rfc3986_Uri_parse, 0, 1, IS_STATIC, 1) | ||
ZEND_ARG_TYPE_INFO(0, uri, IS_STRING, 0) | ||
ZEND_ARG_OBJ_INFO_WITH_DEFAULT_VALUE(0, baseUrl, Uri\\Rfc3986\\\125ri, 1, "null") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has the parameter type as Uri\\Rfc3986\\\125ri
- any idea why it isn't Uri\\Rfc3986\\Uri
? Does it matter? Similar below
}; | ||
|
||
#define URIPARSER_PARSE_STR(uri_str, internal_uri, errors) do { \ | ||
zend_string *str = smart_str_extract(&uri_str); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line got indented with spaces rather than a tab
In preparation of php#14461 (https://wiki.php.net/rfc/url_parsing_api)
2nd take after the failed experiment with #11315
RFC: https://wiki.php.net/rfc/url_parsing_api