-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
url: Improve WHATWG URLSearchParams spec compliance #9484
Conversation
/cc @jasnell |
Awesome! I won't have a chance to review in depth until Monday but it's awesome to see you jump in on this. Thank you! |
20a610e
to
3c099fa
Compare
update(this, search); | ||
this[searchParams][searchParams] = querystring.parse(this.search); | ||
this[searchParams][searchParams] = querystring.parse(this[context].query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is more for consistency with the constructor and cosmetics than anything else: it's a bit odd to see the property being read in its setter.
About making
While the standard says to:
which essentially means to skip all instances of leading, trailing, or consecutive ampersands, querystring treats it as querystring.parse('')
// => ✓ {}
querystring.parse('&')
// => ✗ { '': '' }
// => Expected: {}
querystring.parse('=')
// => ✓ { '': '' }
querystring.parse('&=')
// => ✗ { '': [ '', '' ] }
// => Expected: { '': '' }
querystring.parse('&&')
// => ✗ { '': [ '', '' ] }
// => Expected: {}
querystring.parse('&&=')
// => ✗ { '': [ '', '', '' ] }
// => Expected: { '': '' }
// In Chrome 55 with #enable-experimental-web-platform-features
[...new URLSearchParams('')]
// => ✓ []
[...new URLSearchParams('&')]
// => ✓ []
[...new URLSearchParams('=')]
// => ✓ [ [ '', '' ] ]
[...new URLSearchParams('&=')]
// => ✓ [ [ '', '' ] ]
[...new URLSearchParams('&&')]
// => ✓ []
[...new URLSearchParams('&&=')]
// => ✓ [ [ '', '' ] ]
const { URL } = require('url')
// '\dc00' is an unpaired trail Unicode surrogate
const url = new URL('https://asdf.com/?a=\udc00asdf')
url.search
// => '?a=%EF%BF%BDasdf'
// This is correct: `%EF%BF%BD` corresponds to U+FFFD REPLACEMENT CHARACTER,
// as stipulated by the Web IDL spec:
// https://heycam.github.io/webidl/#dfn-obtain-unicode
url.searchParams.set('a', '\udc00asdf')
url.search
// => '?a=%F0%90%81%A1sdf'
// Here a URL with an invalid Unicode string results, as there isn't a check for
// unpaired surrogates in the userland JS code.
url.search = 'a=\udc00asdf'
url.search
// => '?a=%EF%BF%BDasdf'
// search's setter forces a reparse, so C++ code's Unicode
// sanitation kicks in here So in addition to making the querystring module conformant to WHATWG standards, this issue would have to be fixed somehow, either by doing some string wizardry in JS or add a C++ binding for sanitizing. |
Yep, you're absolutely right. In the initial implementation I decided to punt on 100% compliance with the URLSearchParams in order to get it landed, knowing that we could continue to iterate on it while the feature was still experimental. My plan has been to move the processing into the node_url.cc native code to echo the main URL parsing. |
3c099fa
to
9795284
Compare
@jasnell, have you gotten a chance to look at this PR? |
I have not and I am sorry. I have been in Tokyo for nodefest and have been On Mon, Nov 14, 2016 at 7:32 AM Timothy Gu [email protected] wrote:
|
@jasnell, ping. |
d04241e
to
fe08218
Compare
Just a heads up, in case anyone is interested I've reimplemented the core parsing and serializing routine in C++, in TimothyGu/node@urlsearchparams...urlsearchparams-cpp. I will submit a PR for that after this one is merged. |
@TimothyGu ... I've taken an initial look at things look good but have not yet had the opportunity to do a proper complete review. Haven't forgotten about it tho! |
9754f70
to
796213f
Compare
- Make URLSearchParams constructor spec-compliant - Strip leading `?` in URL#search's setter - Spec-compliant iterable interface - More precise handling of update steps as mandated by the spec - Add class strings to URLSearchParams objects and their prototype - Make sure `this instanceof URLSearchParams` in methods Also included are relevant tests from W3C's Web Platform Tests (https://github.com/w3c/web-platform-tests/tree/master/url). Fixes: nodejs#9302
796213f
to
8ea0a65
Compare
@jasnell, ping 3x... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if CI is green! Thank you very much!
@TimothyGu ... things look good except for a couple of linter nits. Can you run |
- Make URLSearchParams constructor spec-compliant - Strip leading `?` in URL#search's setter - Spec-compliant iterable interface - More precise handling of update steps as mandated by the spec - Add class strings to URLSearchParams objects and their prototype - Make sure `this instanceof URLSearchParams` in methods Also included are relevant tests from W3C's Web Platform Tests (https://github.com/w3c/web-platform-tests/tree/master/url). Fixes: #9302 PR-URL: #9484 Reviewed-By: James M Snell <[email protected]>
Not in 7.2.1? It's not in the changelog, anyway. |
- Make URLSearchParams constructor spec-compliant - Strip leading `?` in URL#search's setter - Spec-compliant iterable interface - More precise handling of update steps as mandated by the spec - Add class strings to URLSearchParams objects and their prototype - Make sure `this instanceof URLSearchParams` in methods Also included are relevant tests from W3C's Web Platform Tests (https://github.com/w3c/web-platform-tests/tree/master/url). Fixes: #9302 PR-URL: #9484 Reviewed-By: James M Snell <[email protected]>
Notable changes: - buffer: - fix single-character string filling (Anna Henningsen) #9837 - handle UCS2.fill() properly on BE (Anna Henningsen) #9837 - url: - including base argument in originFor (joyeecheung) #10021 - improve URLSearchParams spec compliance (Timothy Gu) #9484 - http: - remove stale timeout listeners (Karl Böhlmark) #9440 - build: - fix node_g target (Daniel Bevenius) #10153 PR-URL: #10277
Notable changes SEMVER-MINOR - url: - add inspect function to TupleOrigin (Safia Abdalla) #10039 - crypto: - allow adding extra certs to well-known CAs (Sam Roberts) #9139 SEMVER-PATCH - buffer: - fix single-character string filling (Anna Henningsen) #9837 - handle UCS2 .fill() properly on BE (Anna Henningsen) #9837 - url: - including base argument in originFor (joyeecheung) #10021 - improve URLSearchParams spec compliance (Timothy Gu) #9484 - http: - remove stale timeout listeners (Karl Böhlmark) #9440 - build: - fix node_g target (Daniel Bevenius) #10153 - fs: - remove unused argument from copyObject() (Ethan Arrowood) #10041 - timers: - fix handling of cleared immediates (hveldstra) #9759 PR-URL: #10277
Notable changes: * **crypto**: - Allow adding extra certificates to well-known CAs. (Sam Roberts) [#9139](#9139) * **buffer**: - Fix single-character string filling. (Anna Henningsen) [#9837](#9837) - Handle UCS2 `.fill()` properly on BE. (Anna Henningsen) [#9837](#9837) * **url**: - Add inspect function to TupleOrigin. (Safia Abdalla) [#10039](#10039) - Including base argument in originFor. (joyeecheung) [#10021](#10021) - Improve URLSearchParams spec compliance. (Timothy Gu) [#9484](#9484) * **http**: - Remove stale timeout listeners. (Karl Böhlmark) [#9440](#9440) * **build**: - Fix node_g target. (Daniel Bevenius) [#10153](#10153) * **fs**: - Remove unused argument from copyObject(). (EthanArrowood) [#10041](#10041) * **timers**: - Fix handling of cleared immediates. (hveldstra) [#9759](#9759) * **src**: - Add wrapper for process.emitWarning(). (SamRoberts) [#9139](#9139) - Fix string format mistake for 32 bit node.(Alex Newman) [#10082](#10082)
Notable changes: * **crypto**: The built-in list of Well-Known CAs (Certificate Authorities) can now be extended via a NODE_EXTRA_CA_CERTS environment variable. (Sam Roberts) [#9139](#9139) * **buffer**: buffer.fill() now works properly for the UCS2 encoding on Big-Endian machines. (Anna Henningsen) [#9837](#9837) * **url**: - Including base argument in URL.originFor() to meet specification compliance. (joyeecheung) [#10021](#10021) - Improve URLSearchParams to meet specification compliance. (Timothy Gu) [#9484](#9484) * **http**: Remove stale timeout listeners in order to prevent a memory leak when using keep alive. (Karl Böhlmark) [#9440](#9440)
Notable changes: * buffer: - buffer.fill() now works properly for the UCS2 encoding on Big-Endian machines. (Anna Henningsen) nodejs#9837 * cluster: - disconnect() now returns a reference to the disconnected worker. (Sean Villars) nodejs#10019 * crypto: - The built-in list of Well-Known CAs (Certificate Authorities) can now be extended via a NODE_EXTRA_CA_CERTS environment variable. (Sam Roberts) nodejs#9139 * http: - Remove stale timeout listeners in order to prevent a memory leak when using keep alive. (Karl Böhlmark) nodejs#9440 * tls: - Allow obvious key/passphrase combinations. (Sam Roberts) nodejs#10294 * url: - Including base argument in URL.originFor() to meet specification compliance. (joyeecheung) nodejs#10021 - Improve URLSearchParams to meet specification compliance. (Timothy Gu) nodejs#9484 PR-URL: nodejs#10277
Notable changes: * buffer: - buffer.fill() now works properly for the UCS2 encoding on Big-Endian machines. (Anna Henningsen) nodejs#9837 * cluster: - disconnect() now returns a reference to the disconnected worker. (Sean Villars) nodejs#10019 * crypto: - The built-in list of Well-Known CAs (Certificate Authorities) can now be extended via a NODE_EXTRA_CA_CERTS environment variable. (Sam Roberts) nodejs#9139 * http: - Remove stale timeout listeners in order to prevent a memory leak when using keep alive. (Karl Böhlmark) nodejs#9440 * tls: - Allow obvious key/passphrase combinations. (Sam Roberts) nodejs#10294 * url: - Including base argument in URL.originFor() to meet specification compliance. (joyeecheung) nodejs#10021 - Improve URLSearchParams to meet specification compliance. (Timothy Gu) nodejs#9484 PR-URL: nodejs#10277
Notable changes: * buffer: - buffer.fill() now works properly for the UCS2 encoding on Big-Endian machines. (Anna Henningsen) #9837 * cluster: - disconnect() now returns a reference to the disconnected worker. (Sean Villars) #10019 * crypto: - The built-in list of Well-Known CAs (Certificate Authorities) can now be extended via a NODE_EXTRA_CA_CERTS environment variable. (Sam Roberts) #9139 * http: - Remove stale timeout listeners in order to prevent a memory leak when using keep alive. (Karl Böhlmark) #9440 * tls: - Allow obvious key/passphrase combinations. (Sam Roberts) #10294 * url: - Including base argument in URL.originFor() to meet specification compliance. (joyeecheung) #10021 - Improve URLSearchParams to meet specification compliance. (Timothy Gu) #9484 PR-URL: #10277
Notable changes: * buffer: - buffer.fill() now works properly for the UCS2 encoding on Big-Endian machines. (Anna Henningsen) nodejs/node#9837 * cluster: - disconnect() now returns a reference to the disconnected worker. (Sean Villars) nodejs/node#10019 * crypto: - The built-in list of Well-Known CAs (Certificate Authorities) can now be extended via a NODE_EXTRA_CA_CERTS environment variable. (Sam Roberts) nodejs/node#9139 * http: - Remove stale timeout listeners in order to prevent a memory leak when using keep alive. (Karl Bohlmark) nodejs/node#9440 * tls: - Allow obvious key/passphrase combinations. (Sam Roberts) nodejs/node#10294 * url: - Including base argument in URL.originFor() to meet specification compliance. (joyeecheung) nodejs/node#10021 - Improve URLSearchParams to meet specification compliance. (Timothy Gu) nodejs/node#9484 PR-URL: nodejs/node#10277 Signed-off-by: Ilkka Myller <[email protected]>
Checklist
make -j8 test
(UNIX), orvcbuild test nosign
(Windows) passesdocumentation is changed or added(WHATWG URL interface is still experimental and undocumented)Affected core subsystem(s)
url, test
Description of change
This PR aims to improve the general compatibility of
URLSearchParams
class with WHATWG's URL Standard. Specifically, the PR consists of the following parts:URLSearchParams
constructor signature the same as the spec?
inURL#search
's setterkeys
,entries
, andforEach
methodsURLSearchParamsIterator
object instead of a generator function for@@iterator
, per spec%20
vs+
, are still commented out.TODOs and points of discussion:
URLSearchParams
public@@iterator
,keys
,values
, andentries
application/x-www-form-urlencoded
byte serializer disagrees on certain characters such as0x20
(space; encodeURIComponent escapes it as%20
while querystring should serialize it to+
),0x27
(left paren; encodeURIComponent does not escape but querystring should), etc.Fixes: #9302