-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "#valid? doesn't work for protocol-less urls" #58
Conversation
Makes for easier future refactoring
Add a guard for PublicSuffix makes #valid? simpler. Introduce a helper method to keep things tidy.
There were lots of options to go with there: * nil check in #scheme or #normalized_scheme * .to_s in #scheme or #normalized_scheme * use Addressable’s #normalized_scheme I don’t think nil checks are that pretty, so that one was quickly discarded. I also don’t think we should #to_s “randomly” for some properties it’s either all or nothing in my opinion. Using Addressable’s own #normalized_scheme was the top candidate, but it did additional things to the url (not just downcasing it) and I think it’s a good thing that we try to control what we can when it comes to normalization. So ultimately I went with this, I changed so that we check against #scheme instead of #normalized_scheme as that makes a whole lot of sense to me at least. Fix #55.
I did this change $ git d
diff --git a/spec/lib/twingly/url_spec.rb b/spec/lib/twingly/url_spec.rb
index e8cccdc..7eab2b2 100644
--- a/spec/lib/twingly/url_spec.rb
+++ b/spec/lib/twingly/url_spec.rb
@@ -37,7 +37,7 @@ end
describe Twingly::URL do
let(:test_url) do
- "http://www.blog.twingly.co.uk/2015/07/01/language-detection-changes/"
+ "//www.blog.twingly.co.uk/2015/07/01/language-detection-changes/"
end
let(:url) { described_class.parse(test_url) } And I got this result:
Yes, failure is expected with this change, but not in this way I think. As you said, this may not be the place to fix (not return I have no idea what's going on with 2) 😄
|
@@ -123,7 +127,7 @@ def normalized_path | |||
end | |||
|
|||
def valid? | |||
addressable_uri && public_suffix_domain && SCHEMES.include?(normalized_scheme) | |||
!!(scheme =~ ACCEPTED_SCHEMES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks funny, seems like we only require a scheme to be valid, but I know that's not the case (this is the same as before). But I just wanted to write something about it.
I think making what we consider being an valid URL even more clear would be great. That might be future tasks for @twingly-mob.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks funny, seems like we only require a scheme to be valid, but I know that's not the case (this is the same as before). But I just wanted to write something about it.
My first attempt was actually this method returning true
and I took care of this in .internal_parse
(ensuring NullURL
return), but then I realized that calling #new
could lead to weird stuff (at least how it's currently implemented) so I went with the least amount of changes.
Yes, I was thinking of that one.
Haha, me neither. Probably somethnig we should fix.
Hm, maybe here or maybe not. You really found a new problem (you effectively try to normalize an invalid url, not sure how it should work) :). The intended scope here was to ensure that "#valid?" does not lie to us. I think that #52 will solve this issue for us. Alternatively add another issue for "normalizing invalid urls behavior". |
Maybe all invalid URLs should return NullURL? But yeah, sounds like we need to have a new discussion elsewhere (not this issue/PR).
|
Hehe, my thoughts exactly, just added that in #58 (comment) :) |
We can’t test it now, but we think it’s worth it.
Will return NullURL. Closes #61.
The current implementation of .parse can't pass nil to .internal_parse, but just to be sure we want to test this, because we didn't notice this until now.
Just a note about the "diff test" above $ git d
diff --git a/spec/lib/twingly/url_spec.rb b/spec/lib/twingly/url_spec.rb
index 8fd4c7c..3cad381 100644
--- a/spec/lib/twingly/url_spec.rb
+++ b/spec/lib/twingly/url_spec.rb
@@ -37,7 +37,7 @@ end
describe Twingly::URL do
let(:test_url) do
- "http://www.blog.twingly.co.uk/2015/07/01/language-detection-changes/"
+ "//www.blog.twingly.co.uk/2015/07/01/language-detection-changes/"
end
let(:url) { described_class.parse(test_url) } Now the failures are as expected:
|
Fix "#valid? doesn't work for protocol-less urls"
There were lots of options to go with there:
I don’t think nil checks are that pretty, so that one was quickly discarded. I also don’t think we should #to_s “randomly” for some properties it’s either all or nothing in my opinion. Using Addressable’s own #normalized_scheme was the top candidate, but it did additional things to the url (not just downcasing it) and I think it’s a good thing that we try to control what we can when it comes to normalization.
So ultimately I went with this, I changed so that we check against #scheme instead of #normalized_scheme as that makes a whole lot of sense to me at least.
Fix #55.
Additionally, I did some refactoring which I felt was related to fixing this issue.