Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve validation of urls #736

Closed
yanokwa opened this issue Mar 22, 2017 · 7 comments
Closed

Improve validation of urls #736

yanokwa opened this issue Mar 22, 2017 · 7 comments

Comments

@yanokwa
Copy link
Member

yanokwa commented Mar 22, 2017

In UrlUtils, the way we check URLs is like so:

try {
  new URL(url);
  return true;
} catch (MalformedURLException e) {
  return false;
}

This fails with things that are probably not URLs, like https:appspot.com which technically is a URL, but you know, it's not a URL folks are likely to enter. I'd think we should do better.

We could search for the perfect URL regex, but it doesn't exist. We can add UrlValidator but it seems strange to add a library for just this.

I propose we define a URL to be a string that starts with http:// or https:// and has a least one character. So in regex, ^(http|https)://.+$

@dcbriccetti
Copy link
Contributor

A small improvement: ^https?:\/\/.+$

@nribeka
Copy link
Contributor

nribeka commented Mar 22, 2017

Are we just check the format of the url or do you want to actually check the url?

If we want to check the url whether it is valid and can be used, we probably want to issue HEAD call and see what we're getting? Or this is not gonna too much for folks in the field (consuming their data allocation)? This probably gonna help them making sure that they're actually entering correct server location.

@yanokwa
Copy link
Member Author

yanokwa commented Mar 22, 2017

I don't think we want to actually check the URL, we just want to check the format. Checking the URL requires the server to be up (duh) but there are times when you want to pre-configure the devices first.

@ykayacan
Copy link

Hello! Do you consider about using Android's built in WEB_URL pattern matcher?

Example utility code is below.

public static boolean isUrl(String url) {
Pattern p = Patterns.WEB_URL;
Matcher m = p.matcher(url.toLowerCase());
return m.matches();
}

@yanokwa
Copy link
Member Author

yanokwa commented Mar 23, 2017

@krialix Yes, I considered it. For one, it depends on a fixed list of top level domains which change quickly and so you get different behavior on different versions of Android. See http://stackoverflow.com/a/29771086. Have you found this tradeoff acceptable in your use of Patterns.WEB_URL? Have you confirmed that it reports URLs like https:appspot.com as invalid?

@ykayacan
Copy link

@yanokwa Yes it reports invalid. We use this pattern in our app and worked well for now, but I like to test it for other Android versions (currently we tested it on Android 7.1.1).

Moreover, if it is not valid for older Android versions maybe you can consider to add URL_VALID_DOMAIN from Regex class which is actively developed by Twitter.

@yanokwa
Copy link
Member Author

yanokwa commented Apr 12, 2017

I think a simple regex is strictly better because it:

  1. Has the same behavior across all version of Android.
  2. Does not require an external library.

@lognaturel does not feel strongly about this, so I'm going to proceed with a PR.

@lognaturel lognaturel added this to the April 2017 milestone May 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants