-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for https in urllib
#70
Comments
This is still happening. Any solution? |
I don't seem to be able to load data from any URLs, even [UPDATE - the issue may be related to the URL I am trying to load (World Bank) and how it responds: import urllib
url="http://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?date=2000:2001"
urllib.request.urlopen(url)
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
File /lib/python3.10/urllib/request.py:1348, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1347 try:
-> 1348 h.request(req.get_method(), req.selector, req.data, headers,
1349 encode_chunked=req.has_header('Transfer-encoding'))
1350 except OSError as err: # timeout error
File /lib/python3.10/http/client.py:1282, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1281 """Send a complete request to the server."""
-> 1282 self._send_request(method, url, body, headers, encode_chunked)
File /lib/python3.10/http/client.py:1328, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1327 body = _encode(body, 'body')
-> 1328 self.endheaders(body, encode_chunked=encode_chunked)
File /lib/python3.10/http/client.py:1277, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1276 raise CannotSendHeader()
-> 1277 self._send_output(message_body, encode_chunked=encode_chunked)
File /lib/python3.10/http/client.py:1037, in HTTPConnection._send_output(self, message_body, encode_chunked)
1036 del self._buffer[:]
-> 1037 self.send(msg)
1039 if message_body is not None:
1040
1041 # create a consistent interface to message_body
File /lib/python3.10/http/client.py:975, in HTTPConnection.send(self, data)
974 if self.auto_open:
--> 975 self.connect()
976 else:
File /lib/python3.10/http/client.py:941, in HTTPConnection.connect(self)
940 sys.audit("http.client.connect", self, self.host, self.port)
--> 941 self.sock = self._create_connection(
942 (self.host,self.port), self.timeout, self.source_address)
943 # Might fail in OSs that don't implement TCP_NODELAY
File /lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address)
844 try:
--> 845 raise err
846 finally:
847 # Break explicitly a reference cycle
File /lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address)
832 sock.bind(source_address)
--> 833 sock.connect(sa)
834 # Break explicitly a reference cycle
OSError: [Errno 23] Host is unreachable
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
Cell In[12], line 2
1 import urllib
----> 2 u = urllib.request.urlopen(url)
3 u
File /lib/python3.10/urllib/request.py:216, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
214 else:
215 opener = _opener
--> 216 return opener.open(url, data, timeout)
File /lib/python3.10/urllib/request.py:519, in OpenerDirector.open(self, fullurl, data, timeout)
516 req = meth(req)
518 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 519 response = self._open(req, data)
521 # post-process response
522 meth_name = protocol+"_response"
File /lib/python3.10/urllib/request.py:536, in OpenerDirector._open(self, req, data)
533 return result
535 protocol = req.type
--> 536 result = self._call_chain(self.handle_open, protocol, protocol +
537 '_open', req)
538 if result:
539 return result
File /lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
494 for handler in handlers:
495 func = getattr(handler, meth_name)
--> 496 result = func(*args)
497 if result is not None:
498 return result
File /lib/python3.10/urllib/request.py:1377, in HTTPHandler.http_open(self, req)
1376 def http_open(self, req):
-> 1377 return self.do_open(http.client.HTTPConnection, req)
File /lib/python3.10/urllib/request.py:1351, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1348 h.request(req.get_method(), req.selector, req.data, headers,
1349 encode_chunked=req.has_header('Transfer-encoding'))
1350 except OSError as err: # timeout error
-> 1351 raise URLError(err)
1352 r = h.getresponse()
1353 except:
URLError: <urlopen error [Errno 23] Host is unreachable> In pandas: import pandas pd
pd.read_json(url)
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
File /lib/python3.10/urllib/request.py:1348, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1347 try:
-> 1348 h.request(req.get_method(), req.selector, req.data, headers,
1349 encode_chunked=req.has_header('Transfer-encoding'))
1350 except OSError as err: # timeout error
File /lib/python3.10/http/client.py:1282, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1281 """Send a complete request to the server."""
-> 1282 self._send_request(method, url, body, headers, encode_chunked)
File /lib/python3.10/http/client.py:1328, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1327 body = _encode(body, 'body')
-> 1328 self.endheaders(body, encode_chunked=encode_chunked)
File /lib/python3.10/http/client.py:1277, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1276 raise CannotSendHeader()
-> 1277 self._send_output(message_body, encode_chunked=encode_chunked)
File /lib/python3.10/http/client.py:1037, in HTTPConnection._send_output(self, message_body, encode_chunked)
1036 del self._buffer[:]
-> 1037 self.send(msg)
1039 if message_body is not None:
1040
1041 # create a consistent interface to message_body
File /lib/python3.10/http/client.py:975, in HTTPConnection.send(self, data)
974 if self.auto_open:
--> 975 self.connect()
976 else:
File /lib/python3.10/http/client.py:941, in HTTPConnection.connect(self)
940 sys.audit("http.client.connect", self, self.host, self.port)
--> 941 self.sock = self._create_connection(
942 (self.host,self.port), self.timeout, self.source_address)
943 # Might fail in OSs that don't implement TCP_NODELAY
File /lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address)
844 try:
--> 845 raise err
846 finally:
847 # Break explicitly a reference cycle
File /lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address)
832 sock.bind(source_address)
--> 833 sock.connect(sa)
834 # Break explicitly a reference cycle
OSError: [Errno 23] Host is unreachable
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
Cell In[13], line 2
1 url="http://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?date=2000:2001"
----> 2 pd.read_json(url)
File /lib/python3.10/site-packages/pandas/util/_decorators.py:207, in deprecate_kwarg.<locals>._deprecate_kwarg.<locals>.wrapper(*args, **kwargs)
205 else:
206 kwargs[new_arg_name] = new_arg_value
--> 207 return func(*args, **kwargs)
File /lib/python3.10/site-packages/pandas/util/_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
305 if len(args) > num_allow_args:
306 warnings.warn(
307 msg.format(arguments=arguments),
308 FutureWarning,
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
File /lib/python3.10/site-packages/pandas/io/json/_json.py:588, in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, encoding_errors, lines, chunksize, compression, nrows, storage_options)
585 if convert_axes is None and orient != "table":
586 convert_axes = True
--> 588 json_reader = JsonReader(
589 path_or_buf,
590 orient=orient,
591 typ=typ,
592 dtype=dtype,
593 convert_axes=convert_axes,
594 convert_dates=convert_dates,
595 keep_default_dates=keep_default_dates,
596 numpy=numpy,
597 precise_float=precise_float,
598 date_unit=date_unit,
599 encoding=encoding,
600 lines=lines,
601 chunksize=chunksize,
602 compression=compression,
603 nrows=nrows,
604 storage_options=storage_options,
605 encoding_errors=encoding_errors,
606 )
608 if chunksize:
609 return json_reader
File /lib/python3.10/site-packages/pandas/io/json/_json.py:673, in JsonReader.__init__(self, filepath_or_buffer, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, lines, chunksize, compression, nrows, storage_options, encoding_errors)
670 if not self.lines:
671 raise ValueError("nrows can only be passed if lines=True")
--> 673 data = self._get_data_from_filepath(filepath_or_buffer)
674 self.data = self._preprocess_data(data)
File /lib/python3.10/site-packages/pandas/io/json/_json.py:710, in JsonReader._get_data_from_filepath(self, filepath_or_buffer)
703 filepath_or_buffer = stringify_path(filepath_or_buffer)
704 if (
705 not isinstance(filepath_or_buffer, str)
706 or is_url(filepath_or_buffer)
707 or is_fsspec_url(filepath_or_buffer)
708 or file_exists(filepath_or_buffer)
709 ):
--> 710 self.handles = get_handle(
711 filepath_or_buffer,
712 "r",
713 encoding=self.encoding,
714 compression=self.compression,
715 storage_options=self.storage_options,
716 errors=self.encoding_errors,
717 )
718 filepath_or_buffer = self.handles.handle
720 return filepath_or_buffer
File /lib/python3.10/site-packages/pandas/io/common.py:667, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
664 codecs.lookup_error(errors)
666 # open URLs
--> 667 ioargs = _get_filepath_or_buffer(
668 path_or_buf,
669 encoding=encoding,
670 compression=compression,
671 mode=mode,
672 storage_options=storage_options,
673 )
675 handle = ioargs.filepath_or_buffer
676 handles: list[BaseBuffer]
File /lib/python3.10/site-packages/pandas/io/common.py:336, in _get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode, storage_options)
334 # assuming storage_options is to be interpreted as headers
335 req_info = urllib.request.Request(filepath_or_buffer, headers=storage_options)
--> 336 with urlopen(req_info) as req:
337 content_encoding = req.headers.get("Content-Encoding", None)
338 if content_encoding == "gzip":
339 # Override compression based on Content-Encoding header
File /lib/python3.10/site-packages/pandas/io/common.py:236, in urlopen(*args, **kwargs)
230 """
231 Lazy-import wrapper for stdlib urlopen, as that imports a big chunk of
232 the stdlib.
233 """
234 import urllib.request
--> 236 return urllib.request.urlopen(*args, **kwargs)
File /lib/python3.10/urllib/request.py:216, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
214 else:
215 opener = _opener
--> 216 return opener.open(url, data, timeout)
File /lib/python3.10/urllib/request.py:519, in OpenerDirector.open(self, fullurl, data, timeout)
516 req = meth(req)
518 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 519 response = self._open(req, data)
521 # post-process response
522 meth_name = protocol+"_response"
File /lib/python3.10/urllib/request.py:536, in OpenerDirector._open(self, req, data)
533 return result
535 protocol = req.type
--> 536 result = self._call_chain(self.handle_open, protocol, protocol +
537 '_open', req)
538 if result:
539 return result
File /lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
494 for handler in handlers:
495 func = getattr(handler, meth_name)
--> 496 result = func(*args)
497 if result is not None:
498 return result
File /lib/python3.10/urllib/request.py:1377, in HTTPHandler.http_open(self, req)
1376 def http_open(self, req):
-> 1377 return self.do_open(http.client.HTTPConnection, req)
File /lib/python3.10/urllib/request.py:1351, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1348 h.request(req.get_method(), req.selector, req.data, headers,
1349 encode_chunked=req.has_header('Transfer-encoding'))
1350 except OSError as err: # timeout error
-> 1351 raise URLError(err)
1352 r = h.getresponse()
1353 except:
URLError: <urlopen error [Errno 23] Host is unreachable> With import requests
requests.get(url).text
---------------------------------------------------------------------------
JsGenericError Traceback (most recent call last)
Cell In[14], line 3
1 import requests
2 import json
----> 3 requests.get(url).text
File /lib/python3.10/site-packages/requests/api.py:50, in <lambda>(url, **kwargs)
46 # build and return response object
47 return Response(xmlr, elapsed)
---> 50 get = lambda url, **kwargs: request(method="GET", url=url, **kwargs)
51 put = lambda url, **kwargs: request(method="PUT", url=url, **kwargs)
52 post = lambda url, **kwargs: request(method="POST", url=url, **kwargs)
File /lib/python3.10/site-packages/requests/api.py:41, in request(method, url, headers, auth, data, params)
39 t0 = time.time()
40 if data is None:
---> 41 xmlr.send(js_null())
42 else:
43 request.send("")
File <string>:42, in <lambda>(self, *args)
File <string>:92, in apply(js_function, args)
File <string>:213, in error_to_py_and_raise(err)
JsGenericError: {"stack":"Error: Failed to execute 'send' on 'XMLHttpRequest': Failed to load '[http://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?date=2000:2001'.\n](http://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?date=2000:2001%27.\n) at Module._apply_try_catch ([https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:1690)\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:1690)/n) at __emval_call ([https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:203415)\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:203415)/n) at [https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[3521]:0x2bc54d\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[3521]:0x2bc54d/n) at [https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[210]:0x118e29\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[210]:0x118e29/n) at method_call_trampoline ([https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:40898)\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:40898)/n) at [https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[5127]:0x353164\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[5127]:0x353164/n) at method_call_trampoline ([https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:40898)\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.js:9:40898)/n) at [https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[4206]:0x312ab5\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[4206]:0x312ab5/n) at [https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[6283]:0x3dc453\n](https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[6283]:0x3dc453/n) at https://ouseful-demos.github.io/learn-to-code-jupyterlite/extensions/@jupyterlite/xeus-python-kernel/static/xpython_wasm.wasm:wasm-function[6273]:0x3d9e53"}
For quick tests / demos, it might also be useful if a JupyterLite site running the latest |
@psychemedia looks like an issue when trying to make an HTTP request when the origin is served over HTTPS. Maybe worth trying with |
@jtpio I have been trying all the off-the-shelf World Bank Python API packages I can find, and they all fail in various ways. I've also been trying simpler loaders (as above) with both http and https ( The only solution I've found so far is to write my own package to run in pyodide kernel that tests to see if it can load pyodide, and if it can, then use Example for %pip install wbpy
import pyodide
import wbpy
def fetch_patch(url):
with pyodide.http.open_url(url) as f:
return f.getvalue()
api = wbpy.IndicatorAPI(fetch=fetch_patch)
api.BASE_URL = "https://api.worldbank.org/v2/" |
Description
urllib
can't handle https URLs in XPython kernel.A similar issue was reported for Pyodide kernel (jupyterlite/jupyterlite#413), but the proposed solution is Pyodide specific.
Reproduce
Using xeus-python-kernel, try to execute a cell:
You will get the error: <urlopen error unknown url type: https>
Detailed error report shows the exception is raised inside the
urllib
module.Expected behavior
The referenced CSV file should be loaded into a data frame.
Context
Error traceback
The text was updated successfully, but these errors were encountered: