Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceeded maxRedirects with nytimes.com links #76

Open
dannguyen opened this issue Jul 12, 2016 · 7 comments
Open

Exceeded maxRedirects with nytimes.com links #76

dannguyen opened this issue Jul 12, 2016 · 7 comments

Comments

@dannguyen
Copy link

(Just leaving this here, will investigate a bit later)

Given a New York Times URL such as this:

http://www.nytimes.com/2016/07/12/technology/pokemon-go-brings-augmented-reality-to-a-mass-audience.html

The request will fail with this error:

Error: Exceeded maxRedirects. Probably stuck in a redirect loop http://www.nytimes.com/2016/07/12/technology/pokemon-go-brings-augmented-reality-to-a-mass-audience.html?_r=4

Note that nytimes.com has some convoluted server configuration and returns a HTTP code of 303.

...you'll get the same redirection behavior with cURL:

$ curl -IL http://www.nytimes.com/2016/07/12/technology/pokemon-go-brings-augmented-reality-to-a-mass-audience.html
HTTP/1.1 303 See Other
Server: Varnish
location: https://myaccount.nytimes.com/auth/login?URI=http%3A%2F%2Fwww.nytimes.com%2F2016%2F07%2F12%2Ftechnology%2Fpokemon-go-brings-augmented-reality-to-a-mass-audience.html%3F_r%3D5&REFUSE_COOKIE_ERROR=SHOW_ERROR
Accept-Ranges: bytes
Date: Tue, 12 Jul 2016 12:12:38 GMT
Age: 0
X-API-Version: 5-0
X-PageType: article
Connection: close
X-Frame-Options: DENY
Set-Cookie: RMID=007f010123545784deb60008;Path=/; Domain=.nytimes.com;Expires=Wed, 12 Jul 2017 12:12:38 UTC

HTTP/1.1 200 OK
Date: Tue, 12 Jul 2016 12:12:41 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
Set-Cookie: __cfduid=dce29bea6d432f3d2e44a8bbe3e1220aa1468325561; expires=Wed, 12-Jul-17 12:12:41 GMT; path=/; domain=.nytimes.com; HttpOnly
Vary: Accept-Encoding
Cache-Control: max-age=0, no-cache
Cneonction: close
Server: cloudflare-nginx
CF-RAY: 2c1467a827722507-ORD

More heavy HTTP clients, such as whatever wget uses by default, can deal with this, as can libraries such as Python's Requests. I'm new to Node so I'm not sure what the best-practices route is.

@dannguyen
Copy link
Author

Ah OK, now I remember what the hack for nytimes.com is: keep the cookies during the redirects. Not sure if setting this option to true is something that has implications for the general use case, so I leave it here FYI:

https://github.com/request/request#examples

request( {jar: true, url: url}, (err, resp, body) => { console.log(body) });

@felipe-augusto
Copy link

+1 for this, having the same issue here

@VigneshPT
Copy link

+1 had the same issue for urls of washingtonpost.com. Fixed by setting jar:true

@swathik313
Copy link

request( {jar: true, url: url}, (err, resp, body) => { console.log(body) });

@nfederico
Copy link

Hi guys I was giving a try to Cheerio trying to scrap some info from: http://www.bna.com.ar/Personas

My Code:
`const request = require('request');
const cheerio = require('cheerio');

request({jar:true, url:'http://www.bna.com.ar/Personas'},(err, res, html)=> {
if (!err && res.statusCode == 200) {
console.log(html); }
else console.log(err);
});`

But is giving me the following error.

PS C:\Users\Federico\source\repos\Ws> node index
(node:12196) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 pipe listeners added. Use emitter.setMaxListeners() to increase limit
Error: Exceeded maxRedirects. Probably stuck in a redirect loop http://www.bna.com.ar/Error?aspxerrorpath=/Error/ErrorPage
at Redirect.onResponse (C:\Users\Federico\source\repos\Ws\node_modules\request\lib\redirect.js:98:27)
at Request.onRequestResponse (C:\Users\Federico\source\repos\Ws\node_modules\request\request.js:993:22)
at ClientRequest.emit (events.js:198:13)
at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:556:21)
at HTTPParser.parserOnHeadersComplete (_http_common.js:109:17)
at Socket.socketOnData (_http_client.js:442:20)
at Socket.emit (events.js:198:13)
at addChunk (_stream_readable.js:288:12)
at readableAddChunk (_stream_readable.js:269:11)
at Socket.Readable.push (_stream_readable.js:224:10)

I try adding jar:true but is not working any clue?

@892042158
Copy link

解决了我的问题 jar:true 点赞

@emanton
Copy link

emanton commented Oct 27, 2020

안녕, 영어를 쓰십시오

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants