Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't download youtube playlists or channels #14083

Closed
4 tasks
pythonoma opened this issue Aug 31, 2017 · 24 comments
Closed
4 tasks

Can't download youtube playlists or channels #14083

pythonoma opened this issue Aug 31, 2017 · 24 comments

Comments

@pythonoma
Copy link

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2017.08.27.1. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • [ x] I've verified and I assure that I'm running youtube-dl 2017.08.27.1

Before submitting an issue make sure you have:

  • [ x] At least skimmed through the README, most notably the FAQ and BUGS sections
  • [ x] Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • [ x] Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.youtube.com/playlist?list=UUXDTqHPpvXuBewj_zFhAyxg', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.08.27.1
[debug] Python version 3.5.2 - Linux-4.4.0-87-generic-x86_64-with-Ubuntu-16.04-xenial
[debug] exe versions: none
[debug] Proxy map: {}
[youtube:playlist] UUXDTqHPpvXuBewj_zFhAyxg: Downloading webpage
[download] Downloading playlist: Uploads from شبكة أخبار باباعمرو
[youtube:playlist] UUXDTqHPpvXuBewj_zFhAyxg: Downloading page #1
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/__init__.py", line 465, in main
    _real_main(argv)
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/__init__.py", line 455, in _real_main
    retcode = ydl.download(all_urls)
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/YoutubeDL.py", line 1958, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/YoutubeDL.py", line 787, in extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/YoutubeDL.py", line 939, in process_ie_result
    ie_entries, playliststart, playlistend))
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/extractor/youtube.py", line 272, in _entries
    content_html = more['content_html']
KeyError: 'content_html'

<end of log>

If the purpose of this issue is a site support request please provide all kinds of example URLs support for which should be included (replace following example URLs by yours):

Note that youtube-dl does not support sites dedicated to copyright infringement. In order for site support request to be accepted all provided example URLs should not violate any copyrights.


Description of your issue, suggested solution and other information

Can't download (some) playlists or channels
youtube-dl 'https://www.youtube.com/playlist?list=UUXDTqHPpvXuBewj_zFhAyxg' -v
gives the error above

@Zuccace
Copy link

Zuccace commented Aug 31, 2017

I can confirm the same behaviour with git version 7998520.

@awei78
Copy link

awei78 commented Aug 31, 2017

The YouTube revised today, you can modify this files:
youtube_dl\extractor\youtube.py

    def _entries(self, page, playlist_id):
        more_widget_html = content_html = page
        for page_num in itertools.count(1):
            for entry in self._process_page(content_html):
                yield entry
                
            pattern = r'"nextContinuationData":{"continuation":"(?P<ctoken>[^\"]+)(?:[\s|\S]+?)"clickTrackingParams":"(?P<itct>[^\"]+)'
            mobj = re.search(pattern, more_widget_html)
            if mobj:
                more_url = 'https://www.youtube.com/browse_ajax?ctoken=%s&itct=%s' % (mobj.group('ctoken'), mobj.group('itct'))
            else:
                mobj = re.search(r'data-uix-load-more-href="/?(?P<more>[^"]+)"', more_widget_html)
                if not mobj:
                    break

                more_url = 'https://www.youtube.com/' + mobj.group('more')

            headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Maxthon/4.9.5.1000 Chrome/39.0.2146.0 Safari/537.36'}
            more = self._download_json(
                more_url,
                'Downloading page #%s' % page_num,
                headers=headers,
                transform_source=uppercase_escape)
            content_html = more['content_html']
            if not content_html.strip():
                # Some webpages show a "Load more" button but they don't
                # have more videos
                break
            more_widget_html = more['load_more_widget_html']

   def extract_videos_from_page(self, page):
        ids_in_page = []
        titles_in_page = []
        druation_in_page = []

        pattern = self._VIDEO_RE if 'playlistVideoRenderer' in page else self._VIDEO_RE_OLD
        for mobj in re.finditer(pattern, page):
            ...

_VIDEO_RE_OLD = r'href="\s*/watch\?v=(?P<id>[0-9A-Za-z_-]{11})&amp;[^"]*?index=(?P<index>\d+)(?:[^>]+>(?P<title>[^<]+))?'
_VIDEO_RE = r'"playlistVideoRenderer":{"videoId":"(?P<id>[^\"]+)(?:[\s|\S]+?)"title":(?:[\s|\S]+?)"simpleText":"(?P<title>[^\"]+)(?:[\s|\S]+?)"lengthText":(?:[\s|\S]+?)"simpleText":"(?P<duration>[^\"]+)'

       playlist_title = self._html_search_regex(
            r'(?s)<h1 class="pl-header-title[^"]*"[^>]*>\s*(.*?)\s*</h1>',
            page, 'title', fatal=False, default=None)
        if not playlist_title:
            playlist_title = playlist_title = self._og_search_title(page)

OK

@DowningC
Copy link

I experienced this issue as well, awei78 's suggested changes to extractor/youtube.py fixed the error for me; thanks!

@shirishag75
Copy link

shirishag75 commented Aug 31, 2017

Can somebody do a new version with the updates to youtube.py as shared above, pretty please. Also @awei78 could you please list the lines from which the modifications need to take place. The various code seems to starts from the following -

line 258 1st one.
line 294 2nd one - @awei78 shouldn't in the second it should be -

duration_in_page = []

instead of

druation_in_page = []

spelling error perhaps ?

Don't know where 3rd one is from :(

@Zuccace
Copy link

Zuccace commented Aug 31, 2017

Yeah. Simple .patch file would be nice for example.
The fix above there doesn't indicate which (numbers) lines to replace.

@pythonoma
Copy link
Author

This should have a new update to fix this ; youtu.be download is the main feature for this script.
Hope this happens soon

@Zuccace
Copy link

Zuccace commented Aug 31, 2017

@awei78 Why don't you create a pr for your patch?

@shirishag75
Copy link

just saw youtube.com has changed quite a bit -

https://home.bt.com/tech-gadgets/tech-news/youtube-new-logo-redesign-new-features-11364208101487

don't like the new minimalistic look, more importantly they have made it harder to find playlists :(

@GTechAlpha
Copy link

GTechAlpha commented Aug 31, 2017

Here is a patch using @awei78 's fix until official fix:

https://github.com/GTechAlpha/youtube-dl/commit/0ae0db13d1cecefde301bf5ba6770f409b3bb6bf.patch

Full credit and thanks to @awei78 .

@Zuccace
Copy link

Zuccace commented Aug 31, 2017

I confirm. The patch does work.

@pythonoma
Copy link
Author

I haven't checked it, but can someone push @awei78 patch to the master branch ?

@fdaniele85
Copy link

Applying the patch, when I download the last 5 videos of a channel, it does not download the right videos...

@marabu88
Copy link

marabu88 commented Sep 1, 2017

if i try download some channels - i see this
[youtube:channel] playlist UC_some_channekl_ID: Downloading 0 videos

@ivan
Copy link
Contributor

ivan commented Sep 1, 2017

Patch tested on a user and a channel, don't blame if it breaks something else...

From 953ca568dcde08fc55a0c245036e390c2e8061bc Mon Sep 17 00:00:00 2001
From: Ivan Kozik <[email protected]>
Date: Fri, 1 Sep 2017 02:26:41 +0000
Subject: [PATCH] Get non-Polymer YouTube pages because channel/playlist
 download is broken for Polymer pages

---
 youtube_dl/extractor/youtube.py | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/youtube_dl/extractor/youtube.py b/youtube_dl/extractor/youtube.py
index ea6f12f8e..d892db943 100644
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -267,7 +267,7 @@ class YoutubeEntryListBaseInfoExtractor(YoutubeBaseInfoExtractor):
                 break
 
             more = self._download_json(
-                'https://youtube.com/%s' % mobj.group('more'), playlist_id,
+                'https://www.youtube.com/%s&disable_polymer=true' % mobj.group('more'), playlist_id,
                 'Downloading page #%s' % page_num,
                 transform_source=uppercase_escape)
             content_html = more['content_html']
@@ -310,7 +310,7 @@ class YoutubePlaylistsBaseInfoExtractor(YoutubeEntryListBaseInfoExtractor):
                 r'<h3[^>]+class="[^"]*yt-lockup-title[^"]*"[^>]*><a[^>]+href="/?playlist\?list=([0-9A-Za-z-_]{10,})"',
                 content)):
             yield self.url_result(
-                'https://www.youtube.com/playlist?list=%s' % playlist_id, 'YoutubePlaylist')
+                'https://www.youtube.com/playlist?list=%s&disable_polymer=true' % playlist_id, 'YoutubePlaylist')
 
     def _real_extract(self, url):
         playlist_id = self._match_id(url)
@@ -2325,7 +2325,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
 class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
     IE_DESC = 'YouTube.com channels'
     _VALID_URL = r'https?://(?:youtu\.be|(?:\w+\.)?youtube(?:-nocookie)?\.com)/channel/(?P<id>[0-9A-Za-z_-]+)'
-    _TEMPLATE_URL = 'https://www.youtube.com/channel/%s/videos'
+    _TEMPLATE_URL = 'https://www.youtube.com/channel/%s/videos?disable_polymer=true'
     _VIDEO_RE = r'(?:title="(?P<title>[^"]+)"[^>]+)?href="/watch\?v=(?P<id>[0-9A-Za-z_-]+)&?'
     IE_NAME = 'youtube:channel'
     _TESTS = [{
@@ -2382,7 +2382,7 @@ class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
         if channel_playlist_id and channel_playlist_id.startswith('UC'):
             playlist_id = 'UU' + channel_playlist_id[2:]
             return self.url_result(
-                compat_urlparse.urljoin(url, '/playlist?list=%s' % playlist_id), 'YoutubePlaylist')
+                compat_urlparse.urljoin(url, '/playlist?list=%s&disable_polymer=true' % playlist_id), 'YoutubePlaylist')
 
         channel_page = self._download_webpage(url, channel_id, 'Downloading page #1')
         autogenerated = re.search(r'''(?x)
@@ -2416,7 +2416,7 @@ class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
 class YoutubeUserIE(YoutubeChannelIE):
     IE_DESC = 'YouTube.com user videos (URL or "ytuser" keyword)'
     _VALID_URL = r'(?:(?:https?://(?:\w+\.)?youtube\.com/(?:(?P<user>user|c)/)?(?!(?:attribution_link|watch|results)(?:$|[^a-z_A-Z0-9-])))|ytuser:)(?!feed/)(?P<id>[A-Za-z0-9_-]+)'
-    _TEMPLATE_URL = 'https://www.youtube.com/%s/%s/videos'
+    _TEMPLATE_URL = 'https://www.youtube.com/%s/%s/videos?disable_polymer=true'
     IE_NAME = 'youtube:user'
 
     _TESTS = [{
-- 
2.14.1

@Rayniax
Copy link

Rayniax commented Sep 1, 2017

I can confirm that only the patch from @ivan is working very nice.
Thank you very much for your work !!!

@Zuccace
Copy link

Zuccace commented Sep 1, 2017

Patch from @ivan works here as well.

@pythonoma
Copy link
Author

I would consider @ivan patch a temporary backward compatibility from YouTube & they may disable this option at any time.

@Balun-Symmetrie
Copy link

Balun-Symmetrie commented Sep 1, 2017

I am new, how can I easily apply this Patch?
I installed via Curl.

@Zuccace
Copy link

Zuccace commented Sep 1, 2017

The patch can be applied simply by locating the extractor/youtube.py first.
Then copy the patch and save it with .patch extension preferably.
Then run
patch <path_to>/extractor/youtube.py <the_patch_file_you_just_saved>.patch

You need the patch utility of course.

If you happen to be running youtube-dl on Windows, I'd guess the Linux environment that's included in some windows could contan the patch utility. Otherwise I cannot help much more. It's been more than 10 years since I has windows.

On MacOS I assume that there's already the patch utility in default install.

@Zuccace
Copy link

Zuccace commented Sep 1, 2017

@youtubeuser2017, That's propably the most simple solution.
However I wonder how long will it work..? All the patches here look like circumventing the problem, except maybe the @awei78 's patch.

@taewookim
Copy link

@ivan Thanks. Patch works 100% on playlist

@SRCoughlin
Copy link

Patch does not work for me:

[youtube:subscriptions] playlist Youtube Subscriptions: Collected 0 video ids (downloading 0 of them)

@dstftw dstftw closed this as completed in 8d81f3e Sep 1, 2017
@ytdl-org ytdl-org locked and limited conversation to collaborators Sep 1, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

14 participants