Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

子类重写def item_completed(self, results, item, info),可以实现文件重命名功能 #9

Open
dingyuanhong2006 opened this issue Aug 27, 2019 · 0 comments

Comments

@dingyuanhong2006
Copy link

from scrapy.pipelines.images import ImagesPipeline
from scrapy import Request
from ImageSpider.settings import IMAGES_STORE as images_store
import os

class ImagespiderPipeline(ImagesPipeline):

def get_media_requests(self, item, info):
    # 循环每一张图片地址下载,若传过来的不是集合则无需循环直接yield
    for image_url in item['imgurl']:
        yield Request(image_url)

# def file_path(self, request, response=None, info=None):
#     # 重命名,若不重写这函数,图片名为哈希,就是一串乱七八糟的名字
#     image_guid = request.url.split('/')[-1]  # 提取url前面名称作为图片名。
#     return image_guid

# def item_completed(self, results, item, info):
# 	#重命名文件,并把默认路径D:\ImageSpider\full\*图片 
# 	#修改为D:\ImageSpider\*.jpg,提取item['imgurl']中url前面名称作为图片名
# 	#功能上类似file_path
# 	image_path = [x["path"] for ok, x in results if ok]
# 	for i in range(len(image_path)):
# 		os.rename(images_store+'/'+image_path[i],images_store+'/'+item['imgurl'][i].split('/')[-1])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant