MENU

PHP 爬取 ZOL 壁纸下载到本地文件

前几天看到一个关于 PHP 的爬虫工具(queryList) , 看了看 , 根据官网 demo 改了改 , 来爬一下图片下载

注意: 只支持 desk.zol.com.cn

运行环境

  • linux(windos) + nginx(apache) + php
  • php 版本 >= 7.0
  • 本地测试是docker,应该任何集成环境都可以(如:xampp,phpstudy,wawp)
  • 下载解压(bdquerylist) ,放到你的服务器根目录
  • 访问地址: http://127.0.0.1/queryList/indexx.html
  • 打开后显示如下图:
  •  

使用方法

  • 输入要下载的地址(如:http://desk.zol.com.cn/bizhi/571_5680_2.html)
  • 选择要下载分辨率(默认1440x900)
  • 选择保存地址(默认在文件包里的 image 文件夹内,只在 mac 上测试,win 上面的自定义地址不确定可用)

源码文件,有兴趣的可以看看 小白一枚,大神勿喷!

<?php
$data = $_POST;
if(empty($data) || empty($data['bz_url'])){
    echo json_encode(['code'=>500,'msg'=>'必填参数不能为空']);die;
}
if(substr($data['bz_url'] , 0 , 22) != 'http://desk.zol.com.cn'){
    echo json_encode(['code'=>501,'msg'=>'请传入正确的 ZOL 地址!']);die;
}
$rs['width']  = '1440';//默认分辨率
$rs['height'] = '900';//默认分辨率
if(!empty($data['bz_dpi'])){
    $dpi = explode(',',$data['bz_dpi']);
    if(count($dpi) != 2){
        echo json_encode(['code'=>502,'msg'=>'分辨率解析失败']);die;
    }
    if(!in_array($dpi[0] , [2880,2560,1920,1680,1600,1440,1366,1280]) || !in_array($dpi[1] , [1800,1600,1080,1050,900,768,1024,800])){
        echo json_encode(['code'=>503,'msg'=>'无法识别的分辨率']);die;
    }
    $rs['width']  = trim($dpi[0]);//默认分辨率
    $rs['height'] = trim($dpi[1]);//默认分辨率
}
$path = './image/';
if(!empty($data['bz_path'])){
    if(!is_dir($data['bz_path'])){
        $rk = mkdir($data['bz_path'],0777);
        if(!$rk) echo json_encode(['code'=>505,'msg'=>'保存目录文件夹创建失败']);die;
    }
    $path = $data['bz_path'];
}
 
//使用 queryList 下载ZOL壁纸
require('./autoload.php');
use QL\QueryList;
$withDownImageUrl = trim($data['bz_url']);
$ql = QueryList::getInstance();
$ql->bind('downloadImage',function ($path,$rs){
    $data = $this->getData()->map(function ($item) use($path,$rs){
        $imageUrl = '';
        if(!empty($item['image'])){
            $imageUrl .= $item['image'];
        }
        if(!empty($item['images'])){
            $imageUrl .= $item['images'];
        }
        if(empty($imageUrl)) return false;
 
        $urlArr = explode('144x90',$imageUrl);
        if(count($urlArr) !=2) return false;
        $imageUrl = $urlArr[0].$rs['width'].'x'.$rs['height'].$urlArr[1];
        // 获取图片
        $img = file_get_contents($imageUrl);
        if(!is_dir($path.$rs['title'].'_'.$rs['count'])){
                $isMkDir = mkdir($path.$rs['title'].'_'.$rs['count'],0777);
            if(!$isMkDir){
                $item['down_image'] = '文件夹创建失败';return;
            }
        }
        //存储路径和文件名
        $localPath = $path.$rs['title'].'_'.$rs['count'].'/'.date('YmdHis').mt_rand(10000,99999).'.jpg';
        // 保存图片到本地路径
        $res = file_put_contents($localPath,$img);
        $item['issave']  = !empty($res)?'1':'0';
        $item['message'] = !empty($res)?'保存成功':'保存失败';
        return $item;
    });
    //更新data属性
    $this->setData($data);
    return $this;
});
$html = $ql->get($withDownImageUrl);
//标题
$rs['title'] = $ql->find('#titleName')->text();
//图片总数
$rs['count'] = mb_substr(explode('/',$ql->find('h3>span')->text())[1],0,-1);
$data = $html->rules([
    'image' =>['#showImg>li>a>img','src'],
    'images' =>['#showImg>li>a>img','srcs'],
])->query()->downloadImage($path,$rs)->getData()->all();
$responseData['title'] = $rs['title'];
$responseData['count'] = $rs['count'];
$responseData['succ']  = 0;
$responseData['fail']  = 0;
foreach ($data as $v) {
    if($v['issave']){
        $responseData['succ']+=1;
    }else{
        $responseData['fail']+=1;
    }
}
 
echo json_encode(['code'=>200,'msg'=>'下载完成','data'=>$responseData]);die;

源码下载

0:00