Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add japan.zdnet.com custom parser #410

Merged
merged 3 commits into from
May 8, 2019

Conversation

kik0220
Copy link
Contributor

@kik0220 kik0220 commented Apr 30, 2019

japan.zdnet.com custom parser

@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: feat: add japan.zdnet.com custom parser

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Raspberry Pi競合--Ubuntuが動く「UP Core」を見る",
  "content": "<div><div class=\"article_body\">\n\n  <p>\n&#x3000;2017&#x5E74;&#x306B;&#x30EA;&#x30EA;&#x30FC;&#x30B9;&#x4E88;&#x5B9A;&#x306E;&#x4F4E;&#x4FA1;&#x683C;&#x30DD;&#x30B1;&#x30C3;&#x30C8;&#x30B5;&#x30A4;&#x30BA;&#x30B3;&#x30F3;&#x30D4;&#x30E5;&#x30FC;&#x30BF;&#x306F;&#x3001;&#x300C;Android Marshmallow&#x300D;&#x3068;&#x300C;Windows 10&#x300D;&#x306E;&#x30D5;&#x30EB;&#x30D0;&#x30FC;&#x30B8;&#x30E7;&#x30F3;&#x304C;&#x52D5;&#x4F5C;&#x3059;&#x308B;&#x3002;\n</p><p>\n&#x3000;&#x300C;UP Core&#x300D;&#x306F;&#x3001;&#x300C;ubilinux&#xFF08;Debian&#xFF09;&#x300D;&#x3084;&#x300C;Ubuntu&#x300D;&#x300C;Yocto&#x300D;&#x306A;&#x3069;&#x3001;&#x3055;&#x307E;&#x3056;&#x307E;&#x306A;OS&#x304C;&#x52D5;&#x4F5C;&#x3059;&#x308B;&#x30B7;&#x30F3;&#x30B0;&#x30EB;&#x30DC;&#x30FC;&#x30C9;&#x30B3;&#x30F3;&#x30D4;&#x30E5;&#x30FC;&#x30BF;&#x3060;&#x3002;&#x305D;&#x3057;&#x3066;&#x3001;&#x305D;&#x306E;&#x6027;&#x80FD;&#x306F;&#x300C;Raspberry Pi 3 Model B&#x300D;&#x3092;&#x4E0A;&#x56DE;&#x308B;&#x3088;&#x3046;&#x3060;&#x3002;\n</p>\n\n<p>\n&#xFF08;&#x672C;&#x8A18;&#x4E8B;&#x306F;&#x3001;<a href=\"https://japan.techrepublic.com/article/35102407.htm\">TechRepublic Japan&#x3067;2017&#x5E74;6&#x6708;8&#x65E5;&#x306B;&#x63B2;&#x8F09;&#x3057;&#x305F;&#x8A18;&#x4E8B;</a>&#x304B;&#x3089;&#x306E;&#x8EE2;&#x8F09;&#x3067;&#x3059;&#x3002;&#x7D9A;&#x304D;&#x306F;TechRepublic Japan&#x3067;&#x304A;&#x8AAD;&#x307F;&#x3044;&#x305F;&#x3060;&#x3051;&#x307E;&#x3059;&#xFF09;\n</p>\n\n\n</div></div>",
  "author": "ZDNet Japan Staff 2019年04月28日 08時00分",
  "date_published": "2019-04-27T23:00:00.000Z",
  "lead_image_url": "https://japan.zdnet.com/storage/2019/04/26/97670c9f883bf3f9e11a492df245717c/190426_original_1280x960.jpg",
  "dek": null,
  "next_page_url": null,
  "url": "https://japan.zdnet.com/article/35136396/",
  "domain": "japan.zdnet.com",
  "word_count": 1,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

src/extractors/custom/japan.zdnet.com/index.js Outdated Show resolved Hide resolved
src/extractors/custom/japan.zdnet.com/index.js Outdated Show resolved Hide resolved
@kik0220 kik0220 force-pushed the feat-japan-zdnet-com-extractor branch from 591180b to 14b66e2 Compare May 7, 2019 21:08
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: fix: author and date_published selector

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Raspberry Pi競合--Ubuntuが動く「UP Core」を見る",
  "content": "<div><div class=\"article_body\">\n\n  <p>\n&#x3000;2017&#x5E74;&#x306B;&#x30EA;&#x30EA;&#x30FC;&#x30B9;&#x4E88;&#x5B9A;&#x306E;&#x4F4E;&#x4FA1;&#x683C;&#x30DD;&#x30B1;&#x30C3;&#x30C8;&#x30B5;&#x30A4;&#x30BA;&#x30B3;&#x30F3;&#x30D4;&#x30E5;&#x30FC;&#x30BF;&#x306F;&#x3001;&#x300C;Android Marshmallow&#x300D;&#x3068;&#x300C;Windows 10&#x300D;&#x306E;&#x30D5;&#x30EB;&#x30D0;&#x30FC;&#x30B8;&#x30E7;&#x30F3;&#x304C;&#x52D5;&#x4F5C;&#x3059;&#x308B;&#x3002;\n</p><p>\n&#x3000;&#x300C;UP Core&#x300D;&#x306F;&#x3001;&#x300C;ubilinux&#xFF08;Debian&#xFF09;&#x300D;&#x3084;&#x300C;Ubuntu&#x300D;&#x300C;Yocto&#x300D;&#x306A;&#x3069;&#x3001;&#x3055;&#x307E;&#x3056;&#x307E;&#x306A;OS&#x304C;&#x52D5;&#x4F5C;&#x3059;&#x308B;&#x30B7;&#x30F3;&#x30B0;&#x30EB;&#x30DC;&#x30FC;&#x30C9;&#x30B3;&#x30F3;&#x30D4;&#x30E5;&#x30FC;&#x30BF;&#x3060;&#x3002;&#x305D;&#x3057;&#x3066;&#x3001;&#x305D;&#x306E;&#x6027;&#x80FD;&#x306F;&#x300C;Raspberry Pi 3 Model B&#x300D;&#x3092;&#x4E0A;&#x56DE;&#x308B;&#x3088;&#x3046;&#x3060;&#x3002;\n</p>\n\n<p>\n&#xFF08;&#x672C;&#x8A18;&#x4E8B;&#x306F;&#x3001;<a href=\"https://japan.techrepublic.com/article/35102407.htm\">TechRepublic Japan&#x3067;2017&#x5E74;6&#x6708;8&#x65E5;&#x306B;&#x63B2;&#x8F09;&#x3057;&#x305F;&#x8A18;&#x4E8B;</a>&#x304B;&#x3089;&#x306E;&#x8EE2;&#x8F09;&#x3067;&#x3059;&#x3002;&#x7D9A;&#x304D;&#x306F;TechRepublic Japan&#x3067;&#x304A;&#x8AAD;&#x307F;&#x3044;&#x305F;&#x3060;&#x3051;&#x307E;&#x3059;&#xFF09;\n</p>\n\n\n</div></div>",
  "author": "ZDNet Japan Staff",
  "date_published": "2019-04-27T23:00:00.000Z",
  "lead_image_url": "https://japan.zdnet.com/storage/2019/04/26/97670c9f883bf3f9e11a492df245717c/190426_original_1280x960.jpg",
  "dek": null,
  "next_page_url": null,
  "url": "https://japan.zdnet.com/article/35136396/",
  "domain": "japan.zdnet.com",
  "word_count": 1,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: Merge branch 'master' into feat-japan-zdnet-com-extractor

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "Raspberry Pi競合--Ubuntuが動く「UP Core」を見る",
  "content": "<div><div class=\"article_body\">\n\n  <p>\n&#x3000;2017&#x5E74;&#x306B;&#x30EA;&#x30EA;&#x30FC;&#x30B9;&#x4E88;&#x5B9A;&#x306E;&#x4F4E;&#x4FA1;&#x683C;&#x30DD;&#x30B1;&#x30C3;&#x30C8;&#x30B5;&#x30A4;&#x30BA;&#x30B3;&#x30F3;&#x30D4;&#x30E5;&#x30FC;&#x30BF;&#x306F;&#x3001;&#x300C;Android Marshmallow&#x300D;&#x3068;&#x300C;Windows 10&#x300D;&#x306E;&#x30D5;&#x30EB;&#x30D0;&#x30FC;&#x30B8;&#x30E7;&#x30F3;&#x304C;&#x52D5;&#x4F5C;&#x3059;&#x308B;&#x3002;\n</p><p>\n&#x3000;&#x300C;UP Core&#x300D;&#x306F;&#x3001;&#x300C;ubilinux&#xFF08;Debian&#xFF09;&#x300D;&#x3084;&#x300C;Ubuntu&#x300D;&#x300C;Yocto&#x300D;&#x306A;&#x3069;&#x3001;&#x3055;&#x307E;&#x3056;&#x307E;&#x306A;OS&#x304C;&#x52D5;&#x4F5C;&#x3059;&#x308B;&#x30B7;&#x30F3;&#x30B0;&#x30EB;&#x30DC;&#x30FC;&#x30C9;&#x30B3;&#x30F3;&#x30D4;&#x30E5;&#x30FC;&#x30BF;&#x3060;&#x3002;&#x305D;&#x3057;&#x3066;&#x3001;&#x305D;&#x306E;&#x6027;&#x80FD;&#x306F;&#x300C;Raspberry Pi 3 Model B&#x300D;&#x3092;&#x4E0A;&#x56DE;&#x308B;&#x3088;&#x3046;&#x3060;&#x3002;\n</p>\n\n<p>\n&#xFF08;&#x672C;&#x8A18;&#x4E8B;&#x306F;&#x3001;<a href=\"https://japan.techrepublic.com/article/35102407.htm\">TechRepublic Japan&#x3067;2017&#x5E74;6&#x6708;8&#x65E5;&#x306B;&#x63B2;&#x8F09;&#x3057;&#x305F;&#x8A18;&#x4E8B;</a>&#x304B;&#x3089;&#x306E;&#x8EE2;&#x8F09;&#x3067;&#x3059;&#x3002;&#x7D9A;&#x304D;&#x306F;TechRepublic Japan&#x3067;&#x304A;&#x8AAD;&#x307F;&#x3044;&#x305F;&#x3060;&#x3051;&#x307E;&#x3059;&#xFF09;\n</p>\n\n\n</div></div>",
  "author": "ZDNet Japan Staff",
  "date_published": "2019-04-27T23:00:00.000Z",
  "lead_image_url": "https://japan.zdnet.com/storage/2019/04/26/97670c9f883bf3f9e11a492df245717c/190426_original_1280x960.jpg",
  "dek": null,
  "next_page_url": null,
  "url": "https://japan.zdnet.com/article/35136396/",
  "domain": "japan.zdnet.com",
  "word_count": 1,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • dek

  • next_page_url

✅ All tests passed

@toufic-m toufic-m merged commit 5e1113b into postlight:master May 8, 2019
@kik0220 kik0220 deleted the feat-japan-zdnet-com-extractor branch May 8, 2019 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants