Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add jvndb.jvn.jp custom parser #345

Merged
merged 2 commits into from
Apr 9, 2019

Conversation

kik0220
Copy link
Contributor

@kik0220 kik0220 commented Mar 30, 2019

jvndb.jvn.jp custom parser

@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: feat: add jvndb.jvn.jp custom parser

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "JVNDB-2018-013542 - JVN iPedia - 脆弱性対策情報データベース",
  "author": null,
  "date_published": null,
  "dek": null,
  "lead_image_url": null,
  "content": "<div id=\"mainarea\"> <div id=\"leftarea\"> <div class=\"contents\"> <div id=\"news-list\">\n<br>\n<table class=\"vuln_table_clase\"> <tr><td> </td></tr> <tr><td> </td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote>\nNETWAVE MNG6200 &#x30C7;&#x30D0;&#x30A4;&#x30B9;&#x306B;&#x306F;&#x3001;&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;&#x306B;&#x95A2;&#x3059;&#x308B;&#x8106;&#x5F31;&#x6027;&#x304C;&#x5B58;&#x5728;&#x3057;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr>\n<tr></tr>\n<tr><td>\n<br>\n<div> </div>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> NETWAVE&#xA0;Networks,&#xA0;Inc. <ul> <li>MNG6200&#xA0;&#x30D5;&#x30A1;&#x30FC;&#x30E0;&#x30A6;&#x30A7;&#x30A2; C4835805jrc12FU121413.cpr</li> </ul>\n</blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x60C5;&#x5831;&#x3092;&#x53D6;&#x5F97;&#x3055;&#x308C;&#x308B;&#x3001;&#x60C5;&#x5831;&#x3092;&#x6539;&#x3056;&#x3093;&#x3055;&#x308C;&#x308B;&#x3001;&#x304A;&#x3088;&#x3073;&#x30B5;&#x30FC;&#x30D3;&#x30B9;&#x904B;&#x7528;&#x59A8;&#x5BB3; (DoS) &#x72B6;&#x614B;&#x306B;&#x3055;&#x308C;&#x308B;&#x53EF;&#x80FD;&#x6027;&#x304C;&#x3042;&#x308A;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x53C2;&#x8003;&#x60C5;&#x5831;&#x3092;&#x53C2;&#x7167;&#x3057;&#x3066;&#x9069;&#x5207;&#x306A;&#x5BFE;&#x7B56;&#x3092;&#x5B9F;&#x65BD;&#x3057;&#x3066;&#x304F;&#x3060;&#x3055;&#x3044;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n<ol> <li><a href=\"https://jvndb.jvn.jp/ja/cwe/CWE-255.html\">&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;(CWE-255)</a> [NVD&#x8A55;&#x4FA1;]</li>\n</ol>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote> </blockquote>\n<br>\n</td></tr>\n</table> </div> </div> </div> </div>",
  "next_page_url": null,
  "url": "http://example.com",
  "domain": "example.com",
  "word_count": 15,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • author

  • date_published

  • dek

  • lead_image_url

  • next_page_url

✅ All tests passed

@kik0220 kik0220 force-pushed the feat-jvndb-jvn-jp-extractor branch from 590ee1d to 9050ffe Compare April 1, 2019 22:01
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: Merge branch 'master' into feat-jvndb-jvn-jp-extractor

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "JVNDB-2018-013542 - JVN iPedia - 脆弱性対策情報データベース",
  "author": null,
  "date_published": null,
  "dek": null,
  "lead_image_url": null,
  "content": "<div id=\"mainarea\"> <div id=\"leftarea\"> <div class=\"contents\"> <div id=\"news-list\">\n<br>\n<table class=\"vuln_table_clase\"> <tr><td> </td></tr> <tr><td> </td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote>\nNETWAVE MNG6200 &#x30C7;&#x30D0;&#x30A4;&#x30B9;&#x306B;&#x306F;&#x3001;&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;&#x306B;&#x95A2;&#x3059;&#x308B;&#x8106;&#x5F31;&#x6027;&#x304C;&#x5B58;&#x5728;&#x3057;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr>\n<tr></tr>\n<tr><td>\n<br>\n<div> </div>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> NETWAVE&#xA0;Networks,&#xA0;Inc. <ul> <li>MNG6200&#xA0;&#x30D5;&#x30A1;&#x30FC;&#x30E0;&#x30A6;&#x30A7;&#x30A2; C4835805jrc12FU121413.cpr</li> </ul>\n</blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x60C5;&#x5831;&#x3092;&#x53D6;&#x5F97;&#x3055;&#x308C;&#x308B;&#x3001;&#x60C5;&#x5831;&#x3092;&#x6539;&#x3056;&#x3093;&#x3055;&#x308C;&#x308B;&#x3001;&#x304A;&#x3088;&#x3073;&#x30B5;&#x30FC;&#x30D3;&#x30B9;&#x904B;&#x7528;&#x59A8;&#x5BB3; (DoS) &#x72B6;&#x614B;&#x306B;&#x3055;&#x308C;&#x308B;&#x53EF;&#x80FD;&#x6027;&#x304C;&#x3042;&#x308A;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x53C2;&#x8003;&#x60C5;&#x5831;&#x3092;&#x53C2;&#x7167;&#x3057;&#x3066;&#x9069;&#x5207;&#x306A;&#x5BFE;&#x7B56;&#x3092;&#x5B9F;&#x65BD;&#x3057;&#x3066;&#x304F;&#x3060;&#x3055;&#x3044;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n<ol> <li><a href=\"https://jvndb.jvn.jp/ja/cwe/CWE-255.html\">&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;(CWE-255)</a> [NVD&#x8A55;&#x4FA1;]</li>\n</ol>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote> </blockquote>\n<br>\n</td></tr>\n</table> </div> </div> </div> </div>",
  "next_page_url": null,
  "url": "http://example.com",
  "domain": "example.com",
  "word_count": 15,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • author

  • date_published

  • dek

  • lead_image_url

  • next_page_url

✅ All tests passed

@kik0220 kik0220 force-pushed the feat-jvndb-jvn-jp-extractor branch from 4d42051 to 78af26a Compare April 6, 2019 20:59
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: feat: add jvndb.jvn.jp custom parser

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "JVNDB-2018-013542 - JVN iPedia - 脆弱性対策情報データベース",
  "author": null,
  "date_published": null,
  "dek": null,
  "lead_image_url": null,
  "content": "<div id=\"mainarea\"> <div id=\"leftarea\"> <div class=\"contents\"> <div id=\"news-list\">\n<br>\n<table class=\"vuln_table_clase\"> <tr><td> </td></tr> <tr><td> </td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote>\nNETWAVE MNG6200 &#x30C7;&#x30D0;&#x30A4;&#x30B9;&#x306B;&#x306F;&#x3001;&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;&#x306B;&#x95A2;&#x3059;&#x308B;&#x8106;&#x5F31;&#x6027;&#x304C;&#x5B58;&#x5728;&#x3057;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr>\n<tr></tr>\n<tr><td>\n<br>\n<div> </div>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> NETWAVE&#xA0;Networks,&#xA0;Inc. <ul> <li>MNG6200&#xA0;&#x30D5;&#x30A1;&#x30FC;&#x30E0;&#x30A6;&#x30A7;&#x30A2; C4835805jrc12FU121413.cpr</li> </ul>\n</blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x60C5;&#x5831;&#x3092;&#x53D6;&#x5F97;&#x3055;&#x308C;&#x308B;&#x3001;&#x60C5;&#x5831;&#x3092;&#x6539;&#x3056;&#x3093;&#x3055;&#x308C;&#x308B;&#x3001;&#x304A;&#x3088;&#x3073;&#x30B5;&#x30FC;&#x30D3;&#x30B9;&#x904B;&#x7528;&#x59A8;&#x5BB3; (DoS) &#x72B6;&#x614B;&#x306B;&#x3055;&#x308C;&#x308B;&#x53EF;&#x80FD;&#x6027;&#x304C;&#x3042;&#x308A;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x53C2;&#x8003;&#x60C5;&#x5831;&#x3092;&#x53C2;&#x7167;&#x3057;&#x3066;&#x9069;&#x5207;&#x306A;&#x5BFE;&#x7B56;&#x3092;&#x5B9F;&#x65BD;&#x3057;&#x3066;&#x304F;&#x3060;&#x3055;&#x3044;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n<ol> <li><a href=\"https://jvndb.jvn.jp/ja/cwe/CWE-255.html\">&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;(CWE-255)</a> [NVD&#x8A55;&#x4FA1;]</li>\n</ol>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote> </blockquote>\n<br>\n</td></tr>\n</table> </div> </div> </div> </div>",
  "next_page_url": null,
  "url": "http://example.com",
  "domain": "example.com",
  "word_count": 15,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • author

  • date_published

  • dek

  • lead_image_url

  • next_page_url

✅ All tests passed

@kik0220 kik0220 force-pushed the feat-jvndb-jvn-jp-extractor branch from 78af26a to 757cc16 Compare April 7, 2019 00:37
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: feat: add jvndb.jvn.jp custom parser

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "JVNDB-2018-013542 - JVN iPedia - 脆弱性対策情報データベース",
  "author": null,
  "date_published": null,
  "dek": null,
  "lead_image_url": null,
  "content": "<div id=\"mainarea\"> <div id=\"leftarea\"> <div class=\"contents\"> <div id=\"news-list\">\n<br>\n<table class=\"vuln_table_clase\"> <tr><td> </td></tr> <tr><td> </td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote>\nNETWAVE MNG6200 &#x30C7;&#x30D0;&#x30A4;&#x30B9;&#x306B;&#x306F;&#x3001;&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;&#x306B;&#x95A2;&#x3059;&#x308B;&#x8106;&#x5F31;&#x6027;&#x304C;&#x5B58;&#x5728;&#x3057;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr>\n<tr></tr>\n<tr><td>\n<br>\n<div> </div>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> NETWAVE&#xA0;Networks,&#xA0;Inc. <ul> <li>MNG6200&#xA0;&#x30D5;&#x30A1;&#x30FC;&#x30E0;&#x30A6;&#x30A7;&#x30A2; C4835805jrc12FU121413.cpr</li> </ul>\n</blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x60C5;&#x5831;&#x3092;&#x53D6;&#x5F97;&#x3055;&#x308C;&#x308B;&#x3001;&#x60C5;&#x5831;&#x3092;&#x6539;&#x3056;&#x3093;&#x3055;&#x308C;&#x308B;&#x3001;&#x304A;&#x3088;&#x3073;&#x30B5;&#x30FC;&#x30D3;&#x30B9;&#x904B;&#x7528;&#x59A8;&#x5BB3; (DoS) &#x72B6;&#x614B;&#x306B;&#x3055;&#x308C;&#x308B;&#x53EF;&#x80FD;&#x6027;&#x304C;&#x3042;&#x308A;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x53C2;&#x8003;&#x60C5;&#x5831;&#x3092;&#x53C2;&#x7167;&#x3057;&#x3066;&#x9069;&#x5207;&#x306A;&#x5BFE;&#x7B56;&#x3092;&#x5B9F;&#x65BD;&#x3057;&#x3066;&#x304F;&#x3060;&#x3055;&#x3044;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n<ol> <li><a href=\"https://jvndb.jvn.jp/ja/cwe/CWE-255.html\">&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;(CWE-255)</a> [NVD&#x8A55;&#x4FA1;]</li>\n</ol>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote> </blockquote>\n<br>\n</td></tr>\n</table> </div> </div> </div> </div>",
  "next_page_url": null,
  "url": "http://example.com",
  "domain": "example.com",
  "word_count": 15,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • author

  • date_published

  • dek

  • lead_image_url

  • next_page_url

✅ All tests passed

@toufic-m toufic-m merged commit c389c96 into postlight:master Apr 9, 2019
@postlight-org
Copy link
Collaborator

🤖 Automated Parsing Preview 🤖

Commit: Merge branch 'master' into feat-jvndb-jvn-jp-extractor

Screenshot of fixture (this embed should work after repo is public)

Original Article | HTML Fixture | Parsed Content Preview

Parsed JSON
{
  "title": "JVNDB-2018-013542 - JVN iPedia - 脆弱性対策情報データベース",
  "author": null,
  "date_published": null,
  "dek": null,
  "lead_image_url": null,
  "content": "<div id=\"mainarea\"> <div id=\"leftarea\"> <div class=\"contents\"> <div id=\"news-list\">\n<br>\n<table class=\"vuln_table_clase\"> <tr><td> </td></tr> <tr><td> </td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote>\nNETWAVE MNG6200 &#x30C7;&#x30D0;&#x30A4;&#x30B9;&#x306B;&#x306F;&#x3001;&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;&#x306B;&#x95A2;&#x3059;&#x308B;&#x8106;&#x5F31;&#x6027;&#x304C;&#x5B58;&#x5728;&#x3057;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr>\n<tr></tr>\n<tr><td>\n<br>\n<div> </div>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> NETWAVE&#xA0;Networks,&#xA0;Inc. <ul> <li>MNG6200&#xA0;&#x30D5;&#x30A1;&#x30FC;&#x30E0;&#x30A6;&#x30A7;&#x30A2; C4835805jrc12FU121413.cpr</li> </ul>\n</blockquote>\n</td></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x60C5;&#x5831;&#x3092;&#x53D6;&#x5F97;&#x3055;&#x308C;&#x308B;&#x3001;&#x60C5;&#x5831;&#x3092;&#x6539;&#x3056;&#x3093;&#x3055;&#x308C;&#x308B;&#x3001;&#x304A;&#x3088;&#x3073;&#x30B5;&#x30FC;&#x30D3;&#x30B9;&#x904B;&#x7528;&#x59A8;&#x5BB3; (DoS) &#x72B6;&#x614B;&#x306B;&#x3055;&#x308C;&#x308B;&#x53EF;&#x80FD;&#x6027;&#x304C;&#x3042;&#x308A;&#x307E;&#x3059;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n&#x53C2;&#x8003;&#x60C5;&#x5831;&#x3092;&#x53C2;&#x7167;&#x3057;&#x3066;&#x9069;&#x5207;&#x306A;&#x5BFE;&#x7B56;&#x3092;&#x5B9F;&#x65BD;&#x3057;&#x3066;&#x304F;&#x3060;&#x3055;&#x3044;&#x3002;\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote>\n<ol> <li><a href=\"https://jvndb.jvn.jp/ja/cwe/CWE-255.html\">&#x8A3C;&#x660E;&#x66F8;&#x30FB;&#x30D1;&#x30B9;&#x30EF;&#x30FC;&#x30C9;&#x306E;&#x7BA1;&#x7406;(CWE-255)</a> [NVD&#x8A55;&#x4FA1;]</li>\n</ol>\n</blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr> <tr><td>\n<br>\n<blockquote> </blockquote>\n</td></tr> <tr></tr>\n<tr><td>\n<br>\n<blockquote> </blockquote>\n<br>\n</td></tr>\n</table> </div> </div> </div> </div>",
  "next_page_url": null,
  "url": "http://example.com",
  "domain": "example.com",
  "word_count": 15,
  "direction": "ltr",
  "total_pages": 1,
  "rendered_pages": 1
}

null fields

  • author

  • date_published

  • dek

  • lead_image_url

  • next_page_url

✅ All tests passed

@kik0220 kik0220 deleted the feat-jvndb-jvn-jp-extractor branch April 9, 2019 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants