Various improvements #332

zuko7177 · 2020-09-04T03:56:51Z

This change contains some improvements mainly to DMM scraping.

When scraping with DMM for English, we will now scrape the website's English version instead of getting Japanese and translating it via grammarchecker.com. This should improve scraping reliability and speed.
Add menu option to disable DMM scraping for actress. Skipping actress scraping on DMM will improve speed. Getting actress data from R18 is better.
Increase accuracy of search result matches by matching movie ID with search results Label field first. If no matches found, then try matching to URL as before. For example,
search for URE-051 on R18 will no longer return PURE-051.
search for DV-1195 on JavBus will no longer return HODV-21195.
Add Gradle v5.6.4 wrapper. Enable use of gradle without installing gradle (using gradlew.bat or gradlew).
Example on linux: ./gradlew assemble
Example on windows: gradlew.bat assemble
Known issue with DMM scraping. Some times we still get JP version even though we're requesting EN. This may be due to web server caching. If this happens, DMM scraper will wait 5 seconds and retry again. Occasionally, the 2nd try will still get JP version. If that happens, just restart the scrape.

Fix deprecated in build.gradle

Users can use gradle (gradlew.bat or gradlew) without installing gradle. Example on linux: ./gradlew assemble Example on windows: gradlew.bat assemble

…slator service. This improves speed and reliability of English parsing.

…e for English names. Grammarchecker.net is unreliable. Instead, we scrape DMM's English version.

…he site instead of both versions. This should improve scraping speed.

…if DMM is slow for you. 2. Upgrade jsoup to 1.9.2 in build.gradle 3. In the case where DMM English scraping gives Japanese (due to server caching), we will do another scraping. 4. add new UserAgent 5. some more code refactors

Dmm improve

� Conflicts: � build.gradle � src/main/java/moviescraper/doctord/controller/siteparsingprofile/specific/DmmParsingProfile.java

Increase accuracy of search result matches by matching movie ID with search results Label field first. If no matches found, then try matching to URL as before.

…ed, XStream is probably vulnerable."

zuko7177 and others added 14 commits August 24, 2020 23:19

Apply spotlessJava to DmmParsingProfile.java.

9bec1ef

Fix deprecated in build.gradle

Add Gradle v5.6.4 wrapper.

a978742

Users can use gradle (gradlew.bat or gradlew) without installing gradle. Example on linux: ./gradlew assemble Example on windows: gradlew.bat assemble

Improve DMM parsing by scraping English version instead of using tran…

5fca96c

…slator service. This improves speed and reliability of English parsing.

Fix deprecated.

2964b71

Merge branch 'master' of https://github.com/zuko7177/JAVMovieScraper

db844f1

Refactor scrapeActors(). We stop using grammarchecker.net to translat…

300f619

…e for English names. Grammarchecker.net is unreliable. Instead, we scrape DMM's English version.

Minor updates.

7c7204c

More refactoring of DMM. Will now scrape either EN or JP version of t…

2a24fde

…he site instead of both versions. This should improve scraping speed.

splotlessJavaApply

f2ba5fb

Minor change

68b81bb

Merge pull request #1 from zuko7177/dmm-improve

78c9eef

Dmm improve

Merge branch 'master' of https://github.com/zuko7177/JAVMovieScraper

9b7b032

� Conflicts: � build.gradle � src/main/java/moviescraper/doctord/controller/siteparsingprofile/specific/DmmParsingProfile.java

Forgot to include these in previous commit.

c4b091e

Increase accuracy of search result matches by matching movie ID with search results Label field first. If no matches found, then try matching to URL as before.

zuko7177 mentioned this pull request Sep 6, 2020

DMM english scraper source javinizer/Javinizer#67

Closed

Suppress warning message "Security framework of XStream not initializ…

e159e28

…ed, XStream is probably vulnerable."

zuko7177 mentioned this pull request Sep 11, 2020

Scrapper for DMM Seems to be Broken Again using the Latest Code #333

Open

zuko7177 added 2 commits September 18, 2020 12:34

add useragent to R18 scraper

77f394f

suppress gradle IOException parameter is incorrect

90c4df8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various improvements #332

Various improvements #332

zuko7177 commented Sep 4, 2020 •

edited

Loading

Various improvements #332

Are you sure you want to change the base?

Various improvements #332

Conversation

zuko7177 commented Sep 4, 2020 • edited Loading

zuko7177 commented Sep 4, 2020 •

edited

Loading