Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MyAnimeList Importer #734

Merged
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions server/lib/data_import/my_anime_list.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
module DataImport
class MyAnimeList
ATARASHII_API_HOST = 'https://hbv3-mal-api.herokuapp.com/2.1/'.freeze

include DataImport::Media
include DataImport::HTTP

attr_reader :opts

def initialize(opts = {})
@opts = opts.with_indifferent_access
super()
end

def get_media(external_id) # anime/1234 or manga/1234
media = Mapping.lookup('myanimelist', external_id)
# should return Anime or Manga
klass = external_id.split('/').first.classify.constantize
# initialize the class
media ||= klass.new

get(external_id) do |response|
details = Extractor::Media.new(response)

media.assign_attributes(details.to_h.compact)
media.genres = details.genres.map { |genre|
Genre.find_by(name: genre)
}.compact

yield media
end
end

private

def get(url, opts = {})
super(build_url(url), opts)
end

def build_url(path)
return path if path.include?('://')
"#{ATARASHII_API_HOST}#{path}"
end
end
end
168 changes: 168 additions & 0 deletions server/lib/data_import/my_anime_list/extractor/media.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
module DataImport
class MyAnimeList
module Extractor
class Media
attr_reader :data

def initialize(json)
@data = JSON.parse(json)
end

def age_rating
return unless data['classification']
rating = data['classification'].split(' - ')

case rating[0]
when 'G', 'TV-Y7' then :G
when 'PG', 'PG13' then :PG
when 'R', 'R+' then :R
when 'Rx' then :R18
end
end

def episode_count
data['episodes']
end

def episode_length
data['duration']
end

def synopsis
clean_desc(data['synopsis'])
end

def youtube_video_id
data['preview']&.split('/')&.last
end

def poster_image
data['image_url']
end

def age_rating_guide
return unless data['classification']
rating = data['classification'].split(' - ')

return 'Violence, Profanity' if rating[0] == 'R'
return rating[1] if rating[1].present?

# fallback
case rating[0]
when 'G' then 'All Ages'
when 'PG' then 'Children'
when 'PG13', 'PG-13' then 'Teens 13 or older'
# when 'R' then 'Violence, Profanity'
# this will NEVER happen because of return
when 'R+' then 'Mild Nudity'
when 'Rx' then 'Hentai'
end
end

def subtype # will be renamed to this hopefully
# anime matches [TV special OVA ONA movie music]
# manga matches [manga novel manhua oneshot doujin]

case data['type'].downcase
# anime
when 'tv' then :TV
when 'special' then :special
when 'ova' then :OVA
when 'ona' then :ONA
when 'movie' then :movie
when 'music' then :music
# manga
when 'manga' then :manga
when 'novel' then :novel
when 'manuha' then :manuha
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's manhua not manuha. Also is there a reason why you don't just use else data['type'].downcase.to_sym for most of these cases?

Copy link
Member Author

@toyhammered toyhammered Jul 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops.. sorry bout that you are correct (actually when have you ever been wrong?)

Actually this is god damn brilliant lol.

when 'oneshot' then :oneshot
when 'doujin' then :doujin
end
end

def start_date
data['start_date']&.to_date
end

def end_date
data['end_date']&.to_date
end

def titles
{
en_jp: data['title'],
en_us: data['other_titles']['english'].try(:first),
ja_jp: data['other_titles']['japanese'].try(:first)
}
end

def abbreviated_titles
data['other_titles']['synonyms']
end

# Manga Specific

def chapters
data['chapters']
end

def volumes
data['volumes']
end

# removed subtype (show_type, manga_type issue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you drop in a TODO so that we can track this easier?

# missing status on manga (anime does automagically)
def to_h
%i[age_rating episode_count episode_length synopsis youtube_video_id
poster_image age_rating_guide start_date end_date
titles abbreviated_titles chapters volumes]
.map { |k|
[k, send(k)]
}.to_h
end

def genres
data['genres']
end

# synopsis: seriously don't touch this unless you are Nuck.
def br_to_p(src)
src = '<p>' + src.gsub(/<br>\s*<br>/, '</p><p>') + '</p>'
doc = Nokogiri::HTML.fragment src
doc.traverse do |x|
next x.remove if x.name == 'br' && x.previous.nil?
next x.remove if x.name == 'br' && x.next.nil?
next x.remove if x.name == 'br' && x.next.name == 'p' && x.previous.name == 'p'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is too long. 91/80

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nor this

next x.remove if x.name == 'p' && x.content.blank?
end
doc.inner_html.gsub(/[\r\n\t]/, '')
end

# synopsis: seriously don't touch this unless you are Nuck.
def clean_desc(desc)
desc = Nokogiri::HTML.fragment br_to_p(desc)
desc.css('.spoiler').each do |x|
x.name = 'span'
x.inner_html = x.css('.spoiler_content').inner_html
x.css('input').remove
end
desc.css('.spoiler').wrap('<p></p>')
desc.xpath('descendant::comment()').remove
desc.css('b').each { |b| b.replace(b.content) }
desc.traverse do |node|
next unless node.text?
t = node.content.split(/: ?/).map { |x| x.split(' ') }
if t.length >= 2
if t[0].length <= 3 && t[1].length <= 20

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Favor modifier if usage when having a single-line body. Another good alternative is the usage of control flow &&/||. (https://github.com/bbatsov/ruby-style-guide#if-as-a-modifier)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not even touching that @NuckChorris you can deal with it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I can handle these

node.remove
end
else
node.remove if /^\s+\*\s+.*/ =~ node.content

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert if nested inside else to elsif.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nor this.

end
end
desc.inner_html
end
end
end
end
end
Binary file added server/spec/fixtures/image.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
79 changes: 79 additions & 0 deletions server/spec/fixtures/my_anime_list/berserk-manga.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
{
"id": 2,
"title": "Berserk",
"other_titles": {
"english": [
"Berserk"
],
"synonyms": [
"Berserk: The Prototype"
],
"japanese": [
"\u30d9\u30eb\u30bb\u30eb\u30af"
]
},
"rank": 1,
"popularity_rank": 8,
"image_url": "http:\/\/cdn.myanimelist.net\/images\/manga\/1\/157931.jpg",
"type": "Manga",
"status": "publishing",
"members_score": 9.24,
"members_count": 104891,
"favorited_count": 24537,
"synopsis": "Guts, a former mercenary now known as the \"Black Swordsman,\" is out for revenge.",
"genres": [
"Action",
"Adventure",
"Demons",
"Drama",
"Fantasy",
"Horror",
"Supernatural",
"Military",
"Psychological",
"Seinen"
],
"tags": [

],
"anime_adaptations": [
{
"anime_id": 33,
"title": "Berserk",
"url": "http:\/\/myanimelist.net\/anime\/33\/Berserk"
},
{
"anime_id": 10218,
"title": "Berserk: Ougon Jidai-hen I - Haou no Tamago",
"url": "http:\/\/myanimelist.net\/anime\/10218\/Berserk__Ougon_Jidai-hen_I_-_Haou_no_Tamago"
},
{
"anime_id": 12113,
"title": "Berserk: Ougon Jidai-hen II - Doldrey Kouryaku",
"url": "http:\/\/myanimelist.net\/anime\/12113\/Berserk__Ougon_Jidai-hen_II_-_Doldrey_Kouryaku"
},
{
"anime_id": 12115,
"title": "Berserk: Ougon Jidai-hen III - Kourin",
"url": "http:\/\/myanimelist.net\/anime\/12115\/Berserk__Ougon_Jidai-hen_III_-_Kourin"
},
{
"anime_id": 32379,
"title": "Berserk (2016)",
"url": "http:\/\/myanimelist.net\/anime\/32379\/Berserk_2016"
}
],
"related_manga": [
{
"manga_id": 92299,
"title": "Berserk: Shinen no Kami 2",
"url": "http:\/\/myanimelist.net\/manga\/92299\/Berserk__Shinen_no_Kami_2"
}
],
"alternative_versions": [

],
"personal_tags": [

]
}
Loading