no intersection more awesome
Header

Anime and the Nepomuk Metadata Extractor

October 12th, 2012 | Posted by Jason "moofang" in KDE

Here’s a long-procrastinated update on my endeavors re: Nepomuk and Anime, first introduced here. I mentioned then that the way forward would be to create a plugin-based framework so one could fetch anime metadata from different online sources. Well, as it turned out, I found out that Joerg’s Nepomuk Metadata Extractor can already do that, so I scurried over and grabbed the sources and did my subsequent work on top of Joerg’s program. The anime use-case is now in a fairly workable state. Basically, what the program does is it lets you manually or automatically source for meta information (series title, episode number, synopsis etc) from online sources for anime video files you have on your disk, and write all that to file as Nepomuk metadata.



A small, similar patch gave Nepomuk Metadata Extractor the same anime compatibilities outlined in my previous post. One of the more significant limits of the previous program was that theTVDB‘s web api, which Trueg’s program used, did not support show aliases, which is a huge problem for anime, especially when theTVDB names “Tonari no Kaibutsu-kun” as “My Little Monster”, for example, so searching using “Tonari no Kaibutsu kun”, which is what everyone else calls the series, would yield a null result. I mentioned that MyAnimeList’s API, which does support search with alias, was more resilient against cases like these, but when I set out to write a MyAnimeList plugin, I discovered the obvious problem – unlike theTVDB, MyAnimeList is a per-series service, and didn’t have data for individual episodes. In the end, what I settled on was a modified version of the tvdb plugin, which I called “tvdbmal”. The plugin would use the MAL api to find a list of show aliases first, then run each alias against theTVDB. This significantly increased the number of shows that could be automatically fetched compared to the vanilla tvdb plugin:

Dantalian no Shoka now fetches automatically!

Of course, this still isn’t exhaustive, so another feature I added was the ability to manually specify alias overrides, so that if tvdbmal couldn’t find the correct alias to use by itself, I can just tell it.

As I worked through my own video collection with the program, I added features as needed to be able to cover all my shows, so that the alias configuration system is now advanced enough to support special custom season rules. The problem was that new seasons of anime often have a different title altogether, and are not labelled as “old anime title season 2″. For example, “To Love Ru Darkness” is really “To Love Ru” season 3. Worse: “Fate/Zero” episodes 1 through 13 are considered “Fate/Stay Night” Season 2, and episodes 14 onwards are considered Season 3. The alias configuration system is now powerful enough to deal with oddities like these:

What does having all that metadata do for you? Well for starters you can see all that stored data using dolphin’s information panel. If you have Trueg’s tvshow kioslave, you can also browse all your series/seasons/episodes via dolphin. Even better, you can grab this little gem called bangarang media player that I recently discovered, that already has plenty of hooks out the box:

Browsing series

Series Seasons

Episodes and all their information

I’m sure even more things can be done with the metadata, but this is just the beginning. The MAL alias search could also be improved. In the meantime, if you’d like to give this thing a whirl, you can grab the latest master for metadata extractor with

git clone git://anongit.kde.org/nepomuk-metadata-extractor

And bangarang with

git clone git://anongit.kde.org/bangarang

Have fun!

  • Facebook
  • Twitter
  • Identi.ca
  • Delicious
  • Digg
  • Google Buzz
  • StumbleUpon
  • Add to favorites
  • RSS
1247
Rate this post
Thanks!
An error occurred!

You can follow any responses to this entry through the RSS 2.0 You can leave a response, or trackback.

22 Responses

  • NatsuPower says:

    Very cool :D ,as a Otaku and Linux fan and KDE lover I think this is very interesting.

    When this feature will be in KDE?

    • Jason "moofang" says:

      Well, you can already try it if you want :) It’s still in playground now but I think Joerg will try to have it moved into extragear or something when he feels it’s stable enough.

  • Wind says:

    Now this kind of post is exactly where this blog takes its title from :) Great work

  • Quintasan says:

    Hmm, I have managed to get the metadata-extractor to compile and run but for life of me I can’t figure why I do not have a Search Engine available there. Any ideas?

    • Jason "moofang" says:

      Do you mean no search plugins are loaded? You may be missing some of the plugin dependencies. For the tvdbmal plugin which this post is about, you’ll need, in addition to PyQt and PyKDE, the tvdb python module, and the python requests module.

      You can run metadata-extractor from the command line and look at the output – if it fails to load a plugin it should report why.

      • Quintasan says:

        No sir, it’s more like there are NO search plugins available. Output yields no error, the window itself looks like this -> http://wstaw.org/m/2012/10/22/plasma-desktopNA2336.png and the output is -> http://paste.ubuntu.com/1298214

        I’m not entirely sure where the tvdbmal source code it. Are we talking about git://git.code.sf.net/p/nepomukoracle/code ?

        • Jason "moofang" says:

          Oh blah, all the error messages are kDebug messages, so they won’t be visible unless you enable debug output for metadata extractor. That probably needs to be fixed. You can enable debug output for metadataextractor by running ‘kdebugdialog’ then searching for and checking off metadataextractor in the list. I think you’re almost certainly missing dependencies.

          The plugins are in lib/webextractor/plugins, so tvdbmal is at lib/webextracctor/plugins/tvdbmal.py. You should pull all the plugins right along with the rest of metadataextractor if you cloned it from playground.

          • Quintasan says:

            Well, I double checked and installed all those dependencies you mentioned. Nothing went better -> http://paste.ubuntu.com/1327972
            It looks for plugins in /usr/share/kde4/apps/nepomukmetadataextrator/plugins, the files are there but I still have no search plugins. One funny thing tho. When you open up any of the scripts the shebang line looks like #!/usr.bin/python. Shouldn’t that be #!/usr/bin/python instead? I tried changing that but it didn’t work so I assume that’s not the case

          • Jason "moofang" says:

            Hmmm, two things:

            From your output it looks like Kross couldn’t find python somehow. ‘python’ should be among the list of ‘available interpreters’. This is most likely the main problem. I have a package installed on my system (opensuse) called kross-python, I suspect you’ll need the equivalent. That part of the output should look like this:

            metadataextractor(9892) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: available interpreters ("javascript", "python", "qtscript")
            Kross: "Loading the interpreter library for javascript"
            Kross: "Successfully loaded Interpreter instance from library."
            Kross: "Loading the interpreter library for python"
            Kross: "Successfully loaded Interpreter instance from library."
            Kross: "Loading the interpreter library for qtscript"
            Kross: "Successfully loaded Interpreter instance from library."

            If you still have issues after that, also try removing the config file at ~/.kde4/share/config/nepomukmetadataextractorrc. I think it may not doing its caching properly for missing dependency cases. I’ll be sure to take a closer look when I next have time (busy period now >.<)

  • Larso says:

    Great work.

    By the way, AniDB has per episode and per file information and a UDP API for accessing it. See http://wiki.anidb.net/w/API

  • Loose Control says:

    I have a question regarding the alias configuration where can i find it?
    Is it in the Plugins window of the KCM? (there are 2 buttons which i can’t press)

    @Quintasan
    i had the same problem none of the plugins were working the solution for me was to install the kdebindings python kross package

    • Jason "moofang" says:

      I haven’t built the latest version with the KCM module yet, but if things are the same, in the plugins page you should be able to select a plugin in the list and click ‘configure’ to show the config interface, if the plugin has one. The alias configuration stuff in this post is the config interface of the ‘tvdbmal’ plugin, so you just select that in the list, and click the ‘configure’ button.

      • Loose Control says:

        sorry for replaying this late…
        there are 2 buttons info and configure but they are greyed out for all plugins i have (imdb, tvdb and tvdbmal)

        • Jason moofang says:

          That is strange. The info button at least should become clickable if you select any plugin. Try grabbing the latest source again and rebuild? I am travelling now so I can’t help you in too much detail till I get back.

          • Loose Control says:

            I tried the git version same problem.

          • Jason "moofang" says:

            If you pull the latest changes from git you should see all plugins now with the failed ones redded out. Can you enable debug output (run kdebugdialog and check off “metadataextractor”), run metadataextractor from the command line (just pass it a random file as parameter), go into the plugins page (hit the settings icon at the top-right), then screenshot the plugins page, as well as pastebin all the output in the terminal?

          • Loose Control says:

            omg i just found the bug…
            i have to double click on the plugins then the buttons start to work

          • Jason "moofang" says:

            That’s… weird O_O not sure why that happens.

  • sabutilnik says:

    Hi,
    First of all, thanks for your work, i have a very big anime collection and this software will really help me to organize it. How do you manage to get information for long anime series like One Piece or Naruto?. Currently for tv shows one must specify season and episode numbers, but for this kind of series every episode has only episode number and not season. Is there a way to get this information?

    • Jason "moofang" says:

      Hello. Since the plugins for tv show use thetvdb, you can go to http://thetvdb.com to look at how the season/episode relationships are arranged there. For example, for One Piece, the tvdb entry is here, and looking at it, you’ll see that Season 1 is the first eight episodes. That means that episode 9 is considered Season 2 Episode 1. And since Season 2 has 22 listed episodes, episode 30 (22 + 8) will be the last episode of Season 2 and episode 31 would be Season 3 Episode 1.

      I don’t follow very long shows like One Piece so it never struck me before, but looking at it now it does look like a pain to be micromanaging so many seasons in this way. When I’m free again I’l look into maybe automating some of that, but unfortunately for the time being you’ll need to manage the seasons manually.



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>