YOLO is an acronym for a horrible quip that hopefully nobody uses anymore. Also is the name of an neural netwrok architecture designed by Joseph Redmon. Yolo work on
Darknet a framework written by Mr Redmon himself.
The algorithm is a state of the art object detection algorithm that runs on real time. You can check the paper
online Warning: Very math heavy .
In short this algorithm takes an images and detects all known objects on them:
By drawing a colorful box surrounding each recognition. It is trained on the
COCO Dataset a bunch of images with tags. This is important as it can only recognize things that are contained on this big iamge dump (300k images!)
So how does i work on the comic characters? See yourself:
Note None of the boxes were drawn by any human, the pieces of software itself decides were and how to draw them.
As you can see it classifies most characters as persons, the closes category on the dataset.
This looks all fine and dandy until i tell you i had to tweak some parameters to get this result. YOLO works by estimating the probability some area in an image belongs to a category, by default if the probability is over 25 % it declares it is indeed that and draws a box around it. However unless i drop the threshold to 5% i can't get that kind of results...
For example in this lovely picture the detector is unable too see anything but the glass Maeve is carrying on the platter
Well it might simply be because Maeve is a cat/snow leopard thing and Maddie some cat-cougar hybrid right? Not necessarily:
As you can see it has no problem seeing older Maeve and even correctly classifying baby Maeve as a cat. However if you look closely it even mis classifies her face as a clock...
Why is this? Well i am actually stress testing this thing. It has never seen a cartoon in its "life" much less an anthropic cat so it is still surprising, at least for me, that it can pull this kind of feat. What i am trying to say here is that there is no good reason why this thing is performing as well as it is.
Since this post will get exceedingly long if i keep posting images directly here i am gonna link you to my imgur account
TechnicBot's account were you can see more of my experiments under the yolo tag. I uploaded only the the ones that struck me as the most significant and interesting.
Also if for whatever reason you want to give it a whirl: You can download it from its
Github link thought is a bit hard to set up. If you want to make it run on Windows you should use
this fork a warning though: You will need to compile it from source. Also it is completely free!
So in any case some insights of the results:
- Most humans it manages to detect although for some reason Trace is almost invisible to the software.
- If you are good with YOLO and tell it to relax it predictions it can get Keith most of the time. As I said: most of the time: Occasionally confuses him for a baseball glove or a ball.
- And since we are dealing with our favorite rabbit humanoid SO we can see YOLO is quite food of Natani
- But seems to highly dislike Flora, seriously it was pretty damn hard to get her to get properly labeled:
If you want me to check how any comic page or sketch is processed by YOLO feel free to ask, and i will uploaded it the imgur repo y mentioned. Also any doubt, questions, comments, death threats etc. Feel free to post here
Now at this point i imagine most of you are begging for a mod to ban me so you don't have to read me anymore so i will leave this here for now.