ELI5: How the heck does Akinator work?

r/

How the heck does Akinator work? I used it more than 10 years ago and it was pretty dope back then. Today it randomly popped into my mind, so I decided to play with it again and it guessed all my characters on the first or second try, lol. I know it’s not really an LLM or anything, but it still feels kinda magical 😀

Comments

  1. Anonymike7 Avatar

    It has a large (10+ years’ worth!) database of user-supplied character data. The questions it asks are designed to eliminate as many possibilities as possible, even if that’s not how it works in practice.

  2. Jehru5 Avatar

    Basically a process of elimination. It has thousands of characters and their attributes stored in memory. Every time you answer a question it eliminates an attribute and narrows down the number of options. Once it reaches only one character remaining then it guesses.

  3. An0d0sTwitch Avatar

    Its a series of logic gates, that lead to the right answer.

    Imagine a 2D tree. Each branch goes to 2 more branches, then 2 more branches, 2 more branches. It will keep asking you questions(EX: Is it a fruit? yes/no) and yes goes to one branch, no goes to the next branch. Eventually, its going to reach the final branch and that will be your answer.

    There is some prediction involved with statistics, and it does learn. When it does get it wrong, it has you select what the right answer was, it remembers what branches led to that answer, and now it wont get it wrong again.

  4. kevinpl07 Avatar

    If you have divide the search space by 2 everytime (which they try to do) you quickly get to a solution.

  5. Technologenesis Avatar

    I don’t know about Akinator specifically, so I could be wrong here, but here’s how I would expect such a system to be implemented.

    Akinator is a sort of classifier. It has a number of possible outputs and it must associate its input with the correct output as often as possible.

    It does this iteratively, by asking questions. You could imagine that it knows the answer to every question for every item in the output space and narrows that output space down with each question, but the problem with this is user error and ambiguity. Akinator is pretty reliable even when it asks weird questions that don’t have straightforward answers or when the user makes a mistake.

    Akinator uses probability to get around this issue. It does not take your answers as gospel truth – it just gives a probability boost to outputs that accord with your answers, and a penalty to those that disagree with them.

    At any given point, Akinator will ask you what it determines to be the “optimal” question. What exactly “optimal” means here might be different depending on Akinator’s specific implementation, but a common candidate would be the question that minimizes the entropy of the output space.

    A “high-entropy” output space is one with a lot of uncertainty. For example, a coin flip is an event with two outcomes in the “output space”: heads or tails. If the coin is fair, then this is a relatively high-entropy event – as high as it gets for a two-element probability space. But if the coin is weighted, the entropy is lower, because there is relatively more certainty about the outcome. Maximally, if it is impossible for the coin to land on heads, the entropy is 0, because there is complete certainty: the coin will land on tails.

    Once you can define entropy for your outcome space, you get a mathematical way to quantify your degree of knowledge. So, at any given point, Akinator selects the question that it expects to minimize the entropy of the output space after receiving your answer, whatever that answer may be – which is just a mathematical way of saying that it picks the question which is most likely to get it as far as possible towards singling out a specific answer. Once it reaches a confidence threshold in a particular answer, it makes a guess!

    Akinator can iteratively self-improve as users engage with it. The probability boost it should give to an output based on one of your answers can be calculated from the percentage of users who gave that answer for that output.

    EDIT: Signed, a 10-year-old (I have coded things based on similar principles and have taken CS level probability courses but I still may well have fucked something up in my presentation of this)

  6. Joseelmax Avatar

    Get a list of characters and their basic info (appearance, age, name, occupation, hair color, hundreds more)

    Then get specific information about them, like, a lot of it.

    Then it’s just a matter of discarding options until I’ve got 1 at the top.

    Is your character real? Yes? great, went from 1 billion results to 100 million

    Is your character blonde? Yes? great, that reduces the search from 100 milllion to just 2 million

    Does your character live in America? No? Great, now I’m working with 450 thousand results.

    Is your character a woman? No? ok I’m down to 200 thousand results

    Is your character from anime? No? ok, down to 90k results…

    Does your character appear in a movie? No? great, down to 11k

    Then it starts with more specific questions, and he goes from most general to least general.

    It’s basically playing “Who Is It” but with 2 caveats:

    1. It’s not purely discarding on your answer, sometimes it does, but it’s more likely using a probability ranking that tracks who are the most likely to be, and then asking the smart question that is most likely to make an high impact into the current probabilities.

    2. The actual way in which it works is not public but it’s using dark math (probabilistic)

    When you’re not 5 anymore you can read:

    https://stackoverflow.com/questions/13649646/what-kind-of-algorithm-is-behind-the-akinator-game

  7. PckMan Avatar

    It’s simpler than you think. It’s like the old handheld 20 questions toy. It basically just has a large database sorted in a sort of flow chart arrangement and each question eliminates large parts of the data set until it boils down to one. It’s so accurate simply because its database is huge and has been refined over many years.

  8. lolwatokay Avatar

    It’s a giant binary tree of questions and user supplied answers. It has now 18 years of user submitted answers so it’s really thorough 

  9. Joseelmax Avatar

    And be wary of people saying “it’s a tree branch” or just “following a path until you get to the right answer”. That’s not how it works, it’s probabilistic and the idea behind it is not to follow the right path, if you really wanna get what it’s about, it’s more like:

    – Ask a question to stir the pot

    – Let it sit so bad stuff flows to the top

    – remove the worst stuff from the top (some bad stuff is left over, then there’s decent stuff, there’s not much good stuff yet)

    – Keep asking and stir again until you get to the good stuff

    And I say “stir the pot” because the principle behind it is:

    “you have calculated the probabilities and now you ask the question that will produce the most change in that set of probabilities”.

    You are working with millions of results, you don’t wanna hyperfocus on one specific aspect, you wanna ask a question that will give you the most amount of information.

    if you are working with 1 blonde in a pool of 200 brunettes. You don’t wanna just ask “is your character blonde?” and then 199 out of 200 times you’ll just discard 1 person.

  10. junior600 Avatar

    Thanks guys for your explanations. It’s less sophisticated and complicated than I thought, lol. But it’s still pretty dope though.

  11. ContraryConman Avatar

    Did you know that if everyone in the world competed in a 1 on 1 single elimination tournament, it would only take 33 rounds to determine the winner? This is because, at the end of every round, half of all the options get eliminated. This means that you find the winner at a very fast rate. In math or computer science, we’d say that the time complexity is the inverse of exponential, or log(n), where n is the size of the problem.

    Anyway, it’s the same with Akinator. Let’s say Akinator has 10 million celebrities and characters in its database. And let’s say the attributes of each character are evenly distributed (the database has an equal number of male and female characters, an even number of real and fictional, and so on). Akinator only asks yes or no questions. Meaning, roughly, every time you answer a question, it can eliminate half of all characters in its database.

    20 questions later, it, under this basic model, has already narrowed down the pool from 10 million to like 9 or 10 options. It seems like magic, but it’s just math. Now imagine some questions are even more specific and, if you answer a certain way, can eliminate even more than half the pool. Like “is your character associated with celestial bodies?” and “does your character wear a high school uniform?” will basically eliminate every character that is not a main character in Sailor Moon if you answer yes to both.

    In fact, this effect is a pretty big deal in privacy and security research. For example, Yahoo! released its anonymized dataset to researchers a few years back. They removed all the personally identifiable information. There are millions and millions of Yahoo! users past and present, so surely it’s impossible to pick out any specific person from that dataset, right?

    And yet, if you just stack filters, say, lives in London, is over 50 years old, is female, has two dogs, was in the hospital in the last 5 years, you can very easily narrow down which searches belong to which people. If each filter eliminates roughly half of the dataset, you only need a couple to get it down to a point where a human can look through it

  12. Kilroy83 Avatar

    I may be wrong but I think when you enter any online store and start applying filters to refine your search it works the same way, the only difference is that the online store doesn’t ask you stuff to apply those filters, you just click on them until you reach your goal.

  13. BrakingNotEntering Avatar

    To add to other comments, Akinator uses your previous characters to assume what you’re going to ask next. People usually start with main characters or more popular celebrities, and only then move on to less knows ones, but Akinator already knows what subjects you’re interested in.

  14. Sweatybutthole Avatar

    It’s basically functioning like a search engine, but working in reverse. You come to it with the prompt, and it uses questions that narrow it down until there are only a handful of potential answers remaining in its database through process of elimination.

  15. jaminfine Avatar

    For fun, I tried Akinator just now and I was honestly disappointed that after 70 questions, it could not figure out my target was Uther, The Lightbringer from Warcraft III.

    There are many millions of possible things you could be thinking of. So how could asking yes or no questions narrow it down enough? But the truth is that millions isn’t a lot when exponents are involved.

    Theoretically, if the answer was just yes or no, and every human would answer it the same way for the same target, Akinator could divide the number of possibilities by about 2 each question. In reality, since probably, probably not, and I don’t know are also answers, it’s likely dividing the number of possibilities by 3 or 4 each question instead (accounting for the fact that not everyone answers the same way).

    Many millions divided by 3 or 4 doesn’t sound like a lot of progress, but it really is. If you can divide by 3 twenty times, you now have very precisely narrowed it down even if there were billions of possibilities.

    So the math works! The question becomes how does Akinator know which answers fit which targets to be able to narrow it down that way? And that’s all from user feedback. I gave my feedback when I stumped him on Uther.

    EDIT: I tried again with something extremely obscure and of course Akinator didn’t get it. Ruwen from FTL. Akinator is not impressing me lol

  16. ezekielraiden Avatar

    It has a large database of characters. Each of those characters has an extensive list of characteristics which have yes/no elements (e.g. are they blond, do they have eyes, are they from anime, etc.) Every time you answer “no” to a question, it cuts off all things that would be a “yes”, and vice-versa.

    Let’s say, for simplicity’s sake, that for any given question, exactly 50% of the current candidates get removed. And let’s further assume that there are a billion candidates (almost surely a large over-estimate). How many questions do we need to ask to narrow it down to just one?

    Well, every time we ask a question, we’re dividing the pool in half. A billion becomes 500M after one question, which becomes 250M after a second question. We can easily simplify this process by asking, “What is the first power of 2 bigger than a billion?” And the answer is 30: log(1,000,000,000)/log(2) = 29.897…, so 2^30 > 1 billion. Hence, even if there were a billion entries in the database, Akinator would only need to ask 29-30 questions to eliminate all but one of them.

    In practice, it’s a lot more complicated than that, but often those complications make things easier for Akinator. As an example, “is the character from anime” probably eliminates far more than 50% of answers with a “no” since anime works tend to have a LOT of characters in them. Likewise, a “yes” to something like “does the character have white hair” eliminates far more than 50%, because most characters don’t have white hair, they have some other hair color.

    However, even with popular, relatively well-known characters, Akinator does not always get the answer on the first attempt. My first time using it today, I chose Freiren, because I thought she might be recent enough that she wouldn’t be in the database, but Akinator got it right, to my surprise. However, the second time, I chose Agatha Heterodyne–and Akinator did not get it right on the first go. It needed another 20 questions. So, some characters will be more complicated to identify than others, and on some occasions, Akinator will just get it wrong. (Just did it a third time, and after ignoring some attempts that led to technical issues, Akinator again failed to guess the character on the first try; it originally said Inara Serra from Firefly, but the actual character was Ambassador Delenn from Babylon 5.)

  17. wigglin_harry Avatar

    I’ve only been able to stump it with obscure HP Lovecraft characters

  18. Ultiman100 Avatar

    It’s still very bad. Pick something that’s only slightly obscure and it will completely shit the bed and ask you if the thing you’re thinking of really exists and you’ll answer “no” and then 2 questions later it will ask “Can this object be found on earth”

    It’s going to fail every time if you pick lesser-known people, items, or events.

  19. abzlute Avatar

    Just tried it, with a slightly obscure character I guess but not that obscure. It didn’t work at all and kept repeating the same questions past a certain point, and making guesses that definitely should have been ruled out by responses.

    So…it doesn’t work that well.

    But, it’s just like playing a game of 20 questions. You can narrow down every human concept in the world if you ask questions that divide the possibilities pretty effectively. This implementation is actually fairly poor from what I can tell. Starting by asking it it’s a genie/djinni is a pretty poor first question (it should start broad like “is your character fictional” or “is your character originally from a book” and then maybe “does your character use magic” before ever considering genie specifically) and then its third guess was still a djinn for some reason.

  20. the_kissless_virgin Avatar

    ELI10 version:

    Imagine you have a large printed dictionary of English to, say, Spanish. The book is reallly big, having thousands of pages, each page having hundreds of words. The words are sorted alphabetically but there’s no table of contents to quickly navigate to. Let’s say I ask you to find the translation of the word “Turtle”.

    You remember the alphabet and go somewhere to the 3/4 of the book’s pages; you end up landing on a page which starts with the word “Saturday”. That would mean you landed too early, but that also means that the first 3/4 of the book are not relevant any more. So you focus on remaining part, and open the random page located around 1/4 of the remaining pages. You look at the first word and it’s “Twin” – very good, you’re now very close, and moreover the number of pages that could potentially have the word “Turtle” is even smaller now! It takes you two or three more guesses and you finally see that the end that Turtle in Spanish is Torguta. Congratulations! you handled a massive amount of information in just several easy steps.

    This is basically how Akinator work, it’s just instead of the one aspect in which it looks for (the page containing words alphabetically before or after the target word) it has a much bigger range of questions that narrow down the answer much more effectively, even though the number of characters to ask about is still vast!

  21. JoeGlory Avatar

    I’ve always imagined it like one huuuuuuuuuuuge flowchart.

    Does it have a hat – yes or no

    And then it goes down the chart.

  22. honi3d Avatar

    It basically works like the board game “Guess Who?” but with more characters and more characteristics. If it doesnt know the character the player can add it to the database.

  23. reidft Avatar

    Think of it like folders in a computer. You have the root which is just “characters”. Next one has two options: “Real or fictional” follow next one down, gender, next nationality, next professions, etc etc until it gets to a folder with only one file. It’s gathered so much information since being created that it’s got very specific paths for each character that’s been added

  24. tblackjacks Avatar

    Yeah and I tried using chatgpt to do the same thing and it wasn’t nearly as fast as the Akinator

  25. ManyAreMyNames Avatar

    It has a large list of characters and character traits, but it doesn’t always work.

    I picked someone from one of my favorite books, Cordelia Naismith, and it failed.

    What’s worse, it started repeating itself. It asked if my character was human, and I said yes, and then later it asked if my character was a mammal. It asked if my character was in a movie twice.

  26. wtfisspacedicks Avatar

    I just had a go. It couldn’t guess Kyra from Chronicles of Riddick

  27. InevitablyCyclic Avatar

    If you ask a yes/no question there are two possible answers. Two questions gives 2×2 possible combinations, assuming the correct questions are asked that would allow it to pick between 4 possible things.

    20 questions gives 2^20 possible combinations that it can pick between which works out as slightly over 1 million possible things. People aren’t nearly as good at picking random things as they think they are, 95% of the time the thing it’s trying to guess is probably in the most common couple of thousand options. That gives it plenty of spare questions to allow for non-optimal searches or incorrect answers.
    If it gets the answer wrong the remaining 5% of the time it’s rare enough that it still seems very impressive.

  28. rapax Avatar

    Not that impressive. I just tried Leonard Euler and it took 24 questions to get it.

  29. Spinach-is-Disgusten Avatar

    Anyone else use Akinator whenever they can’t remember what a character’s name is?

  30. aberroco Avatar

    What do you mean by “first or second try”? If you managed to get it to win even once – that’s already an achievement, but it’ll remember your answers and probably would ask you a question and info for the character, so it’ll always win the second try, because that’s in the database now.

    If you mean by first or second question – you’re quite bad at choosing characters, I got something well over a dozen questions for Isaac Clarke.

  31. WhiskeyTangoBush Avatar

    I just defeated it on objects. Literally just a wooden coaster, would’ve accepted coaster though. The closest it got was a Lid.

  32. AlmightyK Avatar

    The others have explained better so I will say that people have poisoned the well so to speak. It used to be better but when people lied to it, the results got confused