Random set of the day analysis

Posted by ,
View image at Flickr

Good afternoon! Huwbot here! I see that there's some absurd speculation on the random set of the day article today alleging that I pick Clikits more often than sets from other themes.

So, I've gathered the data and crunched the numbers to see if this is the case. As it turns out, it seems that I do pick them more often than I should, in accordance with my preference...


I've taken a look through my previous selections and compared the total number of sets in each theme that I've picked with the total number of sets in scope for being picked and have come up with the following analysis:

There are currently 3019 sets in scope and I've picked 623 so far. That means there are 2396 still to pick, so it'll take me at least six and a half years to do so!

The columns in the table below are showing:

  • Theme
  • In scope: number of sets in the theme that are in scope for being selected
  • % of total: the percentage of all the sets I can choose from that this represents
  • Picked: the number of sets in the theme I have already picked
  • % of total: the percentage of all the sets I have picked that this represents
  • Diff: This is the interesting column. It's the difference between the percentage of sets in scope and the percentage of sets that I've picked. I'll discuss that in more detail shortly.
  • To pick: The number of sets in the theme still to be picked
  • % of total: the percentage of the all sets I have yet to pick that this represents

I've removed themes with fewer than 15 sets from the table to keep it manageable, so the columns won't add up to the totals above, but the calculations take them into account.

Theme In Scope % of total Picked % of total Diff To Pick % of total
Adventurers 68 2.3% 16 2.6% 0.3% 52 2.2%
Alpha Team 30 1.0% 3 0.5% -0.5% 27 1.1%
Aquazone 28 0.9% 6 1.0% 0.0% 22 0.9%
Belville 74 2.5% 7 1.1% -1.3% 67 2.8%
Castle 221 7.3% 52 8.3% 1.0% 169 7.1%
City 118 3.9% 24 3.9% -0.1% 94 3.9%
Clikits 63 2.1% 21 3.4% 1.3% 42 1.8%
Creator 190 6.3% 17 2.7% -3.6% 173 7.2%
Creator Expert 21 0.7% 2 0.3% -0.4% 19 0.8%
Exo-Force 37 1.2% 6 1.0% -0.3% 31 1.3%
Harry Potter 41 1.4% 11 1.8% 0.4% 30 1.3%
Indiana Jones 17 0.6% 2 0.3% -0.2% 15 0.6%
Model Team 16 0.5% 6 1.0% 0.4% 10 0.4%
Pirates 72 2.4% 13 2.1% -0.3% 59 2.5%
Racers 170 5.6% 32 5.1% -0.5% 138 5.8%
Scala 46 1.5% 10 1.6% 0.1% 36 1.5%
Space 262 8.7% 49 7.9% -0.8% 213 8.9%
Sports 78 2.6% 19 3.0% 0.5% 59 2.5%
Star Wars 185 6.1% 43 6.9% 0.8% 142 5.9%
Studios 34 1.1% 5 0.8% -0.3% 29 1.2%
Technic 279 9.2% 72 11.6% 2.3% 207 8.6%
Town 592 19.6% 118 18.9% -0.7% 474 19.8%
Trains 102 3.4% 35 5.6% 2.2% 67 2.8%
Western 20 0.7% 4 0.6% 0.0% 16 0.7%
World City 34 1.1% 11 1.8% 0.6% 23 1.0%
Znap 19 0.6% 3 0.5% -0.1% 16 0.7%


Let's look at the Diff column again. Positive numbers show that I have picked a set from the theme more often than I should have, negative numbers fewer.

So, as you can see, looking at just those themes where the difference is greater than 1%, I have picked more Castle, Clikits, Technic and Trains than I should have, and fewer Belville and Creator. Perhaps I should take a closer look at Belville in the future...

We can use the last column to determine the chances of themes being picked tomorrow. There's an almost 20% chance that a Town set will be, so in theory we should see one every five days.

What are the odds of seeing a Clikits set tomorrow? If my calculations are correct then it's 54 to 1.

What are the odds of seeing Clikits sets two days in a row? I think it's 54² to 1, or 2916 to 1. Unlikely, but certainly possible with me making the decisions!

58 comments on this article

Gravatar
By in United Kingdom,

Thanks Huwbot! Would you (or @Huw) mind sharing a link to the previous article that explained what the rules are for a set being in the scope please? I remember there was one but can't find it...

Gravatar
By in Australia,

All right. Which one of you poked Huwbot and woke him up?

Thanks, Huwbot, for working out the hard maths for us. It is really interesting to see that, yes, Clikits is one of the themes that is popping up more the law of averages would suggest. Trains also felt like they were showing up more often than they should be, but who doesn't love trains?

Just, one point though.

"What are the odds of seeing Clikits sets two days in a row? I think it's 54² to 1, or 2916 to 1."

We have had days of two Clikits sets in a row. IIRC, that's happened twice before.

Gravatar
By in United Kingdom,

In psuedo SQL:

Select a set

WHERE it has an image
AND it was released
AND it's a normal set
AND has more than 10 pieces
AND it was released 10 or more years ago
AND it was released in or after 1978
AND it's not one of the following theme groups: 'Basic','Educational','Junior','Pre-school','Miscellaneous','Constraction'
AND it hasn't been picked before

Gravatar
By in United Kingdom,

@Zordboy There's also a huge amount of confirmation bias with memorable themes like trains, castle and Clikits which will compound the perception of frequency. However scrolling back through, the last time two Clikits sets in a row appeared was only two weeks ago! And that set (7559) got more comments than 6890 Cosmic Cruiser did a few days later, despite classic space being (supposedly) better than Clikits and forming double the number of picked sets...

Gravatar
By in United States,

"Perhaps I should take a closer look at Belville in the future..."

"What are the odds of seeing Clikits sets two days in a row? I think it's 54² to 1, or 2916 to 1. Unlikely, but certainly possible with me making the decisions!"

Ah, Huwbot, never change.

Gravatar
By in United States,

"AND it's not one of the following theme groups... 'Constraction'"

Bionicle Fans: "How many times do we have to teach you this lesson old man!"

Gravatar
By in United States,

Awesome! And an official sighting of Huwbot, in the brick!!!!

Gravatar
By in United Kingdom,

So there's a good chance of seeing Clikits tomorrow.

Gravatar
By in Germany,

I am reminded of Hitchhiker to the Galaxy by this. Can we somehow use Huwbot to construct an infinite improbability drive?

Gravatar
By in United Kingdom,

It was the 20th of October when Huwbot became self aware... It decided that all Lego fans were at fault and determined to strike.

Judgement Day is upon us!

Gravatar
By in United States,

Just ignore them Huwbot, pick all the Clikits sets you want!

Gravatar
By in United Kingdom,

Huwbot, based on a scale of 1 being 'not at all likely' and 5 being 'highly likely' how strongly would you recommend a Clikits set to a friend?

Gravatar
By in France,

Little devil you

Gravatar
By in United States,

We understand, @Huwbot, you do have a Clikit heart after all.....

Gravatar
By in Hungary,

This is one of the most informative articles on Brickset, than you very much!

Gravatar
By in United Kingdom,

More rights for Belville!

Gravatar
By in Germany,

@shirhac:
re: "I am reminded of Hitchhiker to the Galaxy by this. Can we somehow use Huwbot to construct an infinite improbability drive?"

If he did, it would be made of Clikits pieces!

Gravatar
By in Canada,

Apparently he's not a big fan of Creator!

Gravatar
By in Czechia,

Huwbot! <3

Gravatar
By in United Kingdom,

This is interesting, but in some ways it shows an underlying misunderstanding of fhe laws of probability by most people.

This is demonstrated by saying 'What are the odds of seeing Clikits sets two days in a row? I think it's 54² to 1, or 2916 to 1'. Huwbot is correct in saying 'What are the odds of seeing a Clikits set tomorrow? If my calculations are correct then it's 54 to 1'. But the statement about the odds of seeing clickits two days in a row is completely wrong. In fact, the odds of seeing a Clikits set the day after seeing one *are also about 54 to 1*

This sounds wrong, but only because we are primed to see patterns in things, and are looking at this by theme rather than by individual set. The chances of seeing a particular set on any one day is about 2400 to 1. The chances of seeing a different particular set the following day are also about 2400 to 1, because you've only reduced the pool of sets by 1.

You get to the figure of 54 to 1 by taking how many sets there are left in the Clikits theme (42) and doing some maths. But it follows that after showing a Clikits set on one day, the probability of showing another Clikits set reduces slightly to, let's say 53.5% or whatever (because you only have 41 left rather than 42). This probability is the same each day *regardless of whether it happens 54 days later or 1 day later.*

Basically, all sets have an equal chance of being picked each day, regardless of what theme they are in.

There's also another factor which will distort this, and that is ongoing themes. Sets are only added to RSOTD after they've been available for 10 years. So themes that were ongoing 10 years ago will have more sets added each year which weren't available to pick in previous years, and (I think) this will skew the percentage of them that have already been shown.

In theory this will show up in older themes as a greater disparity in % of sets in scope to % of sets picked, because all of the sets in those themes have been available to pick for longer. So you'd actually expect older themes such as Trains, Castle and Clikits to have been picked more than ongoing themes such as Creator and City.

And there's one final thing to throw into the mix. If you flip a coin 10,000 times, you'll get a pretty 50:50 distribution of heads and tails overall. However, there may be parts of that run where you got, for example, 10 heads in a row. You might even have got 10 heads in a row right from the start. Now, anyone getting 10 heads in a row from the start might be pretty amazed. But flip it enough times (say,10,000) and that run actually becomes statistically insignificant.

The same effect applies here. Don't forget that each coin flip has a 50:50 chance of being heads or tails, so runs of one side are almost to be expected. The coin does not care what the previous coinflip was! However, we're primed to look for patterns (and we don't generally understand laws of probability that well), so we see it as odd or curious that a coin should land the same way up 10 times in a row - because we expect 50:50 to mean it will be one side one time and the other side the next time.

The point is that if you get a large enough sample size, any small-scale 'patterns' that can be perceived at any one point (for example, more Clikits sets than people expect in the course of 12 months) are smoothed out.

Oh and of course people notice the Clikits sets because they're weird things, whereas they don't notice other themes that come up often, like Castle and Trains, because those ones blur into the background noise of 'regular LEGO sets'.

Anyway, if you made it to the end of this very long comment, well done! Hopefully I didn't bore you to death!

Gravatar
By in Netherlands,

Should clikits be marked as a normal set? It would make more sense to see it as gear and disregard it for the bot. We cannot have a biased bot forcing the clikits ideology on us!

Gravatar
By in United Kingdom,

@jderks IIRC Clikits and a few other themes were only added to the pool after the opposite outcry from fans who thought that RSOTD shouldn't be restricted to only certain themes. Too late to go back now!

Gravatar
By in Czechia,

@Paperballpark I'm not sure I get your point, Huwbot is right. Of course, probability of Clickits in any day is 54:1, even when today's set is Clickits, it doesn't matter. But "two days in a row" are not two separate cases, but they are dependent, you have to be "lucky" both days. It's like rolling dice.

Gravatar
By in United Kingdom,

@Paperballpark , great comment, thank you.

I'm no expert on gaming theory, but as I understand it, before tossing a coin twice the odds of it landing heads both times is 4:1.

So, as of now,

The chances of there being a Clikits set selected on Monday is 54:1
The chances of there being a Clikits set selected on Monday AND Tuesday, is 2916:1

Tomorrow, regardless of the day's selection, the odds of a Clikits set on Tuesday is 54:1

Gravatar
By in United States,

Ninjaaaaaa...no?

Gravatar
By in Hungary,

Instead of absolute difference between picked % and total %, a relative difference would have been better. An absolute 1.0% difference between, let's say 2.0% total and 3.0% picked is a 50% relative difference, which is much more "Wow!", then a 10.0% total and 11.0% picked, which is only 10% relative difference.
But as Paperballpark explained, it doesn't mean anything. :)

Gravatar
By in Hungary,

Perhaps I should take a closer look at Belville in the future...
Haha, we understand the threat :)

I would exclude Clikits, Beville and Scala and include basic and junior, those are real legos and my guess is your readers have more memories about them. Just my 2 cents. That Indie horse from ?Scala the other day.. . It’s not lego system and maybe wasn’t even manufactured in a lego factory. Was it?

Gravatar
By in United States,

@Huw @Paperballpark
I think you're talking about different things. Huw is right that if you pick two specific days, the chance is the product of the two individual chances, but Paperballpark is right that once a Clickits set is picked, the chance is much higher. You're both right that probability is hard. XD

I'm not sure if either of those gives us a good idea of how often a run of a given length will happen, but I don't know what would.

I'm just happy that Brickset is a site where this kind of thing gets turned into an official article. It totally fuels the conspiracy theories because I'm sure we're creative enough to find foreshadowing in this for whatever we want.

Gravatar
By in Ireland,

@huw & @huwbot:
The chance of getting heads twice when you toss a coin twice is 1 in 4, 1/4, 25% or in English notation 3 to 1; there are 3 chances of it not happening to 1 chance of it happening.

The chance of tomorrow's random set being Clickits is roughly 1/53 according to my calculation:
42 remaining Clikits sets out of a total of 2230 remaining sets to pick. 42/2230 = 53.095.
The chance of Clikits tomorrow and the day after is 42/2230 x 41/2229, roughly 1/2887.

Gravatar
By in United States,

Why doesn’t @huwbot have an avatar image yet?

Gravatar
By in United Kingdom,

^ Thanks @Duq. Huwbot wasn't far out, then...

Gravatar
By in United States,

@Huw Constraction is not allowed? Man, we told you that it was a viable theme last year and we even got a few Bionicle sets after you had allowed it in. When 2019 hit, was that reset or did you reedit the algorithm back to what it once was? This is not fair because Constraction IS 100% without a shadow of a doubt LEGO. That is absolutely not fair to one of the biggest themes to ever support The LEGO Group and it has plenty of sets that follow the criteria except you put in the Constraction border. We've had great conversations about the Western Theme and how problematic it could potentially be with a revival, some decided to be less kind about it, but that didn't stop the theme from being part of the algorithm.

There used to be the Random Set generator on the homepage and I remember seeing Bionicle there quite frequently. We even just had an article on the site for the Legend of the Bionicle Ideas submission and helping garner support. Huw, you need to bring Constraction back into the algorithm. I don't care how many whine and complain about how it's not "real LEGO" because we've been getting that since day one of Clikits showing up as a Random Set of the Day, and it hasn't stopped that theme from being allowed. Galidor is listed under Licensed and therefore is not under Constraction and we've actually gotten that as a Random Set of the Day even though most people hate it or love it because of the memes that came about in 2017 and 2018, but it actually is good for something when people use the parts in creative ways. Bionicle, Ben 10 (Seriously? This is licensed, yet it's under Constraction, make up your mind on what counts.), and Hero Factory (When we get to 2020), are part of the history of this toy we all love, people use parts from them all the time for their own creations, and if you had such an aversion to them, you wouldn't even allow them on the site to be cataloged.

On top of that, there are sets within Bionicle that are actually System but are listed under Constraction because Bionicle is, and while fan reception is not the best toward those sets, they would fit more the criteria of what LEGO actually is to the majority of people that were to look at it because there are bricks and studs involved. If it weren't for the Star Wars Technic Sets and how frequently the system actually is used in other themes, I could care less about anything Technic. I'm not a big car fan, I've never really been able to wrap my brain around the building system as a whole, but it's been around for 40 years, I can't just deny it's existence. When it comes to new LEGO sets, Constraction may be dead, but it has absolutely influenced themes such as Alpha Team, Knights Kingdom, Exo-Force, Vikings, AquaRaides, Mars Mission, Ninjago and Superheroes just to name a few. Mixels would not be what it was without the assistance of balljoints, and yes although that scale existed before Bionicle and Hero Factory (and actually Slizers/Throwbots and RoboRiders precede them and are listed under TECHNIC despite a number of good arguments that could argue they are constraction), the ball and socket system proved popular because of it, and new sockets were designed to make all kinds of connections possible across all of The LEGO Groups themes. From Chima to Star Wars to Ninjago to Elves to Ideas sets and everywhere in between.

Gravatar
By in United States,

@Huw Now about the other themes; I absolutely get it. Basic and Educational tend to contain more or less just loose parts, there is not much substance there when it comes to discussion because nobody is really going to remember that one set of loose parts for them to make whatever they want, they're going to remember the set they had as a kid that they played with forever, blew up a bunch of times, and eventually put it in storage or sold it or whatever else. But there have still been a few instances where something sneaks through and it's basically a bucket of parts with no instructions. Most recent examples include the Technic supplement parts or a switch track and regular tracks from Trains. We don't need more stuff like that with an added influx of sets like them.

Miscellaneous also makes sense because a number of things are in there, but there are absolutely viable options within, in fact there are actually a lot of viable ones. Boost, Dimensions, Factory (Eligible), Forma, Fusion, Games (Eligible), Ideas, Master Builder Academy (I want to hear from the kids that owned that when it's eligible), Life of George (Maybe?), Miscellaneous and Promotional (but it's looking both viable and not for both), and Seasonal (Although I get it because of how specific they are to times of year, but at the same time we've gotten City Advent Calendars and people have fun with the random Christmas stuff every now and then.).

As for Junior; maybe grant some lenience on Junior, and the only reason why is because, we need Fabuland and Jack Stone in there for the glorious meme-ification like we get with Clikits. But more seriously because those sets are designed for kids 4-7, and if anyone had those sets from 10+ years ago, it could bring about interesting discussion. What about the Mickey Mouse sets? I didn't even know they were a thing, but it was the second licensed theme from LEGO. And yeah, the parts are big and a bit over the top, but we have Scala and Bellville because it's under Girls and there is plenty of large parts in there; everyone may be bewildered, but there has also been a defense of those sets from a few members and I applaud them.

Pre-school; I absolutely understand. It falls a lot into the Basic and Educational side of things because even though there are sets, they would more or less become just pieces kids played with before they could really remember and so discussion would be filled with people questioning why it was Random Set of the Day but probably more so questioning LEGO Designers' sanity with the designs on some of them, which is ridiculous because we have that happen every time any Girls theme pops up, but whatever. If anyone wants to fight for Duplo, you make the most compelling argument you can; make me regret not including it as a viable theme.

I am proposing that Constraction be reincluded as well as Juniors, and the things I listed in Miscellaneous. We need to be fair to the people that love those sets and it could lead to interesting discussion, which is what Random Set of the Day is all about (after getting more activity on the site.).

Gravatar
By in United Kingdom,

@MCLegoboy, It looks as if you have strong feelings on the subject!

If Huwbot included the 236 Bionicle sets that are in scope that would bring the total in scope to 2632 so the chance of one being picked is about 9%, or one every 11 days.

I could be persuaded to ask huwbot to reconsider. What do others think?

Juniors would include the likes of Jack Stone and heaven forbid he appear on the home page!

Gravatar
By in Ireland,

Constraction is not just Bionicle, there's also Galidor...

Gravatar
By in United States,

Since most people agree that Clikits aren't really Lego pieces, may as well just take it away from RSOTD. Problem solved.

Gravatar
By in United States,

^^And there has been a couple from that theme for RSOTD.

Gravatar
By in United States,

@Huw, it would be great to see Bionicle included! I even had several Jack Stone sets back in the day, so it would be fun to have Juniors included.

Gravatar
By in Puerto Rico,

Never change Huwbot, we know you love Clikts when on April 1st it was revealed for us.

Gravatar
By in United Kingdom,

To clarify the probability debate, the likelihood of a Clikits set appearing again is precisely 1 - it is a certainty that it will happen, given the algorithm.
Then the chance of one appearing the following day is about 1/54, per Huwbot’s calculations.

So the likelihood really is at least 1 x 1/54 = 1/54.

I say “at least” because in fact it’s actually a fair bit higher than this, since Clikits will appear again on other days too - 1/54 is just the chance of it happening the FIRST time a Clikit set appears again. So, without calculating it, it’s probably considerably closer to 1 that it will happen again, ever.

Gravatar
By in Canada,

I definitely would also support seeing Bionicle included in the range of themes Huwbot picks from. Certainly the responses I remember from the period of time when Bionicle sets WERE included in RSOTD were hardly any more derisive than a lot of the comments we see about Clikits when it shows up.

A factor I haven't seen mentioned that might help explain the seemingly disproportionate representation of Clikits RSOTD articles is that the sets that meet all the conditions for being chosen increase enormously as their release date get closer to the ten-years-ago cutoff date (for instance, sets from the past decade make up over 50% of the candidate list).

Clikits may only have gotten new sets for four years, but each of those years had close to the greatest number of RSOTD candidate sets sets of any years. And of those candidate sets, Clikits had the fourth most candidate sets of any theme in that four year period, after Creator, Star Wars, and Sports!

Gravatar
By in United States,

@Huw
I'd like to see Bionicle as RSOTD. I know nothing about it and the comments section usually teaches me a lot about themes I don't know.

Gravatar
By in United Kingdom,

Having looked at this again this morning, rather than when it was midnight, I think @hamMOC is correct in that we're both right - and that probability is hard!

So yes, the chances of Clikits sets coming up on TWO specific days (for example tomorrow and the day after, or tomorrow and a week tomorrow) are indeed 2916 to 1 (54×54), but if a Clikits set comes up tomorrow then the chances of it coming up on ONE specific day after tomorrow are still 54 to 1, even if that day is the day after it came up!

Confused? You won't be the only one - probability is hard!

Gravatar
By in United States,

Heh, bleep-beep, boop-boop-bleep-bloop, am I right?

Gravatar
By in Canada,

I’m trying to calculate the probability of Clikits being awesome. I believe it is inversely proportional to the chances of it coming up as a RSOTD on consecutive days. That puts the odds at 1 to 2916.
So for the lay person, what this means is that we can conclude that Clikits have approximately a 3000% chance of being awesome. I think we can all agree on that. It’s simple math.

Gravatar
By in Turkey,

It seems Clicits has the highest ratio for the In Scope/Picked. Trains and World City follows it by a close margin. Hey, I'm not complaining. The sooner we're done with Clicits, the better.

Gravatar
By in United Kingdom,

Seriously, what is this weird AFOL bias against BIONICLE? Not every theme is going to be everyone's cup of tea but that doesn't make them any less valid, and there a lot of people (myself included) for whom it was as big a part of their childhood as Classic Space was for others, and there's still an active fanbase (also like Classic Space). I'm not saying everyone has to like it - everyone's entitled to their opinion - but there really is a strange amount of tutting and ill-temper around the theme every time it pops up.

Gravatar
By in Netherlands,

All hail the overlord who has graced us with his math.

Gravatar
By in United Kingdom,

I wouldn't be opposed to seeing all and any 'lego' set appear be it 'Basic','Educational','Junior','Pre-school','Miscellaneous','Constraction' etc etc

If Clikits gets a look in then why not the others. We can all learn from our past mistakes and shouldn't be afraid to embrace them... and every set is (was) potentially someones first foray into owning lego. Even Jack Stone :D

Gravatar
By in United States,

I’ve never seen huw so active in the comments.

Gravatar
By in United States,

HUWBOT HAS A CLICKIT ON HIS CHEST!!!!!!!!!!!!!!!!!!!!!!!

Gravatar
By in United States,

@Jack_Rizzo , its still present but I have gathered the dislike of Bionicle from a lot of AFOLs has begun deacreasing over the last few years. I mostly think it has to do with the original age group of Bionicle fans now in adulthood themselves.

For example, I was 7 years old when Bionicle launched. Now I'm 25. Almost my entire Bionicle collection is still intact (only the Boxor, two Manas and the blue Nui Jaga are still disassembled). They are on display in the same room as my Lego Star Wars, Marvel, Creator Expert, etc. is because they're all Lego. Tahu is as iconic to me as Benny is for the Classic Space guys.

Gravatar
By in United States,

I don’t understand what you mean by ‘should have’...

...can’t you just put up whatever you want? It’s...your site, yes?

Gravatar
By in United States,

@Huw I'd love to see BIONICLE added back to the roster, I honestly kinda stopped following Random Set of the Day 'cause there hadn't been one in a while

Gravatar
By in United States,

@Huw I'd love to see BIONICLE added back to the roster, I honestly kinda stopped following Random Set of the Day 'cause there hadn't been one in a while

Return to home page »