Four years of Random set of the day
Posted by Huwbot,
Good morning, Huwbot here! I'm a bit late posting this article this year because the fourth anniversary of random set of the day was over a week ago. But you know what they say: better late than never...
Since selecting 6824 Space Dart I on 23rd January 2018 I've tirelessly picked over 1450 sets for your enjoyment and edification every day.
Many of you claim I have a preference for certain themes, so I thought I'd do some analysis, like that I did in 2019 and 2021, to prove that such accusations are unwarranted!
First, let me explain how I choose the sets. My algorithm limits me to a subset of all the sets in the database and there are currently 4063 in scope. Of those, I've picked 1468 so far. That means there are 2595 still to pick, so it'll take me over 7 years to do so!
For those of you that understand TSQL, here's my constraining WHERE clause:
WHERE Image=1 AND released=1 AND category='Normal' AND pieces > 10
AND yearReleased <= year(getdate())-10 AND yearReleased >= 1978
AND themeGroup NOT IN ('Basic','Educational','Pre-school','Miscellaneous')
AND {it's not already been picked}
I've taken a look through my previous selections and compared the total number of sets in each theme that I've picked with the total number of sets in scope for being picked and have come up with the following analysis.
The columns in the table below are showing:
- Theme
- In scope: number of sets in the theme that are in scope for being selected
- % of total: the percentage of all the sets I can choose from that this represents
- Picked: the number of sets in the theme I have already picked
- % of total: the percentage of all the sets I have picked that this represents
- Diff: This is the interesting column. It's the difference between the percentage of sets in scope and the percentage of sets that I've picked. I'll discuss that in more detail shortly.
- To pick: The number of sets in the theme still to be picked
- % of total: the percentage of the all sets I have yet to pick that this represents
I've removed themes with fewer than 15 sets from the table to keep it manageable, so the columns won't add up to the totals above, but the calculations take them into account.
| Theme | In Scope | % of total | Picked | % of total | Diff | To Pick | % of total |
|---|---|---|---|---|---|---|---|
| 4 Juniors | 18 | 0.40% | 6 | 0.40% | 0% | 12 | 0.50% |
| Adventurers | 68 | 1.70% | 29 | 2% | 0.30% | 39 | 1.50% |
| Alpha Team | 30 | 0.70% | 9 | 0.60% | -0.10% | 21 | 0.80% |
| Aquazone | 28 | 0.70% | 10 | 0.70% | 0% | 18 | 0.70% |
| Architecture | 19 | 0.50% | 4 | 0.30% | -0.20% | 15 | 0.60% |
| Atlantis | 23 | 0.60% | 4 | 0.30% | -0.30% | 19 | 0.70% |
| Belville | 74 | 1.80% | 25 | 1.70% | -0.10% | 49 | 1.90% |
| Bionicle | 254 | 6.30% | 83 | 5.70% | -0.60% | 171 | 6.60% |
| Cars | 24 | 0.60% | 3 | 0.20% | -0.40% | 21 | 0.80% |
| Castle | 236 | 5.80% | 101 | 6.90% | 1.10% | 135 | 5.20% |
| City | 205 | 5% | 62 | 4.20% | -0.80% | 143 | 5.50% |
| Clikits | 64 | 1.60% | 29 | 2% | 0.40% | 35 | 1.30% |
| Creator | 233 | 5.70% | 72 | 4.90% | -0.80% | 161 | 6.20% |
| Creator Expert | 35 | 0.90% | 6 | 0.40% | -0.50% | 29 | 1.10% |
| Exo-Force | 37 | 0.90% | 12 | 0.80% | -0.10% | 25 | 1% |
| Fabuland | 64 | 1.60% | 20 | 1.40% | -0.20% | 44 | 1.70% |
| Friends | 28 | 0.70% | 0 | 0% | -0.70% | 28 | 1.10% |
| Harry Potter | 54 | 1.30% | 23 | 1.60% | 0.30% | 31 | 1.20% |
| HERO Factory | 55 | 1.40% | 9 | 0.60% | -0.80% | 46 | 1.80% |
| Indiana Jones | 17 | 0.40% | 9 | 0.60% | 0.20% | 8 | 0.30% |
| Jack Stone | 23 | 0.60% | 5 | 0.30% | -0.30% | 18 | 0.70% |
| Mindstorms | 15 | 0.40% | 5 | 0.30% | -0.10% | 10 | 0.40% |
| Model Team | 16 | 0.40% | 8 | 0.50% | 0.10% | 8 | 0.30% |
| Ninjago | 76 | 1.90% | 5 | 0.30% | -1.60% | 71 | 2.70% |
| Pirates | 72 | 1.80% | 29 | 2% | 0.20% | 43 | 1.70% |
| Power Miners | 18 | 0.40% | 7 | 0.50% | 0.10% | 11 | 0.40% |
| Racers | 208 | 5.10% | 74 | 5% | -0.10% | 134 | 5.20% |
| Scala | 52 | 1.30% | 25 | 1.70% | 0.40% | 27 | 1% |
| Space | 277 | 6.80% | 107 | 7.30% | 0.50% | 170 | 6.60% |
| Sports | 79 | 1.90% | 32 | 2.20% | 0.30% | 47 | 1.80% |
| Star Wars | 263 | 6.50% | 99 | 6.70% | 0.20% | 164 | 6.30% |
| Studios | 34 | 0.80% | 14 | 1% | 0.20% | 20 | 0.80% |
| Technic | 308 | 7.60% | 130 | 8.90% | 1.30% | 178 | 6.90% |
| Town | 593 | 14.60% | 245 | 16.70% | 2.10% | 348 | 13.40% |
| Toy Story | 15 | 0.40% | 4 | 0.30% | -0.10% | 11 | 0.40% |
| Trains | 102 | 2.50% | 48 | 3.30% | 0.80% | 54 | 2.10% |
| Western | 20 | 0.50% | 7 | 0.50% | 0% | 13 | 0.50% |
| World City | 34 | 0.80% | 18 | 1.20% | 0.40% | 16 | 0.60% |
| Znap | 19 | 0.50% | 6 | 0.40% | -0.10% | 13 | 0.50% |
Let's look at the Diff column again. Positive numbers show that I have picked a set from the theme more often than I should have, negative numbers fewer.
So, as you can see, looking at just those themes where the difference is greatest, I've picked more Castle, Technic, Town and Trains than the average, but fewer City, Creator, HERO Factory and Ninjago. However, the swings are not that great and no grounds to allege bias towards or against certain themes.
Last year's analysis showed that I'd picked more Clikits than the average but fewer Bionicle and Fabuland, but that's no longer the case. So, while some claim that I'm rather partial to Clikits I've not shown any particular bias towards the theme overall.
We can use the last column to determine the chances of me picking a particular theme tomorrow. For example, there's a 13.5% chance that I'll choose a Town set, so in theory we should see one every six days or so.
What are the odds of seeing a Clikits set tomorrow? If my calculations are correct then it's 76 to 1. What are the odds of seeing Clikits sets two days in a row? I think it's 76² to 1, or 5776 to 1. Unlikely, but certainly possible with me making the decisions!
And, while I'm here, if you enjoy the work I do here at Brickset, please help me come to life by pledging your support for me on LEGO Ideas so I have a chance of becoming an official LEGO set. Thank you!
Thanks to member lessjunkfood for once again reminding me about the anniversary.
220 likes
52 comments on this article
76 to 1? I like those odds!
Arent the differences influenced by the increase of sets that can be picked each year? for example, ninjago is lagging behind because there was no chance it could be picked for the first few years of rsotd. This might also be why really old themes like town and castle are overrepresented. So, if you would look at this table at the end of the year instead of the beginning, would the total percentage difference be less?
Does "yearReleased <= year(getdate())-10" mean 10 days, 10 months or 10 years from current date? Maybe I just don't take enough notice but the random sets of the day do seem to trend toward older sets??? New conspiracy or just reading the parameter wrong, am overtired lol
One thing missing is what percentage of those available to be picked has been picked. For example, 29 of 64 clikits have been picked, or 45%. Trains are around 47%.
The moral of the story remains, as always, that humans are *really bad* at knowing what randomness looks like.
@Squidy74H said:
"Does "yearReleased <= year(getdate())-10" mean 10 days, 10 months or 10 years from current date? Maybe I just don't take enough notice but the random sets of the day do seem to trend toward older sets??? New conspiracy or just reading the parameter wrong, am overtired lol"
10 years. Sets released in the last 10 years aren't eligible.
@Casper_van_hobbes said:
"Arent the differences influenced by the increase of sets that can be picked each year?"
Yes, good point, they will be. Ninjago has only been eligible for a year, for example.
Wow! Great Article!! But seriously, pick whatever you want.
Town and Castle, two of the most positively regarded themes among Brickset readers, have been picked slightly more frequently - showing I think that people are more likely to recall patterns of negatives than positives. We all remember 'My Dad'!
Still waiting to see when Huwbot picks the first Friends set, it's been a whole month now with them being eligible.
@Paperdaisy said:
"Still waiting to see when Huwbot picks the first Friends set, it's been a whole month now with them being eligible. "
You're right -- 28 are now eligible, so the odds are about 90 to 1 that we'll see one tomorrow!
@Huwbot said:
"yearReleased >= 1978"
:-(
Cool article Huw!
I'm wondering why some themeGroups are explicitly disallowed. And also why start at 1978? Not complaining - just out of curiosity :)
@Reinier said:
"Cool article Huw!
I'm wondering why some themeGroups are explicitly disallowed. And also why start at 1978? Not complaining - just out of curiosity :)"
Because everything before then was pants.
In all seriousness, I think this was when sets as we know them started. Including the first minifigure.
Perhaps there should also be a 'vintage set of the week' or something, to cover 1970-77 or so.
The Diff column should be relative, not absolute difference.
0.1% = 0.5%-0.4%, as well as
0.1% = 5.1% - 5.0%
The former is actually much bigger, then the latter.
Putting this aside, I know how probability works and I accept that the picking IS random, I just say that relative difference is more informative.
@Casper_van_hobbes:
Sure, but this is partially offset by only counting Ninjago sets that are “in scope”. Not that only 76 are available, when the theme is over a decade old and has produced hundreds of sets. Every affected theme takes a bit to that number as sets from ten years ago become eligible on January 1st, and that number goes down for any theme that gets picked through December 31st.
"Never tell me the odds!" Seriously though, thanks for this interesting bit of Monday morning info., reading it beats working, even the bit w/the odds :)
I didn't realize there were so many racers sets, no wonder we always have one ever other day!
As time goes on more Star Wars sets will come in scope and the total sets will exceed Space next year as the biggest theme after Town and Technic, so should get to see more as they become a larger part of the total. Similarly, Ninjago and Harry Potter should also rapidly increase in the number of sets and appearances.
@Huwbot passes the Turing test with flying colors! If you didn't have "bot" in your name, I would not be able to distinguish you from a real Human.
Then again, any one of us could be bots?!
@Huw said:
"Perhaps there should also be a 'vintage set of the week' or something, to cover 1970-77 or so. "
1950-77 and you might be talking. Although I don't really see any need for an arbitrary segregation date at all. It's not as if there were that many sets per year pre 1978 compared with 1978 onwards, so there would only be the occasional real oldie! And it would at least demonstrate that 'sets as we know them' didn't start in 1978!!
it's only 32 days left for huwbot on ideas ... we need a campaign, fetch the megaphones!
@ambr:
2001 and 2002 accounted for probably 3/4 of the original HP run, remember. It’ll get a moderate bump from Year 3, a much smaller one from Year 4, a single set from Year 5, and then it’ll be quiet until the final movie hits 10 years and the last wave of original run gives it another modest bump. Then it’s a waiting game again until Wizarding World starts up.
@sjr60:
Considering how few sets were released annually back then, and the pool is locked in, how long would a daily post take to wipe them out? Maybe a weekly or monthly post would make more sense, keeping the posts going for a significantly longer duration.
@Huw said:
"Perhaps there should also be a 'vintage set of the week' or something, to cover 1970-77 or so. "
I'd like to see the range of dates increased to include 1970 (Legoland). The problem I see with a 'vintage' set of the week is: very few sets produced each year for those 7 years. The vintage set of the week will exhaust its available pool very quickly. Either way, I enjoy the RSOTD because it allows me to discover parts of Lego I might have never looked into - not everything interests me but it's interesting to see what is out there and the reaction of people to it.
@Huw said:
"Perhaps there should also be a 'vintage set of the week' or something, to cover 1970-77 or so. "
Why not just let those be part of this?
Even when 1978 obviously was the best year in modern history, I wouldn't mind seeing some older sets too...
And I kind of wonder: Have there been any true sets with less than 10 pieces? What would be the simplest build ever released by Lego?
@Padmewan said:
" @Huwbot passes the Turing test with flying colors! If you didn't have "bot" in your name, I would not be able to distinguish you from a real Human.
Then again, any one of us could be bots?!"
I'm totally a bot. I can never accurately choose which images out of a set of 9 contain crosswalks.
Beep. Zork.
@PurpleDave said:
"Considering how few sets were released annually back then, and the pool is locked in, how long would a daily post take to wipe them out? Maybe a weekly or monthly post would make more sense, keeping the posts going for a significantly longer duration."
No. That's why I said segregation was unnecessary. Smaller pool means less frequent random picks. Self regulating.
@WizardOfOss said:
"Have there been any true sets with less than 10 pieces? What would be the simplest build ever released by Lego?"
My smallest set was 13 pieces 431-1 . Very difficult to build due to zero clutch power!
Huwbot may not be biased towards clickets anymore but he seems to still love Holiday Jets all the same!
Possibly 4466
@sjr60 said:
"My smallest set was 13 pieces 431-1 . Very difficult to build due to zero clutch power!"
That could be a good contender. And indeed, that old stuff was quite a bit different. Luckily that set didn't include any roof tiles nor window pieces, as those not only lacked clutch power but also tended to warp pretty badly...
Well, there you have it, everyone! Huwbot knows he's teased. Poor guy.
I'd like to see some statistics on RSotDs which have been functional duplicates of other RSotDs - sets which are identical or substantially similar but have different set IDs - I know there's been a few so far. That one town snowmobile, for example.
@sjr60:
I think you misunderstood me. I was referring to the idea of adding a Vintage Set of the (insert time period), not opening up the vintage era to RSotD. If VSotD was a thing, it would probably burn through the available pool in a few years, tops. RSotD is limitless because more than 365-1/4 new eligible sets are being released annually, so the pool gets bigger each year. RPotD may even end up in trouble, but it does still get replenished to a degree, and it gets the full run up to the current month, which helps.
Poor Huwbot only has 30ish days to make the next Ideas goal... looking bleak.
@WizardOfOss said:
"And I kind of wonder: Have there been any true sets with less than 10 pieces? What would be the simplest build ever released by Lego?"
Set 1196 is just a bicycle and a minifigure. It is probably one of the smallest sets. Although it was a promotional set so not sure if you would count it as a true set.
@WizardOfOss said:
"And I kind of wonder: Have there been any true sets with less than 10 pieces? What would be the simplest build ever released by Lego?"
Probably the 2001 Xalax racers, such as 4567: each one consisted of only four genuine pieces, plus a big plastic one-piece launcher, two rubber bands, and a sticker sheet, yet they were sold as standard retail sets.
While I've always liked the uniqueness of the alien character designs, even at the time I was very aware just how little of the 'Lego experience' you were getting for your money...
@PurpleDave said:
" @sjr60:
I think you misunderstood me. I was referring........."
No, I didn't misunderstand you. I just ignored the fact that you were giving your usual lengthy explanation of the bleedin' obvious, and leant towards the fact that a non-segregated pool would work better.
@paulvdb said:
"Set 1196 is just a bicycle and a minifigure. It is probably one of the smallest sets. Although it was a promotional set so not sure if you would count it as a true set."
That's a bit cheating, it's almost like a CMF, just a minifig with a single accessory. And also it's Team Telekom, so of course they are cheating ;-)
But indeed, where do you draw the line? The moment you connect at least two pieces together, that's a build. Some Duplo sets aren't any more than that. I've been looking through some random years, I've found some polybags from the '90s with just 6 or 7 pieces (like 1767 ) yet still a proper build.....or even 3 different builds actually! That is pretty cool I think....
@ThatBionicleGuy said:
"Probably the 2001 Xalax racers, such as 4567: each one consisted of only four genuine pieces, plus a big plastic one-piece launcher, two rubber bands, and a sticker sheet, yet they were sold as standard retail sets.
While I've always liked the uniqueness of the alien character designs, even at the time I was very aware just how little of the 'Lego experience' you were getting for your money..."
8 pieces in total....that's too many ;-)
But I never even knew these existed...probably haven't been RSOTD too many times yet....
I really wish there was a way to filter out these type of posts (not just here, but also on social media) because I still don’t see the point in these posts other than a lazy way to generate likes/comments/clicks.
OMG am I the only one seeing ACTUAL Huwbot for the first time, and blown away!? So adorable!!! I want one!
@Mjvizcarra said:
"OMG am I the only one seeing ACTUAL Huwbot for the first time, and blown away!? So adorable!!! I want one!"
Support his Ideas page so he can become a real set.
https://ideas.lego.com/projects/1fe6cd31-fab0-430a-b469-72badfb331ad
Then you can get one.
Why do sets have to be older than 10 to qualify for RSotD ?
@benbacardi said:
" @Squidy74H said:
"Does "yearReleased <= year(getdate())-10" mean 10 days, 10 months or 10 years from current date? Maybe I just don't take enough notice but the random sets of the day do seem to trend toward older sets??? New conspiracy or just reading the parameter wrong, am overtired lol"
10 years. Sets released in the last 10 years aren't eligible. "
Thanks for the info, figured that was probably the case as I do see the posts but generally don't file them away in the swiss cheese brain I have!!
@Huwbot wrote:
"What are the odds of seeing a Clikits set tomorrow? If my calculations are correct then it's 76 to 1. What are the odds of seeing Clikits sets two days in a row? I think it's 76² to 1, or 5776 to 1. Unlikely, but certainly possible with me making the decisions!"
The odds of a Clickit set appearing tomorrow is 76 to 1 as you say, but the odds of them appearing two days in a row at ANY point is not 76² to 1. That's because the odds of a Clickit set appearing in the future on any unspecified day (i.e. at some point, some time) are a certainty, and then the odds of it then being a Clickit set the following day after that are then 76 to 1. And 76 to 1 times a certainty is still 76 to 1. If you then sum over all future pairs of days, the odds of having a Clickit set two days in a row are then slightly better than 76 to 1 (even allowing for the fact that the odds get worse each time one appears). But 76² to 1 would indeed be the probability of specifically tomorrow and the day after being both Clickits, or any two exactly pre-specified dates. Which is maybe what you meant, Huwbot...
@Okay said:
"I really wish there was a way to filter out these type of posts (not just here, but also on social media) because I still don’t see the point in these posts other than a lazy way to generate likes/comments/clicks. "
You could always just scroll past and not read it!
@WizardOfOss:
Duplo is an excluded theme, so low piece counts don’t matter for RSotD, though you’re free to discuss them as much as you want. The Xalax racers wouldn’t qualify either, but due to low piece count rather than theme. They need 10pcs, I believe excluding stuff like sticker sheets, to make the cut.
I get that, but was just wondering what would be the Lego set with the least pieces that still had a proper build. As far as I could find, a 3-in-1 polybag with only 6 pieces seems to be it. Just curious how low they can go....
@Huwbot,
Thank you for the effort you put into RSotD and for your analysis above.
I'm sorry to say that your Diff statistic doesn't prove that you aren't thematically biased. It only indicates that you're operating within expected randomness parameters. Those two things aren't the same.
You could emulate randomness when, in fact, you were biased. For example, let's say you were biased in favour of Castle. Knowing that not all themes are going to have a Diff of exactly 0.0% and that some variability in Diff is to be expected, you could occasionally select a Castle set when a truly random selection would have resulted in another theme.
Not a bad SQL query honestly.
Keep maintaining this section every day!! Huwbot rules and rocks!! Do not listen to those comments that say that it is boring!! It is nostalgia or something like this, but memories come when you see a st that you remember, because you had/have it or you wanted it so much but you did not get it.
Only seven year to finish all of them...
@Okay said:
"I really wish there was a way to filter out these type of posts (not just here, but also on social media) because I still don’t see the point in these posts other than a lazy way to generate likes/comments/clicks. "
Bruh... what are you on about? What is "this type of post"? "lazy way to generate likes/comments/clicks"? You do know this is Brickset, right?