Using AI to generate minifigures

Posted by ,

Today's guest author covers a fascinating subject that I suspect, like me, you've not read about before:

My name is Pawel, I’m from Warsaw in Poland. LEGO is my hobby for as long as I can remember, especially Star Wars which I have been collecting since the beginning of the line in 1999. I am also very interested in mathematics and programming.

Now I work as an AI engineer, so I decided to link my two interests…


Recently you may have noticed a lot of information in the media about Artificial Intelligence. You have probably heard of AI being used for deep fakes and some mobile apps like FaceApp. AI is very wide field of science, part of which are deep neural networks (DNN).

If you see somebody’s picture converted into Vincent van Gogh style painting or someone's face photo aged, it all relies on DNN. For me, especially interesting are generative neural networks which can generate some images based on images that it has seen before, for example LEGO minifigures.

And here I present you, what may be the first AI generated LEGO minifigures, totally dreamed up by my computer:

How does it work? The algorithm needs some minifigure pictures -- training data -- to learn what a minifigure looks like. It works best when the pictures are as uniform as possible, so the Bricklink / Brickset database is excellent for that use. I have extracted 4500 minifigure pictures from a few themes like City, Collectable Minifigures, Star Wars. So in this data are not only human figures, but some fancy aliens, disguised ones etc.

It all works thanks to two software algorithms, the generator and discriminator. The generator generates fake minifigure picture using some random input numbers. Then it feeds those generated fake images to the discriminator, along with real minifigure images. The discriminator's task is to distinguish fake images from the real ones.

The better the discriminator does the job, the higher reward it gets, but it also means that the performance of the generator is not good enough, so it has to do better next time. When the generator generates more realistic images, the discriminator’s performance is not good enough. So it is like a game between two players where each one wants to outsmart the other and each of them learns gradually over time how to do that.

I used Google’s cloud service to do the computing. Just after about ten minutes of training on real images you could easily see the shape of the generated minifigure.

Then I left the algorithm to learn for about eight hours and below are some of the generated images that I got. These are rather blurry but you can easily see some torso and face prints on most of them.

Here you can see some common man with brown jacket and orange shirt. He is probably from a licensed theme due to the skin tone.

And here we have something what looks like a Stormtrooper leak from the fourth STAR WARS trilogy.

To oppose the Stormtrooper we have ninja rebel pilot.

Here are some more examples that my computer dreamed up:

What is the most interesting for me is that it learned some consistency. If one leg has printing, the other have it also, yellow head minifigures has yellow hands etc.

Which is your favourite generated minifigure? What may be the next step? Maybe AI generated LEGO sets?

26 comments on this article

Gravatar
By in United States,

This is really wack, but I love the possibilities.

Gravatar
By in Australia,

Wow! This is something I haven’t thought of till now, but it sounds really fascinating! Keep it up!

Gravatar
By in Puerto Rico,

That 501st fake Stormtrooper lools wild if I am being honest.

Gravatar
By in United States,

I find this absolutely fascinating. It of course could be great with improved image quality. I wonder how else it can be applied? This would be complex especially with computing power, but could you somehow preload all LEGO parts, and it determine with 3 - 5 pieces all the different ways they can be assembled?

Gravatar
By in United States,

Your "common man with brown jacket and orange shirt" looks like Elvis! Fascinating article.

Gravatar
By in Poland,

@STL_Brick_Co said:
"I find this absolutely fascinating. It of course could be great with improved image quality. I wonder how else it can be applied? This would be complex especially with computing power, but could you somehow preload all LEGO parts, and it determine with 3 - 5 pieces all the different ways they can be assembled?"

I'll try to make it less blurry (and more impresive ;)) in the future. Regarding all the different combinations of connecting pieces - it is totally different thing and I think possible for quite some time.

Gravatar
By in United States,

Nice one, using LEGO to make me more smarter. zing... but seriously, thanks for sharing your post I enjoyed and can't wait for this to hit the market someday

Gravatar
By in United Kingdom,

Interesting stuff
Now make 3D printer models out of them :)

Gravatar
By in United States,

@STL_Brick_Co said:
"I find this absolutely fascinating. It of course could be great with improved image quality. I wonder how else it can be applied? This would be complex especially with computing power, but could you somehow preload all LEGO parts, and it determine with 3 - 5 pieces all the different ways they can be assembled?"

In mathematics we would just call your questions a "counting problem," and it wouldn't be hard to solve. Mathematically, Pawel's AI with deep neural networks is vastly more complex.

Gravatar
By in United Kingdom,

Looks like the software is sponsored by Specsavers.
Why didn't it learn that Minifigures have crisp, definable edges rather than vague nebulous smudge like areas of both colour and form?

Gravatar
By in Poland,

@Bricklunch said:
"Looks like the software is sponsored by Specsavers.
Why didn't it learn that Minifigures have crisp, definable edges rather than vague nebulous smudge like areas of both colour and form? "


Training pictures were 128x128 pixels, so quite a low resolution. The quality of generated minifigures depends on the time of training and values of some parameters, and setting the best values of this parameters is the key thing. Unfortunately there are only general rules of what values these parameters can take and these rules are often task specific.

Gravatar
By in Canada,

This is an extremely interesting article! I'm always fascinated by AI/neural networks. Although so far this sort of technology has hardly ever lived up to a lot of the practical expectations people have had for it, its failures often turn out to be just as fascinating as its successes!

I wonder if there'd be any way to train it on both images and descriptions/content tags to see if it would be able to generate new minifigure designs in response to particular queries (for instance, "female firefighter with glasses" or "historic pilot with eyepatch") I suspect it would need to be trained on higher resolution images to pick up on certain nuances, of course — at this resolution, I doubt it would be able to draw super clear distinctions between some of those sorts of details, like glasses vs. no glasses. In any event, I'm sure given your interest in this field of study, you already have plenty of further possibilities like that in mind!

Gravatar
By in Greece,

I left uni when plain neural networks were implemented in financial forecasts and K-means algorythm was the king of Matlab and now I am reading about DNNs. I must go back to reading to learn the difference between a good ol' NN and a Deep one!

A really fascinating article about an experimental look on our beloved minifigure, although it would be best portrayed when those virtually produced minis were more clearly defined (I have no idea how this can be achieved though).

In any case, thanks for showing us that AI and LEGO can be combined to very interesting results!!!

Gravatar
By in United States,

Cool. Get the resolution up and then sell it to LEGO. They really should be calling you right now. :)

Gravatar
By in Poland,

This is a super cool idea that will be amazing when the resolution improves. I am intrigued by the process.

Gravatar
By in United States,

@Bricklunch:
It's a computer program. It only knows what you tell it. In this case, you feed it a bunch of images, and tell it which ones are Subject A and which ones are not Subject A. You then ask it to generate an image that fits Subject A. Where we can look at something and intuitively determine if it belongs or not, all the computer has to go by is what it can glean from the pixelated pattern of the images. Unlike you, it doesn't understand the actual 3D object that's depicted. It doesn't even know what a minifig is. It just knows that some pictures are flagged "yes" and others are flagged "no". And in terms of this program, it's still early days. It's going to take a lot of back-and-forth between the two halves of the program before it will start producing crisp edges consistently (basically, once the Discriminator figures out that all minifigs do have crisp edges and starts rejecting any that don't). Complicating things is the fact that the minifig images are probably sourced from multiple photographers, meaning the lighting, framing, background, and camera will be inconsistent across the entire group, but at the same time there will be clusters of similarity. The people program seems to have the benefit of only dealing with portrait-style photos (head and a bit of shoulder), but not the full body. However, it still has to deal with two genders, and multiple racial types, on top of the fact that there's no consistently defined shape to anything. The cat program has it worst, as there are long-hairs, short-hairs, several different coloration patterns...and then the cats are all in completely different poses, sometimes showing the full body and other times in closeup. And it can't just get the _cat_ right, as doing a bad enough job on the surroundings will give the entire image away as a fake.

Gravatar
By in United Kingdom,

Hi Pawel. These are really promising results! Having worked a bit with GANs myself, I know how hard they can be to train. I think before you move on to a new project, it would be really cool to see more progress on this one :) It would be amazing if you could get the network to generate minifigs that look good to humans (e.g. with sharper edges/less blurring).

Can you share some more details on the specifics? For example, what network architecture and hyperparameters? What loss function?

Again, such a cool project! Great job :)

P.S. my favourite is the construction worker guy with the orange hard hat.

Gravatar
By in United States,

I really like the "Fourth trilogy Stormtrooper"
To me it almost looks like one of those android chicks you see in sci fi films, with the brightly colored fake hair, pale casing, and headset.

Gravatar
By in Germany,

... and that is when Skynet became sentient....
Just kidding, nice work.

Gravatar
By in United States,

What a neat surprise! I honestly did not expect NNs were able to accomplish these kinds of results so far. Definitely giving me some ideas for potential projects in my uni haha. Asides from the impressiveness of the NN generating its own figures, I find it particularly impressive it was able to keep the consistency of the entire figure as you also noted. Anyways a fantastic article!

Gravatar
By in United States,

Thisminifigdoesnotexist. com incoming?

Gravatar
By in Poland,

@discodisco said:
"Hi Pawel. These are really promising results! Having worked a bit with GANs myself, I know how hard they can be to train. I think before you move on to a new project, it would be really cool to see more progress on this one :) It would be amazing if you could get the network to generate minifigs that look good to humans (e.g. with sharper edges/less blurring).

Can you share some more details on the specifics? For example, what network architecture and hyperparameters? What loss function?

Again, such a cool project! Great job :)

P.S. my favourite is the construction worker guy with the orange hard hat."


After the feedback I am definitely planning on developing the project further. :)
Generator is just simple transposed convolutional architecture to upsample latent space to the size of the image, than its output is fed to the discriminator that downsamples it to get the class probability - 1 for real image and 0 for fake. So the discriminator loss is comparing output for real images to ones and fake images to 0. Generator loss is a comparison between ones and fake images output. There are also some tricks that help in training to be more stable, like adding noise.

Gravatar
By in United States,

@pablo94:
Probably the most interesting question on this is if it will eventually hit a point where the generator realizes it has to copy existing deco to fool the discriminator, or if it will always be able to get away with creating new deco from scratch. And the answer to that may depend on whether or not you keep feeding it the annual selection of new minifig designs so the discriminator is never able to lock on to any group of elements as "legit" vs anything it doesn't recognize being fake.

Return to home page »