Using AI to generate minifigures
Posted by Huw,
Today's guest author covers a fascinating subject that I suspect, like me, you've not read about before:
My name is Pawel, I’m from Warsaw in Poland. LEGO is my hobby for as long as I can remember, especially Star Wars which I have been collecting since the beginning of the line in 1999. I am also very interested in mathematics and programming.
Now I work as an AI engineer, so I decided to link my two interests…
Recently you may have noticed a lot of information in the media about Artificial Intelligence. You have probably heard of AI being used for deep fakes and some mobile apps like FaceApp. AI is very wide field of science, part of which are deep neural networks (DNN).
If you see somebody’s picture converted into Vincent van Gogh style painting or someone's face photo aged, it all relies on DNN. For me, especially interesting are generative neural networks which can generate some images based on images that it has seen before, for example LEGO minifigures.
And here I present you, what may be the first AI generated LEGO minifigures, totally dreamed up by my computer:
How does it work? The algorithm needs some minifigure pictures -- training data -- to learn what a minifigure looks like. It works best when the pictures are as uniform as possible, so the Bricklink / Brickset database is excellent for that use. I have extracted 4500 minifigure pictures from a few themes like City, Collectable Minifigures, Star Wars. So in this data are not only human figures, but some fancy aliens, disguised ones etc.
It all works thanks to two software algorithms, the generator and discriminator. The generator generates fake minifigure picture using some random input numbers. Then it feeds those generated fake images to the discriminator, along with real minifigure images. The discriminator's task is to distinguish fake images from the real ones.
The better the discriminator does the job, the higher reward it gets, but it also means that the performance of the generator is not good enough, so it has to do better next time. When the generator generates more realistic images, the discriminator’s performance is not good enough. So it is like a game between two players where each one wants to outsmart the other and each of them learns gradually over time how to do that.
I used Google’s cloud service to do the computing. Just after about ten minutes of training on real images you could easily see the shape of the generated minifigure.
Then I left the algorithm to learn for about eight hours and below are some of the generated images that I got. These are rather blurry but you can easily see some torso and face prints on most of them.
Here you can see some common man with brown jacket and orange shirt. He is probably from a licensed theme due to the skin tone.
And here we have something what looks like a Stormtrooper leak from the fourth STAR WARS trilogy.
To oppose the Stormtrooper we have ninja rebel pilot.
Here are some more examples that my computer dreamed up:
What is the most interesting for me is that it learned some consistency. If one leg has printing, the other have it also, yellow head minifigures has yellow hands etc.
Which is your favourite generated minifigure? What may be the next step? Maybe AI generated LEGO sets?
141 likes
26 comments on this article
This is really wack, but I love the possibilities.
Wow! This is something I haven’t thought of till now, but it sounds really fascinating! Keep it up!
Doing marginally better than https://thiscatdoesnotexist.com/
That 501st fake Stormtrooper lools wild if I am being honest.
I find this absolutely fascinating. It of course could be great with improved image quality. I wonder how else it can be applied? This would be complex especially with computing power, but could you somehow preload all LEGO parts, and it determine with 3 - 5 pieces all the different ways they can be assembled?
Your "common man with brown jacket and orange shirt" looks like Elvis! Fascinating article.
@STL_Brick_Co said:
"I find this absolutely fascinating. It of course could be great with improved image quality. I wonder how else it can be applied? This would be complex especially with computing power, but could you somehow preload all LEGO parts, and it determine with 3 - 5 pieces all the different ways they can be assembled?"
I'll try to make it less blurry (and more impresive ;)) in the future. Regarding all the different combinations of connecting pieces - it is totally different thing and I think possible for quite some time.
Excellent material!
Nice one, using LEGO to make me more smarter. zing... but seriously, thanks for sharing your post I enjoyed and can't wait for this to hit the market someday
Interesting stuff
Now make 3D printer models out of them :)
@STL_Brick_Co said:
"I find this absolutely fascinating. It of course could be great with improved image quality. I wonder how else it can be applied? This would be complex especially with computing power, but could you somehow preload all LEGO parts, and it determine with 3 - 5 pieces all the different ways they can be assembled?"
In mathematics we would just call your questions a "counting problem," and it wouldn't be hard to solve. Mathematically, Pawel's AI with deep neural networks is vastly more complex.
Looks like the software is sponsored by Specsavers.
Why didn't it learn that Minifigures have crisp, definable edges rather than vague nebulous smudge like areas of both colour and form?
@Bricklunch said:
"Looks like the software is sponsored by Specsavers.
Why didn't it learn that Minifigures have crisp, definable edges rather than vague nebulous smudge like areas of both colour and form? "
Training pictures were 128x128 pixels, so quite a low resolution. The quality of generated minifigures depends on the time of training and values of some parameters, and setting the best values of this parameters is the key thing. Unfortunately there are only general rules of what values these parameters can take and these rules are often task specific.
This is an extremely interesting article! I'm always fascinated by AI/neural networks. Although so far this sort of technology has hardly ever lived up to a lot of the practical expectations people have had for it, its failures often turn out to be just as fascinating as its successes!
I wonder if there'd be any way to train it on both images and descriptions/content tags to see if it would be able to generate new minifigure designs in response to particular queries (for instance, "female firefighter with glasses" or "historic pilot with eyepatch") I suspect it would need to be trained on higher resolution images to pick up on certain nuances, of course — at this resolution, I doubt it would be able to draw super clear distinctions between some of those sorts of details, like glasses vs. no glasses. In any event, I'm sure given your interest in this field of study, you already have plenty of further possibilities like that in mind!
I left uni when plain neural networks were implemented in financial forecasts and K-means algorythm was the king of Matlab and now I am reading about DNNs. I must go back to reading to learn the difference between a good ol' NN and a Deep one!
A really fascinating article about an experimental look on our beloved minifigure, although it would be best portrayed when those virtually produced minis were more clearly defined (I have no idea how this can be achieved though).
In any case, thanks for showing us that AI and LEGO can be combined to very interesting results!!!
Cool. Get the resolution up and then sell it to LEGO. They really should be calling you right now. :)
I like https://thispersondoesnotexist.com/ better than the cat one. :)
This is a super cool idea that will be amazing when the resolution improves. I am intrigued by the process.
@Bricklunch:
It's a computer program. It only knows what you tell it. In this case, you feed it a bunch of images, and tell it which ones are Subject A and which ones are not Subject A. You then ask it to generate an image that fits Subject A. Where we can look at something and intuitively determine if it belongs or not, all the computer has to go by is what it can glean from the pixelated pattern of the images. Unlike you, it doesn't understand the actual 3D object that's depicted. It doesn't even know what a minifig is. It just knows that some pictures are flagged "yes" and others are flagged "no". And in terms of this program, it's still early days. It's going to take a lot of back-and-forth between the two halves of the program before it will start producing crisp edges consistently (basically, once the Discriminator figures out that all minifigs do have crisp edges and starts rejecting any that don't). Complicating things is the fact that the minifig images are probably sourced from multiple photographers, meaning the lighting, framing, background, and camera will be inconsistent across the entire group, but at the same time there will be clusters of similarity. The people program seems to have the benefit of only dealing with portrait-style photos (head and a bit of shoulder), but not the full body. However, it still has to deal with two genders, and multiple racial types, on top of the fact that there's no consistently defined shape to anything. The cat program has it worst, as there are long-hairs, short-hairs, several different coloration patterns...and then the cats are all in completely different poses, sometimes showing the full body and other times in closeup. And it can't just get the _cat_ right, as doing a bad enough job on the surroundings will give the entire image away as a fake.
Hi Pawel. These are really promising results! Having worked a bit with GANs myself, I know how hard they can be to train. I think before you move on to a new project, it would be really cool to see more progress on this one :) It would be amazing if you could get the network to generate minifigs that look good to humans (e.g. with sharper edges/less blurring).
Can you share some more details on the specifics? For example, what network architecture and hyperparameters? What loss function?
Again, such a cool project! Great job :)
P.S. my favourite is the construction worker guy with the orange hard hat.
I really like the "Fourth trilogy Stormtrooper"
To me it almost looks like one of those android chicks you see in sci fi films, with the brightly colored fake hair, pale casing, and headset.
... and that is when Skynet became sentient....
Just kidding, nice work.
What a neat surprise! I honestly did not expect NNs were able to accomplish these kinds of results so far. Definitely giving me some ideas for potential projects in my uni haha. Asides from the impressiveness of the NN generating its own figures, I find it particularly impressive it was able to keep the consistency of the entire figure as you also noted. Anyways a fantastic article!
Thisminifigdoesnotexist. com incoming?
@discodisco said:
"Hi Pawel. These are really promising results! Having worked a bit with GANs myself, I know how hard they can be to train. I think before you move on to a new project, it would be really cool to see more progress on this one :) It would be amazing if you could get the network to generate minifigs that look good to humans (e.g. with sharper edges/less blurring).
Can you share some more details on the specifics? For example, what network architecture and hyperparameters? What loss function?
Again, such a cool project! Great job :)
P.S. my favourite is the construction worker guy with the orange hard hat."
After the feedback I am definitely planning on developing the project further. :)
Generator is just simple transposed convolutional architecture to upsample latent space to the size of the image, than its output is fed to the discriminator that downsamples it to get the class probability - 1 for real image and 0 for fake. So the discriminator loss is comparing output for real images to ones and fake images to 0. Generator loss is a comparison between ones and fake images output. There are also some tricks that help in training to be more stable, like adding noise.
@pablo94:
Probably the most interesting question on this is if it will eventually hit a point where the generator realizes it has to copy existing deco to fool the discriminator, or if it will always be able to get away with creating new deco from scratch. And the answer to that may depend on whether or not you keep feeding it the annual selection of new minifig designs so the discriminator is never able to lock on to any group of elements as "legit" vs anything it doesn't recognize being fake.