Lego is something that has always fascinated so many of us. With just a few pieces, we can imagine and build anything! However, we all have faced the problem of finding a particular piece from a large number of pieces. How can we try to automate this process of finding and sorting lego pieces?
Last summer I wanted to build a lego spaceship but it would take me a really long time to find a particular piece from my tub of legos. Thats when I thought: Is it possible to find and sort lego pieces automatically?
To get a machine to classify lego pieces we need to first understand how we humans do the job. We find a piece after we have seen it numerous times or in other words - made a mental map of what the piece looks like and how to differentiate it from other ones.
Hence, to classify the pieces, we first need a large number of images of these pieces. To find these images I searched for several online datasets but I couldn't find any which had a large number of images. What do we do now?
Creating the images from scratch
This is the last resort of any application but it is well - the last resort. How do we create over 50K images of around 150+ lego pieces from scratch. After googling for some time I found a website called LDraw which had 3D models of ALL lego pieces. That's great, at least now I had a source of 3D models. The next step was to generate images from these 3D models. For this, I used a software called Blender which allowed me to open the 3D models as well as render them to an image. Now all that was left was to write a 100 line script which would automatically iterate over the 3D models and generate images of them from various different angles. This seems simple but it took quite a lot of time to write the script. After writing the script it took a whole day to actually generate the images. The final lego image set contains 64800 images of 200 lego pieces!
Below are a few examples of the images that the program generated:
Great! Now we have the images for building our lego classifier. Now lets actually build the classifier!
Building the classifier - Transfer Learning on a ResNet50 architecture
Machine learning helps in doing the actual "intelligence" part of my application. Using an already existing model architecture called the ResNet50, I fine-tuned it to classify the lego images.
After writing the ML program, it took the model around 2 days to train and its final accuracy was ~83% That is really good considering the image set had 200 different pieces. If the model instead had predicted randomly, it would get the right prediction only 0.5% of the time which is far less compared to the model's real accuracy.
A few drawbacks
Since the model was trained only on artificially generated images, it had no idea how the lego pieces looked in real lighting conditions and hence gave poor results on these images. I could however fix this issue if I had actual real life images to train my model.
Another drawback which resulted in the accuracy stabilising at 83% and not increasing was the fact that some images were taken in uncommon angles which caused the images to not contain the features of the actual piece. The below diagrams explain this drawback:
The first 2 images belong to the same piece but as you can see due to the angle in which the second image is taken, the third one resembles it although it is an entirely different piece.
Future Steps and A Fun Project
Starting from just an idea to generating images to building a trained model with 83% was a lot of fun! The future of such an application would be to integrate it with some hardware in order for it to sort the pieces. This could also be a fun project for anyone else to play around with so I have open sourced the generated image dataset:
Here is the code I used to train the model:
If you reached till here that's awesome! I hope you liked reading this blog and I would love to know your thoughts on this and in which way you would use this:)