Where does Brickset's data come from?

Posted by ,

A couple of weeks ago, after I'd added some links to other websites, I said I would create a diagram showing where Brickset gets its data. I've now finished the first draft and it's available for your perusal as a PDF.

It may surprise you to see just how little is actually maintained locally. Of course the core of the database -- information about sets, collected and curated since 1997 is -- but almost all the peripheral data is imported from elsewhere. For simplicity, many largely static lookup tables that support the main data sets, such as the list of currencies haven't been included and neither has user-generated data such as reviews and collections.

The diagram shows some 'data flows' that don't actually go anywhere near the Brickset database but appear to the user as if they have come from it, such as Rebrickable inventories.

I don't suppose this will stem the number of emails we receive requesting that we add this minifig to that set, or telling us that inventories are incomplete, but I live in hope...

27 comments on this article

Gravatar
By in Venezuela,

great work!

Gravatar
By in United States,

Very nice @Huw. It is not very ofter a developer documents willingly :)

Gravatar
By in United Kingdom,

I'm not a developer, but I work in IT in the NHS and to have this level of sharing and integration of data at this speed in my role would be a dream come true! - This is one of my favourite web sites, not just because I like LEGO, but because the site is so good!! Cheers Huw

Gravatar
By in United Kingdom,

^ Thanks!

Gravatar
By in Hong Kong,

It's really a great work! Managing information is not that easy!

Gravatar
By in Germany,

This is so elegant! Your site is an example how different internet data can be combined and presented to a wider audience. Some other industries could learn a thing from your network, I guess. Thanks for all your effort and good work!

Gravatar
By in United States,

Flowcharts are cool. Nice, easy to understand diagram, Huw.

Gravatar
By in {Unknown country},

Good job!

Gravatar
By in United States,

This is very nice. I was very curious how this site works under the hood. Great job putting all of this together in one place in such a useful system

Gravatar
By in United States,

I think I want that printed up on a Brickset.com t-shirt!

Gravatar
By in United States,

Interesting chart. Could you tell me where in the menu to find the PaB info? Also, are you considering integrating http://www.brickbuildr.com/view/pab/ as well, if that's even possible?

Gravatar
By in Germany,

And where is website 1000steine.de where you store your images? ;)

Gravatar
By in United Kingdom,

^^ PaB availability is shown on individual parts details pages, such as this one, http://brickset.com/parts/300101, on the right hand side. I suspect Brickbuildr duplicates Wall of Bricks data to some extent but if there's an API or a data export function, and the site owner wants to collaborate, I'd be happy to integrate it too.

^ The image repository is not really part of the database as such and thus not on the diagram but it is of course an extremely important part of the site and I am grateful to Rene at 1000Stene for providing the hosting and bandwidth.

Gravatar
By in United States,

very nice makes me like this site even more (if that were possible.)

Gravatar
By in Canada,

Very nice Huw, is there another one showing which sites pull data from Brickset?

Gravatar
By in United States,

Very cool.

Gravatar
By in United States,

Huw, you put Brickimedia.com instead of Brickimedia.org. :O No big deal, lol. :P Great PDF though! I myself didn't know that so many third parties were involved. The only thing it doesn't include is when we manually add information, which I do kinda frequently. :P

Gravatar
By in Romania,

It looks like brickset is the center of the Lego universe... :P :D

Gravatar
By in United Kingdom,

^^ Sorry, I'll get that corrected.

The manually maintained information is that in the orange circles. The effort we all put in to maintaining it should not be underestimated, should it!

@OTISsoft, no, that would be a good addition, but I'm not sure I know all that do.

Gravatar
By in Puerto Rico,

Great work Huw.

Gravatar
By in United Kingdom,

@bjtpro, the date stamp isn't on that page, because it's generic to a number of suppliers and there's no obvious place to put it, but it is on the Amazon price comparison page, http://brickset.com/buy/amazon, now.

Gravatar
By in United States,

Very cool Huw. Great to be able to see (and for you to share) the level of complexity that you deal with on this site so that it is understood the work that goes into maintaining the site and database. Keep up the great work!

Gravatar
By in United States,

Very nice Huw!! I'm a datawarehouse developer here in the states and actually love to create Visio's of data movement processes I create. I love the behind the scenes look at processes as it gives you a different perspective and better picture of the complexity and/or simplicity. That is very cool. Thanks for everything you do! This has been my favorite site for the Lego addiction.

Gravatar
By in United Kingdom,

^^ & ^ Thanks!

Gravatar
By in United States,

You put Brickimedia as Brickimedia.com, when the real address is Brickimedia.org

Gravatar
By in United Kingdom,

Yes, as noted above I will change for in the next version.

Return to home page »