Garbage In Garbage Out, on the Blockchain

As we are all getting excited about the limitless potential of the blockchain and imagine forever new ways to reinvent this and the other old thing by the new principles, it is a good time to reflect on the hard fact that, mostly, where you have garbage going in, you’ll have garbage coming out, regardless of the many redeeming qualities of the thing into which you input garbage. The proverbial GIGO.

This has been historically true for any transformative technology — at least all the way back to the printing press — and this new one, the Blockchain, is no different.

Moreover, with previous technologies, you could go in after the fact and fix/ improve on the input to increase the quality of the output. Due to its immutable property, with blockchain that is not going to be possible.

Now immutability is a feature, definitely not a bug — in fact it is among blockchain’s most valuable features. Yet it does mean that when we design blockchain powered solutions our first job is to control the garbage that goes in. And not allow any.

This is particularly a concern in the impact space where the quality of the traditionally collected data is very, very low to start with. Previous waves of transformative technology have failed to change this unfortunate reality. In fact, the crowded space of mobile solutions for data collection have almost certainly increased the level of noise-to-signal as they mostly focused on new ways to collect or store data, while very few have tried to address the fundamental flaws in the data itself (related).

The cornerstone of traditional impact monitoring (and evaluation) is forms. Designed mostly around donor & government needs, they are badly built and tedious to use. All over the world, in hospitals, clinics, schools and everywhere where communities rely on donor support, millions of the things are filled by baffled, overworked people essentially feeding the monster known as the Monitoring and Evaluation Industry.

For years I have been arguing that transferring these forms on a tablet (innovation, yeah!) is just as bad as filling them in on paper: it remains as tedious and prone to human error as the paper form, with the added complication of maintaining devices. I have also argued that it is better to not collect such data at all, than to build high stakes programs on deeply flawed data and spurious insights (a.k.a. “garbage”).

Now we have blockchain added to the mix. rarely does a day pass without someone sending me a whitepaper or, more often, a link to some news heavy on hype about a recent “Game-changing Blockchain project” working on putting data on the blockchain to better serve donors/ communities/ punters.

Many of these projects are still at very early stages of the baking-of-the idea process. And by early I mean the ingredients in the dough have not yet been defined, nevermind having prototypes, teams or even basic models articulated. That, however, doesn’t prevent the hype machine from pushing these projects for, mostly, the only merit of having used the word “blockchain” in some vague, buzzword-infested copy.

Then there are the ones that are a bit more advanced in their development — there is a core team, maybe even some money raised/ contracts signed. Quite a few of these operate around the following theoretical opportunity:

Take data from the impact/ aid sector and put it on the blockchain (“Tokenize it”).

(Much awesome. Amaze)

Any project in that space raises three important questions:

1. Who exactly is using that data?

2. Why do they need it on the blockchain, rather than in a good quality database?

3. This data that we are tokenizing. Where does it come from?

Usually, the answer to the first two question is:


The answer to the third question: We are taking this data from the government reports/ charity database. Where do governments/ donors/ charities get this data? Forms.

Forms, you say?

(The irony of data originating from charity/ donor/ governments reports, that is “tokenized” for the benefit of charities/ donors/ governments is not lost on me, but that is a different topic)

Now that I’ve been hating on this model, let me say this: I actually believe there is a huge opportunity in tokenizing data. This is why I believe it is worth doing the right way.

That means figuring out better ways to generate this data that are free of human error. Personally I am a big fan of event-generated data. Small, cheap resilient sensors — more and more ubiquitous in even the cheapest phones — are a great source of data. Payment events. Pictures of things, geo-tagged and time-stamped. Unique QR codes. Spatial/ weather data.

Now that stuff we should put on the blockchain. In fact, we should build entire blockchain platforms around such events — payment tokens; insurance. I also think that we can use such events as a way to evolve proof-of-work frameworks to more sustainable models.

What thinks? Know someone doing this already?

Leave a Reply