Home / Uncategorized / A big know know

A big know know

I’m no expert on such matters, but a friend of mine was talking about different tools for different times and how they use certain technologies for different services.

In particular their interests are data visualisation techniques and how these can identify fraudulent activity or improve investment strategies.

He equated it to the Donald Rumsfeld knowns and unknowns, with basic relational databases dealing with known knowns, data mining with known unknowns, complex events processing with unknown knowns and visualisation with unknown unknowns.

It moves a little beyond my ability with technology to really get this, except that there’s this constant stream of data these days, often referred to as 'big data'.

Data is everywhere, streaming across social networks, banking systems, commercial ecosystems and government intelligence services.

All of this data – gigabytes, petabytes and exabytes of it – is growing daily and exponentially.

There’s too much data.

With all of this data something or someone needs to be able to crunch through it, analyse it and see patterns within it.

Cloud computing helps, as it allows an organisation to sift through terabytes of data in seconds without the internal processing power required to do this, but it’s more than just processing power.

It’s the intelligence to discover.

And that’s why this discussion helped as you can apply massively parallel processing, virtualisation and grid to this tsunami of data, but without a map to see what’s important and what is not, it’s pretty worthless.

So here’s a simple map (doubleclick the image for larger version):

Data management 2011

Be interested to see if folks agree whether this covers the bases or not.



About Chris M Skinner

Chris M Skinner
Chris Skinner is best known as an independent commentator on the financial markets through his blog, the Finanser.com, as author of the bestselling book Digital Bank, and Chair of the European networking forum the Financial Services Club. He has been voted one of the most influential people in banking by The Financial Brand (as well as one of the best blogs), a FinTech Titan (Next Bank), one of the Fintech Leaders you need to follow (City AM, Deluxe and Jax Finance), as well as one of the Top 40 most influential people in financial technology by the Wall Street Journal’s Financial News. To learn more click here...

Check Also

Banking as usual is NOT an option

I’ve blogged quite a bit about adapting to change lately, and will continue to do …

  • Really neat framework, there is a lot more to Big Data than that, but it’s probably the neatest few paragraphs I’ve seen on the subject.
    I found this a great read for getting my head around the concept at the next layer down.

  • df

    Good read!

  • Babis Ermidis

    Reminds me of what Socrates said 2500 years ago: “EN OIDA OTI OYDEN OIDA” (I know one thing, I know nothing).
    Plato later defined 4 types of ignorance:
    1. Simple, when somebody ignores something, but he aknowledges his ignorance.
    2. Double, when somebody ignores something and at the same time doesn’t know he ignores it.
    3. Maximum, when somebody ignores something, he aknowledges his ignorance, but at the same time insists in his opinions not wanting. to escape form his ignorance.
    4. Sophistical, when somebody tries to cover his ignorance with conjectures, unexamined opinions and arbitrary conclusions.
    Not very related to IT, but interesting.

  • As someone experienced in the power and potential of visualization; I’d like to add that the ‘unknown unknowns’ often reside ‘under the nose’ in even the cleanest databases.
    Low bandwidth mediums for ‘seeing’ large scale data makes stakeholders at large organizations akin to the parable of the blind scholars describing an elephant (each ‘sees’ only what they can touch immediately before them and comes up with a totally different description/view of what they are all looking at).
    Simply put, the more dynamic and ‘high bandwidth’ the view the easier it is to see the elephants in the data when they move. This is especially true if you consider there is a larger scale structure embedded in most data systems (the bones and muscle) which – if exposed to end users – allows more agile anticipatory decision making to system wide trends and abrupt changes.
    Great post and clearly articulated expression of where the next frontier of competitive advantage for banks resides.

  • Thomas Joseph

    Its amusing that Mr. Rumsfeld’s pearls of wisdom has its uses. Good food for thought! Babis’ comments resonate with me well.
    In the Rumsfeldian phrases, the noun (the second word – ‘knowns,’ ‘unknowns’) is framed an absolute, while the adjective (the first word – ‘known,’ ‘unknown’) is a modifier describing our transient state of recognition/awareness. (Rumsfeld was referring to ‘known’ or ‘unknown’ before an incident when we may or may not be aware; ‘unknowns’ may become ‘knowns’ post event, but that is too late.)
    Thus, the ‘unknown’ state may turn to ‘known’ with effort. But the ‘unknowns’ remain so however much effort you put in – you can at best be aware they exist.
    So I wonder about ‘unknowns’ column in the graphic. If you can apply data mining or visualization to ‘unknowns’ and they turn to ‘knowns,’ aren’t these just ‘unknown knowns’ waiting to be discovered? If we are unable to use tools to turn ‘unknowns’ into ‘knowns’ before consequent events, isn’t the exercise pedantic? (Using visualization for ‘unknown unknowns’ conjures images of someone pouring over an abstract visual endlessly perhaps knowing that it has a story to tell but with no means to decipher.)
    Perhaps, my thinking is muddled and I got Rumsfeld all wrong (I wouldn’t be the only one!) – and would love to be set straight!

  • Chris Skinner

    @Thomas Joseph
    An unknown unknown is only an unknown unknown until it is known, at which point it is a known unknown we didn’t know was unknown until it was known.