Finding nano-needles in giant data haystacks (#sibos 7 #innotribe)

September 17, 2013 by Chris Skinner

Super caffeinated, I prowled the exhibit hall for an hour or
so and bumped into lots of old and some new friends. There’s not a great deal of stuff happening,
although there are a few stands that stand out such as the Agricultural Bank of
China because the Chinese weren’t here last year. That was when SIBOS was in Osaka and there
was some spat over ownership of Japanese islands … or are they Chinese?

There’s a nice stand from the Azerbaijan bank Pasha.

I like the name Pasha Bank, as I have a passion for banking,
but do have slight concern when I see the bank is owned by the Azerbaijan president’s
father-in-law … but then my bank’s owned by the Prime Minister of Britain, so
why does that worry me?

Finally, I bump into the SEB booth and find the watch on my
wrist has disappeared, along with the belt on my trousers.

What’s going on?

Oh, they have a magician on their stand. He’s very impressive but I’m not sure I like
my trousers dropping around my ankles (although it does get a big laugh).

Ah well, time to find out about Big Data (not that old nugget
again) … yes, it’s the second big innotribe session and this one has a long
list of speakers:

Neil
Bartlett, CTO and Head of Development of Risk Analytics, IBM
Daniel
Erasmus, Owner, Digital Thinking Network
Matthew
Gordon, Forward Deployed Engineer, Palantir
Walid
Jelassi, Transformation Consultant, HP
Piotr
Kulczakowicz, Vice President, Quantum4D
JP
Rangaswami, Chief Scientist, Salesforce
Simon Small, Founding
Director, ARRIA NLG
Kimmo
Soramaki, Founder & CEO, Financial Network Analytics

Kimmo, a Finnish entrepreneur, kicked off the session
talking about What is Data?

Seems like a simple thing, but maybe not as everyone in the
room defined it a different way.
Wikipedia give a good definition though:

“The word data is the plural of datum which is Latin for "something given".”

Maybe a better elaboration is that:

Data are typically the
results of measurements and can be visualised using graphs or images. Raw data,
i.e., unprocessed data, refers to a collection of numbers, characters and is a
relative term; data processing commonly occurs by stages, and the
"processed data" from one stage may be considered the "raw
data" of the next.

This leads logically to a dialogue around data versus
processed data versus information and knowledge, and Kimmo talked about how
data is dirty because it is so unstructured and meaningless but information is beautiful
because we translate the data into meaning.

Walid Jelassi talked about the impact of data interpretation
when twitter wiped $136 billion off the S&P 500 in ten minutes recently, due to a tweet from a strong market influencer.

The tweet related to an Associated Press breaking news item
saying that there had been two explosions in the White House and Barack Obama
had been injured.

Tweet apberaking

The tweet turned out to be false – the twitter account for
the associated press had been hacked into – but you see the impact of real-time
newsfeeds from social media in action,.

In another spin on the view of data, Daniel Erasmus talked
about it being a matter of interpretation as illustrated by the cartoon map of
the world according to Ronald Regan.

Reagan's wordl

Very funny, but data is processed to create meaning based
upon interpretation and it’s your interpretation that is creating the
meaning. Be aware of that.

We then talked about visualisation tools, graphs, charts and
media and the many ways to look at that these days, with Piotr Kulczakowicz
showing us Quantum 4D’s visualisation tools and how they allow you to play with
complex data to find unknowns, such as who was exposed to who when the Greek
economy collapsed.

Quantum4d

Piotr was followed by Simon Small who argued that data is
not what it’s about anyway. It’s about
words. It’s about language. It’s about taking the numbers and telling
stories. It’s using the analytics to
create understanding that we can articulate through language.

PJ then gave reflections across all the presentations,
talking about dirty data being cleansed and how language is also not consistent. You say toma-to and I say tom-a-to. He made us all smile by saying that it doesn’t
matter whether you say toma-to or tom-a-to, nor is it knowing that tomato is vegetable
or a fruit, but it is having the knowledge to know that you don’t put the tomato
in a fruit salad. That is turning data
into knowledge.

This led us into a dialogue about analytics and how to
analyse data.

We talked about how Facebook uses analytics to serve
relevant adverts to users (or not!), and Google using the network metric Page Rank to identify search
results.

This led to the proposition that the whole financial system Is
a set of networking interdependencies, and the data flows between these
connections are the criticalities today.

That is why we need to sift through the data, identify
relationships through interdependencies and time series, place a lens on the
data to filter out irrelevancies and highlight relevancies.

We talked about automated tools to trawl through all the
social media, video, audio, email, texts, transactions, documents, spread
sheets, presentations and stuff that we produce these days to find the nano-needles
in giant data haystacks that are relevant.

We talked about taking all this stuff and throwing into Hadoop,
a framework that transparently provides both reliability and data motion to
applications.

We then talked about how to take all of this mined data and
start to visualise it so we can see the interdependencies.

We talked about hierarchical and relational databases, and
it is quite clear that today we need neural databases that think like a brain
to take all of this analytics and make sense of it all.

We talked about a lot and, If this area is of real interest
to you, then you can come along to a FREE discussion I
am chairing on September 24^th.

The meeting will debate the relevance of Big Data Analytics
in relation to AML and Fraud at Level 39, Canary Wharf, and features Forrester
followed by a panel discussion featuring myself, as moderator, and:

Derek
Wylde, Global Head of Fraud Management, HSBC
Stephen
Foster, Director Anti-Money Laundering, Group Financial Crime, Barclays
Bank
Ram
Chinta, PolarisFT
Paul
Phillips, Hortonworks
Martha
Bennett, Principal Analyst, Forrester

The event is organised by Polaris and Hortonworks, and takes
place from 18:00 on September 24.

ANYONE WHO WORKS FOR A BANK CAN GET A FREE TICKET BY
REGISTERING HERE

Anyways, I’m now off to do more exhibit touring – must get
those free gifts! – and find out about the new cyberworld insecurities of
banks.

The SIBOS BlogCategories

Chris M Skinner

Chris Skinner is best known as an independent commentator on the financial markets through his blog, TheFinanser.com, as author of the bestselling book Digital Bank, and Chair of the European networking forum the Financial Services Club. He has been voted one of the most influential people in banking by The Financial Brand (as well as one of the best blogs), a FinTech Titan (Next Bank), one of the Fintech Leaders you need to follow (City AM, Deluxe and Jax Finance), as well as one of the Top 40 most influential people in financial technology by the Wall Street Journal's Financial News. To learn more click here...