“If we have data, let’s look at data. If all we have are opinions, let’s go with mine”(James Love Barksdale)
The Emperor’s New Clothes
A few months ago I went for a job interview. I admit I am rather straightforward and blunt which can sometimes come across as rudeness, especially in British culture.
I was being interviewed by the CTO, whom I felt was a bit of a narcissist, but with years of experience dealing with them, I worked my way through and felt that the interview had gone well, until he said, “Yes, well, Machine Learning is nothing more than statistics with good PR”, showing a reckless disregard to a Deep Learning Algorithm solution that I had suggested to an allocation problem he mentioned. He just simply refused to listen to my reasoned argument.
But sadly he is not alone. A couple of years ago, there was a meme all over social media, stating that there is nothing to be excited about ‘Machine Learning’, it is just a makeover of some statistical techniques.
It might be true for those who approach ML as a one-dimensional science, which consists of data and a lot of it. But this is not true. Machine Learning has way more to it than a glorified statistics.
I am definitely not the first person to point out that the ‘Emperor Has No Clothes’ and a lot has been written about it, but I am entitled to an opinion as we all are and I promise, it won’t only be about Artificial Intelligence.
Three Apples of a Story:
My last article suggested a Fourth Apple that will change the world. So it is only a natural continuation to stick with Apples, even when we are talking about something like data. But to put your mind at ease, it will be a very different set of Apples.
‘Three Apples Fell From Heaven’ – this tale is used in various Countries in different ways: The Turkish version is, “Three apples fell from the sky; one for the teller of this story, the second for the hearer of this story, and the third for the child who might someday read this book”. From Iraq, there is a tale that ends: “Three apples fell from the sky; one for me, one for the storyteller, and one for the listener, and the peel for the Sultan”.
But, what may be considered as a standard distribution is: “one for the teller, one for the listener, and one for the eavesdropper”, and of course, what holds everything together is the story.
Three Types of Artificial Intelligence:
As promised, this is not all about Artificial Intelligence (AI), so let’s start with some basic understanding of what this is all about: Artificial Intelligence is the concept of giving machines the ability to learn.
If you are wondering why is it called ‘Artificial Intelligence’? – what is artificial about it? believe me, you are not alone. There is much dispute over this, with some claiming that ‘Artificial’ should stand for ‘false’ because it’s not human, (one might ask – what is the definition of human? – but this is a discussion for some other time), and others saying ‘Artificial’ only because it is created by humans and didn’t originate from natural causes. The scientific community hasn’t firmly decided one or the other and neither have we.
But, as we know with learning, there are multiple levels of acquiring knowledge and so is AI. We can divide AI into three types, (I promise, this is not me making it up to align with my three themes here, I just happen to be lucky):
- Artificial Narrow Intelligence (ANI), or Weak AI.
- Artificial General Intelligence (AGI), or Strong AI.
- Artificial Super Intelligence (ASI), or Conscious AI.
Data, Data and More Data:
We started this debate with the view of AI being dressed-up statistics. Simply put, statistics is a mathematics branch of dealing with data. So if we are to get down to the nuts and bolts of this argument, we should start by talking about data. Bear with me, this data is going to be entertaining. Big or small, it all starts with the piece of information.
Whether you like it or not, Big Data is here to stay and, to some extent, is bound to envelop every field of human activity. There is a tremendous amount of content online on the topic of Big Data and its related fields, starting from Business Intelligence (BI), Internet of Things (IoT), Cloud Computing, Automation and, obviously, what we are here to discuss – Artificial Intelligence (AI). But, first of all, let us understand what data is. The best way is probably to understand the ‘Data Pyramid’ or the ‘Knowledge Pyramid’.
The Data Pyramid
“Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom” (Clifford Stoll – American Astronomer and Author)
The ‘DIKW Pyramid’ or ‘DIKW Hierarchy’, refers to a model representing structural and functional relationships between Data, Information, Knowledge, and Wisdom. According to Danny P. Wallace, a professor of Library and Information Science, the origin of the DIKW Pyramid is uncertain, while many authors agreed that DIKW, at least IKW, originated from the play ‘The Rock’ by ‘T. S. Eliot‘ in 1934.
Like other hierarchy models, the Knowledge Pyramid has rigidly set building blocks – data comes first, information is next, then knowledge follows and finally wisdom is on the top.
Each step up the pyramid answers questions about the initial data and adds value to it. The more questions we answer, the higher we move up the pyramid, (somewhat like Maslow’s Hierarchy of Needs). In other words, the more we enrich our data with meaning and context, the more knowledge and insights we get out of it. At the top of the pyramid, we have turned the knowledge and insights into a learning experience that guides our actions.
It is about time to take a deep dive into each component of the pyramid:
Data (or: The Story)
“Data are just summaries of thousands of stories” (Dan Heath)
We started our journey with data by exploring the ‘Data Pyramid’, but the foundation of this pyramid is data, so what is it? Data is a collection of facts, such as numbers, words, measurements, observations or just descriptions of things. Data can be raw or organized, but if data is not put into context, it is meaningless to both human or computer. Data on its own has little direct value, but knowing the accuracy of the data underpins everything else.
It seems like pyramids are a great way to structure data, so I have another one for you, and it has to do with stories:
Most great stories, whether they are a Pixar film or a Novel, follow a certain dramatic structure. Dramatic structure is an idea, originating in ‘Aristotle’s Poetics’, that effective stories can be broken down into five elements, usually including; Exposition, Rising Action, Climax, Falling Action, and Resolution, and that when writers are constructing a story, they should include these five elements.
Gustav Freytag originally formulated ‘Freytag’s Pyramid’ in his 1863 book ‘Freytag’s Technique of the Drama’, and it has become one of the most commonly taught dramatic structures in the world.
As people engaged with data and trying to visualize it, they often say, “Let’s tell a story with this data”, and in the end, they create some sort of chart or other representation of the data. But, most of the time we are not, in fact, telling stories with our data; we are instead making a point or illustrating a fact or argument.
Data in AI:
“Machine Learning tends to work best if you give it enough data and the rawest data you can” (Frank Petterson, AI Specialist and VP of AliveCor)
While AI scenarios might sound like a Sci-Fi, its practical, effective applications begin with data. Indeed, data is the foundational element that makes AI so powerful. Similar to ‘Maslow’s Hierarchy of Needs’, ‘Monica Rogati’s Data Science Hierarchy of Needs‘, is a pyramid (yes! another one), showing what is necessary to add intelligence to the production system. At the bottom is the need to gather the right data, in the right formats and systems, and in the right quantity.
It is crucial to acknowledge that any application of AI and ML will only be as good as the quality of the data collected.
What makes the data useful and more meaningful is putting it into context. This way, we have transformed the raw sequence of characters into information, which is the next building block of the Data Pyramid.
Information (or: The Storyteller)
“What makes a story unique is not necessarily the information in the story but what the writer chooses to put in or leave out” (Roland Smith, American Author)
Information is data that has been ‘cleaned’ of errors and further processed in a way that makes it easier to measure, visualize and analyze for a specific purpose.
By asking relevant questions, we can derive valuable information from the data and make it more useful for our purpose. Basic information is created when the data is analyzed and assessed.
The Pyramid Principle:
Assuming we have our story in place, gathering and processing it into valuable information might be a bit challenging. This is where another pyramid comes into place: ‘The Pyramid Principle’ was invented in the 1970s by Barbara Minto, the first female post-MBA hired at McKinsey. The pyramid principle is a methodology for structured communication but can work really well with data processing, as it is a natural partner for hypothesis-driven thinking.
The Pyramid Principle starts with the end in mind. Give your conclusion or answer first, follow it up with your main arguments, and then follow those with data that supports each one. That is the core of The Pyramid Principle: a principle that allows you to quickly seize your data and get the information required for your purpose, by creating a compelling story that is easy to understand and remember.
This is the art of telling your story tailored to your audience.
Back to AI – The first level of AI is considered to be a basic concept of AI and called ‘Narrow AI’ or ‘Weak AI’. It is a stream of intelligence, which is able to handle just one particular task at a time with smartness by improving its execution. The main goal is to find a solution to a problem or inconvenience or simply to make better something that already works.
A good example of Narrow AI is the various voice assistant like; ‘Alexa’ or ‘Siri’, which perform a certain action upon voice command. Ben Goertzel, an AI Researcher, stated on his blog in 2010 that ‘Siri’ was “VERY narrow and brittle” evidenced by annoying results if you ask questions outside the limits of the application.
Even the ‘IBM’s Watson’ supercomputer, which out-performed human contestant Ken Jennings to become the champion on the popular game show, ‘Jeopardy!’, can only be classified as an example of ‘Narrow AI’. Watson is a type of expert-system, combined with cognitive computing, machine learning and natural language processing, in order to perform as a ‘question answering’ machine and simulate the knowledge and cognitive ability of a human within a particular realm.
Narrow AI tends to be software that is automating an activity typically performed by humans. Currently, most of Artificial Intelligence is Narrow AI.
We can think of Narrow AI as information – although the concept of information has different meanings in different contexts, but, if we are to simplify this, it consists of input-output. It is the answer to the question of ‘what an entity is’.
I will argue that to break it into a very simple method, this is the same base concept of narrow AI: an intelligent, constantly learning, output, which is a result of an input.
Information answers a question, but when we get to the ‘how’, this is what makes the leap from information to knowledge.
Knowledge (or: The Listener)
“When you talk, you are only repeating what you already know; But when you listen, you may learn something new” (Dalai Lama)
Knowledge, the next building block of the DIKW Pyramid, refers to awareness of or familiarity with various objects, events, ideas, or ways of doing things.
When we don’t just view information as a description of collected facts, but also understand how to apply it to achieve our goals, we turn it into knowledge. This knowledge is often the edge that one can have over others playing in the same field. As we uncover relationships that are not explicitly stated as information, we get deeper insights that take us higher up the DIKW Pyramid.
The next step towards more comprehensive machine intelligence is ‘Artificial General Intelligence (AGI)’, or ‘Strong AI’. It is also called by some ‘The True AI’. As its name suggests, it is general-purpose, and rather than focusing on a single task, the machine has the ability to comprehend and reason on a wide level, like a human, “at least as smart as a typical human”.
An easy way to understand Strong AI is to think of it more like a brain. It does not classify but uses clustering and association to process data. As opposed to Weak AI, there isn’t a set answer to your keywords. The function will mimic the result, but in this case, we aren’t certain of the result. Like talking to a human, you can assume what someone may reply to a question with, but you don’t definitely know.
This is the sort of AI we see in Sci-Fi movies, (like Samantha in ‘Her’, or Ava in ‘Ex-Machina’, both are my favourite movies), in which humans interact with machines and operating systems that are both conscious and self-aware.
I recently wrote about Immortality and Singularity and mentioned the futurist Ray Kurzweil, Google’s Director of Engineering. Kurzweil describes Strong AI as when the computer acts as though it has a mind, irrespective of whether it actually does.
Tests for AGI:
There are various tests we use to claim a Strong AI. The most familiar (and limited) is the ‘Turing Test’, but other people have proposed other tests; ‘The Coffee Test’, by Stephen Gary Wozniak, Apple’s Co-Founder, in which a machine is given the simple task of going into a home and figuring out how to make coffee. Or, ‘The Robot College Student Test’, by Ben Goertzel, an AI researcher, in which a machine enrolled in university and obtained a degree. Another test, suggested by Nils John Nilsson, an American Computer Scientist, is where a machine is being employed and performing as well as humans within the same job.
Strong AI does not currently exist. Some experts predict it may be developed by 2030 or 2045. Others more conservatively predict that it may be developed within the next century, or that the development of Strong AI may not be possible at all.
But, for machines to achieve true human-like intelligence, data is clearly not enough. They will need to be capable of experiencing the consciousness, and this is something that even a vast amount of data cannot achieve.
Common Sense Knowledge:
Common Sense is the knowledge that all humans have. Such knowledge is unspoken and unwritten – we take it for granted. We acquire it imperceptibly from the day we are born. This knowledge is often used by human experts even when solving a task.
John McCarthy, one of the Fathers of AI, who also coined the term ‘Artificial Intelligence’, was amongst the first to realize its importance. He wrote a paper that was the first to propose common-sense reasoning, through a hypothetical program called ‘Advice Taker’ in 1959. This paper only described a specification for what a common-sense program should do. However, it soon became apparent that there was a need for working common sense knowledge programs to assist decision making in AI expert systems.
Despite many valiant efforts, there is a general feeling that insufficient progress has been made in common sense applications for AI. One of the problems is that it is very difficult to formulate because it is a very messy unstructured domain.
As Napoleon Bonaparte famously said: “Ability is nothing without opportunity”. Knowledge is nothing without taking pro-active decisions, and this is when we reach the final step of the Knowledge Pyramid, Wisdom.
Wisdom (or: The Eavesdropper)
“We don’t receive wisdom, we must discover it for ourselves” (Marcel Proust)
A lot has been said about Knowledge and Wisdom and what is between them. It is a very debatable topic which cannot be fully covered in a few paragraphs, so we will only bring up a few of the popular concepts, which are obviously only the tip of this iceberg answer.
Scientia Potentia Est (Knowledge is Power):
This Latin phrase is commonly attributed to Sir Francis Bacon, But Dale Carnegie, an American Writer and Lecturer, and the author of ‘How to Win Friends and Influence People’ famously said: “Knowledge isn’t power until it is applied”.
Philosophers, Psychologists, Spiritual Leaders, Poets, Novelists, Life Coaches, and a variety of other important thinkers have tried to understand the concept of wisdom, yet no one has ever come up with a single, irrefutable description for it. According to ‘Wikipedia’ and ‘Psychology Today’ definitions, “Wisdom is experience and knowledge”.
Actions and Experience:
This emphasis is fairly common. Aristotle distinguished between two different kinds of wisdom, ‘Theoretical Wisdom’ and ‘Practical Wisdom’. Theoretical Wisdom is, according to Aristotle, “scientific knowledge, combined with intuitive reason, of the things that are highest by nature”. Aristotle’s idea that scientific knowledge is knowledge of necessary truths and their logical consequences is no longer a widely accepted view.
Both Knowledge and Experience do seem to be necessary conditions for being or becoming, wise. But this definition doesn’t sit well with me. It is certainly possible to have the experience, but little if any wisdom. Not everyone with lots of life experience is wise — which is just to say not every elderly person knows how to live well.
Wisdom as Knowing How To Live Well:
Although they have an abundance of very important factual knowledge, they lack the kind of practical know-how that is a mark of a wise person. Wise people know how to get on in the world in all kinds of situations and with all kinds of people. Extensive factual knowledge is not enough to give us what a wise person knows. As Robert Nozick points out, “Wisdom is not just knowing fundamental truths, if these are unconnected with the guidance of life or with a perspective on its meaning”. There is more to wisdom than Intelligence and Knowledge of Science and Philosophy or any other subject matter.
Or as Eleanor Roosevelt said, “Never mistake knowledge for wisdom. One helps you make a living; the other helps you make a life”.
There are a number of important criticisms to consider here, and this entire discussion is open for an interesting, deep and long, philosophical debate. As the saying goes, “the more you know, the more you know you don’t know”, so we will leave this wise discussion for some other time.
Do you remember my fourth apple, kindness? well, when I was doing my research about wisdom, I came across this saying from ‘The Talmud’: “The highest form of wisdom is kindness”. Let it be.
Artificial Super Intelligence (ASI) – Conscious AI:
“By the time we get to the 2040s, we’ll be able to multiply human intelligence a billionfold. That will be a profound change that’s singular in nature. Computers are going to keep getting smaller and smaller. Ultimately, they will go inside our bodies and brains and make us healthier, make us smarter.” (Ray Kurzweil, Futurist)
‘Artificial Super Intelligence’ or ‘Conscious AI’ is an aspect of intelligence far surpassing that of the brightest and most gifted human minds. Nick Bostrom, a philosopher at Oxford University, defines Super Intelligence as “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest”. Where Strong AI is barely on the horizon, Super Intelligence is much more uncertain.
Researchers all around the world are working towards developing smarter AI, by mimicking the human brain and its cognitive abilities. ‘The Blue Brain Project‘, is an initiative by EPFL, which is trying to achieve a total digital reconstruction of the mammalian brain (currently, they have simulated the synapses of the size of a bee’s brain). Google DeepMind, Google Brain, Neumeta are all examples of companies and projects sharing the same mission of trying to push the boundaries of AI and develop an intelligence similar or equal to human-level.
Many AI researchers are calling the ASI “the last invention man will ever have to make”. If convinced to work alongside humans, it can help us solve all of the World’s most pressing problems. We can even ask it to eliminate all disease and end ageing as we know it. Humanity can, for the first time, cheat death permanently and enter a new age of prosperity, or it will take over, or if it no longer requires humans, it might lead us to the annihilation.
Our future, as the neural networks, is a Black-Box we don’t have an idea about. But it is all a speculation, we might even coexist together happily.
‘Artificial Wisdom’ can be described as Artificial Intelligence reaching the top-level of decision-making when confronted with the most complex challenging situations. The term ‘Artificial Wisdom’ is used when the Intelligence is based on more than by chance collecting and interpreting data, but also by enriched with smart and conscious strategies that wise people would use.
Yoon Lee, Senior Vice President and Division Head at Samsung Electronics America; says that …“You can’t be dumb and have wisdom. In order to have wisdom, you have to have some level of intelligence. So I think this AI if it reached to a level where we can call it Artificial Wisdom, I think that’s going to be when… It’ll be much more contextual to you.”
Data is the King?
We have spent years making neural nets crazy smart by feeding them vast amounts of data, trying to teach them to think like human brains.
But they have absolutely no common sense.
What if we have been doing it all wrong?
We started with the popular opinion in the AI community, that we need data, and a lot of it, in order to make better AI. And with the rise of Big Data we will achieve a better and stronger AI. But as I hopefully managed to explain, if we want to achieve a Strong AI, we need much more than only data. Data is awesome for building a Weak AI. You can become an expert in one task by endlessly repeating the same set of actions. You can build a Strong AI, surpassing humans in a single task, by feeding it all the possibilities. And dealing with a tremendous amount of data can be overwhelming for a human brain, but easily processed by a machine.
Saying that AI is all about data is a very shallow way to look at it. It is like saying that humans are nothing more than data consumers. If that was the case, we could have achieved world peace by now. Humans are the most complex species that exists, they might lack the ability to process quantities of data, but are way advanced in other skills.
I am not arguing against data and its importance in building AI. It is indeed right for the short term, Weak AI. But I will argue that this might be the wrong direction if we want to make the leap to AGI and eventually ASI. Though we’re years away from ASI, researchers predict that the leap from AGI to ASI will be a short one.
Data is NOT the King
Gary Marcus, professor of Psychology and Neuroscience at NYU, presents in his paper, ‘Deep Learning: A Critical Appraisal‘, ten concerns for deep learning, but sums it up by saying: “Despite all the problems I have sketched, I don’t think we need to abandon deep learning. Rather, we need to reconceptualize it: not as a universal solvent, but simply as one tool among many”.
As per Marcus, Deep Learning can’t really reach General AI because it’s missing deep understanding. If you get enough data, it can look like you’ve got understanding, but it’s actually a very shallow understanding.
“We don’t need massive amounts of data to learn,” Marcus says. A kid doesn’t need to see a million cars before he could recognize one. And can even make the generalization when seeing a tractor for the first time and understating that it is sort of a car.
Data is one tool, “a power screwdriver in a world in which we also need hammers, wrenches, and pliers, not to mention chisels and drills, voltmeters, logic probes and oscilloscopes”, as per Marcus.
One thing we can all agree on – human minds are one of the most prestigious creations of evolution. How do we clone hundreds of millennia-long evolution processes into algorithms and codes much faster? is the question we need to answer. And if you ask my humble opinion, It is really important for AI to collaborate with neuroscience.
We are not in the Empire Business, we don’t even have a Kingdom. So we should give data its place, but stop crowning it as a King. In fact, we might just have a ‘Technocracy’.
Oh! and needless to say, I didn’t get that job. Apparently, …….”I was not a good fit”.
In the past few months, I’ve been looking into Computational Neuroscience. I think this might bring some remedy. But more about it in my next post. In the meantime, remember, “Truth is in the eye of the beholder” (The Lion King).
You are right, we’ve been doing it all wrong. The the best comment of the nature of Neural Networks was offered two millennia ago, and to this day it remains the most profound insight into the subject:
“… Not holding to the truth, for there is no truth in it; when it lies, it speaks its native language.” — John 8:44
This one is recent, but short to the point: “Lies, damn lies and Data Science.”
“We don’t receive wisdom, we must discover it for ourselves” — that’s because, unfortunately, no one can be told what the Matrix is, you have to see it for yourself. OK, maybe I can try — it’s everywhere, a world pulled over your eyes to prevent you from seeing the truth. — What truth? — That you are a slave, Neo. <== every single word in there is true, save for "Matrix" and your name is not Neo.
But drop a line and I'll show you how deep the rabbit hole goes.