The rise of the supercompetitor (Part 1)

I have previously defined cognitive advantage as ‘the demonstrable superiority gained through comprehending and acting to shape a competitive environment at a pace and with an ingenuity that an adversary’s ability to comprehend and act is compromised‘. The enabler of such a capability is the wide scale adoption of AI to maximise knowledge discovery and to hyper-accelerate decision-making either by automated action or in augmenting a human.

But what would the impact be if a company, or a government, held a cognitive advantage? There is a lot to unpack here – and I am still feeling my way – and so I am going to cover this topic over a series of posts.

Holding a cognitive advantage could equip the benefactor with a profound and deeply defensible position. If you are able to control the environment that your adversary is trying to make sense of then you not only hold the best cards but you also know what cards your adversary has, because you chose what cards they should have. The odds are stacked completely in favour of the benefactor.

Out-thinking and out-manoeuvring rivals and adversaries, continuously, will require hyper-accelerated decision-making with the agency to act with precision, foresight and an understanding of complex system effects. I explore this in my book ‘Cognitive Advantage‘ and, in summary, it could result from the optimal orchestration of artificial and human intellects alongside beneficial data accesses to yield an ability to act with deep sagacity (‘the trait of solid judgment and intelligent choices’).

I propose a simple* goal for AI: KK > UK + KU + UU

Where KK (Known Knowns), UK (Unknown Knowns), KU (Known Unknowns), UU (Unknown Unknowns).

What does this mean? Simply that if a company, or a country, deploys AI to continually discover new knowledge faster than adversaries, and turns that knowledge (Known Knowns) into actionable steps, for either a machine or a human, then that could yield a cognitive advantage (or, if you like, a decision advantage).

With such a capability, a competitor may decide to actively shape a competitive environment (eg. our online and offline worlds) at a pace and in a way that their competitors simply cannot keep up nor understand the meaning of such changes. Compromising the cognitive abilities of competitors in this way may lead to a decline in their ability to act with any kind of accuracy or insight. After all, they are making sense of the world that the competitor with a cognitive advantage has created. This is likely to lead to poor decision-making thus further reducing the ability of competitors to put up an effective response. In the limit this could feed a regressive cycle of continual decline of perception and reasoning of competitors leading to decisions that are no better than random chance.

As the ability of competitors to compete continues to decline, would this allow a single competitor – that holds a cognitive advantage – to become a supercompetitor that dominates in an unassailable winner-takes-all outcome? What might this mean? What would the impact be? What if it becomes impossible to regulate such organisations; a supercompetitor that may not wish to be understood may not be possible to regulate.

How would we know which companies, or countries, are already on (or have the potential to get on) a supercompetitor trajectory? How might we begin to prepare for such a possibility? Are there already super competitors amongst us such as the big tech firms like Microsoft or Google enabled as they are by their strategic investments in world-class AI labs such as OpenAI and Deepmind? Or a cognitive advantage as a sovereign capability developed by the United States or the China?

I am unsure if we can prevent the rise of a supercompetitor. If such a state were to occur we should hope that the supercompetitor is guided by strong ethical principles. But there is an inherent paradox in such a statement; an ethical supercompetitor is unlikely to want to be a supercompetitor as they would deem it to be unethical. But what is more unethical? Being a benevolent supercompetitor to prevent bad actors from becoming a supercompetitor? Or deciding that it is not a state that any single organisation or country should occupy, and lead by example, but at the risk that this leaves the field open for less ethical competitors?

Image generated with Stable Diffusion using the prompt “Side profile of wise woman, city in the background, dark colors”

* ultimately, such an equation is not that helpful. When we act on Known Knowns we are inevitably increasing the number of Unknowns (on the right-side of the equation) that we then have to re-discover to turn back into Known Knowns!!! Nevertheless, the principle of using AI to maximise knowledge discovery and sagacity, that this equation illustrates, does convey a motivating factor that could lead to the rise of a supercompetitor.

We’re overloading the term ‘Data Scientist’

I work in a team that is navigating a way forward for a large, high-tech organisation to get on the right trajectory with AI. This technology is not new to them and, indeed, they have been pioneers of AI over the last couple of decades. However, their use of AI has been in very specialised areas whereas the opportunity now is to use AI much more widely across their entire global enterprise. We are now into our second year of helping them and we have unearthed a number of insights. One insight, in particular, is quite key to what we do next: the term ‘data scientist’ is being overloaded.

What do we mean? Using AI to help people make better decisions is one of the most-cited and most-pursued application areas for AI whether its in medicine, law, etc. However, data science doesn’t train us to be good psychologists or behavioural scientists or user researchers. This most important and deeply human activity – understanding a situation, knowing what you are trying to achieve, considering your options, trusting (or not) the information you have – is complex, personal and ambiguous. Extracting useful information from data is an important job of the competent data scientist. Understanding what a decision-maker is trying to achieve, the decisions they need to make, the information they need to make a good decision, etc. is not within the toolkit of any data scientist I know. And arguably shouldn’t be.

Should we, instead, look to the discipline of software engineering and user researchers / user experience research + development to cover some of this? I don’t think it does, primarily because software engineering is seeking to answer the question “how do I maximise the utility of this piece of software for this user?” rather than “what top-level outcomes is my user trying to achieve?”.

Market research and focus groups approach this question from the field of economics and behavioural science. But, yet again, the goal is different and, in this case, is: how do I best understand the latent + unmet needs of certain types of people and convince them to buy this product?

The only discipline that I have found that solely focuses on helping a person make good decisions is, unsurprisingly, decision intelligence (which builds on related disciplines such as decision analysis and decision engineering). To illustrate, consider this extract from Lorien Pratt’s upcoming book The Decision Intelligence Handbook:

Copyright © 2022 Quantellia LLC. Reprinted with permission.

Now, we don’t need to be more too concerned on the actual steps described in Lorien’s diagram. The point here is that there is (a) considerably more to decision-making than extracting information from data, and (b) a different framing of the problem by coming at it from what outcomes someone is trying to achieve and the decisions they believe will have a direct effect on the achievement of that outcome. Whether it’s a solo decision or a team decision, the scope of enquiry that a person considers is broader than simply analysing the information that is presented to them.

By taking a more decision-centric approach to understanding where to use data and AI we stand a better chance of investing our time and resources on those things that directly supports the decisions that count towards achieving outcomes.

Decision intelligence (or, if you are adverse to that phrase, see it as a decision-centric approach to developing safe + effective AI applications), is a discipline that is rapidly entering the mainstream of business and operations management. When we come to consider where best to use data and AI we should start by understanding the outcomes we are trying to achieve and, therefore, the detailed nature of the decisions we need to make to pursue them.

To make valuable use of we need to provide clear, consistent and good quality aiming points for our data scientists and software engineers who will very much welcome the clarity and confidence that this brings.

A 21st century Taylorism?

Whether it’s Amazon’s infamous efficiency metric ’the rate’ to set the desired productivity of its workforce, or the continuous 360 feedback app used by NextJump employees – firms are increasingly using data generated about their own business to surveil the productivity and effectiveness of their workforce. Some regard this as the promise of management information and business intelligence finally making it into the real world, others see it as fundamentally inhumane and unethical. Where one person may be motivated by relentless feedback and short-term goal setting (regardless of the progenitor of that goal) others may see it as smothering and repressive. With the investment going into data science and artificial intelligence we can expect to see such working practices get adopted at an increasingly exponential rate.

A timely question seems to be – is this an acceptable form of managing employees? Or are we about to embark on a journey where we demonstrate that we have learned nothing from the de-humanisation of our workforce that was pioneered by Frederick Taylor’s time-and-motion studies of the early 20th century look like a side project, and from which – as an advanced society – we have still not yet recovered (ask anyone who works at McDonalds). Is data-driven decision-making, most of it automated through the use of AI, going to introduce a new management/employee paradigm that creates a deeply unfair society? Are we on the brink of a 21st century Taylorism?


Amazon’s ‘the rate’ algorithm

Amazon is probably one of the better known examples of a company that is using machine learning and data analytics to automatically assign performance targets for its human workforce working in their huge distribution centres. The now-notorious efficiency metric ‘the rate’ is determined and set by an algorithm. 

“Every task in an Amazon fulfillment centre has an efficiency rate. The two most demanding tasks are ’stow’ and ‘pick’. They represent the input to a fulfillment centre (deliveries) and the output from the fulfilment centre (deliveries going out to customers). Stowing requires removing products from their boxes and then stacking them on shelves. Picking requires checking and labelling up products before they are sent out to customers. Robots assist both of these processes. As a box is handled a bar-code is scanned. This creates a data point that Amazon’s automated system then monitors. An employee is informed how well they are doing by a visual graph shown on the monitor on their workstation. The graph changes colour (green, yellow, red) depending on how well they are doing. Task performance is determined by an algorithm. If a worker falls behind the target efficiency rate then they receive a warning. After the fourth warning, a worker is fired. Apparently, supervisors rarely question the targets set by the algorithm and, hence, those workers who are affected do not have a ‘human in the loop’ ensuring that ‘the rate’ is fair. As to be expected, this simple management tool has come to feared by Amazon workers.” – ‘the rate’ (extracts from The Verge article)

The issue here is not that Amazon are using quantitative metrics to objectively assess employee performance – most manufacturing industries use similar methods; the issue is that an algorithm is automatically determining what the rate should be and appears to do so with one aim in mind: continually increase the rate over time. For example, Mohamed began her job at an Amazon fulfillment centre she had a target stow rate of 120 items per hour. In three years that has increased to 280 items per hour. The type of packages and parcels that Mohamed has to deal with hasn’t changed. 
The constant uncertainty over what ’the rate’ will be from one day to the next is causing anxiety. 


Of course, Amazon leadership are loath to change anything as they are achieving remarkable results. The Institute for Self-Reliance found that Amazon only required half of the employees of a traditional retailer for every $10 million in goods sold. Furthermore, the Intitute also note that the general trend is downwards: Amazon are requiring fewer and fewer employees over time to fulfil the same amount of business. Of course, this all affects profitability. Automation (robots and machine learning algorithms) and efficiency (’the rate’) are credited as the main enablers of such performance. 


“Amazon essentially has developed factory-line technology for retail,”, Spencer Cox, PhD candidate and ex-Amazon worker.


NextJump‘s 360 feedback app


NextJump is an e-commerce company that generated $2 billion in revenue in 2018 from linking employees in other people’s companies to perks and rewards from other companies in the form of discounts on things like cinema tickets, eating out at the local restaurant, and so on. But that is not the most interesting thing about them. Rather it is there relentless focus on changing their own culture on, it seems, a daily basis for which the company has gained attention. The company has invested in so-called HR technology that does two things: it provides an app on every employee’s phone that allows them to provide feedback on their colleagues constantly.

Ordinarily, feedback about your performance is something done between a member of staff and their line manager, probably once or twice a year in line with setting personal objectives for the next 6 to 12 months. Not so at NextJump. Here you can expect to receive, and are expected to give, constant feedback after every meeting or day in the office. The person giving you feedback implies a happy, neutral or sad emotion along with a text entry: ‘your enthusiasm is always inspiring to me’, ‘you forgot to mention next week’s meeting’ to ‘I thought you came across as a bit negative’.

Now, I personally don’t see this as a bad thing. After all, it can be difficult to accurately interpret someone else’s feelings at a conscious level unless the other party feels that they can be honest and explicit. Whether I would want constant, 360 degrees feedback every day, I am not sure that my personality type – I have a tendency towards introversion – would find this comfortable. Nevertheless, I think that any technology that can improve the quality of human dialogue and interactions is valuable if used carefully.


Where NextJump appear to stray over the moral/optimal performance line is in how their managers analyse and respond to the feedback trends of each employee. Constantly getting negative feedback? You can expect a visit from the ‘happy police’. Constantly getting positive feedback? That’s great, but are you – perhaps – getting too comfortable? Apparently, getting some negative feedback is seen by NextJump managers as indicative that you are not in your comfort zone – this is deemed good: they don’t want you getting too comfortable otherwise you are not learning. As a colleague of mine once put it: NextJump employee’s appear to never be ‘in flow’. Now, I am firmly signed up to the belief that for any organisation – whether private or public sector – to remain viable and valuable, they need to be able to adapt appropriately to their environment, but a relentless focus on staying out of your comfort zone? There’s just too much constant anxiety going on there for it to be a healthy, long-term mode of working.

Too much change and too often runs the risk that the organisation never actually gets sufficiently good at doing one job well enough to be efficient and cost effective at it. Likewise, too little change, and an organisation runs the real risk of becoming disconnected from its environment and therefore irrelevant. There is a pretty sizeable middle ground that, to take a term from Ulanowicz (a researcher in the field of ecology), is called ’the window of viability’, and I advocate that an organisation should strive to understand where such a window exists within their ecosystem, and how they need to position themselves strategically within such a window. I will talk more about this ‘window of viability’ in a future article.


For NextJump, the ‘feedback app’ is an important part of how they work and this is exemplified in their company slogan ‘dedicated to changing workplace culture’. They are aware that their approach to workplace culture is not ideal for everyone and so they recruit for individuals with certain demonstrative characteristics: a willingness to be coached, and a healthy dose of humility. To understand how NextJump reached this point, we need to consider their history.

NextJump was founded in 1994 by Charlie Kim in his ‘dorm room’. He successfully grew the company to 150 people during the .com boom, and subsequently saw the business scaled back to just four people as a result of the subsequent .com bust. As can be imagined, a lot of soul searching and lessons learned was done, and Charlie and his surviving team, came to the conclusion that the type of people that they had employed – technologists – were not suited to a business that required constant innovation and change to external drivers. After all, the characteristics of a good technologist – expertise, depth of knowledge – are hard earned and, some may argue, should not be given up easily. However, that was not necessarily in line with the need to be open-minded and humble about your own capabilities compared to your competitors, nor the quality of your product compared to what your customers actually need. Charlie Kim and his team subsequently identifed the more desirable characteristics for NextJump employees are an open, growth mindset. 


I believe that NextJump’s efforts to create a culture of continuous change is innovative and admirable. I do, however, have concern over their use of data insights from their ‘feedback app’ to perform constant, micro-adjustments of people’s tasks and behaviours to keep them out of their comfort zones. Now, it may be that for a company of NextJump’s size (as of November 2019, this was 200 people), this is not a problem and that the people who work there find such constant change and challenge as exhilarating. After all, NextJump specifically recruit people who are going to be up for such a way of working. However, where I do take issue is with the almost cult-like zeal with which NextJump wish to share their cultural practices with other companies across the world. I don’t believe it is sensible to assume that NextJump have solved the problem of cultural change. I believe they have developed an approach to organisation that appears to work well for them within their environment.

What do I mean by that? The world of technology moved on from the .com bust of the early to mid-2000’s. What once required staff with scarce technical skills – writing web pages from scratch in HTML – to where we are now – content management systems – means that, as an e-commerce company, NextJump operate in an ecosystem where they don’t need to have people with deep technical skills. Instead, NextJump’s value comes from being a platform in the ‘perks for employees’ ecosystem. Linking employees to businesses requires primarily empathy, creativity and a willingness to move quickly to constantly identify new offerings for customers. That is admirable. However, there is no one size fits all when it comes to culture. Some organisations generate value from being deep technologists, inventors and innovators. Recruiting people who have invested heavily in becoming deep scientific or technical experts may not be what NextJump want because those types of individuals tend to be more reflective, thoughtful, and introverted. Neurodiversity is also a factor that needs to be considered here. NextJump’s culture is deeply social. Neurodiverse people may struggle to feel included in a culture where they are expected to express their thoughts and feelings, constantly, about their colleagues. The leadership imperative should be one of diversity and inclusion rather than only wanting people who can efficiently fit into a mould.


Conclusion
The case for caution over the use of data insights to drive employee performance is clear with Amazon. Lawsuits, employee strikes, bad news coverage, and so on are a clear indicator that Amazon has crossed a line from efficiency to exploitation. Human Rights legislation will have a word to say about that.

The case with NextJump is more nuanced and, in some ways, potentially more concerning. Whilst their immersion courses tend to excite and repel participants in equal measure, I have heard now on several occasions from people who were the former, that ‘we need to be like NextJump’. I often acknowledge the success that NextJump have had with adapting their culture to their environment, however, that is quickly followed by a caveat that the keywords there are ’their culture’ and ’their environment’. By all means let’s be inspired by the success of NextJump, but let’s not fool ourselves into believing that what is good for the goose is also good for the gander. I think this advice is most strongly targeted at leaders who can be attracted by NextJump’s impressive growth trajectory. However, perhaps I worry too much; how many members of the C-suite would be up for wholeheartedly adopting NextJump’s ‘feedback app’ within their own organisation, and where they themselves are on the receiving end too?

Research into causal AI has grown a bit in the last 3 years…. if NeurIPS is anything to go by

I recently wrote a paper for a client on how AI could be engineered to maximise the cognitive performance of a human who routinely needs to make complex decisions with uncertain, incomplete and ambiguous information (aka. no mean feat). I made the point that investing in equipping machines with a causal understanding of some world of interest would be key.

Now, whilst causal inference is not new to me – its my main area of academic research – I did recall that I’d been seeing more published papers on the subject in recent times. This piqued my curiosity so, after logging in to the NeurIPS 2020 site, I did a (very) quick search on all papers that had ever been presented at NeurIPS that had ‘causal’ or ‘causation’ or ‘causality’ in their title. Here’s what I found:

Now, as I said, this was a quick and dirty piece of analysis (i.e. there may have been any number of papers that covered topics related to causal inference such as probabilistic graphical models, and so on, that I have not included here).

But I think the results are somewhat illuminating and, if I was Gary Marcus, I’d be quietly encouraged that the AI community is beginning to enter the 3rd wave of AI where robust, reliable and trustworthy AI will reign.

Accessing the hidden structure of complex systems using information theory

One of the more useful tools in the complexity scientist’s toolbox is information theory. Now, don’t worry, I’m not going to dive into this much, but I do want to talk about the central concept to information theory called Shannon entropy (or information entropy as it is also known). 

In 1952, Claude Shannon – a research engineer at Bell Laboratories – was tasked to invent a method for improving the transmission of information between a transmitter and a receiver. His invention – which he called ‘a mathematical theory of communication’ – was based on a very simple idea: surprising events carry more information than routine events. In other words, if you wake up in the morning and the sun has turned green, then that is going to jolt you into a hyper-aware state of mind where your brain is working overtime to try and make sense of what is going on. When our interactions with friends or our environment reveals information that we were not expecting, then we seek to make sense of it. We process that information with a heightened sense of consciously doing so. 

This response to surprise is no different whether we are individuals (), in a team (discovering that a colleague is also a part-time taxidermist), an organisation (the sacking of a well-respected CEO) or an entire country (the death of Princess Diana). We seek to understand why and in seeking to answer this, we traverse Judea Pearl’s ladder of causation. However, there is one key difference. When we are dealing with a complex system, or situation, then there is uncertainty over cause-and-effect. This uncertainty is the result of a structural motif of a complex system – feedback loops which I will discuss in a future post – that leads to what is called non-linear behaviour.

Information as a level of surprise is measured in binary digits (bits). The more unlikely an event is to occur, the higher the information that is generated if it should occur. Let me illustrate this with the example of flipping a coin. 

When you flip an unbiased coin there is a 50/50 chance of it landing on heads or tails. Because both events are possible – it was heads, or it was tails – then our uncertainty of the result is at its peak. We cannot have more certainty that the coin will land heads up. Here, the Shannon Entropy of flipping an unbiased coin is 1 bit which is the maximum information that can be obtained from a system (a coin flip) that can only generate two outcomes (heads or tails). 

Now, let’s assume that we’ve been given a biased coin that always lands on tails. We know that the coin is biased and so there is no surprise for us when the coin always lands on tails. If there is no surprise, then there is no information. The chances of the coin landing on tails is 100%. In this case, the Shannon entropy is 0 bits. Certainty does not yield new information.

Now, we don’t need to be too concerned with whether something is 1 bit, or 0.5 bit, or 0 bits or whatever. The point I am making here is that the greater the uncertainty we have about something, the greater the information we can gain from that situation. Likewise, if we have total certainty then there is no information, no knowledge, to be gained. Intuitively this makes sense – if I am observing something that is not changing then I am not learning anything new about it. However, if I perturb that system – add a component, remove a component – then I may be cajoling the system into a different state. This new state may yield new information, especially if I have managed to move the system into an improbable state. (Incidentally, this is why the modes of creativity – breaking, bending, blending – are fundamental to discovering new knowledge).

For Shannon entropy to be used in more practical ways, a probabilistic model of a system would need to be constructed. This simply means that we have identified the different states that a system can occupy, and we have estimated the likelihood of the system being in that state at a moment in time. We can construct a probabilistic model through observing and recording the frequency with which different states are observed. The more frequently we observe the system in a given state, over time we may infer that the system is more likely to be found in that state at a future point. Ordinarily we need to capture enough of the history of the system for us to have sufficient confidence in the probabilistic model we are building. This learning takes time and requires continual sampling of the environment; and there are some challenges to solve – like how to represent the environment – but the idea is to invest time in building a probability distribution, a probabilistic model, of our environment. Novelty is a previously unseen state and so that too should trigger a response, not least requiring an update of our probabilistic model.

As we build our probabilistic model we are forming a hypothesis, an untested belief, about how the environment behaves. Every time we observe and capture the state that the system is in, we are testing that hypothesis. The Law of Large Numbers is relevant here. We expect to see a system move in and out of different states. It may spend more time in one state than we have observed before, or the opposite. We would need to see a persistent, recurring change in the frequency with which each state of the system is observed before we begin to suspect that our hypothesis of the system may need to be re-visited.

Now that we have constructed a probabilistic model of our environment (or, indeed, any system of interest), we can calculate its Shannon Entropy. If we have a good degree of confidence that our probabilistic is sufficiently correct, then we can baseline these measures. We can then set a sampling rate of how often we re-calculate the Shannon Entropy of the probabilistic model (we may use machine learning techniques to optimise the sampling rate). If the Shannon Entropy measurement begins to diverge from the baseline value – by some pre-determined tolerance of +/- x bits – then we could infer from this that the system may be changing in some way. This out-of-tolerance measurement could flag the need for further investigation – either by an intelligent agent or a human.

What I am describing here is an idea. I am not aware of any existing technique, or concept, that achieves this. Neither do I know if there is much utility in what I have described. I believe it is technically feasible – the computational complexity of updating a probabilistic model and calculating its Shannon entropy can be achieved in polynomial time (i.e. very efficiently). As such, you should interpret this for what it is; an idea that I hope interests people enough to pursue it further. 

I believe the utility of this technique – of parsing the environment and comparing it against a probabilistic model – could be a very efficient way to manage a vast amount of automated monitoring of an environment for changes that may warrant further investigation. Of course, this ‘further investigation’ would call into play more expensive resources such as AI and/or humans.

My motivation for conceiving of this idea comes back to the need for any organisation to become highly proficient at anticipating change. When the organisation’s environment (internal or external) may be changing in unexpected ways, then we want to be observing the change as it is happening in real-time, rather than analysing after the event. Why is this important? If we are observing the genesis of an enduring change in our operating environment, then we have the opportunity to gain insights to the causes that led to that change.

Applying Shannon Entropy as an early warning system signals an alarm that our knowledge of our environment may no longer be accurate. We can respond to these warning signals by expending effort to understand the changes that may be occurring. From this we may create new knowledge and, therefore, update our semantic graph to represent that new understanding. The semantic graph is critical, because all of our collective intelligence draws on it to make good decisions. If that semantic graph is erroneous or significantly out of date, the quality of our decisions are impacted. As an organisation harnesses AI to the fullest – where we are talking about millions, if not billions, of decisions being taken every second – then the accuracy of the semantic graph becomes a critical and protected asset.

Anticipation gives us time to prepare; yet to accurately anticipate our environment we need to be sufficiently open to detecting changes that suggest that our understanding of the environment may no longer be up-to-date.

I’d like to finish this discussion by making one final point. The use of information theory to measure the behaviour of a dynamic system is not a new concept. Indeed, information theory is one of the most promising tools in the complexity scientist’s toolbox for unravelling the mysteries of a complex system. One of the biggest challenges for the complexity scientist is having access to information about the system of interest. Most of the time we simply cannot access a complex system with the tools we have. To give just a few examples: the brain, the weather, genetics. It is neither practical, nor feasible, for a complexity scientist to have access to every element or aspect of systems of this kind. Yet we are not without hope. As long as we can capture the signals, the data, the transmissions, from these systems then we can begin to understand the system, even though it is hidden from us. Of course, as we gain more knowledge of these systems, we can then devise precise interventions that may yield crucial insights that either confirms our hypotheses or takes us completely by surprise. 

Up until recently, I had been researching the use of information theory to infer the causal architecture of a system. Techniques such as Feldman & Crutchfield’s causal state reconstruction, or Schreiber’s Transfer Entropy, or Tononi’s Integrated Information Theory were all part of my toolkit. They are all valuable as they can tell us something interesting about a complex system. However, they do not have the explanatory power of causality, especially Judea Pearl’s do-calculus. I pass on this observation here to those readers who may be more familiar with these subjects.

Some thoughts on ‘AI superpowers’ by Kai-Fu Lee

I’ve been reading AI Superpowers by Kai-Fu Lee and some of the points he raises are worth debating.

Firstly, let me say that this is a good book and I am enjoying it. Kai-Fu Lee is a uniquely credible commentator on how a China vs. United States competition to dominate in the AI space might play out. Unique because he has spent most of his career in the States but, for the last decade or so, firmly embedded in China.

He raises two points:

(1) We are now in the Age of Implementation and the Age of Discovery (which we have been in for the last 400 years) is now less relevant to economic success. His assertion for this is that it is the speed and scale at which you can implement AI that is now critical to competitive success. He claims that we are now in this age courtesy of the success of machine learning techniques such as deep learning. What is now less relevant for competitive success, he claims, are those attributes which were born in an Age of Discovery and that, traditionally, the West have dominated with the focus on the discovery of new knowledge and the commercialisation of inventions. China appears to be better placed to succeed in the Age of Implementation vis-a-vis the rest of the world (including the United States).

Here we get to one of the premises of his book – there are only two AI superpowers: China and the United States. They are the only AI superpowers by virtue of their inherent characteristics such as geography (literally the size of their landmass allowing large populations to exist to generate huge markets and, of course, huge amounts of data), entrepreneurial culture, and access to the technology necessary to engineer and deploy machine learning and AI products and services at a massive scale.

(I have not yet explored his claim here and, for now, I’m assuming that this isn’t a crazy assumption. What is interesting is that the AI readiness index 2020 report places United States in 1st place and China in 18th place. However, as the authors themselves admit, their research evaluates readiness for AI, not the scale of implementation in which, arguably, China would score significantly higher)

(2) We are now in the Age of Data and we are leaving behind the Age of Expertise. Lee believes that as long as you have access to data, we now have the capabilities (such as machine learning), to quickly learn to get close to, if not surpass, the performance of human experts. He cites the well known example of a deep learning algorithm classifying breast cancer scans. What he doesn’t mention, however, is that – at the moment – the best performing algorithm is 92% accurate. A human expert is about 94% accurate. And that a deep learning algorithm + human expert is 99% accurate. For these kinds of applications near perfect performance is essential. So, the equation ‘more data + machine learning + compute’ does not always result in better performance than a human. I am a strong supporter in the idea that we should engineer AI to complement human capabilities, not supplant them, which is why I love examples like this.

I am also reminded of Centaur AI teams in international chess competitions. As Garry Kasparov describes in his book ‘Deep Thinking’ the performance of a two amateur humans players with two laptops team saw off the competition which included a supercomputer+grandmaster team.

For the foreseeable future, and even when we develop responsible AI, we will still want humans in the loop. Especially as the criticality of a decision increases. (A talk by IBM at NeurIPS 2020 last month, December 2020, made a similar point).

I think we can see where Lee is going here. He has deftly set out two arguments against why the traditional strengths of the Western world – discovery, expertise – are now less relevant. But I think this is somewhat disingenuous and, what is more, he doesn’t consider how the West might respond.

Let’s use one example to illustrate this. One of Lee’s claims as to why China will out-compete the rest of the world in an Age of Implementation is that Chinese culture (a) relentlessly embraces the ideas of others, copies them and improves on them thus focusing on the application of knowledge rather than the discovery of new knowledge, and (b) has a ferociously competitive domestic market with very few competitors left standing and those that are have discovered a defensible market position (so a very Darwinian view on the Chinese economy). Think of it as an accelerated evolution of a dominant design. And the driving force behind both of these? A culture that is relentlessly focused on being busy and industrious.

Now, I must admit, I don’t know a great deal about Chinese culture but I suspect that this is a fairly accurate description. Whilst the Chinese have shaped themselves into a phenomenal and extremely hard working society, the Western world is moving away from such cultural norms. Nowadays, hard work isn’t seen as the path to success. Personal satisfaction and having a more balanced lifestyle are the new harbingers of Western society now. We need look no further than the experiments in Scandinavian countries for 6-hr days and the 4 day working week (Scandinavian countries are always a good barometer for what is coming down the line for other countries in the Western hemisphere).

As I say, it is hard to dispute that – on effort alone – China is already winning. However, Kai-Fu Lee is writing a book about AI and yet he doesn’t appear to have considered how a more automated world will mean that human effort is not only less necessary, but actually will not be able to keep up. What use is it to have a nation of hard-working, ‘gladiatorial entrepreneurs’ if AI can out-code, out-innovate, and out-experiment them? Those countries that can automate the entrepreneurial process from inception of an idea, to trialling a multitude of different products/services in parallel, and rapidly refining and re-deploying new variants on a steadily improving trajectory, will be those that will win. We don’t have such technology at the moment, and deep learning alone won’t cut it, and so we will be reliant on the discovery of new AI capabilities which we will require elite expertise to develop. Those same two competencies that Lee claims are now less important!

(Microsoft’s ‘low code and no code’ power platform is a good example of where this may be happening already.)

I’ve not yet finished Lee’s book and its certainly thought provoking. We are just at the beginning of an AI revolution that has been ignited by deep learning. But our future AI capabilities are likely to take us well beyond a reliance on a technique that was first developed in the 1980’s. Don’t get me wrong this is an important book and I’ve encouraged my mentees, colleagues and clients to read it.

And, on a final note, I’ve just started to read Ian Hogarth’s excellent blog post on ‘AI Nationalism’ which dovetails quite nicely with Kai-Fu Lee’s thinking. I’ll have more to say on that too.

In an AI world winning in business and politics will go to those that have a ‘cognitive advantage’

When I was at the Complex Systems Conference in Singapore in September 2019, I found myself musing on the question: in a world where we have all maxed out our use of AI, how will that change the way that a business outcompetes their rivals? In a world where automated decision-making will take over more and more of the running of businesses and entire countries how do you compete to win? The conclusion I came to was that it was about out-thinking and out-manouevring your rivals and adversaries to the extent that you shape the environment in which you are competing in a way that your adversaries (humans and AI) can no longer accurately comprehend it and, thus, would begin to make increasingly bad decisions.

Now, this has happened throughout history and to quote Sun Tzu “… the whole secret lies in confusing the enemy, so that he cannot fathom our real intent”. However, the key difference this time is that AI will have a significant role in shaping that competed environment at a speed and a propensity for handling big data that humans are simply left behind. We may enter a cognitive war of AI versus AI.

I am hypothesising that AI will come to dominate global action that shapes our offline and online worlds. So, if you want to compete you will need to shape the digital environment that AI is attempting to predict, understand and act in. In other words, the competitive moves we make in the future will (a) be done automatically by AI on our behalf, and (b) therefore we will need to consider how AI will perceive our actions (recognising that, for the moment at least, most AI is dependent on big data). If the long-established practice of marketing to convince people to buy your product is extended to marketing to artificial intellects too, to persuade an AI to behave in a way that you want it to, then you start to get the point.

I call this having a cognitive advantage which I define as:

the demonstrable superiority gained through comprehending and acting to shape a competitive environment at a pace and with an ingenuity that an adversary’s ability to comprehend and act is compromised

I wrote a paper about this last year:

I will also be giving a talk on Cognitive Advantage at this year’s Future of Information & Communication conference in Vancouver (FICC 2021). A version of the conference paper will also be published in Springer’s ‘Advances in Intelligent Systems’.