Why We Need a Science of AI Explainability

Executive Summary

I have been working in various fields of AI since the early ‘90s (and yes, AI is older than that). I started off in Machine Translation, went on to Automatic Text Summarisation using Artificial Neural Networks (the precursor to today’s Deep Learning and LLMs), and then for 25 years focused on designing Conversational Human-Machine Interfaces, before having to practically reinvent the existing wheel a few years ago, when Generative AI and LLM chatbots came along.

Artificial Intelligence has transitioned, as will be shown below, from the “Science of Explanations” back in the early ‘90s to a black box that we all hope will work and, when it does, we are not sure exactly why it did. The current version of “AI”, nowadays almost synonymous with Generative AI, has mastered the English and other human languages, which has rendered it most persuasive, even when it is wrong or falls into the wrong hands. And it is often wrong, unsafe or unfair, given that it was trained on our data, which unfortunately reflects our human weaknesses, biases and blind spots too.

At the same time, organisations and Governments are thinking hard about selecting AI use cases that bring value and positive impact, while remaining accountable to their users, maintaining their clients’ trust and their own reputation intact. Local, national and international laws and regulations have started to impose the transparency requirement; AI applications need to operate in a predictable and controllable manner and their creators need to be able to explain why a decision was made or not made, or why something went wrong. We simply cannot have Responsible and Ethical AI, if we don’t make the AI black box more transparent. Transparency forces accountability.

And yet, it is very difficult to understand what is going on inside an Artificial Neural Network (ANN). Rather than trying to decipher how an ANN works or trying to finetune an LLM altogether, I put the argument forward that instead, we should “parent the AI”, educate and guide it to work on what we want and, more crucially, the way we want it to work. This approach represents the golden medium, a symbolic – connectionist hybrid solution that imposes human top-down rules but allows the bottom-up flexibility and efficient operation of Deep Learning.

I go further by showing exactly how this should be done through Language Operations or LangOps, namely integrating Knowledge Engineering with Prompt Engineering. This means applying AI to carefully selected use cases in very specific domains with well-defined rules and clear terminology and vocabulary. By controlling the definition of objects and their properties, actors and processes, relationships and interdependencies among those, rules and decision points for our specific world, we can both prescribe and predict, and thus also explain, how the AI uses the corresponding terms, how it reasons about them and how it behaves in that world and why. This is done through detailed ontological modelling of that domain curated by human experts and Subject Matter Experts (SMEs). This domain model then translates into specific concepts and vocabulary (terminology) that will be embedded in the prompts that instruct the AI. This type of data and knowledge curation ascertains the human-in-the-loop requirement.

LangOps is the difference between a generic out-of-the-box LLM and a customised domain- and use case-specific AI tool without needing to build a domain-specific LLM from scratch. It creates a type of evolved, highly efficient Expert System or well-behaved, mature ANN by turning the AI black box into a more transparent box.

To Predict the Future, You need to Understand the Past

AI has been around for decades, since the 1940s, when Alan Turing was fighting the Germans with Enigma, even if the discipline was baptised later (1956). Back in the early ‘90s, when I was doing my Masters and then my PhD, AI was almost synonymous with Expert Systems, i.e. systems that had been designed, implemented, tested and tuned by humans, who would model their field of expertise in terms of explicit rules, conditions and decision trees. The goal was to create exactly what the name indicates, “Artificial” intelligence that mimicks as much as possible human intelligence. Mimicking was not sufficient by itself, however. AI systems were also expected to be able to explain their reasoning and decisions. Human observers or users were supposed to be able to understand exactly how the system works and why it produced a specific output. Anything else was considered a whimsical ad-hoc application and, as such, lacking credibility and reliability.

Enter Artificial Neural Networks (ANNs) also in the ‘40s, a parallel stream of research trying to create automata which mimic the complex structure of the human brain in order to simulate efficient, concurrent, emergent and usually obscure computation. When I encountered Neural Networks back in the early ‘90s, I was fascinated and really excited about the possibility to simulate intelligent complex behaviour, even if we couldn’t exactly understand, let alone control, what the interim processing was. ANNs were used then for simple low-level language tasks, such as phoneme and grapheme sequence prediction, i.e. which sound or letter will most probably follow which other sound or letter in a specific language (English, German etc.). I immediately recognised that ANNs would be ideal for higher-level language processing, such as semantic and pragmatic processing, language meaning and contextual interpretation, all of which were necessary for my Text Summarisation task. Of course that is exactly what they ended up doing now with Generative AI producing perfectly formed but also meaningful text and dialogue, but also speech, images and videos that are fitting specific requirements and context. Back in the early ‘90s, however, ANNs created a schism in the field of AI.

The AI Schism: Symbolic vs Connectionist AI

The AI of Expert Systems represented the Symbolic school of thought (and practice); Symbolism focused on manipulating symbols and concepts that were clearly defined and transparently processed in producing rules and decision trees. A medical expert system was using explicit rules about disease symptoms, causes and ways to treat them, rules that had been crafted by medical experts in collaboration with the developers of the system. Thus, in an operational environment the expert system would give a clear and unambiguous explanation of how it diagnosed a condition, as well as how it should best be treated. This rendered symbolic systems transparent “white boxes”.

The other school of AI thought and practice was Connectionism, which brought in practically a revolution. Connectionist ANNs were, however, looked down upon by AI traditionalists – and certainly the developers of expert systems, because they were considered “black boxes”. Even though you could clearly define their inputs (e.g. disease symptoms) and the desired output (different diseases that the symptoms could be attributed to), there was no way to predefine, let alone observe what was happening in the appropriately named “hidden layers” in-between. Computation was simply emerging from the connections between the different units of the network adjusting their individual weights in the process in order to “learn”. The only thing one could somehow control was specifying how many layers an ANN has (you need at least 3), how many units each layer has (the more the better), whether computation should take place forward, backwards or in a cycle (recurrent ANNs winning the race), and which connections should have a bigger weight (something you could only find out after endless numbers of experiments). The ANN would, in time, discover the optimal, most efficient way to learn how to associate a specific (training) input with a specific (training) output in order to decide on the most appropriate output for previously unseen (test) data. This introduced incredible speed in AI processing, but also near complete lack of transparency, let alone explainability.

What is and What is Not AI (or is only partly)

This lack of explainability of the ANN black boxes was unacceptable to most in the AI community of the early ‘90s. There is a famous article from 1989 by Yves Kodratoff “Enlarging Symbols to More Than Numbers or Artificial Intelligence is The Science of Explanations” which captures the spirit of the time, but also gives us a path forward for the AI of 2025, 36 years later. Kodratoff argued against Connectionism being AI, because they exclusively deal with numbers and not symbols and, hence, are obscure and unexplainable and – as such – “unusable”. He went on to declare that “AI is the science of explanations“!

According to Kodratoff, real AI has the following characteristics:

1) “AI refuses to work with intelligent black boxes.“. He gives the example of “intelligent chess playing machines” of the time, which play so well for reasons only available to their designers, not to their users, and that is not good enough.

2) “AI tends to avoid using implicit knowledge.“. According to him “In an AI system, knowledge should be stated explicitly, preferably in a declarative way”. He advocates for the Expert Systems of the time and declarative programming languages (back then, LISP and PROLOG), which separate “knowledge” from “process” or from the order in which this knowledge is applied.

3) “An AI system is supposed to make heavy use of background knowledge, provided by the user in a knowledge base.” that makes sense to its users.

4) To render an AI system intelligible to its users, it “.. has to speak its users’ language”, rather than a non-existent ‘language of AI’. This is in contrast to other fields, such as Chemistry, which impose their own jargon.

Kodratoff then lists the traits of systems that are “only partly AI”:

a) AI cannot be reduced to Expert Systems, even though those were the most successful in the early 90s and – in a way – still are.

b) AI cannot be reduced to an object-oriented programming language, such as LISP or PROLOG or its descendants, such as JAVA and PYTHON. These languages are only a means to an end, i.e. addressing AI needs.

c) AI cannot be reduced to Pattern Recognition or one of its real-life applications, such as Robotics, Machine Translation or Games. “AI was born from the need … for declarative components, and for large knowledge bases, understandable to the user”.

d) “It is ridiculous to believe that any AI model has to be a faithful representation of what is happening in our brains“. This means that AI cannot be restricted or hindered by the artificial need to discover models that accurately reflect human cognition. AI does not need to be anthropomorphic.

Out of the 4 traits characterising “proper AI” listed by Kodratoff, AI nowadays (35+ years on) only fulfills the 4th eligibility criterion; it now speaks the users’ language, wonderfully well – might we add. It can even correct said users both in terms of spelling and grammar but also, incredibly, even on an intellectual argument level providing alternative takes and giving voice to minority opinions too. Arguably, AI in 2025 also partly covers the 3rd point about making “heavy use of background knowledge”; it was built after all on the basis of millions of human texts, conversations and tagged photos. Thus, it has in a way captured some type of background knowledge. You never know, however, what this specific knowledge is or where it was taken from. It has just been learned in a “wholesale manner” using simple pattern recognition and statistics. As a result, this 3rd point is tempered by AI nowadays lacking severely on points 1 and 2 (transparency and explicitness, in short explainability).

AI and Anthropomorphism

Kodratoff’s last point about AI not needing to faithfully represent what is happening in our brains is also worth mentioning. Back in 1989 he didn’t see the argument about mimicking the structure of the brain as proving the superiority of ANNs over the handwritten rules of Symbolism. Why should AI exactly mirror the brain? As long as it can provide a reasonable result or behave in an acceptable expected manner, that is sufficient. Nevertheless, nowadays, we are in the impossible and very unexpected position of having AI systems that are both black boxes (in true ANN fashion) and are – at the same time – simulacrums of human reasoning, creativity and even “conscience” (or so they have you believe). We are finding ourselves in a new reality where AI chatbots assume or are automatically assigned anthropomorphic qualities, just because they use natural language and arguments that make perfect sense. People are more prone to believe something as true, if it is phrased using correct grammar and correct or – even better – sophisticated vocabulary. Persuasive and relatable language is, after all, how Marketing and Politics work! We have to be very clear, however: LLMs have mastered human language use like educated adults, but are still babbling babies when it comes to the semantics and pragmatics of language, i.e. language meaning and language context. It does not understand what you ask; it just looks for patterns and finds a similar question and the answer that it got (by another human). There is no originality, no creativity, no empathy, no intelligence and certainly no “conscience”. Everything else is just statistics, imitation and coincidence.

The Middle Way: A Symbolic-Connectionist Hybrid

Symbolism, then, involves humans controlling AI using explicit, painstakingly handwritten and expert-curated rules, which also render AI straightforward to interpret. Connectionism involves fast emergent computation that helps discover new patterns and carries out tasks more efficiently, but taking away control in the process. Like a few others back in the early ‘90s, myself I argued for a combination of both approaches, a hybrid to combine their individual strengths, while overcoming their weaknesses at the same time.

I put the idea into practice, when I was doing my PhD on automatic text summarisation. I realised that the only promising way to capture the complexity, ambiguity and “fuzziness” of natural language, meaning and “common sense” is through a hybrid approach; combining human hand-crafted “rules” (i.e. symbolic processing) with the automatic weight distribution and semi-supervised learning of an artificial neural network (connectionist processing). Thus, I used text annotations generated by Linguists, which encoded the morphosyntactic / grammatical, lexical-semantic and discourse pragmatic features of each sentence in a news article. I would then feed these text annotations into a basic feed-forward backpropagation ANN that would calculate the degree of “importance” of each sentence in the whole article and generate a YES or NO answer to the question, whether that specific sentence would be included in the final summary of that news article (not necessarily verbatim). It was a neat idea, very imperfectly executed, as the data set was not that large by today’s standards (just 1,100 sentences representing only 55 news articles) and the ANN barely had 3 layers with a single hidden layer that only had 30 units (that is, very skin-deep learning!). Still, this hybrid approach, which I have since continued to both preach and practise, is certainly gaining ground again now that we are all fighting to render LLMs reliably usable and their output more controllable and accountable to both its users and Governments. It’s back to the future for AI!

Back to the Future of AI

So how does this all translate to today? How can we leverage and learn from the evolution of AI up to now?

Artificial Intelligence has gone from the “Science of Explanations” back in the early ‘90s to a “black box” that we all hope will work and, when it does, we are not sure exactly why it did (In fact, we couldn’t pinpoint it, even if our lives depended on it!). The current version of “AI”, nowadays synonymous with Generative AI, with its impeccable English (German, French) can even offer elegantly formulated “explanations” of its reasoning and answers in our chats, even for the “hallucinated” ones; answers based on often flimsy evidence found in huge, obscure, disconnected and disjointed data sets, and completely unpredictable pattern matching. Masterly language manipulation renders LLMs most persuasive, even when they are wrong, unfair or indeed in the hands of malicious actors. And wrong and unfair they often are, as AI was trained on data produced by all of us on the internet, and as such reflects our human weaknesses, biases and blind spots too.

At the same time everybody talks now about the need for AI Governance and Compliance, Cybersecurity, IP and data protection, and Ethical and Responsible AI. We cannot possibly have any of these without going back to Kodratoff’s requirement for explainability from last century, a position I have held for a long time myself and preaching it more vehemently in the past 3 years that Generative AI has become a mainstream and ubiquitous technology. With the various tools now available it is difficult to know where your data will end up or how it will be used by the AI tool developers, but also other users. There is, for example, adversarial prompting and prompt attacks to steal a system’s core instructions (and data).

The Need for Explainability

Organisations and Governments are thinking hard about selecting AI use cases that bring value, efficiencies and positive impact, while remaining accountable to their users, maintaining their clients’ trust and their own reputation intact. An ever increasing number of local, national and international laws and regulations impose the transparency requirement; AI applications need to operate in a predictable and controllable manner and their creators or operators need to be able to explain why a decision was made or not made, or why something went wrong. An example is denying someone insurance cover because of their skin colour or rejecting a job candidate because of their accent. We simply cannot have Responsible and Ethical AI, if we don’t make the AI black box more transparent. Transparency forces accountability.

To (re)introduce Explainability, we need to “parent” AI

It is very difficult to understand what is going on inside an ANN. This has always been the case, from the time I was experimenting with my tiny 3-layer ANN, but even more so now. Nowadays there are hundreds of hidden layers (compared to the one I had in the early ‘90s) with hundreds of thousands of units and millions of connections among them. The combinatorics increase exponentially with the size of the ANN, which renders the whole process even more obscure and unfathomable. Recent attempts at bringing some type of so-called “mechanical interpretability” are really exciting, but still very basic and potentially also ultimately misleading. Humans can always find “meaning” everywhere, if they look for it. They do it with the formation of stars and the shapes of clouds in the sky, ghost apparitions in haunted houses, hopeful astrology predictions, and the way their love interest looked at them in passing. Equally, current experimentation with rendering the workings of ANNs interpretable is bound by the way the human brain identifies patterns that make sense to us.

So what should we do? Instead of trying to decipher how an ANN works or trying to finetune an LLM altogether, we should “parent the AI”, educate and guide it to work on what we want and, more crucially, the way we want it to work. This translates to applying AI to carefully selected use cases in very specific domains with well-defined rules and clear terminology and vocabulary. By selecting and defining the specific domain world in which the given AI application should and is allowed to reason and act upon, we get rid of the need to specify and model the whole of the human experience, a massive and ultimately impossible and rather pointless task. Researchers tried that too at the end of last century: it is a painstaking task involving thousands of volunteers and it never ends. By controlling the definition of objects and their properties, actors and processes, relationships and interdependencies among those, rules and decision points for our specific world, we can both prescribe and predict, and thus also explain, how the AI uses the corresponding terms, how it reasons about them and how it behaves in that world and why. This is done through detailed ontological modelling of that domain curated by human experts and Subject Matter Experts (SMEs). This domain model will translate into specific concepts and vocabulary (terminology) that will be used in the prompts that instruct the AI. This type of data and knowledge curation ascertains the human-in-the-loop requirement.

The Solution is LangOps

By integrating Knowledge Engineering with Language and Prompt Design and Engineering, we bring together language and meaning, merging Linguistics with Cognitive Science and Machine Learning. I call this approach Language Ops or LangOps for short and it leverages the fact that there is no language without meaning and – more importantly – no meaning unless it is expressed through language. Similarly, we cannot define a domain outside of language, we have to use words to “model” it. This is exactly where LLMs fail us. They use sophisticated words, but have nothing to “express”. LLMs don’t work with meaning, they just bring phrases together that statistically and historically belong or just appear together.

The interdependence between language and meaning is something that all Linguists, like myself, know. It is, however, something that people forget now that everybody is doing Prompt Engineering to some degree; using Generative AI in everyday life has made us all into Prompt Engineers in some form. Nevertheless, Linguists, Computational Linguists and professionals already working with language (e.g. Conversation Designers and Copywriters) are the only ones that can write surgically precise, effective and impactful prompts. The choice of words almost always determines whether you get a remotely relevant or the best result. In an enterprise environment, where you write prompts for your customer’s AI applications, it often makes the difference between a safe and reliable tool, and a cybersecurity threat. The precise selection of words and phrases in a prompt is also crucial when defining a domain-specific task, as well as controlling what its output should be. Thus, precision affects accuracy; effectiveness affects relevance; usability and user experience affect impact.

The LangOps approach of integrating domain-specific knowledge in prompt templates is the difference between a generic out-of-the-box LLM and a customised domain- and use case-specific AI tool without needing to build a domain-specific LLM from scratch. It creates a type of evolved, highly efficient Expert System or well-behaved, mature ANN by turning the AI black box into a more transparent box.