The current approach to improving AI systems is captured by the simple phrase.
"Bigger is better".
The philosophy behind the approach is a simple one. Get an AI architecture that works, and scale it up to improve performance. The approach has some merits: compared to GPT-2 with 1.5 billion parameters or GPT-3 with 175 billion parameters, GPT-4’s incredible improvement in performance is ascribable to some degree to the massive increase in parameter size and training data. It is rumoured to have over a trillion parameters, and at least 4 times the training data of GPT-3.
On the other hand, scaling costs more money and faces diminishing returns. In addition, there remains problems for large language models that remain as we increase scale. Indeed, for certain problems such as opacity, increased complexity can make these systems harder rather than easier to understand.
Against this background, an approach has emerged focused on combining several AI systems to improve performance. The philosophy here is different. Each type of AI system has different strengths and weaknesses depending on its architecture, and possibly training regime. Why not combine these models in a hybrid fashion to harness the strengths and patch the weaknesses of each.
Hybrid AI has typically focused on combining symbolic and non-symbolic systems. Nonetheless as the broad applications of LLMs become apparent a focus has emerged on more complex combinations of AI systems. This has led to speculation on the combination of edge and cloud AI, and of smaller domain-specific models.
Symbolic & Non-Symbolic Hybridisation
Non-symbolic AI describes the kind of neural networks and algorithms we think of when we imagine AI. They don't process information in a particularly intelligible way, but instead represent them via numerical values and patterns within mathematical structures. They then process this information according to self-learnt rules to generate predictions and insights.
Symbolic or good old fashion AI on the other hand uses explicit, human-readable symbols to represent knowledge. It then will manipulate these symbols on the basis of explicit, human coded rules and algorithms.
The distinction would be most easily understood in imagining an AI system aimed at diagnosing a heart condition in patients. Symbolic AI would involve some expert system which encodes the triaging process in an explicit, rule-based fashion. Non-symbolic AI might be a neural network which analyses labelled patient data to develop a way of predicting whether someone has a condition.
Non-symbolic AI is powerful, able to key into patterns humans may be unaware of, or incapable for discerning due to their complexity. On the other hand, the scale required of these models to function properly means they run into the stated problems of efficiency and opacity.
Large language models are illustrative of this: they can produce adaptable natural sounding communications compared to explicitly encoded symbolic AIs. At the same time, however, they suffer from hallucinations where incorrect claims are made with the confidence of correct ones.
The increasingly obvious solution to the drawbacks from inaccuracies in LLMs and the rigidity of symbolic AI is to combine them, instead harnessing the flexibility of LLMs with the accuracy of symbolic AI. The advantage of this hybrid system is obviously applicable to many domains.
Imagine for example, an AI search engine which make use of existing search algorithms to summon the relevant accurate information. It could then combine it with LLMs to accurately distil that information down in a response. This would be more effective than current approaches which trust LLMs to accurately generate that information from their own model.
In addition to accuracy, the barrier for use for many non-symbolic AI models is interpretability. There are some domains in which we also process information implicitly, for example speech or object recognition. In these fields we aren't concerned with understanding exactly how a model correctly identifies an object, as we find relatable the experience of "just seeing it".
On the other hand, we are less amenable to "black box" approaches in domains which typically involve explicit reasoning in humans. For example, AIs guiding insurance or loan assessment, company strategy, or other more complex decisions, consumers often feel unsettled if they are unaware of *how* an AI arrives at that decision.
To take our original example, patients may feel unsettled if an LLM provides a diagnosis or risk assessment without any apparent reasons. Hybrid models would abet this concern if natural language processing were combined with an expert triage system in healthcare settings. It could use an LLM to funnel a patient’s medical history into a expert symbolic triage system, and again to distil the outcome of that explicit assessment in a natural language format.
In these critical domains, hybridisation could provide the explicit reasoning demanded by human agents, whilst still using LLMs to respond flexibly to natural language inputs.
Cloud & Edge Hybridisation
Another strand of this discussion comes from combining edge and cloud systems. Edge devices are those which sit near the data such as phones or sensors.
Qualcomm and IBM have both proposed combining edge with cloud AI. One model proposed by Qualcomm involves an edge device such as a phone running a smaller LLM or symbolic AI capable of most tasks, which references a full LLM in the cloud only when necessary.
The most obvious advantage of this to enterprises is the cost savings of not rooting all information through the cloud.
However, out with cost there are also improvements to the service made available by hybridisation. Using a local model creates the opportunity for personalised AIs trained on your particular needs and behaviours and obviates the need to constantly refer to the cloud, reducing response latency, and allowing offline use.
It also creates room for secure AI systems, where sensitive data isn't transmitted to a cloud-based system or used in training systems. This is an obvious concern for companies looking to use LLMs for the productivity benefits without risking precious IP, or for those companies using AI alongside sensitive data say in healthcare or child education settings.
Domain Specific Model Hybridisation
The final strand focuses on combining several domain-specific models. Creating domain specific models involves a variety of techniques, outlined by Arsalan Mosenia (AI lead at Google in a recent report. You can use prompt engineering, connect the model to an external database, or train the model on domain specific data. Google has been making advances in this field looking at developing an LLM for medical responses.
They used a process of “instruction prompt tuning” where the network itself is frozen isn’t trained, but additional prompts are added as prefixes to the original question, where those prompts, themselves are trained to minimise error. These “soft-prompts” are trained to maximise performance on domain-specific datasets without requiring the complete retraining of a their very large LLM, Flan-PaLM.
Other researchers in the biomedical field have been developing domain-specific models through-and-through such as PubMedGPT, where, though 200x smaller achieved 50% on Google’s MedQA compared to the Flan-PaLM’s 67.4%. There are clearly, then, pros and cons to each approach, though given the fact that very large LLMs have already been trained, altering pre-trained LLMs may represent a cost-effective solution without compromising on efficiency.
Until now, a single LLM has been expected to respond to all queries irrespective of domain. The development of domain-specific models, and model implementations provides a vision of a future where LLMs are not overly general and spread-thin, with the risks of hallucination and computational expense associated with it.
How these hybridised models are connected is also being discussed. Some propose that querying chatbots in future would involve a general, gating LLM or symbolic system which then passes on queries in the appropriate format to an LLM trained specifically for responding to such requests. Given that these domain-specific models can be trained on fewer parameters and less data, we may reduce the overall cost of producing these models, whilst simultaneously improve response depth and hallucination rate.
The Outlook
These different strands emphasise various aspects of hybridised AI systems, but nonetheless emphasise the same thing: combining various AI systems in a sophisticated top-down fashion can preserve the strengths of each system whilst improving on the drawbacks of each.
This direction of travel also opens interesting opportunities within the sector, where we may see highly efficient models specific to a particular domain linked up (for a cost) with a platform gating the use of different models. Perhaps models might emerge in a similar fashion as entertainment apps such as Spotify, which sees consumers pay for a Spotify subscription, and Spotify pay artists who are played on their platform.
The moat for a company in a hybrid age wont then necessarily be the greatest resources to train the greatest number of parameters, but perhaps proprietary data that can be used to train a highly efficient model that may be linked to that service.
Whilst the precise outcomes for hybridisation are unclear, the benefits are obvious enough that it will undoubtedly play a large role in years to come.