What are the core technologies behind generative AI?

8 December 2024 - Updated at 27 August 2025

As mentioned in our article The development of generative AI: what impact on business?”, generative AI is not a recent concept. The neural networks it relies on have been known for decades. So why did it take until 2022 for this technology to gain widespread attention? The answer lies in a series of technological breakthroughs, which we’ll explore in this article.

Before diving into the details, it’s worth noting that we’ll focus on the core technologies behind ChatGPT, which in 2023 remains the most mature generative AI model in terms of reasoning, creativity, generation capacity, and its ability to connect information.

First wave: AlexNet and deep learning

It all started in 2012, when Alex Krizhevsky implemented neural networks on graphics cards (GPUs). Why was this revolutionary? Neural networks are extremely demanding in terms of computing power and data volume. They are also prone to what is called overfitting—in other words, overinterpreting patterns in training data. Until then, the only way to avoid overfitting was to reduce the number of parameters, which inevitably meant lowering model performance. GPUs changed the game by cutting training time by a factor of 100.

At the same time, very large datasets became available, such as ImageNet, which helped improve model performance. These advances were made visible to the public during the 2012 ImageNet competition. Frequently cited as a milestone in computer vision, the AlexNet architecture is considered the first convincing deep learning approach, marking the end of AI’s “winter” and the beginning of a new era. This breakthrough democratized decades of research in applied mathematics, computer science, and neuroscience, paving the way for the rise of generative AI.

From Chatbots to Agentic AI: The Future of Customer Experience

See more

Second wave: GANs

In 2015, the arrival of GANs (Generative Adversarial Networks), introduced by Ian Goodfellow, brought a new set of architectures and unsupervised methods. The principle was to pit two neural networks against each other: one generating images and the other trying to determine whether those images were real or fake. This step is often considered the first truly effective generation capability in AI. However, GANs were known for producing particularly unstable architectures.

Third wave: transformers

Launched in 2017, transformer models developed by Google Brain introduced the concept of attention mechanisms. The idea was to sequentially select and focus on the most relevant information during processing. Transformers quickly gained traction in natural language processing (NLP) and delivered strong results, especially with models like BERT, which remain widely used in applications today.

Fourth wave: GPT-3 and GPT-4

From there, major players such as Microsoft (OpenAI), Meta (Facebook), and Google doubled down on the development of these models to push generative AI forward. By dramatically increasing the number of parameters and combining multiple training modes, OpenAI succeeded in creating ChatGPT.

Its performance rests on several factors: the integration of four complementary training methods (detailed below) and the use of high-performance graphics processors such as Nvidia’s H100.

Self-supervised learning

In 2018, the GPT model was developed, applying self-supervised learning to text with transformers. This approach requires far less manual supervision of data but relies on very large volumes of unlabeled data.

It also includes language learning from text corpora, largely based on massive amounts of content published on the internet. This step explains why ChatGPT can operate in multiple languages. However, it is also the most computationally intensive approach, as its sole purpose is to predict the next word in a sentence or complete a missing word.

Supervised layer

More recently, in 2022, InstructGPT was introduced. This is a GPT model trained on conversations in supervised mode. The goal is to teach GPT how to answer questions. The process involves providing the model with a set of questions along with the expected answers. The main complexity lies in the context required for the AI to respond accurately to a given question (also known as the “prompt”).

This context essentially contains the instructions provided to the model to guide its answers. In ChatGPT, this context is hidden and embedded in the InstructGPT layer, which was trained using supervised learning. The result is a chatbot-type AI, trained on a very large text corpus and able to provide raw answers to almost any question, as long as relevant elements are present in its training data.

Reinforcement learning

While the first three learning modes build the substance of a response, this fourth mode shapes the form, making AI replies more user-friendly. A purely supervised chatbot often produces blunt answers with little politeness or nuance.

Here, a team of human trainers asks ChatGPT a large number of questions, requests multiple answers per question, then ranks them by preference while discarding inappropriate outputs. This process, known as RLHF (Reinforcement Learning from Human Feedback), helps refine how the AI communicates. Technically, it is straightforward but requires significant human resources.

The future of generative AI in the hands of the GAFAMs

By increasing dataset size and avoiding heavy reliance on manual data labeling, it became possible to build foundation models that could then be progressively refined. With InstructGPT, this refinement brought interactivity close to natural human exchanges.

Today, only the major players—OpenAI with Microsoft, Google, and Meta—have the computing power required to train ever-larger multilingual foundation models based on transformers. This is why the future evolution of generative AI foundation models partly depends on the GAFAMs.

It is important to remember that a model with massive parameters but no InstructGPT layer would be far less usable than a smaller model trained with supervised learning. This balance between size and training methodology explains why leading groups continue to focus on fine-tuning approaches as much as on scaling up parameters.

Generative AI may soon reach its limits

In September 2022, DeepMind (through a paper known as Chinchilla) introduced new scaling laws for data—also called Chinchilla scaling laws or Hoffman’s laws—to determine the optimal data requirements for training Large Language Models (LLMs). According to these laws, a 70B-parameter LLM should ideally be trained with 1.4 trillion tokens, or about 20 tokens of text per parameter.

This means that, based on Chinchilla’s findings, training GPT-5 would require 11 times more data than GPT-3 and similar models. In practice, this implies sourcing, cleaning, and filtering around 33 TB of text data to train a 1-trillion-parameter model.

For context, GPT-3 was trained with approximately 175 billion parameters. The challenge is clear: as models grow larger, we risk eventually running out of high-quality textual data to match the increasing number of parameters we want to add.

It is therefore pointless to chase ever more parameters as a way to significantly improve model performance. Future advances in this field will need to come from other directions, most notably from architecture. Why not imagine discovering a model architecture more efficient than transformers? For example, breakthroughs in fundamental mathematics, particularly in the study of high-dimensional vector spaces, could pave the way for progress beyond the architectures currently used in AI.

Finally, while machine learning research is not intended to faithfully replicate the functioning of the human brain, certain mechanisms studied in neuroscience may inspire new ideas for future AI developments. Early neuron models, hierarchical models of vision, and attention mechanisms are just some examples of how insights from neuroscience have already influenced AI, and may continue to do so.

Comments (0)

Your email address will not be published. Required fields are marked *

Your email address is only used by Business & Decision, the controller, to process your request and to send any Business & Decision communication related to your request only. Learn more about managing your data and your rights.

Discover also