Your AI specialists will explain that AI typically involves preparing data by cleaning and organizing it to make it usable, followed by selecting, testing, and refining an appropriate model through tuning and evaluation to ensure its performance. This approach has been the standard in machine learning for some time and is still commonly carried out by data scientists in many companies.
However, this view overlooks recent advancements in AI, particularly the rise of GPT, and misses the significance of the “P” in GPT. “GPT” stands for “Generative Pre-trained Transformer,” and “Pre-trained” means that the steps of data preparation and cleaning can be bypassed.
I’m not suggesting that the traditional approach is obsolete. Rather, there are many use cases where data preparation can be replaced by prompt tuning, allowing you to move directly to the testing and (prompt) refinement phase.
This approach is especially efficient when data preparation would be a massive task. In such cases, you can implement a “4-eyes control” model, where GPT generates a response, and a human validates it. Validation typically takes less time than the preparation itself. The fact that the response has been modified is valuable information, so be sure to preserve it. This allows you to leverage AI-generated results while still preparing your data effectively.
“Yes, but GPT doesn’t know my data.” That’s true. However, methods like RAG (Retrieval-Augmented Generation) allow you to prepare a context for GPT, so that at runtime, it can retrieve the information it needs. Since there is no training involved, the volume of data isn’t a concern. You simply need to gather the relevant documents to feed the context.
In conclusion, while the traditional machine learning approach remains relevant for many use cases, it’s important not to overlook the power of the “P” in GPT.