I used to think fine-tuning a model was the only way to get good results
For the last project I was on, we needed an AI to summarize very specific technical reports. My first thought was to gather a huge dataset and spend weeks fine-tuning a base model, which seemed like the proper way to do it. After a month of slow progress and high cloud compute costs, a friend suggested I try prompt engineering with a much larger, general model first. I was skeptical, but I gave it a shot with GPT-4, crafting a really detailed prompt that included examples of the input and the exact output format we needed. To my surprise, after about three tries tweaking the instructions, it started producing summaries that were 95% as good as what we wanted, for a fraction of the time and money. It completely changed my view on where to start a project. Now I always run a prompt engineering test before even considering a fine-tuning pipeline. Has anyone else had a similar experience where a simpler prompt fix solved what you thought was a complex training problem?