17
I used to think fine-tuning a model was the only way to get good results
For the last project I was on, we needed an AI to summarize very specific technical reports. My first thought was to gather a huge dataset and spend weeks fine-tuning a base model, which seemed like the proper way to do it. After a month of slow progress and high cloud compute costs, a friend suggested I try prompt engineering with a much larger, general model first. I was skeptical, but I gave it a shot with GPT-4, crafting a really detailed prompt that included examples of the input and the exact output format we needed. To my surprise, after about three tries tweaking the instructions, it started producing summaries that were 95% as good as what we wanted, for a fraction of the time and money. It completely changed my view on where to start a project. Now I always run a prompt engineering test before even considering a fine-tuning pipeline. Has anyone else had a similar experience where a simpler prompt fix solved what you thought was a complex training problem?
3 comments
Log in to join the discussion
Log In3 Comments
nancy_wood24d ago
Last year I spent two months trying to fine tune a model for customer service emails. My team was about to give up because the results were so bad. I finally tried a different approach and wrote a very simple, step by step prompt for a bigger model. It understood the task perfectly on the first try. I felt pretty silly for wasting all that time and money. Now my first step is always to see if a good prompt can do the job.
7
wells.christopher24d ago
Spent way too long trying to train a model before realizing I just needed better instructions.
2
richard_young8024d ago
Totally get that, @wells.christopher. I wasted a weekend trying to make a model sort data before I just wrote a clearer prompt and it worked fine.
2