TLDR: I go from wanting a machine learning model to getting that trained model, without actually having a dataset.
I finally got GPT-3 access (big thanks to GDB at OpenAI) and took a stab at doing a fun side-project using GPT-3.
I decided that I want to be able to say "train me a classifier for X", and to have a procedure that spits out a trained classifier without having to manually collect a dataset or do any annotation.
I was curious if GPT-3 lets me do this: can I use generation to create a dataset for a language task and then train a model for that task that actually works?
Why? My main motivator was research curiosity, specifically understanding the fidelity of few-shot generation with large language models. I do a lot of work on data augmentation (e.g. model patching, preprint coming soon), so I'm always interested in ways we can cheaply increase the amount of data available for a task.
Another more practical consideration is that it would be nice to be able to bootstrap to a model very quickly without having to waste time on data collection, which is a fairly expensive and time-consuming process.
Let me add here that I don't actually expect this to work very well.
Building machine learning models is hard, and reducing this process to the press of a button is highly non-trivial. But it's still fun to try to understand what's possible and what's not, and hopefully learn a few things along the way.
Lastly, you could just learn a few-shot model with GPT-3, but there are a lot of reasons you may not want to: you're in a resource-constrained setting and want a small model that's very good at one thing only, or maybe you don't want to share your (test) data with OpenAI.
So to summarize, I want a procedure that looks like