Improve question text
23 March, 2023
The demo shows an example interface for form creators to create questions.
Enter the title for a question and get a possibly better title back.
How it works
This passes the question using a very simple prompt to the gpt-3.5-turbo
model.
I want you to act as a GDS content designer. I will give you the text of a question in a markdown code block and you will correct my grammar mistakes, typos, and spelling errors. I want you to make the question clear and simple to understand, using common language, so that an 11 year old can read it. The question should be phrased the way you would ask it in person. Respond with only the question text.
What I've learned from this example
Results are mixed There are some promising results for examples from real forms - here are some from the MCA form and then the insolvency service.
These questions from the same forms didn't do so well. The first question almost works but lacks context that this is a form for people volunteering to become coastguards and so 'respond' is replaced by the weaker 'reply'. This might be solved by using more or all the form as context to get better results.
Things really fall apart with the national insurance question, resulting in something totally different.
To fix this it might be possible to use more context. Using another call to the model with a different prompt to judge the quality of output or give a confidence score might make it easier to present the results to the user.
Leaking implementation Here the model is betraying how it works and triggering anti-abuse measures from the API. Changing the prompt and wrapping it in markdown code blocks seems to fix this resulting in the second table below.
This result is hard to detect directly. We add extra steps to check that the result is a rephrasing of the original question.
We might be able to encode the outputs better too, offering the chance to return an improvement, skip the question or decline to modify it.
Encoding the original question as markdown blocks suggests that people could avoid it by closing the blocks themselves and then adding a new prompt. This didn't work in practice, but it's probably possible and something which we need to guard against.
Preventing abuse is a challenge Asking nonsense or abusive questions. I'm not sure what the result *should* be here. I'm sure there are ways to trigger abusive or offensive messages though.
Still robust These examples show that the model still produces useful results given highly stylised inputs, maybe from the Bristol/London offices:
Future work
Here are some things to consider trying:
- What's the best interface for offering these suggestions to users? Does a Check your answers screen work?
- Can we use more context to improve the results? Maybe the whole form?
- Can we use a more 'conversational' interface to get better results? Maybe we can ask the user for more context if something isn't clear
- Need help from content designers defining what good looks like and what considerations we need to get there
- Would this fit as the last stage in a pipeline of extracting questions from document based forms?