Inferring answer type from question text

Should question text be used to work out answer type?

Demo

This example shows how we can infer the answer type from the question text.

Given the text of the question, it will suggest an answer type and show a preview of how the question will be presented to the user filling out the form.

It classifies questions into the following five labels: text, number, date, gender, national insurance number and telephone number.

Question 1 - Edit question

Ask the question the way you would in person. For example, ‘What is your national insurance number?’

How it works

This example uses a few-shot approach to build a classifier. It uses a small number of examples to train a model, and then uses that model to complete a prompt:

The prompt ends up looking like this:

       Classify into 5 labels: text, number,date,gender, national insurance number, telephone number

       Text: What is your date of birth?
       Label: date

       Text: How many pets do you have?
       Label: number

       Text: What is your pet's name?
       Label: text

       Text: What is your gender?
       Label: gender

       Text: What is your national insurance number?
       Label: national insurance number

       Text: How can we contact you by phone?
       Label: telephone number

       Text: What’s your phone number?
       Label:
      

A more sophisticated approach would be to use a larger number of examples and to select them based on the question text being presented. This would help avoid token limits and help refine the answers.

What I've learned from this example

LLMs don't just model English

Using some questions from a form produced by the Singapore forms team still produces the correct result:

  • Gender/ 性别/ பாலினம்/ লিঙ্গ
  • Date of Birth / 出生日期 / জন্ম তারিখ / பிறந்த தேதி / tarikh lahir / วันที่เกิด / မွေးနေ့
  • Phone Number / 手机号码 / মোবাইল নম্বর / கைபேசி எண் / telefon / หมายเลขโทรศัพท์ / ဖုန်းနံပါတ်
  • Test site/ 检验地点/ சோதனை தளம்/ পরীক্ষার সাইট

It's not as brittle as I thought it would be

It's surprisingly robust to typos and other errors.

  • give us yer nino
  • snaffle goop yog

Better specification of the task leads to better results

Some questions are classified as numbers when really they should be text. I've added national insurance number and telephone number to differentiate between numbers and text.

For example, “What is your reference number?” is marked as a number when most digital users would list it as text as it would not be manipulated as a number in validation or further processing.

This hints that number might not be the correct term for this answer type. A better label might be “amount” or “quantity” or maybe natural number.

There are ways this could be fixed:

  • The prompt could be adapted dynamically to include more examples of numbers to produce more precise results
  • Further processing of the question text might help to determine the correct answer type

Interestingly, “What is your fax number?” is correctly classified as a phone number.

LLMs make quickly prototyping ideas very fast

This was easy to implement. The only 'programmer' work is a bit of frontend work and managing the prompt generation steps. The actual prompt for classification is very simple.

This isn't production code though - it's not been tested well and operational concerns like token limits haven't been considered. There are other issues too like prompt injection attacks and there maybe inputs which produce incorrect or abusive results.

Testing may end up being the bulk of the work needed to deploy features using LLMs.

Possible use cases

Could this be part of a form making interface?

Could this be used to build forms faster - maybe asking for a list of questions, then presenting a check your answers screen for the form creator to review and refine. This would neatly turn the creators problem from a 'design' task to a 'selection task'.

It's not clear how well this would scale to more answer types though, or how well it would hold up against real users question text. It also doesn't account for answer type options - there is no way to specify that a name requires title for instance.

Our current approach is very clear, captures user's intent well and is easy to understand, verify. The strongest argument against the approach above would be that is It uses something very complicated and hard to test to solve a problem which is already solved using much simpler, more intentional techniques

Migrating existing questions

I've chosen gender here because it's a answer type we don't currently use.

If, in the future, the design system introduce a best practice for asking users for gender, would we be able to upgrade existing forms which currently use a text box or options list?

Using this classifier, we could check every question and 'upgrade' them to use the new pattern.

More points to help creators improve their forms

Gender is also a difficult topic in forms - we would like to discourage creators from asking the question unless they need it. If they do need it, ideally they would be using a best practice pattern.

Currently we have no way of stopping creators using a text box or selection list to ask the question. Maybe by being able to detect the question we could intervene by dispatching a content designer to tell them off.