Statistics.gov.scot improvement project: alpha user research report

Research to improve Scottish Government’s site for open access to Scotland’s official statistics: statistics.gov.scot by assessing current and potential users through testing redesigned portal prototypes and publishing platforms. This user research is part of the alpha project to enhance the service


Overall findings, needs and recommendations

This final section combines analyses of all qualitative and quantitative data to tell the overall story of the user research. These overall findings are then translated into broad user needs for portal users and data publishers, and recommendations for Beta.

Findings

Finding 1: High expectations for search, filtering, and accessibility

When using the open data portal prototypes, the instincts of most participants were to use the main search bar to discover data. They expected this to function like a Google search, with autocomplete, suggestions, tolerance for typos and approximate string matching (fuzzy search). At the same time, where users had a clearer idea of what they were looking for, they tended to try finding this by navigating by theme.

More experienced participants were wary of the relatively minimal filtering options on the portal prototypes, given the potentially high number of datasets, although this may be resolved by an efficient search. Participants expected the portal and publishing platform to be accessible to a wide range of users, including disabled users and users of assistive technology. Given the potentially complex nature of navigating tables, charts, and maps for some disabled users and users of assistive technology, ensuring these are accessible will require planning and testing.

Finding 2: Labels and terminology are limiting comprehension

Many participants stumbled on labels and terminology such as ‘featurecode’, ‘datacode’, ‘statistic’, ‘API’, and the use of S-codes, and asked for simpler language wherever possible. This was particularly apparent when it came to data dictionaries, filtering data, visualising data, and working with tabular data. Participants wanted keys or legends for charts, and asked for a short explanation of what each element means. Some also struggled to understand broader categories such as 'themes' and 'organisation', and suggested examples of each would help. Finding language that balances specificity with broad familiarity and comprehension requires close attention to content design, including specific work on the language, labelling, and hierarchy that is visible to users through the interface, as well as the underlying information architecture.

Finding 3: Design, metadata and lineage affect trust and confidence

Participants felt reassured by the use of the Scottish Government Design System, but this confidence could be quickly undermined without a prominent ‘last updated' (and ideally ‘next expected update’) date, surfaced from search results onwards. Participants expected short, prominent, dataset-specific summaries at the top of the dataset landing page, and would like links to related/parent datasets. Participants also looked for a data dictionary and license information, as these inspired trust and confidence in using data; when these were missing, even experienced users hesitated. In this context, lineage refers to where the data has come from and any processing it has undergone before being made available. As with Finding 2, acting to find the right language and hierarchy here requires close attention to content design.

Finding 4: Data viewer and charting need a gentler on-ramp and should be accessible

Participants hoped for a default chart or table to preview data, with a brief explanation of what they are seeing. Participants were initially enthused by chart and table-building tools but were soon confounded by unintuitive steps, confusing labels, unclear axes and recovery from errors. These tools were also effectively inaccessible to most participants using assistive technology, including keyboard-only users.

Finding 5: Local area discovery is a strong public hook

Many participants wanted to enter a postcode, neighbourhood, or local area and see relevant local statistics and an interactive map, which were seen as a desirable ‘way in’ to understanding the purpose and scope of the open data portal, alongside search and themes. Where datasets included GeoJSON formats, a map preview was appreciated and again helpful to understand context.

Finding 6: Publishers are reassured by guidance, progress, QA, and preview

Publishers found CKAN Admin and Workflow Manager broadly intuitive and more logical than the current system, but repeatedly asked for required-field indicators and help in-context (e.g. beside fields). Brief, first-time guidance and a glossary may help with onboarding new users and prompting less frequent users. Publishers felt reassured by progress indicators, the ability to save progress, and clear confirmations for large uploads. Automated QA was seen as very helpful (if possible) to enforce standards. Publishers would like to preview a dataset before they commit to publishing on the live portal, perhaps in a ‘staging’ or ‘test’ area that mimics the live portal. Publishers also valued minimal jargon and plain-English labelling where appropriate.

Finding 7: Help and clear contact routes are valuable to all participants

Prominent, easily findable ‘how-to’ help was reassuring to all participants, across all prototypes. On the portal, clear contact details for both the team managing the portal and teams responsible for individual datasets were appreciated. Participants were happy to use a contact form but appreciated email contact too, with information on realistic response times helping to set expectations.

User needs

The user needs catalogue created during Discovery was reviewed in light of the Alpha research, resulting in some minor updates. The revised user needs catalogue can be found in Appendix C. At a high-level, taking in the findings from Discovery and Alpha, we determined five overall user needs for those using the Open Data Portal, being:

  • Find and Discover
  • Understand
  • Visualise
  • Access and Download
  • Support and Feedback

We also determined five overall user needs for those using the data publishing platform, being:

  • Create Publishing Task
  • Manage Publishing Task
  • Run Publishing Task
  • Support and Feedback

In addition, for administration and support on both the Open Data Portal and the data publishing platform, we determined two further overall user needs, being:

  • Manage Platform
  • Other (non-functional, e.g. performance, scalability, security, maintenance etc.).

These high-level user needs are helpful to organise lower-level (i.e. specific, granular) user needs in the form of user stories, which can then be brought into sprint backlogs for planning purposes as required. These have been incorporated into the revised user needs catalogue in Appendix C.

In summary, for an Open Data Portal, Users need:

  • minimal jargon and plain-English labelling wherever possible
  • efficient, intuitive search (e.g. typo-tolerant, fuzzy string searching)
  • a simple preview of tabular data
  • example visualisations with clear labelling (e.g. an example graph with uncoded x and y axes)
  • prominent and useful metadata (including easy-read data dictionaries)
  • short paths to multiple data download formats and APIs
  • quick access to key statistics about their local area

In summary, for a data publishing platform, Data Publishers need:

  • a reliable, efficient, and logical service
  • clear guidance (in context) and help documentation
  • minimal jargon and plain-English labelling where appropriate
  • clear progress indicators with the option to save progress
  • automated quality assurance (QA)
  • preview and/or staging before publishing

User groups

The six user groups identified during Discovery research and maintained for this Alpha research were reviewed for suitability and applicability for Beta, with the recommendation that these groups be maintained for further testing in Beta:

  • general citizens: someone who may be occasionally interested in what’s behind the headlines that affect them (inc. those who self-identify with low digital literacy)
  • inquiring citizens: someone who maintains a keen interest in specific issues and may occasionally use statistics (e.g. charity, third sector or think-tank employee)
  • commercial users: someone who is interested in specific datasets that are useful for achieving business objectives, working in the private sector (e.g. at an energy company or bank)
  • technical/expert users: working with and talking about data are part of their daily lives. (e.g. academia, data journalist, developers)
  • public sector/policy influencers: somehow who researches specific issues and uses statistics as evidence to inform others. (e.g. policy advisers, statisticians, councils, health boards)
  • data publishers: someone who collates and provides data to be published on the site (both regular and infrequent)

Recommendations for Beta (and beyond)

The below offers general principles distilled from the research to date, as well as examples of suggested actions to take forward. This should take a phased approach, beginning with the basics of easier publishing, updating, finding and accessing datasets, before moving onto more advanced features such as visualisations, automated QA and local statistics. The below may go beyond Beta in terms of implementation, but is all at least worth considering now, for inclusion later.

Recommendations for beta
Recommendation Suggested actions Domain
Serve the fundamentals of findability and discoverability Implement autocomplete with fuzzy search/synonyms/typo-tolerance Open Data Portal
Surface ‘last updated’ prominently in search results and ensure metadata is easy to find on dataset landing pages Open Data Portal
Replace codes with descriptors Open Data Portal
Include prominent ‘how-to’ guidance and in-context help text Open Data Portal, Data Publishing Platform
Surface and explain previews Show a meaningful table and an example chart by default on dataset pages Open Data Portal
Place a short ‘you can’t break the data’ explainer within API information and any analyse/visualise tools Open Data Portal
Allow data publishers to fully preview their dataset in a staging/preview/test version of the live site Data Publishing Platform
Develop and test for accessibility Schedule and act on a WCAG 2.2 (AA) audit Open Data Portal, Data Publishing Platform
Support local area exploration Add postcode search and a small set of curated, topical collections Open Data Portal
Include two or three ready-made visuals (e.g. infographics) for popular datasets to interest less experienced users Open Data Portal
Clear steps, guidance and reassurance for data publishers Offer help text on field labels and links to 'how-to' guidance Data Publishing Platform
Allow 'save progress' Data Publishing Platform
Clarify status language (e.g. ‘in progress', ‘last run – successful/failed’) Data Publishing Platform
Offer concise run logs Data Publishing Platform
Provide options to assign datasets for review and a lightweight audit trail Data Publishing Platform

Concluding remarks

The Alpha research supported the core findings from Discovery and informed requirements for Beta. The analysis of qualitative and quantitative data generated through engagement with an appropriate range of users suggests that a search-centred, plain language portal with simple data previews, quick, multi-format downloads and simple API access will meet the broadest set of user needs. For publishers, testing demonstrated that a guided, auditable workflow with staged validation and safe publishing controls is a sensible model to pursue. These sites should conform to the Scottish Government Design System and WCAG 2.2 (AA) as far as possible, with clarity and accessibility as guiding principles.

The next sections offer further detail on each round of user testing, including analyses of qualitative and quantitative data, and interim findings.

Contact

Email: auren.clarke@gov.scot

Back to top