As an administrator in the higher education sector I have discussed, researched, pondered and weighed-up approaches to running successful student survey campaigns; both within defined parameters (e.g. NSS[1] and DLHE[2]; each with their own restrictions on how the surveys must operate) and with a fairly free hand to oversee internal surveys – to varying degrees of success.

Anyone set the task of running a survey (who, like me, came to it as a novice) considers survey method and promotion, and forms their own views about what works. The response rate tends to be the most pressing concern and I’ve listened to debates about whether 20 or 30% is ‘enough’ to be considered representative.

The first point to make is that statistical models or assumptions can only be indicative: surveys are not precise tools and often rely on those who ‘self-select’ by taking part. An anecdotal theme often cited in support of survey promotion in institutions (particularly related to the NSS) is that those in the survey population with a grievance of some kind will respond first, and those generally satisfied need to be mobilised to take part. This illustrates the response bias inherent in survey data.

#### Sample size

The most surprising element to non-statisticians is the mathematical relationship between sample and population. It’s obvious that estimates become more precise as the sample size increases – with resource restrictions (and apathy; access; time…) preventing a fully representative census from being taken – however, the gain in accuracy gets smaller and smaller towards the full population. Assuming a confidence level[3] of 95%, once the sample reaches 500, an increase in population size has virtually no effect on the margin of error in the results.

This relationship (due to the standard deviation profile) is demonstrated in a handy blog post – external link [SurveyMonkey blog post]. The decreasing gain with larger population sizes assumes an equal, homogenous population. The reality of student surveys is of sub-populations, either designated for reporting purposes (mode of study; domicile; year groups) or because of their level of engagement and interest in the survey itself (which links to the validity of the questions). A smaller sample of high quality data covering all sub-populations is likely to be more valuable than a larger number of brief, homogenous responses.

An example of reporting purposes affecting response rate targets: the Higher Education Statistics Agency (HESA) oversee the graduate survey (DLHE) and set out response rate targets for institutions that vary from 50 to 80% depending on the domicile status of students (see point 5 on this external link providing guidance for institutions running the survey). HESA says that the ‘response rates are set to ensure that detailed data can be published and that the results of the survey genuinely reflect the outcomes for students leaving institutions’ (i.e. they are meaningful). It is worth noting that there is no correlation identified between acceptable response rates and population size (or institution-specific targets set), which would be unworkable given the wide variation between institutions.

#### Validity

Aside from sample size there are other considerations determining whether conclusions can be drawn from surveys. The phrasing of survey questions will have a profound and often unforeseen impact on the results: ‘validity’ being partly about asking the questions that elicit an accurate measure. Other factors determining the validity of responses are: comprehension of the questions being asked; retrieval of relevant information when making a response (particularly key for a survey like the NSS which asks about experiences across the whole course, normally two-and-a-half years in); formation of a judgement based on this information; mapping the judgement onto the response scale (or equivalent)[4]. Any newly designed survey should be subjected to a trial first to test how questions are answered and whether the responses given are relevant or contain any inherent bias. The task of making them meaningful, representative and valid is challenging – not an exact science – and relies upon a full understanding of the population being surveyed.

[1] National Student Survey

[2] Destination of Leavers from Higher Education: the graduate survey conducted six months after graduation

[3] The probability that the average (mean) results fall within the margin of error (aka confidence interval, usually expressed as something like +/-5%). *NB: if you have a sample and want to calculate margin of error then divide 1 by the square root of the number in the sample*

[4] See Tourangeau, Rips, Rasinski (2000). *The psychology of survey response*. Cambridge: Cambridge University Press

Pingback: ‘Fifth best in the east midlands for sports facilities': questionable survey press releases | Exit Velocity