Annex 1: Research Methods
This annex explains the research methods that were used to identify and evaluate relevant empirical studies.
It was originally intended that the search would encompass material published between the years 2000 to 2017, but it soon became apparent that a number of significant studies were published prior to this, so these were also included in the review. The search focused on the main English speaking jurisdictions that use juries in criminal cases, namely England and Wales, Ireland, Northern Ireland, Canada, the Australian jurisdictions, New Zealand, and the US jurisdictions (but also being mindful of the fact that empirical research may have been undertaken by researchers working in other jurisdictions). It also became apparent that a number of significant studies had been undertaken in the context of civil jury trials (as opposed to criminal trials). Where these were assessed as relevant to the review, they were included. Studies undertaken in the Scottish context were not excluded from the remit of the research, but no such studies were identified.
In order to identify relevant sources, two exercises were undertaken. First, an extensive search of electronic databases was undertaken. This encompassed legal databases (such as Westlaw, Lexis Nexis and HeinOnline), scientific databases (such as PsycARTICLES); multi-disciplinary databases (such as JSTOR and Ingenta Connect) and the databases of the major academic journal publishers (such as Wiley Online Library, Sage Journals Online and Springer Link). A specific search of other sources (including Google Scholar, the Index to Common Law Festschriften (covering 2000-2004) and library catalogues) was undertaken to identify relevant material in book form. A snowballing technique (checking the references cited in each study) was used to identify other relevant sources of evidence.
Secondly, a search was undertaken to identify relevant Government reports, reports published by law reform bodies such as Law Commissions, and work that has been published in PhD theses. Government reports were identified by searching the individual Government websites and the catalogues of national libraries in the relevant jurisdictions. Law Commission reports were identified by searching the British Columbia Law Institute Law Reform Database, but also by directly searching the websites of the official law reform bodies of the jurisdictions concerned. The search for PhD theses was undertaken using the ProQuest UK database (which also indexes PhD theses written in English outside the UK).
Evaluation of Research Methods
Broadly speaking there are two main empirical research methods that have been used to assess jury communication methods: field studies and mock jury studies.  Field studies are undertaken with real jurors who have sat on real criminal cases. Mock jury studies simulate the experience of sitting on a jury by recruiting members of the public to act as jurors and asking them to engage with simulated trial materials. Both methods have advantages and disadvantages.
The main advantage of field studies is their realism - participants have sat on real trials in which they determined the fate of an accused person. Their main disadvantage is that the potential for controlled experiments - where variables are manipulated in order to test the impact of different interventions - is limited. There is also the practical consideration that in many jurisdictions - including Scotland - it is not permissible to ask real jurors to reveal any information about their deliberations.  Even in those jurisdictions where it is possible to do so, it is not permissible to record or observe jury deliberations in criminal cases.  This limits the type of information that can be obtained. It means, for example, that the researchers must rely on juror recollections of how the discussion progressed, which may not be accurate. There have been several large scale field experiments undertaken into methods of conveying information to jurors - primarily in the US - which were discussed in chapter 2.
It is safe to say, however, that the primary method that has been used to evaluate the effectiveness of methods of conveying information to jurors is mock jury studies. Mock jury studies' great advantage is that they can easily be used to conduct controlled experiments where the impact of a particular method is tested against a control group who have not been given that method. Their main disadvantage is that it is difficult and expensive to replicate the real trial experience authentically. If this is not done well, the external validity of such studies - the extent to which their findings are generalisable beyond the experimental setting - is open to question. For this reason, we have been careful to evaluate the research methods used by mock jury studies when drawing conclusions. There are five main areas where methodological scrutiny is necessary, each of which will be briefly discussed.
Studies need to be evaluated in terms of sample size and sample composition. Many studies use a convenience sample of undergraduate students (usually psychology students who participate in the study in exchange for course credit) rather than a sample that is representative of real juries' characteristics. The extent to which this affects external validity is debateable - a recent meta-analysis has suggested that it makes little difference  although others have disagreed.  In the present context, however, studies that use student samples do need to be treated with some caution, given that the dependent variable most commonly used to assess the effectiveness of jury communication methods is the extent to which jurors understand legal directions. Student samples are more likely to score highly on this measure than the general population, given their level of education. In this review, we use the term "community jurors" to refer to a sample of jurors that is not solely comprised of students.
The Stimulus Materials
Studies need to be evaluated in terms of the materials they use to simulate the trial experience. Some have - for reasons of convenience and cost - used written trial transcripts or study packs instead of visual materials (a video or a live trial re-enactment). Others have tested comprehension of legal directions in isolation - with no attempt to place them in the context of a trial. These studies (which in the present review only arise in the context of plain language directions) must be regarded with particular caution, given that in a real trial jurors will be exposed to many other stimuli that may make remembering and understanding directions more challenging. Even where a visual trial simulation has been used, it is important to scrutinise studies in terms of the extent to which the materials reflect the reality of a trial. This is particularly important in the present context, where what is being tested is the extent to which particular methods aid juror memory and understanding. A method that is effective in a highly simplified trial may not be so effective in the longer and more complex setting of a real trial. A recent meta-analysis of studies that have attempted to improve juror understanding found, for example, that the longest jury direction in the empirical studies they looked at was 936 words, or only six minutes of listening. 
There is, of course, a limit to what can be done in terms of replicating the real trial experience - real trials sometimes run for several days or weeks - but it is important that the stimulus materials (the trial and the legal directions) are not too short and that legal procedures and witness testimony are re-created as accurately as possible.
A further consideration is whether a study allowed for deliberation in groups, as jurors would in a real criminal trial. There is broad consensus that studies that do not include an element of deliberation always lack external validity  and thus must be treated with some caution. This is especially true in the present context, where there is evidence that deliberation itself can improve juror comprehension (albeit to a limited extent)  and where some studies have found decision aids to be effective only where jurors have deliberated.  Even where deliberation is included in the study design, the time for which jurors are allowed to deliberate is sometimes very short and the size of juries can be much smaller than they would be in Scotland - groups of six to eight are particularly common (reflecting in part the fact that in many US states, criminal juries can have as few as six members). Extrapolating from a group of this size to the Scottish context, where the normal jury size is 15, must be done with care.
Unlike those in mock jury studies, real jurors know that their decision will have consequences for the accused and for the other parties involved. The extent to which this affects the way that jurors behave is unclear. It is possible that mock jurors are not engaged in their task to the same extent as real jurors and that this might negatively affect their memory and understanding, making extrapolation from the experimental setting difficult. That said, there is evidence from some studies that mock jurors engage very conscientiously with their role and express stress regarding their verdict choices.  To increase the likelihood of mock juror engagement, it is important that studies take as many steps as possible to maximise the solemnity of proceedings, such as using appropriate venues and directing mock jurors about their role in a similar way to real jurors.
Methods of Measuring Effectiveness
In evaluating the effectiveness of methods of conveying information to jurors it is important to consider how effectiveness is measured. Most studies have done so by using juror comprehension as the dependant variable, but care does need to be taken in a number of respects when interpreting the results. First, studies that rely on self-reporting must be treated with particular caution,  as research has shown that jurors tend to over-estimate both how well they remember evidence  and how well they understand directions on the law. 
Secondly, when assessing whether information has been effectively conveyed to jurors, there is a need to distinguish between difficulties that are due to understanding and those that are due to memory. One of the largest field experiments (that tested the effectiveness of a number of methods, such as note-taking and written directions) assessed juror comprehension via a postal questionnaire that jurors could take home and complete, with many jurors completing it several days after the trial had concluded.  None of the methods tested improved comprehension, but it was impossible to tell whether this was because they were ineffective or because jurors' memories of the trial had faded.
Thirdly, it is necessary to examine the measure of comprehension used, as different measures can lead to different results.  Lieberman and Sales note that various methods have been adopted, including: 
(a) agree/disagree statements,
(b) multiple choice questionnaires,
(c) short answer questions (where respondents have to answer in their own words),
(d) asking respondents to paraphrase legal instructions,
(e) asking respondents to apply legal tests to novel situations,
(f) asking respondents to give a verdict in the instant case (on the assumption that a specific verdict would be legally appropriate if the instructions were followed) and/or
(g) asking jurors to provide a justification of their verdict (which is then independently rated in terms of its legal accuracy).
Some of these methods have obvious potential weaknesses - in (a) and (b), respondents might answer correctly simply by guessing (so the sample size must be sufficient to overcome the effects of guessing and the wording of the questions must not make it easy to guess the correct answer). But what is important is that each method is assessing something slightly different - a juror could score highly on (d) simply by having accurate recall, but this does not necessarily mean that she understood the concept or was able to apply it to the facts.  Finally, in determining the weight to be attributed to the various studies, care must also be taken to examine what it was that jurors were being asked to understand and the design of the intervention. Where studies fail to show that a particular intervention was effective, this could be because the concepts were already very simple to understand or because the aid provided to jurors was poorly designed. 
There is a problem
Thanks for your feedback