The use of race, ethnicity and ancestry in human genetic research

The HUGO Journal

Official Journal of the Human Genome Organisation

Table 2 Sample set coding frequencies, N (%)

Variables coded	N = 170 (%)
Basic features
Hypothesis	169 (99.4%)
Limitations	87 (52.4%)
Sample origin	163 (95.9%)
Reason for using population
Why populations	112 (65.9%)
Why this population	117 (68.8%)
Basis for assigning population label	150 (88.2%)
Use of genotyping data to infer genetic ancestry	88 (51.8%)
SNP genotypes or ancestry informative markers (AIMs) used to infer ancestry proportions of individual participants’ DNA samples	20 (23.3%)
Genotype data used to assess the genetic homogeneity of population by principal components cluster analysis, Samples outlying from population clusters of interest excluded from further analysis	36 (41.9%)
Text briefly states that potential population stratification was examined in the research populations, but no further details are provided	32 (36.4%)
Defines generic ‘race and ethnicity’ or ‘ancestry’	0 (0%)
Defines specific population label/describes population group	102 (60.0%)
Ways of using populations in research
Label for study population only	78 (45.9%)
Independent variable	87 (51.2%)
Dependent variable	1 (0.59%)
DNA with a label	23 (13.5%)
Discusses social and ethical implications	0 (0%)