RelMix 1.4
Guro Dorum
2024-04-29
relMix.Rmd
This vignette gives an introduction to RelMix 1.4, the graphical user interface of the relMix package described in Dørum et al. (2017). We will present the main features of the GUI with two examples.
Getting started
Start RelMix by typing the following command in the R console
which will open the user interface seen in Figure 1.
Example 1: A paternity case
In this example we consider a paternity case where we have a mixed DNA profile of a mother and child, in addition to reference profiles from the mother and the alleged father.
1. Import data
The first step in RelMix is to import data. The import files must be either comma, semicolon or tab-separated and must have dots as decimal separators. Examples will be given for the three types of files that are needed: mixture profile, reference profiles and allele frequencies. These files are also found in the folder ‘extdata’ of the relMix installation. The mixture profile and reference profiles should contain the same loci in the same order, while the allele frequency file can contain additional loci not in the profiles. RelMix will give an error message if the data file has an error that is fatal for the program. RelMix will return a warning if there is a minor error in the data file that is not fatal for the program.
1a. Mixture profile
Table 1 shows the structure of the mixture file. The first line should contain column names as shown here: ‘SampleName’, ‘Marker’, ‘Allele1’ etc. The mixture profile in this example has a maximum of three unique alleles per markers, but the file may be expanded with columns ‘Allele4’, ‘Allele5’ and so on if there are additional observed alleles. In this example there are 22 markers, but only the first five are shown here. Only autosomal markers can be included. Marker names are case sensitive, and an error message will be given if markers are not found in the database.
SampleName | Marker | Allele1 | Allele2 | Allele3 |
---|---|---|---|---|
relMix | CSF1PO | 10 | 11 | 12 |
relMix | D10S1248 | 16 | ||
relMix | D13S317 | 9 | 12 | 14 |
relMix | D14S1434 | 10 | 13 | 14 |
relMix | D16S539 | 12 | 14 | |
… | … | … | … | … |
If there are multiple replicates, the next replicate profile must have a different sample name (e.g. “relMix2”) and follow directly after the first replicate (i.e. no blank lines between).
1b. Reference profiles
Table 2 shows the format of the file containing the reference profiles (all reference profiles have to be in the same file). The first line should correspond to the column names seen here. The different reference profiles follow directly after each other, and each profile should have a specific sample name. In our example there are two reference profiles, “Mother” and “Father”. The file shoud only contain profiles of individuals defined in the pedigree. Note: as long as the built-in pedigrees are used, the reference profiles should be named either “Father”, “Mother” or “Child” because these are the only individual IDs that are defined in these pedigrees. If custom pedigrees are provided, any sample name can be used (see example 2 below). Only the first two markers and the last one are shown for each profile.
SampleName | Marker | Allele1 | Allele2 |
---|---|---|---|
Mother | CSF1PO | 10 | 11 |
Mother | D10S1248 | 16 | 16 |
… | … | … | … |
Mother | TPOX | 8 | 10 |
Father | CSF1PO | 12 | 12 |
Father | D10S1248 | 14 | 15 |
… | … | … | … |
Father | TPOX | 8 | 11 |
1c. Database
When clicking the database button in the main interface, the window in Figure 2 appears. Table 3 shows the format of the allele frequencies to be imported here. There is one row per allele and one column per marker. A blank space indicates that the allele is not observed for that marker. Only an excerpt of the database is shown here.
Allele | CSF1PO | D10S1248 | D13S317 | D16S539 | … |
---|---|---|---|---|---|
9 | 0.013850416 | 0.077562327 | 0.106648199 | … | |
9.1 | … | ||||
9.3 | … | ||||
10 | 0.220221607 | 0.047091413 | 0.056786704 | … | |
10.1 | … | ||||
10.2 | … | ||||
10.3 | … | ||||
11 | 0.308864266 | 0.001385042 | 0.325484765 | 0.314404432 | … |
… | … | … | … | … | … |
Clicking the ‘Option’ button opens up the window in Figure 3 with three additional settings: a theta value to correct for population substructure, a silent allele frequency and a minimum allele frequency. Any alleles with frequencies below the minimum allele frequency will by default be set to 0.001, this also applies to previously unobserved alleles. For this example we will keep the default values.
When clicking ‘Save’ here we return to the window in Figure 2, where we need to click ‘ok’ to exit the database settings. If a previously unobserved allele (i.e. not in the database) is found in the imported profiles, a notification will be given that the alllele has been added to the database with the minimum allele frequency. If the allele frequencies do not sum to 1, a question will appear of whether you would like to scale the frequencies. If not, a rest allele will be added.
2. Mutations
In this example we will not consider mutations, so we can ignore the ‘Mutations’ button.
3. Pedigree
The pedigrees define the hypotheses. There are three choices of pedigrees: ‘Paternity’, ‘Unrelated’ and ‘Custom’. We choose ‘Paternity’ as the first pedigree and ‘Unrelated’ (alleged father and child are unrelated) as the second pedigree (Figure 4). In example 2 we show how to define a custom pedigree.
4. Contributors
Clicking the ‘Specify contributors’ button opens the window in Figure 5. We tick the individuals in the pedigree that are in the mixture, in this case the mother and child. It is possible to have different contributors for each pedigree, but here the contributors are the same under both hypotheses.
5. Dropout and drop-in
The last step is to specify dropout and drop-in values. See Dørum et al. (2017) for a detailed description about the drop-in/dropout model. By default all values are set to 0. In this case we expect no dropout from the mother so we set her dropout probability to 0, while the child is given a dropout probability of 0.1. The drop-in value is set to 0.05 (Figure 6).
Note that if the dropout values are not set, an error window will occur. Even if you do not include dropout and drop-in, you must open the dropout window and click on the save button.
6. Compute LR
Finally, we are ready to click the ‘Compute LR’ button. The result appears in a new pop-up window (Figure 7). The chosen parameters are displayed on the left, top right shows the pedigrees, followed by the total LR and a table displaying the LRs per marker. By clicking “Save results to file”, the LRs together with the mixture profiles, reference profiles and the parameters will be saved to file.
Example 2: Mutations, silent alleles and custom pedigree
Click the RESTART button before starting on a new case.
In this example we have a two-person mixture with one known contributor, and it is hypothesised whether the second contributor is the child of the first contributor, or someone unrelated. We take silent alleles and mutations into account. Only one locus is considered.
1. Import data
1a. Mixture profile
Table 4 shows the mixture file. There is only one observed allele for this locus.
SampleName | Marker | Allele1 |
---|---|---|
relMix1 | M1 | 1 |
1b. Reference profiles
In this example there is only one known profile. Since we will provide a custom pedigree, the sample name of the profile can be anything as long as it corresponds to the ID in the pedigree.
SampleName | Marker | Allele1 | Allele2 |
---|---|---|---|
C1 | M1 | 1 | 1 |
1c. Database
The allele frequencies of the one diallelic marker are given in Table 6.
Allele | M1 |
---|---|
1 | 0.4 |
2 | 0.6 |
In the database options we set the silent allele frequency equal to 0.1 (Figure 8).
When clicking the ‘ok’ button we are asked if we want to scale or add a rest allele since the frequencies no longer sum to 1 now that a silent allele frequency has been added. We click yes to scale.
Note 1: A rest allele can only be addded if the sum of frequencies is less than 1. Therefore, enforced scaling will be performed if we click ‘No’ and the sum is larger than 1.
Note 2: The added rest allele is given the name ‘r’. If a stepwise mutation model is chosen (see section below), numeric allele names are required, so in this case scaling must be chosen.
2. Mutations
Clicking the mutation button in the main interface opens a new window. By default mutations are not accounted for (mutation rate set to 0). There are three mutation models to choose from. The ‘Equal’ mutation model assumes that all mutations are equally likely, the ‘Proportional’ model assumes mutation rates to be proportional to allele frequencies, and the ‘Stepwise’ mutation model assumes that mutations to closer allelles (fewer sequence repeats difference) are more probable. When ‘Stepwise’ is chosen, a range must also be specified that indicates the relative probability of mutating n+1 steps versus mutating n steps. This mutation model requires that the allele names are numerics indicating the number of sequence repeats. It does not account for microinvariants, and will treat mutations to and from microinvariants as practically impossible. This option corresponds to the model ‘Stepwise (unstationary)’ in Familias 3. Mutations to and from silent alleles are not possible for any of the models. See the Familias manual for more details on the mutation models.
In this example we will use an equal mutation model with mutation rate 0.1 for both males and females.
3. Pedigree
Since the hypotheses in this example are not covered by the built-in pedigrees, we will provide RelMix with custom pedigrees defined by a text file in ped format. The ped file can be created e.g. by the online pedigree creator QuickPed, but can also be generated manually. Table 7 shows the ped file specifying the first pedigree where the two contributors are mother and child.
id | fid | mid | sex |
---|---|---|---|
C1 | 0 | 0 | 2 |
C2 | 0 | C1 | 1 |
The second pedigree, where the two contributors are unrelated, is specified by the ped file in Table 8.
id | fid | mid | sex |
---|---|---|---|
C1 | 0 | 0 | 2 |
C2 | 0 | 0 | 1 |
Note: The names of the individuals in a pedigree (the id) must correspond to the names of individuals in the reference profiles.
Figure 11 shows how to import the custom pedigrees from R scripts, one script per pedigree.
4. Contributors
Both individuals defined in the pedigrees are contributors under both hypotheses. It is only their relationship that changes between the two pedigrees.
5. Dropout and drop-in
In this example we believe there may be dropout in both contributors’ profiles. We set their dropout probabilities to 0.1 and the drop-in value to 0.05.
References
Dørum et al. (2017) doi:doi.org/10.1007/s00414-016-1526-x
Hernandis et al. (2019) doi:doi.org/10.1016/j.fsigss.2019.09.085
Kaur et al. (2016) doi:doi.org/10.1007/s00414-015-1276-1