Return to the main ET-TDT page

ET-TDT Example 2

The Demo2.hap file is shown here:
0 4 a cx
1 4 b cx
2 3 b cx
3 4 c cx
4 3 a cx
5 3 a ut
6 2 a cx
7 2 a ut
8 1 a cx
Nine 9 haplotypes are listed with haplotype codes from "0" to "8". Each haplotype has 3 markers. The first marker has 4 different forms, the second has 3 and the third has 2. (The cladistic relations are not coded in this file.)

The first and ninth lines of the marker file, markerDemo2.dat, are shown here:

3 2 a a ut cx na na na na na na 2 1 a a ut cx
3 3 b a ut cx 3 2 na na cx cx 3 3 b b na na
With "na" as the missing value code, the first trio has a missing father, coded as all missing values. This child is assumed heterozygous for the first marker, homozygous for allele "a" for the second marker, and heterozygous for the third marker. The second parent of the ninth trio is heterozygous "3/2" for marker 1, was not typed for the second marker, and is homozygous "cx" for the third markers. The child of the ninth trio was not types for the third marker.

The cladogram is in Demo2.clade :

0 1
1 2
1 3
0 4
4 5
4 6
6 7
6 8

The first line means that there is a connection from haplotype "0" to haplotype "1". Overall the form of the cladogram is:

3-1-0-4-6-7
  |   | |
  2   5 8
We will perform our ET-TDT analysis using a pre-collapsed cladogram in which rare haplotypes are collapsed as follows: "2" and "3" are collapsed into "1", "7" and "8" are collapsed into "6", and "5" is collapsed into "4". In addition we will eliminate any trios with more than 50 possible haplotype reconstructions, under the assumption that these are non-informative.

The means we use the following haplotype inference step:

inferhap ha=Demo2.hap ma=Demo2 dr=50 re=2:1,3:1,7:6,8:6,5:4 mi=na

The output is: Recoding table Old New 0 0 1 1 2 1 3 1 4 2 5 2 6 3 7 3 8 3 inferhap dropped trio 62 with 100 possibilities ... (13 more) inferhap dropped trio 1109 with 72 possibilities This recoding table means that the new, pre-collapsed, cladogram is:

1[1,2,3]-0[0]-2[4,5]-3[6,7,8]
as specified in Demo2.coll . The notation here is that the number to the left of the square bracket is the new haplotype code, and the numbers inside the square brackets are the old haplotype codes that correspond to each new code.

Here is the start of countsDemo2.dat, the first of the three output files.

4 249 936
4 4 4 4 4 
4 4 4 2 19 
This indicates that there are 4 haplotypes, 249 unambiguous trios and 936 ambiguous trios. The first 8 ambiguous trios have 4 possibilities each and the 9th has 2 possibilities. Together these 9 trios account for the first 32+2=34 lines of ambDemo2.dat. The ambDemo2.dat file has the same format as unambDemo2.dat, the first 2 lines of which are shown here:
1 2 1 3 
3 3 3 0 
This indicates that the first unambiguous trio has parent one consistent with haplotypes "1" and "2" with "1" transmitted to the offspring, and the second parent is consistent with haplotypes "1" and "3" with "1" transmitted.

The analysis produced by

ettdt cl=Demo2.coll fi=Demo2 th=2 ve=1
is
Cladogram:
  0: (0) -> [1,2]
 T1: (1) -> [0]
  2: (2) -> [0,3]
 T3: (3) -> [2]

249 unambiguous trios and 936 ambiguous trios.
Check 0 x 1 
Score test p-value=0.328340  accept
Check 2 x 3 
Score test p-value=0.000000  reject
Cladogram:
 T0: (0,1) -> [2]  theta=1.44
 T2: (2) -> [0,3]  theta=1
 O3: (3) -> [2]  theta=2.34

Check 2 x 0 1 
Score test p-value=0.010410  reject
Cladogram:
 O0: (0,1) -> [2]  theta=1.44
 O2: (2) -> [0,3]  theta=1
 O3: (3) -> [2]  theta=2.34
The original cladogram has 2 terminal nodes, "T1" and "T3", so iteration one attempts to collapse "1" into "0" as indicated by "Check 0 x 1". This collapse is accepted. Then the collapse of "3" into "2" is tried as indicated by "Check 2 x 3", and the collapse is rejected with p-value <0.000001. The leaves the cladogram
(0,1)[0,1,2,3]-(2)[45]-(3)[6,7,8]
where the numbers in parenthesis refer to the new (pre-collapsed) haplotype codes and the numbers in square brackets refer to the original haplotype codes.

In iteration two, the only collapse to be tested is the "0" and "1" groups into the "2" group. This is rejected using a Bonferonni cutoff of p=0.05/3 (since there are 3 collapses to test for this cladogram).

This means that we infer that the haplotypes group as follows:

  (0,1)[0,1,2,3]-(2)[45]-(3)[6,7,8]
(Original) haplotypes "4" and "5" have the lowest relative risk, and haplotypes "6", "7" and "8" have the highest relative risks.

If you prefer extremely brief output, e.g. for a simulation study, you can monitor some (or all) collapses using the form:

ettdt cl=Demo2.coll fi=Demo2 rejectids=0:1,2:3,2:0
The result is simply "2 0 1 1" which means that there are a total of two collapses that are rejected, those being the ones involving collapse between haplotypes (groups) "2" and "3" and between "2" and "0". For more complex cladograms, you may see something like "Check 4 5 x 0 1 2 3 x 6 7 8". This means that we are testing the collapse of a haplotype group consisting of haplotypes "4" and "5" with a group consisting of haplotypes "0" through "3" while assuming that haplotypes "6", "7" and "8" are also grouped together to have a common haplotype relative risk. For the brief output format, this collapse is specified as "4:0", which is sufficient for unambiguous indentification of this specific collapse step for this cladogram.

Return to the main ET-TDT page