Guide to Building Your Family's Haplotree


I've been working on a pet project to supplement my genealogical research, and in this deep dive I'm going to share it with you as follows:

I. Intro to Family HaploTree Building
II. Constructing Your Family HaploTree
III. Proving Family Anecdote with HaploTree Building
IV.  Mitochondrial-DNA & Y-DNA Testing Options

As a genetic genealogist I'm keen to know if there are haplogroups in my pedigree that are rare, newly discovered or only found in specific populations and biogeographical regions. I'm also looking to use haplogroups for ancient ancestral research, to help me trace family surnames that have disappeared in the bowels of the Trans-Atlantic slave trade as well as the migration paths of my immigrant forebears from their homelands to the Americas. 

Since learning about my own Maternal (or mitochondrial DNA) and Paternal (or Y-chromosome DNA) Haplogroups, I'm naturally inquisitive about the ones that I DIDN'T inherit from my parents and other direct pedigree relatives (grandparents, great-grandparents, 2nd-great-grandparents, 3rd-great-grandparents, etc).

Based on the unique inheritance patterns of human Mitochondrial DNA (Mt-DNA) and Y-chromosome DNA (Y-DNA), a child only inherits their mother's mt-DNA haplogroup through a direct matrilineal line (ie from his/her mother, her mother, her mother, etc), and if male his father's Y-DNA haplogroup through a direct patrilineal line (ie from his father, his father, his father, etc). 

This means a child (me) never inherits their father's maternal haplogroup nor from their mother's father. Going to the next generation [my 4 grandparents] this leaves four more haplogroups (3 mt-DNA and 1 Y-DNA) that I wouldn't inherit. And even more in the next generation [from my 8 great-grandparents]. 

Yet these are my direct forebears and even though I didn't inherit their haplogroups directly it means that by extension I biologically descend from an ancestor bearing the haplogroup. So it becomes genetically and genealogically relevant for me. [Be sure to read Section III  to learn about an intriguing haplogroup discovery in my family pedigree.]

Ultimately I want to identify, document and trace all of my other direct fore-parents' haplogroups back to their root populations to reveal what stories they tell. ... But how do I find out about these other haplogroups if my forebears are unavailable for DNA testing?
I. Intro to Family HaploTree Building
Source: National Genographic screenshot
Tracing haplogroups may be helpful if you want to learn more about your direct matrilineal and patrilineal ancient origins, population migration patterns, for tracing a surname or proving paternity, or if the haplogroup is common or exclusive in certain populations (i.e. Native Americans, Malagasy, Jewish), all of which may lend more clarity about the origins or source population of the haplogroup. And you can do the same thing with your other direct line pedigree ancestors too.  

In the past genetic genealogists and scientists have effectively used mtDNA and Y-DNA testing to prove genetic connections to a specific ancestor, ancestral population, tribe and/or biogeographical region, and to break down the proverbial brick wall. [For deeper discussion see Roberta Estes blogs here and here.]

For example, genetic genealogist Andre Kearns used Y-DNA testing to confirm whether or not his African-American 3rd-great-grandfather James Henry Johnson was the child of one of his slave master John Smyer's sons. Mr. Kearns cleverly tested his cousin Greg's Y-DNA at FamilyTreeDNA. Greg is a direct-line patrilineal descendant of James Henry Johnson. 

If Mr. Kearns' hypothesis is correct, then his strategically-placed cousin Greg and the direct-line patrilineal male descendants of John Smyer will have the same Y-DNA haplogroup. Here are the results: 
Screenshot of Andre Kearns' presentation 
As you can see in the screenshot above, Mr Kearns's cousin Greg shares the same Y-DNA haplogroup assignment J-M172 with not one but two Smyers (also spelled Smoyer/Schmier) descendants — and with genetic distance of "0" (zero), which means they match perfectly (ie without differentiating mutations). We also learn this branch of Mr. Kearns' family tree comes from men who most likely had origins in Deustchland (Germany). [For Andre Kearns's full and amazing story please see his blog here and video presentation here.]

Furthermore it is important to know that females don't have a Paternal Haplogroup because the Y-chromosome is absent in them. However you can learn about the Y-DNA of a female's father by testing her father, full brother, paternal grandfather, paternal uncle or paternal uncle's son.

Lately at personal genome service 23andMe, if a female customer's father or full brother takes the test, then the company will provide the female customer's account with a Y-DNA haplogroup assignment, which has caused some confusion. 23andMe explains: 
23andMe screenshot on Y-chromosome (paternal) haplogroup assignments
The point here is you can use the same methods utilized by genetic genealogist Andre Kearns and 23andMe — along with strategic DNA testing  to discover the maternal and paternal haplogroups of any of your ancestors, even one of your 3rd-great-grandparents. 

II. Constructing Your Family's HaploTree
Source: Mark Orwig, smarterhobby.com, used by permission.
In this section I'm going to discuss how you can learn about your other fore-parents haplogroups by testing strategically-placed relatives. Obviously if you can test a parent, grandparent or great-grandparent would be fantastic. But if you can't then using these 3 Steps will help: 

Step 1. Getting Started

  • Construct your family tree/pedigree and add as many generations as possible, especially your fore-parents (ie grandparents, great-grandparents) and their siblings, well as their progeny (children, grandchildren). 
  • Document the haplogroups you have already discovered in your family. For example since you inherited an exact copy of your mt-DNA from your mother, her mother, her mother etc you already know the haplogroups of those fore-mothers. If you are a male you also inherited an exact copy of your Y-chromosome from your father, his father, his father etc you already know the haplogroups of those fore-fathers. Both of these lineal lines are accounted for no matter how far you go back in generations. 
  • Check to see if you've relatives that have already tested with a direct link to one of your fore-parents. 

Step 2. Identify Strategically Placed Relatives

  • Identify the ancestor whose haplogroup you want to document. For example, I've always wanted to know about the mtDNA of my maternal grandfather. But he is deceased and so is my mother and her brother. Remember a male does not pass down his mtDNA. [Be sure to read Section III to learn about an intriguing haplogroup discovery in my family pedigree.]
    • Identify all living relatives with a direct matrilineal or patrilineal link to one of your fore-parents.  For example I know my maternal grandfather had 9 siblings so they all would have the same maternal haplogroup. However my grandfather is deceased so I turn to my grandfather's nieces from his sisters as well my grandfather's maternal aunt's children — they are all living and would have the same maternal haplogroup as my maternal grandfather. 
    • So if you (like myself) wanted to know about your grandfather's maternal haplogroup (which he gets from his mother — your great-grandmother) but your grandfather is deceased, you can test living relatives with a direct matrilineal link to your grandfather's mother (including his sisters, his sister's children, his sister's daughters children, etc) as they would've inherited the same maternal haplogroup as your grandfather.
    • Each of these living DNA relatives MUST have proven genetic ties to you and your fore-parent and have a direct matrilineal or patrilineal path to themTest each of these strategic relatives at a DNA company that offers an autosomal-DNA product to make sure the genetic relationship between you and your relative is legitimate.
    King Genome's Tip: 
    • For the best haplogroup results I highly recommend taking the Full Mitochondrial Sequence and next generation sequencing Y-DNA tests [see Section IV], as well as reading Mark Orwig's blog on types of DNA tests here

    Step 3. Charting Your Family Haplogroups

    Once you've identified your living relatives with a direct link to your fore-parents,  you will now chart them on: (a) MtDNA & Y-DNA Inheritance Descendants Chart AND (b) Master MtDNA & Y-DNA Pedigree chart. 

    (a) MtDNA & Y-DNA Inheritance Descendants Chart
    In the example below I start with a blank mtDNA & Y-DNA Inheritance chart:

    On the chart above, choose a fore-parent's haplogroup that you want to uncover and then list all of their direct descendants. This fore-parent would be listed at the top of the chart.
    • On the Y-DNA Inheritance portion of the chart, I started with my cousin's great-grandfather (the top black square). All of the squares below him represent his great-grandfather's descendants that would have inherited his Y-DNA, including his son (my cousin's grandfather), grandson (my cousin's father) and his son (my cousin). 
    • On the mt-DNA Inheritance portion of the chart, I started with my own maternal grandfather's mother (the top black circle) and all of the black  circles below her represent her descendant daughters who would have inherited her mt-DNA haplogroup, and the black squares represent her descendant sons who would have inherited her mt-DNA haplogroup. Here are my findings:


    As you can see my cousin paternal great-grandfather is Y-DNA Q1a3a and my maternal grandfather's mother (or my great-grandmother) is mt-DNA B2. 

    Step 4. Master Charting Your Results 

    Finally create a Master Mt-DNA & Y-DNA Pedigree Chart to document your family's haplogroup information, and of course to show off the fruits of your labor.

    In my chart, I only went to the third generation because it gets more complicated after that. In the "Relationship Column" I listed my parents, all four grandparents, and all 8 great-grandparents by generation. For privacy reasons I excluded the names of my fore-parents but please add yours to make things easier.  

    I also took the liberty of adding the origins of my family's haplogroups and indicated the type of test taken. For my female ancestors I used "inferred" when listing their Y-DNA haplogroups because they don't have a Y-chromosome. Essentially you can customize your Master Chart to suit your needs. Without further ado, here are my results:

    (b) TL Dixon's Master Mt-DNA & Y-DNA Pedigree Chart

    1ST GENERATION & 2ND GENERATION (combined)

    3RD GENERATION

    On my Master Chart above, as you can see the majority of my family haplogroups are most likely of West African origin (L1b1a, L1c2a1, L3f1b1a) but there is a European (I-M253) and Native American (B2) one too. Also me and my 2nd-great-grandfather on my mother's paternal side have the same parental Y-DNA haplogroup E-U290. 

    I learned about my 2nd-great-grandfather's Y-DNA haplogroup by testing a great-grandson of my 2nd-great-grandfather with a direct patrilineal link to him. I would need to test my cousin's Y-DNA more to learn if it's the same specific sub-group as mines. However I now know the inferred Y-DNA haplogroup of my maternal grandfather's mother is E-U290! ... It's worth noting that when I tested my Y-DNA via next generation sequencing, my terminal haplogroup E-3950* forms a NEW sub-clade, and until more people test I'm the only person in the world with this signature mutation. 

    Each of the "Unknown" and "To be determined" entries in my Master Chart means a missing branch of my family's haplotree. For example, in the 2nd Generation, in order to find my maternal grandmother's inferred Y-DNA haplogroup [which is carried by her father] I need to test a direct patrilineal male descendant of her father.  

    In the 3rd generation, I need to determine my maternal grandmother's father mt-DNA haplogroup so I will need to test one of my great-grandfather's nieces from one of his sisters or one of their daughters, or their daughters. 

    I've a lot of work to do. 
     I must find — and convince — living relatives with direct matrilineal and patrilineal links back to my fore-parents to test. Now I'll put this exercise into genealogical perspective by sharing a personal family haplotree discovery with you.

    III.  Proving Family Anecdote with Haplogroup Tree Building
    From left: my great-grandaunt Virginia Jackson Van Ness; my great-grandaunt Clara Jackson Van Horn; and my great-grandmother Mary Louise Jackson Winkey
    I didn't grow up with my maternal grandfather's side of the family so when I found them a few years ago, I was eager to get to know more about them. It was bittersweet because my mother, her only sibling, my grandfather and all of his siblings had passed away. Around the same time I began DNA testing myself at all of the major companies to maximize my genealogical research. 

    Of course we had the ubiquitous Native American rumor. According to my grandfather's niece and family historian Gwen Crews, my grandfather's maternal grandmother Sophia Shipley, born 1862 in White House, Hunterdon, New Jersey, knew our Native American relatives. 

    My 2nd-great-grandmother Sophia Shipley married Claiborne Jackson, who migrated from Louisa Virginia, and together they had 11 children: 
    (1) Ida; (2) Inda; (3) Virginia(4) Nelson Franklin; (5) Mabel; (6) Gladys; (7) Clara Evelyn
    (8) Leroy; (9) Claiborne Jr; (10) John, and my great grandmother, (11) Mary Louise

    Cousin Gwen says my 2nd-great-grandparents and their big family were the only black folks living in the unincorporated community of Finderne, Somerset, New Jersey. My own research shows that Sophia Shipley and her ancestors were free people of color with colonial roots in New Jersey. Sophia's mother was Mary Jane Wyckoff (born 1828) and her mother was Jane (born 1800, maiden name unknown). They were all either described as "Colored," "Mulatto" or "Black" on most federal census records. Notably indigenous populations were often classified in the same manner in New Jersey and other places as well. 

    From a historic standpoint, the colonial New Jersey hinterland had an African presence (enslaved, indentured, freed and escaped) since the 1600's and they intermixed with indigenous and European populations. Many came directly from Africa (Guinea Coast) into New York harbor while others arrived after first being seasoned in the Caribbean (Barbados). My ancestors were among them. 

    When I saw photos of my great-grandmother Mary Louise and two her siblings Virginia and Clara (all pictured above), I knew there must be something to the family rumors. Although phenotypes are an unreliable indicator of ethnicity the faces of my ancestresses looked to have Native American and/or Asian influences. The good news is they had lots of children (including my great-grandmother with 10), grandchildren and great-grandchildren to test. 

    Enter cousin Richard Oakley. I met him on AncestryDNA in 2012 when we were doing research on the same family. According to Richard, his great-grandmother was Sophia Shipley, his grandmother was her daughter Virginia "Jenny" Jackson, and his mother was Virgina's daughter Clara Harvey (she was named after her aunt Clara Jackson pictured above). This would make cousin Richard my second cousin once removed. This is because Richard's grandmother Virginia and my great-grandmother Mary Louise were sisters and both were daughters of Sophia Shipley.  

    And like cousin Gwen, our newly found relative Richard told me that he'd heard Sophia Shipley had some sort of Mohawk Indian heritage, and that there was a photo her in buckskin shoes and holding a peace pipe. I was thinking Lenni Lenape or  Carib Indian. 

    Thus I identified cousin Richard as the perfectly strategic candidate to test. His mt-DNA haplogroup result would tell us more about our shared ancestor Sophia Shipley on her direct matrilineal line. Needless to say I convinced Richard to test. As a control bonus, I was able to test Richard's 94-year-old mother Clara Harveyhis 89-year-old uncle L. Van Ness and our cousin Catherine Brunson, who maternal grandmother was my great-grandaunt Clara Jackson. ... They should all have the exact same mt-DNA haplogroup! 

    First let's start with my 23ndMe autosomal-DNA test results and triangulation tool to prove that we are in fact related to each other:
    Part 1: TL Dixon vs. cousins Richard Oakley, Clara Harvey, L. Van Ness and Catherine Brunson

    Part 2: TL Dixon vs. cousins Richard Oakley, Clara Harvey, L Van Ness and Catherine Brunson
    As you can see the 23andMe results supports that we are related to each other in the range of first cousins twice removed and second cousins once removed. Although I didn't post here for brevity reasons, I also compared each of my cousins to each other and myself  — we are all related to each other. In fact we form what is known as a Triangulation Group (TG). [See Blaine Bettinger's blog on triangulation proof standards here.]

    Further proof — I match Richard, his mother and uncle on the X-chromosome (as seen above). This is crucial because as males Richard, his uncle and I could have only inherited our one X-chromosome from our mothers. Therefore the possibility of a paternal-side match between us as a TG can be ruled out. 

    Next I examined the mt-DNA haplogroup results for Richard, his mother, his uncle and our cousin Catherine Brunson. First I reveal Richard and his mother's 23andme mt-DNA haplogroup results:


    23andMe old screenshot

    As you can see their mtDNA haplogroup is B2, one of  the five founder haplogroups found among the indigenous peoples of the Americas. We were shocked. This result could prove our our Native American ancestry through our shared ancestor Sophia Shipley. 

    On my 23andMe DNA Relatives list below you can see Richard, his mother Clara Harvey, his uncle L. Van Ness and our cousin Catherine Brunson show as mt-DNA haplogroup B2:
    Here is a close-up of me and Catherine Brunson, who is my and Richard's second cousin once removed. Her maternal grandmother was Clara Jackson Van Horn and as you can see her mt-DNA haplogroup is B2:

    I added my new family haplogroup discovery to one an mtDNA & Y-DNA Inheritance chart but this time using AncestryDNA's FamilyTree program to show my and Richard's link to Sophia Shipley. Here are two screenshots (click to enlarge). 
    My connection to Sophia Shipley 
    Cousin Richard Oakley's connection to Sophia Shipley
    Cousin Catherine Brunson's connection to Sophia Shipley
    I can now say for certain that Richard's maternal grandmother and my great-grandmother (via my maternal grandfather) were sisters (both had the same parents) — and they would all be mtDNA B2, including my grandfather.  

    Yet we were worried. 23andme only tests a small amount of the mt-DNA and Y-DNA. At present their genomic build for haplogroup testing is outdated leading to only basic haplogroup assignments. However we needed to be sure Richard's maternal haplogroup was in fact B2 and not another sub-group of its parent haplogroup B4'5, which is found outside of the Americas. [See Section IV for DNA testing options]

    So I convinced cousin Richard Oakley to take FamilyTreeDNA's Full Mitochondrial Sequence test, which is the most complete mt-DNA test you can take because the whole mitochondrial DNA code is sequenced. And it gives you the most specific terminal haplogroup available. Here are the results (click to enlarge): 




    As you can see Richard's FamilyTreeDNA (FTDNA) results was the same as 23andMe. This was a bit disappointing because mt-DNA haplogroup B2 has a lot of sub-clades, and we were hoping for a more specific one.


    Richard then decided to test at National Genographic 2.0, which at the time was on a more recent phylo-build than FTDNA, so we hoped his haplogroup assignment might be more specific. Here are the results:
    At the time Richard tested at Nat Geno 2.0, customers could transfer their raw data file to FamilyTreeDNA, which in turn would assign a confirmed haplogroup (but since Nat Geno switched to the Helix platform this is no longer available). Here are his revised results:

    Houston, there is a problem. The FamilyTreeDNA results are the same B2 as before and Nat Geno is showing B2b3. The results should actually be the same. Hmmm...

    I then uploaded Richard's FamilyTreeDNA FASTA file to James Lick MtDNA Haplogroup Analysis program to see if I could find an issue. Here are the results (click to enlarge):



    With Richard's James Lick MtDNA Haplogroup Analysis (above), he actually shared a best mtDNA haplgroup with B2 and B2b3, the latter of which is a sub-group of B2b.

    That's when I noticed the discrepancy: Richard is mismatch for the defining marker for B2b (indicated above in red as marker 6755A). This means Richard terminal haplogroup could not be B2b3 if he's negative for the defining mutation of its parent B2b. 

    I contacted genetic genealogy expert James Lick about the result, and he told me that either there was a reversion of the defining mutation for B2b back to its ancestral state (usually haplogroup mutations are confirmed in their derived state), or it was not really B2b3. ... Huh? What's going on here?

    Next I reached out to Argentinian anthropologist and geneticist Dr. Claudio Bravi and asked him this B2 discrepancy. Dr. Bravi asked to see my cousin's mt-DNA raw data files. After analyzing the files, here is what Dr. Bravi wrote back to me: 
    Hi TL,
    No, it is not B2b3 with reversion. Instead, it is a new clade with homoplasic polymorphism at 13708. 13708 is a rather hotspot that appears over and over again in different mtDNA lineages. 
    Your cousin´s sequence has an almost perfect match with one  published one from the US, unfortunately without info regarding ethnicity or geographic origin. See below a comparison of these sequences. I only listed the polymorphisms accumulated since the arrival to America: both sequences share the same five mutations and differ at hypervariable 16092. This indicates that they are really very close as seen here. 


    As you probably know, there is a dearth of data regarding US Native Americas.
    Best,
    Claudio
    What is a homoplasy and how does it apply to my family's rare maternal haplogroup assignment B2?  

    Well a homoplasy occurs when when defining mutations of two haplogroups are similar, but are not derived from a common ancestor. Homoplasy often results from what is known as convergent evolution. In other words, the defining marker for Richard's haplgroup assignment B2b3  is not derived from its parent sub-clade B2b  which is defined by 6755A (derived state) whereas Richard has 6755G (ancestral state) in that location. This indicates that Richard's terminal haplogroup is a new branch of B2 which has not been defined and which is also positive for 13708
    • To provide another example let's look at maternal haplogroup M42, which is found mostly in Australia according to Nagle et al. The existence of a novel (new) branch within the M42 clade "was first suggested by Ballantyne et al, in which some samples were labelled M42*(xM42a) because they carried the transition at np A9156G (the defining mutation of M42) but not G12771A (one of the defining mutations of M42a)."
    I've designated Richard and my maternal grandfather's potential new maternal haplogroup assignment as  B2* + 13708. ... Am I correct? I'm not sure. 

    But I've proven that my paternal grandfather — and by extension myself — biologically descend from a foremother of Native American descent on his direct matrilineal line. Facts I can add to my Family Haplogroup Tree project to my genealogical research efforts!
    IV. Mt-DNA & Y-DNA Testing Options

    In this final section I discuss mtDNA and Y-DNA testing options. Some DNA companies offer mtDNA and Y-DNA testing as stand-alone products (like FamilyTreeDNA) and others include them in with autosomal-DNA testing (like 23andMe). Reminder: I highly recommend each of your strategically placed relatives do an autosomal DNA test first to confirm the relationship between you and your shared fore-parent. Here are some great testing options: 

    FamilyTreeDNA  (price varies) — offers a range of comprehensive mtDNA and Y-DNA tests (from $69 US), including the highly recommended Full  Mitochondrial Sequence test ($199 US),  and the Big Y ($695 US), as well as individual SNPs. FTDNA's mtDNA and Y-DNA tests are separate from its autosomal-DNA FamilyFinder product so testing can get pricey. The Big Y goes on sale sometimes and can be as low as $395. 
    Noteworthy: FTDNA is the only major company offering mtDNA and Y-DNA relative matching as well as haplogroup projects you can join. FTDNA is on Build 17 as of March 24, 2017 (it was previously on Build 15). As of October 2017, FTDNA's Big Y product is upgrading from hg17 to h38.

    23andMe  ($99 US) — offers basic haplogroup assignment predictions based on low coverage mtDNA and Y-DNA, included with its autosomal DNA test product. Haplogroup assignment generally accurate, but may not be as specific because 23andMe utilizes the outdated Build 7 platform. 
    Noteworthy: If you're watching your budget 23andMe is the most economical option. 

    National Genographic 2.0 Helix ($149 US) — offers moderate-coverage mtDNA and Y-DNA testing, included with autosomal DNA test product. Haplogroups are very accurate; Nat Geno is on Build 17. At present Nat Geno 2.o Helix does not offer any genetic relative
    matching which may be problematic for confirming genetic relationships. 
    Noteworthy: Next Gen 2.0 Helix offers more Mt-DNA and Y-DNA SNP coverage than FTDNA with the exception of FTDNA's Full Mitochondrial Sequence and Big Y products. 

    Full Genomes Corp. (price varies) — offers an array of whole genome sequencing (from $675) and Y-DNA products, including the high resolution Y-Elite test ($645). Interpretation services included in the cost of some products but offered as stand-alone product as well. 
    Noteworthy: Full Genome Corp. offers a payment plan option to help defray costs. 

    YSEQ (price varies)— offers whole genome sequencing tests (from $740), and an array Y-DNA STR and SNP panels (from $17.50). Includes deep sequencing for mt-DNA and Y-DNA with whole genome sequencing product. Interpretation services included. 
    Noteworthy: YSEQ helped discover the oldest Y-DNA haplogroup ever found to date changing what we know about Y-DNA phylogeny. [Read my A00 Cameroon story here]. 

    King Genome's Tips:
    •  James Lick's excellent mtDNA Haplogroup Analysis tool has been updated to reflect the most recent Build 17 update. When you upload your mt-DNA raw data files from AncestryDNA, 23andMe and FamilyTreeDNA (FASTA), it will be conformed to the new build. This could lead to a better mt-DNA haplogroup prediction. 
    • For males taking high-resolution Y-DNA tests you can upload your raw data file (BAM or vcf) to YFull.com ($40 US), an interpretation service for next generation sequencing products (Big Y, Y-Elite). Please read Linda Thompson Jonas's excellent blogs here and here to learn more about the importance of deep Y-DNA testing. 
    • Males can learn their general Y-DNA haplogroup by uploading their AncestryDNA
    •  raw data file to Promethease ($5 US). For instructions see Blaine Bettinger's blog here.

    #END#

    1 comment:

    1. Thanks for your useful information. This blog give me very clear idea about Recombinant DNA.
      upload raw dna data
      dna upload
      promethease review

      ReplyDelete