introduction to gedmatch - dnaadoption.com practice exercises ... that were tested and what was...

25
Introduction to GEDmatch Using GEDmatch Use of this document signifies your agreement to the Web Site Agreement , Privacy , Statement and Terms of Use of DNAGedcom and DNAAdoption . This content is copyrighted by DNAAdoption. Benefits of GEDmatch (Why should I upload my results to GEDmatch?) Do yourself a favor: Upload today As soon as you get your test results, immediately upload the raw data, as explained in this course. It will take some weeks for the data to be processed and available for all of the optional searches. However, in a day, you should be able to do one-to-one searches. For example, if you have a relative who is on GEDmatch, get their kit number and run a one-to-one search. It’s to everyone’s benefit to take advantage of this opportunity. Lesson written and formatted by Susan MacLaughlin and Diane Harman-Hoog Layout and Design by Mesa Foard © 2015 DNAAdoption.com Objective: Compare your DNA test results with a large database of results from multiple testing companies. Tools: GEDmatch.com website; raw autosomal DNA data from AncestryDNA, Family Tree DNA, and/or 23andme. Exercises: Practice Exercises throughout the lesson help you apply what you’re learning. Relative Roadmap: Your Journey with Autosomal DNA GEDmatch enables you to compare your autosomal DNA data with other testers' data, regardless of which company processed your sample. GEDmatch provides the only way to see and compare your AncestryDNA data. Using GEDmatch, you may find additional “DNA relatives“ whom you had not previously identified.

Upload: vuongnga

Post on 24-Mar-2018

223 views

Category:

Documents


4 download

TRANSCRIPT

Introduction to GEDmatch

Using GEDmatch

Use of this document signifies your agreement to the Web Site Agreement, Privacy, Statement and Terms of Use

of DNAGedcom and DNAAdoption. This content is copyrighted by DNAAdoption.

Benefits of GEDmatch (Why should I upload my results to GEDmatch?)

Do yourself a favor: Upload today

As soon as you get your test results, immediately upload the raw data, as explained in this course. It

will take some weeks for the data to be processed and available for all of the optional searches.

However, in a day, you should be able to do one-to-one searches. For example, if you have a relative

who is on GEDmatch, get their kit number and run a one-to-one search. It’s to everyone’s benefit to

take advantage of this opportunity.

Lesson written and formatted by Susan MacLaughlin and Diane Harman-Hoog

Layout and Design by Mesa Foard

© 2015 DNAAdoption.com

Objective:

Compare your DNA test results with a large database of results from multiple testing companies.

Tools: GEDmatch.com website; raw autosomal DNA data from AncestryDNA,

Family Tree DNA, and/or 23andme.

Exercises: Practice Exercises throughout the lesson help you apply what you’re learning.

Relative Roadmap: Your Journey with Autosomal DNA

GEDmatch enables you to compare your autosomal DNA data with other testers' data, regardless of which company processed your sample.

GEDmatch provides the only way to see and compare your AncestryDNA data.

Using GEDmatch, you may find additional “DNA relatives“ whom you had not previously identified.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 2

Introduction to GEDmatch

GEDmatch (pronounced JED-match) is a free, "do-it-yourself" genomics website that allows DNA testers to

upload their raw data from Family Tree DNA (FTDNA), 23andMe and AncestryDNA in order to compare their

autosomal DNA (atDNA) with a large database of data that has been voluntarily uploaded by other testers. The

website address is http://www.gedmatch.com

GEDmatch is a free, non-profit utility website that is not affiliated with any of the DNA testing companies. It is

staffed by a team of dedicated volunteers. In order to maintain and upgrade its services, GEDmatch offers a

$5/month subscription add-on called Tier 1. This add-on provides subscribers with additional search capabilities.

GEDmatch uses a slightly different algorithm to compare DNA data than that used by the “big three” testing

companies. So, with GEDmatch you may discover additional matches, including contact information for some

matches. The algorithm also gives you different views of the comparisons, which may help you begin to separate

your matches along maternal and paternal lines. Stay tuned to learn more.

What is Raw Data?

Register with GEDmatch

GEDmatch requires registration to enter the website. Log in and continue.

Raw data sounds like it could be the saliva you sent in to be tested. It is not. Raw data is the

computer generated file of information about your DNA. It contains a list of the locations of the

bits of your DNA that were tested and what was found at each location. This file can be

downloaded from your DNA testing company and uploaded to GEDmatch.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 3

Your GEDmatch Home Page (Main Menu)

When you log in, your home page will look something like this. Your name and email address appear in the

upper left under Your Log-in Profile.

Let’s take a tour of the page.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 4

The top half of the page includes Information in the upper left. Below your name, in the Learn More section,

click on the bullet points to explore the topics listed. In the upper right of the screen you will find File Uploads.

We will cover in detail the steps to take to upload your raw data.

The lower left section of the page lists any or all atDNA kits that are currently associated with your email

address. Opposite it, on the right, you will find the Analyze Your Data section.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 5

Note that Diane’s profile shows that she is a Tier 1 Member (green text). Tier 1 is an extra analysis area that was

added by GEDmatch at the end of 2014. You get access to it by subscribing to the site for $10 per month. Tier I

Utilities appear in the lower right. More details will follow later in this course.

GEDmatch Process Now that you understand how GEDmatch.com is organized, you are ready to take advantage of all that

GEDmatch has to offer. The process is straightforward. Follow these three steps.

How to Download and Upload Your Raw Data Go to the top of the opening GEDmatch page for good instructions on working with your raw data files.

For example, click on Ancestry.com.

1. Download your raw data from the testing company

2. Upload your raw data to GEDmatch.com

3. Analyze your data using GEDmatch utilities

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 6

A new screen will open. Near the top of the page, click Here (blue text) for detailed instructions on downloading

and uploading your data.

This course expands on some of those directions.

Which Data Should I Upload? Each of the three testing companies enables you to download your raw data. If you have test results from all

three companies, you could upload all of them to GEDmatch. However, the data would be very similar. We now

suggest you upload results from only one company. If you have AncestryDNA results, upload those.

If you only have test results available from one of the other testing companies (FTDNA or 23andme), that’s okay.

Go ahead and upload that set of raw data. Doing so will allow you to get the process started so that you can use

GEDmatch utilities right away to see more of your matches. When your AncestryDNA results are ready, upload

your Ancestry raw data, also. Working with two sets of data on GEDmatch.com is not a problem.

If you have your AncestryDNA results, use them for GEDmatch analysis. Download your

raw data from AncestryDNA; then upload it to GEDmatch.com, as described in this

lesson. This is the only way to see your AncestryDNA matches’ data, and the only way

to compare your AncestryDNA data with others.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 7

AncestryDNA Raw Data

Log in to your Ancestry account. Click on the DNA tab and then on Your DNA Home Page. On the right of the

page, you will see a button with a little gear and Settings. Click on it; this will take you to your settings page.

In the right hand panel of the Settings page you will see Actions. Click on Get Started to download your raw

data.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 8

A form will open in a new window. Enter your password and click Confirm.

An email will be sent to you. Follow the directions in the email to download your data. Save the files to your

download folder or to another folder on your PC. Pat yourself on the back! You have downloaded your

AncestryDNA raw data.

Now it’s time to upload the raw data to GEDmatch.com. Click on Ancestry.Com to begin.

Fill in the form, following the instructions (in blue). Check Yes to allow your data to be compared with others’.

Browse your PC for the raw DNA file name. Choose the file and click Upload to send your raw data to GEDmatch.

Wait patiently while each chromosome number (1, 2, 3, etc.) appears on the screen as it completes processing.

Be sure to wait for all of the data to process before you leave the screen.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 9

When Finished shows on your screen, you have completed uploading your AncestryDNA raw data.

Congratulations! Before you leave the page, write down the GEDmatch number, starting with the letter A.

Family Tree DNA (FTDNA) Raw Data FTDNA provides raw data in two files: an autosomal file and an X Chromosome file. You must download both

files from FTDNA, and upload both to GEDmatch.com, in order for your results to be accepted and compared to

others’ results. Go to FTDNA’s website, www.familytreedna.com, and log in.

You will be taken to your opening page. Welcome to myFTDNA is in the upper left, above your profile. Under

Family Finder, in the lower right, you will see a line of orange text at the bottom of the section.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 10

Click on Download Raw Data.

You will download two files, one at a time. First, click on Build 36 Autosomal Raw Data. Save this file “as is”

(do not open it) to your download folder or other folder on your PC. Make a note of where it is stored. Second,

click on Build 36 X Chromosome Raw Data. Save this file to the same folder on your PC.

Congratulations! You have downloaded your FTDNA raw data!

Now upload the two files from your PC to GEDmatch, one at a time.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 11

Begin by clicking on FTDNA Family Finder on the GEDmatch opening page.

This form opens. Follow the instructions below (in blue). Check Yes to allow your data to be compared with

others’. Browse your PC for the autosomal raw DNA file name; it will begin with ffo and end in .gz. Choose this

file and click Upload to send your raw data to GEDmatch.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 12

It takes several minutes for the data to process. Wait patiently while the numbers 1 to 36 appear sequentially on

your screen. Do not click away from the page while it processes.

When the autosomal data upload has finished, follow the same upload process for your FTDNA X-DNA raw data.

A similar form will launch on GEDmatch. Complete it using the same name, account number, etc. as you did

previously. The X-DNA raw data file begins with xo and ends with .gz. Choose it and click Upload.

When Finished shows on your screen, you have completed uploading your FTDNA raw data. Well done!

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 13

23andme Raw Data

Sign in to your account at www.23andme.com.

Click on the down-arrow (inverted triangle) next to your name. Select Browse Raw Data from the drop-down

menu.

When you open that you will see a page that looks kind of like a calendar. In the top right corner you will see a

Download link. Click on Download.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 14

Fill out this form, following the instructions below (in blue).

Then click Download Data at the bottom right of the form to download a raw data file to your PC.

Go to GEDmatch.com (log in if your session has expired), and click on 23andme.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 15

Fill in the form. Check Yes to allow your data to be compared with others’. Browse your PC for the raw DNA file

name; it will end in .zip. Choose this file and click Upload to send your raw data to GEDmatch.

It takes several minutes for the data to process. Wait patiently while the numbers 1, 2, 3 and so on appear on

your screen. Do not click away from the page while it processes.

When Finished shows on your screen, you have completed uploading your 23andme raw data. Well done!

Before you leave this page, write down the GEDmatch number, starting with the letter M.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 16

Analyze Your Data

One-to-many – This is the most common way to analyze your data. Enter your GEDmatch number (kit number).

Set the cMs in both blanks to 5 cM for the first round. (The ISOGG tables use this as a baseline for determining

relationship predictions. Or, use our Combined Prediction Chart. )

Click Display Results, and GEDmatch gives you a table with up to 1,500 your of matches in it. The table below is

a truncated version of a very long table (multiple pages). Be sure that in this exercise you copy a couple of the

kits numbers to use in other exercises so you do not have to rerun this report.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 17

Let’s go through the match listing report, column by column.

Kit Nbr – Your matches’ GEDmatch kit numbers.

Type – Type of DNA test and/or chip used in testing the raw data. “A”-prefix files are from AncestryDNA.

List – Clicking on the L link will take you to the matches for that particular kit number.

Select – Check the boxes for the kit numbers you wish to compare when you want to see chromosome results

for more than 2 individuals. This will open a chromosome browser so you can see where (on which

chromosome segments) your matches are related. AncestryDNA does not provide chromosome data to testers.

Therefore, it is particularly important for you and your AncestryDNA matches to upload your raw data to

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 18

GEDmatch. Check the kits that you want included in your comparison, then select the Chromosome option in

the top part of the page to run the analysis.

Sex – Male or Female match. This could be important when a male has X matches because we then know that

the connection must be through his mother, as males only receive their X-DNA from their mother.

Under the Autosomal heading

Details – “A” – This link will take you to a page that shows you where exactly (on which chromosome, location

and length) you match this person. It will be similar to the chromosome browser at FTDNA or the IBDData.csv

that you can download from “Family Inheritance: Advanced” at 23andMe.

Total cMs – Total CentiMorgans (across all chromosomes) of your match.

Largest cM – Longest length of match in CentiMorgans on one specific chromosome.

Gen – Estimated number of generations going back to a common ancestor for you and your match. We must

stress that the speculation at GEDmatch may be too generous.

Under the X chromosome heading

Details – “X” – This link will take you to a page that shows you exactly where on the X chromosome (location and

length) you match this person.

Adj. cM – GEDmatch’s best guess. Remember, the X chromosome is inherited and passed down differently.

Females inherit their X from both their mother and father; males inherit their X from their mother’s only.

Total cMs – Total CentiMorgans of your X match.

Largest cM – Longest length of CentiMorgans on the X chromosome.

Email – Email address of your match.

We loosely define CentiMorgans (cMs) as a unit of measurement. For our purposes,

cMs can be considered a unit for measuring genetic linkage. cM is a linear relationship

but not strictly inches or centimeters. First you have base pairs: each is a single

position on your DNA strand. CentiMorgans has to do with how likely a particular

position is to recombine. Gaye Tannenbaum’s analogy is mile markers vs. exits.

CentiMorgans are exits. On some stretches of road (chromosome) they are close

together (base pairs = mile markers), and on other stretches of road they are far apart.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 19

Exercise 1: Run a One-to-many comparison using your own GEDmatch number.

You will have to wait until your data is completely processed to do this and it can

take from a few days to a few weeks.

You may use Diane’s if you want to try it against her uploaded data and do not have your

own processed yet. Diane’s GEDmatch number is M100424. It is preceded by an M, which

tells you the account is from 23andme. The results for Diane’s kit will include people who

match Diane, not you. Once you get the first table, you will have the account numbers of

her matches. In GEDmatch you may run any of the analyses (exercises) using any of her

matches. So experiment and see what you get. Can you identify Diane’s brother?

Now put a check mark next to several matches and go to the options above the report and

try them all. These are all different presentations of data. The one I use the most often is

the Chromosome option. This is data you can eventually put into a spreadsheet as you will

learn in later classes.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 20

One-to-One Compare

When you want to compare just two kits, choose One-to-one Compare from the Analyze Your Data list.

Example: Let’s say you find two matches who both have overlaps with you on one chromosome. But you need to

find out if these two matches match each other. Your goals are to:

Find your matches at FTDNA, 23andMe or GEDmatch

See where there are segment overlaps with other matches Analyze the other matches to see it they match each other

If so, then you and your matches share a common ancestor.

One-to-one analysis is immediately available. You can use it while your results are being fully processed by

GEDmatch. So if you are a hurry to see how you match with a particular person, and you know the kit

number, you can run this comparison now. Click on One-to-one Compare and a page will open where you

can enter two different kit numbers to compare:

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 21

X One-to-One Compare

This is the same as the One-to-one Compare but only on the X chromosome.

Admixture Utilities

This screen gives you a way to determine the possible ethnicity of your heredity.

Exercise 2: Run a One-to-one comparison using your own GEDmatch number.

You can run this exercise as soon as you upload your data.

Run a One-to-one against Diane’s kit number if you wish. However, you may not get any

matches that way. If you like, run a One-to-one compare of Diane’s kit number against one

of her matches from the One-to-many exercise.

Look at the results. What do you see when the two subjects’ DNA overlaps? The longer the

overlap, the closer the relationship. Try several different comparisons and notice the

difference in results. Now put a check mark next to several matches and go to the options

above the report and try them all. These are all different presentations of data. The one we

use most often is the Chromosome option. This is data you can eventually put into a

spreadsheet for further analysis as you will learn in later classes.

Exercise 3: Run a One-to-one X comparison.

Pick two people out of the results list who both have a cM amount listed under X. You will

learn more about X-DNA in the Working with Autosomal DNA class.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 22

This is still an area of improving results. Be aware that there is a speculative part to this.

Select the Project

Phasing – Refers to the process of separating the mixed DNA results into the DNA obtained from your mother

and the DNA obtained from your father. Phasing is typically done by comparing your results to your parents’

results and determining which parent could have and/or must have contributed each SNP.

Phasing your data requires that both parents and a child have been tested and all their data has been uploaded

to GEDmatch. For adoptees who have yet to identify their birth families, this obviously cannot be done. New

technologies are being discovered every day, however; hopefully, phasing of siblings may soon be available.

GEDCOMS

GEDcoms are not always available on GEDmatch. When the GEDmatch.com server is overloaded, they turn

off the GEDcom capability. When it is working you can compare your GEDcom with all other uploaded

GEDcoms looking for matches.

Are Your Parents Related?

This is the other relevant utility, run it if you think that your parents might have been related.

Tier 1

You have to pay $10/month for a subscription to GEDmatch to see and use Tier 1 utilities. We have permission

for you to use my login for this one - class username = [email protected] and password = me1lissa. The

Tier 1 Utilities include these options:

GEDmatch Matching Segment Length

Usually the only thing you need to enter in the form is the kit number of your kit or any other one you are

interested in. This can take a long time to run. It will show you the matching segments in your matches and the

length of those segments. The meaning of this is still in extremely early beta and not really understood

Exercise 4: Google these projects. Try them all or none of them if you wish.

You have a number of ways in which you can process the data. All of them have their

pluses and minuses. Remember this is an emerging field of analysis. Just experiment

with the different choices.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 23

Relationship Tree Projection

No one seems to know just what this means or how you use it yet.

Lazurus

If you have sufficient information on close ancestors you may be able to reconstruct a chromosome profile for

an ancestor. Most adoptees will not have it.

Triangulation

Triangulation is the process we use to identify people who are both ICW and with overlapping cMs. This utility

has been reworked. It now takes the top 400 of your matches and looks to see which ones having overlapping

DNA with each other. Triangulate my account number.

Just enter the kit number of the kit you want to use as the basis of the triangulation. Each time you run this

analysis, it takes about 45 minutes.

Exercise 5:

Put in the kit number you want to use as a basis, normally it would be your own. In

this case use any one you got from my matches if you are using my kit #. The first

time use the default setting, but be sure to check the second circle in the list. Look

at the results.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 24

Result of Running Triangulation

This is the result of running the Triangulation Utility

This is a 4 line representation of a table of about 1200 lines.

In this Donna triangulates with Bob and the owner of the kit id you input. (yellow)

Maria triangulates with Bob and the owner of the kit id you input. (green)

And basquebrent triangulates with Bob and the owner of the kit id you input. (pink)

So what does this mean? In order to look for a Common Ancestor between two people, you need overlapping

DNA, ICW status and tree data. Three things.

16,748,649 23,990,559

16,748,649Is the start and 23,990,559 is the end of the cM segment where the original kit that was entered and

Donna and Bob overlap. The overlap is for 14.6 cM you will learn that is an equival to about a 5th cousin. That

means that since they are also ICW. They also share great great great great grandparents. And any ancestors of

this couple are the ancestors of all three of them.

ICW, you will have to take the program’s word for the ICW part. I prefer to say blood related. Remember that all

though that your paternal grandmother and your fraternal grandfather are most likely not blood related to each

other!

So they are not ICW with each other.

Relative Roadmap: Your Journey with Autosomal DNA

© DNAAdoption.com Introduction to GEDmatch 25

There are two sides to a DNA helix. The addresses on both sides are the same. One side is from the paternal side

of your family and the other side is from the maternal side of your family. Unless you know of someone who is

tested on one side and whether they are paternal or maternal. The other factor that might be a tip is ethnicity.

So if you know one side of your family is all Italian that might help.

Definitions-

Overlapping DNA segments – DNA segments where the ending and starting numbers show that the

centrimorgans overlap with each other. A very small match is probably not going to prove useful, but

longer ones are a clue. The longer they are, the closer the relationship

ICW – people that the testing company or the interpreter of raw data says are blood related.

Both ICW with overlapping DNA – they have common ancestors

That is what triangulation is all about. You then proceed wherever possible to try to find these common

ancestors.

If this is being run against your kit #, they are also your common ancestors and you now have the start of your

very own tree. For an adoptee this can be a thrilling moment.

.

You can save these tables into a spread sheet. Open the spread sheet. Highlight the table if you checked the

“show results sorted by kit_number, chromosome, segment start position” option. Put your cursor in the middle

of the table and use CtrlA this will highlight all the data, then CtrlC will copy it onto the clipboard. (For Macs

replace Ctrl with the Apple Key in the instructions.

Go to your spreadsheet and put your cursor in the top left cell. CtrlV pastes the data into your spreadsheet and

you can then save it.

Thank you for taking this class. I hope it gets you started in evaluating your results.

A full explanation of our Triangulation Methodology is located on http://dnaadoption.com under Getting

Started. This class is intended as only an introduction to get you started.

In the meantime, there is a lot to read about DNA and lots of resources on the same website.

Good luck in your hunt.

Exercise 5:

Change the Upper Segment Threshold Limit. Observe what it does

Now try it with one of the kit numbers you observed in the triangulation results. The

results are different. Why?