DNA Clustering Using the Leeds Method

For many of you, getting a list of your DNA matches was very exciting. Suddenly, you are presented with hundreds, if not thousands of people that are genetically connected to you. However, sorting through those matches can be a daunting exercise. Where do you even begin to confirm these matches? The most obvious is to start with the DNA matches that are closest to you such as immediate family, first and possibly second cousins. Once you have some initial matches confirmed, moving beyond them to your more distant cousins can be a struggle. One technique often implemented when trying to sort your matches is called CLUSTERING. This involves creating groups of matches based on their shared DNA with each other in an effort to confirm a common ancestor.

This process can be done manually using your match list from any DNA testing company. This methodology has been coined the “Leeds Method” and is named after Dana Leeds who first introduced it. The minimum and maximum amount of shared DNA you use for this method can have varied results. Generally, I recommend a minimum of 90cM and a maximum of 400cM. Keep in mind that results that fall in this range can include individuals from 1st cousins to 5th or even 6th cousins. It is recommended not to use 1st cousins. Here are the general steps:

1. List your DNA matches in a spreadsheet down the first column from highest to lowest. Include in a second column the amount of shared DNA (cM) there is between you. I have used my Ancestry matches as an example but have privatized the names.

2. Starting with the first DNA match, look at the shared matches they have at the testing site. In a new column, assign a colour and indicate that same colour for everyone that shares DNA with the person you start with.

3. Select the next person on the list that does not have a colour (DNA Match #2) and repeat Step #2 assigning a different colour.

4. Continue this process, each time selecting the next person that has not been assigned a colour. Mark all of their shared matches with a new colour until you have everyone assigned with at least one colour. In my example including all my Ancestry DNA matches between 40cM and 500cM, I end up with 16 colours. Note that I extended my matches to this range as I was looking for relationships that were more distant. I would still recommend sticking to 90-400cM.

5. You may notice some overlap where some matches may fall into multiple clusters. This is normal and simply means that one cluster may be a different generation than another.

6. Add an additional 2 columns beyond your cluster colours. The first will be a column where you can indicate the relationship (if known). The second column will be for you to indicate the common ancestral couple you share with the DNA match (if known).

7. Finally, copy the known MRCA (common ancestors) to each of the coloured clusters for that match.

8. You will notice that common ancestral lines will be represented by each of the coloured clusters. Again, you may find multiple generations within a cluster. Generally, the cluster represents the shared DNA with anyone from the furthest generation.

9. Continue to research your DNA matches focusing on those that have the largest online trees as they will likely be the easiest to determine the common ancestors. Continue to add the information to your clustering spreadsheet.

The advantage of taking this step in your genetic genealogy process is that you can then determine, to the best of your ability, what family line a DNA match is likely to belong to before contacting them. This allows you to provide a filtered list of potential surnames that you may have in common. You are much more likely to receive a response if you can show that you have taken some time to do some research.

If you tested with MyHeritage (https://www.myheritage.com/) or have uploaded your raw DNA file to MyHeritage and have access to their DNA tools, you are in luck! Autoclusters was introduced in 2019 and can automatically produce a file with your ancestral clusters, without having to perform all the steps above. Similarly, for those that tested with FamilyTreeDNA or 23&Me Autoclusters is also available at the Genetic Affairs website (https://geneticaffairs.com/).


Sign up for our weekly newsletter and get all the latest blog posts!

1 thought on “DNA Clustering Using the Leeds Method”

  1. Sandra K Peters

    Very interesting! Ben in my DNA I find a high Orsborn count. My father was adopted, and my mom’s mother does not know who her father was. I Have tried to contact the person, but they have never answered. Do you think it is possible that he may be long to one of them?

Comments are closed.