May9

22 Comments

UPDATE (11 July 2013): It has been more than a year since I created this tool and process, and the 23andme site has gone through some extensive changes. I have not had time to test if the process in this post nor the spreadsheet is still working properly with any possible data format changes.

Feel free to give it all a shot, but I recommend that you may first want to take a look at the DNAGedcom.com site, which has automated things in a much more user-friendly way!

Ancestry Finder Drop DownIf you are a 23andme customer, you may have noticed that they updated their Ancestry Finder (AF) tool to make it possible to see the matches of the people with whom you share genomes. You can also download this data in comma-separated-variable (.csv) format.

That. is. great. stuff.

I’m sure that people will soon come up with interesting new ways to triangulate their genetic matches and learn more about their ancestry. But, there is one problem: If you are sharing your results with a few hundred people and want to get all of their AF data files, it takes a drop-down selection, wait a few seconds for AF to reload, then a button click and a Save File As click FOR EACH PERSON IN YOUR LIST.

Holy carpal tunnel.

I suppose 23andme might make it possible within their system to get them all at once which will make this all moot. But, I thought there must be a better way to get all of these data files and I found a slightly clunky, yet completely workable method.

There may be more elegant ways to do this, but for now I created an Excel spreadsheet that will help you get you all the data with only a bit copying and pasting – along with some help from a Firefox add-on.

First off, some Caveats

  • I am sorry to say I cannot promise I will be able to support or answer all questions on how to use this sheet. I am trying to make it as simple as possible, but for those that might be a bit technologically challenged you may feel it got ugly.
  • My hope with this post is that if you just follow the step-by-step you’ll be able to get there. (Most of this info is also found in my spreadsheet.) I suppose if you find this method difficult, then there’s a good chance you don’t want to download and massage the data files anyway.
  • This tool is designed to help people get the 23andme AF tool data they have legitimate access to more efficiently. 23andme’s terms and agreements obviously still apply and nothing here is supposed to work around them in any way.

In any case – Please Link Responsibly.

A bit more background and then on with the process.

Code with Profile IDWhen you visit the AF tool, part of the interface that loads is a drop-down box with the names of ALL the users you share with and whose results you can view. If you look at the page source, it turns out that the code for the drop down box also includes the user’s encrypted profile_id (nothing non-secure, I believe, just a way to pull up the right data.)

Using that profile_id and the code attached to the Download Results button, I was able to reconstruct the link for each profile in the list. Then I simply created a spreadsheet that pulled out the profile_id and appended it to a simple download link.

Download Button Code

Executive Briefing

Below is a detailed, step-by-step process for the downloads. Since that might be confusing at first look, here’s an Executive Briefing on what we will be doing:
1. Logging into 23andme and going to Ancestry Finder.
2. Copying a section of the Ancestry Finder web page HTML source code and pasting it into my pre-created handy-dandy spreadsheet.
3. Saving the Excel spreadsheet as an HTML web page and opening that in Firefox.
4. Using a Firefox download manager add-on to download all the links at once.
5. Play with the data!

Step By Step

Read through this process before you attempt it – down near the bottom you will see I use a Firefox browser add-on and you will need to have it (or a similar one) installed to download the data.

First Things First – Getting Ready

Download and open the Excel spreadsheet I created.

Download and install the DownThemAll! Firefox Extension.
(You will need to restart the browser before it will work properly.)

Login to 23andme. IMPORTANT: Keep your browser open and remain logged in to 23andme during this process or the downloads will not work!

Ancestry Finder ControlsGo to the Ancestry Finder Page

Before you worry about working with the Advanced Controls: They only appear to affect the visual display and do not affect the content of the downloaded data file.

Get the Source Code

View Source MenuSomewhere on the WHITE SPACE of the page, RIGHT click your mouse. A browser menu will appear: LEFT click “View Page Source.” This will open a new page with the HTML code of the web page. (Note: Do not try to do this on or near the chromosome visual or you will get a different menu.) You do not need to understand the code, just follow on.

Update: I realized later that due to the way I wrote the Excel formula, there is a simpler method to this next part than selecting just a section of the HTML code. I will leave the selection process here for reference, but:

You can select the entire sheet and paste it into the Excel sheet and it will work fine! When on the HTML Source page, use the keyboard shortcut <Ctrl A> to select all of the page, then <Ctrl C> to copy it to the clipboard. (Use the <Command> key on a Mac.)

Now move down to the Paste the Code Into the Spreadsheet” section below to continue.

Start of Code to SelectScroll down the page source until you find a section that starts with:
<!– Start profile select section. –>

You can use the find function <Ctrl F> to get there quicker, but it will be about a third of the way down the page.

It should be easy to spot because following it you will see the code for the list of user profiles you can see in AF.

Select Code

Note: Code edited to make example image shorter

Use your mouse to select the text starting with the <!– Start profile select section. –> line and highlight ALL THE WAY DOWN to the end of the last line with a profile name in it. (They all start with <option value=””).

If you select more lines than that, don’t worry. I have built in some cleanup in this sheet to ignore it.

With that entire section of code selected, RIGHT Click on the highlighted text and then LEFT click on Copy.

This will copy the text to your clipboard. (For those who are used to using keyboard shortcuts, <Ctrl C> works too.)

Paste the Code Into the Spreadsheet

Paste Into ExcelReturn to the “Code Cleanup” tab in the spreadsheet and paste the text you copied starting in Cell A9. There is a note above that cell that says PASTE BELOW HERE. (All other cells in that sheet are protected.)

Excel Clean Data & LinksTo the right of that code, you should now see a cleaner version of the User ID and Profile Name, along with a live link to each data file and the link text itself in the last two columns.

At this point you could click on each individual link to download the data files – but this is almost as cumbersome as before. So, we will use a browser add-on to automate the process.

If you have not done so already, download and install the DownThemAll! extension referenced at the start of this list.

Remember: If you just installed DownThemAll! you will have had to restart Firefox for it to work. If you need to restart Firefox now you will probably need to log back in to 23andme again and keep the browser open.

Convert the Spreadsheet to a Web Page

To use the DownThemAll! extension, you will need to save the spreadsheet as an HTML file and open it in Firefox. So: In Excel, click File… Save As then in the “Save As Type” box, select “Web Page” then click Save.

(If you get a message warning about compatibility, just click Yes.)
Save As HTML

Now that you have the file as a simple web page of links, you can just drag it into Firefox to open it. (Preferably in a new tab.)

The web page should look just like it appears in the Excel spreadsheet. If, when you open the page you find you are looking at the reference information instead of the links, simply click the “Code Cleanup” tab at the bottom of the page.

Download the Data Files

Download AllNow, we’ll use the add-on to get the files – RIGHT Click on some white space in the page, then LEFT click on DownThemAll!

Select AllA DownThemAll! window will appear with all of the links listed. RIGHT click in the window and LEFT click “Select All”

Check AllOnce all are selected, again RIGHT click in the window, then LEFT click “Check Selected Items”

Select a location to save all of the files in the bottom of the window, then click Start! To begin the process

DownThemAll! will show you a progress listing of the files being downloaded. Once completed, you can close the window and any browser tabs and Excel files if you wish.

At this point, the download folder you selected should have one .csv data file for each person in your Ancestry Finder list (including yourself.)

While that took a little bit of work, much easier than doing it manually – especially if you plan to download the data at regular intervals as more people are added to the database!

Be Sociable, Share!

22 Comments

  • avatar

    Comment by Robert P — May 10, 2012 @ 11:54 am

    John,

    Thanks for developing the excel file and downloading tool. I came to your blog after seeing your post on the Triangulation thread at 23andme. I have followed the steps and I have a question about downloading all the links with Down Them All. I get a message that says: File name conflict!

    “A file named “signin.htm” already exists. What do you want to do with download “https://www.23andme.co…….

    The choices are:
    Rename into “signin_001.htm”
    Overwrite existing file
    Skip/Cancel

    Which option should I select?

    Thanks

  • avatar

    Comment by John — May 10, 2012 @ 12:05 pm

    Hi Robert,
    You’re Welcome!

    Luckily I had seen that same error when I was testing things out and that should be easy to work around: What I believe is happening is that you are not logged into 23andme in Firefox.
    I would open a new tab in Firefox and go directly 23andme and log back in, then re-run the DownThemAll! steps.

    More info on what happened:
    If you are not logged into 23andme when you run the DownThemAll! steps, as the add-on tries to access those data file links, the 23andme site redirects your request to a login page – once for each file it is trying to download.

    So, instead of getting the data file, when hitting the first link DownThemAll! downloads the “signin_001.htm” web page. Then when it tried to access the 2nd link it did the same thing and tried to download another copy of “signin_001.htm”.

    So, DownThemAll! was asking if you wanted to overwrite the one you already had.

    .JT.

  • avatar

    Comment by Kasandra — May 12, 2012 @ 2:15 pm

    Great Help!

    I had already macroed bringing my shared matches to the top and shrinking the pesky AJ columns. This makes it a few button pushes total to get what I want. Thanks!

    The only suggestion I would make is that you can delete the line for peeps you are sharing with but don’t have anything in common with prior to turning it into an html doc.

    Thanks a lot!
    Kasa

  • avatar

    Comment by John — May 12, 2012 @ 5:31 pm

    The html code on the original 23andme AF page doesn’t have any data in it – just that list of people and their IDs, so there’s no way to tell whom you share segments with.
    When I get back to massaging my csv data, I’ll see if I can come up with a simple way to check that in the imported data sheet. Perhaps comparing your own csv results and then filtering other rows based on your own non-sharers.
    I started using a nice filtering interface I wrote for some Engineering data analysis that might port over for this purpose with some tweaks…

  • avatar

    Comment by Mark — May 15, 2012 @ 11:49 am

    Thank you very much for this very useful utility it saved me a ton of time and well worth the time required to set everything up.

    Mark

  • avatar

    Comment by Rob Warthen — June 6, 2013 @ 9:52 am

    As another option, you can download this file from the DNAGedcom.com web site. The site logs into the 23andMe web site for you and does all these steps. It gives you your Family Inheritance: Advanced, Ancestry Finder and matches sheets all in one big shot. You do have to give the site your password, but the password is only used the one time to download the information and is not retained. You can also change your password, run the data and then change it back if you are concerned.

  • avatar

    Comment by John — June 6, 2013 @ 10:03 am

    Hi Rob – Terrific! Thanks for your comment.

    I went through all of this when it first came available, so a more automated tool is certainly welcome… I had taken a look at your site for a bit back in January, but hadn’t had time to go back and really work with it again.

  • avatar

    Comment by eas — July 11, 2013 @ 3:15 pm

    So I did everything according to the instructions, but I do not get the links after I paste the data. I am VERY familiar with excel, so I started to remove protection and play around with it, but I thought I would post here first. Any ideas on why nothing happens?

    Thanks!

  • avatar

    Comment by John — July 11, 2013 @ 3:21 pm

    Hello eas,
    Please feel free to play around – I didn’t do anything too advanced in there and power users should be able to figure things out.

    I wrote this download tool more than a year ago and 23andme has gone through a drastic site design change. I haven’t tested the download steps in several months, so not sure if everything still works the same, or even if the data is in the same format. (They have changed their data download formats before.)

    You might want to go check out the DNAGedcom.com site, which has automated the process since my early days at creating this.

    In the meantime I will try to see if my spreadsheet and process is still useful at all or should just be retired in favor of the other folks’ tools.

    I’m going to put a caveat at the beginning of this post so others are aware it might no be working properly now. Thanks!

  • avatar

    Comment by eas — July 11, 2013 @ 3:24 pm

    well never mind, sort of. I see the problem. The spaces are all completely different in the source code than what the formula is actually looking for.

  • avatar

    Comment by eas — July 11, 2013 @ 3:26 pm

    The formulas just need to be reworked. I did it with the first line to check, but then I read about the other tool at DNAGedcom.com. As much as I love excel, I think I will test that out first. LOL.
    Thanks for the quick reply!

  • avatar

    Comment by John — July 11, 2013 @ 3:35 pm

    I hear you! Yep – took a peek and see that while the content is still the same, there’s some extra blank characters that my formulas don’t take into account. Probably should rewrite them using the TRIM() function to just blank out all of the spaces and then pull the data from that.

    But… probably better to let the other site’s tool do it for everyone. ;)

  • avatar

    Comment by Rob Warthen — July 11, 2013 @ 3:46 pm

    Let me know if you have any problems with DNAGedcom.com. I’ve tried to keep it up to date with any new information that people may want and there were some changes made when 23andMe went to their new format.

  • avatar

    Comment by Rob Warthen — July 11, 2013 @ 3:47 pm

    By the way John, we are always looking for more citizen scientist programmers at the site. :)

  • avatar

    Comment by John — July 11, 2013 @ 3:54 pm

    Rob,
    Sounds great!

    Although I’m definitely not so much a programmer as a competent slicer and dicer of data. Probably time for me to pick up some new skills so I don’t spend so much time making problems fit into my limited toolset. ;)

  • avatar

    Comment by eas — July 11, 2013 @ 4:44 pm

    I actually just tried DNAGedcom.com and I noticed that there are at least 2 people that were left out of my Family Inheritance Advanced file. Can’t win.

  • avatar

    Comment by eas — July 11, 2013 @ 4:53 pm

    Actually, there were 3 out of 53 that were left off.

  • avatar

    Comment by Rob Warthen — July 11, 2013 @ 4:54 pm

    Can you send me the details at support@dnagedcom.com and I’ll get it fixed.

  • avatar

    Comment by John — July 11, 2013 @ 4:57 pm

    How’s that for support?
    I’ll see if I can hack my Excel formula to account for extra spaces in the page source html so mine might work again too…

  • avatar

    Comment by eas — July 11, 2013 @ 5:01 pm

    Awesome. Will do. Not so sure about what I got in the AC file either. Looks like quite a bit of info was missing. I’ll send an email to you asap.

  • avatar

    Comment by John — July 11, 2013 @ 5:11 pm

    OK – I have updated my AF Download spreadsheet. Looks like it is working when I used my own data on Ancestry Finder / Countries of Ancestry, or whatever they are calling it today.

    Note: You are probably better off using my original steps of just copying the lines starting with “< !– Start profile select section. –>” in the page source…

    Hope it works!

    https://app.box.com/s/32217716434a1df24758

  • avatar

    Comment by Rob Warthen — July 11, 2013 @ 5:15 pm

    The AC file is the Ancestry Composition Information. All the data is there. I eventually will get the data into the right level, but for now, you get the different levels (continent, country, etc) all in the one value.

RSS feed for comments on this post. TrackBack URL

Leave a comment


+ nine = 18