How to export and backup your Google Hangouts chat history

TL;DR: I’ve written a web tool you can use to view and download your Hangouts chat history. Click this link and follow the instructions.

Google Hangouts as messaging platform has some advantages over other often used services. On my smartphone, I mainly use WhatsApp and Threema, but both lack other clients so far WhatsApp web can only be used from one device at a time, and Threema lacks a web/desktop client, so communication is limited to using the smartphone which is somewhat annoying for longer conversations. This is a thing I like about Hangouts, it happens on all my devices and I can hop in and out of a conversation wherever I am at that time.

The problem

However, a major downside for me is the inability to export your chat. Sometimes you want to re-read some conversation you’ve had, or search for something the chat partner has said, like a link to a website. Now I am used to using IRC (Internet Relay Chat), which is kind of oldschool but – depending on the chat client configuration – can log everything to text files. You can read them, grep them, search for links, nicknames, everything. Log files of conversations are great.

Since I do not have Gmail, I can’t tell how accessible or not the chat history is in the Gmail web interface, but in the Chrome app, Android app and the chat box on the Google+ website, you can only see the last 50 or so messages. If you scroll up more than that, it dynamically loads more messages from the server. Then again, scroll up, and it will load some more. If you had a couple of long conversations, this can take quite a while, and even when you do this, the output produced by selecting all and copy pasting the chat to a text editor is ugly and not useful. So I googled how to export the chat. WhatsApp for example lets you send conversation histories via e-mail, which is great. It seems, however, that Hangouts cannot do this – there is no export/archive function whatsoever.

Getting closer

Yet there is a way to access that data, it’s just not as trivial as clicking an “export to .txt” button. One of Google’s several tools, Google Takeout, allows you to download all your Google account related data, i.e. your g-mails, calendar data, contacts, Google+ posts… and Hangouts conversations. When you do this, it will create a .zip archive containing a file which contains the chat data, JSON encoded.

Google Hangouts Takeout JSON
What Hangouts.json looks like

This file is full of unnecessary rubbish, I estimate less than 10% of the file contents are actual chat messages, the rest is metadata such as participant IDs, conversation IDs, read status, timestamps, and lots of other things. This makes the file unreadable and messy (note that there is not even one actual chat message in the part I screencaptured).

My Solution

I spent some time analyzing the structure of that JSON and wrote a parser in PHP to turn the JSON into different useful formats. It offers the following views and download options:

  • HTML view, displays the chat nicely formatted in a messenger-like style so you can easily read through it (features clickable links, embedded images etc.)
  • Text view, shows the chat in an IRC-like format that you could copy paste to a text editor.
  • XML view, in a textbox, to be copy-pasted somewhere.
  • XLSX download, all conversations in a single Excel 2007 (Office Open XML) document
  • CSV download, this can be opened in Excel, LibreOffice Calc etc.
  • ZIP with CSVs, all conversations as single CSV files, bundled in a .zip archive to download

In order to use the parser, you have to…

  1. download your chat history from Google Takeout
  2. (optional: extract the JSON file from the .zip archive you get from Google)
  3. upload it to the parser (zip or extracted JSON).

It will then show you a list of all your Hangouts conversations, be it group or one-to-one chats, and you can then view or download single conversations.

Screenshots

conversation list
conversation list
HTML view
HTML view
text view
text view
XML view
XML view
CSV view
CSV view
CSV view
Print view

Don’t worry, your text will not be all blurry

Caveats

Google switched from Google Talk to Google Hangouts on May 15th, 2013, and the file you get from the Google Takeout export only contains Hangouts chats. This means that there won’t be messages from before that date in your export.

Older chat messages can be found in the “Chats” folder in Gmail. My tool can only work with Hangouts data – if you want to export Google Talk chats too, there’s a python script by Clint Olson (@coandco) here.

Click here to use my parser

If you have a suggestion or found a bug, please do leave a comment below! You can also chat with me in #JayCorner on freenode and QuakeNet.

FAQ

Privacy notice aka What happens to my file when I upload it to your server?

The uploaded file will be stored on the server for 24 hours (there is an hourly cron job that deletes uploads older than 24h), so you have enough time to review your conversations and work with the data. After that, you’ll have to re-upload. I am the only person with access to the server, and I promise I won’t read your chats, you’ll have to trust me on that though There is now (as of 2015-03-15) also a “delete immediately” button at the bottom of the conversation overview, so you can delete your upload when you’re done.

Why does the parser show “unknown_102900492317555965172” instead of the name for some people?

I don’t know. Every person has what Google calls a “gaia_id”, it’s just a long number like 102900492317555965172. This is what identifies that person. In most cases, the person’s name is given in the Hangouts export file too, but sometimes it’s missing. I don’t really know why, maybe they have some stricter privacy settings than most of the people, or maybe they don’t have a Google+ profile, or a private one, or changed their name, … I could only guess. I couldn’t find out the reason so far. Sorry.

How can I save the HTML view with all the images?

Just open the HTML view and use your browser’s “File -> Save as…” function. Make sure to choose “Complete page” or “HTML with all images” or something like that as format, to avoid only saving the html page but nothing else.

Will you publish the source code so I can run the parser on my own server?

Yes and no. I will not publish the source code of the whole parser, and by that I mean the website layout, the uploader, all the methods for exporting/viewing a CSV, XLS, HTML etc. – what I did publish though is the very PHP method that actually parses a Hangouts json file into a PHP array of conversations. You are free to build your own parser around that.

Updates

Update #1 2014-12-29: the CSV link now generates a file download instead of showing a large text box with comma separated text.

Update #2 2014-12-30: I have made some more improvements to my parser and moved the actual parsing code to a separate function, which I published here.

Update #3 2015-01-02: Memory optimizations, fancy HTML5 uploader, view as XML, proper error messages, date picker in HTML view, subtle CSS improvements, bugfixes

Update #4 2015-01-04: Button to download a .zip containing CSVs of all convos.

Update #5 2015-01-10: Added XLSX export (1 conversation per sheet)

Update #6 2015-03-15: Added “delete my upload now” button to the conversation overview after being asked by someone to delete their upload

Update #7 2015-03-21: Added sortable “time of last message” column to conversation list, added expand/collapse to members column when there are more than eight members in a conversation, added message date range filter (makes the conversation list show only conversations with at least one message in the given date range), added selection checkboxes to conversation overview so you can select one, many or all conversations to download as XLSX or zipped CSV. Updated the screenshot of the conversation overview to reflect the changes. Also, the message order in the HTML view can now be reversed (newest message at the top, oldest at the bottom). Some of these changes were based on suggestions by @JMG, kudos to him!

Update #8 2015-04-26: You can now upload the zip file directly instead of first extracting the Hangouts.json file from it and uploading that, this will make uploads much faster and reduce traffic. Thanks @HIM357 for the suggestion.

Update #9 2015-05-29: Fix for chat timestamps being in GMT+1 instead of the user’s local timezone. Thanks @SC for the bug report and @amh for his help with testing the fix.

Update #10 2015-06-13: Added .txt download for single conversations and zipped .txt download for multiple conversations. Thanks @Tomcat for the suggestion.

Update #11 2015-07-06: The hangout parser is now SSL encrypted! Requests to http://hangoutparser.jay2k1.com will be redirected to https://hangoutparser.jay2k1.com. Shoutout to StartCom for providing the certificate!

Update #12 2015-08-07: In the HTML view, the current date is now always visible at the top. It took some CSS fiddlery to make it the way I wanted it to be, and now I’m quite satisfied with it. Thanks to @Ray890 for joining me on IRC and suggesting it!

Update #13 2015-09-23: I made a secondary CSS stylesheet for printing, so if you print the HTML view now, it will look kind of good. I also added a screenshot above to show what it looks like.

Update #14 2015-10-15: Wooooo, 10,000 uploads! I’d never have expected this to be so successful and useful to so many people. Thank you guys! Having coded something that many people use feels great!

Update #15 2016-04-30: Wooooo, 20,000 uploads! This is great. The second 10,000 uploads just took about six and a half months, that’s about half the time of thefirst 10,000. It’s really great to know something I coded is useful to so many people!

426 Comments

  1. Needed some informations from two years ago, and your work helped me go through all my archive without having to scroll up through it! Great job, will definitely recommend.

  2. I tested your parser, and all but one very big conversation was included. Does your parser have an upper limit of the size of conversations? The title of the conversations includes the Hangout emojis \uD83D \uDCF1 – does your parser have a problem with this?

    1. I didn’t encounter a conversation size limit yet, and I didn’t set any. Emojis are mostly being replaced, but I can see errors in the log file – I’m looking into it and get back to you asap.

  3. Every time I try to browse to the file it loads it to 40% ad then the following message pops up. I tried to re download but it does not hep.
    “Error: The JSON inside the zip file does not seem to be a valid Hangouts.json file”

    1. That most probably means that it is indeed not a valid file. You could unpack the zip file on your computer and open the Hangouts.json in a text editor to check. My logs say that the uploaded file did not contain any text…

  4. Will this work on deleted chats? My google hangouts got deleted when someone disabled my account, G+ is up and running fine but it says i have no chat history, it was devastating and I have searched the internet since in the weeks since hoping to find a solution’s. I was told once they get deleted they are gone forever, i cant except that, i know i cant magically put them back in the chat but surely there is a way to recover the content. Surely they must be stored on my hard drive? Please can you help me?

    1. Hi Jacqq, I am not sure about that, but I don’t think the chats are actually on your computer. I would recommend following the instructions in this blog post and downloading the chat history from Google Takeout. If it’s not in there, I’m afraid it’s irretrievably gone. I wish you best of luck!

  5. Could you help me to extract data from json file, which is google search history.

  6. Good work!
    I really appreciate your idea and the tool!

    However could you please add an option “Save as…” direct raw text file ?
    It would be more convenient, than opening TXT as HTML page, and then selecting whole text, then Copy/Paste into text editor.

      1. Thank you very much!
        It is so convenient now

        One suggestion anyway:
        could you please decode LT/GT chars around username in the TXT file ?
        It is already well done in the TXT when downloaded as ZIP,
        but it is not decoded when downloading a single conversation as TXT.

    1. Jay I read the thread on exporting hangouts to use in emails. I am computer illiterate. Can you please update step by step for 2015? Also can you use a cdma sim card from straight talk and put it in any cdma phone and it will work? Same thing for gsm?
      Lastly my phone is not recognizing my sd card. It is an att Samsung go phone. What do I do to get data on there?
      Thank you
      Terri

  7. This is very well developed, thanks for sharing the idea and service.

    I was actually looking for a way to archive the chat that happens *inside* the hangout meetings as notes; this seems to pickup the chat that takes place outside (or am I looking at it wrong??)

    1. Well, this tool was meant to backup the hangouts chats as I use them – and I use the hangouts chat only, like a messenger like WhatsApp. I don’t think I ever used the video chat/conference function which I believe is what you mean. I don’t know if the chat messages that were sent during a video conference are part of the Takeout archive, but if they are, I can make that happen I guess. Would you mind me contacting you on hangouts via the address you provided to talk about some details?

    1. I don’t use it, so I neither know what it is, nor what data you get from Takeout, so I’m afraid that’s a no.

  8. Hi, I am a user in Hong Kong (GMT +8). It is a very useful tool and I am heavily rely on it to extract the messages. However, I guess owing to the time zone different, the date / time presented by your tool is not correct. Any chance you can get this fix? Much appreciated. Thanks.

    1. I actually have an idea on how to accomplish this. Since the script is fully PHP, it cannot know which time zone you are in, but I will determine this with some JavaScript and store your time zone in a cookie, which will then be read out by PHP so it can adjust the displayed timestamps accordingly.
      I am somewhat busy at the moment, so it might take a few days until I get around to code this.
      Thanks for the great idea though! I didn’t even think about this.

      1. Holy smokes, I can’t believe nobody brought this up yet! Anyway, I was able to confirm this bug and code a fix. It should work properly now. Thanks to @amh from Australia for helping me test it.

  9. Worked amazingly,, didnt expect this much.. Great effort thank u so much.. Google has to learn this one.

  10. This application is so usefull!! Exactly what I long needed. Thank you so much for developing it! Awesome!

  11. Hi Jay,
    I was absolutly looking a way to work around this JSON files. Actually im getting an error after i upload the JSON…

    Notice: Undefined offset: 0 in /home/Jay2k1/wwwroot/hangoutparser.jay2k1.com/www/parsehangouts.php on line 257 Notice: Undefined offset: -1 in /home/Jay2k1/wwwroot/hangoutparser.jay2k1.com/www/parsehangouts.php on line 258

    This is shown at the top of the page , and im not getting the lines with the conversations on the arrays, am i doing something wrong?

    I tried with the JSON, and the zip file, same error on both (it’s only 4KB)

    1. Hi, I think this could happen if you submit a file without messages. I fixed it, check again. ~Jay

      1. Thank You Jay! now it is working, i found out i was running on GTalk til now, i just switched to Hangout and try it out! its working great now! that u fix it even without messages
        I really appreciate your support and your tool!
        ill be in touch, C ya around

  12. Thank you Jay so much for putting this together! I think it speaks well about a person that is willing to put the work into creating something that can be used anonymously by thousands or millions of people. Maybe only a small percent express their due appreciation to you but I think you deserve much more praise!
    “What we do in life, echos in eternity”

    You’ve inspired me to get moving on creating more tools like this that others can use.
    Be well!
    ~brad

  13. I am excited to use your program and thank you for the effort put in to it! My archive is 800MB and your limit is 500 on the server. Are there any options out there? Thank you!!

    1. Sure, as of yesterday you can upload the zip file directly, that should work. Or are you saying your zip archive is 800MB?

      1. Yes, the archive is 800MB in this case

        It’s an ongoing conversation with my girl for the last two years. We were hoping to back it up and your program and the HTML output looks amazing.

  14. This looks like just the tool I’ve been looking for.

    Would you be willing to add support to extract the JSON from the Google Takeout archive on the server? 16MB is much easier to upload than 300MB.

    1. I’m glad you like it
      I’ve been thinking about allowing upload of the ZIP archive already. The reasons against this were that I can’t validate the file before uploading, and, much more importantly, do not know the file size of the actual JSON file before uploading. It couldn’t be too large because of memory usage issues. I have since managed to improve the code so it now consumes less memory, and also I have upgraded to a new server with more memory and a faster CPU, so this shouldn’t be an issue anymore.
      I think I’ll look into this again now. Thank you for your suggestion!

      1. Thank you for looking into it so quickly. I’ll test it out now.

        Another option could be to create an offline version that processes the file on the user’s machine.

Leave a Reply

Your email address will not be published. Required fields are marked *