TL;DR: I’ve written a web tool you can use to view and download your Hangouts chat history. Click this link and follow the instructions.
Google Hangouts as messaging platform has some advantages over other often used services. On my smartphone, I mainly use WhatsApp and Threema, but both lack other clients so far WhatsApp web can only be used from one device at a time, and Threema lacks a web/desktop client, so communication is limited to using the smartphone which is somewhat annoying for longer conversations. This is a thing I like about Hangouts, it happens on all my devices and I can hop in and out of a conversation wherever I am at that time.
The problem
However, a major downside for me is the inability to export your chat. Sometimes you want to re-read some conversation you’ve had, or search for something the chat partner has said, like a link to a website. Now I am used to using IRC (Internet Relay Chat), which is kind of oldschool but – depending on the chat client configuration – can log everything to text files. You can read them, grep them, search for links, nicknames, everything. Log files of conversations are great.
Since I do not have Gmail, I can’t tell how accessible or not the chat history is in the Gmail web interface, but in the Chrome app, Android app and the chat box on the Google+ website, you can only see the last 50 or so messages. If you scroll up more than that, it dynamically loads more messages from the server. Then again, scroll up, and it will load some more. If you had a couple of long conversations, this can take quite a while, and even when you do this, the output produced by selecting all and copy pasting the chat to a text editor is ugly and not useful. So I googled how to export the chat. WhatsApp for example lets you send conversation histories via e-mail, which is great. It seems, however, that Hangouts cannot do this – there is no export/archive function whatsoever.
Getting closer
Yet there is a way to access that data, it’s just not as trivial as clicking an “export to .txt” button. One of Google’s several tools, Google Takeout, allows you to download all your Google account related data, i.e. your g-mails, calendar data, contacts, Google+ posts… and Hangouts conversations. When you do this, it will create a .zip archive containing a file which contains the chat data, JSON encoded.
This file is full of unnecessary rubbish, I estimate less than 10% of the file contents are actual chat messages, the rest is metadata such as participant IDs, conversation IDs, read status, timestamps, and lots of other things. This makes the file unreadable and messy (note that there is not even one actual chat message in the part I screencaptured).
My Solution
I spent some time analyzing the structure of that JSON and wrote a parser in PHP to turn the JSON into different useful formats. It offers the following views and download options:
- HTML view, displays the chat nicely formatted in a messenger-like style so you can easily read through it (features clickable links, embedded images etc.)
- Text view, shows the chat in an IRC-like format that you could copy paste to a text editor.
- XML view, in a textbox, to be copy-pasted somewhere.
- XLSX download, all conversations in a single Excel 2007 (Office Open XML) document
- CSV download, this can be opened in Excel, LibreOffice Calc etc.
- ZIP with CSVs, all conversations as single CSV files, bundled in a .zip archive to download
In order to use the parser, you have to…
- download your chat history from Google Takeout
- (optional: extract the JSON file from the .zip archive you get from Google)
- upload it to the parser (zip or extracted JSON).
It will then show you a list of all your Hangouts conversations, be it group or one-to-one chats, and you can then view or download single conversations.
Screenshots
Don’t worry, your text will not be all blurry
Caveats
Google switched from Google Talk to Google Hangouts on May 15th, 2013, and the file you get from the Google Takeout export only contains Hangouts chats. This means that there won’t be messages from before that date in your export.
Older chat messages can be found in the “Chats” folder in Gmail. My tool can only work with Hangouts data – if you want to export Google Talk chats too, there’s a python script by Clint Olson (@coandco) here.
Click here to use my parser
If you have a suggestion or found a bug, please do leave a comment below! You can also chat with me in #JayCorner on freenode and QuakeNet.
FAQ
Privacy notice aka What happens to my file when I upload it to your server?
The uploaded file will be stored on the server for 24 hours (there is an hourly cron job that deletes uploads older than 24h), so you have enough time to review your conversations and work with the data. After that, you’ll have to re-upload. I am the only person with access to the server, and I promise I won’t read your chats, you’ll have to trust me on that though There is now (as of 2015-03-15) also a “delete immediately” button at the bottom of the conversation overview, so you can delete your upload when you’re done.
Why does the parser show “unknown_102900492317555965172” instead of the name for some people?
I don’t know. Every person has what Google calls a “gaia_id”, it’s just a long number like 102900492317555965172. This is what identifies that person. In most cases, the person’s name is given in the Hangouts export file too, but sometimes it’s missing. I don’t really know why, maybe they have some stricter privacy settings than most of the people, or maybe they don’t have a Google+ profile, or a private one, or changed their name, … I could only guess. I couldn’t find out the reason so far. Sorry.
How can I save the HTML view with all the images?
Just open the HTML view and use your browser’s “File -> Save as…” function. Make sure to choose “Complete page” or “HTML with all images” or something like that as format, to avoid only saving the html page but nothing else.
Will you publish the source code so I can run the parser on my own server?
Yes and no. I will not publish the source code of the whole parser, and by that I mean the website layout, the uploader, all the methods for exporting/viewing a CSV, XLS, HTML etc. – what I did publish though is the very PHP method that actually parses a Hangouts json file into a PHP array of conversations. You are free to build your own parser around that.
Updates
Update #1 2014-12-29: the CSV link now generates a file download instead of showing a large text box with comma separated text.
Update #2 2014-12-30: I have made some more improvements to my parser and moved the actual parsing code to a separate function, which I published here.
Update #3 2015-01-02: Memory optimizations, fancy HTML5 uploader, view as XML, proper error messages, date picker in HTML view, subtle CSS improvements, bugfixes
Update #4 2015-01-04: Button to download a .zip containing CSVs of all convos.
Update #5 2015-01-10: Added XLSX export (1 conversation per sheet)
Update #6 2015-03-15: Added “delete my upload now” button to the conversation overview after being asked by someone to delete their upload
Update #7 2015-03-21: Added sortable “time of last message” column to conversation list, added expand/collapse to members column when there are more than eight members in a conversation, added message date range filter (makes the conversation list show only conversations with at least one message in the given date range), added selection checkboxes to conversation overview so you can select one, many or all conversations to download as XLSX or zipped CSV. Updated the screenshot of the conversation overview to reflect the changes. Also, the message order in the HTML view can now be reversed (newest message at the top, oldest at the bottom). Some of these changes were based on suggestions by @JMG, kudos to him!
Update #8 2015-04-26: You can now upload the zip file directly instead of first extracting the Hangouts.json file from it and uploading that, this will make uploads much faster and reduce traffic. Thanks @HIM357 for the suggestion.
Update #9 2015-05-29: Fix for chat timestamps being in GMT+1 instead of the user’s local timezone. Thanks @SC for the bug report and @amh for his help with testing the fix.
Update #10 2015-06-13: Added .txt download for single conversations and zipped .txt download for multiple conversations. Thanks @Tomcat for the suggestion.
Update #11 2015-07-06: The hangout parser is now SSL encrypted! Requests to http://hangoutparser.jay2k1.com will be redirected to https://hangoutparser.jay2k1.com. Shoutout to StartCom for providing the certificate!
Update #12 2015-08-07: In the HTML view, the current date is now always visible at the top. It took some CSS fiddlery to make it the way I wanted it to be, and now I’m quite satisfied with it. Thanks to @Ray890 for joining me on IRC and suggesting it!
Update #13 2015-09-23: I made a secondary CSS stylesheet for printing, so if you print the HTML view now, it will look kind of good. I also added a screenshot above to show what it looks like.
Update #14 2015-10-15: Wooooo, 10,000 uploads! I’d never have expected this to be so successful and useful to so many people. Thank you guys! Having coded something that many people use feels great!
Update #15 2016-04-30: Wooooo, 20,000 uploads! This is great. The second 10,000 uploads just took about six and a half months, that’s about half the time of thefirst 10,000. It’s really great to know something I coded is useful to so many people!
This is really great. Thanks! It’s exactly what I needed….
I would love to see if your php works, but I am a bit new to this and don’t know how to extract the .zip to a .json file. when I use an extract provided by google drive, no option is given by a .json.
I’d suggest you do not use the Google Drive option, but “send download link by e-mail”, and then download the .zip to your computer. Once it’s there, you can just right click the zip file and select “extract to…”.
Same issue. Also, I noted that the first message shown in the html and txt selections is not near the top of the file, so I think that I’m getting it all downloaded, but it’s not all presented when I upload it.
With your permission, I could examine the file and see if it’s a problem of the parser or if there’s actually data missing in the file. If you want, you can join my chatroom to discuss this.
Hmm, I have a small file, 4.7M but it’s not getting all the messages. I know the message go back to early 2014, but the first on I see is dec 2014. Suggestions? I’m not getting any errors.
Try requesting another Hangouts export from Takeout. Some people have reported getting partial exports in some cases.
After many hours, i found your solution! Thank You vary much as you just saved my day ! Google should learn with you.
Hi, This worked great once, now I can’t seem to get the HTML to display at all. The text and csv work bt with the HTML I can print to a PDF with images.
If you want, you can chat with me and we can try to resolve the issue. When I’m not responding for a few minutes, I’m probably not at the computer (UTC+1 here).
Turns out I had broken something when adding the theme changer. Fixed now. Kudos to Tim for reporting the issue!
I tried your parser. But nothing is happening after I upload the .json file. The page gets stuck at ‘ ‘Uploading… 100%’ .
Console log :
InterYield clickbind 1.0-SNAPSHOT.34,044 20150226-1458
nocoverage.do?callback=AA3RsriL.NoCoverage&product=iy&title=&matchedKeyword=&affiliate=monad&subid=…:6 InterYield click bind handler had no ad coverage.
inj_sprk_starter.js?pid=LTEsMTQ0MjY0LDk2NzQ1LDU3MTA5&subid=src45_pr&appname=Smartbar:1 Resource interpreted as Script but transferred with MIME type text/plain: “http://ext1-api.engageya.com/gas-api/feed.json?cb=inj_sprk_callback&format=…-us&cs=UTF-8&pid=LTEsMTQ0MjY0LDk2NzQ1LDU3MTA5&subid=src45_pr&title=&kwrds=”.
VM667:1 Resource interpreted as Script but transferred with MIME type text/plain: “http://s.krbfjs.info/dealdo/shoppingjs4?b=Chy9mZaMDhnSptaMzgf0yt0Ln0iLmJjOm…5RpszPBNn0z3jWpszPywC9y2XPzw50mtaWlI4My29VA2LLC1n0yxr1CZ1JB29RAwvfBMfIBgvK”.
Hi Yash, I’ve looked into the server logs and could see you’ve tried to upload the file four times, the files are there, so the uploads worked. There is no message in the error log, so I don’t know exactly what went wrong…
The console log messages you posted don’t have anything to do with the upload parser. Googling for “interyield clickbind” shows that you are infected with some malware/virus though. See also this similar problem from someone else. I suggest you google this yourself to find out how to remove it. This will probably solve your problems with the uploads. And you should think about using an anti-virus software.
Best of luck, Jay
Hi, this is a fantastic piece of work. Well done. I’m surprised that Google does not have something akin to this within their range of tools.
I have noticed that when I collate all my hangouts, it only goes back to a date in 2013, however, I have chats dating back further than that. Do you have any idea as to why this might be?
Thank you again!
Hi Harman,
thank you for your comment, I am happy you like my work.
The user in this comment chain had the same problem, and it seems it is a bug with the Google Takeout export not exporting everything. I had found a thread regarding this issue on Google forums, but can’t seem to find it again. Please try requesting another Google Takeout or two and see if they will have a different size.
EDIT 2015-03-15: I found out that Google switched from Google Talk to Google Hangouts on May 13th, 2013, and only messages after that date are included in the export you get from Google Takeout.
Thanks so much for making this Jay! My wife and I are currently working through a very lengthy spousal visa application and one of the requirements is proof that we still maintain contact so this has proved essential.
Are there any plans to introduce PDF export? It would fantastic if such an export was formatted in a similar way to the HTML export (ie, with embedded images) but on a white background for ease of printing.
Also, any plans to create a downloadable version of the app (like on Android). I would happily pay for it
Hi, thank you very much for your feedback! I am glad my tool has proven useful for an actual purpose other than the fact that having a backup of your chats is nice to have. Your comment made me happy Also, good luck with your application!
As for your suggestion about PDF export, that’s actually a good idea. I thought about this already, but discarded the idea because I assumed you could just copy paste the TXT version to e.g. Word and save as PDF, but that of course would not include images. I won’t make a promise here, but I’ll definitely look into it and see what I can do.
Regarding a standalone version, be it for Windows/Mac OS X/Linux or a mobile app for iOS or Android, I am not a developer and know nothing about making actual apps, so I’m afraid that won’t happen. There is, however, an android app already that can convert your Hangouts.json to XML and optionally import that to an SMS app, maybe you want to check it out. And if you look for something to run locally on your PC, there’s a Python script for that.
Hope that helps, Jay
I figured the easiest and quickest way to achieve what you requested is to change the dark background color to white, so I made a color switcher in the top bar of the HTML view which lets you switch between the dark and a new white color scheme. You can then use your browser to print to PDF.
Great! I’ll be sure to give it a go later tonight.
Thanks & Bravo Mate
Hello!
The tool worked like a charm, and was quite easy to work with. However, I was wondering if there was a way to raise the file size.
My past Hangouts .json files were only a hundred megabytes or so, which I tried out with the parser, and work amazingly. However, I just did a Takeout today, and found out that it was more than 400 MB, which was above the maximum file size.
Thank you. =3
Hi Harry,
the reason for the file size limit is that I am using PHP’s json_decode() function which unfortunately reads the whole JSON into memory in order to parse it. This makes it quite memory intensive (with a 278MB JSON file, the peak RAM consumption of the parsing process is 2.6GB). However, because my server has plenty of RAM, I increased the file size limit to 500MB and I’ll just see how it goes.
This looks wonderful, I have yet to try it but it looks perfect. However I can’t let myself upload my chat histories to a 3rd party like this one. Was there a link to your source code, or any way I could get this running locally?
Thank you very much for your work!
Hi Ian,
I can understand your worries. Rest assured though that I have no interest in reading other people’s chats (so far, more than 400 files have been uploaded, and I really have better things to do than to read them). Besides, as a system administrator in a web hosting company, I have access to sensitive data of a large number of private and business clients, so being trusted with other people’s data is part of my job (being able to install and run my very own server in our datacenter is a perk of that job too). Still I can understand if you’d rather not upload your chats to some random guy’s server.
At the end of the blog post you’ll find a link to the source code of the parser function (read update #2). That function will give you a PHP array of your conversations and messages so you can further process them (display them, save them, you name it). Depending on the size of your Hangouts.json file and CPU power, you probably have to increase some limits in php.ini (max_execution_time, memory_limit etc.) — it seems the parsing process consumes roughly ten times the file size worth of RAM.
FWIW, I also found python and ruby scripts to parse a Hangouts.json file.
Perfect easy to use and just what I needed. Thanks
Searched for about an hour online to find a way to get a history of a chat that I needed. I can’t believe that Google doesn’t have a simpler way in their interface to do this. Thanks so much for your tool. I bookmarked it to use in the future. It worked great!
When I check the “chats” tab in gmail, I have hangout chats all the way to 11/29/2014. When I downloaded the json and parsed it, it only goes back to 12/29/2014. Am I not downloading it right?
Hi Vu, I don’t know how the chats tab works since I do not have a gmail account, but to my knowledge there is no way you could “not download it right”, Takeout doesn’t let you specify a date range or something like that. I can hardly imagine you would get different data.
I could analyze the file though, maybe in your chat something strange happened and my parser has trouble parsing it, and is only showing one part of your conversation(s). If you want me to look at it, please send me the hangoutparser link you got (the long ID string in it is actually enough) using either the contact form or the chat on the “About” page — I’m afraid that’s all I can offer.
It turned out that takeouts is extremely buggy. I tried 5 consecutive times to download the hangouts, and each time it was different (in file size ranging from <1mb to 3mb). I just ended up downloading the "largest" one and hope for the best. I hope they escalate this problem up the chain because its a pretty embarrassing problem considering its google.
That is weird. I just requested two Hangouts exports on Takeout (as ZIP via e-mail link), downloaded both and they were identical in size and my parser reported the same amount of messages in all conversations. Also, during development of the parser, I have requested quite a few exports, and I haven’t ever noticed anything unusual…
I guess I’m lucky. Apparently others have experienced very similar issues as well, so its not just me.
Awesome! Just used it and worked like a charm!
Thanks!
Wow thank you for building this, you have provided me with the exact output format I was looking for! (CSV)
Great, thank you
So, when I coded this thing, I looked at my own Hangouts.json file and it was 18MB in size. So when setting a size limit for files to be uploaded, I thought 100MB was high enough. Apparently, I was mistaken. This limit was the reason David, the commenter above, ran into errors.
I raised the limit to 300MB now. Additionally, because I now realized there are people with files this large, I added a remaining time estimate display to the uploader, and I changed the code to display a proper error message when the file is too large. Before the actual upload process. And because coding is fun, I added another display option: XML.
Hi, every time I try to use the parser, I get the message “ERROR: requested file not found. It has probably expired.” Is there a fix for this? I have tried downloading the data again, but it didn’t fix the problem.
Hey Jason, I am glad you find it useful. Please see the bottom of the post for an update on sharing the source.
As for Hanna’s problem, this is fixed now by providing CSV download directly.
Hey, this is really awesome and exactly what I’m looking for (I think), however, I can’t seem to download the CSV file in the right right format? (eg each cell has the full date, person, convo. Instead of each cell having the date, another cell having the name and another cell having the convo)
Hi this is excellent – would you again consider sharing your source code? I would really like to see how you did this.
Thanks so much. This worked perfect for what I needed it for.
This is exactly what I am looking for. I would love to have a copy of the source code, if you don’t mind sharing it.
Thanks!