So the Landgericht Muenchen today handed down what could be an important judgement, because (a) it seems to be doubling up on Schrems, (b) is from a not unimportant court in Germany (Laender level), and (c) suggests a somewhat limited understanding of the technology of Internet within the judiciary. Having said this – I have not read the judgement (it is available for puchase [here][hm]) and there is one redeeming factor: whilst the outcome was not satisfactory, the fine imposed (€100) does not seem particularly draconian. Still. A bad precedent in my view.
So what was this judgement about? See here and here for a summary in German, and here for a shorter summary in English. Also for reference my threads today – which contain interesting responses – are here, here, and here. After in Schrems the judges found that the use of Google Analytics was unlawful for website operators (!) because of its intrusiveness, this judgement found that the use of Google Fonts is also illegal under GDPR. This, in my considerate view, is nonsense. I will here write down my thoughts in form of a Q&A – I may find time later to structure my thoughts but at the moment I want to make sure they are all together.
How does Google Fonts work?
Google Fonts is a free service by Google where it hosts certain resources – in this case fonts – at a well known URL where everyone can download them. The fact that it is a font does not matter – it could be an image file or a text file.
What personal data is transmitted?
The data that is transmitted in the request is the following
The IP address of the user; this is absolutely necessary because otherwise the data can not be routed back to the intended recipient
Googles third-party cookies in the browser, provided they did not opt out of sending those (which is default in many configurations); this data is not necessary, and the user / their browser can easily opt out
The referrer, ie the website (and exact URL) that caused this request; this data is not necessary to transmit, but for reasons that are not clear to me you can not as a user opt out transmitting it
The privacy issues here are two-fold: firstly, if a user is at home, then usually the IP address identifies a single household. IP addresses are usually long-lived, from a few days to weeks or even months. On the assumption that the user is active on the Internet it is fair to assume that Google can associate an IP address with a household. If the user is at work the IP address is often not particularly useful – it will simply be a generic exit node of the corporate network, shared by 100s or 1000s of users.
If the user also submits a third party cookie (which, as a reminder, they can easily opt out of or even have to actively opt in to) then Google can identify the user itself, and not only the household. Note that in this case the IP does not matter – other than helping Google to associate an IP with a real world identity.
Finally the referrer is what can make this sensitive – let’s say the URL contains
.../what-to-do-if-you-go-bankrupt/... then this alone contains sensitive information. Moreover, as Google has crawled those pages, and is aware of correlation patterns amongst people accessing those pages, it is reasonable to say that someone identified accessing a certain page may be of certain importance and be indeed personal information.
Can’t this data be read by everyone else anyway?
No. Assuming the user uses https (and not http) then all the nodes in between see is that a certain IP address exchanges information with Google. Nodes that are close to the user may see that the user is accesing a certain website (
webdoctor.com) but not the URL (
how-to-treat-herpes). Also as the headers and cookies are encrypted, there is no way for an eavesdropper to understand that the request to download Google Fonts is related to a request on webdoctor.com, let alone at which URL. So on the assumption that the website is in the EU, and all inter-EU traffic never leaves the EU (a big assumption; read up on the Border Gateway Protocol why this may not always be the case) then noone outside the EU will ever be able to associate the IP address of the user with the sites they browse.
Can the user (reasonably) prevent this data being transmitted?
The term “reasonably” here does a lot of work. The user can absolutely prevent this data to be transmitted. Examples how to do this include
program your own browser (not that hard; a lot of it is open source) and ensure that it does not transmit data that you do not want to transmit
disable third-party cookies (that’s pretty much a given; everyone should do that); that does not however get you all the way, you also need (3)
start Chrome with the option
--disable-remote-fontswhich may seem daunting for many users but is actually extremely easy
set up your system to block all requests that do not go to EU IP addresses; this is not that hard to do, but will most likely break the Internet for you
What is special about Google Fonts?
Nothing – other than that the website owner could have easily embedded them into the website itself, and which point the request would not longer have gone to Google. But other than that this is relevant for any kind of content that is hotlinked: scripts, images, sounds etc. Special mention to videos – if you embed a YouTube video or a SlideShare etc you do the same.
Does it make a difference to ensure everything is locally hosted?
Now this is where it gets interesting. Say you host your website in the US – maybe your are using wix.com, or wordpress.com, or netlify.com. At this point all is lost anyway – you may as well request resources from Google in the US. Now you can go with a hosting company that is in the EU, or at least offers EU servers, but I am not sure that this works, and probably not on free plans. For example, this side is hosted on Netlify, and I am not aware of any options of where I could choose where this server is place, and I am also not aware of any decent EU competitor.
Also there are not only websites, there are also CDNs. Interestingly, the request for the font may never hit the Google servers. The reason is that Google uses a CDN (“content delivery network”) like CloudFlare. A CDN is a network that pushes copies of the data closer to the user, and files can be delivered from the CDN without ever hitting the servers. In which case Google would never see the IP of the person requesting the fonts. That’s the good news. The bad news is that you now have the CDN – and the CDN is not just someone in the middle: as they are replicating data they are genuine end points, so they see all IPs and cookies and all the rest.
Who transmits the data?
Now this is where it gets legally interesting – and of course legally I may be on thin ice, just like lawyers seem to be technically on thin ice. Casual reading of the GDPR suggests that for there to be a data controller and data processor relationship, there must be a data transmission between the data controller and the data processor. The way GDPR reads this is a subcontractor relationship: I have the customer relationship and the data, so I am the controller. If I give it to someone else – the data processor – then (a) I am responsible for their actions, and (b) if they are not in the EU there are additional steps I need to take.
Those assumptions do not apply here! There is no data transmission between website and Google. What happens is that the website suggests to the user (or rather, their browser) “if you want to go ahead you should really get this resource from Google”. In case of fonts it is actually well understood that this is not a necessary request (see above) and that the user (or rather, their browser) can just ignore it, at the expense of a slightly less pretty site. For other content this of course is not as easy – if the site embeds a video, and the browser choose to not load the video, then chances are that the site is not working as intended. However – this is all at the discretion of the user. They - or rather their browser - can at any point decide that they do not want to interact with anyone outside the EU. This may break a number of sites – probably 2/3 of the Internet – but that’s a personal choice. There is no general service obligation by providers of web servers to make it accessible to everyone, respecting everyone’s data protection preferences.
So to drive home this point: in this case the data is not transmitted from the website operator to Google; the website operator merely suggests to the user’s browser to fetch that data, and it is up to the user’s browser whether or not to follow this suggestion.
Can websites ensure that they do not include data not hosted outside the EU?
No, not really. Data is referred to by its URL, and the IP address part is encoded in something like
fonts.google.com. There is not necessarily a relationship between an IP address and the domain. For example I doubt that many
.io websites are hosted in the British Indian Ocean Territory. And whilst the
.com TLD is often associated with the US, it is really international and can point to an IP address anywhere in the world. Even
.eu websites which must be operated by EU citizens or residents do not as far as I know impose a requirement of being hosted in the EU.
So if a website owner includes a link via a URL that contains a domain name then a priori they have no idea where this resource is located. They can of course try to find out: they just type it in a browser, and check to which IP address it resolves. There are a few issues with this however
what they see is the IP address now; however, IP addresses may change, so the IP address tomorrow or the day after tomorrow may change and move from an EU address to a non-EU address
what they see is one possible IP address associated with this DNS entry; especially high traffic sites (like Google’s real estate for example) may implement load balancing at the DNS level, so you may see a different IP address if you access a site from say Portugal (where it may route to the US) or from Estonia (where it may route to a continental data center)
the IP address is a load balancer or other form of proxy that distributes the load to other servers behind it; those server may be located anywhere in the world, and there is not way to find out
So again, this a very important point: it is not possible for website owners to ensure that the resources they include are hosted in the EU. The only one who has a chance to ensure that they user only accesses resources from inside the EU is the user (or rather, their browser), and even the user may be defeated by proxying.
So what about the referrer tag?
Indeed, what about it? The referred tag is really the main culprit in this whole story – that fact that a certain IP has requested a certain commodity resource is in itself not particularly interesting, and definitely not personal data. Sure, if the font is some special font that is only used on one particular website then there is a some information contained in the fact that a particular IP address requests it. But for your standard definitely-not-Arial font that 1000+ websites are using the value of this information is de minimis.
So there is really a very simply way around this: allow users to prevent sending of referrer headers. This simple change will massively increase your privacy. Now websites like Google may not like it and stop providing you certain services, ie you may not get those fonts unless you enable sending referrer information. But that is now your choice as a user: send referral information and get the font, or don’t send it and look at Arial the whole day.
UPDATE. As kindly pointed out to me there are ways of controlling the referrer behviour in html, as described here. The key table is this one
As an example that this is the request that topaze.blue makes from this page when retrieving fonts (of course I also use Google Analytics so this does not make much of a difference…)
In other words:— Oscar D Þorson (@odtorson) January 31, 2022
Is there any good reason, in 2022, to still send referrer headers, or should browser just stop doing it