Why the reports about PRISM are wrong


I feel a bit strange about writing this.

I distrust authority. I’m the sort of person who believes that powerful people have a tendency to be corrupt and that governments manipulate the truth for political purposes. I support the free press and believe journalists rather than government spokespeople and large companies.

We are being sold a lie about PRISM. Only it’s not governments selling us the lie, it’s the free press.

I strongly believe that the descriptions of PRISM in The Guardian, The Washington Post and on many websites are inaccurate and misleading. They are little more than conspiracy theories. The claims made are not technically possible or realistic.

Those of you who have a tendency to believe conspiracy theories may disregard my comments. You may believe that I am a mouthpiece of government spin. I am not. I work in the media and technology. I have no affiliation with any government or private company. I have no special knowledge of PRISM, but I know how technology and the public sector works. And the PRISM described in the press does not and cannot exist.

In the following article I will explain two things.

Firstly, why PRISM cannot exist in the form that the media is portraying, and these claims are nothing more than conspiracy theories and should be treated as such.

Secondly, that there are privacies being eroded that we should be worried about and that the spurious concerns about PRISM are distracting us from the real problems.

PRISM is not what we think it is


If you’ve been reading the press you will probably have an idea of what PRISM is. According to The Guardian:

The National Security Agency has obtained direct access to the systems of Google, Facebook, Apple and other US internet giants, according to a top secret document obtained by the Guardian.

The NSA access is part of a previously undisclosed program called Prism, which allows officials to collect material including search history, the content of emails, file transfers and live chats

Their evidence for this is a leaked PowerPoint document.


The presentation is amateurish; the formatting and phasing is imprecise. The Guardian has focused on the phrase “collection directly from the servers”. They use this phrase to theorize about a whole range of activities. But these are groundless speculation with no basis. The phrase “directly from servers” means nothing. This is not a technical document, and these words are vague. There’s a mix of companies and products; YouTube belongs to Google, for example, and Skype to Microsoft, yet both are listed.

Former general counsel of the NSA Stewart Baker says:

The PowerPoint is suffused with a kind of hype that makes it sound more like a marketing pitch than a briefing — we don’t know what its provenance is and we don’t know the full context

Why would a top secret US Government programme even have a logo?

Do we seriously think that an organisation that can tap and read “the internet” in real time would produce a presentation as sloppy as this? As ZDNet says:

we strongly suspect that the leaked PowerPoint slides are probably not written by technical people. It’s likely that these slides were prepared as an internal marketing tool for new recruits. So, when the slides say: “direct access to servers,” that statement may well be an oversimplification of the facts.

The US Government clearly has data on individuals. We already know that. They can legally request it from companies by a subpoena, and we know they already do that a lot. Too much in fact.

But that is a legal process. The police go the courts, get a court order and request companies like Google or Facebook to export data from their servers and hand it over. At no point, do the security services have access to the servers, direct or otherwise.

Other than this vague and sloppy phrase, there is no evidence, either in these documents or in any other information released that PRISM does anything else other than store data collected through court orders. Really the burden of proof should be on the media to prove that something more is going on. However, the reports have now reached epidemic levels and one cannot satisfactorily disprove them by saying there is no evidence in the same way one cannot disprove the exist of God by saying that.

A huge number of sources have rejected these reports. Insiders have come forwards to multiple journalists:

Recent reports in The Washington Post and The Guardian […] are incorrect and appear to be based on a misreading of a leaked PowerPoint document, according to a former government official who is intimately familiar with this process of data acquisition and spoke today on condition of anonymity.

“It’s not as described in the histrionics in The Washington Post or The Guardian,” the person said. “None of it’s true. It’s a very formalized legal process that companies are obliged to do.”

That former official’s account — that the process was created by Congress six years ago and includes judicial oversight — was independently confirmed by another person with direct knowledge of how this data collection happens at multiple companies.

Larry Page and Mark Zuckerburg both stated that they’re not giving direct access to their servers. Google said:

The U.S. government does not have direct access or a ‘back door’ to the information stored in our data centers. We provide user data to governments only in accordance with the law.

But, hey, they run large companies so we can’t trust them.

The New York Times has cited anonymous sources that cast doubt on the initial reports, but maybe they’re lying too.

Maybe, one loosely phrased statement, in a non-technical, sloppy PowerPoint presentation is correct, and all of the industry experts, anonymous sources, government statements and publicly available legal records are incorrect, and the government are doing this.

So let’s dig a bit deeper.


The slides say that the budget for PRISM is $20m a year. In large scale IT projects, $20m is peanuts. The BBC’s DMI project recently failed, after spending $100m on trying to build a internal database. They even had all the content already and didn’t need to steal it from protected, encrypted sources.

In 2005, the FBI spent $170m trying to build a digital system for managing case work. It failed.

I’ve written before about how IT projects fail. Large scale IT projects are incredibly complex. They are too complex for humans to comprehend, and as more developers start working, communication becomes harder and harder to manage.

As ZDNet says:

One source speaking to ZDNet under the condition of anonymity said $20 million — the amount quoted by the NSA in the leaked document that covers the cost of the PRISM program — wouldn’t even cover the air conditioning costs and the electrical bill for the datacenter. Taking the datacenter out of the equation, $20 million would even not cover 3-6 months worth of data storage required to store keep copies of the wiretap data.

Even The Guardian struggle to make sense of this:

“The Prism budget – $20m – is too small for total surveillance,” one data industry source told the Guardian. Twitter, which is not mentioned in the Prism slides, generates 5 terabytes of data per day, and is far smaller than any of the other services except Apple. That would mean skyrocketing costs if all the data were stored. “Topsy, which indexes the whole of Twitter, has burned through about $20m in three years, or about $6m a year,” the source pointed out. “With Facebook much bigger than Twitter, and the need to run analysts etc, you probably couldn’t do the whole lot on $20m.”

It is unthinkable that such a project could be run for $20m a year. The press can’t find a single expert to support this. And you can be sure they’ve been desperately looking.

The budget given in the presentation is comparatively tiny – just $20m per year. That has puzzled experts because it’s so low.

But maybe, somehow, the NSA has found a way of cutting costs, way beyond anything anyone can understand. After all, the public sector is famous for being run efficiently and getting the best value for money for the taxpayer.

The Conversation.jpg

Let’s have a think about what PRISM is doing. The claims are that it is “tapping” the network. Images come to mind of Gene Hackman in The Conversation listening in with headphones. Unfortunately, that only works with analogue communications. The Internet isn’t analogue. You simply can’t “listen” in and see what websites someone is looking at. The internet does not work that way. Any claims like this show a shocking misunderstanding of the technology. As the PRISM slides say:

A target’s phone call, e-mail or chat will take the cheapest path, not the physical most direct path – you can’t always predict the path

If I send an email to my girlfriend sitting in the living room, that email will be sent in thousands of packets, some of which may go via Australia, or anywhere else in the world. The packets pick the best route. You can’t “listen in on them”.

Now, what you could do is connect into my wireless network illegally. To do that you’d need to crack the WEP or WPA key. There are tools available for that such as Aircrack-ng or Kismet. You could then use something like an ARP spoofing attack with Wireshark or Ettercap to view the data packets flowing into and out of my house.

But to do that you’d need to be physically close enough to my house to connect to my wireless network. And I’m just one person. To be able to “tap” the internet this way, you’d need a surveillance team outside every house in the country. You couldn’t do that to the whole work without 7 billion spies in vans. And although the traffic on my road is bad, it’s not that bad.

Undersea internet cables

The Guardian, however has described up “a couple of methods” that PRISM may be using. Before we start analyzing these, remember, these were thought up by people working in the data-processing business. They have no special knowledge of PRISM, they do not work for the government and although experts in their field, they have no information that we don’t have. These are theories that have been come up with because of that one phrase on a PowerPoint presentation.

First, lots of data bound for those companies passes over what are called “content delivery networks” (CDNs), which are in effect the backbone of the internet. Companies such as Cisco provide “routers” which direct that traffic. And those can be tapped directly.

I was dubious of this claim. The Guardian links to a Cisco technical document, about a specific Cisco router. So, I read it, (and boy was it boring). It says:

The Cisco Service Independent Intercept Architecture Version 3.0 document describes implementation of LI for VoIP networks using the Cisco BTS 10200 Softswitch call agent, version 5.0, in a non-PacketCable network.

In layman’s terms, this is a description of how you can connect directly into a specific router to access a VOIP phone call. VOIP, by the way, is things like Skype. It’s using the internet to have a phone call. This does not mean that the government can “tap” into CDNs. It means that someone could technically connect into one particular brand of router, if there was a court order to do so.

Another Guardian source, who said that $20m wasn’t enough to do anything useful, suggests:

“they might have search interfaces (at an administrator level) into things like Facebook, and then when they find something of interest can request a data dump. These localised data dumps are much smaller.”

The other day, I needed to find a receipt in my gmail inbox. Sadly, it turns out I’ve bought quite a few things. It was a really big job to find it. Imagine searching every gmail inbox in the world for something. You’d never be able to find anything in the noise.

And that’s even assuming it is technically possible. I have no insider knowledge into Google. But I can’t see why they’d build that. I do have knowledge of Exchange (the Microsoft email servers) and can categorically say you cannot do that on there (I’ve actually been asked a couple of times, and have been involved in email search operations. They are not easy or cheap). Even if Google had built this admin level functionality, it would be slow.


When you use Google search it is very quick. But that speed comes at a cost. Google spend a huge amount of money optimizing and caching searches to deliver the content to you quickly. Why would they spend a similar amount of time optimizing search across all gmail boxes. There’s a reason they optimize search. Because there’s money in it for them. Lots of money. The more you search, the more you see ads.

There’s no money in it for them to build a snoop search. It’s hardly as if Google are going to advertise to NSA officials alongside their search: “customers who searched for Jihad also bought the Koran”.

But let’s assume that the $20m figure is wrong. That the people that made this presentation were incredibly precise with their wording about the way data was collected, but then missed five zeroes off the end of the budget figure. Zdnet has produced a theory, which begins with the reassuring statement: “The following article should be treated as strictly hypothetical.”

Their suggestion is that PRISM taps into Tier 1 networks:

The Internet may be distributed and decentralized in nature, but there is a foundation web of connectivity that enables major sites and services to operate. These are referred to as “Tier 1” network providers. Think of these as pipes of the main arteries of the Internet, in simple terms.

There are 12 companies that provide Tier 1 networks. Zdnet’s theoretical paper suggests that the NSA could “tap” these networks.

these Tier 1 network providers have a far smaller employee base working in these divisions than the aforementioned companies. This allows the NSA to either send its own employees in as “virtual” employees — working under the guise of these companies — while the NSA gags those companies from disclosing this fact to other staff. They could look like special contractors that only work with the special wiretapping routers.

We’re heading into conspiracy theory territory again now. We’re suggesting that the NSA put undercover staff into twelve private companies, attached equipment to their computers and extracted all information that comes out of the servers. They then put gagging orders on all of the companies, and someone stopped all the individuals who knew about it from leaking it to the press.

Oh, yes and the they built a secret database that could contain the whole Internet.

But let’s ignore that. Let’s pretend they managed to build this magic technology that the rest of the world doesn’t have without anyone knowing.

They still wouldn’t have access to the Internet. They’d just see a load of data flowing through CDNs. And most of it would be iplayer, YouTube and pictures. They wouldn’t even have all of the Internet. Only some data flows through these.

And that’s ignoring the problem of encryption. Facebook, Google, Hotmail all the interesting stuff, is encrypted. What this means is, even if you got all of the packets of every request, you wouldn’t be able to read them. Even if I tweet, publicly, the tweet is encrypted when it’s sent to Twitter. Although it’s displayed publicly on the website, you wouldn’t be able to read it by “tapping” my internet, you’d just get a load of encrypted nonsense.

Maybe they have special servers optimized to crack encryption. And maybe they set them to work decrypted every single encrypted internet session in use. Seems even more unlikely, but lets imagine that this happened.

The Guardian reported that:

last year GCHQ was handling 600m “telephone events” each day, had tapped more than 200 fibre-optic cables and was able to process data from at least 46 of them at a time.

Each of the cables carries data at a rate of 10 gigabits per second, so the tapped cables had the capacity, in theory, to deliver more than 21 petabytes a day


21 petabytes is big. Really big. And that’s just in one day. In 30 days, this would create 630 petabytes.

To put this to scale, IBM recently built the largest hard drive array in the world. It is highly experimental and no one else in the world has come close to something like this. It is 120 petabytes. The next biggest is little more than 40 petabytes.

The storage for processing this much data just does not exist. We’re suggesting that GCHQ and the NSA have secretly built databases that are a similar size to the internet and no one noticed. It is just not possible.

Even The Washington Post is starting to back down on its claims.

And then a funny thing happened the next morning. If you followed the link to that story, you found a completely different story, nearly twice as long, with a slightly different headline. The new story wasn’t  just expanded; it had been stripped of key details, with no acknowledgment of the changes. That updated version, time-stamped at 8:51 AM on June 7, backed off from key details in the original story.

Before naming their source as Edward Snowdon, The Washington Post and Guardian both referred to him as “a career intelligence officer [who exposed the materials] in order to expose what he believes to be a gross intrusion on privacy.”

Edward Snowdon is not an intelligence officer but an “infrastructure analyst”. He had been in his current position with an external contractor for three months. I don’t mean to discredit him, but if The Guardian managed to get his job title wrong, what else did they mistake?

All evidence from governments, from legal proceedings, from technology experts and from leaked documents that we’ve seen suggests that PRISM is simply gathering up information obtained legally through court orders. There is simply no evidence that anything else is happening. U.S. Director of National Intelligence James said:

“PRISM is not an undisclosed collection or data mining program […] it is an internal government computer system” designed to “facilitate […] authorized collection of foreign intelligence.” NSA Director Gen. Keith Alexander says of Snowdon’s claims “I know of no way to do that.”

It is absolutely unfeasible that the PRISM described by the Guardian exists.

But you should be worried

However, there is a problem that the hype around PRISM is overlooking. The US Government is requesting huge amounts of private data from companies. Governments are making hundreds of thousands of legal requests for information form Google, Microsoft, Facebook and many others.

These are perfectly legal, and all of the government officials questioned say their data is obtained In this way. Apple received 4000 requests, and Facebook 10,000 in the last quarter. Google were forced to respond to 8,000 by the US government alone.

Should governments legally be allowed to make all these requests? Shouldn’t we be more worried about what our current legal system is allowing to happen? Rather than becoming hysterical over conspiracy theories of illegal activities that clearly aren’t happening, maybe we should focus more on stopping what is actually going on.

Government rebuttals of PRISM are actually shocking. “No,” they’re saying, “we didn’t really tap your networks to get all this data illegally. We did it by the perfectly legal method and that’s fine.”

The biggest tragedy of PRISM is not the spurious and ignorant claims that are being made. But that it is distracted us from the real problem. The press claims of PRISM have made the governments standard activities look reasonable. But they are not. And we should stop being bamboozled by fantasy computer systems that seem like something out of a Hollywood film.

Operation BlackBriar

Operation BlackBriar

Apps and Websites

I have mixed feelings about apps. There’s an XKCD comic that pretty much sums up my experience:


It’s quite common to come across apps that are just content from the website but in a more limited container. You can’t interact with them as you would with webpage content. Sometimes you can’t copy and paste from them or use  features that have been in webpages as standard for twenty years.

People who are familiar with computers tend to forget that most normal users  are worse with computers than we think. I once told a story about a project to build an appstore at work:

A colleague of mine had to give a presentation about a Corporate iPhone AppStore that we’re building. Half way through, he realised the audience weren’t feeling it, and so said, “Who here knows what an appstore is?” About four people put their hands up.

The excellent web app Forecast.io has a blog that talks about the confusions people get into when trying to pin their webclip to the homescreen:

I’m fairly certain none of them will ever know that Forecast is actually a web app. To them, it’s just an app you install from the web.

Users don’t really understand what they’re doing. Instead they do get frustrated and confused when things don’t behave when they expect. The user who wrote in to Forecast.io was confused that he couldn’t download the web app from the Apple Appstore.

Apps are too much like 1990’s CD-ROMs and not enough like the Web. I feel like I’m always updating my apps. Every time I pick up my phone there’s a little red box next to the Appstore, telling me I have more updates to download.

Perhaps it’s my OCD coming out, but I just can’t leave the updates sitting there. I have to download them all. But if I ever look at the reason for the updates, I see it fixes an issue like “error with Japanese timezone settings for people living in Iceland” or “fixes an issue when you plug an iPhone 1 into a particular model of ten year old HP TV” that I’ll never encounter. Sometimes they change apps so they no longer work in the way I’ve got used to.

It’s probably worth adding: I’m not complaining that they’re fixing problems. Someone in Iceland is probably on Japanese time. And even if they’re not, I’m idealistic enough to think it needs fixing just because it’s wrong. The problem is the nature of apps. I never have to update Amazon.com before I buy a book, or update Facebook before I poke someone. Websites don’t need updating. They are always the latest version.

There are, of course, some good things about apps. Jeff Atwood wrote about how much better the ebay app is than the website. An he’s right. It’s slicker, simpler, easier to use:

Above all else, simplify! But why stop there? If building the mobile and tablet apps first for a web property produces a better user experience – why do we need the website, again?

But maybe the solution here is to build a better website.

Of course, some apps carry out functions on the device, or display static data. And it makes sense for them to be native apps. On my iPhone, I have a torch app (it forces the flash on my camera to remain on), and I have a tube map app (it essentially shows me a picture of the tube map). One of these interacts with the base firmware, so that one has to be a native app. The other displays static data. It would be unnecessary to connect to the Internet to pull the map down every time I want to look at it.

But other apps, like Facebook or LinkedIn are just a native wrapper around the website.

So, what’s the solution?

Of course, there isn’t one. It’s a compromise. At the moment we’re in an era obsessed with native apps. All companies have to have an “app”, if only just to show that they’re up to date.

I was in the pub the other day and accidentally got chatting to someone. He told me that his company had just released an app. “What does it do?” I asked him. “No idea,” he said, “but you’ve got to have an app.” He was the managing director of the company.

Hopefully, when we’ve got over the novelty of the technology, we can start using apps for what they’re good at, rather than just having apps before they are there.

A lot happened on 1st January 1970


If you’ve spent any time playing with code and dates, you will at some point have come across the date the 1st January 1970.

In fact, even if you’ve never touched any code, you’ll have probably come across it. I came across it today when I was looking at the stats on WordPress:


Bizarrely, the WordPress hit counter starts in 1970. Not so bizarrely, no one read my blog that day. But then they were probably all so excited by Charles “Chub” Feeney becoming president of baseball’s National League. Or something.

Most likely, this is caused by the Unix Timestamp, a number I wrote about the other day. As I said, time is a real faff, but numbers are great, so computers sometimes store time as numbers. Specifically, the number of seconds since midnight on the 1st January 1970. It’s a real oddity when you first encounter it,  but it makes a lot of sense.

It’s not, though, the only way of storing time. Microsoft, typically, do it a different way, and use a value that’s affectionately known as Integer8, which is an even bigger number. This is the number of nanosecond intervals since midnight on January 1st, 1601.

With both of these, you need to do a calculation along the lines of:

January 1st 1970 + number of seconds

To turn the number into a date. Of course, this means that if you report the Timestamp as 0, the computer adds 0 to January 1st 1970, and gets January 1st, 1970.

Presumably, it’s something along these lines that have resulted in WordPress reporting me hit stats from 1970. According to computers, a lot of things happened on 1st January 1970.

Coding is hard

Coding is easy

“Coding”, to misquote Douglas Adams, “is hard. Really hard. You just won’t believe how vastly, hugely, mindbogglingly hard it is. I mean, you may think it’s difficult to walk  down the road to the chemist’s, but that’s just peanuts to code, listen…”

Except it’s not. Not really. And that bit about the chemist’s makes no sense. Thanks for nothing, Simon.

Writing a little bit of code is easy. So easy, that I’ll do some now. Here:

<b>Make this bold</b>

See, that wasn’t so hard, was it? That’s one line of HTML. HTML is what’s called “markup”, code for styling text on a website. The angled brackets indicate that what is within them is a command. This command is sent to the “code interpreter”, which reads the code and turns it  into:

Make this bold

At the end, there is another set of angular brackets, with a forward slash to indicate that you can stop making it bold now, thank you very much. So to the compiler, it reads:

Start making the next bit bold until I tell you to stop

This is the specific text that I want you to print out, so just print it out

Stop making it bold now

That’s quite straightforward, but using essentially that principle (and a few others), you can start building most websites in the world. And end up with code that looks like this:

Html Code

A little bit of code is easy. A lot of code is hard.

The way computers “think” is just very different to humans. Densely written text like that is almost impossible for a human to read. But for computers, this isn’t a problem. They just start at the beginning, and work their way through. Very quickly. In milliseconds. It would take us humans with our monkey brains much, much longer to go through it all.

The upshot of this is that once you go beyond writing a couple of simple lines of code, managing the code becomes a bigger job that actually writing the damn stuff. And when you throw some other coders into the mix, with their own way of coding, and their own idea of how code should be indented (it’s tabs, for Christ’s sake, not spaces!), you start to see why coding is so hard, and why so many software projects fail.

In this blog, I’m going to take a walk through coding, development and software. Looking at what goes right (and wrong), why coding is the way it is, and talk about some of the concepts involved in coding and development. You won’t learn anything useful from this site (it’s not Coding for Dummies or anything like that). But you will learn lots of useless things. And, if I’m honest, I’ve always preferred useless information anyway.