Later, SHAMROCK turns into BLARNEY (no really)

… So the question has to be not so much ‘Is Big Brother watching?’ but ‘How in hell can it cope?’ We know what the NSA’s job is, but we don’t know how it does it. How would you, as a junior analyst in S2C41, the branch of the Signals Intelligence Directorate responsible for monitoring Mexico’s leadership, navigate the millions of call records and pieces of ‘digital network intelligence’ logged from Mexico daily, in order to find that nugget of information about energy policy that’s going to get you noticed? For all the doomsaying certainty of the news stories that have periodically filled front pages since early June we are still in the dark about most of the NSA’s actual methods and day to day activities. The NSA employs more than thirty thousand people and has an annual budget of nearly $11 billion; outside its headquarters at Fort Meade in Maryland, it operates major facilities in Georgia, Texas, Hawaii and Colorado, and staffs listening posts around the world. The leaks are, at best, a series of tiny windows into a giant fortress. It’s still hard to spy on the activity within.

The documents we’ve seen – a fraction of the total number in the hands of Guardian and Washington Post journalists – are a blur of codenames. EVILOLIVE, MADCAPOCELOT, ORANGECRUSH, COBALTFALCON, DARKTHUNDER: the names are beguiling. But they don’t always tell us much, which is their reason for existing: covernames aren’t classified, and many of them – including the names of the NSA’s main databases for intercepted communications data, MAINWAY, MARINA, PINWALE and NUCLEON – have been seen in public before, in job ads and resumés posted online (these have been collected over the years by a journalist called William Arkin, who has written several books on American secrecy and maintains a useful blog). It’s been a feature of the coverage that the magic of the words has been used to stand for a generalised assertion of continuous mass surveillance. On 29 September the New York Times ran a story reporting that MAINWAY was being used ‘to create sophisticated graphs of some Americans’ social connections’. The next day, not wanting to have its thunder stolen, the Guardian, which after all owned the Snowden story, having broken it, ran a front-page piece saying that MARINA provided the ability to look back on the past 365 days of a user’s internet browsing behaviour. The only new piece of information in the story – new in the sense that it hadn’t been already been reported in the Guardian – was the business of the year’s worth of history. It was a case of my database is scarier than yours.

One reason for the uncertainty over what these things are for and how they work is that the leaked documents aren’t everything you might hope. The ones which have been relied on most heavily in the coverage are PowerPoint presentations that are usually described as ‘training slides’, even though – in the sections which have been made public, at least – they tend not to explain how a particular system is used. They are more like internal sales brochures aimed at the analysts, bigging up the benefits of one method over all the others. ‘PRISM,’ one introductory slide says, ‘The SIGAD Used Most in NSA Reporting.’ A series of bar charts shows how relatively rubbish other forms of collection are by comparison. The presentation’s author, PRISM’s own collection manager, proudly notes the ‘exponential’ growth in the number of requests made through the system for Skype data: 248 per cent. ‘Looks like the word is getting out about our capability against Skype.’

The system about which most detail is given, thanks to a presentation that begins with the question ‘What can you do with XKEYSCORE?’, sells itself by advertising – in a bullet-pointed list – its ‘small, focused team’ that can ‘work closely with the analysts’. There’s some geeky speak of Linux clusters and the Federated Query Mechanism – which simultaneously searches current traffic at all of the NSA’s collection sites around the globe – as well as a strong sense of startup culture: XKEYSCORE’s philosophy is ‘deploy early, deploy often’, a weaponised version of the Silicon Valley mantra beloved of Facebook engineers, ‘ship early, ship often’. Some handy use cases are listed: find everyone using PGP encryption in Iran, find everyone in Sweden visiting an extremist web forum. ‘No other system’ – these words highlighted in red – ‘performs this on raw unselected bulk traffic.’ There’s an endorsement from the Africa team, declaring that XKEYSCORE gave it access to stuff from the Tunisian Interior Ministry that no other surveillance system had managed to catch. It’s not unlike a washing powder ad. One of the things these slides are most revealing of is the marketplace within the NSA. At your desk in S2C41, as you sit down to find the best way to home in on dodgy goings-on by senior Mexicans, you have a whole menu of sexy tools to choose from.

The sales-speak nature of this material means that it can be misleading. It was the PRISM system – which the reports said gave the NSA ‘direct access’ to the servers of some of Silicon Valley’s biggest and most beloved companies, including Facebook, Google, Apple and YouTube – that dominated the headlines when the leaks first hit. The idea that the genius behind your perfectly engineered iPhone and the friendly souls behind the colourful Google logo had willingly collaborated with the electronic eavesdroppers to hand over the full set of keys to their multibillion-dollar server farms – when there was no law that could require them to do so – was a shock to many. It was also at some level outlandish: in most cases (if you leave aside Apple), the data the company possesses is what generates its phenomenal value, and it was hard to imagine that this commercially priceless property would be freely shared with anyone, let alone with the government. (Ayn Randist libertarian capitalists don’t like government.) The internet companies themselves categorically denied any knowledge of the PRISM programme, or anything like it.

But ‘collection directly from the servers’ was what the slides said, and the implication was that the full unencrypted traffic from everyone’s favourite web services was being piped wholesale into the NSA’s databases. The implication turned out to be wrong. What happens is that an NSA analyst ‘tasks’ PRISM by nominating a ‘selector’ – meaning an email address or username – for collection and analysis. In other words, PRISM allows an NSA worker to submit a request, which is invariably granted, to monitor an individual Gmail account or Yahoo identity or Facebook profile and have all its activity sent back to the NSA. (In this context, ‘direct access’ is accurate: if a selector has been approved for monitoring, the NSA has access to it in real time.) One of the slides the Guardian didn’t disclose – it appeared a few days later in the Washington Post – showed a screenshot of the tool used to search records retrieved through PRISM. The total count of records in the database – in April, when the slide was made – was 117,675. It’s worth looking at that number. Facebook has a billion users: half of the internet-connected population of the planet has an account. The fraction of those whose full unencrypted activity the NSA was actively monitoring can be no more than 0.01 per cent. This isn’t to pretend that the NSA high-mindedly refrains from seeking access to our baby pictures or inane comments on other people’s baby pictures. But it does suggest that you don’t fill in a form to access a random Mexican’s timeline unless you expect to get something out of it.

Another slide the Guardian withheld – it published only five of the 41 in the full presentation, citing security concerns, though the wish for maximum impact could be another reason for the choice – describes the PRISM ‘tasking process’. The slide shows a flowchart of mind-numbing complexity. After the analyst puts selectors into the Unified Targeting Tool, they are passed to S2 FAA Adjudicators in Each Product Line and to Special FISA Oversight and Processing (SV4), before going to a third department, Targeting and Mission Management (S343), pending Final Targeting Review and Release. Somewhere at the bottom of the line the approved request gets handed over to the FBI’s Data Intercept Technology Unit (DITU), the external body which actually interfaces with whichever internet company the NSA needs data from. (You can see why Facebook, Google et al have found it so easy to maintain that they aren’t systematically feeding the NSA.) The internet company hands over the requested data to the FBI – in 90 per cent of cases with no questions asked – and the information is then processed and ingested into NSA databases for all analysts to enjoy.

As ever, the blandly obscurantist codes give little sense of what is actually going on, and it’s easy to suppose – as many do – that all this meaningless superstructure is designed merely to give a semblance of due process to a system that has none. But in fact the arrangement has its devilish logic, each coded unit standing for a whole subsection of the NSA’s huge, hydra-headed military bureaucracy. The full extent of this bureaucracy is one of the most valuable lessons of the leaks. S2 is ‘analysis and production’, S3 ‘data acquisition’. S35 and its subcodes refer to Special Source Operations, the department responsible for conducting the delicate task of arranging ‘partnerships’ with entities that can give the NSA access to data that can’t be reached by any other means: cable companies, internet backbone providers, the maintainers of the switches and relays that keep global communications whirring. It is these arrangements that give rise to many of the more spectacular covernames that have been seen recently: MONKEYROCKET, SHIFTINGSHADOW, YACHTSHOP, SILVERZEPHYR. The type of data these sources provide, whether phone or internet records, is lightly classified: it’s merely secret. The area the source is targeted at – say, counterterrorism in the Middle East – is classified top secret. How the NSA has actually gone about getting hold of these data streams – through what pressure put on what companies by what means – is so sensitive that none of the documents we’ve seen even hints at it.

SILVERZEPHYR (SIGAD US-3273) is a source of particular interest to our man on the Mexico desk. It delivers data from Central and South America, serving up phone and fax metadata, as well as internet records – both metadata and content. An impressive demonstration of what can be achieved with it appears in an NSA presentation that was released last month to Fantástico, a Brazilian news programme, by Glenn Greenwald, the chief shepherd of the Snowden leaks. The presentation is a case study to show the benefits of creating ‘contact graphs’, ‘a useful way of visualising and analysing the structure of communication networks’. The slides describe a two-week ‘surge’ operation that S2C41 carried out in the final month of the 2012 presidential campaign against Enrique Peña Nieto, who was then leading in the polls, and nine of his closest advisers.

The analysts first tasked their systems with ‘seed’ selectors, representing the phone numbers of Peña Nieto and the advisers. Using MAINWAY – the database, you’ll remember, that allows for analysis of phone metadata and the relationships between numbers – S2C41 then produced a ‘two-hop’ contact graph, to show everyone each seed communicated with, and everyone those people communicated with too. Further analysis of the graph showed who in the network was most significant, including targets who until then hadn’t been known. It was then a cinch to run the content of all text messages sent from and received by these significant numbers through a system called DISHFIRE, which extracted any messages that were ‘interesting’. Among these messages were lists of names of the people who would be given senior positions in a Peña Nieto administration. Six months after Peña Nieto’s election, all the people listed had joined the government. A case study like this shows why you really do need all the systems at your disposal to do useful work at the NSA. It’s also a good primer in how to learn things that are unknown to anybody other than the Mexican president-elect, and perhaps his wife …