2015-04-23

Journos writing about trading and high-speed computing

I have to admit, this amused me - the Daily Mail trying to write about high-frequency trading:

Suspected rogue trader Navinder Sarao lived in his parents' modest home because it gave him a split-second advantage worth millions of pounds, it was claimed yesterday.
His family's semi-detached house in suburban West London is closer to an internet server used by one of the major financial exchanges, giving him a nanosecond advantage over rivals in the City.
[...]
Sarao, 36, was dubbed the 'Hound of Hounslow' after it emerged he lived at home with his parents, despite allegedly making £26.7million in just four years of dealing from their home.
And yet you'd think that renting a small flat in Slough and paying for Internet access there would have improved his speed advantage; at a cost of about £50K for four years, that would have been a bargain. Why, it's almost as if the Daily Mail journalists had no idea what they were talking about....

2015-04-02

Active attack on an American website by China Unicom

I wondered what the next step in the ongoing war between Western content and Chinese censorship might be. Now we have our answer.

"Git" is a source code repository system which allows programmers around the world to collaborate on writing code: you can get a copy of a software project's source code onto your machine, play around with it to make changes, then send those changes back to Git for others to pick up. Github is a public website (for want of a more pedantic term) which provides a repository for all sorts of software and similar projects. The projects don't actually have to be source code: anything which looks like plain text would be fine. You could use Github to collaborate on writing a book, for instance, as long as you used mostly text for the chapters and not e.g. Microsoft Word's binary format that makes it hard for changes to be applied in sequence.

Two projects on Git are "greatfire" and "cn-nytimes" which are, respectively, a mirror for the Greatfire.org website focused on the Great Firewall of China, and a Chinese translation of the New York Times stories. These are, obviously, not something to which the Chinese government wants its citizenry to have unfettered access. However, Github has many other non-controversial software projects on it, and is actually very useful to many software developers in China. What to do?

Last week a massive Distributed Denial of Service (DDoS) attack hit Github:

The attack began around 2AM UTC on Thursday, March 26, and involves a wide combination of attack vectors. These include every vector we've seen in previous attacks as well as some sophisticated new techniques that use the web browsers of unsuspecting, uninvolved people to flood github.com with high levels of traffic. Based on reports we've received, we believe the intent of this attack is to convince us to remove a specific class of content. [my italics]
Blocking Github at the Great Firewall - which is very easy to do - was presumably regarded as undesirable because of its impact on Chinese software businesses. So an attractive alternative was to present the Github team with a clear message that until they discontinued hosting these projects they would continue to be overwhelmed with traffic.

If this attack were just a regular DDoS by compromised PCs around the world it would be relatively trivial to stop: just block the Internet addresses (IPs) of the compromised PCs until traffic returns to normal levels. But this attack is much more clever. It intercepts legitimate requests from worldwide web browsers for a particular file hosted on China's Baidu search engine, and modifies the request to include code that commands repeated requests for pages from the two controversial projects on Github. There's a good analysis from NetreseC:

In short, this is how this Man-on-the-Side attack is carried out:
1. An innocent user is browsing the internet from outside China.
2. One website the user visits loads a JavaScript from a server in China, for example the Badiu Analytics script that often is used by web admins to track visitor statistics (much like Google Analytics).
3. The web browser's request for the Baidu JavaScript is detected by the Chinese passive infrastructure as it enters China.
4. A fake response is sent out from within China instead of the actual Baidu Analytics script. This fake response is a malicious JavaScript that tells the user's browser to continuously reload two specific pages on GitHub.com.

The interesting question is: where is this fake response happening? We're fairly sure that it's not at Baidu themselves, for reasons you can read in the above links. Now Errata Security has done a nice bit of analysis that points the finger at the Great Firewall implementation in ISP China Unicom:

By looking at the IP addresses in the traceroute, we can conclusive prove that the man-in-the-middle device is located on the backbone of China Unicom, a major service provider in China.
That existing Great Firewall implementors have added this new attack functionality fits with Occam's Razor. It's technically possible for China Unicom infrastructure to have been compromised by patriotically-minded independent hackers in China, but given the alternative that China Unicom have been leant on by the Chinese government to make this change, I know what I'd bet my money on.

This is also a major shift in Great Firewall operations: this is the first major case I'm aware of that has them focused on inbound traffic from non-Chinese citizens.

Github look like they've effectively blocked the attack, after a mad few days of scrambling, and kudos to them. Now we have to decide what the appropriate response is. It seems that any non-encrypted query to a China-hosted website would be potential fair game for this kind of attack. Even encrypted (https) requests could be compromised, but that would be a huge red arrow showing that the company owning the original destination (Baidu in this case) had been compromised by the attacker: this would make it 90%+ probable that the attacker had State-level influence.

If this kind of attack persists, any USA- or Europe-focused marketing effort by Chinese-hosted companies is going to be thoroughly torpedoed by the reasonable expectation that web traffic is going to be hijacked for government purposes. I wonder whether the Chinese government has just cut off its economic nose to spite its political face.

2015-03-04

What does "running your own email server" mean?

There's lots of breathless hyperbolae today about Hillary Clinton's use of a non-government email address during her tenure as Secretary of State. The Associated Press article is reasonably representative of the focus of the current debate:

The email practices of Hillary Rodham Clinton, who used a private account exclusively for official business when she was secretary of state, grew more intriguing with the disclosure Wednesday that the computer server she used traced back to her family's New York home, according to Internet records reviewed by The Associated Press.
[...]
It was not immediately clear exactly where Clinton's computer server was run, but a business record for the Internet connection it used was registered under the home address for her residence in Chappaqua, New York, as early as August 2010. The customer was listed as Eric Hoteham.
Let's apply a little Internet forensics to the domain in question: clintonemail.com. First, who owns the domain?
$ whois clintonemail.com
[snip]
Domain Name: CLINTONEMAIL.COM
Registry Domain ID: 1537310173_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.networksolutions.com
Registrar URL: http://networksolutions.com
Updated Date: 2015-01-29T00:44:01Z
Creation Date: 2009-01-13T20:37:32Z
Registrar Registration Expiration Date: 2017-01-13T05:00:00Z
Registrar: NETWORK SOLUTIONS, LLC.
Registrar IANA ID: 2
Registrar Abuse Contact Email: abuse@web.com
Registrar Abuse Contact Phone: +1.8003337680
Reseller:
Domain Status:
Registry Registrant ID:
Registrant Name: PERFECT PRIVACY, LLC
Registrant Organization:
Registrant Street: 12808 Gran Bay Parkway West
Registrant City: Jacksonville
Registrant State/Province: FL
Registrant Postal Code: 32258
Registrant Country: US
Registrant Phone: +1.5707088780
Registrant Phone Ext:
Registrant Fax:
Registrant Fax Ext:
Registrant Email: kr5a95v468n@networksolutionsprivateregistration.com
So back in January this year the record was updated, and we don't necessarily know what it contained before that, but currently Perfect Privacy, LLC are the owners of the domain. They register domains on behalf of people who don't want to be explicitly tied to that domain. That's actually reasonably standard practice: any big company launching a major marketing initiative wants to register domains for their marketing content, but doesn't want the launch to leak. If Intel are launching a new microbe-powered chip, they might want to register microbeinside.com without their competitors noticing that Intel are tied to that domain. That's where the third party registration companies come in.

The domain record itself was created on the 13th of January 2009, which is a pretty strong indicator of when it started to be used. What's interesting, though, is who operates the mail server which receives email to this address. To determine this, you look up the "MX" (mail exchange) records for the domain in question, which is what any email server wanting to send email to hillary@clintonemail.com would do:

$ dig +short clintonemail.com MX
10 clintonemail.com.inbound10.mxlogic.net.
10 clintonemail.com.inbound10.mxlogicmx.net.
mxlogic.net were an Internet hosting company, bought by McAfee in 2009. So they are the ones running the actual email servers that receive email for clintonemail.com and which Hillary's email client (e.g. MS Outlook) connected to in order to retrieve her new mail.

We do need to take into account though that all we can see now is what the Internet records point to today. Is there any way to know where clintonemail.com's MX records pointed to last year, before the current controversy? Basically, no. Unless someone has a hdr22@clintonemail.com mail from her home account which will have headers showing the route that emails took to reach her, or has detailed logs from their own email server which dispatched an email to hdr22@clintonemail.com, it's probably not going to be feasible to determine definitively where she was receiving her email. However, CBS News claims that the switch to mxlogic happened in July 2013 - that sounds fairly specific, so I'll take their word for it for now. I'm very curious to know how they determined that.

All of this obscures the main point, of course, which is that a US federal government representative using a non-.gov email address at all for anything related to government business is really, really bad. Possibly going-to-jail bad, though I understand that the specific regulation requiring a government employee to use a .gov address occurred after Hillary left the role of SecState (Feb 2013). Still, if I were the Russian or Chinese foreign intelligence service, I'd definitely fancy my chances in a complete compromise of either a home-run server, or of a relatively small-scale commercial email service (mxlogic, for instance).

Desperately attempting to spin this whole situation is Heidi Przybyla from Bloomberg:

OK, let's apply our forensics to jeb.org:
$ dig +short jeb.org MX
5 mx1.emailsrvr.com.
10 mx2.emailsrvr.com.
emailsrvr.com is, like mxlogic.net, a 3rd party email hosting service, apparently specialising in blocking spam. I'm not surprised that someone like Jeb Bush uses it. And, like Hillary, he isn't "running his own email server", he's using an existing commercial email server. It's not Gmail/Outlook.com/Yahoo, but there's not reason to think it's not perfectly serviceable, and it's not controlled by Bush so if they log or archive incoming or outgoing email his correspondence is legally discoverable.

The difference between Jeb Bush and Hillary Clinton of course, as many others note, is that Jeb is not part of the US federal government and hence not subject to federal rules on government email...

2015-02-28

No cash for CASH

For those following along with our previous adventures with the prodnoses of Consensus Action on Salt and Health (CASH) their 2014 accounts make an entertaining read, with not a little schadenfreude.

Deprived of the £100K that our friends at the Marcela Trust sent in their direction in 2013, via OMC Investments, their fairly steady expenditure rate of £150K per year is maintained this year, but since their income was £30K rather than £140K they ended up with a £120K deficit in spending, eroding their capital down to £766K. At this rate, in 6-7 more years they will be out of funds and out of luck. It seems that no-one really likes CASH or wants to give them money in any quantity - at least, not while the world is watching.

The note in the "Movement in funds" section on p.33 is amusing:

The designated fund will provide working capital to the charity to enable it to continue its unique activities whilst the trustees implement their fundraising strategy.
Yes, I'd be interested in what that strategy is going to be. Are they going to try to tap government funds in the classic fakecharity game - lobby the government to give them money to lobby the government? I'll be watching the CASH website and their subsidiary organisation Action on Sugar to see what they're up to.

2015-02-26

Net neutrality - be careful what you wish for

I'm driving my forehead into an ever-deepening dent on my desk in despair at the news that the US Federal Communications Commission has approved new rules governing net neutrality in the USA. This may seem like the sort of news that a progressive geek like your humble bloghost would welcome, but it turns out to involve some inconvenient wrinkles.

The EFF, guardians of liberty, were originally cheering on behalf of net neutrality. Then, 2 days ago, they started to get a little concerned with some of the details being proposed by the FCC:

Unfortunately, if a recent report from Reuters is correct, the general conduct rule will be anything but clear. The FCC will evaluate "harm" based on consideration of seven factors: impact on competition; impact on innovation; impact on free expression; impact on broadband deployment and investments; whether the actions in question are specific to some applications and not others; whether they comply with industry best standards and practices; and whether they take place without the awareness of the end-user, the Internet subscriber.
In essence, the proposed rules for Net Neutrality gave the FCC - a US government agency, headed by a former lobbyist for the cable and wireless industry - an awfully wide scope for deciding whether innovations in Internet delivery were "harmful" or not. There's no way that this could go horribly wrong, surely?

Broadband in the USA

Now, let's start with the assertion that there is an awful lot wrong with broadband provision in the USA currently. It's a lot more expensive than in the UK, it's almost always supplied by the local cable TV provider, and in general there is very little if any choice in most regions. See the broadband provider guide and choose min, max of 1 - there's an awful lot of the USA with monopoly provision of wired high-speed internet.

The dominant ISPs with high-speed provision are Comcast, AT+T, Time Warner, CenturyLink and Verizon. It would be fair to say that they are not particularly beloved. Comcast in particular is the target of a massive amount of oppprobium: type "Comcast are " in your favourite search engine, and you get autocompletion suggestions including "liars", "crooks", "criminals". American broadband is approximately twice the price of British, and you generally get lower speeds and higher contention ratios (you share a pipe of fixed size with a lot of people, so if your neighbours are watching streaming video then you're out of luck). As effective monopolies, ISPs were in a very powerful position to charge Internet services for streaming data to their customers, as last year's Comcast-Netflix struggle showed - and it ended with Netflix effectively forced to pay Comcast to ship the bytes that Netflix customers in Comcast regions were demanding.

Google's upstart "Google Fiber" offering of 1 Gbps (125 MB per second) fiberoptic service tells a story in itself. It's targeting a relatively short list of cities but has been very popular whenever it opened signups. It has spurred other broadband providers to respond, but in a very focused way: AT+T is planning to offer 1Gbps service, but only in Google Fiber's inaugural area of Kansas City which is impressive in its brazenness. Other community-based efforts are starting to bear fruit, e.g. NAP is proposing their Avalon gigabit offering in part of Atlanta, Georgia. However, most of the USA is still stuck with practical speeds that have not changed noticeably in half a decade. Entrenched cable ISPs have spent plenty of money on lobbyists to ensure that states and cities make it expensive and difficult for newcomers to compete with them, requiring extensive studies and limiting rights to dig or string fiber-optic cable to residential addresses.

So there's clearly a problem; why won't Net Neutrality solve it?

The ISP problem

Net neutrality essentially says that you (an ISP) can't discriminate between bytes from one service and bytes from a different service. Suppose you have two providers of streaming Internet movies: Netflix and Apple iTunes. Suppose Comcast subscribers in rural Arkansas pay Comcast for a 20Mbps service, easily sufficient for HD streaming video. Comcast controls the network which ends at their customers' home routers, and when it receives a TCP or UDP packet (small chunk of data) from their customers they will look at its destination address and forward it either to its destination - e.g. a server in the Comcast network - or to one of the other Internet services they "peer" to. Peering is a boundary across which Internet entities exchange Internet data. When data comes back across that boundary with the address of one of their customers, Comcast routes the data to the customer in question. So far, so good.

Now the customer is paying Comcast for their connection, so it's not really reasonable for Comcast to force them to pay more for more data above and beyond the plan they've agreed. If you've got a 20 Mbps connection, you expect to be able to send / receive 20Mbps more or less forever. Comcast might have a monthly bandwidth cap beyond which you pay more or get a lower speed, but that should be expressed in your plan. Comcast might weight certain kinds of traffic lower than others, so that when 20 people are contending for use of a 100 Mbps pipe traffic which is less sensitive to being dropped (e.g. streaming video) is dropped more often than more sensitive traffic (web page fetches), but that's all reasonable as long as you know how many people you're contending with and what the rules are.

Streaming video is one kind of traffic that's problematic for ISPs: it requires very little bandwidth from the paying customer. They send an initial message "I want to see this video" and then a low volume of following messages to control the video stream and assure the video streaming service that someone really is still watching it. From Comcast's point of view, though, they have a large amount of latency-sensitive traffic coming into their network from a peering point, so they need to route it through to the destination user and use up a large chunk of their network capacity in the process. If lots of people want to watch videos at once, they'll have to widen the incoming pipe from their peer; that will involve buying extra hardware and paying for its associated management overhead so that they can handle the traffic, as long as they are the limiting factor. (Their peer might also be the limiting factor, but that's less likely).

So the more data users stream concurrently, the more it costs Comcast. This can be mitigated to some extent by caching - storing frequently used data within the Comcast network so that it doesn't have to be fetched from a peer each time - and indeed this is a common strategy used by content delivery networks like Akamai and video streaming firms like YouTube. They provide a bunch of their own PCs and hard disks which Comcast stores inside its datacenters, and when a user requests a resource (video, image, music file, new operating system image) which might be available in that cache they will be directed to the cache computers. The cache will send the data directly if it's available; if not, it will download it and send it on, but store it locally so if someone else requests it then it's ready to send to them directly. This has the effect of massively reducing the bandwidth for popular data (large ad campaigns, "Gangnam Style" videos, streaming video releases), and also increases reliability and reduces latency of the service from the user's perspective, but costs the provider a substantial overhead (and operational expertise) to buy, emplace and maintain the hardware and enable the software to use it.

The non-neutral solution

If Netflix aren't willing or able to pay for this, Comcast is stuck with widening their pipe to their peers. One might argue that that's what they're supposed to do, and that their customers are paying them to be able to access the Greater Internet at 20Mbps, not just Comcast's local services. But Comcast might not see it this way. They know what destination and source addresses belong to Netflix, so they might decide "we have 100 Gbps of inbound connectivity on this link, and 50 Gbps of that is Netflix video streaming source addresses at peak. Let's reduce Netflix to a maximum of 20 Gbps - at peak, any packet from Netflix video streaming sources has a 60% chance of being dropped - and see what happens."

You see where the "neutrality" aspect comes in? Comcast is dropping inbound traffic based solely on its source address - what company it comes from. Only internal Comcast configuration needs to be changed. From the customer's point of view, Netflix traffic is suddenly very choppy or even nonfunctional at peak times - but YouTube, Facebook, Twitter etc. all work fine. So Netflix must be the problem. Why am I paying them money for this crap service? (Cue angry mail to Netflix customer support).

Net Neutrality says that Comcast can't do this - it can't discriminate based on source or destination address. Of course, it's not really neutral because ISPs might still blacklist traffic from illegal providers e.g. the Pirate Bay, but since that's normally done at the request of law enforcement it's regarded as OK by most.

The problem

The USA has handed the Federal Communications Commission, via the "general conduct" rules, a massive amount of control of and discretion in the way in which ISPs handle Internet traffic. It presumes that the FCC has the actual best interests of American consumers at heart, and is intelligent and foresighted enough to apply the rules to that effect. Given the past history of government agencies in customer service and in being effectively captured by the industries they are supposed to regulate, this seems... unwise.

2015-02-15

Failing to listen to the sounds of Chinese silence

I was moved by an interesting yet flawed piece by John Naughton in the Grauniad, analysing the kinds of censorship applied by the Chinese government:

So they [researchers] clicked on the URLs associated with a sample of posts and found that some – but not all – had vanished: the pages had disappeared from cyberspace.
The question then was: what was it about the "disappeared" posts that had led to them being censored? And at that point the experiment became very interesting indeed. First of all, it confirmed what other researchers had found, namely that, contrary to neoliberal fantasy, speech on the Chinese internet is remarkably free, vibrant and raucous. But this unruly discourse is watched by a veritable army (maybe as many as 250,000-strong) of censors. And what they are looking for is only certain kinds of free speech, specifically, speech that has the potential for engendering collective action – mobilising folks to do something together in the offline world.

The study quoted is indeed interesting, and highlights one particular and significant aspect of Chinese censorship. Where Naughton fails, though, is in failing to note the unseen, and this is picked up by CiF commentator steviematt:

The Harvard research and Gary King's opinion are both flawed beyond belief.
It only factors the number of posts that were originally published and then disappeared over the course of weeks and months. It ignores the fact that most posts that are critical never have a chance of passing through the filters in the first place.
Indeed, Naughton fails to notice that many of the websites that the West takes for granted in being able to express their opinions are completely blocked in China. Within China, sites like Twitter and Facebook are essentially completely unavailable. YouTube: no chance. You can get to a limited set of Google sites (search and maps are on-and-off accessible in my experience), but it's very iffy. Blogger seems completely blocked. Bing search seems to work fine though. Why is that?

It's because if you are a western firm who wants to provide an Internet site within China, you have to partner with a Chinese company and accept the conditions of serving users within China - key in this is agreeing to provide identity information of your users (source IP addresses , times logged on etc.) at the "request" of the government. The case of Yahoo and the Chinese dissident Shi Tao is illuminating:

According to a letter Amnesty International received from Yahoo! (YHOO), and Yahoo!'s own later public admissions, Yahoo! China provided account-holder information, in compliance with a government request, that led to Shi Tao's sentencing.
Jerry Yang, then-CEO of Yahoo, got roasted by Congress for providing this information when this story came out. Truth be told, though, he really didn't have much choice - Yahoo had presumably agreed to these conditions when it started serving China-based users. If you don't want to play ball with those conditions, and it seems that Google, Twitter and Facebook don't, you're going to be serving outside China and prone to getting blocked by the Great Firewall.

So when Naughton comments "only some kinds of activities are blocked" it's actually in the context of "only some users are willing to discuss these kinds of activities on sites where they know the government has the right to waltz in and demand their details at any time" (before presumably visiting them at home and offering them an extended stay at a pleasant little camp out in the country, for a year or ten.)

Rumours suggest that Facebook might announce something aimed at Chinese users but it's not obvious how they're going to deal with the existing restrictions. Still, Zuckerberg's a smart guy and doesn't seem to be an obvious patsy for the Chinese regime, so it's possible he's got something clever up his sleeve. Stay tuned.

2015-01-22

Mendacity from Amy Nicholson

In Slate, L.A. Weekly movie critic Amy Nicholson takes aim at deceased sniper and Navy SEAL Chris Kyle:

Take American Sniper, one of the most mendacious movies of 2014. Clint Eastwood was caught in a trap: His subject, murdered Navy SEAL Chris Kyle, lied a lot. In his autobiography, he said he killed two carjackers in Texas, sniped looters during Hurricane Katrina, and punched Jesse Ventura in the face. None of that was true. So Eastwood was stuck. Should he repeat Kyle’s lies as truth? Expose him as a liar?
Ironically her article is titled "Clint Eastwood's American Sniper is one of the most mendacious movies of 2014", because she clearly hasn't read Kyle's autobiography. In his autobiography he does not discuss either of the first two situations she describes, at all. The third situation is described, but Jesse Ventura is not mentioned (Kyle calls the participant "Scruffy" and although some of Scruffy's background is consistent with Ventura's, it's not an obvious link). So Nicholson seems happy with at least one of two situations: 1. making claims about a book she hasn't read, or 2) making knowingly false claims about a book she has read.

It's slightly clearer when you read the New Yorker article which she links because they report third person recounting of the first two stories: people who claim to have heard Kyle talk about them. Kyle may or may not have told these stories, and they may or may not have been accurately recounted by the third parties. The Scruffy story was later confirmed by Kyle in a video interview to pertain to Ventura, and a court subsequently decided that Ventura had been libelled by it. It's a pretty misleading recounting by Nicholson though, whether or not the claim turns out to be substantially true - if you aspire to being an actual journalist, one would expect you to have a clear understanding of 1st vs 2nd vs 3rd party sources and make the distinction clear in your articles. Perhaps Ms. Nicolson has no such aspiration and is happy being a partisan hack.

2015-01-06

BBC booze bill shocker

The shocker is, it's extremely reasonable:

The Corporation stated that the figure related to 'non-production related and production related spend'.
It added: 'The total spent on alcohol for the period 1st October 2013 to 26th October 2014 with the BBC's single preferred supplier Majestic Wine PLC was £43,000.'

I'm not the greatest fan of the BBC's compulsory TV licence, but I really don't think that this is worthy even of a Daily Mail throwaway article:

  • Use of bulk supplier for savings: check
  • Cost per employee per year: £2 , eminently reasonable, no reason to think this is taxpayer-funded employee booze
  • Cost per day: £130 over all channels and events. That's about 3 bottles of Veuve Clicquot NV at Sainsbury's prices. Assuming the BBC allocates half a bottle per top echelon (MP, MEP, sleb) guest, that's 6 top echelon guests per day which sounds about right.
It comes as up to 50 MPs called for the licence fee to be scrapped and replaced with a voluntary subscription service in its place.
Talk about tenuous connections. This is possibly one of the strongest signals of thrifty BBC spending there is, and you're linking it to a call for licence fee repeal? Your logic is not like our Earth logic, Daily Mail.

2014-12-24

Scentrics, "Key Man" and mobile security, oh my

From a story in the Daily Mail today I found this October article in the Evening Standard about security firm Scentrics which has been working with UCL

In technical parlance, Scentrics has patented the IP for “a standards-based, fully automatic, cryptographic key management and distribution protocol for UMTS and TCP/IP”. What that translates as in layman’s language is “one-click privacy”, the pressing of a button to guarantee absolute security.
Where issues of national security are concerned, the ciphers used are all government-approved, which means messages can be accessed if they need to be by the security services. What it also signals in reality is a fortune for Scentrics and its wealthy individual shareholders, who each put in £500,000 to £10 million.
Hmm. That's a fairly vague description - the "government-approved" language makes it look like key escrow, but it's not clear. I was curious about the details, but there didn't seem to be any linked from the stories. Chandrasekaran was also touting this in the Independent in October, and it's not clear why the Mail ran with the story now.

I tried googling around for any previous news from Scentrics. Nada. So I tried "Paran Chandrasekaran" and found him back in 2000 talking about maybe netting £450M from the prospective sale of his company Indicii Salus. I couldn't find any announcements about the sale happening, but it looks like email security firm Comodo acquired the IP from Indicii Salus in March 2006. According to Comodo's press release

The core technology acquired under this acquisition includes Indicii Salus Limited's flagship security solution which, unlike other PKI offerings, is based on server-centric architecture with all information held securely in a central location thus providing a central platform necessary to host and administer central key management solutions.
That's a single-point-of-failure design of course - when your central server is down, you are screwed, and all clients need to be able to authenticate your central server so they all need its current public key or similar signature validation. It's not really world-setting-on-fire, but hey it's 8 years ago.

Then LexisWeb turns up an interesting court case: Indicii Salus Ltd v Chandrasekaran and others with summary "Claimant [Indicii Salus] alleging defendants [Chandrasekaran and others] intending to improperly use its software - Search order being executed against defendants - Defendants applying to discharge order - Action being disposed of by undertakings not to improperly use software"

Where the claimant had brought proceedings against the defendants, alleging that they intended to improperly use its software in a new business, the defendants' application to discharge a search order, permitting a search of the matrimonial home of the first and second defendants, would be dismissed.
The case appears to be fairly widely quoted in discussions of search+seizure litigation. I wonder whether Paran Chandrasekaran was one of the defendants here, or whether they were other family members? There's no indications of what happened subsequently.

How odd. Anyway, here's a sample of the Scentrics patent (USPTO Patent Application 20140082348):

The invention extends to a mobile device configured to:
send to a messaging server, or receive from a messaging server, an electronic message which is encrypted with a messaging key;
encrypt a copy of the message with a monitoring key different from the messaging key; and
send the encrypted copy to a monitoring server remote from the messaging server.
[...]
Thus it will be seen by those skilled in the art that, in accordance with the invention, an encrypted copy of a message sent securely from the mobile device, or received securely by it, is generated by the device itself, and is sent to a monitoring server, where it can be decrypted by an authorized third party who has access to a decryption key associated with the monitoring key. In this way, an authorized third party can, when needed, monitor a message without the operator of the messaging server being required to participate in the monitoring process.
Because both the message and its copy are encrypted when in transit to or from the mobile device, unauthorized eavesdropping by malicious parties is still prevented.
This reads to me like "given a message and a target, you encrypt it with a public key whose private key is held by your target and send it to the target as normal, but you also encrypt it with a separate key known to a potential authorized snooper and send it to their server so that they can access if they want to."

WTF? That's really not a world-beating million-dollar idea. Really, really it's not. Am I reading the wrong patent here? Speaking personally, I wouldn't invest in this idea with five quid I found on the street.

2014-12-16

The 2038 problem

I was inspired - perhaps that's not quite the right word - by this article on the Year 2038 bug in the Daily Mail:

Will computers be wiped out on 19 January 2038? Outdated PC systems will not be able to cope with time and date, experts warn Psy's Gangnam Style was recently viewed so many times on YouTube that the site had to upgrade the way figures are shown on the site.
  1. The site 'broke' because it runs on a 32-bit system, which uses four-bytes
  2. These systems can only handle a finite number of binary digits
  3. A four-byte format assumes time began on 1 January, 1970, at 12:00:00
  4. At 03:14:07 UTC on Tuesday, 19 January 2038, the maximum number of seconds that a 32-bit system can handle will have passed since this date
  5. This will cause computers to run negative numbers, and dates [sic]
  6. Anomaly could cause software to crash and computers to be wiped out
I've numbered the points for ease of reference. Let's explain to author Victoria Woollaston (Deputy Science and Technology editor) where she went wrong. The starting axiom is that you can represent 4,294,967,296 distinct numbers with 32 binary digits of information.

1. YouTube didn't (as far as I can see) "break".

Here's the original YouTube post on the event on Dec 1st:

We never thought a video would be watched in numbers greater than a 32-bit integer (=2,147,483,647 views), but that was before we met PSY. "Gangnam Style" has been viewed so many times we had to upgrade to a 64-bit integer (9,223,372,036,854,775,808)!
When they say "integer" they mean it in the correct mathematical sense: a whole number which may be negative, 0 or positive. Although 32 bits can represent 4bn+ numbers as noted above, if you need to represent negative numbers as well as positive then you need to reserve one of those bits to represent that information (all readers about to comment about two's complement representation can save themselves the effort, the difference isn't material.) That leaves you just over 2bn positive and 2bn negative numbers. It's a little bit surprising that they chose to use integers rather than unsigned (natural) numbers as negative view counts don't make sense but hey, whatever.
Presumably they saw Gangnam Style reach 2 billion views and decided to pre-emptively upgrade their views field from signed 32 bit to signed 64 bit. This is likely not a trivial change - if you're using a regular database, you'd do it via a schema change that requires reprocessing the entire database, and I'd guess that YouTube's database is quite big but it seemed to be in place by the time we hit the signed 32 bit integer limit.

2. All systems can only handle a finite number of binary digits.

For fuck's sake. We don't have infinite storage anywhere in the world. The problem is that the finite number of binary digits (32) in 4-byte representation is too small. 8 byte representation has twice the number of binary digits (64, which is still finite) and so can represent many more numbers.

3. The number of bytes has no relationship to the information it represents.

Unix computers (Linux, BSD, OS X etc.) represent time as seconds since the epoch. The epoch is defined as 00:00:00 Coordinated Universal Time (UTC - for most purposes, the same as GMT), Thursday, 1 January 1970. The Unix standard was to count those seconds in a 32 bit signed integer. Now it's clear that 03:14:08 UTC on 19 January 2038 will see that number of seconds exceed what can be stored in a 32 bit signed integer, and the counter will wrap around to a negative number. What happens then is anyone's guess and very application dependent, but it's probably not good.
There is a move towards 64-bit computing in the Unix world, which will include migration of these time representations to 64 bit. Because this move is happening now, we have 23 years to complete it before we reach our Armageddon date. I don't expect there to be many 32 bit systems left operating by then - their memory will be rotted, their disk drives stuck. Only emulated systems will be still working, and everyone knows about the 2038 problem.

4. Basically correct, if grammatically poor

5. Who taught you English, headline writer?

As noted above, what will actually happen on the date in question is heavily dependent on how each program using the information behaves. The most likely result is a crash of some form, but you might see corruption of data before that happens. It won't be good. Luckily it's easy to test programs by just advancing the clock forwards and seeing what happens when the time ticks over. Don't try this on a live system, however.

6. Software crash, sure. Computer being "wiped out"? Unlikely

I can see certain circumstances where a negative date could cause a hard drive to be wiped, but I'd expect it to be more common for hard drives to be filled up - if a janitor process is cleaning up old files, it'll look for files with modification time below a certain value (say, all files older than 5 minutes ago). Files created before the positive-to-negative date point won't be cleaned up by janitors running after that point. So we leave those stale files lying around, but files created after that will still be eligible for clean-up - they have a negative time which is less than the janitor's negative measurement point.

I'm sure there will be date-related breakage as we approach 2038 - if a bank system managers 10 year bonds, then we will see breakage as their expiry time goes past january 2038, so the bank will see breakage in 2028. But hey, companies are already selling 50 year bonds so bank systems have had to deal with this problem already.

Thank goodness that I can rely on the Daily Mail journalists' expertise in all the articles that I don't actually know anything about.