Data replication and recovery strategies

You can’t get the train from Jersey City, a former blue-collar town across the Hudson from Manhattan, directly to Wall St anymore; the station is now smashed steel and rock and lies underwater. But it’s not such a long walk after all from the Christopher Street station, and there’s always the ferry. People are managing.

Even before 11 September this part of the shoreline in northern New Jersey, the state famously home to Tony Soprano, was up and coming, though. In what was until recently a Polish immigrant neighbourhood, workers in the financial district had been buying up condominiums.

On the same block rise newer buildings begun long before the terrorist attacks, where brokerage firms had already been raising satellite trading rooms, data centres, and clearing facilities. Not as glam as downtown, but cheaper. Welcome to “Wall St West”. Nothing says you have to keep doing what you do where you used to.

The same is true of corporate data. One time it lived in the mainframe.

Now it lives essentially on the network, and customers have many options on where to put it to protect it as best they can. “We live in a world of shared data, particularly in financial services,” says Richard Hall, chief technology strategist at specialist data replication consultancy Evolution.

For James Governor, analyst in the application development practice of US analyst firm Illuminata, the issues of data replication and data recovery have little to do with disaster of any stripe, and more to do with bigger trends. “It’s all about wider issues about the corporate approach to data and the creation of data models, as well as about where the data gets held. In many ways data replication is almost trying to do 3G networking without the networks, with innovative services on top of a highly distributed architecture. It’s partly Java and partly a number of other things, such as repositories of data and ontologies of metadata. This is the way the industry’s headed anyway.”

Not just technologically. In 1999 the Institute of Chartered Accountants in England and Wales published its final guidance on the implementation of the internal control requirements of the Combined Code on Corporate Governance. The Turnbull Report, endorsed by the Stock Exchange, specifically says companies need to have clearly laid out policies on risk management, including IT problems. Having a business continuity plan is now pretty much a pre-requisite to getting listed. Plus, the Data Protection and RIPA bills have implications for data collection and management, as do the usual financial and regulatory commandments of institutions like the SEC.

And “we’ve always had terrorism in the UK, thanks to the IRA. What’s happening now is a lot of testing of crisis management plans, not a new market”, says Chris Keeling, managing principal at information security and risk management firm Insight Consulting, whose clients include UBS Warburg and Deutsche Bank.

And don’t forget that sometimes you want the computers to be down – for system upgrades, office moves and such. Planned downtime needs to be catered for as much as unplanned downtime should be combated. According to Hitachi Data Systems, while unplanned downtime may only be a few hours a year, planned downtime can often be as much as 70 hours.

It used to be so simple. You had a mainframe, halon gas dispensers to put out any fires, and a big battery in case the power failed. But as the architecture of a typical large company grew more heterogeneous, so did the ways to cover all eventualities.

The more established solutions typically centre on a big UPS (uninterruptible power supply) network and a remote or mirror site where a duplicate of the main computing platform sits. This is nothing new for City firms.

But as networks have burgeoned – especially fibre-optic channels, especially in metropolitan areas – so has the notion of storing data on the network, so there is no one central place it lives in and thus could be “killed” in.

But let’s be sure whatever option is taken works. According to research from McGladrey and Pullen, any company unable to access its data for 10 days will never make a full financial recovery, and 43% of businesses will close altogether.

As a result, consultants have a major opportunity – and responsibility – to help organisations decide what is the best data replication and recovery strategy. The tragic events last September only focused the issue, not created it. “Every manager was already worried about how to protect his data from logical and system disasters,” says Mark Woodford, a consultant at storage consultancy Posetiv, a spin-off of Computacenter. Steve Delmege, StorageTek’s marketing director for UK and Ireland, agrees. “It’s wrong to focus on 11 September ; this has always been an issue,” he says.

In some ways, nothing and everything’s changed in the world of data security.

“The process of protecting data by getting a back up copy hasn’t changed in 25 years,” says Steve Pearce, chief operating officer of a company called InTechnology. Pearce’s company specialises in, yes, you guessed it, tape, or more specifically, replacing at least some of that medium with an automated virtual back up system, Vbak.

But despite the mountains of tape out there – Delmege estimates there is seven times as much tape as attached hard disks – there are many new options to consider about where your data should live before being assigned to that magnetic grave. “There’s still demand for more data and more access to that data, and the ability to use it.”

The spectrum of ways of doing that, with confidence that you’re reading the most kosher and protected data, now runs from traditional disaster recovery and UPS arrangements through storage in its network (in either NAS, network attached storage or SAN, or storage area network) aspect to software solutions such as disk mirroring and database replication to the co-location and dark fibre options.

But how to choose? “There are lots of different categories and needs for data,” says Delmege. “There’s a horses for courses aspect to determining a data strategy. Of all your terabytes of data, how much do you need all the time? Only need to see once a month? Could afford not to have for a day? Different individuals can have different requirements even within the same organisation. One size data recovery strategy does not fit all.”

“The waters are quite muddy,” notes Graeme Rowe, marketing director at Posetiv. “Should it be on a hardware or a software level, on a systems or application level? It depends very much on a client’s needs.” Insight’s Keeling adds: “The issue is impact assessment. What needs to be recovered? What are the financial implications of any loss?”

Customers don’t want to find out these answers the hard way. “We’ve had people standing in our drive with their broken servers before we come in to work saying we either fix it or they were told not to bother coming back to work,” says Todd Johnson, general manager UK of data forensics specialist Ontrack. Johnson’s services have been used by not just the IT grunts but corporate generals: one challenging assignment was fixing remotely the laptop of a main speaker at the 1998 Winter Olympics in Nagano so he could go ahead with his presentation.

But life and human nature being what it is, not all companies have implemented a data security strategy. “We always offer security, but our customers don’t always take it,” says Ionut Ionescu, security director in the Professional Services division of troubled co-location leader Exodus Communications.

After all, “everybody talks and gives lip service to back up and recovery, but it’s actually quite a tedious, time and resource consuming operation. Start people talking about it and they all acknowledge it’s a problem”, says InTechnology’s Pearce. “That’s why we chose this area; where there’s muck there’s brass, as they say.”

UPS is not a glam business to be in either. Being given the job of writing the annual UPS feature is the least sought-after assignment in IT journalism.

Nonetheless, UPS is a vital part of any data recovery and replication strategy, says Michael Adams, UK general manager for the biggest player, American Power Conversion: “Availability and reliability are the only ways to prevent the need for disaster recovery in the first place.” But even UPS vendors like APC know that not all data is platinum stuff. “You can get 100% availability, but what’s important is deciding which of that data is important and which you can live without if you have to.”

True. But the problem with UPS, say critics, is that it’s a vital product but only good at protecting a single site. “What’s needed is to protect the business as opposed to just the system,” warns Posetiv’s Woodford.

“To really protect against site failure you need multiple protected locations.”

There’s also – at least for some organisations – going to be the issue of the time lag it takes to get to the safer remote site. “If IT is off-site and remote fine; if not, then you can’t work until those staff get to the new site,” says Insight’s Keeling. “There’s a growing issue about where you actually should locate these staff.” One thinks of Wall St West: Lloyd’s of London’s main processing centre is now in Chatham, according to Keeling.

Hence data replication. There are two main approaches; in the database, as with Oracle “Unbreakable” and its database clustering approach and its more established Replication Server solution; and in storage, especially networked storage. Disk mirroring and running a remote duplication of the production database are the usual database options. A US company called Quest has an interesting product in this area for Oracle called Shareplex, which maintains current, available copies of the production database by only saving certain updates. This, it claims, can increase OLTP performance by up to 80%, guarantee higher service levels, and protect against downtime due to disaster, data block corruptions, software and/or human error.

In storage, companies like EMC, especially with its AutoIS (Automated Information Storage) WideSky middleware, offer storage to storage device replication says Woodford, and Veritas is also strong in this area, according to Illuminata’s Governor. StorageTek has a technology called snapshot that creates a virtual copy of data in seconds. This copy can then be used instead of the original to make a remote backup copy which can take hours thus freeing up the original for use while backup is taking place.

As the copy is virtual it takes up no extra storage space and therefore reduces cost, claims the firm.

Storage fans also say their solution obviates the need for UPS. “Disaster recovery is all about back up,” says Hendrik Wacker, sales and marketing director of JNI Corp, a US company which provides specialist hosted bus adaptor kit for storage area networks over fibre channel. “A SAN is always built in a redundant way. Our clients at the World Trade Centre were able to access replicated sites and get back up quickly without disaster recovery. It’s designed for reassurance.”

But any network, even fibre, is just a transmission medium – an attached platform or device is still needed to hold the information. But if the pipe is big and fast enough physical proximity becomes less of an issue.

It can go outside the main computer room – and thus outside of the building, and eventually the company.

And after all, why not outsource your data storage? “Even small to medium businesses or small offices should be looking at doing this,” suggests StorageTek’s Delmege. “But some organisations will always want the comfort factor of being able to keep and look at their data.”

This was the main argument of the co-location industry, which has been quite badly battered by last year’s downturn in dotcommery (viz the collapse of the most highly-capitalised European tech start up of all time, CityReach, and the filing for Chapter 11 bankruptcy protection in the US of Exodus, the company which basically started the whole movement). But co-lo isn’t dead by a long chalk. “The argument behind it remains sound,” as Illuminata’s Governor notes. “We’re seeing business as usual in Europe,” claims Exodus’ Ionescu. After all, in a June 2001 report Forrester Research estimated firms could save 25% to 80% of their Web infrastructure costs by hosting.

Exodus says it has the largest operational security team in the world, looking after both its own 300,000 node network and customer premises’ equipment, and that it can give a company the highest levels of security expertise available at a third of the cost of an internal IT department.

The co-lo argument for data recovery and replication is that it’s part of the whole outsourcing service, he goes on. “We’ll either manage your back up device you leave with us or we’ll rent and manage a slice of someone else’s.”

The most intriguing trend or option is at the leading edge of networking.

This essentially means migrating from the private network to using metropolitan broadband, according to Evolution’s Hall. “There are some players struggling in this market – it’s still very new and raw – but they’re talking about a 10 Gigabit fibre network which will give you storage almost at the utility level,” he points out. “This is appealing since most companies would prefer to outsource their data storage if they could. There’s a large potential for this service, but no-one with a strong enough brand yet.”

One reason: “This is expensive,” stresses Insight’s Keeling. “Using dark fibre in a triangulated topography between main site and remote site – at the moment I think this is something really mainly of interest to the investment banking community if the issue is continuous upkeep and recovery of vital data. Government departments, in contrast, don’t need so much resilience and we can implement much simpler and less expensive ‘mutual fallback’ scenarios for them.”

Slightly behind the bleeding edge is a service like that of business continuity specialist GuardianIT, with its Scalenet offering. According to Jim Mathieson, the company’s service development manager, the firm has strung all its services together across a metropolitan area network which links all its sites around the M25 onto one common network so customers can get which ones they want a la carte. This solution means there is no single vulnerable point of presence, he adds.

So data recovery and techniques for implementing it look likely to remain “hot”. The promise of networking storage and backup in particular seem to suggest that there is much opportunity for consultants in helping to build the Wall St Wests of the future.

Gary Flood is a freelance journalist


One company that backs the database replication solution to the data recovery problem is the RAC. The company holds data in its Oracle production systems on some two million individual customers, says its database administration team leader Andrew Woolley. In 1998 the company chose a tool from Quest to implement a replication policy.

“We wanted something that could give us maximum availability with minimum overhead. A hot stand by database is the best way of doing this for us,” he says. “In terms of simplicity, performance and recoverability this is great.” The original choice was between a complete disk mirroring approach with separate sites but this was rejected as too complex – the RAC estimated this could take as long as nine months to implement and risked disruption to the main production databases during that time. Instead, the Shareplex system from Quest only replicate transactions that need to be duplicated to the remote site.

Disasters that eat data don’t have to be explosions; Mother Nature does quite well enough on her own. One such unfortunate event was when Hanover Displays, a Sussex-based company that makes electronic destination signs for public transport, found its computers and servers on the ground floor were completely submerged under 12 feet of the Ouse.

The company’s back-up tapes were stored in a non-waterproofed firesafe.

“We were particularly keen to recover software that we had developed ourselves for the production of illuminated signs,” says its IT manager, Ben Richardson.

Richardson turned tapes that had already been in filthy water to specialist recovery firm Vogon – which was able to help. “Losing e-mail archives was inconvenient, but if we had lost the software it would have had serious implications for the company,” he adds. “We would have had to try to recreate it from people’s heads.”

Another aspect to data recovery is the litigious. During the US presidential election last year, for example, Ontrack was commissioned to copy and analyse data from Florida Secretary of State Katharine Harris’s office computers, after she agreed to release them to a consortium of news agencies for public scrutiny. It was alleged that she had destroyed data, including some public documents, so her hard drives needed to be forensically analysed for the legal case brought against her.

One imagines that Enron’s computers are being similarly scrutinised even as we speak.

Related reading

HMRC banknotes