One of the surprising facts about any new IT project these days is the huge bite that storage is likely to take out of the overall budget. Analyst houses like Gartner and Forrester put the figure at somewhere in excess of 60% of total project costs.
As Gartner analyst Carolyn DiCenzo points out, storage is not just about adding more hard disk capacity. The storage market has three components, devices to store data, storage networking devices and storage management software. Each area has any number of subdivisions, which have spawned a vast array of products.
There is no doubt about the growing torrent of data requiring storage. A recent study – How Much Information?, by the University of California School of Information Management and Systems – demonstrated that the world’s total yearly production of print, film, optical and magnetic content would already require about 1.5bn gigabytes of storage if it were all in digital readable form.
She says: “As more information is captured in digital format at creation, the demand for disk space and tape automation (for back up) will grow, regardless of project priorities. The need to compete effectively while containing costs, and government mandates to manage electronic information better, will also increase the need for storage. Companies are going to have to find better ways to organise and manage these resources.”
This kind of pressure creates excellent consultancy opportunities, of course. To take advantage of these opportunities, however, consultants are going to have to focus more closely on storage architecture issues.
David Liff, regional vice president at Computer Associates, reckons that larger companies now have no option but to bring storage under control. “Many large organisations today have no clear understanding of what is going on in their data storage. When you analyse their systems, you find that they have multiple copies of the same data all over the shop, with no clarity as to which generation of the data various users are looking at.”
There is nothing surprising about this, he adds, when one thinks that every two years we are currently storing more data than has been created in the entire history of the human race.
Advances in hard disk manufacturing mean that the cost of storage is falling all the time. However, there is a compensatory force at work – the need to store more – which has so far kept the price more or less stable.
Large disk arrays these days still cost hundreds of thousands of pounds, which means that it is not sensible for organisations that can’t tell what their current disk utilisation is to simply keep on adding more new storage capacity. It quickly becomes worth investing in new storage architectures and management processes to make better use of what the corporate has already.
“Around 80% of an enterprise’s data is not critical. If something went wrong, the organisation could live without access to that data for a day or two. Conversely, it would need probably 20% of the data immediately or it would start losing money. The problem is, very few organisations could tell you with any certainty which bits of data fall into which category,” he says. It takes sophisticated storage resource management software to answer these kinds of questions.
Once an organisation understands which data to prioritise, it can start to build a sensible data management system that will move non-critical data to tape, which is at least two orders of magnitude cheaper than disk storage. Companies who go down this route, he suggests, will generally be able to defer the purchase of additional disk systems for some time.
“Delaying buying storage is a winning formula, since each new generation of products offers more for less, and generally does things better than the preceding generation.”
Chris Atkins, storage manager at Sun Microsystems, points out that storage growth is generally not predicated on future projects which companies can defer if times get hard. It comes from the company’s existing systems.
In boom times, company boards were likely to simply open the purse strings and give the IT director more money every time the need for more storage surfaced. In today’s harsher climate, that cycle of spending is now under much tighter scrutiny.
“In the past storage was managed by specialists. It was a “geekie”, complex subject, and company boards avoided it. Now senior directors are looking at the escalating history of storage costs and saying: ‘Hold it! It’s time we imposed some normal director level rigour over these areas’,” he says. “Directors are now asking questions such as: How much storage do we actually have, and what are our utilisation levels? The problem is that in many organisations, there is just no easy way of providing answers to these questions,” he says.
There are three major storage architecture “models”: SAN, NAS and DAS, standing for Storage Area Networking, Network Attached Storage and Direct Attached Storage, respectively. Each has its strengths and its weaknesses.
The traditional practice of buying servers and attaching storage directly to the server (the DAS model) has brought companies to their present, largely unsustainable position, where they have little or no overall view of disk utilisation. However, the one thing this model has going for it is flexibility. You put the server where it is needed and it comes with all the memory that users require, at least for a while.
Keep doing this, however, and you end up with an escalating control problem. SAN and NAS are both attempts to manage storage more efficiently. NAS came into being when companies such as Auspex created storage devices that were very much better than ordinary servers at handling file and print tasks. A NAS device sits “on” the network and shares data between users and servers. It is a way of consolidating a company’s storage requirements for the kind of data that you want large numbers of users to be able to share and pass between themselves.
A SAN is very different. SANs sit “back” of the network and are centralised islands of storage. Chunks of a SAN’s storage capacity are allocated to specific servers. This is great where you don’t want to share data. Transaction engines, for example, generating enormous amounts of real time information, need to be protected from users running analysis programs directly on the raw data. As Sun’s Aitken says, you may want to feed your data warehouse that transaction data in near real time, but you want the two data stores kept distinct to avoid corrupting core business data.
This “back of the network” and “on the network” distinction between SAN and NAS devices led some to think of these two storage models as competing alternatives. In point of fact it is now generally accepted that they are complementary solutions. As Glenn Fitzgerald, ICL Storage Solutions Centre manager, explains, it is now commonplace to design storage solutions that have both NAS components and SAN components.
He argues too, that the perceived weakness in SANs, namely the difficulty of building a shared data component across servers, is itself a temporary thing. “We definitely see SANs achieving the capability of being able to dynamically share data across multiple servers. In fact, we see the present distinction between NAS and SAN storage solutions vanishing over the next few years. The delivery of individual storage capabilities and functions will be mapped by software in the network, not by the devices,” he says.
Today, the maintenance and resource management of SANs has not yet matured to the point where the software can remap resources dynamically. We are still trapped in a world where storage has to be allocated statically to individual locations.
FIBRE CHANNEL AND IP – THE TRANSPORT ISSUES
The big move forward in SANs right now is not in dynamic data sharing, but in the speed that data can be moved between the SAN and its attendant servers. SANs use Fibre Channel (FC) and special purpose FC switches from specialist vendors such as Brocade and Gadzooks to move data. The server has a special card installed inside it called a host bus adaptor (HBA), which provides FC connectivity to the SAN storage device. As such they are a distinct and different technology from the rest of the network, which uses IP over either Gigabit Ethernet or 100 Mb Ethernet.
However, as Paul Trowbridge, marketing director at Brocade Communications Systems, notes, the two co-exist quite happily since what happens “back of the network” is shielded from the network. Servers have the usual Gb port to the network and an FC connection to the SAN. An easy way to conceptualise the IP/FC distinction is to see FC as a data centre connectivity medium and IP as the campus and Wide Area Network connectivity.
Within FC interconnect technology, SAN vendors are increasingly transitioning from the present one gigabit interconnect standard to the new two gigabit standard. Hitachi, one of the larger specialist SAN storage vendors, already has a 2Gb SAN product available, and, according to Trowbridge, Brocade will have 2Gb product very shortly. The industry has also made sure that new products will co-exist with existing product, in that the switches will be able to “auto-sense” whether the connecting storage hardware is 2Gb or 1Gb enabled.
The next step forward in Fibre Channel speeds, according to Trowbridge, will be a big leap to 10Gb. “The storage industry has progressed so far in 2x steps, looking for a doubling of capacities and speeds. However, we are now taking a leaf out of the network vendors’ and telecom vendors’ book. They are leapfrogging from 1Gb Ethernet to 10Gb Ethernet and the storage vendors are now doing the same order of magnitude shift,” he says.
CENTRALISATION – SERVERS OR STORAGE?
Sun storage manager Aitken argues that where many user organisations go wrong when thinking about storage is that they fail to bring the kind of rigour to bear that companies use when thinking about server policies.
“All big organisations have at least a dual vendor strategy where servers are concerned, in order to keep some competitive pressure going. Yet when it comes to storage, they generally always go for a single vendor. This is just not sensible,” he says.
A similar muddle occurs when companies think about consolidating and centralising storage. “A very good tip to give to consultants and data centre managers is to take any sentence they are formulating that contains the word “storage” and substitute “server” instead. Companies understand server strategies, they don’t yet really grasp storage strategies. With Aitken’s exercise, if the thought still makes sense, they are probably on the right track. This approach, he suggests, will help IT managers to steer a course through all the tricky discussions about the relative merits of DAS, NAS and SAN.
“No IT manager gives a hoot what bus sits at the back of a server. They look at the business application and the value to the business. They should do the same with storage. The aim always should be to come up with a storage solution that best meets the needs of a particular stack of business applications at the lowest possible cost.
This exercise runs into problems, however, when companies try to think about centralising in order to increase manageability and control. Here you have to focus on either servers or storage as your route forward.
As Aitken notes, either starting point gives control of the other. If you centralise your servers, you inevitably end up with a centralised storage policy, and if you centralise storage, even virtually, you regain control over your servers. “Customers are either in love with one policy or the other. Some love modularity and total flexibility on the server front, so they like centralised storage. Others want to consolidate all their servers back to the date centre. Neither approach is right or wrong and both give you the same results in the end – a managed IT infrastructure.
IP storage and iSCSI
To move “in one bound” from IP as a network data transfer protocol to IP storage, the storage industry had to find a viable way of reading data off disk in the way that storage systems like. Storage systems pass block data, not files, and the key lies with a new SCSI (small computer systems interface) protocol. Normally SCSI is the interface used with DAS (direct attached storage), and links a RAID disk array, for example, to a server.
To pass data over an IP network between a server and a storage device, or between storage devices, something different has to happen and that something different is provided by iSCSI, or Internet SCSI, a new, still emerging standard.
As the interface card specialist vendor, Adaptec, points out in a very good white paper on iSCSI (see www.adaptec.com), the whole point of this new standard is that it enables the transport of block-level storage traffic over IP networks. iSCSI builds on the SCSI command set on the one hand, and on traditional IP, on the other.
Normally servers (and PCs) connect to an IP network using NICs (network interface cards) designed to transfer packetised file level data. To transform a NIC into a block data transport engine, it has to be made to operate like a Host Bus Adaptor (HBA). Special chip sets on the iSCSI NIC take the block data and break it into frames to be placed inside TCP/IP packets.
Once this is done, the storage data travels just like file data over the IP network. The advantage of this manoeuvre is that, at a stroke, storage vendors and user organisations can leverage the power of IP networks to send data over very long distances. IP storage thus has tremendous potential for business continuity and disaster recovery. It also has the benefit of offering corporate data centres a storage medium that maps perfectly to their existing network infrastructure. A third benefit is that as vendors such as Adaptec bring iSCSI adaptors to market, organisations can create IP SANs using their existing IP infrastructure without having to retrain on Fibre Channel technology.
One disadvantage of IP storage at present is that the business of breaking down block level storage data into IP packets is carried out, at least in part, by the host server’s CPU and is therefore tremendously CPU intensive.
The storage industry has responded by developing TCP/IP off-load engines (TOEs) and siting them on the adaptor card, thereby freeing the server CPU.
Anthony Harrington is a freelance journalist
With 65% of UK accountancy practices already using or planning to use cloud accounting software, Paul Haydock of DueCourse examines four reasons why SME owners should consider a cloud-based service
BDO has implemented data submission and extraction technology to improve efficiency and automate manual processes
Driving opportunity for all and empowering businesses for success are the key themes for the Sage Summit UK this year, which takes place on 5-6 April
Advanced has extended its West Midlands HQ following the creation of 200 jobs and planned hiring of a further 200 employees over the next nine months