Norton Internet Security logo

Disk arrays

Nov 27, 2010

A disk array is a disk storage system which contains multiple disk drives. It is differentiated from a disk enclosure, in that an array has cache memory and advanced functionality, like RAID and virtualization.

Disk array refers to a linked group of one or more physical independent hard disk drives generally used to replace larger, single disk drive systems. The most common disk arrays are in daisy chain configuration or implement RAID (Redundant Array of Independent Disks) technology. A disk array may contain several disk drive trays, and is structured to improve speed and increase protection against loss of data.

Disk arrays organize their data storage into Logical Units (LUs), which appear as linear block paces to their clients. A small disk array, with a few disks, might support up to 8 LUs; a large one, with hundreds of disk drives, can support thousands.

Disk arrays are an integral part of high-performance storage systems, and their importance and scale are growing as continuous access to information becomes critical to the day-to-day operation of modern business.

Components of a typical disk array include:

  • Disk array controllers
  • Cache memories
  • Disk enclosures
  • Power supplies
Typically a disk array provides increased availability, resiliency and maintainability by using additional, redundant components (controllers, power supplies, fans, etc.), often up to the point when all single points of failure (SPOFs) are eliminated from the design. Additionally those components are often hot-swappable.

Typically, disk arrays are divided into categories:
  • Network attached storage (NAS) arrays
  • Storage area network (SAN) arrays:
    • Modular SAN arrays
    • Monolithic SAN arrays
    • Utility Storage Arrays
  • Storage virtualization
A hard disk, while being the vital center of any computer system, is also its weakest link. It is the only critical device of a computer system that is not electronic, but relies on intricate moving mechanical parts that often fail. When this happens, data is irretrievable and unless a backup system has been employed, the user is out of luck. This is where disk arrays make a difference.

Disk arrays incorporate controls and a structure that pre-empts disaster. The most common disk array technology is RAID (Redundant Array of Independent Disks). RAID utilizes disk arrays in a number of optional configurations that benefit the user.

One advantage of RAID disk arrays is redundancy of data writes so that if a file is damaged or stored in a bad cluster or disk, it can be instantly and transparently replaced from another disk in the array. RAID also allows hot-swapping of bad disks and increased flexibility in scalable storage. Performance is also enhanced through a process called "stripping."

There are many varieties of RAID, and though designed primarily for servers, disk arrays have become increasingly popular among individuals because of their many benefits. RAID is particularly suited for gamers and multimedia applications.

RAID controllers, built into motherboards, must set parameters for interacting with disk arrays. The controller sets the performance parameter to match the slowest disk. If it were to use the fastest disk as the benchmark, data would be lost when written to disks that cannot support that speed. For this reason, all disks in the array should be the same brand, speed, size and model for optimal performance. A mix of capacities, speeds and types of disks will negatively impact performance. The best drives for disk arrays are SATA (Serial ATA) RAID drives. These drives are optimized for RAID use and, being SATA, are hot-swappable.

A disk array controller is a device which manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache. A disk array controller name is often improperly shortened to a disk controller. The two should not be confused as they provide very different functionality.

Disk array controller provides front-end interfaces and back-end interfaces.
  • Back-end interface communicates with controlled disks. Hence protocol is usually ATA (a.k.a. PATA; incorrectly called IDE), SATA, SCSI, FC or SAS.
  • Front-end interface communicates with a computer's host adapter (HBA, Host Bus Adapter) and uses:
    • one of ATA, SATA, SCSI, FC; these are popular protocols used by disks, so by using one of them a controller may transparently emulate a disk for a computer
    • somewhat less popular protocol dedicated for a specific solution: FICON/ESCON, iSCSI, HyperSCSI, ATA over Ethernet or InfiniBand
A single controller may use different protocols for back-end and for front-end communication. Many enterprise controllers use FC on front-end and SATA on back-end.

In a modern enterprise architecture disk array controllers are parts of physically independent enclosures, such as disk arrays placed in a storage area network (SAN) or network-attached storage (NAS) servers.

Those external disk arrays are usually purchased as an integrated subsystem of RAID controllers, disk drives, power supplies, and management software. It is up to controllers to provide advanced functionality (various vendors name these differently):
  • automatic failover to another controller (transparent to computers transmitting data)
  • long-running operations performed without downtime
    • forming a new RAID set
    • reconstructing degraded RAID set (after a disk failure)
    • adding a disk to online RAID set
    • removing a disk from a RAID set (rare functionality)
    • partitioning a RAID set to separate volumes/LUNs
  • snapshots
  • Business Continuance Volumes (BCV)
  • replication with a remote controller.
A/P = active/passive

A/P disk array

Multipathed disk array type that allows one path to a disk to be designated as primary and used to access the disk at any time. Using a path other than the designated active path results in severe performance degradation in some disk arrays. When such arrays are configured in autotrespass mode, path failover routes I/O to the passive (standby) path.

A/P-C disk array

Multipathed disk array type that permits concurrent I/O. This allows one or more paths to a disk to be designated as primary, and used to access the disk at any time. Using a path other than the designated active paths results in severe performance degradation in some disk arrays. When such arrays are configured in autotrespass mode, path failover routes I/O to the passive (standby) path.

A/PF disk array

Active/passive disk array in explicit failover mode. Such arrays require a special failover command to be executed. ASL support for such arrays cannot be added dynamically.

A/PF-C disk array

One or more active/passive disk arrays in explicit failover mode. Such arrays require a special failover command to be executed. ASL support for such arrays cannot be added dynamically.

A/PG disk array

Active/passive disk array in which failover occurs for a group of LUNs.

A/PG-C disk array

One or more active/passive disk arrays in which failover occurs for a group of LUNs.

0 comments: (+add yours?)

Post a Comment

Note: Only a member of this blog may post a comment.