About This Case

Closed

30 May 2008, 11:59PM PT

Bonus Detail

  • Qualifying Insights Split a $12,000 Bonus

Posted

2 May 2008, 12:00AM PT

Industries

  • Enterprise Software & Services
  • Hardware
  • IT / IT Security
  • Internet / Online Services / Consumer Software
  • Start-Ups / Small Businesses / Franchises

More Insights Into The Rapidly Evolving Storage Area Network Market

 

Closed: 30 May 2008, 11:59PM PT

Qualifying Insights Split a $12,000 Bonus.

Last month, we kicked off The Future of Storage conversation with Dell, and now we're continuing that conversation.

Top insights in the community are being posted to http://thefutureofstorage.com/ with some also getting posted to Ars Technica's website -- as well as in ad promotions for the site.

The focus of this conversation is centered on the SAN market and its current and future directions, including things like iSCSI, FCoE, deduplification, virtualization, thin provisioning, encryption, etc. We're looking for good insight into any of those topics. To get an idea of the type of insights we're looking for, just look over http://thefutureofstorage.com/ to see some earlier posts. While that site is sponsored by Dell, the topic for discussion is the SAN market in general -- and is not vendor specific.

In continuing this conversation, feel free to write about any topic in this space that you think would fit on the site, such as where you think the SAN market is heading. Alternatively, feel free to write thoughtful insights that address some of the points raised in earlier posts on The Future of Storage site.

Also, as we look to continue this ongoing conversation, feel free to email us what types of questions you think we should be asking in future months to keep the conversation going.

The insights selected to be on this site will each get a "share" of the bonus pool below. You can write multiple insights to get multiple shares.

PLEASE NOTE: We are looking for unique insights that delve into a single subject concerning this topic. Don't try to cover too many things in a single insight submission. Again, look over the existing Future of Storage site to get an idea of what's appropriate.

Like last time, please get your insights in early. We will be closing the case once we feel there are enough insights for the month (somewhere between 10 and 20 insights).

19 Insights

 



I have been researching iSCSI impementations on the server to try and understand the dfference between them and to come to grips with how they work. This article looks to compare the various methods of connecting to a iSCSI network.

It seems that many people do not know or understand that the generation and transmission of IP packets is CPU intensive process. In some operating systems, it can also be very latent since there are many transfers across the memory bus and the PCI bus before the data is actually transmitted.

When managing IP the server OS must manage the following:

  • session state for every TCP connection
  • buffer for each session
  • memory management and control for every IP session
  • data transfer on and off the PCI bus

 


And when using TCP the following items must be handled by the driver.

  • TCP Window Size
  • TCP Window Scale
  • TCP Timestamps
  • TCP Delayed ACKs
  • TCP Selective ACK
  • 802.1x Flow Control / Ethernet Pause
  • Maximum Segment Size (MSS) – Jumbo Frames

 

iSCSI Implementations

So you can implement an iSCSI connection on the computer using three basic modes:

  1. Software initiators
  2. TCP Offoad Engines (TOE)
  3. Host Bus Adapters

Software Initiators

This is the most common implementation. You can download the Microsoft iSCSI Initiator and get started immediately. It seems that some other vendors also offer Initiators, e.g. DelL.

There are several performance issues with this approach:

you are using a general purpose CPU to perform the data transformation
you are performing multiple data copy functions across the internal bus of your computer.
it is not optimised for performance
For a desktop or low intensity server this might work OK. But for VM platforms and high intensity servers, spending a lot of CPU cycles generating iSCSI packets will impact performance.

TCP Offload Engines (TOE)

For a while I thought the TOE cards and HBA were the same thing but this is not true. The TOE cards are able to improve the TCP performance of a server. Since iSCSI uses TCP for data transfer, this will improve storage performance and reduce latency.

Many servers now ship with TOE cards as standard but most drivers have the TOE feature disabled. Also, device driver quality seems to be important, so make sure you get latest versions and use quality vendors.

For servers that exchange a lot of data, enabling TOE will improve performance and reduce the server CPU utilisation. This will improve the performance of your iSCSI session.

Host Bus Adapters

It is a generic term for connecting the I/O bus of your server to an external system such as Ethernet or FC. Thus an ethernet adapter is also a HBA. Note that both FC and iSCSI people use the term HBA.

Header and Data Digests were added by the iSCSI Working Group as a more robust mechanism for ensuring data integrity compared to TCP checksums. However, iSCSI Header and Data Digest calculations are very CPU intensive.

Only a full iSCSI offload HBA has the logic built into the ASIC to accelerate these calculations. General purpose NICs and TOEs do not have this innate capability; therefore, the calculations must be performed by the host CPU (if desired). If these calculations are performed by the host CPU, both throughput and IOPS performance will further degrade, potentially slowing application performance to an unacceptable level

Full iSCSI offload HBAs offer SAN Administrators what they need in a storage adapter, including:

  • Full iSCSI offload HBAs consistently have low CPU verhead
  • offer iSCSI digesteliability at line speeds without impacting the performance of host applications
  • tools that show the capacity and performance of your iSCSI connection.


It is the management tools that are particularly interesting. The ability to have extended reporting on the iSCSI data flows provides a real benefit in terms of locating performance or network problems, provide stats on packet drops and connection failures and thus let you know that a problem exists and give you tools to resolve that problem. 

Conclusion

After researching HBA, it seems clear that you are more likely to be successful if you purchase iSCSI HBA for your servers. It seems that many of the features that make Fibrechannel popular are actually derived from the FC HBA and not from an inherent superiority in the protocol or in the switches.

However the most vital feature is the management tools. Server Administrators need to have information to communicate with the Network team and to measure the results of tests or changes in either the server or the network. The ability to inspect and analyze the iSCSI data flow is vital to resolving problems that span both the server and the network. Many Server Admins will choose FC so that they can have the management tools and the simplicity of managing their own network. iSCSI offers the same features, (but with more choices and options) but you must proactively choose them and implement them. 

When implementing an iSCSI backbone you should ensure that get iSCSI HBA for your servers. You will improve your server performance, and get better visibility into your iSCSI service and network overlay.

Seven fundamental reasons why FCoE will fail in the market


I have been evaluating FCoE for a while now, and been researching the technology for my latest project. Its a fine technology, if you believe in Fibrechannel. I believe it has some nice features, and will offer customers some very compelling reasons to purchase.

But if you stand back and look at the market space at a whole, I cannot perceive that FCoE will be successful. Here are my reasons.

There are no standards

The first rounds of FCoE standards process are estimated (yes estimated) to be finished by the end of 2008. All areas of FCoE won't be finished until 2009/2010. This assumes that none of the vendors start bickering over the details ( think of 802.11 wireless standards). If any of participants start bickering you could be looking at another WiMAX / 802.11n debacle.

Have a look at the list of FCoE vendors and decide for yourself if they can deliver: Cisco, Infinera, Ericsson, Qlogic, Emulex, Juniper Networks, Fulcrum Micro, Brocade, IBM, Fujitsu icroelectronics , Broadcom, Intel, HP

The year of 10Gb ethernet won't be until 2010.

We are seeing some early stage shipments of 10 gigabit ethernet switches from the usual players. However these are very simple ethernet switches that are missing of the features that we expect and want to be there. Sure they have ethernet and some basic layer 3 routing but the rest of the features will not be mainstream until 2010.

And of the 10GB switches today, only the Cisco Nexus 5000 supports pre-standard FCoE. A single vendor does not make a market (no matter how large you are). And make note of the pre-standard - being stuck with pre-standard FCoE would be a very very expensive mistake.

For FCoE to be successful, you must buy new switches that support PFC, ETS, and DCBX Data Centre Ethernet extensions.

These features were submitted to the IEEE in February 2008 as draft submissions. Just when will these be approved, and then implemented by the switch manufacturers ? This year ? Next year ?

DCE capable switches are not here yet, they are going to cost a bomb, and will take a while to be confident of their performance and delivery. It is going to take quite some time before the big customers will purchase these switches. I am sure that there will be a lot of smoke around DCE, but I find it hard to the believe that substantive volumes of equipment will ship before 2010.

You must buy FCoE HBA for servers, and then wait for the drivers to be certified by all the storage vendors

Each FCoE HBA will need design, production and software drivers. But even when this is done, these adapters will need validation from the storage vendors. This process is going to take at least six months for the slowest and least capable FCoE adapters. The high performance and feature rich adapters will take about a year to move through the production cycle.

Why did Cisco buy Nuova Systems so quickly ?

The FCoE is largely developed and driven by Nuova Systems, a Cisco funded startup. Cisco has unexpectedly purchased the remaining interest of Nuova Systems in April 2008. Many people commented at the time that this was quicker than normal and an unusual move by Cisco.

Did Cisco fast track the acquisition of Nuova to head off the iSCSI movement ? Its very possible. The vendors who make the iSCSI HBA will now have to split their resources to develop FCoE adapters as well and this will slow down the iSCSI development.

FCOE is a transition technology 

One of the most surprising aspects is that everyone agrees that FCoE is a transition technology, it is not the final destination for Network Attached Storage. You can see this repeatedly in early usegroup and forum postings. And even in the latest submission for DCBX to the IEEE, it clearly shows FCoE is regarded as a temporary measure to transition Fibrechannel systems to an ethernet physical connection and then IP backbone. It also talk about FCoE being for legacy or existing Fibrechannel sites, and greenfield sites are not expected to use FCoE. 

Surprised ? Well, yes I am. I don't want to be investing in short term technology. Living in a niche marketplace is not a happy place and bad for career aspirations.  

ISCSI will move into the gaps

By the time these issues have stabilised, iSCSI will have moved in to these gaps and made FCoE obsolete before it achieves significant market penetration. Even though many people call iSCSI an SMB technology, or an unreliable or untrustworthy system, they will also tell you that many of their servers are using iSCSI today. It won't take long before the price and simplicity advantage will outweigh the supposed benefits of FCoE.

iSCSI will continue strong growth, Infiniband in our future.

Many companies have not yet deployed SANs in their data centers. They will likely look to iSCSI because it has been around the block and is a proven technology, with decreased cost and a lesser learning curve than FC switched fabric. However are Ethernet and twisted pair near the end of its roadmap for bandwidth? CAT7 requires more costly shielded twisted pair and before you know it we'll be back to fiber.

Why aren't we hearing more about Infiniband? It seems to have great potential for low-latency big-bandwidth. Is it fear of the unknown, is it too new? With its open standard, it may do for interconnections what LTO did for tape drives. I think iSCSI is great for many companies and will continue to spread tremendously in new implementations. Smart SAN manufacturers will make arrays that support both FC and iSCSI if they haven't already. This will take us into 2009 or so when people will wonder why they didn't invest in an Infiniband array. I think SANs should start being offered with iSCSI and Infiniband, or FC and Infiniband, or perhaps all three! I'm not sure I see a good fit for FCoE. I would pursue Infiniband instead. The only way that won't take off is if the storage market follows the path of mobile technology; consumers may give up quality in pursuit of convenience. It's possible that convergiSCSI will continue strong growth, Infiniband in our future.

Many companies have not yet deployed SANs in their data centers.  They will likely look to iSCSI because it has been around the block and is a proven technology, with decreased cost and a lesser learning curve than FC switched fabric.  However are Ethernet and twisted pair near the end of its roadmap for bandwidth?  CAT7 requires more costly shielded twisted pair and before you know it we'll be back to fiber.  Why aren't we hearing more about Infiniband?  It seems to have great potential for low-latency big-bandwidth.  Is it fear of the unknown, is it too new?  With its open standard, it may do for interconnections what LTO did for tape drives.

I think iSCSI is great for many companies and will continue to spread tremendously in new implementations.  Smart SAN manufacturers will make arrays that support both FC and iSCSI if they haven't already.  This will take us into 2009 or so when people will wonder why they didn't invest in an Infiniband array.  I think SANs should start being offered with iSCSI and Infiniband, or FC and Infiniband, or perhaps all three!  I'm not sure I see a good fit for FCoE.  I would pursue Infiniband instead.

The only way that won't take off is if the storage market follows the path of mobile technology; consumers may give up quality in pursuit of convenience.  It's possible that convergence technology such FCoCEE (Fibre Channel over Convergence Enhanced Ethernet) may take off, but I personally don't see it.  Companies will deploy iSCSI before diving into unproven convergence technology and risk their data network integrity.ence technology FCoCEE = Fibre Channel over Convergence Enhanced Ethernet )

icon
Michael Kramer
Tue May 6 9:07pm
I apologize, something strange happened when I posted above. The less confusing version that is not duplicated (I hope) is below:

iSCSI will continue strong growth, Infiniband in our future.

Many companies have not yet deployed SANs in their data centers. They will likely look to iSCSI because it has been around the block and is a proven technology, with decreased cost and a lesser learning curve than FC switched fabric. However are Ethernet and twisted pair near the end of its roadmap for bandwidth? CAT7 requires more costly shielded twisted pair and before you know it we'll be back to fiber. Why aren't we hearing more about Infiniband? It seems to have great potential for low-latency big-bandwidth. Is it fear of the unknown, is it too new? With its open standard, it may do for interconnections what LTO did for tape drives.

I think iSCSI is great for many companies and will continue to spread tremendously in new implementations. Smart SAN manufacturers will make arrays that support both FC and iSCSI if they haven't already. This will take us into 2009 or so when people will wonder why they didn't invest in an Infiniband array. I think SANs should start being offered with iSCSI and Infiniband, or FC and Infiniband, or perhaps all three! I'm not sure I see a good fit for FCoE. I would pursue Infiniband instead.

The only way that won't take off is if the storage market follows the path of mobile technology; consumers may give up quality in pursuit of convenience. It's possible that convergence technology such FCoCEE (Fibre Channel over Convergence Enhanced Ethernet) may take off, but I personally don't see it. Companies will deploy iSCSI before diving into unproven convergence technology and risk their data network integrity.

Storage Virtualization - Where Should It Go

Writing about storage virtualization, I should start with defining which virtualization I mean. Generally, a simple RAID volume can be called "virtual" too as it is a logical representation of some more complex logic behind it. Don't worry, I'm not going to write about RAID. Instead, my mind is full of mirrors, snapshots, clusters, recovery sites and a single question: At which layer of SAN infrastructure these features should live?

Today, we can find storage virtualization implemented mostly on two places:

  1. Built into array controllers, or
  2. running as a software installed in a hardware appliance or Fibre Channel switch placed in between the arrays and SAN clients.

The first usually doesn't go far beyond a semi-working mirroring feature you receive a huge bill for. Also, a common problem of these vendors is their scope ends at block level, they don't really care of host applications. Sure, the primary function of storage controller is different and mixing it with the complete stack of virtualization features in one box might create more troubles both in design and during operation.

The second way of virtualization many people presume to be an in-path obstacle wearing another vendor's label, box they have to learn how to manage, pay maintenance fees for etc. No matter explaining them how it's full of features, how it's not necessarily a single point of failure or how it creates just a minor latency.

As a result, there is a large set of SAN installations lacking modern virtualization features. Is it bad? It is, I think. Safety features like mirroring with transparent failover or consistent snapshot replication should be an obligatory part of each SAN installed in 2008. At least, they should be available as part of storage solutions from SMB up through all the marketing labeled levels.

What's a way to avoid these drawbacks and bring storage virtualization to more SAN users? As always, I believe the way goes through simplicity and standardization. Let's devide each virtualization feature in two parts. One, that inevitably needs to be implemented at the controller level and the second which would reside at the host. No need for any third, in-band level in this design.

Suppose you setup synchronous mirroring in such design. The host would then send the blocks it writes to all arrays configured to be part of such mirroring. The benefit is there is no retransmission from primary controller to the mirror one, no central point of the in-band appliance. In case of array failure the host itself selects another array. From such perspective, it could be just an improved MPIO driver. I'm an optimist so I believe there is a way how to write such driver to be vendor-independent. Thus you could mirror your HP to IBM, LeftHand to EqualLogic ;-)

There is already similar implementation of such out-of-band virtualization in Fibre Channel world. It's LSI StoreAge. Most of it's features work on Windows only however and yet it requires hardware appliance to be set up in the SAN. No similar implementation in iSCSI world I would know of.

Having the host part of storage virtualization brings another advantage: It's close to applications. It's application data we need to protect, not low-level blocks. Application support is necessarily important for creating snaphots and replicating data to remote sites. We could manage SAN data much more safely and in a simpler way if the SAN border moved closer to applications. Of course, some work of standardization has yet to be done here.

To summarize, I see the current storage virtualization too in-band-ish. Although there are some rare efforts to put selected tasks on SAN hosts (eg. FalconStor's DiskSafe agent), they stay alone without further plans to replace the central appliance. If I was an array vendor I would consider pairing with FalconStor to strengthen the market of interoperable, application-centric SANs, bringing more ways how to use "my" arrays.

The future of storage is a really wide, and open ended topic. I've been in
the data center/systems administration/storage business for 30 years, so I
can look back a long way. The one thing that's been a constant during all
that time is that things change, and they change in ways we never expected
at the time.

So looking very far into the future is going to be rife with speculation and
in most cases just plain wrong. But what I think we can do is look at
current technology which is just being introduced, and make some intelligent
guess about how it might be used in the very near future, say within the
next couple of years.

I think that there are a few sea change kinds of technologies which are just
starting to become mature enough that they are being deployed into
production right now. Things like storage virtualization, deduplication,
thin provisioning, etc. are really starting to catch on and I think that
they are going to get combined in some interesting ways. For example, I
think that server virtualization when combined with thin provisioning,
network attached storage, and deduplication can have a very profound impact
on some of the real issues that we are fighting in the data center today.

Specifically, if you combine those technologies, you will be able to deploy
systems much more quickly, and manage them much more efficiently than we
ever have before and you will be able to do it using less disk space than
you might be today.

That's a good thing since systems are proliferating like mad. Server
virtualization has created more "servers" than ever and created more demand
for storage than ever. Being able to grow a virtual server's available space
on the fly without disruption to the users of that virtual server will soon
become a requirement in this 24x7x365 world a lot of companies are finding
themselves living in. Not only will you have to be able to grow the storage
on a virtual server at the drop of a hat, but you will need to be able to
manage all of the storage behind all of those virtual servers much more
efficiently than you ever have before. That means dealing with the issues of
end of lease, outages, backup and recovery, and DR. Some of those new
technologies are going to be very helpful in dealing with all of those
issues, and combining them will create some synergies as well.

Soon we will be able to grow the storage on a virtual system in the fly,
maybe even automate the growth from a pool of storage just for that purpose.
The OS and applications stored on those virtual servers will be stored using
deduplication technology reducing the on disk footprint by a significant
margin which will help to slow the growth of storage a little. We will back
up these virtual servers to disk, again using deduplication technology so
that the backups don't take up huge amounts of disk space and we will use
CDP to replicate that data to our backup data center, again using network
deduplication technology to reduce the quantity of data flowing over our
network link. All of the above we can do by putting together existing,
shipping technology.

Some of the issues that we are beginning to run into, however, still don't
have a cost effective solution yet. For example how we are going to address
the performance issue of current disk drive technology? Back in the day when
we had to throw a lot of drives at a big database in order to get enough
space bandwidth to disk wasn't an issue. But drives have been getting bigger
and bigger for a long time but the rate that I can pump data to/from a drive
has remained relatively flat in comparison to the quantity of data stored on
a drive. To the point now where we end up having to run drives 1/2 empty in
order to spread the database across enough spindles to get the performance
we need.

One interesting technology that is just starting to ship from array vendors
is SSD. Sure, it's expensive as heck. Some say 300x what a regular disk
drive costs. On the other hand, if you're running your database on 1/2 empty
drives, that isn't exactly cost effective either. Besides, like with any
technology I'm sure that as more and more of it gets shipped the price per
GB will come down.  So I think that SSD is something we are going to see a
lot more of over the next 3 or 4 years. Of course it will start with the
people who have the deep pockets and the applications that really need the
speed. ERP, for example is one application that springs to mind on the
business side that can take advantage of SSD. There are certainly a lot of
other application on the scientific side that will also be able to take
advantage of SSD, and I'm sure that certain 3 letter government agencies
already have it deployed (that's a guess, please don't send the guys with
the mirrored sunglasses to my house to find out everything I know about
this, I don't know anything, really).

The bottom line is that the business side of IT is going to be looking to
reduce costs, just like always, while they continue to grow their dependence
on IT. The seemingly never ending thirst for more storage is not going to be
ending any time soon. If anything new technologies will just continue to
accelerate this growth, and our challenge is going to be to figure out how
to manage these massive amounts of storage as efficiently as possible.
Because if we don't, upper management is going to find someone who can to
replace us.

My advice to everyone who's managing storage is, don't be afraid of new
technology. Find ways to combine it in ways that will help you to be more
efficient and effective in managing the storage. Remember, you have been
entrusted with something that is rapidly becoming the crown jewels of your
company, the company's information and it's up to you to manage and protect
that information. That's an awesome responsibility, but don't let it scare
you, it's also a heck of an opportunity!

--joerg

Ubiquitous Storage - data everywhere How many times we have heard someone saying: "Send me the Quarterly numbers….I will check on my Blackberry…""It’s a big file….gmail it…or put it on the thumb drive….""Lets look up every thing on xyz product – designs, manufacturing records, marketing materials and support cases"When you  start connecting the dots of data management. It is no longer limited to the data center or desktop...

If you follow the bit trail....There is data everywhere...

 The Data management revolves around a simple information lifecycle and associated tools:
  • Data Creation – MS Office, ERP
  • Distribution – Email, Web, FTP..
  • Action – Document management systems & other applications
  • Dispose – Backup and archival application
On further study, You will realize there are three distinct classes of storage:Personal Storage
  • iPod generation
  • Smart Phone, Blackberry, PDA
  • Thumb drive
Internet Storage
  • iDisk, SkyDrive, Gdrive
  • Flickr, youTube, gmail
  • Online backup
Enterprise Storage
  • Gigabyte mailboxes
  • Terabyte Data warehouse
  • Petabyte File Archives
  The History of Personal Storage can be traced back to punch cards, microfilms, large floppy disks(5 1/4 or larger) and tape.The adoption of solid state drives into consumer electronics has driven down the cost of personal storage devices. The evolution from Kilobyte to terabyte size personal storage devices  creates compelling need for data management.There is notion that Internet Storage is always available and accessible. It does not eliminate the need for data management. There is a sense of loss of control and lack of privacy with internet storage. This in turn creates new opportunities for data management for Internet Storage. There is no need to emphasize the need for data management in the enterprise storage space.... Innovative companies like Fabrik (http://www.fabrik.com/) tacking these data management challenges by marrying  personal storage with internet storage and topping it up with online backup services with its recently acquired a personal Storage Vendor (http://www.Simpletech.com).  There are others like Apple who have integrated simplified backup capability into a Wireless Access Point with deep integration into the operating systems and tying it with .MAC services based internet backup. There are players like Synchronica offering solutions to backup mobile phone based personal storage. We cannot ignore the Google Gears approach to always available data for Google Apps. Microsoft SkyDrive and Live Integration with its Cloud OS(Internet Cloud) Strategy is a step in the same direction. There are hardly any vendors seriously pitching Enterprise storage in the Cloud. The new trends in virtualization like Virtual Desktop Infrastructure, Cloud Computing, Grids, Managed Applications and Software-as-a-Service are first steps to moving enterprise data into the cloud. Ubiquitous Storage can be viewed as a convergence of personal, internet and enterprise that’s always in sync.  This convergence is inevitable, Its happening on mobile devices like laptops and smart phones today.  It is taking shape in the data access layer in the enterprise with Mash-ups and enterprise portals. There future challenges of data management revolve around providing consistent means to manage personal, internet and enterprise storage.   I firmly believe that next generation operating environments and web computing platforms will realize the vision of seamless always available data everywhere...A. K. A. ubiquitos storage 
   
 
 
  
icon
Ramkaran Rudravaram
Fri May 16 11:12am
I need to get this formatted:
The History of Personal Storage can be traced back to punch cards, microfilms, large floppy disks(5 1/4 or larger) and tape.

The adoption of solid state drives into consumer electronics has driven down the cost of personal storage devices. The evolution from Kilobyte to terabyte size personal storage devices creates compelling need for data management.

There is notion that Internet Storage is always available and accessible. It does not eliminate the need for data management. There is a sense of loss of control and lack of privacy with internet storage. This in turn creates new opportunities for data management for Internet Storage.

There is no need to emphasize the need for data management in the enterprise storage space....

Innovative companies like Fabrik (http://www.fabrik.com/) tacking these data management challenges by marrying personal storage with internet storage and topping it up with online backup services with its recently acquired a personal Storage Vendor (http://www.Simpletech.com).

There are others like Apple who have integrated simplified backup capability into a Wireless Access Point with deep integration into the operating systems and tying it with .MAC services based internet backup.

There are players like Synchronica offering solutions to backup mobile phone based personal storage.

We cannot ignore the Google Gears approach to always available data for Google Apps. Microsoft SkyDrive and Live Integration with its Cloud OS(Internet Cloud) Strategy is a step in the same direction.


There are hardly any vendors seriously pitching Enterprise storage in the Cloud. The new trends in virtualization like Virtual Desktop Infrastructure, Cloud Computing, Grids, Managed Applications and Software-as-a-Service are first steps to moving enterprise data into the cloud.


Ubiquitous Storage can be viewed as a convergence of personal, internet and enterprise that’s always in sync. This convergence is inevitable, Its happening on mobile devices like laptops and smart phones today. It is taking shape in the data access layer in the enterprise with Mash-ups and enterprise portals.

There future challenges of data management revolve around providing consistent means to manage personal, internet and enterprise storage.


I firmly believe that next generation operating environments and web computing platforms will realize the vision of seamless always available data everywhere...A. K. A. ubiquitos storage

The greatest change that the SAN market could make is to jump from the exclusivity of the enterprise into the home market.

 As the multi media home starts to develop and more and more devices enter the space previously ocupied by the PC a cost effective SAN would become the centre of the home.

Why a home SAN?

 Server to storage - traditional transport but with the high bit rates offered by SANs would allow for portability of large files (think High Definition movies) being moved around the home.  Servers are replaced by PC's, DVD's, games consoles and HiFi's most of which have some element of computing technoogy built into them.

 Server to server - using a home SAN as a fast transport service would allow for fast sharing and recycling of files.  Download a file from the internet on your PC (say a purchase from iTunes) and have it moved to your media centre or hard disk on your DVR for playback on your HD TV.

 Storage to storage - moving files between storage elements with the intervention of a server.  Back up files to DVD, move files from your DVR (TiVo) to your PC to watch whilst your travelling.

What home devices would you connect to your hom SAN?

  • PC
  • Laptop
  • Games console (Playstation, Wii, Xbox360)
  • DVR (DVD recorder, TiVo)
  • DVD player
  • Podcast device
  • Photo storage
As home working becomes more viable having SANoIP would also provide back up service for your work to aid disaster recovery and sharing of data between users.
A vision for virtualization.

As a current user of storage virtualization, I couldn't feel stronger
about the strategic place this has in today's datacenter. Living with
such a rapidly changing/evolving technology such as SAN storage, has its
own inherent limitations in some of the key benefits and features it is
intended to provide. Technical requirements for where you can use things
like Flash Copy, what types of storage can be pooled together, etc.
force admins into purchasing decisions and technical directions. I don't
know about you, but I do not like being forced down any technical
roadmap!

Many vendors offer this technology today, and they all have subtle
differences. But there are some key features they all have in common,
and I would highly recommend anyone who is not currently using this, to
take a long hard look at it. I run a mixture of SCSI, SATA, and SAS disk
technologies. Yet I manage it all through one centralized console, and
view it all as one large storage pool. I've used Flash Copy between
these dissimilar systems, I've (temporarily) grown a virtual disk from
one to another in an emergency, and can toss a host from one to another
as performance requirements change.

Great example - I deployed Exchange 2003 years ago with the logs on R10,
and the databases on R5 (yea, I know...). As our user base increased,
performance started to suffer. I had just brought in some SAS disk for
some new storage requirements, so I had some unallocated disk to borrow.
On a Saturday from my home office (cold beer in hand), I striped out the
new disk, lifted Exchange off its current SCSI R5 disk and dropped it on
the new arrays. I added a little more SCSI disk to that system, blew
away the original R5 arrays, combined that disk with the new SCSI disk
and re-striped all of that as R10. On Sunday (with a fresh beer in
hand), I tossed Exchange back on to the R10 SCSI disk. And all of this
was with 1500 users logged on, with absolutely no down time, and an
unnoticeable performance effect. What is that type of flexibility worth?


I could do (and maybe I will) an entire separate write-up on the
performance enhancements as well. Tired of playing the spindle game???
The performance I get from mid range 300 GB SAS drives, is intrinsically
better than what I got from 73 GB SCSI drives on my FC disk system
natively. This was another thorn in my side that virtualization has
removed.

Do yourself a favor, read up on the different storage virtualization
offerings out there, make your case to whomever you have to, and once
you shove your storage behind this type of technology, you will be like
me and wonder how you ever managed without it!      

The Future of Home Storage

Along with my professional focus on enterprise storage systems, I'm enamored of home networking, and recently passed the two terabyte mark at home!  Along with David Mould's post on the 19th, this got me thinking about where home storage is heading.

Past Failures: Home Servers

Home storage appliances and servers have come and gone over the year, with none seeming to make much of a mark.  The market remains littered with UPNP media servers and home NAS boxes dashed on the shoals of an unappreciative public. Nearly every home network device company has produced one or two home storage servers, none of which have succeeded.  Although I use a Linksys NSLU2 at home, I had to hack its Linux software and completely replace Linksys' features to create a useful device!  The un-hacked NAS devices of Buffalo, Western Digital, Netgear, and the rest have generally failed to find buyers as well.  So far, consumers seem content with simple USB and FireWire external drives.

The most adventurous home storage servers came from Zetera and Ximeta, both of whom relied on proprietary IP SAN protocols. Note that these were SAN products, sharing block storage over Ethernet, rather than conventional NAS solutions.  Both required drivers, limiting client support.  The one Zetera buyer I know was pleased by the performance but never used the device as anything but a large hard drive for one PC.

Then there is Microsoft.  Recall that the latest Windows Home Server is only their latest attempt to enter this market, and yet I know of no one who has adopted the device.  The same can be said of the various media center servers from Microsoft and others.  At this point, it seems likely that the future of home storage servers will not come from Microsoft, though their two XBox generations have great potential as clients.

Even EMC has entered the market with their nifty (but largely unnoticed) LifeLine product.  Supporting file services and backup for computers as well as audio and video for media players, EMC positions LifeLine much like their Retrospect backup product, but goes further in offering a complete software solution for hardware OEMs wanting to offer a non-Windows home server. Although an impressive offering, it is too early to tell if EMC will have much success with this product.

The Sleek, Shiny Elephant in the Living Room

Of course, there is one company that sells media players and servers by the bushel, complete with sleek, shiny interfaces.  Apple's tremendous success with the iPod has led to their iTunes software becoming the dominant media organization platform, complete with its own proprietart discovery and sharing protocol.  Now, with the Apple TV and video iPods, the company is broadening into more media categories.  Surely their dominance here puts them in a special position when it comes to setting the stage for a home server/storage revolution.

They also have a strong position in the world of dedicated home storage.  Their Airport products are among the only routers to be widely implemented with shared storage.  Although many other companies offer similar products, low customer understanding means that these functions are not widely used.  And the new Time Capsule device is surely already the most widely-used home NAS product.

But Apple has not yet shown any home server strategy.  Administering multiple iTunes servers can be frustrating for users, with no inter-iTunes synchronization or centralization capability.  Although the Mac Mini, Apple TV, or Time Capsule could certainly be seen as a home server, the company does not position them as such in the market.  Indeed, some iTunes users like myself rely on compatible third party media servers like Firefly and TwonkyVision rather than using iTunes itself.

One issue for Apple is their reliance on proprietary protocols.  Although the Bonjour discovery protocol is certainly simpler than UPnP in practice, Apple stands alone in relying on it.  They also steadfastly stick to AFP for NAS and DAAP for remote media streaming.  This limits the number of third-party clients and servers that can be used with their hardware and software.

The Future is Friendly

Although Apple has not yet tipped a home storage strategy beyond Time Capsule and Airport Extreme, they are best positioned to deliver a real home storage solution.  A simple step would be to create an iTunes media server integrated with Time Capsule and add client/server media synchronization. The company already has OS X backup and file services integrated, and this move would further centralize the digital home around Apple products. But the company's reliance on closed protocols like DAAP is worrisome, since it locks consumers into nearly all-Apple solutions.

Microsoft's Media Center/Home Server combination, based around UPnP, shows great promise, with many compatible third-party clients and servers already available.  But my own experience with the solution has not been at all positive (I still can't get my Roku SoundBridge, Vista Ultimate laptop, and Media Center PC to see each other!), leading me to question the viability of this option.

Although Apple or Microsoft could come to dominate, I suspect the future of home storage is out of both companies hands.  A number of others are working on improved home server experiences, including EMC's LifeLine and the expanding use of Debian Linux and open source tools.  But all could be sidelined by improved Internet-based services.  Google, Microsoft, and Apple continue to expand their online consumer suites with greater storage, synchronization, multimedia integration, and all have the potential to reduce or eliminate the need for in-home storage.

Although I cannot yet tell which service will win, one thing is certain: Consumers demand friendly, flexible solutions.  They don't want to fuss with their media, and they don't want simple shared storage.  They want integration with multiple devices and flexibility to access their content on any device.  The first company to offer a simple, flexible storage server for the home will surely be on the right track!

Years ago, eveybody thought that IT was about computers with an internal disk running an operating system, and tapedrives to store corporate data. And look where we are now. An evolution in the way we use storage space has already taken place.

Where hard disk space used to grow enormously internally in computers, that has come to a halt, and external storage has taken part of its place. All kinds of people walk around with USB drives, and NAS has become a regular term like spaghetti and cheese. 

The corporate world has also seen an evolution. It went from tapedrives to network attached disks, and now to storage solutions, where you don't even speak of disks anymore, as the total amount of storage space is enormous, and space just gets added to servers in need. Although the servers believe to see disks, they do not. So where is this all going?

Recently someone said that the exploding use of storage space can be reduced by 20% if you teach users how to delete stuff they don't need. As the cost went down, we all started using more and more. The big question is "Is this better than before? Are we working more efficiently?". Not necessarily. As independent consultant, I see that most of the big companies out there have no clue how to manage the storage space and make good use of it, but merely see it as a place to pile up documents. But how do you find the data you need efficiently?  

In other words, now that we have the storage we always wished for at a good price, we don't even bother using it efficiently, we just waste it without caring. So what have we gained?  

Common SAN Performance Issues

Performance of storage devices often becomes a subject of controversy between storage reseller and the customer. There are always some false expectations, wrong setups and hot heads when it comes to performace. I decided to write about the trouble sources I'm meeting most often. I believe enumerating and discussing them might not only help storage beginners, but hopefully it will inspire storage vendors to improve existing features or develop new ones.

Issue 1 - NAS Protocols' Single Client Limits

Many of us seeing this every now and then: Having his new shiny array attached to a Windows server an unlucky new owner benchmarks copying some iso file between his laptop and a Windows server's share. While passing 40 MBps we start celebrations, thus making the man wondering yet more.


More seriously now. Don't expect any common NAS client to get much above 40-50 MBps. File sharing protocols like SMB/CIFS (Windows sharing) or NFS (Unix file sharing) are designed to serve multiple parallel clients. A single client is not able to get the most out of it and thus to prove the SAN performance. No, even copying multiple files in parallel with a single client doesn't help.

If you really need to prove a SAN performance through NAS protocol, consider deploying more NAS clients, more laptops. Some 10 to 15 might suffice to fill in the uplink from array to NAS appliance.

Future?: Is single client performance of NAS protocols something needing improvement for the future? I'm quite sure it is. The file sizes are increasing, users and even application backends need to transfer larger files between machines. The backend block-level storage infrastructure is speeding up too, we got 10gbps there now. The traditional protocols' improvements don't copy this trend however. There are few vendors like F5 or Ibrix developing their ways how to obey these limits. Their coverage is minor, however, compared to how many users depend on CIFS. There is lot of space for improvements in my opinion.

Issue 2 - Random vs. Sequential vs. Rotating Drives

Many people believe the rising new 10Gbps Ethernet or 8Gbps FC infrastructure will help their applications run faster. Will they? To a large extent they won't. How large extent? Well, as large as many database-like applications they run. The core of the problem is a common misunderstanding based on an expectation that all applications need big pipes to perform "fast".

I've just come to the important point which is to realize what traffic pattern each application generates. Whether it is close to random (relational databases, mail systems, ERP applications) or sequential (file server, streaming applications). As long as we think of conventional rotating drives, different drives are differently suitable for each sort of traffic. Perhaps everyone heard SAS was for database, SATA for files. The difference is in drive's design, latency, seeking algorithms, rotations - this set of features allow SAS (or SCSI, FC) drives for more head movement and more sectors reached throughout the drive to serve the database. Database applications are eager for multiple small blocks residing on various places of the drive's plates. That's why SAS drives are able to feed them faster but almost never at the speed of sequential transfer.

Today, the simplest performance solution for sequential traffic is to use "big pipes" while for random traffic - except of choosing the right drive type - the number of drives helps to gain more I/O operations.

Future?: I'm quite sure the future will bring a great change of the conventional concept. As memory based drives have no rotating parts, they have no problem with time consuming seeking the correct sectors required by application. Theoretically, their random performance is the same as their sequential performance and it is really high, tens times faster than rotating drives. Practically however, the today's NAND based SSD drives suffer from a big latency caused by the need to erase each block before it can be re-written. I believe, and there are even some notes by anonymous engineers, that disk engineers are working hard to solve this problem and thus allow flash drives to become the best performance option for database applications.

Issue 3 - IOps vs. Mbps

The last performance issue I'm going to express is not a real problem, it's just a common misunderstanding of units used tomeasure performance. In promotional materials, magazine articles, discussions and many other places I am reading statements that random (database) performance is measured in IOps while sequential performance in MBps. What these sources usually don't mention is that both these units are related to each other and both of them can be used to describe any sort of traffic. It is just so that IOps better express database requirements which are operations related and MBps better express those throughput numbers used for file transfers.

The third unit which binds all this stuff together is blocksize, ie. the amount of data an application uses to form each I/O request. All these units can be put into a simple equation:

throughput  (kBps)  =  operations  (IOps)  x  blocksize  (kBps)

How to interpret these numbers? For example take a common mail system, generating load of 500 IO operations per second. It is a database based application, using 8k blocks. Placing these numbers in the equation above you'll get a throughput of 4000 kilobytes per second. Such number would be really bad when seen on a file server. For that mail system however, it means 500 operations per second which should suffice for about 500 users. That is quite enough and what is more, you only employ some 4 or 5 SAS drives for it.

Look at the attached picture (sorry, I couldn't embed it into this page). It displays results of a single IOmeter test running a random operations benchmark. The two graphs display MBps (left) and IOps (right) on their y-axis, blocksize is on the x-axis. Just notice the two graphs are two interprations of the same test, the same performance under given blocksize.

There is no future paragraph for this issue, I just hope I helped someone to understand these units little more ;-)

icon
Lukas Kubin
Tue May 27 12:39am
Please, correct the heading of Issue 3 - "Mbps" should change to "MBps". Thanks.

Flash Forward or Flash Back?

The tech industry has been buzzing about solid state drives (SSDs) again lately, but many questions remain.  Even after many major vendors (Apple, EMC, and Dell to name a few) have introduced NAND flash-based disk into their core products, it is unclear whether non-disk storage will fly or flop.  I'm betting it will find a nice niche, but that traditional spinning disks are here for a good long time.

Apple's Flashing Success

When Apple switched from hard disks to flash in their mainstream product line, the world was abuzz with the novelty:  Would flash displace hard drives? Sure, the company still offered disk-based storage for those needing vast capacity, but most people found that 8 GB or so of storage was plenty for daily use.  Of course, instead of the MacBook Air, I'm talking about the massively successful iPod Mini, which rules the standalone iPod market with 4 or 8 GB of NAND flash storage in a tiny but full-featured package.

Like the Air, the Mini demonstrates that what matters in the "take it with you" market is portability in the form of low weight, perceived durability, and compact dimensions.  And NAND flash excels when it comes to packaging.  The flash-based iPod is an excellent semaphore for this market segment in other ways, too.  Audio files are fairly small, so music users don't need as much storage as movie lovers. They will gladly ignore the cost per GB, too, at such small capacity points:  iPod Mini buyers pay ten times more per GB than iPod Classic buyers.

In the case of the iPod, the compact size and joggable durability afforded by the Mini is worth the money to most buyers, not that flash player has sufficient capacity to meet their needs.  The MacBook Air teaches a slightly different lesson: Although reviewers are quick to point out that the speed and battery life difference between the hard disk and NAND flash versions of the mini notebook are negligible, many buyers have been happy to pay the extra $1000 to skip the disk.  In this case, they are paying for quick access time, light weight, and durability that exist as much in their perception as in real-world benchmarks.

EMC's Heavyweight Champion

In the exact opposite corner of the data storage world lurks EMC's top-line Symmetrix DMX storage array.  When the company announced the availability of NAND flash drives as their top-tier choice for storage, it turned the heads of the whole enterprise storage industry.  Although the technology implementation is substantially different from Apple's iPod, EMC's move suggests that another group of customers exists who are similarly unimpressed by a low cost per GB: Enterprise application managers.

Many have suggested that enterprise flash is not yet competitive in terms of price, capacity, reliability, or even performance.  And they have publicly disagreed with EMC CEO, Joe Tucci, who claimed effective parity after 2010 at the recent EMC World event.  After all, today’s enterprise flash drives are far more than ten times more expensive than their spinning brothers, and disk capacity continues to march higher by the month.

But the comparison is not about the cost of apples or oranges.  In the enterprise storage space, flash drives sot at the top of the pyramid, with just a few units added into the traditional tiered storage mix as a “tier zero” of maximum performance.  It is not as simple as pulling out a set of 146 GB FC drives and replacing them with a similar number of flash units.  Instead, a few key applications or data sets are migrated up to the pinnacle, with the rest of the stack remaining the same.

There is huge promise when this tiered model is combined with storage virtualization, especially the automated variety.  If the tiny percentage of storage that truly needs top-tier performance could be moved to a few solid state disks, the whole stack will benefit from reduced device contention.  If automation could make the decision on a block-by-block basis, the effectiveness would be much greater.

I Still Remember

There is another kind of solid state disk in play, too.  For over two decades, company after company has pushed the idea of packaging high-performance DRAM as a disk substitute for enterprise storage, just as EMC has now done with NAND.  These RAM-based disks offer even higher performance and prices than their flash-based cousins, and none has taken the industry by storm.

Way back when a tiny EMC was one purveyor of solid state storage, I recall the philosophical conundrum posed by the devices:  Is it better to package DRAM as storage and use it in a conventional manner or to use that same memory as a cache for actual disks?  The market voted for the latter, with EMC and others introducing in-array cache to accelerate RAID to great effect.  System memory expanded in parallel, with modern servers optimally caching data in three or more levels internally as well.

Where Does the Flash Go?

For most uses, this is precisely the correct configuration.  The priciest and quickest “storage” is placed close to the CPU, with performance and cost dropping and capacity increasing as one moves outward.

Where does flash belong, then?  Apple teaches us that NAND flash delivers the goods when it comes to the portable market, and it is likely that the use of this technology in this area will only continue to grow.  And EMC shows that there is a need for higher performance in the enterprise storage world as well, though perhaps not enough for pure DRAM devices.

The message is clear:  As long as the cost of disk continues to lead, NAND flash will remain a niche product.  There are certainly markets for NAND-based devices, from portable computing to the enterprise, but disk just works too well to be displaced.  While one can never see too far into the future of storage, it seems clear that conventional hard disks will remain the dominant media for a few more generations of technology at least.

icon
Greg Ferro
Wed May 28 7:31am
I find it hard to agree with this prima facie. The Flash manufacturers have a substantial oversupply of manufacturing capacity and have been desperately looking for new opportunities to expand production and recover the sunk investment. I believe the rapidity with which Samsung has developed and announced the 256GB SSD is a testmonial to this fact.

The only impediments to rapid SSM adoption will be user reluctance, and supply shortages and write cycles.

Many people who believe that storage must be proven ultra reliable and validated before even considering the deployment of new technologies will be slow to take up this opportunity. In spite of the fact that their Storage farms are massively redundant and overprovisioned with availability.

This will be coupled with concern about the number of write cycles before Flash will begin to fail. The real impact is small, since hard drives already compensate for sector failures at the device level.

The Medium and Small market segment are likely to rapidly adopt this technology, mostly because of the hype and marketing. This will drive a surge in the manfacture of flash drives and probably overrun the available capacity and a short term spike in price will occur before the production ramps up. As evidence, visit Digg / Reddit et al and listen to the hysteria that every SSM announcement generates.

Given that everyone knows about SSM, this is another technology that will grow from the bottom up and force its way into the data centre by brute force. As a comparison, look at SATA drives being extensively used in SAN's today.

The big end of the market will be slow to adapt and change, as usual, which will provide them time to carefully evaluate SSM. It is inevitable that SSM will replace spindles, but probably not as fast as I think it will. Spindle drives will have greater capacity and that will be important for the short term.
icon
Stephen Foskett
Wed May 28 9:59am
It sounds as if you agree with me, actually. Flash in general is unlikely to "take over" from disk in a broad sense for another few technology generations at least, and may not ever due to inherently higher cost. Sure, there are times and instances that a smaller capacity of flash is sufficient and price sensitivity isn't there, but this will not describe the bulk of the market for a long time to come.

I guess I'm more skeptical about your bottom-up argument for the simple reason that consumers at the bottom tend to be far more price sensitive than those at the top. And even as more vendors are rushing in, the price differential remains massive. Note that the latest announcement from Samsung did not include price... I simply don't see a time in the near future when consumers and small businesses, who currently snap up laptops by the handful at $500 to $1000, will be willing to add 50% or more just to have flash.

Note too that not all flash is equal. Enterprise flash is a very different animal than consumer - MLC vs. SLC, added logic and integration, lots of DRAM cache for optimization, etc...

Of all the anti-NAND arguments made, the one that doesn't hold water (any more) is reliability. Math is on our side here - at 100 GB, a well-engineered device simply can't write fast enough to wear out in under a few years.

iSCSI vs Fiber Channel

For brand new SAN implementations iSCSI appears to be the likely as well as the superior choice for most IT managers.  Although there are some technical advantages with Fiber Channel in most cases these appear unlikely to justify the higher costs of Fiber Channel compared to iSCSI.

Security issues with iSCSI are largely unjustified due to misundertandings about the architecture.  Contrary to some suggestions,  iSCSI does not carry most of the types of security risks associated with remotely hosted and remotely accessed internet applications. 

Also, iSCSI in many cases offers comparable performance to Fiber Channel because the performance "bottlenecks" in most applications are not data transfer speeds - rather the bottlenecks are disk I/O issues.      Thus even though the Fiber Channel speeds are theoretically about four times faster than iSCSI, applications will often be constrained by the disk i/o speeds and will not benefit from Fiber Channel's theoretical bandwidth advantage.  

Another factor in "betting on" iSCSI's future is that Hybrid implementations are possible such that companies can migrate from FC to iSCSI gradually and without disruptions or risks.  

Although some IT managers may stick to their guns and continue with theoretically faster but much more expensive Fiber Channel configurations structures, over time it seems that iSCSI will "win" and - at least for the short term - become the dominant approach until other SAN methods bring a superior price/performance ratio.

 

Article: The Storage Market Doesn’t Innovate, it Mimics Existing Innovations.

 

One of the things I love about the storage industry is the continual emergence of innovative technology.  New features in our market can go from inception to implementation in as little as 18 months.  Seeing this continuous hyper-evolution of storage features, it has occurred to me that the storage industry doesn’t really innovate at all, it mimics existing innovations. No, not mimicking the technology itself, but the evolution of the technology.  Let me explain.

 

Remember back in 1997 when 10-megabit Ethernet was the new kid on the block?  The ISP I worked at finally dumped our antiquated hubs and bought a shiny new USRobotics 10-megabit switch for our core network.  Soon after that came 100-megabit hubs, then 100-megabit switches, then core switches with backplanes, so on and so forth. 

 

Then something special happened.  Switches weren’t just switches anymore.  Now they had integrated routers, packet filters, intrusion detection, cache engines and load balancers.  Switches were becoming intelligent.  These days, the network switch does a lot more than simply network things together. 

 

Compare that to the evolution of Fibre Channel switches over the last 8 years.  See any similarities?

 

Along those same lines, aren’t Zones basically VLANs?  Aren’t WWNs just MAC addresses?  FC doesn’t seem to have a lot of the problems that TCP/IP and Ethernet have.  For one, FC is a combined protocol, but it also eliminates the need to assign IP addresses, set up DHCP servers, tweak MTU sizes, or worry about dropped packets.

 

For management protocols, look at the how similar SNMP is to SMI/S.  For certifications, there are CISSP and SNIA.

 

Another great example is De-duplication.  I remember first seeing this technology about 15 years ago when it was called compression. 

 

On a side note, I always secretly smile when I hear the term “De-dupe.”  De-dupe is a great marketing term.  I doubt it would be so popular if it were called compression2.

 

The only problem with mimicking another industry is, there doesn’t tend to be a whole lot of natural innovation.  For example, there are some features that have long since been desired but still aren’t quite there.  Thin Provisioning comes to mind – what took so long? 

 

Another is ILM. 

 

ILM is a great concept on paper, but not really practical.  Currently it is just too difficult to move data from expensive disk to cheap disk and vice versa.  And now that Solid State is becoming affordable, customers are going to want to move data around their environment easily and online, without having to engage project managers and data migration appliances.  This is changing of course, there are some great LUN virtualization products maturing, and I suspect in 12 months I will be singing a different tune on this topic.

 

Even though there is an intelligent evolution happening in the storage market, there is still a fair amount of stumbling occurring. For example, this constant churn about iSCSI vs. FCOE vs. FC vs. Infiniband is getting a little old.  I see this topic in industry ‘expert’ blogs 100 times more than I hear about it from actual customers with money.  I don’t think people realize that until there is something deployable in a customer environment, all this discussion is just academic.  Let the companies innovate, the customers purchase, and the market decide.  Worry not Chicken Little, FC is not going away anytime soon.

 

I am happy that my industry is trying to evolve in a more efficient way than other industries.  Fast emerging technologies keep us early-implementers employed.  It will be interesting to see what is innovated, er, mimicked next.

icon
Stephen Foskett
Fri May 30 1:36pm
Good points, Chip! Nice to hear a little reality injected into our world... Most customers have no opinion about FCoE versus InfiniBand simply because they've never heard of them. iSCSI and even NAS are new concepts to lots of the people I talk to!

But then again, this is the "future of storage", not the past...

SAN for video is poised to become a key market.  Innovation in this space will affect data compression, content delivery, and other major storage and file sharing concerns.  

Here's a brief review of four popular integrated solutions for high performance video editing as well as other large file sharing applications such as broadcast media, film, and medical imagery:

Global SAN: http://www.studionetworksolutions.com/products/product.php?pci=10

Workgroup ability to edit audio/video projects and large files directly over RAID-protected storage.  High-throughput iSCSI/IP SAN.    Advantage:  Avoiding the expense of Fibre Channel storage, cards and switches.

Avid Unity: http://www.avid.com/products/unitymedianetwork/

AVID's experience in Video Editing is reflected in their high number of high end global installs.    AVID Unity's advantages  are support for collaborative HD workflow, huge storage capacity of up to 40 TB, and support for both Windows and  Macintosh.   Another Avid SAN advantage is this active online forum community:
http://community.avid.com/forums/p/13609/76665.aspx

Apple Xsan 2: http://www.apple.com/xsan/
At $999 Apples's Xsan 2 offers low cost and high performance. Key features of Xsan2 for the Macintosh are 
Powerful collaboration tools and pooling of storage resources to increase capacity.   Also, Apple's Spotlight search feature allows users to search *within* video files for content relevant to their query.  This type of video search will be increasingly important to SAN environments as the number of video files skyrockets.   

Metasan: http://www.rorke.com/av/metasan.cfm
metaSAN is designed for heavy bandwidth workgroup environments such as those found with film editing and healthcare applications.   Simultaneously access of groups of files such as video clips, satellite imagery, medical data, and CAD files.

We Need a Storage Revolution

Many discussions in The Future of Storage have focused on the relative merits of one protocol or another, but I have been pleased to see a few touch on the core issue at hand:  We continue to patch together a system based on outdated concepts.  Most storage protocols continue to mimic direct attached storage, and most of our so-called networks act as point to point channels.  An ultra-modern virtualized storage infrastructure with all the latest bells and whistles still holds the concepts of block and file at its core.  Whenever the storage industry has tried to bring about real storage management they have been stymied by a lack of context for data.  No amount of virtualization, and no new protocol, will fix this.  Put simply, we need a storage revolution.

Channels, Blocks, and Files

Most innovation in the 1980s and early 1990s focused on moving storage out of the server.  SCSI allowed disk to exist in a separate cabinet, RAID allowed multiple physical disks to become a virtual one, and these were mixed to become the prototype storage array.  Although SCSI allowed one-to-many connectivity, it was never a true peer-to-peer network, even once it was mixed with network concepts in the form of Fibre Channel.

Even today, SAN storage is focused on providing faster, more flexible, and feature-packed direct-attached storage.  A modern virtual SAN hides a complex arrangement of caching, data protection, tiered storage, replication, and deduplication, masquerading the lot as a simple, lowly disk drive.  It is sad but true that all of our work as an industry has been dedicated to recreating what we started with.

Networked file-based storage is no better.  Although NAS devices have all the advanced features of their SAN cousins, they must present a simple file tree to the host to retain compatibility.  File virtualization merely presents a larger homogenous tree.

Inside the server, too, features and complexity are hidden to retain a familiar file system format.  Volume managers can do anything a virtualization device can, but must present their output as a simple (though virtual) disk drive.  File systems, too, have added features but still present a familiar tree of mount points, inodes, and files.  Even ZFS, possibly the most advanced combination of volume management and file system technology yet, must present a simple tree of storage to applications.

The Metadata Roadblock

This outdated paradigm, of disks and file trees, is ill-suited to today’s storage challenges.  Data must be categorized so actions can be taken to preserve or destroy it based on policies.  Data must be searchable so users and applications can find what they want.  Data must be flexible so it can be used in new ways.  Our antiquated notions are not capable of meeting these challenges.

One simple problem is that we lack context for our data.  Most file systems merely assign to a file a name, location, owner, and security attributes.  The most advanced can contain extended metadata, but this is rarely seen in practice since many applications cannot agree on how to use this data.  Microsoft’s Office suite can store and share extended file attributes, for example, but these live inside the file rather than in the file system.  The promise of expanded Office attributes is only realized in conjunction with a content management system like SharePoint which lies above the lowly file system.

What if the storage system could keep this data instead?  What if it could logically group files according to project or client, mining keywords and authors, and maintaining revisions?  These concepts are not new, having been implemented in content management systems for years, and certain elements appeared in file systems, like Apple's HFS and VMS’ Files-11, for decades.

Cut Down the Tree

File metadata would allow advanced features, but truly taking advantage of them requires a more fundamental shift in the way applications access files.  Rather than sticking to a traditional hierarchy of directories in a tree (which was, after all, simply a primitive metadata system), we should remove the tree altogether.  Allow files to become data objects, identified by arbitrary attributes and managed according to an overarching policy.

This future vision is decidedly different from our current notion of storage, but is not so far off.  Many organizations now rely on central data warehouses based on SQL-language relational databases.  As many storage managers have grumbled, databases tend to ignore storage management concepts entirely, managing their own content independently.

But not all applications need a database back-end, so another initiative seeks to provide generic object storage for wider use.  Called content-addressable storage or CAS, these devices have traditionally been used only for archival purposes, since that was their first market application.  As vendors break free of proprietary interfaces in favor of open ones like XAM, CAS could transform storage itself by eliminating both file and block storage at once.

Similar concepts are already at work in the so-called Web 2.0 world.  Non-traditional databases like Google BigTable, Amazon S3, and Hadoop allow massive scalability for object storage.  API-sharing initiatives with many Web 2.0 companies can be seen as similar prototypical object storage frameworks.  Any of these could be leveraged to provide a new world of data storage, and many are gaining traction even now.

Although traditional block storage is here to stay for disk drives, and tree-type file systems are likely to remain the foundation of operating system storage, new object-based concepts could change the world in fundamental ways.  As applications become “web aware”, they also become object aware, increasing the likelihood of such a storage revolution.  For the majority of applications, this new world would be a welcome one indeed.

A Common Fear Of Going Thin - What If I Run Out Of Space?

If you're like me, then your first thoughts after discovering thin provisiong were similar to: "Hmm, that sounds like my SAN space is getting out of control". Nobody likes the idea of loosing precise overview of his systems. And that's what thin provisiong does - while the array promised 100TB of space to you (on your command, of course) it actually only has six 1TB drives inside. What if I fill them up? I decided to test it for you and show what happens to a Windows server system when a thin-provisioned LUN gets out of physically available space.
  1. I've configured the storage system so that it only had about 2 GBs of free space. I've created a 10GB thin-provisioned LUN.
  2. The LUN was attached to Windows Server 2003, formatted as a NTFS volume.
  3. I started copying approximately 3 gigabytes of large files to the volume. When the written size crossed the 2GB limit, the system stopped responding.
  4. It remained alive but denied to perform any task. I've let it staying so for about an hour. [screenshot]
  5. As a last step I decided to free up some space by deleting other LUN from the array. Almost immediately, the Windows finished the copy operation like there was no interruption.[screenshot]
What's a result of the test? Well, the Windows behaviour was strange, apparently the system was not ready to a situation when drive remains online, claims to have enough free space but doesn't accept more writes. On the other side, it doesn't fell into BSOD neither it closed any application. It just slept. I can't say it will not hurt your application. Neither I think you shouldn't monitor your real capacity. I just wished to test what happens. Knowing it may help you overcome this common fear of thin provisioning.

I love thin provisioning. I teached few customers to love it too. They don't have to calculate their storage space requirements for a tens of months perspective more. They just purchase as much storage as they need now. Still they configure their volumes many times bigger than they currently need without having to pay for the white space. The average oversubscription I'm seeing is around 2-3 times of physical capacity. The fear of less precise space control is gone!

SAN vs cloud storage - a gray or silver lining?

There will clearly be a large SAN market for secure enterprise computing for years to come, but cloud computing in my view is likely to become the standard for an increasing number of businesses and applications.    In an excellent Techworld article about the effect of cloud computing on the future of storage Chris Mellor observes:

Google does not use a storage area network (SAN). It has no world-wide network-attached storage (NAS) infrastructure. Instead it uses thousands of Linux servers with cheap disks - direct-attached storage (DAS) - and organises their contents inside its own Google File System (GFS).

Cloud computing storage is the antithesis of traditional SAN and NAS storage. The good news is that relatively few organisations will have the size needed to build out cloud computing infrastructures. The bad news for SAN and NAS storage vendors is that they could be so incredibly massive as to trigger a significant migration of their customers to using storage-as-a-service on the massive clouds provided by Google, Amazon and others.

His last question is a key issue - will the growing number and size of cheap cloud storage options destroy the SAN market by absorbing customers who would otherwise be setting up their own storage networks?    I think the answer is no, but the cloud will put several constraints on the growth of the SAN market.    The key constraint will be the cost of SAN vs cloud storage.   Where SAN costs will run in the neighborhood of $20 per gigabyte, the (internal) cost of cloud storage by Google is reported to be about $1 per Gig.  At Amazon E3 the cost is about $1.80 per year per Gigabyte of storage.   Unless SAN makes some incredible price breakthroughs fast we'll see a tendency in the direction of ... cheap storage options.    The exception to this will be legacy users who will not want to abandon existing high performance architectures for the untested waters of large cloud storage usage and applications where security is a key concern such as banking and medical file storage and file sharing.   This is a huge and growing market and should keep SAN a profitable business for some time. 

In the small and mid sized business markets we will see more of what is happening right now - a lot of storage moving into the cloud along with applications and services.    Even Microsoft - a key beneficiary of the small business server market -  has begun to push cloud computing while at the same time continuing to promote the "small business server" which I think is doomed to a slow death as small businesses seek to cut costs, minimize downtime, and avoid service calls and contracts.

More constraints on the growth of the global SAN market are the increasingly reliance on remote enterprise computing for globally distributed businesses, and mobile computing.   Businesses increasingly rely on the need for their workforce to be connected via low memory capacity mobile devices 24/7.  They need access to extensive amounts of data.   A cloud storage solution would for many companies be much cheaper and easier to manage than, for example, a VPN accessing an enterprise SAN setup. 

Relevant Linkage:

Chris Mellor, Techworld:
http://www.techworld.com/storage/features/index.cfm?featureid=3893

Jon Stokes at Future of Storage:
http://thefutureofstorage.com/archives/45

Structure 08 Cloud Computing Conference:
http://events.gigaom.com/structure/08/

 

 

This post is simply a collection of links to articles that were very relevant to this Storage Area Networks discussion. 
OK to use this without counting it as an "insight".

SAN Encryption - a collaboration between Cisco and EMC.  More on this SAN Encryption project.

Storage area network basics every SQL Server DBA must know
Denny Cherry  (MySpace.com SQL and SAN Expert)
05.29.2008

A tour of the CNBC SAN Environment on YouTube via Computerworld:
Video Clip of CNBC SAN

SAN vs NAS Video  by CyGem.

Second Life's IBM SAN Structure - a virtual conference session in Second Life.

More YouTube Clips about Storage Area Networks

Netapp blogs  Netapp is a key SAN solutions provider, and most of their top staff maintain blogs.  The blogs are obviously "Netapp centric" but offer good insight into some aspects of the market.   I would say they are far too harsh about cloud computing, probably because it's going to make it harder for them to thrive.

From NetworkWorld:

Battling power costs with a sleeping SAN
A Naval systems center devises power-saving SAN architecture to drive down electricity costs by 40%.


The rise of iSCSI
Is Fibre Channel dead?



Classified savings

Watch the savings roll in when you rework your storage tiers.

See the whole SAN
New tools provide app-level view of SAN performance.


SAN action plan
Sophisticated change-management tools are a must.

SAN Fine-tuning
Smart ways to keep enterprise storage performing at its best.


Pet SAN project
Vet school operates with storage virtualization.

Onaro - mapping storage networks

Onaro bought by Netapp