| Author |
Message |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 1:21 am Post subject:
Re: Need a good hardware cluster solution |
|
|
"Clayton Sutton" <none@none.com> wrote in message
news:%23W6i97E6EHA.1120@TK2MSFTNGP11.phx.gbl...
| Quote: | Thanks for the input Scott,
What do you mean by: "If you build fault tolerance into your NLB/server
cluster design"? Using a product like... (what?)
|
As Rich pointed out, we can do many things to improve uptime through fault
tolerance. For example, if your servers have dual power supplies, RAID
configurations for hard drives, dual connections to dual Fiber switches,
mirrored RAM, and so on and so on. You can then combine the fault tolerance
within a server to the next level by having multiple servers in either an
NLB cluster or a server cluster. Which technology you choose will most
likely be dependent on the application requirements.
From my years of working with lock step products like Marathon, I really
haven't seen much improvement in uptime of an application over a Microsoft
cluster. When you start getting to 99.99%, you can rest pretty easy.
|
|
| Back to top |
|
 |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 1:47 am Post subject:
Re: Need a good hardware cluster solution |
|
|
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:eMzgCsF6EHA.2512@TK2MSFTNGP09.phx.gbl...
| Quote: | You may want to clarify that then, as the definition is not accurate for
NLB or server clusters.
|
I think I just did, but thanks, I will expand here. Just to be clear, this
is not a Microsoft take on the world of HA.
Clayton, "continuous availability" is not necessarily what I meant when it
comes to the truest definition of the term, but it is a term that I have
found resonates with many people and they begin to understand what HA means.
Don't worry that Microsoft uses terms a little differently, but if you
decide to do some googling, you will find the terms are often used
interchangeably by many major players in the industry. For example, one
definition states:
"In information technology, high availability refers to a system or
component that is continuously operational for a desirably long length of
time. Availability can be measured relative to "100% operational" or "never
failing." A widely-held but difficult-to-achieve standard of availability
for a system or product is known as "five 9s" (99.999 percent)
availability."
Source:
http://searchcio.techtarget.com/sDefinition/0,,sid19_gci761219,00.html
Obviously, "never failing" just isn't possible over extremely long periods.
We all need to understand that HA includes not just the design of the
hardware/software solution, but it also includes the backup/restore
solution, and it includes failover processing. Some experts will also
contend that a true HA environment includes a well documented development,
test, and production migration process with in-depth documentation. There is
much to achieving HA, however, it simply comes down to application
availability through processes, software, and hardware implementations.
If you use NLB to provide application availability to your users over the
Internet for your web based app, then that is fantastic. It helps keep the
application available to your users. The same can be said for server
clustering, however, you need to take into account the non-availability
during the actual failover of your application. Sometimes, it is a matter of
seconds, in other cases it can be several minutes. In all cases, a
clustering solution will significantly drive down non-availability and
increase the uptime of your application as run on your servers.
Many experts state that, for any application or system to be highly
available, the parts need to be designed around availability and the
individual parts need to be tested before being put into production. As an
example, if you are using 3rd party products with your Exchange environment
that have not been properly tested, you may find that they are a weak link
that results in loss of availability. Implementing an Exchange server
cluster will not necessarily result in HA. |
|
| Back to top |
|
 |
Scott Schnoll [MSFT]
Guest
|
Posted:
Thu Dec 23, 2004 1:51 am Post subject:
Re: Need a good hardware cluster solution |
|
|
It's not a product. It's all in your design. Things like RAID arrays, SMP
processors, ECC memory, redundant power supplies, redundant networks, etc.
You can have an HA solution with NLB and server clusters. As an example,
have a look at the Exchange Server 2003 High Availability Guide at
http://www.microsoft.com/technet/prodtechnol/exchange/2003/library/highavailgde.mspx.
--
Scott Schnoll
This posting is provided "AS IS" with no warranties, and confers no
rights. Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.
"Clayton Sutton" <none@none.com> wrote in message
news:%23W6i97E6EHA.1120@TK2MSFTNGP11.phx.gbl...
| Quote: | Thanks for the input Scott,
What do you mean by: "If you build fault tolerance into your NLB/server
cluster design"? Using a product like... (what?)
Clayton
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:uKm2LKE6EHA.2540@TK2MSFTNGP09.phx.gbl...
I believe 'continuously available' is Russ' definition (after all, they
were
his words <g>). We do not consider clustering to be a continuously
available solution (or what might be described as a fault tolerant
solution). Clustering provides high availability, which is a level of
availability approaching 100%.
Both NLB and the Windows Cluster Service provide high availbility. If
you
build fault tolerance into your NLB/server cluster design, you can
achieve
very high uptime (99.99% or more).
--
Scott Schnoll
This posting is provided "AS IS" with no warranties, and confers no
rights. Please do not send email directly to this alias. This alias is
for
newsgroup
purposes only.
"Rich Matheisen [MVP]" <richnews@rmcons.com.NOSPAM.COM> wrote in message
news:npkhs0tjnnk2iarqrdj9amd18a932r21nv@4ax.com...
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote:
"Clayton Sutton" <none@none.com> wrote in message
news:Oh$3Hmu5EHA.1204@TK2MSFTNGP10.phx.gbl...
Well, while I admit I don't know a lot about clustering I thought
that
if
I
used NLB and didn't get "High Availability" (failover). I thought it
just
did "Load Balancing" so if one server went down the you were down.
Then
what's the difference between NLB and "High Availability"? Why can
you
use
Windows 2003 Standard for one and you "HAVE" to have Enterprise + for
the
other?
High availability is a term that is often confusing for many people.
Basically, a high availability solution is continuously available
despite
the failure of individual components and even the failure of complete
systems. NLB Clustering and Server Clustering both provide high
availability.
And just to show how varible that definition may be, "continuously
available" refers to "non-stop computing" and MS clusters certainly
aren't that.
Specialized hardware is needed to survive failures of "individual
components" (disk controllers, motherboards, CPUs, etc.). Think of
machines like Stratus and hardware schemes like Marathon Technologies.
This type of hardware is "fault tolerant", not "fault resistant"
(which is what MS clusters are). Fault tolerant hardware "fails out" a
component. MS clusters "fail over". That's a big difference if you
really need continuity of service.
MS clusters have a single point of failure: the share-nothing disks.
Lose one of those and you might as well be running on a single
machine.
The "continuously available" falls apart when it comes to the time it
takes to "fail over" a node of a MS cluster.
--
Rich Matheisen
MCSE+I, Exchange MVP
MS Exchange FAQ at http://www.swinc.com/resource/exch_faq.htm
|
|
|
| Back to top |
|
 |
Scott Schnoll [MSFT]
Guest
|
Posted:
Thu Dec 23, 2004 2:01 am Post subject:
Re: Need a good hardware cluster solution |
|
|
Comments inline...
--
Scott Schnoll
This posting is provided "AS IS" with no warranties, and confers no
rights. Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:ObYu66F6EHA.3124@TK2MSFTNGP11.phx.gbl...
| Quote: | "Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:eMzgCsF6EHA.2512@TK2MSFTNGP09.phx.gbl...
You may want to clarify that then, as the definition is not accurate for
NLB or server clusters.
I think I just did, but thanks, I will expand here. Just to be clear, this
is not a Microsoft take on the world of HA.
Clayton, "continuous availability" is not necessarily what I meant when it
comes to the truest definition of the term, but it is a term that I have
found resonates with many people and they begin to understand what HA
means.
|
Please don't confuse these terms. We go to great lengths in our content and
documentation to clearly state what you get from our products and
technologies. We do not have any products or technologies that provide
continuous availability. Whether it resonates or not, if you are using the
term 'continuous availbility' then you are actually creating confusion for
Microsoft customers interested in high availbility.
| Quote: | Don't worry that Microsoft uses terms a little differently, but if you
decide to do some googling, you will find the terms are often used
interchangeably by many major players in the industry. For example, one
definition states:
"In information technology, high availability refers to a system or
component that is continuously operational for a desirably long length of
time. Availability can be measured relative to "100% operational" or
"never failing." A widely-held but difficult-to-achieve standard of
availability for a system or product is known as "five 9s" (99.999
percent) availability."
Source:
http://searchcio.techtarget.com/sDefinition/0,,sid19_gci761219,00.html
|
I personally disagree with this definition of high availability and not just
because it different from Microsoft's definition. Continuously available is
the term for a fault tolerant system. We are talking about high
availability. These are two different concepts which are often confused and
co-mingled, as is the case in this thread.
| Quote: | Obviously, "never failing" just isn't possible over extremely long
periods. We all need to understand that HA includes not just the design of
the hardware/software solution, but it also includes the backup/restore
solution, and it includes failover processing. Some experts will also
contend that a true HA environment includes a well documented development,
test, and production migration process with in-depth documentation. There
is much to achieving HA, however, it simply comes down to application
availability through processes, software, and hardware implementations.
If you use NLB to provide application availability to your users over the
Internet for your web based app, then that is fantastic. It helps keep the
application available to your users. The same can be said for server
clustering, however, you need to take into account the non-availability
during the actual failover of your application. Sometimes, it is a matter
of seconds, in other cases it can be several minutes. In all cases, a
clustering solution will significantly drive down non-availability and
increase the uptime of your application as run on your servers.
Many experts state that, for any application or system to be highly
available, the parts need to be designed around availability and the
individual parts need to be tested before being put into production. As an
example, if you are using 3rd party products with your Exchange
environment that have not been properly tested, you may find that they are
a weak link that results in loss of availability. Implementing an Exchange
server cluster will not necessarily result in HA.
|
When the messaging infrastructure is implemented properly and in accordance
with Microsoft best practices, an Exchange cluster can most certainly result
in HA. |
|
| Back to top |
|
 |
Rodney R. Fournier [MVP]
Guest
|
Posted:
Thu Dec 23, 2004 2:13 am Post subject:
Re: Need a good hardware cluster solution |
|
|
I really didn't want to get involved, but please read Microsoft's
Achieving High Availability with Exchange Server at Microsoft at
http://download.microsoft.com/download/a/5/8/a58cbb94-06c6-4deb-8ca7-4eae5227f7ca/ExchangeHighAvailabilityTSB.doc
Great stuff. The article is about how Microsoft takes its five 9's
seriously. It talks about the No Excuse approach. HA should be way more then
Clustering or NLB, as the article states.
Just my two cents - not directed at Scott or Russ or Clayton, but the topic
in general.
Cheers,
Rod
MVP - Windows Server - Clustering
http://www.nw-america.com - Clustering
http://msmvps.com/clustering - Blog
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:e$BLBEG6EHA.208@TK2MSFTNGP12.phx.gbl...
| Quote: | Comments inline...
--
Scott Schnoll
This posting is provided "AS IS" with no warranties, and confers no
rights. Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:ObYu66F6EHA.3124@TK2MSFTNGP11.phx.gbl...
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:eMzgCsF6EHA.2512@TK2MSFTNGP09.phx.gbl...
You may want to clarify that then, as the definition is not accurate for
NLB or server clusters.
I think I just did, but thanks, I will expand here. Just to be clear,
this is not a Microsoft take on the world of HA.
Clayton, "continuous availability" is not necessarily what I meant when
it comes to the truest definition of the term, but it is a term that I
have found resonates with many people and they begin to understand what
HA means.
Please don't confuse these terms. We go to great lengths in our content
and documentation to clearly state what you get from our products and
technologies. We do not have any products or technologies that provide
continuous availability. Whether it resonates or not, if you are using
the term 'continuous availbility' then you are actually creating confusion
for Microsoft customers interested in high availbility.
Don't worry that Microsoft uses terms a little differently, but if you
decide to do some googling, you will find the terms are often used
interchangeably by many major players in the industry. For example, one
definition states:
"In information technology, high availability refers to a system or
component that is continuously operational for a desirably long length of
time. Availability can be measured relative to "100% operational" or
"never failing." A widely-held but difficult-to-achieve standard of
availability for a system or product is known as "five 9s" (99.999
percent) availability."
Source:
http://searchcio.techtarget.com/sDefinition/0,,sid19_gci761219,00.html
I personally disagree with this definition of high availability and not
just because it different from Microsoft's definition. Continuously
available is the term for a fault tolerant system. We are talking about
high availability. These are two different concepts which are often
confused and co-mingled, as is the case in this thread.
Obviously, "never failing" just isn't possible over extremely long
periods. We all need to understand that HA includes not just the design
of the hardware/software solution, but it also includes the
backup/restore solution, and it includes failover processing. Some
experts will also contend that a true HA environment includes a well
documented development, test, and production migration process with
in-depth documentation. There is much to achieving HA, however, it simply
comes down to application availability through processes, software, and
hardware implementations.
If you use NLB to provide application availability to your users over the
Internet for your web based app, then that is fantastic. It helps keep
the application available to your users. The same can be said for server
clustering, however, you need to take into account the non-availability
during the actual failover of your application. Sometimes, it is a matter
of seconds, in other cases it can be several minutes. In all cases, a
clustering solution will significantly drive down non-availability and
increase the uptime of your application as run on your servers.
Many experts state that, for any application or system to be highly
available, the parts need to be designed around availability and the
individual parts need to be tested before being put into production. As
an example, if you are using 3rd party products with your Exchange
environment that have not been properly tested, you may find that they
are a weak link that results in loss of availability. Implementing an
Exchange server cluster will not necessarily result in HA.
When the messaging infrastructure is implemented properly and in
accordance with Microsoft best practices, an Exchange cluster can most
certainly result in HA.
|
|
|
| Back to top |
|
 |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 2:59 am Post subject:
Re: Need a good hardware cluster solution |
|
|
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:e$BLBEG6EHA.208@TK2MSFTNGP12.phx.gbl...
| Quote: | "Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:ObYu66F6EHA.3124@TK2MSFTNGP11.phx.gbl...
Clayton, "continuous availability" is not necessarily what I meant when
it comes to the truest definition of the term, but it is a term that I
have found resonates with many people and they begin to understand what
HA means.
Please don't confuse these terms. We go to great lengths in our content
and documentation to clearly state what you get from our products and
technologies. We do not have any products or technologies that provide
continuous availability.
|
To which I agree and did not state that Microsoft does. However, I did
provide a very common definition and clarified it is not Microsoft's.
| Quote: | "In information technology, high availability refers to a system or
component that is continuously operational for a desirably long length of
time. Availability can be measured relative to "100% operational" or
"never failing." A widely-held but difficult-to-achieve standard of
availability for a system or product is known as "five 9s" (99.999
percent) availability."
Source:
http://searchcio.techtarget.com/sDefinition/0,,sid19_gci761219,00.html
I personally disagree with this definition of high availability and not
just because it different from Microsoft's definition. Continuously
available is the term for a fault tolerant system.
|
Continuous availability (CA) is self defining. In a CA solution there is
zero down time. We all agree with that, right? However, I think everyone
also agrees that there really is no such thing as a CA solution when it
comes to extended lengths of time. Since solutions include hardware and
software and both fail and not all failures can be configured to not result
in downtime, CA is a dream, but it is a nice goal.
To be clear, Microsoft server clustering results in minimal downtime
(failover time) and can be used along with other components to achieve an HA
solution. Microsoft NLB clustering also results in minimal (practically
zero) downtime. Both are components that can be used as part of an HA
solution.
Other components, such as hardware, can contain fault tolerant components
and still be part of an HA solution. Right? If you are trying to say that HA
can not include fault tolerant components, then we will just have to agree
to disagree. HA comes from implementation of well designed processes,
hardware, and software. You can't achieve HA solutions without properly
implementing all three. You can disagree with this, but I think I have
enough support in the industry to back me up here.
| Quote: | We are talking about high availability. These are two different concepts
which are often confused and co-mingled, as is the case in this thread.
|
You can disagree all you want. The definition for HA comes from the goal of
CA. Replace minimal downtime with no downtime, and HA becomes CA. Again, I
can't stress this enough, CA is not achievable over extended lenghts of
time. The industry has several overlapping definitions because of the common
elements between CA and HA solutions. I agree with the ones that states that
HA includes well designed processes, hardware, and software implementations
resulting in minimal downtime, thus HA. Continuous availability, as defined,
just isn't achievable. I use it to help understand what HA is, though, and
it works for most people.
| Quote: | Obviously, "never failing" just isn't possible over extremely long
periods. We all need to understand that HA includes not just the design
of the hardware/software solution, but it also includes the
backup/restore solution, and it includes failover processing. Some
experts will also contend that a true HA environment includes a well
documented development, test, and production migration process with
in-depth documentation. There is much to achieving HA, however, it simply
comes down to application availability through processes, software, and
hardware implementations.
If you use NLB to provide application availability to your users over the
Internet for your web based app, then that is fantastic. It helps keep
the application available to your users. The same can be said for server
clustering, however, you need to take into account the non-availability
during the actual failover of your application. Sometimes, it is a matter
of seconds, in other cases it can be several minutes. In all cases, a
clustering solution will significantly drive down non-availability and
increase the uptime of your application as run on your servers.
Many experts state that, for any application or system to be highly
available, the parts need to be designed around availability and the
individual parts need to be tested before being put into production. As
an example, if you are using 3rd party products with your Exchange
environment that have not been properly tested, you may find that they
are a weak link that results in loss of availability. Implementing an
Exchange server cluster will not necessarily result in HA.
When the messaging infrastructure is implemented properly and in
accordance with Microsoft best practices, an Exchange cluster can most
certainly result in HA.
|
You obviously missed what I said, Scott. I said that if you include 3rd
party stuff that has not been properly tested or implmented, it may result
in loss of availability that clustering will not fix.
Can Exchange clustering result in HA? Absolutely. Can clustering Exchange
servers that include untested 3rd party code fix the problems? Absolutely
not. Clustering alone does not result in HA solutions. |
|
| Back to top |
|
 |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 3:00 am Post subject:
Re: Need a good hardware cluster solution |
|
|
"Rodney R. Fournier [MVP]" <rod@die.spam.die.nw-america.com> wrote in
message news:ehX09KG6EHA.2804@TK2MSFTNGP15.phx.gbl...
I absolutely agree with you. HA takes way more than just implementing server
clustering or NLB clustering. However, they can be components of an HA
solution. |
|
| Back to top |
|
 |
Scott Schnoll [MSFT]
Guest
|
Posted:
Thu Dec 23, 2004 3:23 am Post subject:
Re: Need a good hardware cluster solution |
|
|
Comments inline...
--
Scott Schnoll
This posting is provided "AS IS" with no warranties, and confers no
rights. Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:eqSPNjG6EHA.2196@TK2MSFTNGP11.phx.gbl...
| Quote: | "Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:e$BLBEG6EHA.208@TK2MSFTNGP12.phx.gbl...
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:ObYu66F6EHA.3124@TK2MSFTNGP11.phx.gbl...
Clayton, "continuous availability" is not necessarily what I meant when
it comes to the truest definition of the term, but it is a term that I
have found resonates with many people and they begin to understand what
HA means.
Please don't confuse these terms. We go to great lengths in our content
and documentation to clearly state what you get from our products and
technologies. We do not have any products or technologies that provide
continuous availability.
To which I agree and did not state that Microsoft does. However, I did
provide a very common definition and clarified it is not Microsoft's.
|
OK.
| Quote: | "In information technology, high availability refers to a system or
component that is continuously operational for a desirably long length
of time. Availability can be measured relative to "100% operational" or
"never failing." A widely-held but difficult-to-achieve standard of
availability for a system or product is known as "five 9s" (99.999
percent) availability."
Source:
http://searchcio.techtarget.com/sDefinition/0,,sid19_gci761219,00.html
I personally disagree with this definition of high availability and not
just because it different from Microsoft's definition. Continuously
available is the term for a fault tolerant system.
Continuous availability (CA) is self defining. In a CA solution there is
zero down time. We all agree with that, right? However, I think everyone
also agrees that there really is no such thing as a CA solution when it
comes to extended lengths of time. Since solutions include hardware and
software and both fail and not all failures can be configured to not
result in downtime, CA is a dream, but it is a nice goal.
|
I don't know if I agree with that or not. Define down time.
| Quote: | To be clear, Microsoft server clustering results in minimal downtime
(failover time) and can be used along with other components to achieve an
HA solution. Microsoft NLB clustering also results in minimal (practically
zero) downtime. Both are components that can be used as part of an HA
solution.
Other components, such as hardware, can contain fault tolerant components
and still be part of an HA solution. Right? If you are trying to say that
HA can not include fault tolerant components, then we will just have to
agree to disagree. HA comes from implementation of well designed
processes, hardware, and software. You can't achieve HA solutions without
properly implementing all three. You can disagree with this, but I think I
have enough support in the industry to back me up here.
|
HA does not equal FT. Here are the definitions I use:
Fault tolerance is the ability of a system to continue functioning when part
of it fails (e.g., experiences a fault). This term is used to describe disk
subsystems (e.g., RAID), symmetric multiple processors (SMP), redundant
power supplies (with separate power sources), uninterruptible power
supplies, redundant network adapters, etc. Fault tolerance is designed to
alleviate the problems caused by component failures, power outages, or other
like occurrences.
High-Availability refers to a system uptime that approaches 100%. For
example, an availability level of 99.999%, calculated on a round-the-clock
basis, would mean that an organization would experience at least five
minutes of unscheduled downtime per year. A level of 99.99% translates to
52 minutes of downtime. A level of 99.9% translates to 8.7 hours, and a
level of 99% equals about 3.7 days of downtime per year.
HA systems are often built with a fault tolerance design using fault
tolerance components.
| Quote: | We are talking about high availability. These are two different concepts
which are often confused and co-mingled, as is the case in this thread.
You can disagree all you want. The definition for HA comes from the goal
of CA. Replace minimal downtime with no downtime, and HA becomes CA.
Again, I can't stress this enough, CA is not achievable over extended
lenghts of time. The industry has several overlapping definitions because
of the common elements between CA and HA solutions. I agree with the ones
that states that HA includes well designed processes, hardware, and
software implementations resulting in minimal downtime, thus HA.
Continuous availability, as defined, just isn't achievable. I use it to
help understand what HA is, though, and it works for most people.
|
That is all great information, but it has nothing whatsoever to do with this
thread. Let's get back to Clayton's original question if we can:
"The company that I work for is wanting to move to a Windows 2003 Server and
Exchange 2003 clustered environment. I know that Windows 2003 (Standard)
will do a "Network Load Balancing" and the Enterprise Edition will do both
"Network Load Balancing" and "High Availability" clustering but not BOTH.
If you want to do BOTH "Network Load Balancing" and "High Availability" you
need a third party solution. That's what I'm looking for, anyone have any
ideas? Also, any white papers on Windows and Exchange clustering would be
great too. Thanks for any input."
Clayton is asking about NLB and HA for his Exchange cluster, and he is
asking it in a Microsoft newsgroup (albeit an unmanaged newsgroup). However
industry pundits define CA, HA or FT, the question here is specific to
Windows 2003 and Exchange 2003, and what can be achieved. I'm not against
dreaming about 100% or CA or whatever you want to call it, but I do get
concerned when I see a customer being confused by misleading terms and
incomplete information.
| Quote: | Obviously, "never failing" just isn't possible over extremely long
periods. We all need to understand that HA includes not just the design
of the hardware/software solution, but it also includes the
backup/restore solution, and it includes failover processing. Some
experts will also contend that a true HA environment includes a well
documented development, test, and production migration process with
in-depth documentation. There is much to achieving HA, however, it
simply comes down to application availability through processes,
software, and hardware implementations.
If you use NLB to provide application availability to your users over
the Internet for your web based app, then that is fantastic. It helps
keep the application available to your users. The same can be said for
server clustering, however, you need to take into account the
non-availability during the actual failover of your application.
Sometimes, it is a matter of seconds, in other cases it can be several
minutes. In all cases, a clustering solution will significantly drive
down non-availability and increase the uptime of your application as run
on your servers.
Many experts state that, for any application or system to be highly
available, the parts need to be designed around availability and the
individual parts need to be tested before being put into production. As
an example, if you are using 3rd party products with your Exchange
environment that have not been properly tested, you may find that they
are a weak link that results in loss of availability. Implementing an
Exchange server cluster will not necessarily result in HA.
When the messaging infrastructure is implemented properly and in
accordance with Microsoft best practices, an Exchange cluster can most
certainly result in HA.
You obviously missed what I said, Scott. I said that if you include 3rd
party stuff that has not been properly tested or implmented, it may result
in loss of availability that clustering will not fix.
|
No, I read that part. Don't let me absence of a comment lead you to believe
I missed information. What you have said is generically true about every
computer, clustered or not.
| Quote: | Can Exchange clustering result in HA? Absolutely. Can clustering Exchange
servers that include untested 3rd party code fix the problems? Absolutely
not. Clustering alone does not result in HA solutions.
|
I don't think anyone ever said anything contrary to these statements. |
|
| Back to top |
|
 |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 3:37 am Post subject:
Re: Need a good hardware cluster solution |
|
|
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:uajB2xG6EHA.2788@TK2MSFTNGP15.phx.gbl...
| Quote: | Continuous availability (CA) is self defining. In a CA solution there is
zero down time. We all agree with that, right? However, I think everyone
also agrees that there really is no such thing as a CA solution when it
comes to extended lengths of time. Since solutions include hardware and
software and both fail and not all failures can be configured to not
result in downtime, CA is a dream, but it is a nice goal.
I don't know if I agree with that or not. Define down time.
|
Time when the application is not available. This can be a tricky definition,
though. For example, if the Exchange VS is failing over, it isn't available
at that time. However, one of the features taht I love with Exchange Server
2003 and Outlook 2003 is the cached mode so that the users do not experience
a loss of service in that they can respond to messages and write new
messages while this failover is happening. So, I don't know how to define
that time. <G>
| Quote: | HA does not equal FT. Here are the definitions I use:
Fault tolerance is the ability of a system to continue functioning when
part of it fails (e.g., experiences a fault). This term is used to
describe disk subsystems (e.g., RAID), symmetric multiple processors
(SMP), redundant power supplies (with separate power sources),
uninterruptible power supplies, redundant network adapters, etc. Fault
tolerance is designed to alleviate the problems caused by component
failures, power outages, or other like occurrences.
|
But yet, isn't FT (for hardware like the above) part of an HA solution? I
believe it is. Without it being part of the solution, the overall solution
would have greater non-availability.
| Quote: | High-Availability refers to a system uptime that approaches 100%. For
example, an availability level of 99.999%, calculated on a round-the-clock
basis, would mean that an organization would experience at least five
minutes of unscheduled downtime per year. A level of 99.99% translates to
52 minutes of downtime. A level of 99.9% translates to 8.7 hours, and a
level of 99% equals about 3.7 days of downtime per year.
|
Agreed. The magic 100% comes under CA. Thus why they are both used in the
same discussions most of the time.
| Quote: | HA systems are often built with a fault tolerance design using fault
tolerance components.
|
Absolutely, and I believe I had said that. Thus why I don't equate FT with
CA only.
| Quote: | "The company that I work for is wanting to move to a Windows 2003 Server
and
Exchange 2003 clustered environment. I know that Windows 2003 (Standard)
will do a "Network Load Balancing" and the Enterprise Edition will do both
"Network Load Balancing" and "High Availability" clustering but not BOTH.
If you want to do BOTH "Network Load Balancing" and "High Availability"
you
need a third party solution. That's what I'm looking for, anyone have any
ideas? Also, any white papers on Windows and Exchange clustering would be
great too. Thanks for any input."
Clayton is asking about NLB and HA for his Exchange cluster, and he is
asking it in a Microsoft newsgroup (albeit an unmanaged newsgroup).
However industry pundits define CA, HA or FT, the question here is
specific to Windows 2003 and Exchange 2003, and what can be achieved. I'm
not against dreaming about 100% or CA or whatever you want to call it, but
I do get concerned when I see a customer being confused by misleading
terms and incomplete information.
|
I see your point, but I think this thread has helped make that clear not
just to Clayton, but to everyone else. This has been a very helpful thread. |
|
| Back to top |
|
 |
Bob Christian
Guest
|
Posted:
Thu Dec 23, 2004 6:19 am Post subject:
Re: Need a good hardware cluster solution |
|
|
Then there are such things as user-impacting downtime and non-user-impacting
downtime.
I had a manager that said to me once... zero impact on the user is our goal.
If it's down and it does not impact the user or generate a helpdesk ticket,
that does not impact the user. We must strive for the same thing internally
because the monitoring system is not a user and it will rat us out without a
guilty feeling.
Bob
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:uvO$o4G6EHA.4072@TK2MSFTNGP10.phx.gbl...
| Quote: | "Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:uajB2xG6EHA.2788@TK2MSFTNGP15.phx.gbl...
Continuous availability (CA) is self defining. In a CA solution there
is
zero down time. We all agree with that, right? However, I think
everyone
also agrees that there really is no such thing as a CA solution when it
comes to extended lengths of time. Since solutions include hardware and
software and both fail and not all failures can be configured to not
result in downtime, CA is a dream, but it is a nice goal.
I don't know if I agree with that or not. Define down time.
Time when the application is not available. This can be a tricky
definition,
though. For example, if the Exchange VS is failing over, it isn't
available
at that time. However, one of the features taht I love with Exchange
Server
2003 and Outlook 2003 is the cached mode so that the users do not
experience
a loss of service in that they can respond to messages and write new
messages while this failover is happening. So, I don't know how to define
that time. <G
HA does not equal FT. Here are the definitions I use:
Fault tolerance is the ability of a system to continue functioning when
part of it fails (e.g., experiences a fault). This term is used to
describe disk subsystems (e.g., RAID), symmetric multiple processors
(SMP), redundant power supplies (with separate power sources),
uninterruptible power supplies, redundant network adapters, etc. Fault
tolerance is designed to alleviate the problems caused by component
failures, power outages, or other like occurrences.
But yet, isn't FT (for hardware like the above) part of an HA solution? I
believe it is. Without it being part of the solution, the overall solution
would have greater non-availability.
High-Availability refers to a system uptime that approaches 100%. For
example, an availability level of 99.999%, calculated on a
round-the-clock
basis, would mean that an organization would experience at least five
minutes of unscheduled downtime per year. A level of 99.99% translates
to
52 minutes of downtime. A level of 99.9% translates to 8.7 hours, and a
level of 99% equals about 3.7 days of downtime per year.
Agreed. The magic 100% comes under CA. Thus why they are both used in the
same discussions most of the time.
HA systems are often built with a fault tolerance design using fault
tolerance components.
Absolutely, and I believe I had said that. Thus why I don't equate FT with
CA only.
"The company that I work for is wanting to move to a Windows 2003 Server
and
Exchange 2003 clustered environment. I know that Windows 2003
(Standard)
will do a "Network Load Balancing" and the Enterprise Edition will do
both
"Network Load Balancing" and "High Availability" clustering but not
BOTH.
If you want to do BOTH "Network Load Balancing" and "High Availability"
you
need a third party solution. That's what I'm looking for, anyone have
any
ideas? Also, any white papers on Windows and Exchange clustering would
be
great too. Thanks for any input."
Clayton is asking about NLB and HA for his Exchange cluster, and he is
asking it in a Microsoft newsgroup (albeit an unmanaged newsgroup).
However industry pundits define CA, HA or FT, the question here is
specific to Windows 2003 and Exchange 2003, and what can be achieved.
I'm
not against dreaming about 100% or CA or whatever you want to call it,
but
I do get concerned when I see a customer being confused by misleading
terms and incomplete information.
I see your point, but I think this thread has helped make that clear not
just to Clayton, but to everyone else. This has been a very helpful
thread.
|
|
|
| Back to top |
|
 |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 10:33 pm Post subject:
Re: Need a good hardware cluster solution |
|
|
"Bob Christian" <BobChristian@removethis.gmail.com> wrote in message
news:uSwzvUI6EHA.3756@TK2MSFTNGP14.phx.gbl...
| Quote: | Then there are such things as user-impacting downtime and
non-user-impacting
downtime.
I had a manager that said to me once... zero impact on the user is our
goal.
If it's down and it does not impact the user or generate a helpdesk
ticket,
that does not impact the user. We must strive for the same thing
internally
because the monitoring system is not a user and it will rat us out without
a
guilty feeling.
|
Absolutely. The goal is to be able to provide the application despite the
failure of a components or a complete system.
Where I work, we have such incredible high levels of automation, a server
cluster failover will actually generate several different tickets. A ticket
is generated for the failover of every resource, so even if you manually
move the resources to another node, management is made aware of the change.
You can't hide downtime from the automated tools, even if it doesn't impact
the user community. |
|
| Back to top |
|
 |
Scott Schnoll [MSFT]
Guest
|
Posted:
Thu Dec 23, 2004 10:42 pm Post subject:
Re: Need a good hardware cluster solution |
|
|
Inline...
--
Scott Schnoll
This posting is provided "AS IS" with no warranties, and confers no
rights. Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:%23z56tzQ6EHA.2196@TK2MSFTNGP11.phx.gbl...
| Quote: | "Bob Christian" <BobChristian@removethis.gmail.com> wrote in message
news:uSwzvUI6EHA.3756@TK2MSFTNGP14.phx.gbl...
Then there are such things as user-impacting downtime and
non-user-impacting
downtime.
I had a manager that said to me once... zero impact on the user is our
goal.
If it's down and it does not impact the user or generate a helpdesk
ticket,
that does not impact the user. We must strive for the same thing
internally
because the monitoring system is not a user and it will rat us out
without a
guilty feeling.
Absolutely. The goal is to be able to provide the application despite the
failure of a components or a complete system.
|
Right, so what is the application in this case? Before you answer, its
hypothetical. :-) This is the primary challenge we have with Exchange
because we really serve two distinct groups of users: (1) IT Pros, the folks
who manage servers, such as AD and Exchange; and (2) Information workers,
such as those user Outlook or ActiveSync or some other Exchange client.
This is why each organization must determine for itself what downtime means
and how to measure and combat it.
| Quote: | Where I work, we have such incredible high levels of automation, a server
cluster failover will actually generate several different tickets. A
ticket is generated for the failover of every resource, so even if you
manually move the resources to another node, management is made aware of
the change. You can't hide downtime from the automated tools, even if it
doesn't impact the user community.
|
Yes, you can. :-) Configure the tools with exclusions so that certain
failures don't trigger tickets or alerts. I have yet to see a monitoring
solution that doesn't provide some sort of exclusion features so that you
can "hide" downtime from the tool. |
|
| Back to top |
|
 |
Russ Kaufmann [MCT]
Guest
|
Posted:
Thu Dec 23, 2004 10:54 pm Post subject:
Re: Need a good hardware cluster solution |
|
|
"Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:uTNOZ5Q6EHA.2452@TK2MSFTNGP14.phx.gbl...
| Quote: | Absolutely. The goal is to be able to provide the application despite the
failure of a components or a complete system.
Right, so what is the application in this case? Before you answer, its
hypothetical. :-) This is the primary challenge we have with Exchange
because we really serve two distinct groups of users: (1) IT Pros, the
folks who manage servers, such as AD and Exchange; and (2) Information
workers, such as those user Outlook or ActiveSync or some other Exchange
client. This is why each organization must determine for itself what
downtime means and how to measure and combat it.
|
Yep, many different definitions for different users. It really makes uptime
reporting a nightmare.
| Quote: | Where I work, we have such incredible high levels of automation, a server
cluster failover will actually generate several different tickets. A
ticket is generated for the failover of every resource, so even if you
manually move the resources to another node, management is made aware of
the change. You can't hide downtime from the automated tools, even if it
doesn't impact the user community.
Yes, you can. :-) Configure the tools with exclusions so that certain
failures don't trigger tickets or alerts. I have yet to see a monitoring
solution that doesn't provide some sort of exclusion features so that you
can "hide" downtime from the tool.
|
Regrettably, the monitoring tool is managed by a different team (gotta keep
us honest) and it is a combination of a well known vendor tool and an
in-house tool. I have been trying to ply the monitoring team with lots of
beer, but it doesn't work. <G> |
|
| Back to top |
|
 |
Jo K
Guest
|
Posted:
Thu Dec 23, 2004 11:38 pm Post subject:
Re: Need a good hardware cluster solution |
|
|
Clayton,
Two possibilities are NSI Doubletake (www.nsisoftweare.com) or
Neverfail group ( www.neverfailgroup.com )
NSI is more of a replication solution
Neverfail is a high availability solution and replication
Depends which one you are looking for and what you want to achieve.
Both are inexpensive, we've purchase Neverfail because we needed the
high availability it provides.
Jo K
Clayton Sutton wrote:
| Quote: | Hi everyone,
The company that I work for is wanting to move to a Windows 2003
Server and
Exchange 2003 clustered environment. I know that Windows 2003
(Standard)
will do a "Network Load Balancing" and the Enterprise Edition will do
both
"Network Load Balancing" and "High Availability" clustering but not
BOTH.
If you want to do BOTH "Network Load Balancing" and "High
Availability" you
need a third party solution. That's what I'm looking for, anyone
have any
ideas? Also, any white papers on Windows and Exchange clustering
would be
great too. Thanks for any input.
Clayton |
|
|
| Back to top |
|
 |
Clayton Sutton
Guest
|
Posted:
Fri Dec 24, 2004 3:39 am Post subject:
Re: Need a good hardware cluster solution |
|
|
First off, let me say that I have ENJOYED this thread! You guy's input
(your give and take back and fourth) has been VERY in-lighting AND helpful!
ALL of you guys are the best!!!
In trying to get my arms around the subject at had, can you guys clear up
the difference between an "NLB" cluster and a "Server" cluster. I know that
"NLB" stands for "Network Load Balancing" and I know what that is. However,
in a "Server" cluster if one server goes down another one takes over.
Right?
Does an "NLB" cluster do the same? What if I had a four node "NLB" cluster
and I lost a node? Would the whole cluster be down?
I am trying to understand the technical difference (other then "Load
Balancing") between an "NLB" cluster and a "Server" cluster.
By the way, I am reading as many white papers as I can, I just haven't
gotten to that part yet. I am buried in "White Papers" right now (if you
know what I mean) :).
Thanks again for all of your help.
Clayton
"Russ Kaufmann [MCT]" <russ@exchangemct.nospam.com> wrote in message
news:u7hbT$Q6EHA.2592@TK2MSFTNGP09.phx.gbl...
| Quote: | "Scott Schnoll [MSFT]" <scschnol@online.microsoft.com> wrote in message
news:uTNOZ5Q6EHA.2452@TK2MSFTNGP14.phx.gbl...
Absolutely. The goal is to be able to provide the application despite
the
failure of a components or a complete system.
Right, so what is the application in this case? Before you answer, its
hypothetical. :-) This is the primary challenge we have with Exchange
because we really serve two distinct groups of users: (1) IT Pros, the
folks who manage servers, such as AD and Exchange; and (2) Information
workers, such as those user Outlook or ActiveSync or some other Exchange
client. This is why each organization must determine for itself what
downtime means and how to measure and combat it.
Yep, many different definitions for different users. It really makes
uptime
reporting a nightmare.
Where I work, we have such incredible high levels of automation, a
server
cluster failover will actually generate several different tickets. A
ticket is generated for the failover of every resource, so even if you
manually move the resources to another node, management is made aware
of
the change. You can't hide downtime from the automated tools, even if
it
doesn't impact the user community.
Yes, you can. :-) Configure the tools with exclusions so that certain
failures don't trigger tickets or alerts. I have yet to see a
monitoring
solution that doesn't provide some sort of exclusion features so that
you
can "hide" downtime from the tool.
Regrettably, the monitoring tool is managed by a different team (gotta
keep
us honest) and it is a combination of a well known vendor tool and an
in-house tool. I have been trying to ply the monitoring team with lots of
beer, but it doesn't work. <G
|
|
|
| Back to top |
|
 |
|
|
|
|