| Author |
Message |
bc
Guest
|
Posted:
Fri Jan 21, 2005 2:41 am Post subject:
Database Size, Database Corruption, Message Store & Storage |
|
|
We're just recovering from an Exchange disaster. We discovered that we could
not get database backups off of two of our four storage groups. We then
discovered we were down one drive in our RAID 5 array. Then we discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if we
could do things differently in the future that might help us recover more
quickly. Especially since we found that one of the moved stores wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in each
storage group with a total of about 100 GB between the 8 of them. We found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check? Are
there online alternatives to repairing a database that won't back up?
|
|
| Back to top |
|
 |
Ben Winzenz [Exchange MVP
Guest
|
Posted:
Fri Jan 21, 2005 3:44 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
To be honest? More careful monitoring of the server. Checking daily that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly (or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon as
they happened. What you can do differently in the future is make a list of
things that you should be checking on a daily basis so that problems can be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at one
point or another. The best thing you can do is learn from what went wrong.
Another thing I want to mention is that you may be better served filling up
an entire storage group before creating a new one. The reason I say this is
because each extra database requires additional memory when Exchange starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption. If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer to
corruption within the edb file - there is another set corresponding to the
stm file). If there was database corruption, then more than likely it was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store? Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange server?
As far as individual database size, that is really up to you. Normally, I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
| Quote: | We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We then
discovered we were down one drive in our RAID 5 array. Then we discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if we
could do things differently in the future that might help us recover more
quickly. Especially since we found that one of the moved stores wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up? |
|
|
| Back to top |
|
 |
GT
Guest
|
Posted:
Fri Jan 21, 2005 4:20 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
Well said!
-GT
"Ben Winzenz [Exchange MVP]" <ben_winzenz@NOSPAMdotmessageonedotcom> wrote
in message news:eL2Rxkz$EHA.3840@tk2msftngp13.phx.gbl...
| Quote: | To be honest? More careful monitoring of the server. Checking daily that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or thereabouts) testing restores to make sure that they are good. I'm
not trying to come across as being hard on you, but there really is no
good reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as they happened. What you can do differently in the future is make a
list of things that you should be checking on a daily basis so that
problems can be remediated as soon as they occur. Again - take this as
constructive feedback rather than bashing on you. :-) Most of us have
been there at one point or another. The best thing you can do is learn
from what went wrong.
Another thing I want to mention is that you may be better served filling
up an entire storage group before creating a new one. The reason I say
this is because each extra database requires additional memory when
Exchange starts. Each extra storage group also requires more memory at
startup, but it is quite a bit more than a mailbox store requires.
Microsoft generally indicates that the best practice for server
utilization is to fill up the entire storage group first for this very
reason. If you have 8 mailbox stores, you may consider consolidating down
to 2 storage groups. If you are using too much physical memory, you will
start to have excessive disk paging, which can have an impact on
performance.
Also it is important to know if there was actually database corruption.
If there is, then there will be corresponding events logged to the
server's application log. Events such as 1018, 1019 or 1022 errors (these
refer to corruption within the edb file - there is another set
corresponding to the stm file). If there was database corruption, then
more than likely it was the result of faulty hardware. In this case, it
could have possibly been flaky hard drives, but the raid controller is
also something you should look at. The figure that gets tossed around is
that 99.9% of all database corruption in Exchange is caused by faulty
hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store? Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange server?
As far as individual database size, that is really up to you. Normally, I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that they can be restored within that time frame. You should also factor
in maintenance. Normally, Exchange doesn't require daily maintenance - it
is pretty good at taking care of itself. However, if you should have to
run eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover more
quickly. Especially since we found that one of the moved stores wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
Drew Halevy
Guest
|
Posted:
Fri Jan 21, 2005 4:25 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
backups are handled elsewhere in my department, but the backup guy is both
very good and very conscientious. He is an alert addict, and I get alerts on
overall backups, F/E backups and B/E backups. I check them first thing in
the morning. If there is an error, we can jump on it right away. Most 3rd
party backup software should have notifications. I also do monitoring using
IP sentry for SMTP service, free drive space and some Exchange services.
Plus the stuff listed below (check log files, etc) -Drew
"GT" <DSS4u@+++nospam+++HOTMAIL.COM> wrote in message
news:c9WHd.423$6f.93@charlie.risq.qc.ca...
| Quote: | Well said!
-GT
"Ben Winzenz [Exchange MVP]" <ben_winzenz@NOSPAMdotmessageonedotcom> wrote
in message news:eL2Rxkz$EHA.3840@tk2msftngp13.phx.gbl...
To be honest? More careful monitoring of the server. Checking daily
that the backup job completes successfully probably would have also
helped. Daily checking of the event logs for any warnings or errors.
Quarterly (or thereabouts) testing restores to make sure that they are
good. I'm not trying to come across as being hard on you, but there
really is no good reason for discovering all the things that you
mentioned "at time of disaster". They are all things that should have
been discovered as soon as they happened. What you can do differently in
the future is make a list of things that you should be checking on a
daily basis so that problems can be remediated as soon as they occur.
Again - take this as constructive feedback rather than bashing on you.
:-) Most of us have been there at one point or another. The best thing
you can do is learn from what went wrong.
Another thing I want to mention is that you may be better served filling
up an entire storage group before creating a new one. The reason I say
this is because each extra database requires additional memory when
Exchange starts. Each extra storage group also requires more memory at
startup, but it is quite a bit more than a mailbox store requires.
Microsoft generally indicates that the best practice for server
utilization is to fill up the entire storage group first for this very
reason. If you have 8 mailbox stores, you may consider consolidating
down to 2 storage groups. If you are using too much physical memory, you
will start to have excessive disk paging, which can have an impact on
performance.
Also it is important to know if there was actually database corruption.
If there is, then there will be corresponding events logged to the
server's application log. Events such as 1018, 1019 or 1022 errors
(these refer to corruption within the edb file - there is another set
corresponding to the stm file). If there was database corruption, then
more than likely it was the result of faulty hardware. In this case, it
could have possibly been flaky hard drives, but the raid controller is
also something you should look at. The figure that gets tossed around is
that 99.9% of all database corruption in Exchange is caused by faulty
hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store?
Are you also doing Brick-level (Mailbox-level) backups via a 3rd party
backup agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that they can be restored within that time frame. You should also factor
in maintenance. Normally, Exchange doesn't require daily maintenance -
it is pretty good at taking care of itself. However, if you should have
to run eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
bc
Guest
|
Posted:
Fri Jan 21, 2005 5:57 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
A note it my defense -
I was brought in after the fact, just trying to make things better in the
future.
Good ideas. Now, if you've got some ideas on how to suggest changes without
ruffling other's feathers... probably another newsgroup though. ;-)
However, the more people I get saying the similar things, the better. How
about those database utilities like ISINTEG and ESEUTIL, is it worthwhile to
dismount a store and run them? Ever?
"Ben Winzenz [Exchange MVP]" wrote:
| Quote: | To be honest? More careful monitoring of the server. Checking daily that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly (or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon as
they happened. What you can do differently in the future is make a list of
things that you should be checking on a daily basis so that problems can be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at one
point or another. The best thing you can do is learn from what went wrong.
Another thing I want to mention is that you may be better served filling up
an entire storage group before creating a new one. The reason I say this is
because each extra database requires additional memory when Exchange starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption. If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer to
corruption within the edb file - there is another set corresponding to the
stm file). If there was database corruption, then more than likely it was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store? Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange server?
As far as individual database size, that is really up to you. Normally, I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We then
discovered we were down one drive in our RAID 5 array. Then we discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if we
could do things differently in the future that might help us recover more
quickly. Especially since we found that one of the moved stores wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
Susan
Guest
|
Posted:
Fri Jan 21, 2005 6:11 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
the consensus on running this utilities against your databases is that you
should not, unless it is necessary...if/when you do, ensure you take a full
backup before, and immediately after running these procedures...you might
want to read up on these utilities to see how they affect your
databases...and speaking of backups, the importance of these cannot be
overstressed...you can recover from any calamity if you have good backups...
"bc" <bc@discussions.microsoft.com> wrote in message
news:4065C98E-CB72-44AF-AABD-7FB8A9A9B4D8@microsoft.com...
| Quote: | A note it my defense -
I was brought in after the fact, just trying to make things better in the
future.
Good ideas. Now, if you've got some ideas on how to suggest changes
without
ruffling other's feathers... probably another newsgroup though. ;-)
However, the more people I get saying the similar things, the better. How
about those database utilities like ISINTEG and ESEUTIL, is it worthwhile
to
dismount a store and run them? Ever?
"Ben Winzenz [Exchange MVP]" wrote:
To be honest? More careful monitoring of the server. Checking daily
that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as
they happened. What you can do differently in the future is make a list
of
things that you should be checking on a daily basis so that problems can
be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at
one
point or another. The best thing you can do is learn from what went
wrong.
Another thing I want to mention is that you may be better served filling
up
an entire storage group before creating a new one. The reason I say
this is
because each extra database requires additional memory when Exchange
starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up
the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you
are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption.
If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer
to
corruption within the edb file - there is another set corresponding to
the
stm file). If there was database corruption, then more than likely it
was
the result of faulty hardware. In this case, it could have possibly
been
flaky hard drives, but the raid controller is also something you should
look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing
backups.
Are you simply doing a normal online backup of the Information Store?
Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party
backup
agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it
is
pretty good at taking care of itself. However, if you should have to
run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that
we
could
not get database backups off of two of our four storage groups. We
then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on
a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not
reboot.
We've since rebuild and recovered what we could, but we want to know
if we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely
time
consuming. We'd like to hear from anyone who might have suggestions
or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
GT
Guest
|
Posted:
Fri Jan 21, 2005 7:30 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
Usually if you are able to do a full backup of the database there is nothing
wrong with them. Assuming you are doing a full backup at least once/week
this should be sufficient verification.
-GT
"bc" <bc@discussions.microsoft.com> wrote in message
news:4065C98E-CB72-44AF-AABD-7FB8A9A9B4D8@microsoft.com...
| Quote: | A note it my defense -
I was brought in after the fact, just trying to make things better in the
future.
Good ideas. Now, if you've got some ideas on how to suggest changes
without
ruffling other's feathers... probably another newsgroup though. ;-)
However, the more people I get saying the similar things, the better. How
about those database utilities like ISINTEG and ESEUTIL, is it worthwhile
to
dismount a store and run them? Ever?
"Ben Winzenz [Exchange MVP]" wrote:
To be honest? More careful monitoring of the server. Checking daily
that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as
they happened. What you can do differently in the future is make a list
of
things that you should be checking on a daily basis so that problems can
be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at
one
point or another. The best thing you can do is learn from what went
wrong.
Another thing I want to mention is that you may be better served filling
up
an entire storage group before creating a new one. The reason I say this
is
because each extra database requires additional memory when Exchange
starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you
are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption.
If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer
to
corruption within the edb file - there is another set corresponding to
the
stm file). If there was database corruption, then more than likely it
was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should
look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store?
Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it
is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We
then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
Ben Winzenz [Exchange MVP
Guest
|
Posted:
Fri Jan 21, 2005 9:13 pm Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
It wasn't my intent to "ruffle your feathers". I wasn't aware that you were
brought in after the fact. In that case, then many of my comments would
have been directed at the existing people there, not you :-)
As far as the utilities, I completely agree with Susan. In 6+ years of
working with Exchange, I have had exactly 1 case where I needed to run
either tool against a production server. At that, it was to run isinteg
against a Public folder store. I did have to use eseutil and isinteg once
when restoring a 5.5 database (DR test restore to alternate server), but
that is another story and was due to some other problems. Further, as Susan
indicated, they really ought to only be run if absolutely necessary. There
are some who would argue that you ought to run eseutil and *defragment* the
databases every so often. I am not one of them. The *whitespace* that an
offline defrag removes makes Exchange work harder IMHO as it must create new
pages in the database instead of re-using existing ones. It's all about
efficiency. It's more efficient to re-use the whitespace.
Most often, if you are finding the need to run eseutil against a database
(recovery or repair), it is better to restore from backup. Especially in
the case of running the repair. If you speak to Microsoft PSS (Product
Support Services), they will tell you the same thing. Running a hard repair
should always be the last option. Exchange has a lot of built-in
recoverability features. For instance, if you perform a restore, Exchange
will replay the log files to get you to a point-in-time failure. There are
obviously situations where that can't be done, but I'm just illustrating a
point.
I'd also recommend to be familiar with the disaster recovery documents.
Since I don't know whether you are running 2000 or 2003, I'll post both.
http://support.microsoft.com/default.aspx?scid=kb;en-us;326052 - Exchange
2000
http://www.microsoft.com/downloads/details.aspx?FamilyID=A58F49C5-1190-4FBF-AEDE-007A8F366B0E&displaylang=en -
Exchange 2003
Along with those, this is a great resource.
http://www.microsoft.com/technet/prodtechnol/exchange/2003/library/default.mspx
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:4065C98E-CB72-44AF-AABD-7FB8A9A9B4D8@microsoft.com...
| Quote: | A note it my defense -
I was brought in after the fact, just trying to make things better in the
future.
Good ideas. Now, if you've got some ideas on how to suggest changes
without
ruffling other's feathers... probably another newsgroup though. ;-)
However, the more people I get saying the similar things, the better. How
about those database utilities like ISINTEG and ESEUTIL, is it worthwhile
to
dismount a store and run them? Ever?
"Ben Winzenz [Exchange MVP]" wrote:
To be honest? More careful monitoring of the server. Checking daily
that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as
they happened. What you can do differently in the future is make a list
of
things that you should be checking on a daily basis so that problems can
be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at
one
point or another. The best thing you can do is learn from what went
wrong.
Another thing I want to mention is that you may be better served filling
up
an entire storage group before creating a new one. The reason I say this
is
because each extra database requires additional memory when Exchange
starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you
are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption.
If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer
to
corruption within the edb file - there is another set corresponding to
the
stm file). If there was database corruption, then more than likely it
was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should
look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store?
Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it
is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We
then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
bc
Guest
|
Posted:
Fri Jan 21, 2005 11:37 pm Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
Two things:
First, I ran the eseutil /d against two of the databases as we had over 50
GB of whitespace after moving mailboxes around and our backups were taking 50
GB worth of time. Things have since dramatically improved. Maybe it was the
wrong thing to do, but ...
Second, one of the techs keeps mentioning and "online defrag/consistency
check" that I cannot find anything about. He's not sure either. It may be
that he's talking about the maintenance that happens when the backups run?
BTW, this is an Exchange 2003 installation and I badly underestimated our
total store size. It's approx 180 GB.
"Ben Winzenz [Exchange MVP]" wrote:
| Quote: | It wasn't my intent to "ruffle your feathers". I wasn't aware that you were
brought in after the fact. In that case, then many of my comments would
have been directed at the existing people there, not you :-)
As far as the utilities, I completely agree with Susan. In 6+ years of
working with Exchange, I have had exactly 1 case where I needed to run
either tool against a production server. At that, it was to run isinteg
against a Public folder store. I did have to use eseutil and isinteg once
when restoring a 5.5 database (DR test restore to alternate server), but
that is another story and was due to some other problems. Further, as Susan
indicated, they really ought to only be run if absolutely necessary. There
are some who would argue that you ought to run eseutil and *defragment* the
databases every so often. I am not one of them. The *whitespace* that an
offline defrag removes makes Exchange work harder IMHO as it must create new
pages in the database instead of re-using existing ones. It's all about
efficiency. It's more efficient to re-use the whitespace.
Most often, if you are finding the need to run eseutil against a database
(recovery or repair), it is better to restore from backup. Especially in
the case of running the repair. If you speak to Microsoft PSS (Product
Support Services), they will tell you the same thing. Running a hard repair
should always be the last option. Exchange has a lot of built-in
recoverability features. For instance, if you perform a restore, Exchange
will replay the log files to get you to a point-in-time failure. There are
obviously situations where that can't be done, but I'm just illustrating a
point.
I'd also recommend to be familiar with the disaster recovery documents.
Since I don't know whether you are running 2000 or 2003, I'll post both.
http://support.microsoft.com/default.aspx?scid=kb;en-us;326052 - Exchange
2000
http://www.microsoft.com/downloads/details.aspx?FamilyID=A58F49C5-1190-4FBF-AEDE-007A8F366B0E&displaylang=en -
Exchange 2003
Along with those, this is a great resource.
http://www.microsoft.com/technet/prodtechnol/exchange/2003/library/default.mspx
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:4065C98E-CB72-44AF-AABD-7FB8A9A9B4D8@microsoft.com...
A note it my defense -
I was brought in after the fact, just trying to make things better in the
future.
Good ideas. Now, if you've got some ideas on how to suggest changes
without
ruffling other's feathers... probably another newsgroup though. ;-)
However, the more people I get saying the similar things, the better. How
about those database utilities like ISINTEG and ESEUTIL, is it worthwhile
to
dismount a store and run them? Ever?
"Ben Winzenz [Exchange MVP]" wrote:
To be honest? More careful monitoring of the server. Checking daily
that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as
they happened. What you can do differently in the future is make a list
of
things that you should be checking on a daily basis so that problems can
be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at
one
point or another. The best thing you can do is learn from what went
wrong.
Another thing I want to mention is that you may be better served filling
up
an entire storage group before creating a new one. The reason I say this
is
because each extra database requires additional memory when Exchange
starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you
are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption.
If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer
to
corruption within the edb file - there is another set corresponding to
the
stm file). If there was database corruption, then more than likely it
was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should
look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store?
Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it
is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We
then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
bc
Guest
|
Posted:
Fri Jan 21, 2005 11:45 pm Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
Nevermind! I found the on-line defrag! Sorry to waste anyone's time.
"bc" wrote:
| Quote: | Two things:
First, I ran the eseutil /d against two of the databases as we had over 50
GB of whitespace after moving mailboxes around and our backups were taking 50
GB worth of time. Things have since dramatically improved. Maybe it was the
wrong thing to do, but ...
Second, one of the techs keeps mentioning and "online defrag/consistency
check" that I cannot find anything about. He's not sure either. It may be
that he's talking about the maintenance that happens when the backups run?
BTW, this is an Exchange 2003 installation and I badly underestimated our
total store size. It's approx 180 GB.
"Ben Winzenz [Exchange MVP]" wrote:
It wasn't my intent to "ruffle your feathers". I wasn't aware that you were
brought in after the fact. In that case, then many of my comments would
have been directed at the existing people there, not you :-)
As far as the utilities, I completely agree with Susan. In 6+ years of
working with Exchange, I have had exactly 1 case where I needed to run
either tool against a production server. At that, it was to run isinteg
against a Public folder store. I did have to use eseutil and isinteg once
when restoring a 5.5 database (DR test restore to alternate server), but
that is another story and was due to some other problems. Further, as Susan
indicated, they really ought to only be run if absolutely necessary. There
are some who would argue that you ought to run eseutil and *defragment* the
databases every so often. I am not one of them. The *whitespace* that an
offline defrag removes makes Exchange work harder IMHO as it must create new
pages in the database instead of re-using existing ones. It's all about
efficiency. It's more efficient to re-use the whitespace.
Most often, if you are finding the need to run eseutil against a database
(recovery or repair), it is better to restore from backup. Especially in
the case of running the repair. If you speak to Microsoft PSS (Product
Support Services), they will tell you the same thing. Running a hard repair
should always be the last option. Exchange has a lot of built-in
recoverability features. For instance, if you perform a restore, Exchange
will replay the log files to get you to a point-in-time failure. There are
obviously situations where that can't be done, but I'm just illustrating a
point.
I'd also recommend to be familiar with the disaster recovery documents.
Since I don't know whether you are running 2000 or 2003, I'll post both.
http://support.microsoft.com/default.aspx?scid=kb;en-us;326052 - Exchange
2000
http://www.microsoft.com/downloads/details.aspx?FamilyID=A58F49C5-1190-4FBF-AEDE-007A8F366B0E&displaylang=en -
Exchange 2003
Along with those, this is a great resource.
http://www.microsoft.com/technet/prodtechnol/exchange/2003/library/default.mspx
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:4065C98E-CB72-44AF-AABD-7FB8A9A9B4D8@microsoft.com...
A note it my defense -
I was brought in after the fact, just trying to make things better in the
future.
Good ideas. Now, if you've got some ideas on how to suggest changes
without
ruffling other's feathers... probably another newsgroup though. ;-)
However, the more people I get saying the similar things, the better. How
about those database utilities like ISINTEG and ESEUTIL, is it worthwhile
to
dismount a store and run them? Ever?
"Ben Winzenz [Exchange MVP]" wrote:
To be honest? More careful monitoring of the server. Checking daily
that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as
they happened. What you can do differently in the future is make a list
of
things that you should be checking on a daily basis so that problems can
be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at
one
point or another. The best thing you can do is learn from what went
wrong.
Another thing I want to mention is that you may be better served filling
up
an entire storage group before creating a new one. The reason I say this
is
because each extra database requires additional memory when Exchange
starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you
are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption.
If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer
to
corruption within the edb file - there is another set corresponding to
the
stm file). If there was database corruption, then more than likely it
was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should
look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store?
Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it
is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We
then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
bc
Guest
|
Posted:
Sat Jan 22, 2005 12:19 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
That third error, error 1022, is that the correct error number? This seems
to be an error regarding users trying to log into mailboxes while maintenance
is running on them. We ARE seeing this error, but not the other two.
"Ben Winzenz [Exchange MVP]" wrote:
| Quote: | To be honest? More careful monitoring of the server. Checking daily that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly (or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon as
they happened. What you can do differently in the future is make a list of
things that you should be checking on a daily basis so that problems can be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at one
point or another. The best thing you can do is learn from what went wrong.
Another thing I want to mention is that you may be better served filling up
an entire storage group before creating a new one. The reason I say this is
because each extra database requires additional memory when Exchange starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption. If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer to
corruption within the edb file - there is another set corresponding to the
stm file). If there was database corruption, then more than likely it was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store? Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange server?
As far as individual database size, that is really up to you. Normally, I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We then
discovered we were down one drive in our RAID 5 array. Then we discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if we
could do things differently in the future that might help us recover more
quickly. Especially since we found that one of the moved stores wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
Ben Winzenz [Exchange MVP
Guest
|
Posted:
Sat Jan 22, 2005 12:32 am Post subject:
Re: Database Size, Database Corruption, Message Store & Stor |
|
|
http://support.microsoft.com/?id=314917
Basically, there is a 1022 error that relates to Exchange corruption, but it
also refers to other things. The same event ID can reference multiple
things. If it was database corruption, it would be indicated as such. An
example of this is:
MSExchangeIS (248) Synchronous read page checksum error -1018
((1:3106 1:3106)(0-310013)(0-312215)) occurred.
Please restore the databases from a previous backup.
You shouldn't have anything to worry about. You can reference the error you
are seeing on eventid.net for more details on what might be causing the
problem you are seeing. It's not database corruption, though :-)
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:E18AFE50-4459-4D19-BE59-559C5729DA3F@microsoft.com...
| Quote: | That third error, error 1022, is that the correct error number? This
seems
to be an error regarding users trying to log into mailboxes while
maintenance
is running on them. We ARE seeing this error, but not the other two.
"Ben Winzenz [Exchange MVP]" wrote:
To be honest? More careful monitoring of the server. Checking daily
that
the backup job completes successfully probably would have also helped.
Daily checking of the event logs for any warnings or errors. Quarterly
(or
thereabouts) testing restores to make sure that they are good. I'm not
trying to come across as being hard on you, but there really is no good
reason for discovering all the things that you mentioned "at time of
disaster". They are all things that should have been discovered as soon
as
they happened. What you can do differently in the future is make a list
of
things that you should be checking on a daily basis so that problems can
be
remediated as soon as they occur. Again - take this as constructive
feedback rather than bashing on you. :-) Most of us have been there at
one
point or another. The best thing you can do is learn from what went
wrong.
Another thing I want to mention is that you may be better served filling
up
an entire storage group before creating a new one. The reason I say this
is
because each extra database requires additional memory when Exchange
starts.
Each extra storage group also requires more memory at startup, but it is
quite a bit more than a mailbox store requires. Microsoft generally
indicates that the best practice for server utilization is to fill up the
entire storage group first for this very reason. If you have 8 mailbox
stores, you may consider consolidating down to 2 storage groups. If you
are
using too much physical memory, you will start to have excessive disk
paging, which can have an impact on performance.
Also it is important to know if there was actually database corruption.
If
there is, then there will be corresponding events logged to the server's
application log. Events such as 1018, 1019 or 1022 errors (these refer
to
corruption within the edb file - there is another set corresponding to
the
stm file). If there was database corruption, then more than likely it
was
the result of faulty hardware. In this case, it could have possibly been
flaky hard drives, but the raid controller is also something you should
look
at. The figure that gets tossed around is that 99.9% of all database
corruption in Exchange is caused by faulty hardware.
In terms of backups, you should look at "how" you are performing backups.
Are you simply doing a normal online backup of the Information Store?
Are
you also doing Brick-level (Mailbox-level) backups via a 3rd party backup
agent? Do you have file-level antivirus installed on the Exchange
server?
As far as individual database size, that is really up to you. Normally,
I
would say that database size should be dictated by your SLA's. If your
restore window is only 4 hours, then you should size your databases so
that
they can be restored within that time frame. You should also factor in
maintenance. Normally, Exchange doesn't require daily maintenance - it
is
pretty good at taking care of itself. However, if you should have to run
eseutil against the database, it can take a while.
--
Ben Winzenz
Exchange MVP
"bc" <bc@discussions.microsoft.com> wrote in message
news:BA9B0773-8B6F-4062-9231-8E91EEC31521@microsoft.com...
We're just recovering from an Exchange disaster. We discovered that we
could
not get database backups off of two of our four storage groups. We
then
discovered we were down one drive in our RAID 5 array. Then we
discovered
that we could not rebuild onto a new drive, probably due to issues on a
second drive. We then attempted to save as much data as we could by
exmerging data out of the two damaged stores and moving mailboxes to a
second
server. In the midst of this, the server crashed and would not reboot.
We've since rebuild and recovered what we could, but we want to know if
we
could do things differently in the future that might help us recover
more
quickly. Especially since we found that one of the moved stores
wouldn't
back up.
We have 4 storage groups on the server in question. Two databases in
each
storage group with a total of about 100 GB between the 8 of them. We
found
exmerging and/or moving the data to try to save it to be extremely time
consuming. We'd like to hear from anyone who might have suggestions or
similar experiences. Should we decrease the size of our databases by
splitting them up? Should we try doing an off-line consistency check?
Are
there online alternatives to repairing a database that won't back up?
|
|
|
| Back to top |
|
 |
|
|
|
|