Discussion:
DB2 deletion
(too old to reply)
r***@in.ibm.com
2007-09-18 09:55:50 UTC
Permalink
Hello,

I am trying to delete a huge table based upon record creation date. My deletion statement was -
"delete from <table_name> where record_created < current timestamp - 60 days"

There were 191 million records for deletion. My DB supports infinite logging, with logsecond -1. The total no. of records in the table is 420 millions.

I ran the query for 6 hours, more than 80 secondary log files generated, but still it did not complete. Finally, I have to cancel it.

Do you have any other better solution through which
- I can have control on the progress of deletion
- I can breakup the deletion in smaller chunk and run sequentially.

*** Pls.note that I can not drop the table, as it is required for report generation purpose.
b***@to.com
2007-09-18 11:18:44 UTC
Permalink
Hi,

interesting problem :o) In order to break this operation into smaller chunks, you could do a:

"delete from <table name> where record_created in (select record_created from <table name> where record_created < current timestamp - 60 days fetch first <x> rows only)"

where <x> is set to a reasonable value and in this way you can "follow" the delete process.

I don't know your exact setup, but wouldn't it be an option to partition this huge table into some data partitions, for example one data partition per 2 months (~60days)? This way you could attach/detach the data partitions to the table, and drop the data partitions that you don't need no more. Just a thought...

Greetings from Stuttgart,
Bogdan

----------------------
Bogdan Taru
System Engineer
http://www.bogdant.net
Mark A
2007-09-18 22:07:29 UTC
Permalink
Post by r***@in.ibm.com
Hello,
I am trying to delete a huge table based upon record creation date. My
deletion statement was -
"delete from <table_name> where record_created < current timestamp - 60 days"
There were 191 million records for deletion. My DB supports infinite
logging, with logsecond -1. The total no. of records in the table is 420
millions.
I ran the query for 6 hours, more than 80 secondary log files generated,
but still it did not complete. Finally, I have to cancel it.
Do you have any other better solution through which
- I can have control on the progress of deletion
- I can breakup the deletion in smaller chunk and run sequentially.
Here are two options:

Write a stored procedure that deletes the rows using a cursor and commit
every 100 deletes. I do this frequently as I recently had to delete 41
million rows. I actually have logic which commits every 100 deletes or every
2 minutes, whichever comes first, since this is an OLTP system.

Create a Command Line Processor script (not a shell script) with the
following operations.:

- Alter the table using ACTIVATE NOT LOGGED INITIALLY.
- Then do the delete,
- Then do a commit

The entire script must be run with autocommit off (+c option). The delete
operation will not be logged and after the commit, logging will resume. The
only problem is that your database is hosed if you have a failure (you can
recover with backup, but can not roll forward to end of logs).
Ian
2007-09-19 21:47:37 UTC
Permalink
Post by r***@in.ibm.com
Hello,
I am trying to delete a huge table based upon record creation date. My deletion statement was -
"delete from <table_name> where record_created < current timestamp - 60 days"
There were 191 million records for deletion. My DB supports infinite logging, with logsecond -1. The total no. of records in the table is 420 millions.
I ran the query for 6 hours, more than 80 secondary log files generated, but still it did not complete. Finally, I have to cancel it.
Do you have any other better solution through which
- I can have control on the progress of deletion
- I can breakup the deletion in smaller chunk and run sequentially.
*** Pls.note that I can not drop the table, as it is required for report generation purpose.
One more option (if you can take a very short outage):

Create a duplicate of the table, i.e. <table_name>_2 with all indexes, etc.

Then use load from cursor to populate this table with all records from
<table_name> where record_created >= current_timestamp - 60 days.


Finally rename the tables to swap in your new table.



Deleting 40% of the table will cause issues -- you'll need to reorganize
the table, which can take a while. This solution is fast and will avoid
the reorg.
Knut Stolze
2007-09-24 14:31:55 UTC
Permalink
Post by Ian
Post by r***@in.ibm.com
Hello,
I am trying to delete a huge table based upon record creation date. My
deletion statement was - "delete from <table_name> where record_created <
current timestamp - 60 days"
There were 191 million records for deletion. My DB supports infinite
logging, with logsecond -1. The total no. of records in the table is 420
millions.
I ran the query for 6 hours, more than 80 secondary log files generated,
but still it did not complete. Finally, I have to cancel it.
Do you have any other better solution through which
- I can have control on the progress of deletion
- I can breakup the deletion in smaller chunk and run sequentially.
*** Pls.note that I can not drop the table, as it is required for report
generation purpose.
Create a duplicate of the table, i.e. <table_name>_2 with all indexes, etc.
Then use load from cursor to populate this table with all records from
<table_name> where record_created >= current_timestamp - 60 days.
Finally rename the tables to swap in your new table.
Deleting 40% of the table will cause issues -- you'll need to reorganize
the table, which can take a while. This solution is fast and will avoid
the reorg.
One more idea: if the situation will occur again in the future, you may want
to consider range-partitioned tables (available in DB2 V9). Then you can
roll out entire partitions and drop those rolled-out tables.
--
Knut Stolze
DB2 z/OS Utilities Development
IBM Germany
Loading...