Warning

 

Close
Confirm Action

Are you sure you wish to do this?

Cancel Confirm
AR15.COM
7/13/2014 5:29:30 PM EDT
I have a table with about 180 million rows (65gb of data when it's in uncompressed .txt files).

I've identified a subset of the rows in this table with a query that joins to another table, and put the keys for those rows, into a third table.

I want to now delete from the the table of 180 million, all rows for which the primary key appears in that third table.

I could do this as:


delete from table1 t1 where
t1.key in (Select t3.key from table3 t3)


but it would be horribly slow.


What's the best way to do this?

Should I just do a join and create a new table and then drop the old one?

something like this:


create table1new as
select
t1.*
from table1 t1 left join table3 t3 on t1.key = t3.key
where t3.key is null
;

drop table1;
rename table1new to table1;



7/13/2014 5:35:14 PM EDT
[#1]
Try rewriting using WHERE EXISTS and see if that helps
7/13/2014 5:39:20 PM EDT
[#2]
use the query planner to see what the cost is.
7/13/2014 5:40:27 PM EDT
[#3]
SQL Server guy here but if the majority of your records are not what you're deleting, why not just move the new records to a new table.  Truncate (delete without writing to transaction log (kinda)) the table, drop the table, rename the new table to the old table's name - all in a transaction.  This makes assumptions - but it's fairly low effort ETA: and fast.

ETA2:  I derped.

I meant truncate the table, move the data back from the fake temp table and drop that.  Obviously if you were going to truncate the data you could have just dropped it and did a rename as you suggested in your OP.
7/13/2014 5:45:03 PM EDT
[#4]
Look at foreign Keyes. That's a big delete, esp for your row count. I have seen 3 million row deletes take a few days. Depending on the referential integrity, you may be able to drop some table constraints and speed it up a bit. Can you give us a little more info? Maybe explain the relationships with different names if you are worried about security.

Eta what sweatpants said regarding a truncate is faster as well, truncate is faster but you'd need to restore the data you do want.


More edit: an explain plan would tell you where the slowdown is