I came over some code that used the forall construct to update rows in a large table. Quite a few rows were updated and the batch took some time to complete. Usually, using forall is a good idea, but sometimes other constructs can perform better.
Using the forall-construct can often reduce the overhead when going between the PL/SQL and the SQL enging in Oracle. There’s one context switch with an array of parameters and the given statement is re-executed for each and every value in the array. This can be a very good thing, given that the context switch between PL/SQL and SQL is the main contributor to the processing time.
It is however important to note that the statement is being executed once for every bind-value in the array. This will in reality lead to a form of row-by-row processing. To illustrate the concept, I’ve created a simplified case with two tables. One to be updated and the other one with id of rows to be updated. I’ve deliberately omitted an index in the first table to make the case even clearer.
So – setting up the tables and filling in some data to make them ready. I’ve set the filling of the blocks low to make it take some more space:
drop table t1 purge;
create table t1
(
id number,
text varchar2(50)
)
pctfree 95;
drop table t2 purge;
create table t2
(
id number
);
begin
insert into t1
select rownum, 'first text'
from dual
connect by level <= 100000;
insert into t2
select id
from t1 sample (1); -- sampling ~1 percent of the rows
commit;
end;
select count(*)
from t2 t;
COUNT(*)
1013
So – now let’s try do update the rows in T1, based on the ids in T2:
declare
type t_id_table is table of t1.id%type;
l_id_table t_id_table := t_id_table();
begin
select id bulk collect
into l_id_table
from t2 t;
forall i in l_id_table.first .. l_id_table.last
update /*+forall*/ t1
set t1.text = 'first edit'
where t1.id = l_id_table(i);
commit;
end;
completed in 7 s 984 ms
1013 updates took almost 8 seconds on my laptop. Let’s try a different, set-based approach:
begin
update /*+noforall*/ t1
set t1.text = 'secnd edit'
where t1.id in (select id from t2);
commit;
end;
completed in 104 ms
Why the big difference? Both of these have to use table scan as the search-method as T1 has no indexes. The forall has to do it once for every row, but the second one, using where … in escapes with only one table scan. A quick look in v$sql also reveals that the forall-version performs about 5,3 million consistent gets as opposed to the second one (noforall) which only uses 6.300!
Now – this was maybe an extreme case but still not very unlike the production case I discovered. What if we change the case to update a larger portion of the rows in T1 and add a primary key to help the forall-lookup:
begin
delete t2 t;
insert into t2 (id)
select id from t1 sample(50);
commit;
end;
select count(*) from t2;
count(*)
50014
alter table t1 add primary key (id);
Then we run the updates again:
declare
type t_id_table is table of t1.id%type;
l_id_table t_id_table := t_id_table();
begin
select id bulk collect
into l_id_table
from t2 t;
forall i in l_id_table.first .. l_id_table.last
update /*+forall2*/ t1
set t1.text = 'first edit'
where t1.id = l_id_table(i);
commit;
end;
completed in 534 ms
begin
update /*+no_forall2*/ t1
set t1.text = 'secnd edit'
where t1.id in (select id from t2);
commit;
end;
completed in 217 ms
Now the time differences are smaller. v$sql also shows that the forall used 128.392 consistent gets, while the no forall-version used 52.863 – less than half.
In summary: Forall can often be a very useful construct, but there are cases where there are significantly better options available.
Examples was run on Oracle 19.3
[…] over to the SQL engine performing the inserts. Remembering the article I wrote last year about FORALL still being row-by-row processing, I thought I’d rewrite my code to use a set based approach to improve it even further. So […]
LikeLike