I read this blog-post by Connor McDonald the other day about the Advanced Network Compression and network transfer savings. It reminded me of a feature not many know of and comes without an extra cost option. In fact, it’s on by default.
I’ve sometimes had discussions with developers that have been reluctant to join master-detail-queries on the database side, since they expect the transfer size of the resultset to be a problem because of a lot of repeated values on the master-side. They argumented that a separate query to each table would be more efficient. They’re wrong *).
The testcase under demonstrates how a simple optimization technique of Oracle Net Services, avoids sending duplicated column-values, compared row to row. That is: If the next row’s column value is the same as the previous one, it won’t be shipped over the network. But if the two values differ, the value will be sent. I’ve illustrated this by returning 1.000 rows. The text-column has a length of 100 characters with only two possible values: “-xxxx…” and “-yyyy…”. The transferred size is vastly smaller, 22.519 bytes, when the resultset is ordered by the text-column, vs 121.971 bytes when ordered so that the text alternates every second row.
SQL*Plus: Release 12.2.0.1.0 Production on Wed Jan 10 11:16:48 2018
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Last Successful login time: Wed Jan 10 2018 11:01:15 -05:00
Connected to:Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> SET AUTOTRACE TRACE STAT
SQL> SELECT
rownum rn,
mod(ROWNUM, 2) mod_result,
rpad('-', 100, mod(ROWNUM, 2)) text
FROM dual
CONNECT BY LEVEL <= 1000
ORDER BY rn;
1000 rows selected.
Statistics
----------------------------------------------------------
13 recursive calls
0 db block gets
32 consistent gets
0 physical reads
0 redo size
121971 bytes sent via SQL*Net to client
1333 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
1000 rows processed
Next – I’ve ordered the resultset by the long text-column and that makes wonders to the amount of data that’s transferred.
SQL> SELECT
rownum rn,
mod(ROWNUM, 2) mod_result,
rpad('-', 100, mod(ROWNUM, 2)) text
FROM dual
CONNECT BY LEVEL <= 1000
ORDER BY text;
1000 rows selected.
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
22519 bytes sent via SQL*Net to client
1333 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1000 rows processed
This means that your typical Customer -> Orders -> OrderLines join with an order by in the same order is safe to do in the database – there will be no extra CPU or network cost related to this, as opposed to do client-side joins (which some ORM-frameworks prefer to do).
Actually, the overhead of separate queries is usually a lot higher, both on the database, network and middle tier, as demonstrated in a different case by Toon Koppelars! This is also worth taking into consideration when moving to the cloud where you often pay for CPU and network bandwith.
So – another reason to do joins in the database instead of the middle tier or the client.
*) Of course, there might be a few exceptions to this, but in general this statement holds.