Getting in control with PL/Scope

I’m currently working on a project that involves a system that is basically unknown to me. We also have to make changes (additions) to the system as part of our project. The system is a PL/SQL hard-shell application with all business logic and queries in the Oracle database and has about 1 million lines of code. Being a PL/SQL application, it has the great advantage that the source code is easily available for quick searches through views like all_source. We also know that all queries are in the database.

My requirement was to intercept and change the logic around the update of a specific column in a specific table. This means that I will have to find and examine all statements that inserts into or updates this table.

A quick search through all_source using the table name (where upper(text) like ‘%MYTABLE%’) gave way too many hits to handle. Searching for the column name itself is not sufficient as there may or may not be code doing insert and updates using a variable declared as a MyTable%rowtype as illustrated below:

declare
  lRowVariable MyTable%rowtype;
begin
  select *
  into lRowVariable
  from MyTable
  where id = 1;
  
  lRowVariable.id := MyTable_seq.nextval;
  
  insert into MyTable
  values lRowVariable;
end;
/

These (and other) examples may not exist, but they will have to be checked out of the case.

So this gave me an opportunity to try out PL/Scope, which of 12c also collects information about SQL Statements.

To get going, all my source had to be recompiled to collect the PL/Scope-info:

alter <object_type> <object_name> compile PLSCOPE_SETTINGS='IDENTIFIERS:ALL';

I wrote a PL/SQL loop, but there are other ways like the UTL_RECOMP-package, utlrp.sql-script and so forth. You may also use the dependency-views to just recompile referring objects if your scope is limited.

After my initial testing I decided to just turn it on as default in our development-database, since it will obviously come in handy later:

alter system set plscope_settings = 'IDENTIFIERS:ALL' scope=both;

I would not recommend doing this on the production system, as the compile time increases a little bit.

I then started playing with the all_statements and all_identifiers-views. Getting a quick overview over the statements in the system is real easy and might give you some information that you’re not aware of:

-- Overview of statement types
select type, count(*)
from user_statements
group by type
order by 2 desc;

You might, for instance, have a look at execute immediates and so forth.

The nice thing is that the SQL_ID is the same that you can find in v$sql, and you can use it to find duplicate statements (as an indicator of duplicate code):

-- Listing duplicate SQL statements in the code:
select distinct sql_id, count(*) occurrences, min(text) sql_text
from user_statements
where sql_id is not null
group by sql_id
having count(*) > 1
order by occurrences desc;

This also means that even though production runs on 11g, we can use 12c on development and test and still identify statements back and forth.

I also tested a quick way of getting the context where a particular query is located, displaying 5 lines before and after the line where the statement is (i used the materialize-hint to speed up the query):

with s as (
 select /*+materialize*/ object_name, object_type, line
 from user_statements s
 where s.sql_id = '&1'
)
select decode(src.line, s.line, '*****', ' . ') tag, 
  src.name, src.type, src.line, src.text
from user_source src
join s on src.name = S.Object_Name 
  and src.type = S.Object_Type 
  and src.line between s.line -5 and s.line + 5
order by src.type, src.name, src.line;

Now – back to my case – I want to identify insert and update-statements related to a certain table. By combining the all_statements and all_identifiers-views and utilizing the hieriarcical structure, I came up with this query to list all statements inserting and updating my table:

WITH
allReferences AS (
 SELECT 
 'ID' plscope_type,
 ai.object_name,
 ai.object_type,
 ai.usage usage,
 ai.usage_id,
 ai.usage_context_id,
 ai.TYPE || ' ' || ai.name name,
 ai.line,
 ai.signature,
 NULL as sql_id
 FROM all_identifiers ai
UNION ALL
 SELECT 
 'ST' plscope_type,
 st.object_name,
 st.object_type,
 st.TYPE usage,
 st.usage_id,
 st.usage_context_id,
 'STATEMENT' name,
 st.line,
 st.signature,
 st.sql_id
 FROM all_statements st
)
SELECT
 myStatements.object_name,
 myStatements.object_type,
 myStatements.usage,
 myStatements.line,
 myStatements.sql_id,
 st.FULL_TEXT
FROM (
 SELECT DISTINCT -- quick-fix for duplicates
 LEVEL reference_context_level,
 r.plscope_type,
 r.object_name,
 r.object_type,
 r.usage,
 r.usage_id,
 r.usage_context_id,
 r.line,
 r.name,
 r.signature,
 r.sql_id
 FROM allReferences r
 CONNECT BY PRIOR object_name = object_name 
 AND PRIOR object_type = object_type 
 AND PRIOR usage_context_id = usage_id
 AND LEVEL <= 2
 START WITH r.signature = (
 SELECT DISTINCT signature
 FROM all_identifiers i
 WHERE type = 'TABLE' AND name = 'MYTABLE'
 )
) myStatements
LEFT JOIN all_statements st on (myStatements.signature = st.signature)
WHERE myStatements.reference_context_level = 2 -- We want the context where this is used
 AND myStatements.plscope_type = 'ST' -- We want statements
 AND myStatements.usage IN ('UPDATE', 'INSERT')
ORDER BY
  myStatements.object_type, 
  myStatements.object_name, 
  myStatements.usage_id, 
  myStatements.line

The list of queries from the above query was considerably shorter and well manageable: Well suited for a further manual inspection of the code. Notice that it will also list occurrences where MyTable is used as a select source in a insert into MyOtherTable select … from MyTable.

Of course – after doing all this work I realized that I could also just have searched the all_statements-view alone:

SELECT *
FROM all_statements st
WHERE st.type IN ('UPDATE', 'INSERT')
 AND st.full_text LIKE '%MYTABLE%';

This runs a lot faster than the large query above and basically gives the same result.

I have not checked out the PL/Scope-behaviour when you use synonyms, since I don’t have that in my code base.

We also have very few and controllable occurrences of dynamic SQL (EXECUTE IMMEDIATE and OPEN … FOR <string>). These are impossible to trace the contents of, but the location is recorded, so you know where to look. But it’s yet another reason to avoid dynamic SQL.

In this case, I’m lucky: All statements are in the database, so I can be in control. I would not have gotten far with this method if statements were coming from the client/web-server. In a same size system with SQL in the middle/end-tier this work would have taken days or maybe even weeks.

I’ll follow up with more posts where I use PL/Scope to identify dead and possibly dead code.

For more on PL/Scope, check out the fine blogs of Steven Feuerstein, Philipp Salvisberg (who also has a great git-repo and additions to SQL Developer), Tim Hall and of course the documentation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s