May 16, 2013

Oracle 10g Wildcard NCLOB Bug

I ran into a peculiar Oracle 10g bug that took a bit of head-scratching to diagnose.

Our application stores links to other application content in text strings within the database. These are (roughly) in this format /xid-123_2, with 123 being a file_id and the 2 referring to the data source. I wanted to extract them, but had some problems...

13:29:54 SQL> -- This works
Table dropped.
13:29:54 SQL> create table mytest (a clob);
Table created.
13:29:54 SQL> insert into mytest (a) values ('There is an /xid-123_2 in here.
Then comes an /xid-2222_3');
1 row created.
13:29:54 SQL> select instr(a,'/xid-') from mytest;
INSTR(A,'/XID-')
----------------
              13 
13:29:54 SQL> -- find the first link...
13:29:54 SQL> select substr(a,instr(a,'/xid-')+5
,instr(a,'_',instr(a,'/xid-'))- (instr(a,'/xid-')+5) )
file_id,substr(a,instr(a,'_',instr(a,'/xid-')) + 1,1 ) source from mytest;
FILE_ID
--------------------------------------------------------------------------------
SOURCE
--------------------------------------------------------------------------------
123
2 
13:29:54 SQL> -- find the second link...
13:29:54 SQL> select substr(a,instr(a,'/xid-',1,2)+5
,instr(a,'_',instr(a,'/xid-',1,2))- (instr(a,'/xid-',1,2)+5) )
file_id,substr(a,instr(a,'_',instr(a,'/xid-',1,2)) + 1,1 ) source from mytest;
FILE_ID
--------------------------------------------------------------------------------
SOURCE
--------------------------------------------------------------------------------
2222
3
All well and good! However, while it works on a CLOB, running the same script against an NCLOB fails.
13:29:54 SQL> -- This doesn't work:
13:29:54 SQL> drop table mytest;
Table dropped.
13:29:54 SQL> create table mytest (a nclob);
Table created.
13:29:54 SQL> insert into mytest (a) values ('There is an /xid-123_2 in here.
Then comes an /xid-2222_3');
1 row created.
13:29:54 SQL> select instr(a,'/xid-') from mytest;
INSTR(A,'/XID-')
----------------
              13
13:29:54 SQL> -- find the first link...
13:29:54 SQL> select substr(a,instr(a,'/xid-')+5
,instr(a,'_',instr(a,'/xid-'))- (instr(a,'/xid-')+5) )
file_id,substr(a,instr(a,'_',instr(a,'/xid-')) + 1,1 ) source from mytest;
FILE_ID
--------------------------------------------------------------------------------
SOURCE
--------------------------------------------------------------------------------
x
13:29:54 SQL> -- find the second link...
13:29:54 SQL> select substr(a,instr(a,'/xid-',1,2)+5
,instr(a,'_',instr(a,'/xid-',1,2))- (instr(a,'/xid-',1,2)+5) )
file_id,substr(a,instr(a,'_',instr(a,'/xid-',1,2)) + 1,1 ) source from mytest;
FILE_ID
--------------------------------------------------------------------------------
SOURCE
--------------------------------------------------------------------------------
x
After a lot of testing, it looks like the the '_' is treated as a single-character wildcard when you're using it with an NCLOB, but not if you're using it with a CLOB. I've not found any way around this, aside from using TO_CLOB() to convert the NCLOB to a CLOB column.
13:29:54 SQL> drop table mytest;
Table dropped.
13:29:54 SQL> create table mytest (a nclob);
Table created.
13:29:54 SQL> insert into mytest (a) values ('There is an /xid-123_2 in here.
Then comes an /xid-2222_3');
1 row created.
13:29:54 SQL> select instr(to_clob(a),'/xid-') from mytest;
INSTR(TO_CLOB(A),'/XID-')
-------------------------
                       13
13:29:54 SQL> -- find the first link...
13:29:54 SQL> select substr(to_clob(a),instr(to_clob(a),'/xid-')+5
,instr(to_clob(a),'_',instr(to_clob(a),'/xid-'))-
(instr(to_clob(a),'/xid-')+5) )
file_id,substr(to_clob(a),instr(to_clob(a),'_',instr(to_clob(a),'/xid-')) +
1,1 ) source from mytest;
FILE_ID
--------------------------------------------------------------------------------
SOURCE
--------------------------------------------------------------------------------
123
2
13:29:54 SQL> -- find the second link...
13:29:54 SQL> select substr(to_clob(a),instr(to_clob(a),'/xid-',1,2)+5
,instr(to_clob(a),'_',instr(to_clob(a),'/xid-',1,2))-
(instr(to_clob(a),'/xid-',1,2)+5) )
file_id,substr(to_clob(a),instr(to_clob(a),'_',instr(to_clob(a),'/xid-',1,2))
+ 1,1 ) source from mytest;
FILE_ID
--------------------------------------------------------------------------------
SOURCE
--------------------------------------------------------------------------------
2222
3
Anyway, I think I've run into a similar issue before, but didn't remember it. Oracle 11g fixes this behavior, so I'm going to say both that a) it's definitely a bug and b) by writing up this blog post, I have a better chance of remembering this problem exists.
Posted by jeffreyb at 02:04 PM | Comments (0)