Jared Still ponders this question in a recent blog post. He runs some benchmarks against the various approaches and comes to the conclusion that using TRANSLATE() is the fastest approach. Which is fine, as far as it goes. It’s a solution which works for Jared’s situation but is not universally applicable.
Note that I have slightly complicated Jared’s approach to allow for fake_number values of varying length:
SQL> select * from detect_numeric 2 order by 1 3 /FAKE_NUMBER——————–000000000001000002000010000011000012000020000021000022
9 rows selected.
SQL> select fake_number 2 from detect_numeric 3 where lpad(’|', length(fake_number), ‘|’) 4 = translate(fake_number,’0123456789′,’||||||||||’) 5 order by 1 6 /FAKE_N——000000000001000002000010000011000012000020000021000022
9 rows selected.SQL>
So far so good. Let’s add another record….
SQL> insert into detect_numeric values (’123.45′) 2 /
1 row created.
SQL> select fake_number 2 from detect_numeric 3 where lpad(’|', length(fake_number), ‘|’) 4 = translate(fake_number,’0123456789′,’||||||||||’) 5 order by 1 6 /FAKE_N——000000000001000002000010000011000012000020000021000022
9 rows selected.
SQL>
Wha’pen? Isn’t 123.45 is a number? Well, no, not in this context. The TRANSLATE() call is only counting digits. Hmmm, obviously we need to allow for decimal points.
SQL> select fake_number 2 from detect_numeric 3 where lpad(’|', length(fake_number), ‘|’) 4 = translate(fake_number,’0123456789.’,'|||||||||||’) 5 order by 1 6 /FAKE_N——000000000001000002000010000011000012000020000021000022123.45
10 rows selected.
SQL>
Problem solved? Not quite. There’s more to being numeric than just comprising digits and decimal points. They have to be in the right quantities and right places. Let’s add an IP address to the mix….
SQL> insert into detect_numeric values (’127.0.0.1′) 2 /
1 row created.
SQL> select fake_number 2 from detect_numeric 3 where lpad(’|', length(fake_number), ‘|’) 4 = translate(fake_number,’0123456789.’,'|||||||||||’) 5 order by 1 6 /FAKE_NUMBER——————–000000000001000002000010000011000012000020000021000022123.45127.0.0.1
11 rows selected.
SQL>
And that’s why we might need a function like IS_NUMERIC(), which wraps a TO_NUMBER call:
SQL> create or replace function is_numeric 2 (p_str in varchar2, p_fmt_msk in varchar2 := null) 3 return varchar2 4 as 5 return_value varchar2(5); 6 n number; 7 begin 8 begin 9 if p_fmt_msk is null then 10 n := to_number(p_str); 11 else 12 n := to_number(p_str, p_fmt_msk); 13 end if; 14 return_value := ‘TRUE’; 15 exception 16 when others then 17 return_value := ‘FALSE’; 18 end; 19 return return_value; 20 end; 21 /
Function created.
SQL> column is_numeric format a10SQL> select fake_number, is_numeric(fake_number) is_numeric 2 from detect_numeric 3 order by 2,1 4 /FAKE_NUMBER IS_NUMERIC——————– ———-127.0.0.1 FALSE000000 TRUE000001 TRUE000002 TRUE000010 TRUE000011 TRUE000012 TRUE000020 TRUE000021 TRUE000022 TRUE123.45 TRUE
11 rows selected.
SQL>
All of which underlines the importance of understanding the data with which we are working. If we just need to assert that a string consists solely of digits then a simple TRANSLATE() will suffice and will be very efficient. But if we need to assert something more precise - that the string contains a valid number - then we may require a slower but more reliable approach.
Incidentally, anyone who is interested in seeing how to use 10g’s regex functionality to winnow numeric strings from non-numeric strings should read this OTN Forum post from CD.