SQL*Plus is commonly used to extract data from an ORACLE database, by simply spooling query results out to a file. Generally by CSV, SQL*Plus extracts can be used for reporting data or viewing meta-data. There are various limitations to extracting to basic CSV however, one particular one being CLOBs; a good way to partially resolve this and extract at least some data from a CLOB is by outputting to an Excel spreadsheet file.
Wrapper Script
For the purpose of this article, I’ll use the example of extracting some log data from a “TBL_LOG” table for a given timeframe. The base script for this is fairly simple, and would follow the same layout of any other SQL*Plus CSV extract, something along the lines of the following:
#!/bin/bash _date=`date '+%Y%m%d-%H%M%S'` _start="201602151100" _end="201602162000" sqlplus ${USERNAME}/${PASSWORD}@${ORACLE_SID} @extractLog.sql TBL_LOG_${_date}_${_start}-${_end}.xls ${_start} ${_end}
As usual, the script should be redesigned to allow arguments giving the start and end date/time, but the idea is that this would extract every log entry from the start time to the end time using an associated SQL*Plus sql file.
SQL Script
Again, a very simple log table to extract data, the log_timestamp
field being between the arguments given.
SET PAGES 0 SET FEEDBACK OFF SET TERMOUT OFF SET SERVEROUTPUT OFF SET LINESIZE 32767 SET LONG 32767 SET LONGCHUNKSIZE 32767 SET TRIMOUT ON SET TRIMSPOOL ON SET ECHO OFF SET MARKUP HTML ON SPOOL ON SPOOL &1 SELECT /*+ parallel(tbl_log,32) */ ''''||to_char(tbl_log.log_timestamp,'yyyy-mm-dd hh24:mi:ss.ff')||'''' as log_timestamp_formatted, qlog.* FROM tbl_log where TO_TIMESTAMP(tbl_log.log_timestamp) between TO_TIMESTAMP('&2','YYYYMMDDHH24MI') and TO_TIMESTAMP('&3','YYYYMMDDHH24MI') ORDER BY tbl_log.log_timestamp DESC / SPOOL OFF SET MARKUP HTML OFF SET ECHO ON
Using TO_TIMESTAMP
to explicitly define the date/time format is good practice, as often the NLS_DATE_FORMAT
of a database is poorly defined on setup and protected within the SPFILE (in other words, if you don’t have access to set the NLS_DATE_FORMAT
yourself, always TO_TIMESTAMP
or TO_DATE
dates and times). This is also why in the above query the log_timestamp_formatted
field is defined and output with string qualifiers, to ensure that we have a good-quality date coming out for every record. Obviously if your NLS settings are good, this is less important.
Exporting to Excel
The above script will export that data to an Excel .xls file, the key parameter is SET MARKUP HTML ON
which will output to HTML which can then be read by Excel. Breaking the SQL down, we have:
SET PAGES 0 SET FEEDBACK OFF SET TERMOUT OFF SET SERVEROUTPUT OFFStandard settings to block feedback and outputs.
SET LINESIZE 32767 SET LONG 32767 SET LONGCHUNKSIZE 32767Set
LINESIZE
to it’s maximum,LONG
(the parameter that defines the length of CLOBs) to a suitable number (the maximum for this value is 2 billion), andLONGCHUNKSIZE
(the amount of data grabbed at a time (per chunk) for a CLOB) to a suitable number. TheLONG
value is important here to grab the CLOBs.
SET TRIMOUT ON SET TRIMSPOOL ON SET ECHO OFF SET MARKUP HTML ON SPOOL ONTrim the lines, turn echo off, add the
SET MARKUP HTML ON
key parameter, then turn on spooling.
SPOOL &1 SELECT /*+ parallel(tbl_log,32) */ ''''||to_char(tbl_log.log_timestamp,'yyyy-mm-dd hh24:mi:ss.ff')||'''' as log_timestamp_formatted, qlog.* FROM tbl_log where TO_TIMESTAMP(tbl_log.log_timestamp) between TO_TIMESTAMP('&2','YYYYMMDDHH24MI') and TO_TIMESTAMP('&3','YYYYMMDDHH24MI') ORDER BY tbl_log.log_timestamp DESC /This is obviously our actual SQL, again make sure to cast timestamps to avoid any NLS issues where necessary.
SPOOL OFF SET MARKUP HTML OFF SET ECHO ONEnd with turning off the spool, turning off the
SET MARKUP HTML
, and turning back on the echo.
The output of this will be a workable Excel .xls file, complete with CLOBs. It’s common for Excel to report an error on trying to open the file due to compatibility, but the content of the file will still be fine.