Monthly Archives: July 2016

Synthetic data – nice little SQL feature

This blog was inspired by a Twitter conversation between Jeffrey Kemp and Franck Pachot about little SQL tricks.
Countless times I’ve created synthetic data to model a case or just play around in general. For quick and dirty and small volumes I prefer “CONNECT BY”. When doing it properly, I use recursive CTEs.

But this puts a smile on my face every time I look at it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
select
    item.getNumberVal() num
from xmltable('1 to 10,20,30,40 to 50') item
;
 
       NUM
----------
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    20
    30
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
 
23 rows selected.

Here’s what I usually start with as a base and then tweak the script to my current needs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
drop table t1 purge;
create table t1 (
    id number
  , num_small number (5, 0)
  , num_big number
  , vc2_short varchar2(200 byte)
  , vc2_long varchar2(4000 byte)
  , dt date
  , cl clob
)
  tablespace users
  lob (cl) store as securefile t1_cl_seg (
    tablespace users
    disable storage in row
    retention auto
    nocache logging
    chunk 32K)
;
  
insert /*+ append */ into t1 (id, num_small, num_big, vc2_short, vc2_long, dt)
with datagen as (
  select --+ materialize
      rownum as id
  from
      dual
  connect by level <= 1e4
)
select
    rownum as id
  , trunc(dbms_random.value(1, 100)) as num_small
  , round(dbms_random.value(10000, 1000000), 4) as num_big
  , dbms_random.string('L', trunc(dbms_random.value(10, 200))) as vc2_short
  , dbms_random.string('A', trunc(dbms_random.value(200, 4000))) as vc2_long
  , trunc(sysdate + dbms_random.value(0, 366)) as dt
from
    datagen dg1
  , datagen dg2
where
    rownum <= 1e4
;
  
commit;
  
<<populate_lob>>
declare
  num_rows number;
  cl clob;
  chunk_size integer := 1024; -- 1 KB
  max_cl_size integer := 8192; -- 8 KB
begin
  select count(*) cnt
  into num_rows
  from t1
  ;
   
  for row_num in 1 .. num_rows loop
    dbms_lob.createtemporary(cl, false, dbms_lob.call);
  
    for idx in 1 .. trunc(dbms_random.value(1, (max_cl_size / chunk_size))) loop
      dbms_lob.writeappend(cl, chunk_size, dbms_random.string('A', chunk_size));
      null;
    end loop;
  
    update t1
      set t1.cl = populate_lob.cl
    where id = row_num
    ;
    commit;
  
    dbms_lob.freetemporary(cl);
  end loop;
end populate_lob;
/
  
--alter table t1 modify lob (cl) (cache logging);

I hope above little trick amuses you as much as it did me 🙂
Imagine, it made me blog about it!