The second extension that you will add works well with the filesize function. Given the name of a directory, the filelist function returns a list of all files (and subdirectories) contained in that directory. The filesize function (from the previous example) returns a single value; filelist will return multiple rows. An extension function that can return multiple results is called a set-returning function, or SRF.
When you are finished creating the filelist function, you can use it like this:
movies=# SELECT filelist( '/usr' ); filelist ------------ . .. bin dict etc games html include kerberos lib libexec local sbin share src tmp X11R6 (17 rows)
In this example, the user has invoked the filelist function only once, but 17 rows were returned. A SRF is actually called multiple times. In this case, the filelist() function is called 18 times. The first time through, filelist() does any preparatory work required and then returns the first result. For each subsequent call, filelist() returns another row until the result set is exhausted. On the 18th call, filelist() returns a status that tells the server that there are no more results available.
Like the filesize function, filelist takes a single argument; a directory name in the form of a TEXT value. This function returns a SETOF TEXT values. Listing 6.3 shows the first part of the filelist.c source file:
Listing 6.3. filelist.c (Part 1)
1 /* 2 ** Filename: filelist.c 3 */ 4 5 #include "postgres.h" 6 #include "fmgr.h" 7 #include "nodes/execnodes.h" 8 9 #include 10 11 typedef struct 12 { 13 int dir_ctx_count; 14 struct dirent ** dir_ctx_entries; 15 int dir_ctx_current; 16 } dir_ctx; 17 18 PG_FUNCTION_INFO_V1(filelist); 19
filelist.c #includes four header files, the first three of which are supplied by PostgreSQL. postgres.h and fmgr.h provide data type definitions, function prototypes, and macros that you will need to create extensions. The nodes/execnodes.h header file defines a structure (ReturnSetInfo) that you need because filelist returns a set of values. You will use the scandir() function to retrieve the directory contents from the operating system. The fourth header file defines a few data types that are used by scandir().
Line 11 defines a structure that keeps track of your progress. In the first invocation, you will set up a context structure (dir_ctx) that we can use for each subsequent call. The dir_ctx_count member indicates the number of files and subdirectories in the given directory. The dir_ctx_entries member is a pointer to an array of struct dirent structures. Each member of this array contains a description of a file or subdirectory. dir_ctx_current keeps track of the current position as you traverse the dir_ctx_entries array.
Line 18 tells PostgreSQL that filelist() uses the version-1 calling convention.
Listing 6.4 shows the filelist() function:
Listing 6.4. filelist.c (Part 2)
20 Datum filelist(PG_FUNCTION_ARGS) 21 { 22 FmgrInfo * fmgr_info = fcinfo->flinfo; 23 ReturnSetInfo * resultInfo = (ReturnSetInfo *)fcinfo->resultinfo; 24 text * startText = PG_GETARG_TEXT_P(0); 25 int len = VARSIZE( startText ) - VARHDRSZ; 26 char * start = (char *)palloc( len+1 ); 27 dir_ctx * ctx; 28 29 memcpy( start, startText->vl_dat, len ); 30 start[len] = '