The TABLE data type is a specialized data type that has some similarities to a table within a database. It's more like a temporary table than a permanent table, as it is created on the fly and has a limited life span.
Two types of UDF definitions exist – single statement table function and multi-statement table function. As the name suggests, a single statement table function doesn't have a function body and the whole SELECT statement returning the data table is placed within the RETURN statement.
The second type is more like other UDFs, where we wish to complete processing before actually returning the data. This data type is not unique to functions and can be found in stored procedures or inline (as a part of) T-SQL (statements). By defining the TABLE data type as a local variable we can populate it from within any T-SQL statement, just as we would do to any other table.
The TABLE data type provides a rowset of data, as you might find from a temporary table, or even from a view. When used within a function it can provide a predefined subset of data for a query.
Important | Temporary tables, such as those defined within stored procedures or ad-hoc T-SQL, cannot be defined within UDFs. |
To demonstrate the TABLE data type in action, let's create a function returning a set of unshipped orders from the Northwind database. The function receives a customer ID and returns one row per order date, required date, and shipping company.
Note | When coding a function such as this, it is advisable to check the T -SQL first, including checking the execution plan, before placing it into a function. |
CREATE FUNCTION fn_CustUnShippedOrders (@CustId NCHAR(5)) RETURNS TABLE AS RETURN (SELECT o.OrderDate, o.RequiredDate, s.CompanyName AS 'Shipper', ROUND(SUM((UnitPrice * Quantity) - Discount),2) AS 'Order Value' FROM dbo.Orders o JOIN dbo.[Order Details] d ON o.OrderId = d.OrderId JOIN dbo.Shippers s ON o.ShipVia = s.ShipperId WHERE o.ShippedDate IS NULL AND o.CustomerId = @CustId GROUP BY customerid, o.OrderDate, o.RequiredDate, s.CompanyName)
We can execute the above statements as if we were returning data from a view, or a table.
This following statement should return two rows:
SELECT OrderDate, RequiredDate, Shipper, [Order Value] FROM fn_CustUnShippedOrders('ERNSH')
The output is:
OrderDate RequiredDate Shipper Order Value --------------------------------------------------------- ------- 1998-04-08 00:00:00.000 1998-05-06 00:00:00.000 Federal Shipping 4903.3999999999996 1998-05-05 00:00:00.000 1998-06-02 00:00:00.000 United Package 5218.0
Please note that we have formatted this output to fit it onto our page.
The above function is written in such a way that it assumes the input value is always known and is passed in through the parameter. Suppose we wanted to cater for a specific customer or a list of all customers who have unshipped orders; in other words, what if we didn't know the customer ID (that is, the customer ID is NULL)?
The following example demonstrates this situation. First, we need to modify the function so that it can work with either a valid or a NULL customer ID. Also, in the previous example we did not list the CustomerId column, as it wasn't needed. If we call the function with NULL then we need to know what customer each row relates to, so here we have added an extra column to the TABLE data type being returned:
CREATE FUNCTION fn_CustUnShippedOrders (@CustId NCHAR(5)) RETURNS TABLE AS RETURN (SELECT o.CustomerId, o.OrderDate, o.RequiredDate, s.CompanyName AS 'Shipper', ROUND(SUM((UnitPrice * Quantity) - Discount),2) AS 'Order Value' FROM dbo.Orders o JOIN dbo.[Order Details] d ON o.OrderId = d.OrderId JOIN dbo.Shippers s ON o.ShipVia = s.ShipperId WHERE o.ShippedDate IS NULL AND (@CustId IS NULL OR o.CustomerId = @CustId) GROUP BY o.CustomerId, o.OrderDate, o.RequiredDate, s.CompanyName)
Note | Note tha, the execution plan for this function will differ depending on whether a customer ID is entered or not. |
In the above example, we simply received a parameter and then returned a table of information within the RETURN statement, but things change when more processing is involved. If we have to perform more processing, within the function, prior to returning the table data, then the function is known as a multi-statement table function.
Multi-statement table functions are quite similar to stored procedures, in that you can process any information and use any functions within the T-SQL statements. The difference between a single statement function and multi-statement function is that we have to define a table variable, as part of the RETURNS statement, and then place data into that table, which is then returned.
Important | Remember that within a function we cannot define temporary tables other than the table variable itself. |
It is possible to create a table within a function, work with it as if it were a temporary table, and then pass the rowset out of the function from the defined table variable. We come across this in a multi-statement table function rather than a single-statement table function.
The syntax for a multi-statement table function is:
CREATE FUNCTION name (@parametername datatype[(datalength)][,...]) RETURNS @tablename TABLE (column datatype[(datalength)][, ...]) AS BEGIN ... -- There must be some sort of population of @tablename in here RETURN END
We would define the table in the RETURNS statement itself, as you can see in our altered procedure shown below. This method is quite a bit slower, when compared to our last example. This comes from building the table and then having to insert data, instead of the server creating the table on the fly.
Important | Unless it's necessary to have a multi-statement table function avoid using them; whenever possible, aim for a single statement function. |
Now, let's look at the implementation of the fn_CustUnShippedOrders function by using multi-statement tables:
CREATE FUNCTION fn_CustUnShippedOrders (@CustId NCHAR(5)) RETURNS @UnShipped TABLE (CustomerId NCHAR(5), OrderDate DATETIME NULL, RequiredDate DATETIME NULL, Shipper NVARCHAR(40), [Order Value] MONEY ) AS BEGIN INSERT INTO @UnShipped SELECT o.CustomerId, o.OrderDate, o.RequiredDate, s.CompanyName AS 'Shipper', ROUND(SUM((UnitPrice * Quantity) - Discount),2) AS 'Order Value' FROM dbo.Orders o JOIN dbo.[Order Details] d ON o.OrderId = d.OrderId JOIN dbo.Shippers s ON o.ShipVia = s.ShipperId WHERE o.ShippedDate IS NULL AND (@CustId IS NULL OR o.CustomerId = @CustId) GROUP BY customerid,o.OrderDate, o.RequiredDate, s.CompanyName RETURN END
This function inserts the data into the table variable, as we would with any temporary table. However, the link between this table variable and its return to the calling T-SQL is through the RETURNS statement of the CREATE FUNCTION statement.
An example of using the above function for joining data from the multi-statement table UDF is given below. We perform a JOIN operation on the CustomerId column from both the function's table variable and the Customer table to give the relevant result:
SELECT c.CompanyName, f.OrderDate, f.RequiredDate, f.Shipper, f.[Order Value] FROM dbo.Customers c JOIN fn_CustUnShippedOrders(NULL) f ON f.CustomerId = c.CustomerId
Multi-statement table functions can help stored procedures to remove the need for creating a temporary table within the calling procedure. Consider the above example; if this processing was being done in several stored procedures by using a temporary table, it would be better to create a single function and place all the code within this, to reduce the amount of duplicate code. It would also give us one central code repository for carrying out any changes.
As we can see from the table definition, there are many similarities to building any other table, such as allowing NULL values. It is also possible to place keys and CHECK and DEFAULT value constraints on columns, but it is not possible to place a foreign key reference to this table from any other table.
Important | Just like any other local variable, the table is scoped to the function in which it is defined. Indexes cannot be added at a later stage, once the function has returned the data. |
As we mentioned earlier, there are in-built functions that return a scalar value, but there are also in-built functions that return a TABLE data type. It is easy to distinguish them, as they all have the prefix fn_. These functions are:
fn_helpcollations
This lists all collations supported by SQL Server
fn_listextendedproperty
If you use extended properties in your database, then you can use this function to return this information in a table
fn_servershareddrives
This function lists the names of shared drive on the clustered server
fn_trace_geteventinfo
When using tracing, this function will return information about the events
fn_trace_getfilterinfo
When using tracing, this function will return information about the filters in place
fn_trace_getinfo
If you are using tracing, this function will return information about it, such as the options, file name of the trace, current trace status and so on
fn_trace_gettable
This function returns trace information in a table format
fn_virtualfilestats
This provides Input/Output statistics for database and transaction log files
fn_virtualservernodes
This lists all the nodes on which the virtual server can run
Working with functions such as these is a bit different to scalar system functions. It is necessary to prefix the function name with a double colon mark ::. For example, here's how we can use the fn_helpcollations function:
SELECT * FROM ::fn_helpcollations()
This statement will return a list of all the collations that are available in SQL Server, along with a description of each collation. It might have been more useful if the function could either take a NULL parameter to list all the functions, or a collation identifier so that you could list either a specific collation or even a subset of collations based on the input. This is where not being able to work with the source has a downfall, and we have to write our own function for doing this. However, you could write your own wrapper around this function to complete the filtering of the collations.