It is common to provide simple counter-type values for tables that don't have a natural or efficient primary key. Columns such as cust_id are usually simple counter fields. The IDENTITY property makes generating unique numeric values easy. IDENTITY isn't a data type; it's a column property that you can declare on a whole-number data type such as tinyint, smallint, int, bigint, or numeric/decimal (with which only a scale of zero makes any sense). Each table can have only one column with the IDENTITY property. The table's creator can specify the starting number (seed) and the amount that this value increments or decrements. If not otherwise specified, the seed value starts at 1 and increments by 1, as shown in this example: CREATE TABLE customer ( cust_id smallint IDENTITY NOT NULL, cust_name varchar(50) NOT NULL ) To find out which seed and increment values were defined for a table, you can use the IDENT_SEED(tablename) and IDENT_INCR(tablename) functions. Take a look at this statement: SELECT IDENT_SEED('customer'), IDENT_INCR('customer') It produces the following result for the customer table because values weren't explicitly declared and the default values were used. 1 1 This next example explicitly starts the numbering at 100 (seed) and increments the value by 20: CREATE TABLE customer ( cust_id smallint IDENTITY(100, 20) NOT NULL, cust_name varchar(50) NOT NULL ) The value automatically produced with the IDENTITY property is normally unique, but that isn't guaranteed by the IDENTITY property itself, nor are the IDENTITY values guaranteed to be consecutive. (I will expand on the issues of non-unique and non-consecutive IDENTITY values later in this section.) For efficiency, a value is considered used as soon as it is presented to a client doing an INSERT operation. If that client doesn't ultimately commit the INSERT, the value never appears, so a break occurs in the consecutive numbers. An unacceptable level of serialization would exist if the next number couldn't be parceled out until the previous one was actually committed or rolled back. (And even then, as soon as a row was deleted, the values would no longer be consecutive. Gaps are inevitable.) Note
To temporarily disable the automatic generation of values in an identity column, you use the SET IDENTITY_INSERT tablename ON option. In addition to filling in gaps in the identity sequence, this option is useful for tasks such as bulk-loading data in which the previous values already exist. For example, perhaps you're loading a new database with customer data from your previous system. You might want to preserve the previous customer numbers but have new ones automatically assigned using IDENTITY. The SET option was created exactly for cases like this. Because the SET option allows you to determine your own values for an IDENTITY column, the IDENTITY property alone doesn't enforce uniqueness of a value within the table. Although IDENTITY generates a unique number if IDENTITY_INSERT has never been enabled, the uniqueness is not guaranteed once you have used the SET option. To enforce uniqueness (which you'll almost always want to do when using IDENTITY), you should also declare a UNIQUE or PRIMARY KEY constraint on the column. If you insert your own values for an identity column (using SET IDENTITY_INSERT), when automatic generation resumes, the next value is the next incremented value (or decremented value) of the highest value that exists in the table, whether it was generated previously or explicitly inserted. Tip
The keyword IDENTITYCOL automatically refers to the specific column in a table that has the IDENTITY property, whatever its name. If that column is cust_id, you can refer to the column as IDENTITYCOL without knowing or using the column name or you can refer to it explicitly as cust_id. For example, the following two statements work identically and return the same data: SELECT IDENTITYCOL FROM customer SELECT cust_id FROM customer The column name returned to the caller is cust_id, not IDENTITYCOL, in both cases. When inserting rows, you must omit an identity column from the column list and VALUES section. (The only exception is when the IDENTITY_INSERT option is on.) If you do supply a column list, you must omit the column for which the value will be automatically supplied. Here are two valid INSERT statements for the customer table shown earlier: INSERT customer VALUES ('ACME Widgets') INSERT customer (cust_name) VALUES ('AAA Gadgets') Selecting these two rows produces this output: cust_id cust_name ------- --------- 1 ACME Widgets 2 AAA Gadgets (2 row(s) affected) In applications, it's sometimes desirable to immediately know the value produced by IDENTITY for subsequent use. For example, a transaction might first add a new customer and then add an order for that customer. To add the order, you probably need to use the cust_id. Rather than selecting the value from the customer table, you can simply select the special system function @@IDENTITY, which contains the last identity value used by that connection. It doesn't necessarily provide the last value inserted in the table, however, because another user might have subsequently inserted data. If multiple INSERT statements are carried out in a batch on the same or different tables, the variable has the value for the last statement only. In addition, if there is an INSERT trigger that fires after you insert the new row and if that trigger inserts rows into a table with an identity column, @@IDENTITY will not have the value inserted by the original INSERT statement. To you, it might look like you're inserting and then immediately checking the value: INSERT customer (cust_name) VALUES ('AAA Gadgets') SELECT @@IDENTITY However, if a trigger was fired for the INSERT, the value of @@IDENTITY might have changed. There are two other functions that you might find useful when working with identity columns: SCOPE_IDENTITY and IDENT_CURRENT. SCOPE_IDENTITY returns the last identity value inserted into a table in the same scope, which could be a stored procedure, trigger, or batch. So if we replace @@IDENTITY with the SCOPE_IDENTITY function in the preceding code snippet, we can see the identity value inserted into the customer table. If an INSERT trigger also inserted a row that contained an identity column, it would be in a different scope: INSERT customer (cust_name) VALUES ('AAA Gadgets') SELECT SCOPE_IDENTITY() In other cases, you might want to know the last identity value inserted in a specific table from any application or user. You can get this value using the IDENT_CURRENT function, which takes a table name as an argument: SELECT IDENT_CURRENT('customer') This doesn't always guarantee that you can predict the next identity value to be inserted because another process could insert a row between the time you check the value of IDENT_CURRENT and the time you execute your INSERT statement. You can't define the IDENTITY property as part of a UDT, but you can declare the IDENTITY property on a column that uses a UDT. A column that has the IDENTITY property must always be declared NOT NULL (either explicitly or implicitly); otherwise, error number 8147 will result from the CREATE TABLE statement and CREATE won't succeed. Likewise, you can't declare the IDENTITY property and a DEFAULT on the same column. To check that the current identity value is valid based on the current maximum values in the table, and to reset it if an invalid value is found (which should never be the case), use the DBCC CHECKIDENT(tablename) statement. Identity values are fully recoverable. If a system outage occurs while insert activity is taking place with tables that have identity columns, the correct value is recovered when SQL Server is restarted. SQL Server does this during the checkpoint processing by flushing the current identity value for all tables. For activity beyond the last checkpoint, subsequent values are reconstructed from the transaction log during the standard database recovery process. Any inserts into a table that have the IDENTITY property are known to have changed the value, and the current value is retrieved from the last INSERT statement (post-checkpoint) for each table in the transaction log. The net result is that when the database is recovered, the correct current identity value is also recovered. In rare cases, the identity value can get out of sync. If this happens, you can use the DBCC CHECKIDENT command to reset the identity value to the appropriate number. In addition, the RESEED option to this command allows you to set a new starting value for the identity sequence. See the online documentation for complete details. |