The previous chapter focused on some of the advanced scenarios with the Crystal Enterprise Object Model. Most of these scenarios focused around creating scheduled report jobs and the associated settings. This chapter continues the coverage of advanced Crystal Enterprise Object Model scenarios and covers the following topics:
In Chapter 34, "Introduction to the Crystal Enterprise Professional Object Model," the concept of an InfoStore query was introduced. This is the programmatic entry point into the Crystal Enterprise Repository where all items such as reports, folders, users, and groups are stored. These items are all represented as objects called InfoObjects. From the COM, .NET, and Java object models, a query can be defined that retrieves objects and their corresponding properties. This section covers some of the advanced scenarios when dealing with InfoStore queries.
There is a sample application that comes with Crystal Enterprise 10 that is very useful when working with InfoStore queries. Its called the Query Builder. There is a link to it from the User Launchpad or you can find it at the following location:
The Query Builder application (shown in Figure 36.1) provides a simple text box that you can type InfoStore queries into.
When you click the Submit Query button, it runs the query and displays the results in a simple table. Also, it enables you to type in a user account to use to execute the statement. This is very useful for testing different levels of security. For example, logging on as one user should bring back different reports than logging on as another user. The nice thing is that any valid InfoStore query can be used here, including queries that return objects like users, groups, and servers. Figure 36.2 shows the results of running a query retrieving the name of all reports in the system (SELECT SI_NAME FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report).
This is an indispensable debugging tool for developers. If a query is returning an error through your application, copy and paste it into the Query Builder and make sure it runs successfully there before worrying about any application coding problems.
Its important to understand that when running an InfoStore query, the query command that is passed in is not the actual query that is run against the Crystal Enterprise repository database. The Crystal Enterprise InfoStore query is a meta-query, meaning that it is meant to be simple and high-level but when run, gets translated into a more complex query against the repository, which has a different database schema. The main reason for this kind of design is to abstract the developer away from the actual raw data source and instead provide an application-level interface to get at the underlying information.
Although the InfoStore query is a meta-query and not the actual database query, the syntax and structure of the queries passed in are still a performance factor. There are three basic rules of query optimization:
The first step is to reduce the width of your queries. When looking at example applications and sample code, you often see the SELECT * clause being used in InfoStore queries. Although this is a short and easy way to write a query, the result can be greatly degraded performance. As any developer familiar with basic database concepts understands, the general theory of querying a database is to bring back as little data as possible. The same principle applies here. Performing a "SELECT *" query actually returns upward of 60 properties, and some of those properties even have subproperties. To see this for yourself, run the query SELECT * FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report. You might think you are only bringing back a list of reports, but there is a lot of metadata stored in the Crystal Enterprise repository about each report.
Using "SELECT *" queries would almost certainly result in a large number of unused properties and thus wasted bandwidth. Because performance is always a factor is any large-scale enterprise software deployment, its important to understand which properties you need and only bring those back. For example, the following query only returns the ID and name of reports and folders:
SELECT SI_ID, SI_NAME FROM CI_INFOOBJECTS
For reference, consult the Crystal Enterprise developer documentation and look for the Query Language Reference section. This contains a topic collection called InfoObject properties that provides a description of every single InfoObject property. This is useful for determining which properties are needed as opposed to just using a "SELECT *".
The SI_ID property is always returned, even if it is not specified in the SELECT clause. You don need to ask for it explicitly although its usually a good idea just for clarity.
When looking at reducing the "length" of your queriesthat is, keeping the number of records returned to a minimumyoull need to employ the use of the WHERE and TOP N clauses. These standard SQL constructs work pretty much the way you would expect.
The WHERE clause is used to filter the records returned from the query. Ideally, if you only need to obtain information about a single object, use a WHERE clause such as
If properties for a single object are needed, try to use the WHERE clause to filter on the SI_ID (ID) or SI_NAME (name) of the object. If all reports at a certain level of the folder structure need to be brought back, use the SI_PARENTID property in the WHERE clause. The following sample query returns all reports at the root level of the folder structure:
SELECT SI_NAME FROM CI_INFOOBJECTS WHERE SI_PARENTID=0
The other common type of operation is to bring back all objects of a certain kind, for example, all reports, users, events, and so on. This is where the SI_PROGID property should be used in the WHERE clause, as shown in the following example that returns a list of all usernames:
SELECT SI_NAME FROM CI_SYSTEMOBJECTS WHERE SI_PROGID=CrystalEnterprise.User
In addition to using a WHERE clause to filter the query, a big performance gain can be made by only filtering on indexed properties. The following properties are indexed in the Crystal Enterprise repository database:
These indexed properties are used as actual fields in the Crystal Enterprise repository. All other properties are stored in a single binary data field in the repository. This means if a query is passed to the InfoStore with a WHERE clause using an indexed property, that WHERE clause can be used in the actual repository database query. This is the ideal situation. Alternatively, if a non-indexed property such as SI_DBNEEDLOGON is used, the query against the repository database must be run without a WHERE clause, thus bringing back extra records and then the InfoStore component must make a pass through the records and filter out the unnecessary ones. If there are large numbers of reports, folders, user, or other similar types of objects present in Crystal Enterprise, filtering on a non-indexed property can result in a performance bottleneck.
The TOP N clause is used for limiting the results of a query. The N would be replaced by a number, for example, TOP 10 returns the top 10 records. If no ORDER BY clause is used, the TOP N clause simply returns the first N values it finds. This can be useful for preventing a large query that would affect performance. An example of this is allowing the user to search for a report by name or keyword. If the system were to have several thousand reports, then depending on the search term used, a search could potentially bring back far too many results. This is analogous to performing a Google search for a word like "crystal." It would bring back too many results for a user to really be able to review.
By default, if a TOP N clause is not used, the InfoStore limits the results to 1,000. Effectively, the InfoStore inserts a TOP 1000 clause into all queries. You can determine if the list of objects returned has been truncated due to a TOP N by comparing the number of objects in the result collection (InfoObjects Count property) with the real number of records resulting from the query (InfoObjectss ResultCount property). If the Count is less than the ResultCount, the object collection has been truncated.
If there is a valid reason to run a report and return more than 1,000 objects, you can specify an arbitrary number in the TOP N clause such as TOP 5000. If you want to change the default limit of 1,000 to a different number, lower or higher, modify the InfoStoreDefaultTopNValue Registry key values found in the following location:
The MACHINENAME is the name of the Crystal Enterprise CMS server. This needs to be configured for each instance of the CMS.
The other way the TOP N clause can be used is to rank the objects being returned. This is done by using an ORDER BY statement. The following query returns the name and start time of all scheduled jobs ordered by the start time in descending order.
SELECT SI_NAME, SI_STARTTIME FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report AND SI_INSTANCE_OBJECT=1 ORDER BY SI_STARTTIME DESC
This query essentially provides a log of scheduled reports. However, you can imagine how this list could get very large with a large system with many reports and many schedules. By adding the TOP 25 clause, only the last 25 scheduled reports would be returned.
The final principle to InfoStore query optimization is to reduce the number of overall queries to the database. Typically this is done by using some little-known InfoObject properties that can answer a question and eliminate the need for an additional query to find out that answer.
A typical scenario is to call a recursive function to display reports in the system. Typically the process is started by running a function like the following:
SELECT SI_ID, SI_NAME, SI_PROGID FROM CI_INFOOBJECTS WHERE SI_PARENTID=0
This would return a collection of objects that could be a report, folder, or favorites folder that exist as children of the object with an ID of 0, which is the root folder. After that, a function could be recursively called that would list objects in each subfolder. Sometimes this statement would run and that folder would have no children objects. In this case, that query could have been avoided if the SI_CHILDREN property was brought back. This property is a number representing the number of children objects that exist for a given InfoObject. If the value were 0, there would be no need to run a query to retrieve the children. This property also works for reports and instances. A report with four instances returns an SI_CHILDREN value of four. Its also a nice feature to show this number in the user interface to indicate how many objects are below a certain level.
Another property that is useful for reducing additional queries is the SI_INSTANCE_OBJECT property. It is a Boolean value that indicates whether the current object is a report instance. Because this is an indexed property, it could be used in the WHERE clause to limit the results to only report instances or perhaps the opposite: only report objects.
Thus far in this book, the SI_ID property has been used as the unique identifier of a report. This is valid, but the ID is technically only unique to the current Crystal Enterprise deployment. You might find that a report when moved from the development to production environment will have a different SI_ID. Depending on how your application works, this might be a concern. In a very generic application, the only ID hard-coded anywhere should be the ID of the root folder in Crystal Enterprise, which is always 0. However, sometimes the ID is hard-coded or perhaps stored somewhere outside of Crystal Enterprise. If this is the case, the SI_CUID property can be used as a unique identifier. This property is a globally unique identifier within a Crystal Enterprise cluster, so that when reports are moved from one deployment to another, the SI_CUID property stays the same. Keep in mind that this value is a string instead of a number and contains alphanumeric data, so it needs to be handled appropriately. To see what the SI_CUID values are for your current reports, run the following statement in the Query Builder:
SELECT SI_NAME, SI_CUID FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report
So far the InfoObject properties that have been discussed have been simple string, date, or numeric properties. There are some types of properties that exist in the InfoStore that need to provide more than a single value. For example, to return a list of parameters (prompts) defined in the report, instead of having properties such as SI_PROMPT1, SI_PROMPT2, SI_PROMPT3, and so on there is simply an SI_PROMPTS property that contains information for multiple prompts. Hierarchical properties generally map to an object in the Crystal Enterprise object model, in this case, the ReportParameters collection.
As an example, the SI_FILES property returns a list of files that are stored with an InfoObject. To retrieve the hierarchical properties, SI_FILES must be included in the SELECT clause:
SELECT SI_NAME, SI_FILES FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report
Its values can be accessed via the Files collection attached to the InfoObject. Obviously there is more than one piece of data retrieved to make up the collection, but that is handled behind the scenes. Consult the documentation for a list of query properties and which object collections they map to.
When dealing with report objects specifically, there are two additional categories of properties that are available. They are processing information and scheduling information. Notice they map to the ISchedulingInfo and IProcessingInfo interfaces from the previous chapter on scheduling. These properties are accessed by the SI_PROCESSINFO and SI_SCHEDULEINFO prefixes, respectively. The following example retrieves the name and record selection formula via the SI_NAME and SI_RECORD_FORMULA properties:
SELECT SI_NAME, SI_PROCESSINFO.SI_RECORD_FORMULA FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report
These properties do need to be prefixed or they will not work. There is a full list of these processing info and scheduling info properties in the Crystal Enterprise documentation. Look for the Query Language Reference section. The following is a list of some of the more useful properties.
Processing Info (SI_PROCESSINFO):
Scheduling Info (SI_SCHEDULEINFO):
Based on the material provided so far, you should be able to add searching capability to your application by using the WHERE clause to limit results returned. As an example, the following query would return all reports that are called Sales Report:
SELECT SI_ID FROM CI_INFOOBJECTS WHERE SI_NAME=Sales Report
Although this is useful, the user would need to know the exact name of the report, which somewhat defeats the purpose of the search. This section describes some advanced methods for providing more robust searching within a Crystal Enterprise-based application.
Up until this point in the coverage of InfoStore queries, all conditions used in the WHERE clause have used the = operator. In fact there are several additional types of operators:
These operators are type-sensitive as you would expect from standard database query operators. As an example, the LIKE operator works against string items only, and the equality operators work only against numeric data. Most of these are self-explanatory but some require some additional information.
The LIKE operator is a great way to handle searching for reports by a keyword or partial report name. Its syntax is as follows:
The pattern can contain the following wildcard characters:
As you can see, the LIKE operator is quite powerful and can be used to provide some robust searching. This becomes important as the number of reports, folders, and users grows.
The IN operator is useful for querying for reports from a known list of IDs or names. The syntax for the IN operator is
IN ( , , ...)
Some examples of using the IN operator are shown here:
SELECT SI_NAME FROM CI_INFOOBJECTS WHERE SI_ID IN (223, 732, 442, 334, 743) SELECT SI_ID FROM CI_SYSTEMOBJECTS WHERE SI_PROGID=CrystalEnterprise.User AND SI_NAME IN (jsmith, sbecker, mblouin)
Finally, the BETWEEN operator is a quick way to express a range condition. The syntax is as follows:
Some examples of using the BETWEEN operator are listed here:
SELECT SI_NAME FROM CI_INFOOBJECTS WHERE SI_ID BETWEEN 250 and 260 SELECT SI_ID FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Folder AND SI_CHILDREN BETWEEN 1 AND 20
Youve probably figured out so far that when InfoStore queries are run, any properties listed in the SELECT clause are available from the InfoObjects Properties collection. This is a properties() method in the Java object model and a Properties collection in the COM and .NET object models. Assume the following query was run against the InfoStore:
SELECT SI_NAME, SI_DESCRIPTION FROM CI_INFOOBJECTS WHERE SI_PROGID=CrystalEnterprise.Report
At this point, the following JSP code could be used to access the SI_DESCRIPTION property:
IInfoObject infoObject = (IInfoObject) results.get(0); IProperties props = infoObject.properties(); IProperty prop = props.getProperty("SI_DESCRIPTION"); out.println(prop.getValue());
You might have noticed that there is an add method on the IProperties interface. This means you can add any number of properties and subproperties yourself. The following code adds a property called Project to the report, so that reports can be tied to specific projects.
IInfoObject infoObject = (IInfoObject) results.get(0); IProperties props = infoObject.properties(); IProperty newProp = props.add("Project", "Project X", IProperty.DIRTY); iStore.commit(results);
After this custom property is set, a developer can retrieve this property at any time to display to the user. The new property Project can be used in the InfoStore query like this:
SELECT SI_NAME, Project FROM CI_INFOOBJECTS WHERE SI_ID=289
This new property can even be used in the WHERE clause to perform a filter as shown here:
SELECT SI_NAME FROM CI_INFOOBJECTS WHERE Project=Project ABC
Note that custom properties are not indexed so are not optimal to filter on, but can be useful when used appropriately.