USEABLE FILE FORMATS

 < Day Day Up > 



USEABLE FILE FORMATS

Even if the data is in a format that appears to be one you already use, conversion still may be necessary. The format may be too new. For example, you will not be able to open an old Word 97 file if you are using WordPerfect 5.1 or even Word 7.0. The problem is a basic one. When those programs were written, Word 97 did not yet exist. As a result, they do not have in them the pieces of code needed to read old Word 97 files. You will need to find a machine with a word processing package capable of reading old Word 97 files. Alternatively, you will need to get a program, such as Word for Word, that can recognize and work with many different file types. In a similar vein, you may have to get the data converted if it comes to you in a format that is too old or runs on a different operating system.

Finally, you may encounter problems of the WordPerfect-versus-Word kind. Although simpler files created with one company’s software generally can be opened without a problem using a competitor’s comparable product, this often does not hold true for more complex files. Thus, Word documents formatted using “styles” or containing complex tables may not be fully readable by WordPerfect (the same holds true when going from WordPerfect to Word.)



 < Day Day Up > 

 < Day Day Up > 



UNUSABLE FILE FORMATS

You may get electronic data in a format that you cannot use “out of the box.” When that happens, you have to convert the files to a format you can use—or, find someone to do the conversion for you. You may have already encountered these issues with a variety of files including electronic mail files, database files from mainframe systems, and “.txt” files containing data dumped from database files. Anyone who has undertaken this task can attest that it is potentially a difficult and painstaking process.

Whenever you suspect that you will have to convert data, there are some steps you can take to facilitate the process. Initially, try to get as much information about how the files were created and maintained as you can. Whether you intend to try the conversion yourself or rely on outside resources to get the work done, the more you know about the files, the better your chances of a successful conversion. For example, if you receive a “.txt” file that appears to contain information from a database file, try to find out, among other things, the make and model of the computer the file came from; the name and version of the operating system the computer ran; the name and version of the database program used; the name of the database file; a list of all fields in the database; and descriptions of each field with the descriptions including the type, length, and other characteristics of the field.

Furthermore, get sample printouts if possible. If you get these, they may provide answers to some of the questions previously listed. They may show how the data was laid out—and, hence, how it was used. They also may give clues about electronic data that you should have received, but did not.



 < Day Day Up > 

 < Day Day Up > 



CONVERTING FILES

If you are going to attempt converting the data yourself, you may be fortunate enough to have received electronic data that you can covert directly into programs such as Access or Excel using the Wizards built into those programs. This can be the case with “.txt” files. Sometimes the first row in a file you are converting may even contain the names of the fields that need to be created, further simplifying your task. If that information is not in the file itself, then try to get the field names and descriptions from the producing party. Should you fail at that, you may have an exceedingly difficult time carrying out a meaningful conversion.

Sometimes data will not be in a format amenable to immediate conversion. E-mail files are a common example.

Get the Right Software, Hardware, and Personnel

Concomitant with getting the data into a useable format is getting the right software, hardware, and personnel to work with the format you choose. For software, you may have already found that Access, Excel, and Concordance meet most of your needs, but there are, of course, a plethora of other good tools available.

Hardware requirements will vary greatly depending on specific circumstances. Ten kilobytes of data can be handled by most any machine and across most any network. Ten gigabytes, however, pose substantial challenges in terms of hard drive space, back-ups, network traffic, and, for that matter, performance. When faced with data of that quantity, you need to set up dedicated machines that do not pass queries or results across your network.

Personnel requirements present the greatest challenge. If you are going to make sense of the electronic data you have received, converted, and loaded, you need know how to use the tools yourself, or, failing that, rely on someone who can use the tools for you. As previously discussed, you may already have the personnel you need in your own office or you may have to turn to outside resources. Also, once you are in a position to work with the electronic data you got from the other side, check that the data is what it ought to be.

Did You Get All the Data?

Check to see whether you received all the data you should have received. Prepare an inventory of what you received and compare it against what you requested. This may be as simple as preparing and comparing lists of file names. More likely, however, it will require that you develop short descriptions of the data you received and then match the descriptions with your discovery requests. It may even mean that you will have to closely analyze the data to see whether gaps emerge that indicate some failure to produce all that it ought to have produced.

You also can search the electronic data for references to electronic files that should have been given to you, but were not. This can be done through a manual review. The manual review can be enhanced if the software you are using to review the data allows you to search for strings of characters. If it does, you can search for filename extensions that are typically associated with the types of files you want to find. Examples include .doc, .htm, .html, .htx, .rtf, .mcw, .txt, .wps, and .wpd for word processing files; .csv, .dbf, .dif, .txt, .wk1, .wk3, .wk4, .wks, .wq1, .xls, and .xlw for spreadsheet files; and .asc, .csv, .dbe, .dbf, .htm, .html, .mda, .mdb, .mde, .mdw, .tab, .txt, and .xls for database files.

If you received spreadsheet or database files in their native format, you can scrutinize them for signs of links to files that were used in connection with the files you got, but nonetheless were not given to you. In a spreadsheet file such as an Excel file, this might mean searching the cells for extensions such as the ones previously listed. It also can mean checking the “properties.” If you are asked whether you want to reestablish a link when you open the file, that is a clear sign of potentially missing files; keep track of the file names and check to see whether you received them. In a database file such as an Access file, this means closely examining all tables, queries, forms, reports, macros, and modules for references to other files.

Did the Evidence Come from the People You Thought It Would?

Files often contain indications as to who created them, who worked on them, and who last saved them. If you go to File |Properties, you can sometimes find this information.

Look for “Hidden” Data

Electronic files often contain “hidden” data (information that does not show up on any printouts of the file) that can potentially prove useful. You should go to File | Properties, where you may be able to find out a host of details about the file that the people sending it to you may never have known went with it. These can include: when the file was created; when it was last modified; who created it; what comments have been added; what title was given to the file; whether intentionally or automatically, which subjects have been assigned to the file; who last saved the file; and how many revisions the file has gone through.

In word processing files, look for comments that display on the screen, but do not automatically print out. If there are tables containing numbers, check them for a formula that calculates the figures displayed in the tables. If there are objects embedded in the word processing file, such as portions of spreadsheet files, try to ascertain the names of source files.

In spreadsheet files, look at the formula; these show the true work being done by the spreadsheet file in a way that a printout never can. Check the formula for references to other files. Look for hidden columns. If the column listing across the top goes “A B C E H,” that means that there are at least three hidden columns (D, F, and G) that might contain information of greater value than anything shown. Watch for comments; in Excel, these may initially only show up as small red triangles at the upper right corners of cells. Beware of cells that appear to be empty, but are not.

In database files, look for an explanation of field names or contents; in Access, you might find this by looking at the database tables in “design” mode. Look for links to files you did not receive; in Access, this might be indicated by small arrows to the left of the table icons. Look for tables, queries, forms, reports, macros, and modules that you did not know about. In tables, look for hidden fields.

Test the Data

Test the electronic data to determine how complete, accurate, and reliable it is. You can test the data against itself. Look for inconsistencies. Look for errors as well.

Where feasible, the electronic data can be compared to underlying documents, again to determine the completeness, accuracy, and reliability of the data. This comparison can highlight coding errors made when creating the database such as wrong numbers, dates, and names. It also can reveal categories of information that were not added to the electronic data, which if they had been added, would have affected the results one obtains by searching the data. Just as electronic data can be compared to underlying documents, so also can it be compared to data in other electronic files, the contents of other documents, and information available through the Internet.

Work the Evidence

Examples of how one can work with the other side’s electronic data are offered in the preceding paragraphs. And, what one can do really is limited more by one’s imagination than anything else. That said, there are several general recommendations that can be offered: Put the data into tools you can use. Spreadsheet programs can allow one to perform calculations, prepare pivot tables that can quickly summarize data across several dimensions, develop charts to graphically present trends in the data, and map out information geographically. Database programs can permit one to search or query the databases in complex and subtle ways, perform calculations, and generate a broad range of reports. Sharing the data you receive and the knowledge you glean from it to reconstruct past events with your client, experts, and other colleagues as appropriate, can offer you the opportunity to more effectively handle your case.



 < Day Day Up >