In this section, you learn the fundamentals of gnuplot and, after some simple examples, move to PostgreSQL and see how it can be combined with gnuplot to generate database-driven graphics. To start gnuplot, use the following: [hs@ duron hs]$ gnuplot -background white -background white makes sure that the background of the graphs we want to generate is white. After starting gnuplot, we will be in an interactive shell. Now we can draw graphs by using the plot command: gnuplot> plot tanh(x) - cos(x) We want to see the graph of tanh(x) - cos(x) ( tanh is tangens hyperbolicus; cos is cosinus). Figure 20.1 shows the result. Figure 20.1. A simple mathematical function.
gnuplot finds the best interval for the graph so that only the "useful" part is displayed. gnuplot provides a large number of functions; you explore the most important ones in this section. Let's get back to PostgreSQL. We have compiled a small table storing incomes for certain years . We will use this table for the next example: mygnuplot=# SELECT * FROM income; year income ------+-------- 1997 24000 1998 26300 1999 26000 2000 29000 2001 32100 2002 32000 2003 32700 (7 rows) The data shown does not have the right format to be used for gnuplot directly; we have to use some parameters when retrieving the data from the table. In this example, we want only the first column to be displayed, so we select only the first column from the table: [hs@duron hs]$ psql -c "SELECT income FROM income" -A -F " " -t mygnuplot 24000 26300 26000 29000 32100 32000 32700 The data shown already fits our demands. Let's look at the command-line parameters: -A tells PostgreSQL to use the unaligned table output mode. -F defines the field separator ”this option is useful only when multiple columns are returned by the query. -t makes sure that the header is silently omitted. Now that we have written a shell command, which returns the data in the way we need it, we have to find a way to pass it to gnuplot in order to draw the graph. Luckily, gnuplot supports some easy methods to read the data. The following shows how we compute the result of the shell command we have shown previously and draw a graph: gnuplot> plot "< psql -c 'SELECT income FROM income' -A -F ' ' -t mygnuplot " The command between the two double quotes is evaluated, and the result is returned to gnuplot, which displays the graph on the screen (see Figure 20.2). Figure 20.2. gnuplot displays a window containing the graph.
We can see an array of points representing the values in the database. Up to now, we have created all graphics with the help of gnuplot's shell. This might be a bit of a problem for you when building huge applications. Let's see how the graphics shown previously can be created with the help of one simple Bourne shell command: [hs@duron hs]$ echo "plot \ "< psql -c 'SELECT income FROM income' -A -F ' ' -t mygnuplot \ " " gnuplot -background white -persist We simply pipe the command to gnuplot. All double quotes passed to the program have to be masqueraded using a backslash so that the shell does not mix up the double quotes used for the echo command with the double quotes used for gnuplot. The -persist flag is necessary; otherwise , gnuplot will close the X windows when the execution of the script is ready. Using -persist makes sure that the window is still alive after gnuplot has terminated . Having the result displayed in an X window might be nice, but in some cases it is necessary to store the result generated by gnuplot in a file that you can use for further transformation. The following includes a file that contains a short gnuplot script. First we set the terminal to pbm , which means that a pbm file is generated instead of X11 output. Then we include the command we showed previously: [hs@duron gnuplot]$ cat config.plot set terminal pbm color plot "< psql -c 'SELECT income FROM income' -A -F ' ' -t mygnuplot " We start gnuplot with config.plot as the first parameter and redirect the output to a file called graph.pbm : [hs@duron gnuplot]$ g nuplot config.plot > graph.pbm We can use that file now. If pbm is not our desired file format, we can easily convert the file to jgp or any other format by using a program such as mogrify : [hs@duron gnuplot]$ mogrify -format jpg *.pbm Now we want to change the labels of the graphics we have just generated. The first thing we want to do is to eliminate the ugly string containing psql on the upper edge of the graph. This can be done by using set nokey . The title of the graph should be Income table , and the axis should be labeled with year and income in USD per year . Up to now we have also seen that gnuplot finds the right scale for the graph by itself. In some cases, this might not be what we need, so we define the range of values displayed manually using xrange and yrange . The following is a file that does exactly what we need: set nokey set title "Income table" set xlabel "year" set ylabel "income in USD per year" xmax=6 ymin=0 ymax=40000 set xrange [0 : xmax] set yrange [ymin : ymax] plot "< psql -c 'SELECT income FROM income' -A -F ' ' -t mygnuplot " We can generate the graph using a simple shell command: [hs@duron gnuplot]$ gnuplot -persist -background white config.plot The result will be the graph shown in Figure 20.3. Figure 20.3. Redesigning the graph.
Often it is necessary to plot two independent data sources in one graph ” especially when you want to compare data. The following script shows how you can plot two data sources (they contain the same data) in two different ways: set nokey set grid set title "Income table" set xlabel "year" set ylabel "income in USD per year" xmax=6 ymin=0 ymax=40000 set yrange [ymin : ymax] plot "< psql -c 'SELECT year, income FROM income' -A -F ' ' -t mygnuplot" \ with lines linewidth 3, \ "< psql -c 'SELECT year, income FROM income' -A -F ' ' -t mygnuplot" \ with boxes linewidth 3 We want a grid to be displayed in the background of our graph. The last four lines of code are responsible for plotting. The first data source is displayed as a line, and linewidth is set to 3 . The second data source is displayed using boxes. gnuplot makes sure that the two components of the graph are displayed in different colors (see Figure 20.4). Figure 20.4. Plotting two data sources.
The script plots the graph as we expected it to be, but there are still two problems that have to be solved . We have plotted two data sources, and therefore we have queried the database twice. This leads to two problems: It might happen that the data in the database changes during the execution of the first query, so the result of the second query might differ from the result of the first query. Another problem is that executing two queries takes much longer than executing just one. Our two problems can be solved easily. To show you how this can be done, we have compiled a table containing the income of males and females: mygnuplot=# SELECT * FROM income; year male female ------+-------+-------- 1997 24000 19200 1998 26300 21400 1999 26000 21800 2000 29000 21000 2001 32100 26500 2002 32000 26700 2003 33700 28200 (7 rows) We use a simple Makefile to generate the result: [hs@duron gnuplot]$ cat Makefile x : config.plot psql -c 'SELECT year, male, female FROM income' -t -A -F ' ' \ mygnuplot > file.data gnuplot -background white -persist config.plot rm -f file.data Let's analyze the Makefile first. We select all data from table income and store the result of the query in file.data . Then we start gnuplot and pass the name of the configuration script to it. Here is the configuration script: set nokey set grid set time set title "Income table" set xlabel "year" set ylabel "income in USD per year" set yrange[0 : ] plot 'file.data' using 1:2 with lines linewidth 2, \ 'file.data' using 1:3 with lines linewidth 2 The time the image was created is displayed by using set time . We don't know the highest income in the database, so we set the lower limit for yrange " to and do not specify the upper value. gnuplot will make sure that a suitable border for the upper limit will be used. In the next step, we tell gnuplot to use the first and the second column in file.data as the data source for plotting the first line. The second line will be plotted using the first and the third column. Using one plot command is enough, because every additional graph is simply added to the list (see Figure 20.5). Figure 20.5. Plotting the data in an external file.
The previous examples showed basic tasks that can be accomplished with gnuplot and PostgreSQL. Sometimes it might be necessary to add explanations to the graph so that the reader can easily understand what you want to say. Two components for adding explanations to a graph are essential: arrows and labels. In the next example, we added two labels and one arrow to the scenery : set nokey set grid set time set arrow from 1999, 15000 to 2000, 20000 set label "significant changes" at 1998.35, 13000 set label "(c) SAMS" at 2002.3, 2200 set format y "%g$" set title "Income table" set xlabel "year" set ylabel "income" set yrange[0 : ] plot 'file.data' using 1:2 with lines linewidth 2, \ 'file.data' using 1:3 with lines linewidth 2 First we add the arrow to the graph. As you can see, the coordinates can simply be defined. Then we add the labels to the scenery. The first component that we define for the label is the text we want to be displayed. Then we tell gnuplot where to place it, as we did for the arrow. We start gnuplot with the file shown. The result can be seen in Figure 20.6. Figure 20.6. A plot including arrows and comments.
For many applications, it is necessary to produce three-dimensional plots. Because gnuplot is a highly developed software, it is also capable of generating three-dimensional graphics. To show you how such plots can be created, we compiled a table that contains two additional columns, which store the number of people we used to compute the average income ( wf is an abbreviation for workforce ): mygnuplot=# SELECT * FROM income; year avg_male wf_male avg_female wf_female ------+----------+---------+------------+----------- 1997 24000 732 19200 932 1998 26300 1412 21400 1054 1999 26000 1930 21800 1320 2000 29000 2065 21000 1150 2001 32100 2163 26500 1259 2002 32000 2254 26700 1292 2003 33700 2620 28200 1721 (7 rows) The following is the config file for our three-dimensional plot: set nokey set time set format y "%g$" set title "Income table" set xlabel "year" set ylabel "income" splot 'file.data' using 1:2:3 with lines linewidth 2, \ 'file.data' using 1:4:5 with lines linewidth 2 Here, we have to use splot instead of plot . Because we have one additional dimension, we have to define three columns containing the data ”we select the numbers of the columns according to the order they can be found in the input file. The screenshot of the plot is shown in Figure 20.7. Figure 20.7. A simple 3-D plot.
|