Normally, when one plots a function or file, the command has the following structure
plot 'foo' using 1:2 with line, f(x) with line
That was the old syntax. In the new version of gnuplot, we can insert arithmetic expressions in the plot command as follows
f(x) = a*sin(x) plot a = 1.0, f(x), a = 2.0, f(x)
Now, this has some implications. First, one has to be a bit careful, because the arithmetic expressions are separated from the actual function by a comma. However, the 'for' loop that we discussed a week ago, reads statements up to the comma, and then returns to the beginning of the statement. In other words,
plot for [i=1:10] a = i, f(x)will evaluate the expression a = i ten times, and then plots f(x). At that point, the value of 'a' will be 10, therefore, we have only one plot, and that will be 10*sin(x).
The second implication is that the notion of a function has completely changed. What we do in a plot command now is no longer a mapping of the form
x -> f(x)but rather, the evaluation of a set of instructions, one of which is the above-mentioned mapping. But the crucial point here is that the mapping is not the only allowed statement. The upshot is that "functions" have become a set of operations, and the following statement is completely legal
f(x) = (a = a+1.0, a*sin(x)) a = 0.0 plot for [i=1:10] f(x)
(It is a complete different question, whether this plot makes any sense...)
What we should notice is the fact that now a function can have the form of
f(x) = (statement1, statement2, statement3, return value)and when the function is called, statement1, statement2, statement3 are evaluated, and 'return value' is returned. We should not underestimate the significance of this! Many things can be done with this. I will show a few of them below.
The first thing I would like to dive into is calculating some statistics of a file. Let us see how this works!
reset set table 'inline.dat' plot sin(x) unset table num = 0 sum = 0.0 sumsq = 0.0 f(x) = (num = num+1, sum = sum+x, sumsq = sumsq+x*x, x) plot 'inline.dat' using 1:(f($2)) print num, sum, sumsqwhich will print out
100 -1.77635683940025e-15 0.295958848441We expected this, for the number of samples is 100 by default, and the sum should be 0 in this case.
So, what about finding the minimum and its position in a data file? This is quite easy. All we have to do is to modify our function definition, and insert a statement that determines whether a value is minimal or not.
reset set table 'inline.dat' plot sin(x) unset table num = 0 min = 1000.0 min_pos = 0 min_pos_x = 0 f(x,y) = ((min > y ? (min = y, min_pos_x = x, min_pos = num) : 1), num = num+1, y) plot 'inline.dat' using 1:(f($1, $2)) print num, min, min_pos_x, min_poswhich prints
100 -0.999385 4.74747 73i.e., the minimum is at the 73rd record (we count from 0), at x = 4.74747, and its value is -0.999385. Note that instead of an 'if' statement, we use the ternary operator to decide whether min, min_pos_x, and min_pos should be updated.
The implementation of calculating the standard deviation, e.g., should be trivial:
sum = 0.0 sumsq = 0.0 f(x) = (num = num+1, sum = sum + x, sumsq = sumsq + x*x, x) plot 'inline.dat' using 1:(f($1)) print num, sqrt(sumsq/num - (sum/num)*(sum/num))We have thus seen how the "inline" arithmetic can be used for calculating quantities, e.g., various moments, minima/maxima and their respective positions. These involve the sequential summing or inspection of the data set. But this trick with the function definition can be used for back-referencing, too. This is what we will discuss next.
The trick is to use a construct similar to this
backshift(x) = (prev = pres, pres = x, prev)
which will store the last but one value in the variable 'prev', and return it. That is, the following code shift the whole curve to the right by one
reset set table 'inline.dat' plot sin(x) unset table pres = 0.0 backshift(x) = (prev = pres, pres = x, ($0 > 0 ? prev : pres)) plot 'inline.dat' using 1:(backshift($2)) with line, '' u 1:2 with line(In cases like this, we always have to decide what to do with the first/last data record. In this particular case, I opted for duplicating the first record, - this is what happens in the ternary operator - but this is not the only possibility.) If, for some reason, you have to shift the curve by more, you do the same thing, but multiple times. E.g., the following code shifts by 3 places.
backshift(x) = (prev1 = prev2, prev2 = prev3, prev3 = x, prev1)
Once we have this option of back-referencing, we should ask the question what it can be used for. I show two examples for this.
The first example is drawing arrows along a line given by the data set. Drawing arrows one by one is done by using
set arrow from x1,y1 to x2,y2but we have to use a different method, if we want to plot the arrows from a file. Incidentally, there is a plotting style, 'with vectors', that works as
plot 'foo' using 1:2:3:4 with vectorswhere the first two columns specify the coordinates of the beginning, and the second two columns specify the relative coordinates of the vectors. So, it works on four columns. What should we do, if we want to plot vectors from the points in a file. Well, we use the back shift that we defined above. Our script is as follows:
reset unset key set sample 30 set table 'arrowplot.dat' plot [0:3] sin(x)+0.2*rand(0) unset table px = NaN py = NaN dx(x) = (xd = x-px, px = ($0 > 0 ? x : 1/0), xd) dy(y) = (yd = y-py, py = ($0 > 0 ? y : 1/0), yd) plot 'arrowplot.dat' using (px):(py):(dx($1)):(dy($2)) with vectorwhich results in the following figure:
In the second example, we will turn this around. In my post in last August, plotting the recession, I showed how the background of a plot can be changed, based on whether the the curve is dropping, or increasing. Let us take the following script
reset set sample 20 set table 'inline.dat' plot [0:10] exp(-x)+1.0+rand(0) unset table unset key px = 0 py = 1000 dx(x) = (xd = x-px, px = x, xd) dyn(y) = (yd = y-py, py = y, (yd < 0 ? yd : 1/0)) dyp(y) = (yd = y-py, py = y, (yd >= 0 ? yd : 1/0)) plot 'inline.dat' using (px):(py):(dx($1)):(dyp($2)) with vector nohead lt 1 lw 3, \ px = 0, py = 0, '' using (px):(py):(dx($1)):(dyn($2)) with vector nohead lt 3 lw 3which creates the following graph
Another utilisation of the back reference can be found on gnuplot's web site, under running averages.
Next time I will try to go a bit further, and demonstrate some other uses of the inline data processing.
Cheers,
Gnuplotter
Hey,
ReplyDeleteThanks for your post. It helped me defining a gnuplot routine for collapsing graphs of data automatically. I think your blog/website would be a good place to put it. Here is the script:
reset
files="a.gdat b.gdat c.gdat"
collapse_at_y=8
x_at_collapse=sqrt(8)
#generate the files (for the example)
set table "a.gdat"
plot [0:4] (x+.5*(rand(0)-.5))**2
set table "b.gdat"
plot [0:4] (x+.5+.5*(rand(0)-.5))**2
set table "c.gdat"
plot [0:4] (x+1+.5*(rand(0)-.5))**2
unset table
set term x11
set key bottom
FILE(i)=word(files,ceil(i/2.))
firstpass(i)=((i-floor(i/2)*2)==1)
collapse_y(x,y,i)= \
(ref_x=(firstpass(i))?(y<collapse_at_y)?x:ref_x:ref_x, \
(firstpass(i))?1/0:y)
collapse_x(x,i)=(firstpass(i))?x:x-ref_x+x_at_collapse
set term png
set output "collapse.png"
set multiplot
plot for [i=1:words(files)] word(files,i) using ($1):($2) title "raw ".i with points lt 1 pt i+3
set size 0.5,0.48
set origin 0.1,0.45
set key top left Left reverse
plot for [i=1:2*words(files)] FILE(i) using (collapse_x($1,i)):(collapse_y($1,$2,i)) title (firstpass(i))?"":"collapsed ".ceil(i/2.) with points lt 2 pt ceil(i/2.)+3, x**2 w l lt 3
Best wishes,
Jocelyn Etienne
Hello Jocelyn,
ReplyDeleteThanks for the script, it is really interesting, and I am glad that this blog was of some use! Just a side note: don't forget to unset the multiplot, when you are done;)
Cheers,
Zoltán
The arrow/vector plot works better - i.e. doesn't start at (0;0) all the time - with the following modification:
ReplyDeletepx=NaN
py=NaN
dx(x) = (xd = x-px, px = x, xd)
dy(y) = (yd = y-py, py = y, yd)
Thanks for the comment!
ReplyDeleteCheers,
Zoltán
Sorry for disturbing you.
ReplyDeleteI quite new for gnuplot, totally noob.
At my studies I should generate a graph, which is like an impulse plot of a discrete distribution, but at the "y" axis shows how many times has that data showed up. I would like to change this plot style, like one vertically unit would be an arrow aiming down and if the count is bigger than one than it would be shown more (equally to the count) arrows.
Please inform me about that do you have a solution or not, but I don't want to waste your time.
Thank you.
P.S.: If you would like, I can send you a picture how the graph should look like.
Greetings Béla,
ReplyDeleteI think it would be useful, if you could post a picture or a link that shows what exactly you want, because I am not sure of what you have in mind. E.g., should the arrows be next to each other, or should the second one connect to the end of the first, and so on. If you had an image, I could try to hack something.
Cheers,
Zoltán
Dear Zoltan.
ReplyDeleteI love your blog. Bringing out the power of gnuplot. I tried to make a nummerical integrator. But gnuplot "jumps" at the first integration. Have you seen this before?
Heres my gnuplot commands:
reset
cd 'C:\Users\tlinnet\Documents\My Dropbox\Speciale\5NT-project\Litteratur\FRET-litt\Atto-dyes\Spectre\Test'
set ytics nomirror
set y2tics
set y2label 'Integral'
set xrange [250:255]
set yrange [0.350:0.400]
##Set zero values
Nr488Ext = 0; nmprev488Ext = 0.0; nmpres488Ext = 0.0; Intprev488Ext = 0.0; Intpres488Ext = 0.0; Area488Ext = 0.0
##Make area functions
Areafunc488Ext(nm,Int) = (Nr488Ext = Nr488Ext +1, nmprev488Ext = nmpres488Ext, nmpres488Ext = nm, Intprev488Ext = Intpres488Ext, Intpres488Ext = Int, Area488Ext=Intprev488Ext*(nmpres488Ext-nmprev488Ext) + (Intpres488Ext-Intprev488Ext)*(nmpres488Ext-nmprev488Ext)/2 + Area488Ext, Area488Ext)
plot "ATTO488.txt" using 1:(Areafunc488Ext($1,$2)) with lines axis x1y2,\
"ATTO488.txt" using 1:2 title 'Atto488 excitation' with lines
set label 1 "Area488Ext A= %g", Area488Ext at graph 0.7, 0.55
replot
pause -1
set table 'Test.txt'
plot "ATTO488.txt" using 1:(Areafunc488Ext($1,$2)) with lines axis x1y2
unset table
DATA: ATTO488.txt
Absorption Emission
Wavelength. nm Absorbance Wavelength. nm Intensity. a.u.
250.0 0.395 480 0.000
250.2 0.395 481 0.000
250.4 0.394 482 0.001
250.6 0.393 483 0.002
250.8 0.392 484 0.002
251.0 0.389 485 0.002
251.2 0.388 486 0.004
251.4 0.387 487 0.006
251.6 0.387 488 0.007
251.8 0.387 489 0.009
252.0 0.385 490 0.011
252.2 0.384 491 0.015
252.4 0.381 492 0.019
252.6 0.380 493 0.025
252.8 0.379 494 0.029
253.0 0.379 495 0.038
253.2 0.378 496 0.049
253.4 0.376 497 0.064
253.6 0.373 498 0.081
253.8 0.371 499 0.102
254.0 0.368 500 0.127
254.2 0.367 501 0.162
254.4 0.365 502 0.203
254.6 0.363 503 0.244
254.8 0.360 504 0.277
255.0 0.357 505 0.307
And so I found out why. I have to skip the first data point, since the evalutaion (x2-x1) will give a high value for the first datapoint, if x1=0 and x2=some value.
ReplyDeleteThe numerial integration function should be:
Areafunc488Ext(nm,Int) = (Nr488Ext > 1? (Nr488Ext = Nr488Ext +1, nmprev488Ext = nmpres488Ext, nmpres488Ext = nm, Intprev488Ext = Intpres488Ext, Intpres488Ext = Int, Area488Ext=Intprev488Ext*(nmpres488Ext-nmprev488Ext) + (Intpres488Ext-Intprev488Ext)*(nmpres488Ext-nmprev488Ext)/2 + Area488Ext):(Nr488Ext = Nr488Ext +1, nmpres488Ext = nm, Intpres488Ext = Int), Area488Ext)
Or shorter:
Nr=0; xprev=0.0; xpres=0.0; yprev=0.0; ypres=0.0; sum=0.0
sumf(x,y)=(Nr>1? (Nr=Nr+1,xprev=xpres,xpres=x,yprev=ypres,ypres=y,sum=yprev*(xpres-xprev)+(ypres-yprev)*(xpres-xprev)/2+sum):(Nr=Nr+1,xpres=x,ypres=y), sum)
What a nice blog indeed! It took me some time until I found it while looking for a way to do a simple numerical derivative in gnuplot (in fact it was second derivative). So, to make it easier for others:
Deleteprev3=0.;prev2=0.;prev=0.;h=0.01;
f(y)=(prev3=prev2,prev2=prev,prev=y,(prev+prev3-2*prev2)/h/h)
plot "data.dat" u 1:($1>0.125?f($2):f($2)*1/0)
Try what's below for data.dat
0.11000000 -344.2391374
0.12000000 -344.8596850
0.13000000 -345.5359616
0.14000000 -346.2684164
0.14999999 -347.0575356
0.16000000 -347.9038481
0.17000000 -348.8079189
0.17999999 -349.7703532
0.19000000 -350.7918026
0.19999999 -351.8729526
0.20999999 -353.0145395
0.22000000 -354.2173366
wow, great post. I didn't know that. The recession can probably be used e.g. to smooth data with a boxcar, and the new definition of function to perform calculations of columns from different files (two things that I was never able to do and I have been missing for years).
ReplyDelete