Wednesday, 10 February 2010

The map, the inline function, and the macro

Some time ago, on the 26th of July, to be more accurate, I showed how somewhat decent-looking maps can be created with gnuplot. With the wisdom of hindsight, that was a rather ugly hack, I must say. Even worse, it seems that it is not quite fail-safe. At least, I have obtained reports complaining about it. Could we, then, do better? Could we, perhaps, throw out that disgusting gawk script, with all the hassle that comes with it? Could we, possibly, manage the whole affair in gnuplot? Sure, we could. And here is how, just keep reading!

On the 17th of January, we saw that in the new version of gnuplot, functions take on a funny property, namely, they can contain algebraic statements not related to the return value. We also saw that this feature can be used to perform searches of some sort: as we "plot" a file, and step through the numbers in the file, we can assign values to variables, provided that some conditions are fulfilled. It is easy to see that in this way, we can determine the minimum or the maximum of a data file, e.g. But we can do much more, than that.

We should also recall from that old post on the map what the contour file looks like. In case you have forgotten, here is a small section of it
# Contour 0, label:        2
-0.391812  3.63636  2
...
-0.959596  3.50978  2

-0.959596  3.50978  2
...
-0.391812  3.63636  2


# Contour 1, label:      1.5
-1.20098  4.51515  1.5
-1.16162  4.54423  1.5
-1.15982  4.54545  1.5
...

What we have to realise is the following: first, contours lines belonging to the same level are not necessarily contiguous (this is quite obvious, for there is no reason why they should be), and if there is a discontinuity, it manifests itself in a single blank line in the contour file, and second, contour lines belonging to different levels are separated by two blank lines. So, in the data file above, there is a blank between the lines -0.959596 3.50978 2, and
-0.959596 3.50978 2, and there are two blanks between -0.391812 3.63636 2, and # Contour 1, label: 1.5. By the way, the third column is the value of that particular contour line.

This observation has at least one important consequence: we can decide which contour line we want to plot, simply by using the index keyword. You might recall, that indexing the data file pulls out one data block, which is defined by a chunk of data flanked by two blank lines.

Now, what about the labels, and the white space that they need? Well, the white space is quite easy: what we will plot is not the contour line, but a function, which returns an undefined value at the place of the white space, e.g., this one (whatever eps and xtoy mean)
f(x,y) = ((x-x0)*(x-x0)+(y-y0)*(y-y0)*xtoy*xtoy > eps ? y : 1/0)
Normally we would plot the contour lines as
plot 'contour.dat' using 1:2 with lines
but instead of this, now we will use this
plot 'contour.dat' using 1:(f($1,$2)) with lines
This will leave out those points which are too close to (x0, y0). And the labels? Well, that is not difficult either. Take this function
lab(x,y) = ( (x == x0 && y == y0) ? stringcolumn(3) : "")
and this plot
plot 'contour.dat' using 1:2:(lab($1,$2)) with labels
This will put the labels at (x0, y0), and even better, we haven't got to set the labels by hand, they are taken from the data file.

So, we have seen how we can plot the contour, leave out some white space, and then put a label at that position. The only remaining question is how we determine where the label should be. And this is where we come back to our inline functions. For the sake of example, let us take this function and the accompanying plot

g(x,y)=(((x > xl && x < xh && y > yl && y < yh) ? (x0 = x, y0 = y) : 1), 1/0)
plot 'contour.dat' using 1:(g($1,$2))
What on Earth does this plot do? The plot itself does absolutely nothing: it is always 1/0. However, while we are doing this, we set the value of x0, and y0, if the two arguments are not too close to the edge of the plot. This latter condition is needed, otherwise labels could fall on the border, which doesn't look particularly nice.

By now, we have all the bits and pieces, we have only got to put them together. Let us get down to business, then!

I will split the script into two: the first produces the dummy data, while the second does the actual plotting. So, first, the data production.
reset
filename = "cont.dat"
xi = -5; xa = 0; yi = 2; ya = 5;
xl = xi + 0.1*(xa - xi); xh = xa - 0.1*(xa-xi);
yl = yi + 0.1*(ya - yi); yh = ya - 0.1*(ya-yi);
xtoy = (xa-xi) / (ya-yi)
set xrange [xi:xa]
set yrange [yi:ya]
set isosample 100, 100
set table 'test.dat'
splot sin(1.3*x)*cos(.9*y)+cos(.8*x)*sin(1.9*y)+cos(y*.2*x)
unset table
set cont base
set cntrparam level incremental -3, 0.5, 3
unset surf
set table filename
splot sin(1.3*x)*cos(0.9*y)+cos(.8*x)*sin(1.9*y)+cos(y*.2*x)
unset table

What we should pay attention to here is the definition of a handful of variables at the very beginning. Some are already obvious, like xi, xa and the like, and some will become clear in the second part. Now, the plotting takes place here

reset
unset key
set macro
set xrange [xi:xa]
set yrange [yi:ya]

set tics out nomirror
set palette rgbformulae 33,13,10
eps = 0.05

g(x,y)=(((x > xl && x < xh && y > yl && y < yh) ? (x0 = x, y0 = y) : 1), 1/0)
f(x,y) = ((x-x0)*(x-x0)+(y-y0)*(y-y0)*xtoy*xtoy > eps ? y : 1/0)
lab(x,y) = ( (x == x0 && y == y0) ? stringcolumn(3) : "")

ZERO = "x0 = xi - (xa-xi), y0 = yi - (ya-yi), b = b+1"
SEARCH = "filename index b using 1:(g($1,$2))"
PLOT = "filename index b using 1:(f($1,$2)) with lines lt -1 lw 1"
LABEL = "filename index b using 1:2:(lab($1,$2)) with labels"

b = 0
plot 'test.dat' with image, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL, @ZERO, \
@SEARCH, @PLOT, @LABEL

A bit convoluted, isn't it? OK, we will walk through the script, line by line.

First is the range setting, and then ticks go to the outside, just for aesthetic reasons. We also define eps here, which determines how much "white space" we have for the labels. Then we define the three functions that we discussed above. We have already seen eps, and the meaning of it, but what about xtoy? Despite its name, this is not something to play with, rather the ratio of x to y, or more precisely, the ratio of the xrange to the yrange. This is needed, if the two ranges are of different order of magnitude, e.g., if xrange is something like [0:1], while yrange is [0:1000]. But this ratio is automatically calculated at the beginning, you haven't got to worry about it.

After this, we define 4 macros. These are abbreviations for longer chunks of code, and make life really easier. The idea is that when confronted with a macro, gnuplot expands it as a string, and then acts accordingly. In my opinion, if written properly, macros can make the script rather readable.

The first of the macros, ZERO, is needed, because in our SEARCH macro, which is nothing but a call to the function g(x,y), if the condition is not satisfied for a particular data block, then x0, y0 wouldn't be updated, therefore, the label would end up at the wrong position. At the same time, ZERO also increments the value of b, which determines which data block we are actually plotting. b is used in the indexing in the macros SEARCH, PLOT, and LABEL. We have already mentioned SEARCH, PLOT plots the contour with the white space at the position given by x0, and y0 (this is calculated in the SEARCH macro), and finally, LABEL places the value of the contour line at that position.

At this point, we have defined everything, all that is left is plotting. We do it 13 times, because our zrange, or the contour lines were given between -3, and 3, with steps of 0.5. In this particular case, there are only 10 contour lines, and gnuplot will complain that the last 3 data blocks are empty, but this is not an error, only a warning. Shouldn't we look at the figure, perhaps? But of course! Here it is:


The only thing that I should like to point out is that the white space is made for a particular contour line, but there is no guarantee that, if the contour lines are too close to each other, the label does not cover a neighbouring contour line. If that happens, I would simply suggest to increase the contour spacing by incrementing the parameter in the set cntrparam line.

I hope that this method proves better, than the other one, and that it will be easier to use. In the next post, I will re-visit the inline functions, and show a nifty trick with them. Cheers,
Gnuplotter

19 comments:

  1. Dear Gnuplotter,

    just wanted to say that it is AWESOME what you manage to create with Gnuplot!
    I have tried to implement the plot with contours, adapted to my needs, and have encountered a problem: Apparently, when plotting with the image style, there is some trouble with logscales. Quite surprisingly, everything runs smoothly, as long as one does not tell Gnuplot specific plotting ranges, but if done, then the contours are properly plotted, not so the colour map...Any idea/suggestion why and how to fix it?

    Another (not related) question is about canvas orientation: Is it somehow possible to rotate the canvas (i.e. into a portrait orientation) so as to maximize the plotting area when plotting e.g. three figures on a column using multiplot?

    Thanks in advance, and especially for all the great tips!

    Thor.

    ReplyDelete
  2. Hello Thoro,

    First, thanks for the kind words! As to your questions, could you please, post the script that you want to use? Perhaps, the data file too, or if it is too big, a function that would generate similar data? It might not be important what exactly it is, for it seems that you have problems with the log scale, so I would just like to see the range settings, and so on. The script should probably be enough.
    As for the second of your questions, you can produce "portrait" figures, if you set the size. So, e.g., you could have

    size 0.3, 1

    which will compress your image in the x direction by a factor of 3. You can also put all three images in the same figure as

    set multiplot
    set size 0.3, 1
    set origin 0,0

    plot something

    set origin 0.33, 0
    plot something else

    set origin 0.66, 0
    plot something else
    unset multiplot

    The only thing you have to watch out for is the bounding box in the postscript file, if you want to produce that for a publication. Once you produced the figure, you should just open the postscript file, and set the bounding box by hand. It should be in the 6th line of the file, I believe.
    Cheers,
    Zoltán

    ReplyDelete
  3. Hello Zoltán,

    Time for a challenge ... ;-)

    One thing I miss with gnuplot is being able to plot colour maps from finite element type data, eg :

    #Element 1
    # x y val
    0 0 1
    1 0 2
    0 1 2

    #Element 2
    # x y val
    1 0 2
    1 1 3
    0 1 2

    This data corresponds to the function f(x,y)=x+y+1 on a mesh composed of 2 triangles.

    Do you think one could plot this sort of data ? And would it be manageable with 1000s of triangles ?

    Gnuplot is such a great tool, because once you can plot then any analytic transformation/comparison is possible...

    joce

    ReplyDelete
  4. Hello Joce,

    I am not sure I understand the problem, so perhaps, we should further discuss it. Let us take one element for a start! So, in your example, you have the triangle given by the following triplet

    (0,0)-(1,0)-(0,1)

    First I thought that the idea was to colour this triangle as red, say, but your points have different numbers assigned to them. Do you, perhaps want to have a colour gradient on this triangle, in such a way that the (0,0) point is read, the (1,0) and (0,1) points are blue, say? Could you, perhaps, post an example of such a plot?
    Cheers,
    Zoltán

    ReplyDelete
  5. Yes, a colour gradient in (each of) the triangle(s) is what I'm looking for. It's actually the colour-map plot of a piecewise linear function f(x,y). This corresponds to what you get from a finite elements PDE resolution.

    Alternatively, an elevation plot of f(x,y) would also be interesting.

    I'm unfortunately quite unable to produce such a colour-map plot with gnuplot from data of the type I've given. Gnuplot handles this type of data as a 1D curve (x,y,z=f(x,y)).

    In the case of elevation it is doable, as splot readily gives (nearly) what I want from the data file above, except that the last edge of the triangle is missing. One could add it either by repeating the first line of each block at the end of the block, or possibly using the new `for' command.

    Oh, and one thing of importance, the triangle summits are generally not aligned on any grid, the only property of these triangles is that they don't overlap (and that across a given edge, if they have a neighbour then this neighbour shares the full length of the edge, but that's not important for plotting)

    The software plotmtv (in all good linux repos) for instance will produce the colour map, if you add the commands
    $ DATA = CONTCURVE
    %contstyle=2
    %nsteps=50
    to the head of the data file

    ReplyDelete
  6. Dear Gnuplotter,

    I want to plot a potential energy map which has several peaks. The script presented here works well for a smooth gradient of values, but doesn't really for more complicated shapes. In particular, I found difficult to control which iso surface has to be labelled.
    Furthermore, the file containing the isosurfaces also includes some isocurves that appear on the plot as straight lines. To fix it, I have to manually edit such file and then run again the script.

    Could you have a look at my files? They are here.

    Thank you very much for this useful blog :)

    ReplyDelete
  7. > Yes, a colour gradient in (each of) the triangle(s) is what I'm looking for. It's actually the colour-map plot of a piecewise linear function f(x,y). This corresponds to what you get from a finite elements PDE resolution.

    I haven't yet worked out the details, but here is how you could start: given the triplet (x1,y1,z1)-(x2,y2,z2)-(x3,y3,z3), take an arbitrary parametrisation of the triangle. Then, read out the numbers in the file, and add parametric plots based on your parametrisation to a string. Once you are done with it, evaluate the string. I believe that all you have to do is to find the parametrisation, and then use what I discussed in my post "Parametric plot from a file". Let me know, if this works! If you still have problems, I will just write it up in a post.

    > The software plotmtv (in all good linux repos) for instance will produce the colour map, if you add the commands

    I hadn't known this software, but after you mentioned it, I checked it out. However, it seems that it is no longer maintained.
    Cheers,
    Zoltán

    ReplyDelete
  8. Dear Gnuplotter,

    I agree with Thoro, this is just awesome!

    Like Joce, tried to adapt your method to a slightly different scenario, in my case : squattered data points...

    Basically I have a number of squattered data points (x,y,z), x and y being the "coordinates" and z the value.
    Adding the contour as you did it in your example would be great as this would make the plot easier to read and make it easy to identify the extrema area. But unfortunately a parametric approach is not possible there.
    (Will stick with the traditional -not so nice- contour line then!..)

    For record I can show you what I tried to do with awk (which did not work very well...probably because I tried to add it as part of a multiplot that I started with splot's )(yeah...life is complicated! ;-) ) :

    reset
    set xrange [*:*]
    set yrange [*:*]
    set isosample 20, 20
    set table 'test.dat'
    splot 'acidYplus0.1.txt' using 2:3:(sqrt(($4-21.263)**2)/21.263*100):(sqrt(($4-21.263)**2)/21.263*100) w pm3d
    unset table
    set cont base
    set cntrparam level incremental -3, 0.5, 3
    unset surf
    set table 'cont.dat'
    splot 'acidYplus0.1.txt' using 2:3:(sqrt(($4-21.263)**2)/21.263*100):(sqrt(($4-21.263)**2)/21.263*100) w pm3d
    unset table
    reset
    set xrange [*:*]
    set yrange [*:*]
    unset key
    set palette rgbformulae 33,13,10
    l '<./awk.sh cont.dat 0 15 0'
    p 'test.dat' w ima, '<./awk.sh cont.dat 1 15 0' w l lt -1 lw 1.5

    ReplyDelete
  9. Dear gnuplotter,

    Many thanks for this truly inspirational post! I just finished a contour plot for my paper using a variation of your method. The only snag in my case was with the positioning of labels, which were not nicely scattered around as in your example, but piled along one of the edges (this seems to be related to the order of points in the data file, which is outside of my control). Positioning tends to work ok for gnuplot-generated meshes, but even then it becomes a bit suspect for smooth functions: try using "xi = -5; xa = -4; yi = 4; ya = 5;" in your script above.

    One solution to this problem is to use some measure of the distance from label to the border as the positioning criterion. This is what I came up with after few experiments:

    reset
    unset key
    set macro
    set xrange [xi:xa]
    set yrange [yi:ya]

    set tics out nomirror
    set palette rgbformulae 33,13,10
    eps = 0.001

    min(a,b) = ((a=dist )?( x0=x,y0=y,dist=min(min(x-xi,xa-x)/xd,min(y-yi,ya-y)/yd) ):1),1/0);
    f(x,y) = ((x-x0)*(x-x0)+(y-y0)*(y-y0)/xtoy/xtoy > eps ? y : 1/0)
    lab(x,y) = ( (x == x0 && y == y0) ? stringcolumn(3) : "")

    ZERO = "x0 = xi - (xa-xi), y0 = yi - (ya-yi), b = b+1, dist = 0"
    SEARCH = "filename index b using 1:(g($1,$2))"
    PLOT = "filename index b using 1:(f($1,$2)) with lines lt -1 lw 1"
    LABEL = "filename index b using 1:2:(lab($1,$2)) with labels"

    b = 0
    dist = 0
    plot 'test.dat' with image, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL

    Who knows, maybe, after few more iterations this will help someone to produce that perfect map!

    ReplyDelete
  10. I am sorry, it seems that HTML processing just messed up the code. Let me try again:

    reset
    unset key
    set macro
    set xrange [xi:xa]
    set yrange [yi:ya]

    set tics out nomirror
    set palette rgbformulae 33,13,10
    eps = 0.001

    min(a,b) = ((a<b)?a:b)
    xd = xa - xi
    yd = ya - yi
    g(x,y) = ((( min(min(x-xi,xa-x)/xd,min(y-yi,ya-y)/yd)>=dist )?( x0=x,y0=y,dist=min(min(x-xi,xa-x)/xd,min(y-yi,ya-y)/yd) ):1),1/0);
    f(x,y) = ((x-x0)*(x-x0)+(y-y0)*(y-y0)/xtoy/xtoy > eps ? y : 1/0)
    lab(x,y) = ( (x == x0 && y == y0) ? stringcolumn(3) : "")

    ZERO = "x0 = xi - (xa-xi), y0 = yi - (ya-yi), b = b+1, dist = 0"
    SEARCH = "filename index b using 1:(g($1,$2))"
    PLOT = "filename index b using 1:(f($1,$2)) with lines lt -1 lw 1"
    LABEL = "filename index b using 1:2:(lab($1,$2)) with labels"

    b = 0
    dist = 0
    plot 'test.dat' with image, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL, @ZERO, \
    @SEARCH, @PLOT, @LABEL

    ReplyDelete
  11. Greetings penguinny,

    Thanks for visiting, and for your comments! I am glad that you came across some useful tricks here. I found your solution to the problem you mentioned especially brilliant. I have to admit that, while I was aware of the difficulty with the positioning of the labels, I was just too lazy to think about the actual solution. Many thanks for sharing!
    Cheers,
    Zoltán

    ReplyDelete
  12. Hi Gnuplotter! First of all, thank you for this post, which was really useful.

    I found a bug in the code. Namely, you ought to multiply, not divide, by xtoy^2 in order to achieve the correct result. The code should therefore be

    f(x,y) = ((x-x0)*(x-x0)+(y-y0)*(y-y0)*xtoy*xtoy > eps ? y : 1/0)

    Again, thanks a lot.

    ReplyDelete
  13. > I found a bug in the code. Namely, you ought to multiply, not divide, by xtoy^2 in order to achieve the correct result. The code should therefore be

    > f(x,y) = ((x-x0)*(x-x0)+(y-y0)*(y-y0)*xtoy*xtoy > eps ? y : 1/0)

    Yes, you are absolutely right. Thanks for posting it! I will just update the post.
    Cheers,
    Zoltán

    ReplyDelete
  14. Hi...
    I'm using this script as a cooking recipe.. :) sorry for this...
    I would create a graph of a energy landscape.
    My only problem (maybe) is that when I give the "g(x,y)=(((x > xl && x < xh && y > yl && y < yh) ? (x0 = x, y0 = y) : 1), 1/0)" instruction, gnuplot reply me with this: " ^
    ')' expected
    "

    I don't know why... but if you can help me... Thank you!!!

    ReplyDelete
  15. Hi Ricardo,

    Do you run gnuplot 4.4?
    Cheers,
    Zoltán

    ReplyDelete
  16. In the older post the gawk script was modified such that the labels were rotated parallel to the contour lines. Has anyone come up with a way to do this within gnuplot?

    The ideas presented here are incredibly helpful. Thank you to everyone who has chipped in.

    ReplyDelete
  17. Genius!
    wow, gnuplot is really mind-bending...

    ReplyDelete
  18. This comment has been removed by the author.

    ReplyDelete
  19. Hi Gnuplotter,
    Thank you for your generous and thoughtful posts. They are truly helpful.
    I am trying to implement the above gnuplot command sequence but have
    not yet succeeded. When I press enter after the last line I get an warning
    message that says: Skipping data file with no valid points. I have typed
    your code verbatim. The image that is generated in the end is the contour
    map without the level curves. Do you have any idea what might be causing
    this error? Thank you very much for your blog postings, and any help you
    might be ableto offer would be greatly appreciated

    ReplyDelete