Monday, July 27, 2015

Ready Player One, by Ernest Cline

Book cover If you were born before the 80s, Ready Player One is going to fill you with melancholy. Ernest Cline combines several classical young adult themes - like a battle versus an oppressive corporate evil, true and pure love, villainy and lack of honor defeated through friendship and good feelings - with (often obscure) geek memes of the 70s and 80s. If you are the kind of person who likes to impress by quoting lines from movies or telling of your adventures in games that are older than your children, this is the book for you. OK, I won't be mean, the book is going to be fun no matter when you were born, but the level of enjoyment may vary.

However, the book is still a young adult book at its core and, besides the overall message that you actually have to make an effort to reach your goals - an often neglected tidbit in young adult books and movies - it reads like one. Young heroes, with enough skill to pass through the challenges of the story, but awkward enough to also be endearing, manage to save the world through the power of their dedication and ideals. Also Chekhov's Gun has been used so much that it left gaping holes in the story. Amazing how random things in the story come perfectly together at the end. Let a bitter Harry Potterish aftertaste.

But let's start with the plot, which is pretty fun. It all happens in a dystopian world where energy reserves dwindled, gasoline became way too scarce and expensive, Elon Musk never happened and most people live their lives in a virtual world called OASIS, created by a brilliant yet reclusive visionary who made sure the system will remain secure and anonymous. Kind of like the Internet, but without the MPAA or the NSA. Yeah, it already sounds like an impossible dream, doesn't it? Well, the maker of the game world dies and leaves his entire estate (hundreds of billions of dollars and complete control over OASIS) to whoever finds the Easter Egg he his inside the game. The heroes of the story are young "egg hunters", while the villains are corporate drones who have been hired to find the egg for their company.

The writing style irked me a little. I know it is Cline's first book, and it certainly was a decent effort, but it had that way of explaining things that I call "fake past" in which the narrator explains things as if he is telling a story from the past. "The OASIS was...". Since this is supposed to happen in the future, it took a while until I could stop feeling irritated by it. However this has the advantage of being very easy to read.

I think it matters a lot if the reader is into cultural references. I could understand some, I could remember some, most of the references in the book, though, were ancient or obscure enough that even I didn't recognize them, and I am a pretty geeky person. I felt rewarded when I could "get it" and frustrated when I didn't, so probably it will be the same for most readers. If you don't care about these things, I think it is better to wait for the movie.

...which is in the works, with Steven Spielberg attached to the project. It might be difficult to put the story on the screen, though, since the book made an effort to describe a future world where everybody is obsessed with this specific period of the late 20th century, kind of like the Star Trek episodes that happened in the past or that required Kirk or Picard to know some specific book from the school curriculum. There is even a Ready Player One web site, that might have some Easter eggs in it (they would be dumb not to program some) but to me it seems both way too geeky and way to social at the same time.

Bottom line: a fun book for geeks. I hope it inspires the younger generations to look at the world a little bit differently, but I don't have my hopes up. My guess is that they will go all 'Meh' on a story that references anything that happened last century.

Wednesday, July 22, 2015

T-SQL: Joining two very large tables (or not) when trying to determine the hierarchical relationship between the records

Update: after another very useful comment from NULLable, I tried several new ideas:
  • range queries - trying to specify that the child StartIp, for example, is not only greater or equal to the parent StartIp, but also less or equal to the parent EndIp. In my case the query didn't go faster and adding new indexes as recommended in the comment made is slower. I believe it is because the range values are not static or just because clustering the start/end IP index is really way faster than any logical implementation of the search algorithm
  • cursor hints - obviously a very important hint that I should add to almost any cursor is LOCAL. A GLOBAL cursor can be accessed from outside the stored procedure and weird things can happen when running the stored procedure twice at the same time. NULLable recommended also STATIC READ_ONLY and FORWARD_ONLY. In truth the performance of the query doesn't really depend on the speed of the cursor, anyway, but I found an article that discusses the various cursor hints and ends up recommending LOCAL FAST_FORWARD. Check it out, it is very informative. My tests showed no real difference in this particular scenario.
  • RI-Tree implementation in SQL - the article that NULLable linked to is amazing! I just don't get it :) I will update this more when I gain more IQ points.

Update 2: I kind of understood the Relational Interval Tree implementation, but I couldn't find a way for it to help me. The code there creates a computed column of the same type as the IP columns then makes a BETWEEN comparison and/or a join or an apply with two table functions. I can't imagine how it could help me since the original query is basically just two BETWEEN conditions. But still a very interesting article.

I wanted to have a database of all Ripe records, in order to quickly determine the Internet Service Provider for an IP. We are discussing IPv4 only, so the structure of the table in the database looked like this:
CREATE TABLE [dbo].[RipeDb](
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [StartIp] [bigint] NULL,
  [EndIp] [bigint] NULL,
  [NetName] [nvarchar](450) NULL,
  [StartTime] [datetime2](7) NULL,
  [EndTime] [datetime2](7) NULL,
  [ParentId] [int] NULL)

As you can see, I translate IPs into BIGINT so that I can quickly sort and select stuff. I also added a ParentId column that represents the parent ISP, as you have some huge chunk of IPs, split and sold to other ISPs, which in turn are selling bits of the IP range they own to others and so on. The data I receive, though, is a simple text file with no hierarchical relations.

The task, therefore, is to take a table like described above, with more than four million records, and for each of them find their parent, if any.

The simplest idea is to join the table with itself like this:
SELECT rp.Id as ParentId, 
   r.Id 
FROM   RipeDb r 
   INNER JOIN RipeDb rp 
       ON rp.StartIp <= r.StartIp 
          AND rp.EndIp >= r.EndIp 
          AND rp.EndIp - rp.StartIp > r.EndIp - r.StartIp 
This gets all ancestors for each record, so we need to use a RANK() OVER() in an inner select in order to select only the parent, but that's beyond the scope of the article.

Since we have conditions on the StartIp and EndIp columns, we need an index on them. But which?

Through trial and error, more than anything else, I realised that the best solution is a clustered index on StartIp,EndIp. That is why the first column (Id) is not marked as PRIMARY KEY in the definition of the table, because it has to look like this:
[Id] [int] PRIMARY KEY NONCLUSTERED IDENTITY(1,1) NOT NULL
. Yes, primary keys don't have to be clustered.

But now you hit the snag. The process is EXTREMELY slow. Basically on my computer this query would end in a few days (as opposed to twice as much with a nonclustered index). What the hell is going on?

I tried several things:
  • JOIN hints (Merge, Loop and Hash joins) - the query optimizer seems to choose the best solution anyway
  • Various index combinations - nothing beats a clustered index
  • Taking a bunch of records and joining only them in a WHILE loop - it doesn't fill up the temp db, but it is just as slow, if not worse

At this point I kind of gave up. Days of work trying to figure out why this is going so slow reached a simple solution: 4 million records squared means 16 thousand billion comparisons. No matter how ingenious SQL would be, this will be slow. "But, Siderite, I have tables large like this and joining them is really fast!" you will say. True, with equality the joins are orders of magnitude faster. Probably there is either place for improvement in the way I used the indexes or in the way they are implemented. If you have any ideas, please let me know.

So did I solve the problem? Yes, of course, by not relying on an SQL join. Think about how the ranges are arranged. If we order the IP ranges on their start and end values, you get something like this:



For each range, the following is either a direct child or a sibling. I created a stored procedure that called itself recursively, which should have worked, but then it reached the maximum level of recursion in SQL (32 - a value that one cannot change!) and so I had to do everything myself. How? With a cursor. Here is the final code:
DECLARE @ParentIds TABLE (Id INT,StartIp BIGINT, EndIp BIGINT)
DECLARE @ParentId INT
DECLARE @Id INT
DECLARE @StartIp BIGINT
DECLARE @EndIp BIGINT
DECLARE @OldParentId INT

DECLARE @i INT=0
DECLARE @c INT

DECLARE curs CURSOR LOCAL FAST_FORWARD FOR
SELECT r.Id, r.StartIp, r.EndIp, r.ParentId
FROM RipeDb r
WHERE r.EndTime IS NULL
ORDER BY StartIp ASC, EndIp DESC

OPEN curs

FETCH NEXT FROM curs
INTO @Id, @StartIp, @EndIp, @OldParentId

WHILE @@FETCH_STATUS=0
BEGIN

    DELETE FROM @ParentIds WHERE EndIp<@StartIp

    SET @ParentId=NULL
    SELECT TOP 1 @ParentId=Id FROM @ParentIds 
    ORDER BY Id DESC

    SELECT @c=COUNT(1) FROM @ParentIds

    IF (@i % 1000=0)
    BEGIN

    PRINT CONVERT(NVARCHAR(100),SysUtcDatetime())+' Updated parent id for ' + CONVERT(NVARCHAR(100),@i) +' rows. ' + CONVERT(NVARCHAR(100),@c) +' parents in temp table.'
    RAISERROR ('', 0, 1) WITH NOWAIT

    END
    SET @i=@i+1

    IF (ISNULL(@OldParentId,-1) != ISNULL(@ParentId,-1))
    BEGIN
        UPDATE RipeDb SET ParentId=@ParentId WHERE Id=@Id
    END

    INSERT INTO @ParentIds VALUES(@Id,@StartIp,@EndIp)

    FETCH NEXT FROM curs
    INTO @Id, @StartIp, @EndIp
END

CLOSE curs
DEALLOCATE curs

I will follow the explanation of the algorithm, for people hitting the exact issue that I had, but let me write the conclusion of this blog post: even if SQL is awesome in sorting and indexing, it doesn't mean that is the only solution. In my case, the SQL indexes proved to be a golden hammer that wasted days of my work.

So, the logic here is really simple, which makes this entire endeavour educational, but really frustrating to me:
  1. Sort the table by start IP ascending, then end IP descending - this makes the parents come before the children in the list
  2. Create a table variable to store the previous parents - so when you finished with a range you will automatically find yourself in its parent
  3. Use a cursor to move through all the items and for each one:
  4. Remove all parents that ended before the current item starts - removes siblings for the list
  5. Get the last parent in the list - that is the current parent range
  6. Set the parent id to be the one of the last parent

It's that deceptively simple and the query now ends in 15 minutes instead of days.

Another issue that might be interesting is that after the original import is created, the new records added to the table should be just a few. In that case, the first join and update might work faster! The next thing that I will do is count how many items I need to update and use one method or another based on that.

Hope that helps someone.

Tuesday, July 21, 2015

Villains by Necessity, by Eve Forward

Book cover Villains by Necessity is not a masterpiece of literature, but it is a fun fantasy book that doesn't feel the need to be part of a trilogy or take itself too seriously. Perfect for when you want to pick up a book because you are tired, not because you want to work your brain to dust. First work of Eve Forward, it is rather inconsistent, moving from silly to dead serious and back and making the heroes of the story oscillate between pointlessly evil and uncharacteristically good.

The best part of the book, though, is the concept. The world enjoys an age of light and good after a bitter war that saw all the forces of darkness and evil be defeated. The world is filled with happy people, no conflict, beautiful weather, lush vegetation. In a word, it is fucking boring. An unlikely party of evildoers gets together to save the world by bringing darkness back!

Alas, it was a concept that was not really well used. The characters, borrowed from classical fantasy, are "evil" by their professions only, but not by behaviour. Sam is an assassin, but he doesn't enjoy the suffering of his victims and is proud of his prowess. Archie is a mischievous thief, but other than that he is an OK fellow. Even Valerie, the dark sorceress, eats sentient beings just because it is her race's culture and her evil is more often artificial. Not to mention Blackmail, who acts as the classic stoic hero. Similarly, the forces of good are blood thirsty thugs that want to either kill everything dark or brainwash them, as a humane solution. This basically makes our heroes... err... heroes, not villains, and viceversa.

Now, the book wasn't bad. The style was amateurish, but it is Eve Forward's first book, after all. I could read it and I got caught by the story. I was more attracted to the original concept, though, and I was very curious how it would go. It is so difficult to present bad people as the protagonists, I know, because many people, including writers as they write, want them to be redeemed somehow. In the end, the moral of the story - excruciatingly laid out in a few paragraphs that shouldn't have existed - felt really heavy handed and simplistic. Ok, good people can do bad things and bad people can do good things, but it is important to explore what makes them good and bad, not just lazily assign them dark fantasy classes and be done with it.

Bottom line: fun read, but nothing special.

Sunday, July 19, 2015

We are doomed to be hit by asteroids, repeatedly

This is how reality stands right now: even if the danger of an asteroid hit is great, the risk of one hitting is small. That means that they hit (very) far apart and cause a lot of damage. Now, all governments in the world are run by politicians, who are by their very nature bureaucrats. They are reactive, not proactive, and they have insulated themselves from responsibility by manipulating laws and creating committees and departments that they can behead at any time, as they keep their fat asses on their chairs of power. This is not a rant, it's just the ugly truth, evolving, but never really changing since we were barely smarter than monkeys.

The logical conclusion of these facts is that politicians will not do anything about asteroids until we are hit by one. Even worse, since the probability that a really big one will hit without us knowing in advance has been reduced by space advances, the asteroid that will hit us will probably be small. The Tunguska and the Chelyabinsk events, real things that happened, changed nothing. The one that is going to change anything will be when something similar happens on top of a city.

This is not a doomsday prophecy, either. The probability that this will happen is extremely small. First of all the asteroid has to be small enough and/or fast enough so that we don't detect it in time. Then it has to hit at a certain angle to not be deflected by the atmosphere. Then it has to reach a populated area, which one would think is simple, since we can't seem to be able to fart without someone smelling it, but in truth, with the oceans and the human propensity of congregating for no good reason, it is less probable. However, with enough time, even a small probability becomes certainty.

So, the scenario goes like this: we all pretend to care, but we don't. We want less taxes, not asteroid protection. Politicians use our shortsightedness and our greed to enhance their own and do nothing. Then an asteroid hits, causing massive damage, death and loss of property. This is the moment when something happens. They implement new laws, launch asteroid defense programs, create new departments and committees. But, since the probability that an asteroid hits is small, the hype will fade, the budgets with it, politicians will rotate, people will forget. By the time the next asteroid hits, no one will be prepared for it any more than for the previous one.

In the end, the only things that ever made a dent in the probability that something will hurt us as a species or even as a larger group were technological. Not technology per se, just its price. Just more scientists with cheaper tech getting more done. When space launches become cheaper, satellites smaller, we can do more with them at the same relative price. That is why now we are discovering millions of asteroids in the Solar System, not because of some sort of scientific awakening. It's cheaper, probably as cheap as it was for amateur astronomers to buy telescopes in 1801, when Giuseppe Piazzi discovered the first asteroid, Ceres. I just hope this all gets cheap enough fast enough so we can do something by the time the big asteroid is coming. Well, if we don't destroy ourselves in some other way by then.

I know I'm a month late, but Happy Asteroid Day!

Wednesday, July 15, 2015

Fun debugging stuff

I needed to get all IP addresses in a range, so I applied the mask to my current IP and started to work through the minimum and maximum values of bytes to get the result. I had a code that looked like this:
for (var b = min; b <= max; b++) { //do stuff }
where min was 0 and max was 255.

The expected result: all values from 0 to 255. The result: infinitely running code. Can you spot why?

Yes, when min and max are bytes, b is also a byte, so when it gets to 255 and does b++ the value becomes 0 again. Fun, eh?

Thursday, July 09, 2015

The White Luck Warrior (Aspect Emperor book 2), by R. Scott Bakker

There are several stories happening at the same time in The White Luck Warrior, with almost no direct connection between them. There is the Great Ordeal, advancing slowly towards Golgoterath while being besieged by hordes of Sranc, also containing the story of this kid prince forced to march with it; then there is the palace life, with Esmenet left to rule the empire while Kellhus is away, while various factions are ready to take advantage of the lack of man power of the leadership and her half Dunyain children prove to be either insane or really insane; there is the trek of Achamian in search of the origin place of Kellhus. Among these there is a vague and a few paragraphs long subplot of The White Luck Warrior, a mysterious figure that seems to know all of its future, making him an automaton, I guess and some bits about the Fanim.

Why the smallest and insignificant portion of the book gave its title I do not know, but remember that the first book in the Aspect-Emperor series was called The Judging Eye, which is most prominently used or described in this volume. By far the most interesting and captivating storyline is that of Achamian, although I have to say that the logistics of long duration travel within enemy territory and the psychological factors involved seemed to me poorly described by Bakker.

What I knew will happen happened. I finished the book before the third volume in the series was released and now I am in withdrawal pains. That proves that the book captivated me. At very few moments I felt the need to "fast forward" and, considering the amount of distraction and that I had resolved to draw this book out a little bit in the hope that the third volume would be released, I finished it rather quickly.

Even if enjoyable, to me it felt more like a filler. I couldn't empathize with Esmenet or any of her demented children, nor could I care less what happened to Maithanet, who is one of the less fleshed out characters in the book. Similarly, the Sorweel story arch described a confused and frustrated teen, which was relatable, but uninteresting as a character. Unlike in the first four books, Kellhus sounds less godly and dominating and is mostly relegated to a minor role in the overall story. No, the most interesting characters and storyline revolve around Achamian, Mimara, The Captain and the mysterious Cleric, plus any of other members of the crazy bunch of mercenaries known as The Skin Eaters. And they just walk and walk and walk, only to end the book in a cliffhanger. While I await eagerly the sixth book, I have my misgivings and fears that it will not be as good as this one, just as this one felt a little bit short of the first.

Monday, July 06, 2015

Possible SQL injection using SQL parameters

This post will discuss the possibility of creating an SQL injection using the name of a parameter.

We have known about SQL injection since forever: a programmer is constructing an SQL command using user provided information and some malicious hacker inserts a specially crafted string that transforms the intended query into something that returns the content of a user table or deletes some data on the server. The solution for this, in the .NET world, is to use parameterized queries. Even if someone dynamically creates an SQL query, they should use parameter names and then provide the parameters to the SQL command. The idea behind this is that any value provided by the users will be escaped correctly using parameters. Today I found that this solution works perfectly for parameter values, but less perfectly for parameter names. Try to use a parameter with a name containing a single quote and you will get an error. Let's analyse this.

Here is a piece of code that just executes some random SQL text command, but it also adds a parameter containing a single quote in the name:
using (var conn = new SqlConnection("Server=localhost;Database=Test;UID=sa;Trusted_Connection=True;"))
{
    conn.Open();
    using (var comm = new SqlCommand())
    {
        var paramName = "a'";

        comm.Connection = conn;
                    
        comm.CommandText = "SELECT 1";
        comm.Parameters.Add(new SqlParameter(paramName, SqlDbType.NVarChar, 100)
        {
            Value="text"
        });

        comm.ExecuteNonQuery();
    }
    conn.Close();
}

As you can see, the text content of the SQL command is irrelevant. the name of the parameter is important and for this I just used a single quote in the name. Running SQL Profiler, we get this query string that is actually executed:
exec sp_executesql N'SELECT 1',N'@a'' nvarchar(100)',@a'=N'text'
In this example the name of the parameter is properly handled in the string defining the name and type of the parameters, but it is NOT escaped during the parameter value declaration. A small change of the code with paramName="a='';DELETE * FROM SomeTable --" results in an interesting query string in the SQL Profiler:
exec sp_executesql N'SELECT 1',N'@a='''';DELETE FROM SomeTable -- nvarchar(100)',@a='';DELETE FROM SomeTable --=N'text'
Strangely enough, when inspecting the SomeTable table, the values are still there, even if copying the text into SQL Management Studio actually deletes the values. A similar construction using stored procedures leads to a completely legal SQL that is recorded by SQL Profiler, but it doesn't really do anything:
using (var conn = new SqlConnection("Server=localhost;Database=Test;UID=sa;Trusted_Connection=True;"))
{
    conn.Open();
    using (var comm = new SqlCommand())
    {
        var paramName = "a='';DELETE FROM SomeTable --";

        comm.Connection = conn;
        comm.CommandText = "DoTest";
        comm.CommandType = CommandType.StoredProcedure;
        comm.Parameters.Add(new SqlParameter(paramName, SqlDbType.NVarChar, 100)
        {
            Value="text"
        });

        int k = 0;
        using (var reader = comm.ExecuteReader())
        {
            while (reader.Read()) k++;
        }
        Console.WriteLine(k);
    }
    conn.Close();
}
... with the resulting SQL:
exec DoTest @a='';DELETE FROM SomeTable --=N'text'

I have demonstrated a method of sending a maliciously crafted parameter name to the SqlCommand class and managing to send to the SQL Server a query that should achieve a destructive result. So why doesn't it actually do anything?

The explanation is in the EventClass column of the SQL Profiler. While a normally executed SQL command (let's say from SQL Management Studio) has event classes of SQL:BatchStarting and SQL:BatchCompleted, the query resulting from my attempts have an EventClass of RPC:Completed. It appears that the RPC method of sending queries to the SQL Server doesn't allow for several commands separated by a semicolon. The result is that the first command is executed and the rest are apparently being ignored. The TDS protocol documentation shows that the RPC method is a system of executing stored procedures by sending a binary structure that contains stuff like the name of the procedure, the parameters and so on. Since an SQL text is actually translated into a call to the sp_executesql stored procedure, RPC is used for both types of SqlCommand: Text and StoredProcedure.

I don't have the time to explore this further, but I wonder if this can be used for any type of SQL injection. To make sure, try to check the names of the parameters, if they come from user input.

Sunday, July 05, 2015

Getting a blue screen of death when stopping debugging in Visual Studio (Ping, PROCESS_HAS_LOCKED_PAGES)

As you know from the previous post, I am working on a network monitor application that displays information about the devices in my local network. For that I need to ping them to see if they are available. The simplest solution is to use the out-of-the-box solution of System.Net.NetworkInformation.Ping. It was no little shock to stop debugging my application and find myself facing a Blue Screen of Death, you know, the blue thing with white text that means everything is fucked, with the error message PROCESS_HAS_LOCKED_PAGES.

Googling around I found that the problem is, indeed, coming from the Ping class. Since the application is full with BackgroundWorkers and threads and stuff like that, Ping was actually the furthest from my mind. I even suspected that my laptop is dying. Microsoft, blessed be their hearts, not only did they ignore the bug, which is logged on their Connect site, but they eventually closed it with no resolution. Therefore I resolved to find a solution by myself.

I tried to see if there are any alternatives to the Ping class. I downloaded two implementations: PingWin and Pinger, both seemingly functional. PingWin, was using the Windows IcmpSendEcho function, via Pinvoking icmp.dll. Pinger was using raw sockets. After replacing the Ping class with a class with the same members and properties I was using, but wrapping PingWin, I checked to see if the application was behaving as expected, then stopped debugging. Again I got the BSOD, suggesting what I had read from other people with the same problem, that it is not a .NET managed code issue, but instead it is a problem stemming from the lower level code in the Windows operating system.

As noted in the IcmpSendEcho documentation, there are two implementations, one included in Icmp.dll, which comes from Windows 2000, and one included in Iphlpapi.dll, which comes included in Windows XP and later. Replacing the target of the DllImport attribute to Iphlpapi.dll resulted in an application that works exactly the same, crashing just the same. Looking at the source code of the Ping class, I see that it is using the same mechanism, calling the functions exported by Iphlpapi.dll.

Using the raw socket implementation did not cause a crash, though, so I can recommend you use it, if you can, since everything using Iphlpapi.dll or Icmp.dll seems to be affected by this. There are some limitations to using raw sockets in Windows, but I think ICMP may be exempt. You should research some more if you want to use it in an application that must work in all kind of security environments.

Saturday, July 04, 2015

A curvy WPF MVVM graphical chart using Bezier curves

In this post I will discuss the following:
First, let's discuss the project that I was working on which led to this work. I wanted to do a program that manages the devices on my network. There is a router and several Wi-fi extenders, all of them with an HTTP interface. I wanted to know when they are reachable through the network, connect to the HTTP interface and gather data or perform actions like resetting the device and so on. In order to see if they were reachable, I was pinging them every second, so I thought I would like to see the evolution of the ping roundtrip time in a visual way, therefore the chart.

All of the values that I was displaying and all the commands that were available on the interface were using MVVM, the pattern developed by Microsoft for a better separation of presentation and data model. MVVM presents some difficulties, though, since most of the time directly getting the data and displaying it is more efficient and easier to do. It does allow for fantastic flexibility and good maintenance of the project. So, since I am a fan, I wanted to draw this chart via MVVM as well.

The MVVM chart


In order to do that I needed a viewmodel that abstracted the chart. Since I had several devices, each of them with a collection of pings containing the time of the ping and a nullable rountrip value, it would have been way too annoying to try to chart the values directly, so on the main viewmodel I created a specific chart model. This model contained a BindingList of items of various custom types: GraphLines, GraphStarts and GraphEnds. When the ping failed I added an "end" to the model. When the ping succeeded after a fail, I would add a "start". And when the ping was continuously successful, I would add a "line" connecting the previous ping to the current one.

So, in order to draw anything, I used a Canvas. The Canvas is a very simple container that can position stuff at absolute values. The first thing you need to realize is that it is not a vectorial type of container, so when you draw something on a small canvas and you resize the window, everything remains at the same position and size. The other thing that quickly becomes apparent is that there are various ways of positioning objects on the Canvas. The attached properties Canvas.Top, Canvas.Left, Canvas.Bottom and Canvas.Right can define the position of TextBlocks or other elements, including Rectangles and Ellipses. Lines, on the other hand, whether simple Line objects or something more complex, like Paths, are positioned using Points and X,Y coordinates. This would come to bite me on the ass later on.

WPF is very flexible. In order to add things to a Canvas, all one needs to do is to declare an ItemsControl and then redefine the ItemsPanel property to be a Canvas. The way objects are represented on the Canvas can be defined via DataTemplates, in my case one for each type of item. So I created a template that contains a Line for the GraphLine type, another for GraphStart, containing a Rectangle, and one for GraphEnd, containing an Ellipse. Forget the syntax right now, first I had to solve the problem of the different ways to position something on a Canvas and the ItemsControl. You see, in order to position a Line, all you have to do is set the X1,Y1,X2,Y2 properties, but for Ellipses and Rectangles you need to set Canvas.Left and Canvas.Top. The problem with the ItemsControl is that for each of these not primitive objects it creates a ContentPresenter to encapsulate them, therefore setting Canvas properties to the inner shape did nothing. The solution is to set a style for the ContentPresenter and set the Canvas properties on it. Surprise! Then the Lines stop working! The solution was to add several Canvases, one for the lines and one for the rectangles and ellipses, as ItemsControls, and one for static text and stuff like that, all in the same Container so that they overlap. But it worked. Then I started the program and watched the chart being displayed.

<ItemsControl ItemsSource="{Binding GraphItems}" Name="GraphLines">
    <ItemsControl.Resources>
        <DataTemplate DataType="{x:Type local:GraphLine}">
            ...
        </DataTemplate>
        <DataTemplate DataType="{x:Type local:GraphSpline}">
            ...
        </DataTemplate>
        <DataTemplate DataType="{x:Type local:GraphStart}">
            ...
        </DataTemplate>
        <DataTemplate DataType="{x:Type local:GraphEnd}">
            ...
        </DataTemplate>
    </ItemsControl.Resources>
    <ItemsControl.ItemsPanel>
        <ItemsPanelTemplate>
            <Canvas/>
        </ItemsPanelTemplate>
    </ItemsControl.ItemsPanel>
</ItemsControl>

But how did I calculate the coordinates of all of these items? As I said, the Canvas is a pretty static thing. If I resized the window, the items would remain in the same position and with the same size. Also, the viewmodel didn't have (and shouldn't have had) an idea of the actual size of the drawing Canvas. My solution was to use a MultiBinding with a custom converter. It would get two values, one would be a computed double value, from 0 to 1, that represented either vertical or horizontal position, the second would be the value of the dimension, the height or the width. The result would be, of course, the product of the two values. Luckily WPF has a very flexible Binding syntax, so it was no problem two define a value from the viewmodel and a value of the ActualWidth or ActualHeight properties of the Canvas object. This resulted in a very nice graph that adapted to my resizing of the window in real time without me having to do anything.

<Line Stroke="{Binding Ip, Converter={StaticResource TextToBrushConverter}}" StrokeThickness="2" >
    <Line.X1>
        <MultiBinding Converter="{StaticResource ResizeConverter}">
            <Binding Path="X"/>
            <Binding Path="ActualWidth" RelativeSource="{RelativeSource Mode=FindAncestor, AncestorType={x:Type Canvas}}"/>
        </MultiBinding>
    </Line.X1>
    <Line.Y1>
        <MultiBinding Converter="{StaticResource ResizeConverter}">
            <Binding Path="Y"/>
            <Binding Path="ActualHeight" RelativeSource="{RelativeSource Mode=FindAncestor, AncestorType={x:Type Canvas}}"/>
        </MultiBinding>
    </Line.Y1>
    <Line.X2>
        <MultiBinding Converter="{StaticResource ResizeConverter}">
            <Binding Path="X2"/>
            <Binding Path="ActualWidth" RelativeSource="{RelativeSource Mode=FindAncestor, AncestorType={x:Type Canvas}}"/>
        </MultiBinding>
    </Line.X2>
    <Line.Y2>
        <MultiBinding Converter="{StaticResource ResizeConverter}">
            <Binding Path="Y2"/>
            <Binding Path="ActualHeight" RelativeSource="{RelativeSource Mode=FindAncestor, AncestorType={x:Type Canvas}}"/>
        </MultiBinding>
    </Line.Y2>
</Line>

Performance


The next issue in the pipeline was performance. Clearing the GraphItems collection and adding new items to it was very slow and presented some ugly visual artifacts. For this I used the inner mechanisms of the BindingList object. First I set the RaiseListChangedEvents property to false, so that the list would not fire any events to the WPF mechanism. Then I cleared the list,added every newly calculated GraphItem to the list, set RaiseListChangedEvents back to true and fired a ListChanged event forcefully using the (badly named) ResetBindings method.

GraphItems.RaiseListChangedEvents = false;
GraphItems.Clear();
foreach (var item in items)
{
    GraphItems.Add(item);
}
GraphItems.RaiseListChangedEvents = true;
this.Dispatcher.Invoke(GraphItems.ResetBindings, DispatcherPriority.Normal);

All good, but then the overall performance of the application was abysmal. I would move to another program, then switch back to it and it wouldn't show up, or I would press a button and it wouldn't show up pressed, or the values of the data from the devices were not displayed sometimes. It wasn't that it used too much CPU or memory or anything like that, it was just a very sluggish user experience.

First idea was that the binding to the parent Canvas object to get the ActualWidth and the ActualHeight values was slow. I was right. In order to test this I removed any bindings to the Canvas and instead set the values directly to the converter, via the SizeChanged event of the Canvas object. This made things slightly faster, but also made them look weird, since I would resize the window and only see a difference after SizeChanged fired. The performance gain was significant, but not that large. The UI was still sluggish.

void Canvas_SizeChanged(object sender, SizeChangedEventArgs e)
{
    var resizeConverter = (ResizeConverter)this.Resources["ResizeConverter"];
    resizeConverter.Size = e.NewSize;
}

Now, you would ask yourself, what is the purpose of my using this ItemsControl and Canvas combination? It is in order to use the MVVM pattern. Just drawing directly on the Canvas would violate that, wouldn't it? Or would it? In this case the binding of the values in the viewmodel to the chart is one way. I only need to display stuff and nothing that happens on the chart UI changes the viewmodel. Also, since I chose to recreate all the chart items at every turn, it just means I am delegating clearing the Canvas and drawing everything to the WPF mechanism, nothing more. In fact, if I would just subscribe to the GraphItems ListChanged event I would be able to draw everything and not really have any strong link between data model and presentation. So I did that. The side effect of this was that I didn't need two ItemsControl/Canvas instances. I only needed one Canvas and I would add items to it as I saw fit.

Of course, the smart reader that you are, you realized that I need to know the type of the viewmodel in order to subscribe to the items list. The very correct way to do it would have been to encapsulate the Canvas into a control that would have received a list of items as a model and it would have handled all the drawing itself. It makes sense: you don't want a Canvas, what you really want is a Chart component that handles everything for you. I leave that to the enterprising reader, since it is outside the scope of this post.

Another thing that I did not do and it probably made sense in terms of performance, was to add items to the chart, somehow translate the position of the chart and remove the items that were outside the visible portion of the chart. That sounds like a good feature of the Chart control :) Again, I leave it to the reader to try to do something like that.

Bezier curves instead of lines


The last thing that I want to cover is making the chart less jagged. The roundtrip ping values were all over the place resulting in a jagged line kind of chart. I wanted something smoother, like a continuous curvy line. So I decided to replace the Line representation with a Bezier curve one. I am not a graphical person, neither a math geek. I had no idea what a Bezier curve is, only that it helps in creating these nice looking curves that blend into each other. Each Bezier curve is defined by four points so, in my ignorance, I thought that I just have to pass four points from the list instead of the two required to form a Line. The result was hilarious, but not what I wanted.

Reading the theory we learn that... what the hell is that on Wikipedia? How can anyone understand that?!... Ugh!

So let's start with some experiments. Let's use the wonderful XamlPadX application to see some examples of that using WPF. First, let's draw a jagged three line graphic and try to use the four points to define a Bezier curve and see what happens.

<Page xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:sys="clr-namespace:System;assembly=mscorlib" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" >
  <Canvas>
    <Line X1="100" Y1="100" X2="200" Y2="300" Stroke="Gray" StrokeThickness="2"/>
    <Line X1="200" Y1="300" X2="300" Y2="150" Stroke="Gray" StrokeThickness="2"/>
    <Line X1="300" Y1="150" X2="400" Y2="200" Stroke="Gray" StrokeThickness="2"/>
    <Path Stroke="Red" StrokeThickness="2">
        <Path.Data>
            <PathGeometry>
                <PathGeometry.Figures>
                    <PathFigureCollection>    
                        <PathFigure StartPoint="100,100">
                            <PathFigure.Segments>
                                <PathSegmentCollection>
                                    <BezierSegment Point1="200,300" Point2="300,150" Point3="400,200" />
                                </PathSegmentCollection>
                            </PathFigure.Segments>
                        </PathFigure>
                    </PathFigureCollection>
                </PathGeometry.Figures>
            </PathGeometry>
        </Path.Data>
    </Path>
</Canvas>
</Page>



As we can see, the curve does touch the first and the fourth points and sort of approximates the line, but not very clearly. The problem becomes even more obvious when we add another point and we create two Bezier curves, from the first and last four points. The two curves intersect, they are not continuous. Even if you take points four by four, the resulting curves, even if they continue each other, they do it with straight corners, the opposite of what I wanted.




Let's try the opposite, let's draw one Bezier curve and then lines that connect the first and second and then the third and fourth points. We see that the lines define tangents to the two arcs comprising the Bezier curve. That intuitively tells us something: if two Bezier curves would to seamlessly blend into each other, then the straight lines that define them would also have to be continuous. We try that in XamlPadX and yes! It works.




So, from this we learn something. First of all, the first and last points of the Bezier have to be the points used in a normal Line. Then the last two points need to be part of the same line for the first two points of the next curve. So what about the second and third points? How do I choose those? Can I choose any lines to define my curves? Thinking of the chart that I am looking for, I just want that the jagged edges turn into nice little curves. I also don't want to think of other points than the points that would normally define a single line, that means I shouldn't use future data in defining the middle points of the curve that defines current data. So I just made the decision to use only horizontal lines to define curves. That means for any pair of coordinates X1,Y1, X2,Y2 I would create four pairs like this: X1,Y1 X1+something,Y1 X2-something,Y2 X2,Y2. That value could be anything, but I've decided it would be a percentage of the horizontal distance between two points.

Final result: using a percentage, let's say 20%, I would turn the pair of coordinates into X1,Y1 X1+(X2-X1)*0.2 X1+(X2-X1)*(1-0.2) X2,Y2. Let's see how that looks on the original jagged line. Let's use 50% instead. And for some fun, let's put it to 80%, 100% and even 200%.

<Page xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:sys="clr-namespace:System;assembly=mscorlib" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" >
  <Canvas>
    <Line X1="100" Y1="100" X2="200" Y2="300" Stroke="Gray" StrokeThickness="2"/>
    <Line X1="200" Y1="300" X2="300" Y2="150" Stroke="Gray" StrokeThickness="2"/>
    <Line X1="300" Y1="150" X2="400" Y2="200" Stroke="Gray" StrokeThickness="2"/>
    <Path Stroke="Red" StrokeThickness="2">
        <Path.Data>
            <PathGeometry>
                <PathGeometry.Figures>
                    <PathFigureCollection>    
                        <PathFigure StartPoint="100,100">
                            <PathFigure.Segments>
                                <PathSegmentCollection>
                                    <BezierSegment Point1="150,100" Point2="150,300" Point3="200,300" />
                                </PathSegmentCollection>
                            </PathFigure.Segments>
                        </PathFigure>
                        <PathFigure StartPoint="200,300">
                            <PathFigure.Segments>
                                <PathSegmentCollection>
                                    <BezierSegment Point1="250,300" Point2="250,150" Point3="300,150" />
                                </PathSegmentCollection>
                            </PathFigure.Segments>
                        </PathFigure>
                        <PathFigure StartPoint="300,150">
                            <PathFigure.Segments>
                                <PathSegmentCollection>
                                    <BezierSegment Point1="350,150" Point2="350,200" Point3="400,200" />
                                </PathSegmentCollection>
                            </PathFigure.Segments>
                        </PathFigure>
                    </PathFigureCollection>
                </PathGeometry.Figures>
            </PathGeometry>
        </Path.Data>
    </Path>
</Canvas>
</Page>







That's it, folks. I hope you enjoyed this as much as I did and it helps you in future projects.