Tuesday, August 30, 2011

Ghost in the Wires, by Kevin Mitnick

Book coverWhen I was a child I watched with huge eyes movies like Hackers, enjoying the shenanigans of computer rebels fighting the stupid law enforcement and the "evil" hackers. Of course, there was also Angelina Jolie. Even then I knew that my pleasure was a guilty one: no way could the police be that stupid, no way it would be that easy to penetrate all kinds of systems and produce effects so flashy. A while after that I watched Skeet Ulrich in the movie Operation Takedown, which was a more realistic hacker movie (and one I think Skeet did a great job in). It depicted how Kevin Mitnick has been apprehended by the authorities. I really loved that movie, although it had a lot of eye rolling moments.

Fast forward to now, reading Ghost in the Wires, Kevin Mitnick's book about himself, practically a hacking autobiography, and I loved this book every bit as much as I liked those movies as a kid. Not only I couldn't leave the book out of my hands once I started reading it, but was shocked to see that reality is not that far away from what was depicted in hacking movies. It was also interesting to read how the script of Operation Takedown came to be, which Kevin considers defamatory and mostly untrue.

Long story short, Mitnick is a smart kid with a great memory, an absent father and no real friends. He starts dabbling with radio and telephones and manages to get access to phone systems way before computers where personal or connected to each other. He's a kid, though, and he gets caught a few times. Nobody seems to understand he does it just for the fun of it and he can't seem to understand why nobody gets him. In the end, pushed by the desire to challenge himself, but also by authorities baiting him all the time, he becomes a life long hacker and eventually gets caught.

A shocking part of the book is how easy it is to penetrate any system, not by whatever technical wizardry, but by simply tricking people into giving you information and access. Called "social engineering" it was Mitnick's strongest point and at several times in the book, when the technology would not allow him to enter one system or another, he would just abandon the tech stuff and go with tricking people. Already having knowledge on how to manipulate phone systems made that a lot easier, as well.

Another, less shocking, but utterly disappointing part is about authorities. Just as they are now about file sharing and whatever "crisis" they are in, law enforcement agencies are basing their entire existence on pure power of coercion, ignoring the rules that they themselves are enforcing and being motivated only by keeping that power in their hands. Technical morons, they only seem to be getting into the action when their pride is affected. In this book Kevin Mitnick dances around security personnel, local cops, FBI, NSA several steps ahead of them, but they only seem to really mind when newspapers start publishing articles that makes law enforcement look bad. And once they have him, caught only with the help of other hackers, they are using all the dirty tricks in the book to bring Mitnick to his knees. Nothing has changed from then to now, just look at cases like Gary McKinnon's. Intimidation is a bully's greatest strength. That's sad.

I would have to say that the most unexpected thing was the tone of the book, which is almost exuberant. Mitnick has not become a bitter and paranoid man after countless personal betrayals and authority abuse and he is not angry at all. If anything, the guy is happy to have lived as the lead actor in the "Myth of Kevin Mitnick", which has grown way bigger than the real person. There is a scene when he gets outside of a building and there are hundreds of fans there, shouting, and he looks behind to see if there is a celebrity around.

Bottom line: this is a book you can't miss. It is easy to read to the point of instantly addictive, it is well written with enough juicy technical details to keep one interested and, most of all, makes you feel good, even in the horrible moments of his detention. It makes one wonder, did Mitnick socially engineer himself into remaining an open and cool guy in the face of adversity? Or is it he had this strength all along and that is his most powerful "magic"?

Sunday, August 28, 2011

A Dance with Dragons, by George R.R. Martin

Book coverI've always felt that a lot of movies and books lack realism, that when the hero survives and gets the girl, while the villains get their rightful punishment the whole story becomes null and void. Therefore I believed that the opposite of this - a character centric plot, a realistic story, unexpected outcomes - would spell a good story and wonderful books/movies. Such is the tale in A Song of Ice and Fire, the seven (until now) book saga that I am currently reading, in which A Dance with Dragons is the fifth book.

Imagine my surprise in discovering that the other extreme is just as bad. In ASOIAF, there is no real main character. A lot of heroes and villains to choose from, but, completely unexpected, heroes suffer and die, while villain prevail. Maybe it's the other way around. Maybe nobody truly prevails. Certainly they all suffer. So what is wrong with this picture? George R. R. Martin went too far. His tale is now more of an alternative history than a true story. Maybe the sixth book will be different, but how can it be and still remain faithful to the first five? How can it continue the way the story went so far without a pointless struggle, where random things happen to random people? I mean, if I like this, I should start reading history books. My father would be pleased, I am sure :)

A Dance with Dragons takes place in a period in time that mostly overlaps the fourth book. The author acknowledged that there were too many characters doing too much in that short amount of time and, rather than split the storyline in the middle and create the "left hanging" effect, he opted for a geographical (more or less) division of the plot.

I liked the book. It started with a lot of setup, and that is what most of the book is about, but it also got deeper into magic and Arya's assassin training and what Jon and Tyrion did. I would say they are my favourite characters so, in that respect, the fifth book is better than the fourth. Can't say more about it without ruining the fun, but I was not disappointed, only slightly annoyed to see that what I really wanted in a book I got and I was still not completely satisfied.

My theory is that the stern characters in the book represent Martin's own father, while Tyrion and Jon are reflections of his own persona. Sansa may be the clueless girl who broke his heart, while Arya would be his daughter or younger sibling. I am just theorizing here, as I have not read any bio information of the author. Perhaps he represented himself as Wun Wun :)

I am waiting for the sixth book to appear, but the second book in the saga appeared two years after the first, the third two years later, then five years, then six! At this rate the sixth book will be published in 2018, with Martin of 70 years of age and waiting for the seventh eight or nine years later.

Wednesday, August 17, 2011

Weird 'DataTableReader is invalid for current DataTable' exception

I was debugging an application when I've noticed that there was an exception thrown in one of the legacy DAL modules. A method was using regular ADO.Net to run a stored procedure, fill a DataSet, then proceeded on getting an IDataReader for the set, using the CreateDataReader method. On reader.Read() the exception DataTableReader is invalid for current DataTable 'Table' was thrown.

I've investigated the issue only to notice that the DataSet was correctly retrieved from the database, the only problem came when creating the reader. Any meaningful property threw this error. I've found a thread that discussed this problem here, but the answer was not there. Instead, I read an obscure line somewhere that said this exception is thrown when the DataSet has changes and tried that solution: before running CreateDataReader, I ran DataSet.AcceptChanges. And it worked!

The strange part comes now: I did the AcceptChanges bit in the Watch window during debug, not as a permanent change to the code. From then on, the code worked, no matter how many times I ran iisreset or restarted the browser. I've added the solution in the code for good measure, but I am still not sure if this is the solution or simply some fluke of the universe. One possible answer is the "race condition" described in this discussion, which also suggests this happens in debug mode only. Strange, innit?

Update: AcceptChanges did not solve the issue on the production server. I am still investigating, but if you know what this is about, please share :)

Saturday, August 13, 2011

A bit of randomness

A colleague of mine asked a question that seemed trivial, but then it revealed interesting layers of complexity: how would you build an algorithm for a random number in any integer interval assuming that you already have a function that returns a random binary bit? The distribution of the bit is perfectly random and so it should be that of your function.

My first attempt was to divide the interval in two, then choose the first or second half based on the random bit function. This worked perfectly for intervals of even length, but there were issues with odd sized intervals. Let's take the most basic version there is: we want a random number between 7 and 9. The interval has a size of 3, which is not divisible by 2.

One solution is to split it in half anyway, ignoring one number, then use the random bit function one more time to determine in which half the remaining number should be added. For example the random bit yields 1, so we add the odd number to the second half: 7,8,9 -> 7 and 8,9 . Now the random bit is 0, thus choosing the first half, which is 7. This sounds good enough, let's see how this works:

Possible random bit results:
  • 0 (7,8|9)
    • 0 (7|8)
      • 0 (=7)
      • 1 (=8)
    • 1 (=9)
  • 1 (7|8,9)
    • 0 (=7)
    • 1 (8|9)
      • 0 (=8)
      • 1 (=9)

The interesting part is coming when deciding (pun not intended) what type of probability we would consider. From the tree above, if we take the terminal leafs and count them, there are exactly 6. Each of the numbers in the interval appear exactly twice. There is a perfectly balanced probability that a number will appear in the leaf nodes. But if we decide that each random bit run divides the total probability by two, then we have a 50% chance for 0 or 1 and thus the probability that 7 would be chosen is 1/4 + 1/8 (3/8), the same for 9, but then 8 would have a 2/8 probability to be chosen, so not so perfect.

What is the correct way to compute it? As I see it, the terminal graph leaf way is the external method, the algorithm can end in just 6 possible states and an external observer would not care about the inner workings of the algorithm; the second is an internal view of the use of the "coin toss" function inside the algorithm. The methods could be reconciled by continuing the algorithm even when the function has terminated, until all the possible paths have the same length, something akin to splitting 7 in two 7 nodes, for example, so that the probability would be computed between all the 2 to the power of the maximum tree height options. If the random bit yielded 0, then 0, we still toss the coin to get 000 and 001; now there are 8 terminal nodes and they are divided in 3,2,3 nodes per numbers in the interval. But if we force this method, then we will never get a result. No power of two can be equally divided by 3.

Then I came with another algorithm. What if we could divide even an odd number in two, by multiplying it with two? So instead of solving for 7,8,9 what if we could solve it for 7,7,8,8,9,9 ? Now things become interesting because even for a small finite interval length like 3, the algorithm does not have a deterministic running length. Let's run it again:

Possible random bit results:
  • 0 (7,7,8)
    • 0 (7,7,7)
    • 1 (7,8,8)
      • 0 (7,7,8)... and so on
      • 1 (8,8,8)
  • 1 (8,9,9)
    • 0 (8,8,9)
      • 0 (8,8,8)
      • 1 (8,9,9)... and so on
    • 1 (9,9,9)

As you can see, the tree looks similar, but the algorithm never truly completes. There are always exactly two possibilities in each step that the algorithm will continue. Now, the algorithm does end most of the time, with a probability to end increasing exponentially with each step, but its maximum theoretical length is infinity. We are getting into Cantoresque sets of infinite numbers and we want to calculate what is the probability that a random infinite number would be part of one set or another. Ugh!

And even so, for the small example above, it does seem that the probability for each number is 25%, while there is another 25% chance to continue the algorithm, but if you look at the previous stage you have a 25% chance for 7 or 9, but no chance for 8 at all. If we arbitrarily stop in the middle of the algorithm, not only does it invalidate the result, but also makes no sense to compute any probability.

You can look at it another way: this new algorithm is splitting probability in three equal integer parts, then it throws the rest into the future. It is a funny way of using time and space equivalence, as we are trading interval space for time. (See the third and last algorithm in the post)

My conclusion is that the internal method of computing the probability of the result was flawed. As a black box operator of the algorithm I don't really care how it spews its output, only that it does so with an as perfect probability as possible (pun, again, not intended). That means that if I use the algorithm two times there is no way it can output equals amounts of three values. The probability can't be computed like that. If we use it a million times we would expect a rough 333333 times of each value, but still one would be off one side or another. So the two algorithms are just as good.

Also, some people might ask: how can you possible use the second algorithm for large intervals. You are not going to work with arrays of millions of items for million size intervals, are you? In fact, you only need five values for the algorithm: the limits of the interval (a and b), the amount of lower edge values (p), the amount for the higher edge (r), then the amount for any number in between (q). Example: 7778888888899999 a=7, b=9, p=3, q=8, r=5 . You split this in two and (for the coin toss of 0) you get 7778888 a=7, b=8, p=3, q=1 (don't care at this point), r=4. The next step of the algorithm you multiply by two p,q and r and you go on until a=b.

You can consider a simpler version though: there are three values in the interval so we need at least a number equal or bigger than three that is also a power of two. That means four, two coin tosses. If the coin toss is 00, the result is 7; if the coin toss is 01, the result is 8; for 10, the result is 9. What happens when you get 11? Well, you run the algorithm again.

Thursday, August 11, 2011

Sending an array to a stored procedure in Sql Server 2008

I needed to pass an array of IDs to a stored procedure on SQL Server 2008. This version of the server supports user defined table types and a way to access it from .Net, of course. A comprehensive resource for sending arrays to any version of SQL Server can be found here.

Long story short, for 2008 you first define a user table type that has a single int column (we are talking about an array of integers here, obviously), then a stored procedure that takes a parameter of that type. A way to send the array from .Net code is detailed here. As you can see, you create an array of something called SqlMetaData, holding the information of each column as defined in the user defined type, then you use an SqlParameter of SqlDbType Structured and with the TypeName the name of the user defined table in SQL Server. The parameter will have a list of SqlDataRecord instances that have the integer values in their first columns. Yes, there is an even longer story and I consider this short :-P

All nice and easy, but there is a caveat, something that is not immediately obvious from the code. The column metadata is set as a property value for any of the records that are added to the sql parameter value list. What if the list is empty? In this case it appears that there is a bug somewhere. The stored procedure fails, I guess because it does not receive the structure of the user defined table declared in the metadata and cannot map it to the user defined type.

A solution for this is to add a dummy SqlDataRecord with no values and then, in the stored procedure, check for NULL. A very ugly solution. The solution on Erland Sommarskog's blog did not say anything about this specifically, but I did find this: There are a few peculiarities, though. This does not work:
EXEC get_product_names NULL

but results in this error message:
Operand type clash: void type is incompatible with integer_list_tbltype

It is quite logical when you think of it: NULL is a scalar value, and not a table value. But what do you think about this:
EXEC get_product_names

You may expect this to result in an error about missing parameters, but instead this runs and produces an empty result set!
. Therefore the solution I used was to check in code if the .Net list of integers was empty and, in that case, do not send a parameter to the stored procedure. And it worked.

Tuesday, August 09, 2011

Removing User Access Control from Windows 7, even if enforced by domain policy

UAC is the most annoying feature of Windows 7, one that just has to be specifically designed to annoy the user with useless "Do you want to" alerts. Not even regular alerts, but screen dimming modal panic dialogs. I want it off. There are several options to do that, but the automated way to get rid of UAC is to change a value in the registry. Here is the reg file for it:
Windows Registry Editor Version 5.00


Warning! If you don't know what a registry entry is, then you'd better not do anything stupid! Now, if I could dim the screen while you read that, I would be Microsoft material.

But I digress. In order to execute the reg file above you need to save it into a file with the .reg extension (let's say nouac.reg) and then load it with regedt32 /s nouac.reg. The /s switch removes any confirmation messages and silently loads the file into the registry effectively disabling UAC.

However, my laptop is in a domain with some ridiculous policies enforcing the UAC setting so that when I next restart the computer I get the annoying popups again. Now if I could only run nouac.reg at logoff, I would be all set. And we can do that, also. Just run gpedit.msc, go to User Configuration -> Windows Settings -> Scripts (logon/logoff) and double click logoff. Create a batch file that contains the line to load the registry file and then add it as a logoff script. All set.

Update: Hold the celebration. After I restarted my computer, the hated UAC was back. Either the script was not executed, or it doesn't work like that because the policy is enforced after logoff scripts or before logon. Drat it!

Saturday, August 06, 2011

Computer vs. Human in chess

Ok, I am cheating now. I was feeling bad for not playing chess lately (or playing badly when I had other stuff to do, generating even more guilt) and having nothing to blog about except maybe books and also thinking about all the other directions of the blog that I failed to cover: programming, music, tech news.

So I bring you Brute force or intelligence? The slow rise of computer chess, which is an article about chess, it is from Ars Technica (tech news) and it involves some notions of programming. All I need for this to be complete is music!

Seriously now, I went to a friend's last night and played a bit of chess. We were both a little tired and drunk, so we played chess "for fun" (which translates to incredibly bad), but it really felt fun as opposed to playing a computer at a very low level. Why is that? I believe it is all about prioritization.

When a human plays, he is trying to use the principles of chess, but he doesn't have the time or mental resources to take each one and analyse each piece or position. Humans do use subconscious mechanisms to quickly scan a table, but that only comes with a lot of chess training. So basically, what a beginner human player is left with is finding a strategy that would quickly and (preferably) forcibly win the game. That means that we use something akin with the "Type B" algorithm from the article above. But it's not quite it, because it is a bit of everything, something that is traditionally hard to implement in a computer program (and that has more to do with the psychology of programming engineers than with a specific level of difficulty). Basically we look at the pieces, prioritised by their power and reach as well as their position relative to an area of attack or defence. That is why we don't see the queen or bishop in the corner of the table, because, looking in ever wider circles around the area we are focused on, we suddenly stop and start doing something else. Compare that with a computer which can take the measly 32 pieces on the board and computer in a few fractions of a second all their possible moves and the resulting board position.

Then, when we see a possible good move, we take it forward as many steps as we can. Does a chess beginner do a comprehensive tree of all possible moves in that scenario? Of course not. Not only we do not see all (or most) of the moves, but when we see a possibility for the opponent to play a counter move, we quickly analyse the likelihood that the other guy would see it and sometimes we even gamble that they won't do it, just because we wish they didn't. This is also psychological: the gambler way of thinking has been documented for a while, they are motivated by loss which gives them more of an adrenaline rush than winning or that makes winning ever sweeter; also the guy we play with is probably our friend and we partly root for the guy as well. Program that into a computer! I've had games where I took huge risks on the hope that my friend would a) not see the move, which would make me look good when playing a cool game and b) that he would see the move, making his game look cool, thus making the entire session interesting.

Back to programming, I think that the easiest way of implementing this kind of bad human play in a computer game is to take a normal computer algorithm for playing chess, like mini-max, then program a sort of Alzheimer routine, that would remove bits of its reasoning based on a probability computed from the following factors: proximity of pieces to a region of interest (which would also have to be defined, but let's just assume it would be the average of positions of the pieces involved in the current line of thought), the artistic value of a line of thought (which would be defined either by massive sacrifices for important gains, or by how severely we limit the opponent options - in other words: power), the probability that the opponent would see a move (computed based on current history of play) and also by the artistic value of the entire game, as described in the last paragraph.

In other words, what I am proposing here is that we have a perfect algorithm for playing chess, one that is limited by computing power alone. What we don't have is a good algorithm for bad play, for fun play. Most computer programs I've seen, including ChessMaster, which boasts with its ability to simulate human players of varying abilities, have incredibly stupid ways of limiting performance. For example: a knight wants to attack f7, the black soft spot; it has plans to move a bishop there as well. I move a pawn to prevent the bishop from attacking that spot and the computer takes with the knight, sacrificing a minor piece for a pawn and my king's ability to castle. Or a rook attacks a knight. It then takes the knight, even if defended. In other words, random, pointless moves. Every human move is purposeful, even if the purpose if flawed by bad judgement. Random moves won't do, they have to be moves that follow a plan, no matter how bad that plan is. We need a perfect algorithm for throttling the playing chess level. We need to look at human bad games, make their own chess database, extract rules for bad play and implement this into computers.

Wednesday, August 03, 2011

Dancing Naked in the Mind Field by Kary Mullis

Book coverKary Mullis is a chemist who, in 1983, invented Polimerase Chain Reaction, something that would revolutionize DNA analysis in terms of increased speed. He won the 1993 Nobel prize for that. He also is a controversial scientist who claims possible alien encounters and telepathy, denies global warming as an effect of human intervention, is skeptic about HIV causing AIDS and generally believes that most scientists are inventing reasons to get funded rather than doing anything scientific. He also admits smoking pot, taking LSD and generally experimenting with any mind altering chemical that he can make. He likes women and some of them like him. That is what this book is all about, a sort of "I am Kary Mullis, hear me roar!".

I started reading the book because I was falsely led to believe that he describes how training his mind with LSD lead him to the idea of PCR. The book is not about that at all, and if the article above is true about Mullis had an advantage over his colleagues: he had trained his brain to think differently by using hallucinogenic drugs, there is no mention of that in this book.

It is simply an autobiography, but written with gusto and sincerity. Some of the things he says are both logical and hard to accept (because so many others are of opposite views), some of them are simply personal beliefs. As many a talented person, he is intelligent, he had early opportunity to practice his passion (chemistry), a close friend to share it with and support from a local chemistry business owner who kind of adopted him for the summers and gave him the tools he needed to grow. The way he writes his book reminds me of the style of another scientist friend of mine: devoid of bullshit and intolerant of stupidity.

Bottom line, it is a nice book, simply written, short, I've read it in a few hours. It is a window in the life of an interesting person, and as such, I liked it. I can't say I've learned much from it, though, and that is somewhat of a disappointment coming from a book written by a man of science.