dotNoted

Icon

Observations of .Net development in the wild

Nice new Visual Studio Orcas feature

The window switch window (Ctrl-Tab by default) in Visual Studio has been updated to show a snapshot of the window you are about to jump to. A very nice little feature to have added… it removes a crucial few seconds from the information seek time, which is more psychological than actual time saved, but each little boost adds up.

 

Filed under: Software Engineering

There will be a PDC07 – PDC in 2007!

http://msdn.microsoft.com/events/pdc/

Gist: "More Information on PDC 2007 coming soon!"

Filed under: Software Engineering

DynamicMethod in .Net 2.0 is very powerful, but a debugger visualizer is almost needed to take advantage

We’re creating a dynamic thunk layer to host .Net managed objects in Sybase’s Powerbuilder (both visual and nonvisual objects) and we want a way to mark up our classes with a bit of metadata in the form of Attribute, and then let reflection take over and write the needed Powerbuilder metadata to describe the objects. This works pretty well – reflection is done on a path at design time on the .Net assemblies and the metadata is generated (it isn’t dynamic) for Powerbuilder. At runtime, there were few options since it is dynamic – late-binding calls need to be made in this case. But what about all that great .Net metadata…? Couldn’t we use that to bring a bit more load (or JIT) time binding into the picture, so that the parameters are mapped out in memory to specific method handle locations? Well, in comes DynamicMethod. It is a dynamically built bit of IL which is then, of course, JITted into machine code… at load time. MS calls it Lightweight Code Generation in case you haven’t found this bit already. It’s generally an important component of the dynamic language groundswell in the .Net code space. All sorts of fun ideas from dynamic app reconfiguration and optimization to evolutionary algorithms stand out as being able to be easily taken advantage of. Sure you could always do this with System.Reflection.Emit, but it’s much easier now, since all the assembly stuff is taken out – methods belong to the current assembly.

Debugging one of these dynamic methods is a big pain, though. I went through the trouble of wiring up some debug writelines to get an idea of why my hand crafted IL worked, but the dynamically generated stuff didn’t… and ran into an InvalidOperationException trying to access the GetILAsByteArray method. The exception text is very, very misleading: "System.InvalidOperationException : Operation is not valid due to the current state of the object." What this really means is "Not implemented." For shame! The MethodBase base class throws this. This is what NotImplementedException or NotSupportedException are for, I believe. At least give some textual clues which are not misleading.

So, on a hunt to find a better method of debugging these dynamic methods, I came upon this debugger visualizer. I was skeptical at first, but I tried it out and found it was very nice! It implements the most easily accessible examples like the MSDN doc’s and Joel Pobar’s. Do try when you go down this path… it will save you time and headache. However, you will still need to know IL.

Filed under: Software Engineering

Wix being dogfooded

Looks like Wix (Windows Installer XML – MS’s first open source project, and the installer technology we use for a number of products) is making it mainstream, getting out in actual products by Microsoft, not just obscure development tools…

 

 

Filed under: Software Engineering

Cyclomatic Complexity and Program Analysis

Cyclomatic complexity is a well-established metric of software engineering that measures, in seemingly straightforward terms, how complex a program is based on the number of branches within the module (function, procedure, subsystem, etc.) and how many modules it references. There are many definitions which essentially borrow from the same understanding – one which apparently has a good grasp on either what it means intuitively or the requisite math. Both appear to be valid, but as usual, a more formal understanding contributes to a more carefully constructed cognitive model of it, and enables us to move from general to specific more easily without logic errors, and thus yields more value. You can get the former by simply saying "the number of paths you can take without duplication through a module" without seeming to confuse anyone – we all know what a "path" and a "module" is, right? Probably not exactly. The latter formality takes a little work. Since there is a dearth of a formal definition of "linearly independent" alongside definitions of cyclomatic complexity I’ve found, let’s see if we can raise the level of dialog, and infuse a bit more rigor into the art of measuring. I won’t give equations – just narrative – unless sufficiently provoked.

 

A program consisting of a set of statements and data can be thought of as a graph – a mathematical construct which relates a "vertex" or "node", which is an element in a set of objects – conceptual or physical, doesn’t matter – to other vertexes or nodes (nodes from now on since I want to avoid the contention between "vertices" and "vertexes"). These relations are "edges". In graphs, edges can be "directed" or "undirected". We’re interested in the directed edges between nodes, since programs have "flow" or the CPU marches forward being pushed by the voltage given it. Now, the set of all edges is, effectively, the program. If we start at the starting node (public void Main()) and follow each directed edge until the end, the program completes (or should, since a program which doesn’t have a terminating state is, in practical terms, an error). Finally, an edge exists between nodes (statements in the program) if they are logically adjacent in the program, due to the progression of the CPU. This includes branching and function calls.

 

Another way to look at the set of all concluding directed edges is a set of vectors. A vector is a mathematical construct which relates certain quantities together in a structure – an ordered set, basically. Commonly, a physical force can be modeled by a vector because it has more than one component (amount, or magnitude, and direction), and thus can’t be measured by a "scalar", or single, value. So, each edge in our program graph is a vector quantity which has a magnitude and direction which gives us the next statement, from any starting statement, in our program space.

 

Now, if we take the set of all the vectors that we can make in our program space, and use only the ones which don’t repeat edges in the graph (or, looking at it another way, execute any 2 statements in succession in the same program state, which really can’t happen in a computer program anyway, unless there is some external influence on the Instruction Pointer register due to debugging or an exploit, since even in a loop, while the same statements are being executed, the loop counter is incrementing or stream is being read and exhausted, so the program state is changing), and the set of vectors we choose covers all the nodes in our program graph, then we have a set of "linearly independent" edges or paths through our program. This is also known as a basis for the program space.

 

Once we have a basis, or set of linearly independent paths through the program, we can then proceed to evaluate the complexity of the now well-determined system. The added formalism could potentially buy us a lot – reducing the problem to a set of linear equations could allow us to develop optimizers both for compile time and run time, or just help us root out our own inefficiencies. Just the term “linearly independent” doesn’t allow for that immediate insight (Math 341 isn’t a requisite for CompSci in most schools, but most developers are comfortable with matrix math), but rather just makes our work appear to only have “truthiness” instead of trustworthiness. So please, if you talk about cyclomatic complexity, please explain linearly independent carefully, or make sure your references do.

Filed under: Software Engineering

2^128 is a really phenomenally large number

I’ve been playing around with WinFS beta 1 and I noticed that the item ids are GUIDs. Now, I’ve been told that it is nearly impossible to come up with 2 GUIDs that are identical – so very nearly that, practically, it is and will be. I believed this since 2^128 seems large. However, I couldn’t help wondering, with all the databases that use GUIDs as primary keys, all the devices, all the transactions, etc., etc., and now WinFS, isn’t it just slightly possible that we could "run out" and start having collisions?
 
In a word: not a chance. I did some quick calculations on the matter to see how many GUIDs could be assigned to each man, woman and child on the planet now and even to cover population estimates over the next 50 years, and the answer turns out to be – another very large, mind-staggeringly proportioned number. Even if that many people started to use GUIDs since the beginning of the universe

Filed under: Software Engineering

Obtained MCSD

Obtained the MCSD certification today. I’m not as happy as I thought I’d be for a number of reasons. First, I feel more relief than anything – it’s been hanging over my head for almost 2 years, and I’ve wanted to get it for about 6 years. Whew – no more of that soft, yet incessent, buzzing in the back of my head. Secondly, I was disappointed with the tests for two reasons. One, they were a lot more easy then the practice tests and study material let on… I even skipped studying for the last tests and relied on my experience. I flew through the .Net C# tests and XML tests with scores in the low-to-mid 900s. The other reason I was disappointed was that the tests really hammer on the aspects of the technology which MS throws in there, but which are not really useful in actual real-world scenarios. I’ve heard this complaint before, but now I actually understand and appreciate it. The SQL Server test (70-229) focused what I think was disproportionately on the slight efficiencies and the touted "features" of user functions and cursor extensions and how to use all the cool SQL Server tools like Profiler. I never use this junk since it makes code less portable, harder to debug, puts too much business logic in the DB, and is generally not used by other developers. As for the tools, sure, Profiler and Index Tuning Wizard are cool, but I can almost always pop open Query Analyzer and rewrite the query or add needed indexes just by looking at the query plan. There were also inversions of this which focused on edge cases which most likely would be taken care of by SQL Server’s auto tuning capabilities, a rewrite of the query, or some DBA work (which is what test 70-228 measures). I suppose I can see some value in this latter category of questions, but if it were replaced by more SQL questions, it would have been a better "developer" test. The other tests were similar, but less pronouced, or else I played around with the various technologies they were testing on where I didn’t notice it as much (Databinding for example – never use it, but know how it works).
 
Ah well. At least I don’t have to worry about it. If you are reading this and you do, I’d recommend just playing with all the stuff in the .Net tutorials and samples – that will be more helpful than cert prep books and cert prep tools (with a possible exception for the 70-300 test).

Filed under: Software Engineering

Another switch

I’m moving again. This time not to escape a bad situation, but to embrace a greater opportunity… and hey, we need health care. I’ll miss Intel and the folks around here… the experience has done much to solidify my desire and ability to code excellently. A lot was the driving, type-A culture and the remainder the separation from my family (2 hrs away) and my solitude during the week. A perfect storm of opportunity for an obsessive learner.

Filed under: Software Engineering

Passed 70-300

FWIW, I passed the 70-300 today (oops, May 31, forgot to publish). The MS free second shot promotion helped me finally commit. They extended it to August 31… let’s see if I can get a couple more out of the way.

Filed under: Software Engineering

SqlCommand.ExecuteReader deficiency

.Net exceptions have two purposes: signal that there was an exceptional situation when executing program flow which needs to be handled using routines that are separate from the main program flow (usually the catch block), and the other is to allow decisions to be made about how to handle what handled and take some appropriate action based on the type of the exception. HRESULTS did the same thing, but exceptions are more elegant because the exception code is outside the main program flow. Now, imagine if all you had was the Exception type (or HRESULT 0x80004005) – we would know that an exception occured, but the program would only have one recourse which would maintain program runtime integrity – exit. Since we don’t have information on the exception, we wouldn’t know if it was an input validation error or something more critical, like an access violation or that the system was out of memory. Strongly typed exceptions allow corrective action or graceful degradation of program execution in a reliable fashion. While most of the .Net Framework provides excellent variety of strongly typed exceptions which allow good programming structure and responsiveness, there is one notable exception: the ExecuteReader method in the SqlCommand class. Check out the docs: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfSystemDataSqlClientSqlCommandClassExecuteReaderTopic2.asp

 

Notice that when a statement can’t be executed, a plain old generic Exeption is thrown! How are we supposed to know that this is what really happened, and not some other error like OutOfMemoryException or ExecutionEngineException which apparently can happen at any time? One might say, "Well, those are all subclasses of SystemException, so just trap that and handle it differently." However, there are many, many SystemExceptions, so should we differentiate in our catch statement among all of them, some of them, or what? Obviously, this is a serious oversight, since SQL code which doesn’t run is a quite common occurance. Perhaps this is fixed in 2.0….

Filed under: Software Engineering