Monday, July 20, 2015

WCF Extensibility - message call-chain tracing

I'm nearing closer to implementing a correlation id through an entire WCF call chain. It was working for single threaded calls. Then my parallel calls test executed and the result was not what I needed.


Here's what I mean by all of this:


Suppose we have three services - Sa, Sb, and Sc. And a single client - Ca.


When Ca calls Sa and Sa calls Sb and Sc in the same call, then responds to Ca I want to be able to trace all of those related calls in a call log so that I know which calls contributed to an event (perhaps an error or a specific data update).


Currently, we have all calls being logged in a database via a custom operation invoker which implements IOperationInvoker. That implementation sends a one-way call to a singleton logger service which logs the pre- and post-invoke including operation, call time, who called, etc.


I'm adding messageId from the IncomingMessage built in header property "MessageId" (only the Guid part though). I'm also adding a CorrelationId to the header and this is where I'm currently sort of in the weeds.


CorrelationId is a custom header value that will be logged to the call log for each call in all services. For all calls related to a single origin call, it will have the same value. For each origin call, it will have a different value. An origin call is a call to a service which begins a call-chain. It would originate from outside the system (collection of services), typically from a UI or a service outside the system.


The service context that handles the first incoming call would need to create the Guid before the call is logged. It would need to be passed via a message header before the call is made to the service. Each service would need a way to know if it is the first service in the call chain. The absence of the correlation id in the header should suffice.


Implementing an IClientMessageInspector should be perfect for this. The BeforeSendRequest method has the message and adding the header value is trivial.


I created a correlationId member on the class that implements this interface. The only thing left is initializing this Guid when the CorrelationId messgaeheader is missing and add it to the headers. Hopefully that will wrap up the implementation. Then its just a matter of adding the extension to each service.

Various Types of Fun!


I've been busy lately, busy reading up on MS SQL Server query optimizations, WCF header message ids for logging call chains, and Disciplined Agile Delivery.


Here are a few nuggets about each topic:


SQL - I got here because I was looking up Hash Searching, http://blogs.msdn.com/b/craigfr/archive/tags/Joins/
Craig has some really deep explanations about all sorts of SQL Server related topics. If you really want to improve your SQL chops, you gotta read his stuff. The blog is written in a "we" context, because he worked on SQL Server at the time - got it? A quick search of the net shows his imprint.


WCF - when you have a service oriented architecture or a polyglot architecture which includes services (micro-services, queues, whatever) you will want a running log of the calls. Then you will want to correlate calls within a call chain and you will want to know what happened and when. Fortunately, if you are using WCF as your message-based framework of choice, you can use the message headers to pass this information from call to call. Additionally, you can do this in a AOP type of way so you don't have to alter every contract to add traceability. Here's a link with some of that info http://www.brendanwhelan.net/2013/wcf-custom-authentication-headers


DAD - no it's not about dads, its about process for software delivery. More specifically, its a framework to guide design of a process for software delivery. Or just guess and hope you are right... Here's the entry point for making better choices http://www.disciplinedagiledelivery.com/start-here/


But wait, there's more...


Mel Conway - after catching a glimpse of Conway's Law (how system designs will follow the same patterns as organizational communications), I delved deeper into Mel's experiences. It turns out that he has a lot to say on the public education system. After teaching 10th grade geometry in one of the poorest schools in Massachusetts, he shares his thoughts and experiences here http://melconway.com/Home/Home.html


IBM's cloud - after joining TopCoder (which is a thing where coders sign up for challenges and build real software for prizes) I started receiving newsletters. One newsletter lead me to Watson (IBM's famous Jeopardy champion) Services. Apparently even Watson is on the cloud. Check it out http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/visual-recognition.html
I'd really like to dig in on that someday.


I finally installed a Linux based OS - Ubuntu. I did this on a VM running on Hyper-V on Windows Server 2012. I know, right?!? Anyways, I did it this way because I wanted to explore and it seems like the best way to experiment. Oh, first I tried this in the Azure cloud on 2 VMs - one with Ubuntu another with the Oracle's RedHat clone. I'd really like to do more with these (MEAN stack).

Tuesday, July 14, 2015

Long Post: Thoughts on Conway's Law


After pondering Conway's Law I've come to think of it like this:


A typical organization will have departments that specialize in certain activities. There may be a marketing department, a sales department, a finance department, another for human resources, and so on. The natural design of any system (a set of processes and workflows that are systematic and repeatable whether automated or not) will be organized along the same structure where communications occurs between departments.


In a typical IT system, where the system is defined as all of the software which communicates with other software by design, the programs/sub-systems will typically be organized by the same functional areas of specialization. It takes some specialized work to organize the system in a way that is more efficient.


The human brain, for example, is organized in a way that areas have functional specializations. In the brain there are areas that work together to solve math problems, other areas work together to understand and produce speech, there are areas specializing in regulation of the body. It seems that the lower-level functioning of our brain, that which is concerned with keeping the body in a state of living, are specialized in a more specific sense, while higher-order thinking is comprised of areas composed dynamically for some general purposes.


Math problems are so varied and dynamic that it may take the stitching together of more parts of our brain to solve some than it would for others. We commit some math problems to memory - what's 2 + 2? Most of us don't even have to think about it because we've created an area that specializes in 2 + 2 long ago and it has solidified in our mind. How about divining a proof for the quadratic equation? Prove the existence of the Higgs Boson? Those require a special organization of areas of our brain to produce the result (unless we are math experts of course).


Imagine if our business organizations were aligned this way...there might be a math department which performs all math functions, an email department which is responsible for writing all emails, a bad news department...well, you get the picture. This sort of organizational construct would not serve well in an environment where people have to communicate across those lines of specialization. Human communications are relatively inefficient when compared to communications between computers or between nerve synapses.


Given that human communication is the main inefficiency between people, we organize into specializations based on higher functioning areas of business instead of specific functional needs. Our communications protocols are based on the human language and are transmitted via blog, email, memos, telephones, texts, etc. We need contextual understanding in order to decipher the messages, which have specialized meanings in specific contexts. Our communications are Klumsy, Lame, Ugly, Dumb, But Good Enough and barely at that! If not for poor communications the world wouldn't need therapists!


What is the solution to removing the weak links in the chain? Automate, of course. We made calculators, and now no one needs to learn basic math or consult a math specialist through our various means of poor communications. Instead we each have a math wizard in our pockets and under our fingertips at all times. What's more is that these math wizards can use their mathematical prowess to present and communicate information (and mis-information) to humans at near-light speed.


Our ability to communicate as effectively in the whole pales in comparison with the ability of computers to communicate with each other. What's more, is that computers have software. With software, we can teach a computer how to understand nearly any communication that it can receive by simply installing software. For humans to do this would require years of specialized training for each individual. Computers require specific training by a few individuals who produce the software. It may take years to produce, but the outcome is that each computer takes minutes to learn how to communicate in this specialized way.


Thus far, I have been setting the stage for an argument for a concerted effort toward the design of IT systems that seeks to maximize efficiency in ways that humans cannot. Given that communications between contextual boundaries is the main barrier to human efficiencies, when one seeks to improve efficiency, then the way to produce maximum gains is to address the barriers between contexts.


Since computers can specialize in low level functioning and do not suffer the same communications barriers that people do, systems comprised of computer software can be organized in a way that humans cannot. They can be organized in a way that maximizes efficiencies. Be warned, however, that there are different types of efficiencies when it comes to software; some of which apply to maintainability and development. For sake of argument, lets call these indirect efficiencies. Processing efficiencies are one direct efficiency that comes to mind in regards to software. When some work is performed, how quickly will the system arrive at the result? This depends on a number of things including the architecture (organization of the system), the hardware in the system, and the software design. Two of these can be easy to change, one cannot. Hardware is frequently upgraded and so is software. The single most difficult change in a system is the architecture - how the pieces are organized. In fact, the ability to upgrade software and sometimes hardware are directly impacted by the architecture of the system.


An architecture can be defined in terms of the coupling of its components - a system can be tightly coupled or loosely coupled. A tightly coupled system is one in which each of the components depend on one another in a way that makes it difficult to change any of the functional areas without affecting one or more others. Loose coupling is when the functional areas of a system can be changed at will without much impact to other functional components. So far, the descriptions of coupling are arbitrary. To quantify, lets consider the following: If a system has n functional components (logical chunks of work that it performs), and each component is coupled to x other components such that a change to one will affect another in a way that causes the other to change in order to provide the same functional work, then the coupling can be quantified as the sum of all couplings divided by the number of components


SUM(x0,...,xn)/n


we would strive for 0 so that any component can be exchanged or even moved to different hardware so that the overall efficiency of the system can be improved.


Another nuance in meaning of efficiency of a system is tied to how it improves efficiency of the work being done - be that assisting humans in performing their own tasks more efficiently, or by automating the tasks. Some tasks are automatable and some require the elasticity of the human brain and its ability to compose its pieces in order to solve a problem. This is the way we should view IT systems as well, except that the barrier in being able to do this as efficiently as humans is human communications.


Most computer software requires humans to translate some thought into a computer language. This effect is called programming and requires a specialized skill that takes years to learn and master. It requires human interactions in order to understand the problem. It suffers the same inefficiencies as other human interactions between groups of specialists. It takes time. Human brains can pull their pieces together very quickly, while it takes much longer to tell a computer how to solve a new problem.


Because it does take quite a bit of effort to tell a computer how to solve a problem, it is necessary to build the basic low level components that can be pulled together to solve some specific problem in an efficient way. The problem solving components will be higher level components in the system. All components will need to communicate efficiently. Each component should be able to change in implementation or in location with minimal impact on other components. If an architecture supports these goals, it can be changed to make the pieces more efficient.

Monday, July 6, 2015

Learn Web Development

Starting with a list of essentials, here is a guide to starting out with web development:


HTTP verbs
HTML
Client-Server model
CSS
JavaScript
JQuery
OWASP Top 10
1 or more Server Side languages/platforms (PHP, Ruby, nodejs, Go, etc)
RESTful services, how to build and interact with
SOLID, OOP (Java tutorial, GoF design patterns)
TDD
SQL
at least 1 noSQL
some html templating language (mustache, handlebars, underscorejs, etc)
mvc/mvvm patterns and some SPA implementation of this (Angularjs, Emberjs, Knockoutjs)


I would suggest reading this entire article then diving into the links getting an overview of each topic, then diving in a bit deeper. If you get to a point where you feel like you are blocked from understanding a topic, take a step back and dive deeper into a previous link. I tried my best to list them in order of dependencies - HTML, CSS, and JavaScript being first and security deep dive requires an understanding of so many topics that it appears later.






A good place to start is http://www.w3schools.com/
that site has tutorials with an option to try each concept as you go. It's not a fully interactive tutorial, which is great because it also serves as a good point of reference.


Of course, the real heart of the web that we all know is based on HTTP. Here's a great way to understand the model. http://stackoverflow.com/questions/2001773/understanding-rest-verbs-error-codes-and-authentication
I once read a fantastic article about HTTP, but can't seem to locate it. That post on stack overflow has some good info. https://en.m.wikipedia.org/wiki/Client%E2%80%93server_model is sufficient for now to intro client-server.


StackOverflow is one of the best community-driven resources on the web. I'll say no more.




Everyone should know about the World Wide Web Consortium(W3C) http://www.w3.org/, they set and publish standards such as the html spec, JavaScript spec, CSS spec ao if you really want to learn the standards that's where to go.


There are some great tools available for editing html and other types of code. I've found that Notepad++ works well for several languages. It's lightweight, and works for a variety of languages including semantic markups. https://notepad-plus-plus.org/

it is free and "easy to use".


All of the most popular web browsers have debuggers built in for debugging JavaScript. Usually pressing f12 gets you there. Debugging is a sort of dark art that takes time to do well. It takes practice. All of that being said, anyone can start debugging on almost any full blown browser. Look for a future post on debugging tips and tricks.


Client-side debugging is one thing, but when you own the server-side code, you will need to be able to debug that too. In that case you will need an editor capable of debugging. Without going too much detail, this is basically where the editor is hooked into the server-side hosting (typically on your local machine, not a remote host) and any breakpoint you set in code will result in halted execution. You will be able to control the progress of execution in a number of ways - continue, step-through, step-into. While debugging, you can examine flow of control and variables that are in scope. A common free tool for Java is Eclipse https://eclipse.org/ide/. This IDE(Integrated Development Environment) also functions for C/C++ and PHP. Definitely put this on your dev box and learn this tool. Also, while you are at it, learn Object Oriented Programming - perhaps by way of Java. PHP has an object model, so does C++, JavaScript, and most other common languages.


What's Object Oriented Programming all about? Check it out on this Java tutorial https://docs.oracle.com/javase/tutorial/java/concepts/object.html
I cut to the chase, feel free to navigate back to the root and read the intro. Vest in Java or don't, but learn what they have to teach.


OWASP is a community focused on web security. They publish a wealth of information about how to protect your applications from being hacked - mitigate risk rather. You can find the top 10 threats and a wealth of other info listed on their site https://www.owasp.org/index.php/Main_Page


PluralSight is a hugely influential training site that has videos covering a range of IT related topics. They have some free videos, but a monthly membership gets you open access to all. This is a great way for more experienced developers to grow their knowledge base.