Friday, January 30, 2015

Use DNS Entries

Today's lesson is about using DNS entries for every connection in an enterprise environment or any internal connections. It's a bit more work to do up front, but can save so much time later.


Use aliases for databases also. This is probably old news, but something that might save someone a round of headaches later.



To illustrate the purpose, consider what happens when a server's OS reaches end of life for support and it's time to upgrade. If you've written your code well and you have no hard-coded connections to servers, the upgraded server can live anywhere as long as the DNS points to it.


For example, a billing system may communicate with a number of services and have a connection to a database. The database connection should use "billingData" as the alias in the connection. The service connections might be something like "net.tcp://services.example.com/customers/" for customer data. However, this offers less flexibility than "net.tcp://customers.services.example.com/"
since the latter can be moved independently of other services in that sub-domain. The latter is partitioned better which means you can route any service to another physical location.


But if your code has a connection to server1.example.com. You'll have to hunt down every connection and update that. So from the start, add a DNS entry for every database, application or resource.

SOC in an Automated Approval Process

Recently, I was involved in a project regarding an application which has an approval process to onboard an account from another system for use. The design is such that the data is duplicated in the systems. There is an issue that the data gets out of sync or is out of sync from the start.




For sake of example, let's say that we have a shipping system and an ordering system.


The shipping system contains all the source data for transportation companies (USPS, UPS, FedEx, Yellow, RoadRunner, etc). The shipping system is responsible for quoting and booking shipments.




The ordering management system uses the shipping system to send orders to customers. When a user of the ordering system has a new transportation vendor that they would like to use, they must enter the request in the ordering system for approval. Once approved, users may select the shipping vendor to use for shipping orders to customers.




The current design uses the record from within the order system with the shipping company name, and other info as it was entered for the approval process. This design has lead to data maintenance issues. I came up with a design that is not as fragile in terms of data consistency.




The proposed design for automated approval is as follows and may be used whenever this type of approval pattern is needed. The Automated Approval Pattern is a processing pattern that can be used whenever an entity must be entered into a system for use - whether or not the entity is foreign or native to that system.




The main benefit of the pattern is to keep the approval data separate from the usage data. The usage data may change. Once approved, the approval data is historical only. The data should be copied to a usage data store and all usage should relate to that store. The store may reside in another system or local to the usage system.




This separation implies that the approval process could be completely isolated to it's own system. This implies re-use of code and processing.




The basic flow follows different paths depending on the state of the entity in the source system. For our shipping system, a vendor may be non-existent, active, inactive, possibly other states. If the vendor exists, the approval data should be copied from the source data. If not, the user must enter the approval data in the approval system.




The approval process itself may involve one or many steps and the details will vary. The outcome of an approved vendor is that the primary key of the entity will be referenced by the usage system. The reference entity will be a proxy entity to the true entity in the source system. The proxy entity contains some usage system specific data as needed.




The usage system should always present relevant data from the source system - perhaps via a proxy class that represents the proxy entity in the usage system. The proxy class could use data from both systems, but only save to the usage system.




This pattern can be followed whether approval is required or not. The approval process should be separate from the loading and integrating of data. The approval process is volatile and should be contained as is own separate concern within the approval system.


When approval is not required, there may be yet be some benefit to using the approval system for adding entities. Perhaps some automation can be contained within the entry process. Another benefit would be the audit trail. Perhaps the most relevant would be the flexibility to add an approval process.


This pattern can be further normalized to an entry subsystem, an approval subsystem, and an integration subsystem. The entry system would be strictly responsible for lookup and validation. While some kind of hook would allow for the approval process from the approval subsystem to manage the approval workflow (whether existent or not). After the approval process, the integration subsystem would handle the setup of data in one or many usage systems and perhaps the source system.


The benefits of this pattern are greater flexibility, reuse, and data consistency. Costs may include greater complexity especially at scale and when building for reuse. I recommend using this pattern with some sort of relevant high level diagram of the architecture of the pattern clearly available to developers and technical support as to make clear the intent of the design. Basic component diagram and workflow diagram to follow....



Wednesday, January 28, 2015

Refactoring to Clean Code - step 1

A mess, that's what it is. It looks like it was always a mess. Then more mess got piled on top of that mess...then I inherited the mess.




Legacy code is not always a pleasure to inherit. I've been working with a particularly nasty piece of legacy for about 2 years now. For this round of changes I'm applying the boy scout rule of leaving it cleaner than I found it.




Currently, the component I'm working on is a web service that runs in a web project. It's written in C# .NET and was upgraded to 4.0 during a previous release, but was likely written in 3.5. It makes connections to Sql Server databases and sends SQL statements using the SqlCommand class. No slick ORMs or anything and all the calls are intermingled in with business logic. No SOC, SRP, ISP, no nothing. Just a singleton that runs everything and a few long, messy methods.




It's about time to clean up! But where to start? Well, I'm not sure it's worth vesting in writing a full suite of Unit Tests because there are no Units to test yet. They would be integration tests. I don't want to operate on the actual data as of yet so my first goal is to push all the SQL calls into their own methods. Next, into a class with an interface that can be mocked.




One I have a mockable interface for all the DB calls, I'll feel a bit more comfortable running tests. There's just one problem...the god-class is a singleton. So I either need to break this up, or find a way to set up the singleton using some type of creational pattern. Perhaps a factory. For now, I'm going to add a setter so I can inject the dependency. I don't like this because it doesn't express intent like a constructor does, but I'll tackle that later.




So far, I moved all the SQL calls into their own methods. The SQL stands alone, I abstracted the plumbing into 6 types of calls that take either SQL text, or SQL text and parameters and return a DataTable for queries, object for scalar queries, or int for commands. I still have a couple more abstractions to make, but I'll defer those.




The next step is to push all the data access methods into their own interfaced classes. One for queries and another for commands. I want to do this to further express intent. A delete is different from a query.




After that, I'm going to continue eradicating all the DataTable objects from the business logic. I'll create more DTOs to properly represent the data as classes. The DTOs will simply have a public property for each column.


The data access classes will return collections of objects rather than DataTable objects. The benefits of separating this concern is that the DAL can now be implemented using an ORM or the underlying store can change to something else. And...the first benefit is mocking!




With the business layer depending on an interface that can be mocked, we can control the data and test the business logic in isolation. Still need to verify the SQL, but that's a separate concern and will be verified using a different approach - a discussion for the future.

Monday, January 26, 2015

BDD and SpecFlow for .NET

SpecFlow is a tool for creating automated user acceptance tests in .NET. It consumes tests written in the Gherkin syntax. The syntax is as follows:




Given 20 races
  And 5 of them failed with the same engine issue
When preparing for another race
Then a failure rate of over .10 should trigger an investigation into the cause of the engine failures




This test is parsed and a unit test is generated. At that point it's up to the test engineer/developer to implement the unit test. The unit tests are the mechanism by which the acceptance tests are run. You can choose from a variety of common rest engines.


SpecFlow is available as a Visual Studio extension. After the extension is installed, it's a nuget package to add it to a test project. From there, you add a feature file which contains the specification. Then, you need to generate the steps file which contains the implementation.


The value of doing tests this way lies both the automation of acceptance tests and in the format of the specification. The Given-When-Then syntax ensures that all specs follow a specific syntax. If implemented properly, these tests provide a reliable and testable way to proof the code against the requirements.


These tests must be supplemented by unit tests that further flesh out the details of the code. They should not serve as tests for low-level details. They should be written by and/or with the business - hence the value in the ubiquitous language syntax. They should also use domain terms for entities and processes. This will help define and communicate the domain specific language.


An example entity:


Driver
|Name|Car|
|Ricky-Bobby|FirstNotLast|


RaceHistory
|Date|Track|RaceName|Result|
|1/1/10|Daytona|Daytona500|Crash|
|1/1/11|Wis|Cheese300|First|


These Entities are passed to the test context and can be used to drive the tests.



Tuesday, January 20, 2015

More better code

I took a piece of code today and applied some techniques from the book "Clean Code" by Robert C. Martin. I have to say that I really like the results!

It was a test, it was only a test. I only applied some of the conventions regarding class and method style. I took some code from Microsoft's - MSDN - CQRS Journey and did a little refactoring. The result was an easy to understand class and methods with new classes that handled some of the responsibilities in a more SRP sort of way.

So the following code from here https://github.com/mspnp/cqrs-journey-code/blob/master/source/Infrastructure/Infrastructure/Messaging/Handling/EventDispatcher.cs

public void Register(IEventHandler handler)
{
var handlerType = handler.GetType();

foreach (var invocationTuple in this.BuildHandlerInvocations(handler))
{
var envelopeType = typeof(Envelope<>).MakeGenericType(invocationTuple.Item1);

List<Tuple<Type, Action<Envelope>>> invocations;
if (!this.handlersByEventType.TryGetValue(invocationTuple.Item1, out invocations))
{
invocations = new List<Tuple<Type, Action<Envelope>>>();
this.handlersByEventType[invocationTuple.Item1] = invocations;
}
invocations.Add(new Tuple<Type, Action<Envelope>>(handlerType, invocationTuple.Item2));

if (!this.dispatchersByEventType.ContainsKey(invocationTuple.Item1))
{
this.dispatchersByEventType[invocationTuple.Item1] = this.BuildDispatchInvocation(invocationTuple.Item1);
}
}
}

becomes

public void Register(IEventHandler handler)
{
foreach (var invocationDTO in this.BuildHandlerInvocations(handler))
{
this.RegisterHandlers(invocationDTO);
this.ReqisterDispatchers(invocationDTO)
}
}

which is far simpler to read and understand. I've extracted the details into Domain Specific Language represented by private methods.

This is part of the essence of writing clean code.

Thursday, January 15, 2015

Robert C. "Uncle Bob" Martin and professionalism

I saw Uncle Bob give his sermon on professionalism in software development yesterday. It was well founded by over 50 years of practicing the craft. He made some great points about software being a young field (200% growth every 5 years, which means half of us have very little exp).

The main points he makes are that the way to be professional are to write clean code. This is in-line with something I heard before about entropy. Entropy is the tendency for things to fall apart, become a mess, move toward chaos. It takes energy to counteract entropy. To counteract software entropy, it takes practice, skill, and discipline.

Uncle Bob alluded to using automated testing to protect software from ourselves so that we may work toward countering software entropy.

In New York city, the way to keep neighborhoods from going to hell was to ensure that not one broken window went unfixed. They figured out that one broken window quickly lead to more. My guess is that people figure "heck, there's a broken window and obviously the owner doesn't care about broken windows. So...lets break some more!"

In software, if some code is broken or messy, then I guess we figure "Oh boy, someone left an awful mess in there. Well, we must not care much about messes so I'll just make some more messes." Eventually you got one big hot smelly mess! Now who is going to clean that up?

No, perhaps we need to clean up all our playthings before moving on to the next. Somehow I try to instill this is my son who is too young to really get why. He will thank me one day I'm sure! Now if only we adults could learn that lesson ourselves...we really be making some clean code then!

https://cleancoders.com/
http://objectmentor.com/

Tuesday, January 13, 2015

On Event Messages Raised from Persistence Layer

One common pattern I've been using is to wrap all persistence logic in a resource class or perhaps two (one reader and one writer). Lately, I've been seeing more and more about message driven systems. A common pattern in message oriented systems is to publish commands and events to a message queue which can have zero to many message handlers.

The resource classes mentioned above would be consumed by middle layer services which encapsulate business logic and program flow. One policy of this architecture is that resources cannot communicate to anything else. However, this topology combined with event driven design leaves the system open to some holes.

If a middle layer service calls a resource to make some update, then send the event message, then every consumer of that resource would need to do the same. If not, then some important events may be missed. I don't believe that the resource implementation should raise the event either because that is beyond its concern.

A possible solution would be an event publisher via AOP. AOP in WCF can be implemented via a message interceptor that passes a copy off to some event publisher. This interceptor would need information about the transaction (success, or failure). The transaction starts in the client. It also needs some basic data to send.

The event publisher would know if, how and what event data to publish. Event subscribers would register to queues, and the event publisher would be completely unaware of subscribers. The message would be one-way to a queue, something else can broadcast it if desired, or relay to multiple subscribers in a different way (coordinated program flow in response to an event).

One pitfall I could think of is if multiple subs react to a broadcast and those subs are not coordinated in some way. Either the subs would need to handle contention issues, cascading events, race conditions, throttling, and some other timing based issues; or a coordinator should handle the event rather than many separate subscribers to a broadcast.

One possibility of handling broadcast events while minimizing impact of timing issues would be to use bounded contexts. Some data from other bounded contexts would be duplicated in any given bounded context and read only data can be gotten from other sources. However, any updates would need to be managed from within the context by some command coordinator.

Here's how that would work - let's say context 1 sent an update event and context 2 received it. The event handler in context 2 sends a command to update the data from the event within its datastore. The command processor, which would need to handle commands in a certain order, would queue up the command from the event handler and do it's updating. If updates to it's own entities raises events, those events are published by the event publisher. Those events can be handled by any bounded context, even context 2 (the source of the event).

In any case, I'd like to really tackle this concept and put it into use for a bit before I commit to any of these ideas. Perhaps there will be some Greenfield application in the near future that will benefit from this sort of architecture/design. If so, I'm sure I will have many lessons to learn in the process (and to post).

Thursday, January 8, 2015

Tips for Improving Your Dev Skills (beginners)

1. Learn what SOLID means and why you should apply it.

2. Learn to Unit Test.

3. Study GOF Patterns and learn to properly apply them.

4. Learn JavaScript (JSON, closures, JQuery, study libraries).

5. Study open-source source code.

6. Look up Martin Fowler and Pinal Dave.

7. Practice, practice, practice, study, study, study.

More in depth -
1. SOLID - These five principles form the foundations of object based code development. Follow them to write reusable, understandable, testable code.

2. Unit Testing - Enable more rapid development, define the software, allow for safety margin when making changes. Enables easier testing during development. Nudges code toward better structure when tests are written properly.

3. Patterns are everywhere, the original Gang of Four patterns help us understand how to recognize and apply basic patterns in software development. They have been translated to several code languages. Other collections of patterns exist and can be learned from also.

4. JavaScript is good, I like JavaScript. You should learn it. It's the only language that I know of that can be used in a full web stack natively. One can do a great deal with JavaScript.

5. Open Source code is a wealth of knowledge. See how others are coding - how the programs or libraries are structured. Take a look at the bug logs and see what's happening. Even try to resolve some bugs for practice, usually you can grab the source code and just run it.

6. Martin Fowler is a master of the OOP craft, read his work. Pinal Dave is a great source for learning all about SQL Server - gain a deep understanding of how SQL Server stores and indexes data, optimize queries, learn ninja tricks.

7. Practice and Study - goes without saying, right?

Go forth...

Tuesday, January 6, 2015

Assembly Not Found - .net runtime error

If you use a build server to build via msbuild and deploy to a web server, you may come across an error where some assembly is not found after the built code is run on the server. It works locally and perhaps even on other servers, but just not server x.

Following are my explanation of how this happens and what options are available to resolve this issue.

First off, when msbuild is run, it typically gets some information from the project file about how the project should be built. The project file is XML and contains some build details such as how to run nuget and which build targets to use during execution.

Build targets (.targets) are files that can be included in other build scripts which are XML files that declaritively set properties, run build tasks, and some other voodoo. Some examples of tasks are compile via csc, copy files, and run nuget.

One build target that is run is Microsoft.Common.targets. This file contains information about where to look for dependent assemblies (references). It directs msbuild to look in all sorts of places like program files directories, the current target framework directory, the GAC, and several others. The first assembly it finds is the one that will be used.

When a referenced assembly is found installed on the machine, msbuild makes an assumption that there is no need to copy that assembly to the bin. So, when the build server has the assembly installed it is not copied to the bin even if copy local is set to true (I've only witnessed this for nuget package refs so far and haven't thoroughly tested if true for other assemblies).

After the projects are built and all staged into the build drop folder, they can be copied out to the target server. Or perhaps built directly out to the target server, whichever you do. Now suppose the target server does not have the referenced assembly that the build server has installed wherever. This will be a runtime error! And who knows when it will rear it's ugly head!

Here's what can be done:

Option 1: uninstall anything that's become a nuget from your build server and let nuget pull it in.
Option 2: install the same components on all target servers.
Option 3: customize the build script to always copy local somehow (haven't explored this but seems viable)

Really, just do option 1. The others aren't better. Make sure those assemblies are set to copy local and that package restore is on.

msbuild reference-
http://msdn.microsoft.com/en-us/library/0k6kkbsd.aspx

Example targets file from the Roslyn compiler
http://source.roslyn.codeplex.com/#MSBuildFiles/C/ProgramFiles(x86)/MSBuild/14.0/bin_/amd64/Microsoft.Common.targets