Showing posts with label SOLID. Show all posts
Showing posts with label SOLID. Show all posts

Wednesday, July 5, 2017

What is Dependency Injection and Why Does it Matter?

While DI may be ubiquitous among most developers, it recently occurred to me that there are entire batches of newly minted grads entering the industry who, by and large, would not be familiar with the concept. Don't worry, you aren't the only ones! Although this topic has been bludgeoned to death by bloggers, I somehow feel the need to bubble it up here as well - sorry, it's a fun topic for me and I strongly feel that it's necessary to understand the principle. So if you are reading this and thinking - here we go again! Just give it a glance. But if you're like "what's this cool thing about?" or maybe still foggy on the principle read on and see if this doesn't clear things up.


I like to think about Dependency Injection/Inversion (DI) like this: A consumer should not be responsible for producing a resource that it consumes, the execution context should do that.

For a concrete example, consider the need to check whether or not a user is currently enrolled for mentoring. I'm going to be abstract in terms of implementation in order to make things a bit more language and platform agnostic (aside from the example being a web UI)...


We have some web site for managing mentor opportunities. Users who are mentors can navigate to the mentor page if they are mentors. If not a mentor, they cannot go to that page. Also, if they are not a mentor, we want to see if they have enough XP to serve as a mentor - if they do, encourage them to sign up for mentoring.

There are a few scenarios at play here so let's iterate those with specific examples:


Background:

| user   | XP    | is mentor |
| Tyler | 564   | yes           |
| Julie  | 8274 | no            |
| Jeff   |  128   | no            |


Scenario: User is a mentor.


Given user "Tyler" is logged in
When the navigation is shown
Then the user should see the mentors link.

Given user "Tyler" is logged in

When "Tyler" goes to the Mentors page
Then the Mentors page should be shown.


Scenario: User is not a mentor and is not qualified to mentor.



Given user "Jeff " is logged in
When the navigation is shown
Then the user should not see the mentors link.

Given user "Jeff " is logged in

When"Jeff " goes to the Mentors page
Then the user should be redirected to the home page.


Scenario: User is not a mentor and is qualified to mentor.


Given user "Julie" is logged in
When the navigation is shown
Then the user should see the mentors link, but it should be disabled.

Given user "Julie" is logged in

When"Julie" goes to the Mentors page
Then the user should be redirected to sign up for mentoring.

Given user "Julie" is logged in
When messages are displayed
And the user has not accepted the offer to become a mentor
Then an invite to become a mentor should be included in the messages.


Scenario: User becomes a mentor.


Given user "Julie" is logged in
When the navigation is shown
Then the user should see the mentors link.

Given user "Julie" is logged in

When "Julie" goes to the Mentors page
Then the Mentors page should be shown.



That's a lot of description, but it's worth noting the various scenarios in this example. But now back to focusing on the DI part...

So we have a module (note: I'm using "module" to express a class or function or whatever the unit is called in whatever language you use) for each component of the website -  the navigation bar, the mentor page, the mentor signup page. Those modules each need information about the user - is the user a mentor and can they become a mentor. Let's make a User module - one that can be passed into each web module and which can answer those questions. . Let's see how that would look:

User:
        isMentor() : boolean
        isQualifiedForMentor() : boolean
        becomeMentor() : boolean

Navigation:
        getLinks(user) : links

MentorPage:
        authorizeUser(user)

MentorSignupPage:
        authorizeUser(user)
        signup(user)

The User module would be passed to the web modules by the request context (usually in the web framework) - the same thing that initializes the given web module.

So, something in the web request process will authenticate the user, load the user data, and pass the user details to the web page module as a User module. That User will be used to answer questions for each module that depends on it.

Notice what we are not going to do here - we are NOT going to call the database from within authorizeUser(), getLinks(), or signup()...those are details those modules simply should not care about! What they care about is whether or not the user is a mentor or is qualified to be one. Even the Signup page doesn't care about HOW to sign up the user, it just needs to direct something else to do the actual signup.

A level deeper...

The User itself could have some dependencies. Let's say the User needs to send a message to some signup processor - perhaps to a message queue so that the signup processor can pick off the queue and process at it's own pace (so the user won't have to wait for processing to complete).

Let's say at first, it handles the signup right there in the web process. Now let's say the site scales massively and running the signup process within the web processes impacts all users of the site due to the additional load on the web servers - we would want to offload that process. When we use DI, we can change the way the User handles the signup without changing the consumer (the User module).

User:
    internals/private members/curried functions:
        _mentorSignup/signupForMentoring(user_id)
    public function:
        becomeMentor() : boolean
             calls _mentorSignup -> signup(user_id) or signupForMentoring(user_id)

Whatever happens when a user signs up is passed into the user module by the module that constructs the entire web request handling process (or whatever consumes the User for that matter).

The MentorSignup dependency (the one User is depending on) can either handle the signup itself OR pass off the signup to some other process that runs asynchronously so that the current process can continue execution without delay. It can be changed without touching the User module or any module that consumes User (theoretically).

A goal in good system design is to produce product that can be changed at a minimal cost - because software is soft (malleable, changeable).

In the case of changing the MentorSignup module, it's an ideal situation to be able to change the implementation without changing the consumers. However, there is one problem - if the MentorSignup module is consumed in a way that the web modules need to be recompiled or the MentorSignup module would need to be replaced in those modules in some other way, then we haven't achieved the ultimate goal. We are coupled architecturally.

Instead, we could have some communication between the web process and the signup process using a common interchange protocol such as HTTP and a mechanism to tell the signup process that a specific user would like to sign up to become a mentor. That's getting in a bit deeper than DI so I'll back off a little...

So here's what we have now: web modules, a User, a MentorSignup module, and something to build it all up and pass those dependencies in (MentorSignup into User, User into web modules).

We can do some interesting things with that - we can create some fake User code that gives us pre-defined answers so that we can build and test the web modules without depending on all the details of how a User is signed up, how user data is stored and retrieved, etc. We've decoupled the web modules from the dependency on the user data.



How else could we do it? Let's say we started out with the user data in a db and coded everything into the web modules directly:

Navigation:
        lookupUserLinks ... open a db connection, run query, map results, close connection, use results

MentorPage:
        authorizeUser(user) ... open db connection, run query, map results, close connection, use results

MentorSignupPage:
        authorizeUser(user) ... open db connection, run query, map results, close connection, use results
        signup(user) ... open db connection, excute command...open smtp...build email...send email...close db connection...close smtp...open different db connection....execute another command...close different db connection...etc

For one thing, we have a lot more room for errors...then duplicating errors. Another, our web modules would be difficult to understand - what does all this db code have to do with redirecting a user??? Ok you say...so let's pull that out an put it into a User module, call up the user module to run the code...

Navigation:
        lookupUserLinks ... create User module...use User module to query (get user)...use results

MentorPage:
        authorizeUser(user) ... create User module...use User module to query (get user)...use results

MentorSignupPage:
        authorizeUser(user) ... create User module...use User module to query (get user)...use results
        signup(user) ... create User module...use User module to query...use results...create Email module...use Email module to create and send email

At least the "goo" is centralized and has some re-usability - we no longer have a copy and paste scenario. But now we're still coupled to the specific User module that the functions know about directly! So we can't change those without changing the web modules - we can't do cool things like swap the User module on the fly or in test scenarios!

Now we get into programming to abstractions...we've already abstracted the details somewhat, but we've got this wicked coupling to only one specific implementation. So let's level that up and program to an abstraction...

For the purpose of building the web module, we don't care about the details of how to get the user, we don't even care that the user was got at all! What we really care about is that we have something to tell us certain things about the user and something that will sign the user up for mentoring. That's the abstraction we care about. We'll get into the details of implementation later in the development process - regardless of what those are, this thing should still work as described above in those Scenarios.

An important note on abstractions - they can leak! While we may only care logically that the user is a mentor, we also care that finding out does not take ages...if it did, that would be a leak in the abstraction! We really want to find out almost immediately when that block of consuming code runs - therefore we may want to load up all the user data BEFORE the web page is shown to the user. And of course we know this needs to be done quickly as well so we'll have to consider that in the implementation of the User module.

Now that we've gotten this far and (hopefullly) you can see how DI and many other fundamentals (programming to abstractions) are important to producing good software, let's take a look at one of the primary benefits of having done so...testing.

Now that we've designed the system in a way that depends on abstractions whose implementations are injected, we can swap out the implementation. For each of the defined scenarios described waaay above in this post, we can create a User that gives the pre-defined responses and exercise the code without depending on complicated - one-time only, db interactive - test setups.

We can have a setup for the first Scenario similar to:

MentorUser : User
    isMentor(): true


NonMentorNotQualifiedUser : User
    isMentor(): false
    isQualifiedForMentor() : false


NonMentorQualifiedUser : User
    isMentor(): false
    isQualifiedForMentor() : true
    becomeMentor(): true

Well, there you have a whole set of subtypes all based on the User with only isMentor()! Of course each has its own place...it's own domain...we could further break those down to become specific dependencies of each web module, but that'll be enough detail for now...

think about the DI principle and other scenarios where you may want to swap some dependency based on context...connected to internet v not connected, testing v production, logging per environment, exception handling per environment, multi-tennant apps...just to name a few.

If you've now come to understand the DI principle then you've gained some valuable XP! Next you'll probably want to exercise the idea...look for something soon that'll help with that (hint:subscribe now to see what happens on Friday)!

Wednesday, July 9, 2014

SOLID DB principles

In my last post, I rattled off a little bit about creating a db model that conforms to SOLID development principles. I presented the idea to some colleagues and it seemed to go over well. Today I will attempt to expound upon the principle and how a good database design can conform to each piece of the SOLID acronym.

S - single responsibility - each table should have a single responsibility. This means no god tables with nearly infinite nullable columns that can capture more than one type (sub-type actually) of thing.

O - open-closed - open to extension, closed to modification. Create linking tables or type tables to extend a base table, don't change the base table. Add views to create reusable components. Always create list/type tables so that new types of data can be linked to metadata. Reuse data when possible (no is no). If you really, really need to change the attributes frequently, consider EAV - but with caution.

L- Liskov Substitution - Comment is a base table, PostComment is a sub table. They map to types in object-land. Learn TPT, TPH etc.

I - interface segregation - there aren't really interfaces. Just make sure again that your tables, and perhaps views and other objects such as stored procedures/functions (especially these) aren't doing more than one thing.

D - dependency inversion (DI) - this applies more to the DAL than anything. Using DI allows for switching the database technology later, or even spread your data model across technologies. Either way all other code/modules/layers should be agnostic of the data store - abstract that baby away! One could go so far as to write adapters that adapt the db access technology to a custom interface should one want to switch from something like Entity Framework to NHibernate. EmberJS does something like that in the ember-data library. Consumers make all calls through ember-data objects, but can map the calls to whatever their storage technology requires. I truly can't think of any way to apply DI in SQL, perhaps noSQL implementations will have varying abilities to do so with map/reduce functions. I would say that the best way to practice good DI with SQL is to avoid any sort of business logic in stored procedures. I have seen some pretty tricky DI being simulated in stored procedure chains, perhaps some EXEC calls on dynamic SQL and things like that. I avoid those sorts of things like the plague these days unless it's generated SQL via EntityFramework.

Anyways, that's my attempt to SOLIDify database design. I feel kind of floppy about the whole DI rambling, if anyone has some good ideas about that one feel free to chip in!

Happy Coding!

Monday, July 7, 2014

SOLID data relationships

One approach I like to take when designing a database schema is to focus on the relationships before adding columns that are not involved in the relationships (in a relational database that is).

The structure and integrity are the most important component of a good database design. One approach is to model the data structures first. If you are working in an object oriented language you could build your objects first with minimal properties. By doing this and thinking about SOLID development principles, we can keep our database design in alignment with our code development practices.

in C#:
public class Person
{
public int PersonID {get;} {set;}
public virtual ICollection<Task> Tasks {get;} {set;}
}
or even using "class/table diagrams":

Person
-PersonID int PK
-Tasks Task[]

Task
-TaskID int PK
-Assignees Person[]

in this model we can already see that a person can have many tasks and a task can be assigned to many people. We can also see that the task class will not map directly to relational tables. There is only one task, this task can be assigned to multiple people. Therefore we will need a linking table to represent the many-to-many relationship between people and tasks. Likely we would want to collect more data about that task assignment, so probably the intermediary table should have it's own model.

TaskAssignment
-TaskAssignmentID int PK
-TaskID int FK
-PersonID int FK
-details...

Person
-PersonID int PK
-TaskAssignments TaskAssignment[]

Tasks
-TaskID int PK
-TaskAssignments TaskAssignment[]

now we can track each assignment and later add props like CompletedDate, hook up another link table to comments so you can save multiple comments with metadata related to each one.

While many-to-many relationships with a single comments table would work, there is a slight issue with data-integrity. A single comment should relate to a single TaskAssignment, and there should be zero-to-many Comments for a TaskAssignment. Or anything we'd like to tie comments to. Note that we might be tempted create a combo PK on TaskAssignments using TaskID and PersonID and do away with TaskAssignmentID, but what if it's a long running task and a person can be assigned to it again?

One way is to add a comments table each time we need comments to relate to another entity. Our query, should we want to view all comments related to some specific root would be really complicated. Another approach would be to have a single Comments entity with a CommentType attribute and whenever we need to add comments to some other table we can add an entity that relates to a specific comment and the target table.

Comments
-CommentID int PK
-CommentType CommentType FK
-details...

PersonComments
-CommentID int PK, FK
-PersonID int FK

TaskComments
-CommentID int PK, FK
-TaskID int FK

TaskAssignmentComments
-CommentID...etc

now we don't have to violate Open-Closed principle to add comments, our comments can be typed in code using TPT inheritance (Comments is the abstract base), and we maintain data-integrity.

The CommentType is so that we can view the Comments table in a query and know what type of entity each one relates to without joining all ___Comments tables in the query. We will have to if we want to know which entity each comment relates to. In .NET, EntityFramework helps significantly with this.

Hope anyone who gets this far will have some neurons firing with regards to structuring a database to support SOLID development practices.