Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

Wednesday, August 16, 2017

Where to Begin the Agile Project

So, your a programmer and you get a new project to work on. And let's assume this project involves some data source(s), probably some persistence, and user interaction. How do we approach the construction of this system?


In the first place, we'll want to understand what the system needs to do and what it will generally "look like". When I say look like, I mean how it will generally be architected - how does the data flow? What layers? What components? Where do they interface? This involves some basic design and there are several patterns to draw from that would get the job done.


Either way you design the general structure of the system, you'll need to understand the data flow. You'll be presenting some data to a user and that data will come from somewhere. It seems obvious that the first step is to begin where the data flow starts...begin at the beginning. However, I say NO!


Through years of trials and errors of beginning at the beginning, I've found that you generally have to change things near the data source at least a few times throughout the life of an "Agile" project.


I've spent much time modeling databases and building data layers for applications only find that once the front-end was built on top, the data model had some fundamental flaws. The main trouble is that those flaws only came out when the business stakeholders were able to interact with the system - when they were working with something concrete. Only then did the details truly solidify. But by then, there were layers built up below the presentation layer - interfaces, implementations, tests - that all needed to change to support this new understanding. This leads to much rework!


If we can have a better understanding of the system before there is a full foundation baked into the system, then we can reduce the amount of wasteful rework on a project.


To solve for this problem, we have some options. We can build wireframes and prototypes in a Design Sprint. This sort of design-up-front is useful in as much as our understanding of the system is complete at the beginning of the project. If there is a good general understanding by the requester, it will be relatively more useful to do mockups - it may trigger insight and understanding earlier in the "Agile" project. It's a way of saying "if I were to build this, would it solve your problem?" I'm for it and it does a lot of good, but it's not where we should end the game!


Another technique I like to use, and I'm urging you to do the same, is to "start at the end"! By starting at the end - meaning the desired outcome of the feature we're building this cycle - we can set some concrete earlier in the sprint (preferably on Day 1 - Sprint Planning). If you're building an analysis system, start with the graphs and charts. Mock up the data that feeds them. Then build backwards from there. It's best to find out how the system will provide value. Is it an invoicing system? Start with the invoice itself! Work back to entering the billing info.


When we do this - and it doesn't have to look perfectly pretty yet - we give the business stakeholder something to interact with BEFORE we build up all the supporting layers underneath it and put on the fresh coat of paint. It's like building a custom bike starting with the seat and handlebars first...then you put on the wheels and a temporary engine to test the rake of the front forks, footpegs, shifter, grips, brake levers, etc...the user interface. Then they know it's built to what they need - after all it is a custom!


But software is a bit different than building a hog. Users can't see or feel the engine directly. They interact through whatever UI you put on there - there can be mockups and design drawings and diagrams, but it's the concretion that will have the most meaning. Besides, mockups are throwaway - waste. You can build a mockup with the actual code, it's easy these days to get a basic UI up and running quickly. Better to build the real thing then iterate from there one little manageable decision at a time. Better to have to refactor early in one layer than have to chase a change up the whole stack!

Monday, July 7, 2014

SOLID data relationships

One approach I like to take when designing a database schema is to focus on the relationships before adding columns that are not involved in the relationships (in a relational database that is).

The structure and integrity are the most important component of a good database design. One approach is to model the data structures first. If you are working in an object oriented language you could build your objects first with minimal properties. By doing this and thinking about SOLID development principles, we can keep our database design in alignment with our code development practices.

in C#:
public class Person
{
public int PersonID {get;} {set;}
public virtual ICollection<Task> Tasks {get;} {set;}
}
or even using "class/table diagrams":

Person
-PersonID int PK
-Tasks Task[]

Task
-TaskID int PK
-Assignees Person[]

in this model we can already see that a person can have many tasks and a task can be assigned to many people. We can also see that the task class will not map directly to relational tables. There is only one task, this task can be assigned to multiple people. Therefore we will need a linking table to represent the many-to-many relationship between people and tasks. Likely we would want to collect more data about that task assignment, so probably the intermediary table should have it's own model.

TaskAssignment
-TaskAssignmentID int PK
-TaskID int FK
-PersonID int FK
-details...

Person
-PersonID int PK
-TaskAssignments TaskAssignment[]

Tasks
-TaskID int PK
-TaskAssignments TaskAssignment[]

now we can track each assignment and later add props like CompletedDate, hook up another link table to comments so you can save multiple comments with metadata related to each one.

While many-to-many relationships with a single comments table would work, there is a slight issue with data-integrity. A single comment should relate to a single TaskAssignment, and there should be zero-to-many Comments for a TaskAssignment. Or anything we'd like to tie comments to. Note that we might be tempted create a combo PK on TaskAssignments using TaskID and PersonID and do away with TaskAssignmentID, but what if it's a long running task and a person can be assigned to it again?

One way is to add a comments table each time we need comments to relate to another entity. Our query, should we want to view all comments related to some specific root would be really complicated. Another approach would be to have a single Comments entity with a CommentType attribute and whenever we need to add comments to some other table we can add an entity that relates to a specific comment and the target table.

Comments
-CommentID int PK
-CommentType CommentType FK
-details...

PersonComments
-CommentID int PK, FK
-PersonID int FK

TaskComments
-CommentID int PK, FK
-TaskID int FK

TaskAssignmentComments
-CommentID...etc

now we don't have to violate Open-Closed principle to add comments, our comments can be typed in code using TPT inheritance (Comments is the abstract base), and we maintain data-integrity.

The CommentType is so that we can view the Comments table in a query and know what type of entity each one relates to without joining all ___Comments tables in the query. We will have to if we want to know which entity each comment relates to. In .NET, EntityFramework helps significantly with this.

Hope anyone who gets this far will have some neurons firing with regards to structuring a database to support SOLID development practices.