When modeling data an entity refers to a unit that has properties which we are modeling. Some of those properties of the entity are intrinsic (natural properties) and some are extrinsic (externally applied/relative properties). If we model our entities such that the intrinsic properties belong to a root model (Person) and the extrinsic properties belong to subclasses of the root model (Employee) we can establish a proper representation of actual relationships of the entities in our model.
To illustrate, let's take a common example of modeling a person who is an employee at a specific company.
The person has some natural properties: Name, DOB, Gender, etc. These properties describe the entity in its base state. As an employee, the person may have additional properties that relate specifically to their state as an employee: id, email address, extension, location, title, etc. This person will also have some personal properties that may be interesting to model: email(s), phone number(s), address(es), family, etc.
If we are modeling the data as a relational model, none of the columns in the data tables should be nullable. Nullable columns are a clear indicator that the data can be further normalized and that it does not likely represent the entities accurately.
To achieve the goal of creating our model, we need to be able to create the proper relationships:
Person - Email (1:*)
Person - Address (*:*)
These relationships bring up interesting questions (and leave ambiguity in the model).
If someone has more than one address and one primary address is a requirement (assuming we are not modeling homeless), then how does a many-to-many relationship enforce the many-to-one relationship that must exist to model the persons primary residence? Address may be either of two types: primary xor non-primary. We could set a bit flag to identify primary. In order to enforce the *:1 (an address can be primary for more than one person) constraint with the primary, we would need an FK on the Person table and a reference to the address table. In the *:* relationship above, how can that FK be constrained to the PrimaryAddress type? The answer lay in table-per-type (TPT). We want to get to this:
Person - Address (*:*)
Pesrson - PrimaryAddress (*:1)
To implement the TPT for this model, we would need an Address table a PrimaryAddress table and a AdditionalAddress table. The Address would contain the details of the address and the PK. But each typed table would contain an FK to the Address table which is also its own PK. The Person record would contain a FK ref to the PrimaryAddress. An additional xref table would be needed to link the Address table to the Person table in general, this would be a *:* relationship. This PersonAddress table can better represent the fact that more than one Person can live at the same Address and a Person may have more than one Address. The FK to the PrimaryAddress will enforce and show that the Person must have a PrimaryAddress and the 1:1 relationship between PrimaryAddress and Address will establish which Address is that Person's primary Address.
It is important to recognize the boundaries of the entity since any entity may serve as the root. It is equally important to establish distinctions between different types/classifications of entities. Here are a few examples of these relationships:
Person (base type) -> [Employee, Parent, Volunteer, Student, Teacher] (sub-types)
Contact (base type) -> [Email, Phone -> [Internal], Skype, Signal -> [Tree, Smoke, Call]] (sub-types)
Relationship (base type) -> [Friend -> [Best, Acquaintance], Family -> [Sibling -> [Brother, Sister], Parent -> [Mother, Father], Cousin, ...] (sub-types)
In this way we can establish meaningful relationships in the data model, while accurately representing the entities we are modeling.