C# Quick Reference

June 1, 2023 Adrian Simionescu Leave a comment

I started last year working on an existing C# quick reference for myself to have a good list of the “main” features of C#.

It might be missing some of the features but I tried to take into consideration those features that are primary and most used.

I’ve included the following C# versions at the time of writing this blog post:

C# 2.0
C# 3.0
C# 4.0
C# 5.0
C# 6.0
C# 7.0
C# 7.1
C# 7.2
C# 8.0
C# 9.0
C# 10.0
C# 11.0

https://github.com/lionadi/csharp-quick-reference-guide

AI and Data, Architecture, Programming & Software Development

Lessons Learned: Working with legacy or existing systems

June 1, 2023 Adrian Simionescu Leave a comment

I recently found out about a book on working with legacy code and got me thinking about all the times I have been working with legacy projects or existing systems.

So I wondered what have I learned and remembered while working with legacy systems or existing systems. I decided to write a list to get a start for the book and then compare notes.

The book in question is: Working Effectively with Legacy Code – Robert C. Martin

Architecture

Try to understand the present architecture by using the system, preferably testing it for some purpose. Try to understand both the system and application architecture.
If you have the possibility and know-how try to refactor parts of an architecture. Changing part of an architecture can be a “safe” way to make changes and improve both development quality and performance.
If documentation and graphs are missing, don’t be afraid to make some, especially if things are complex and large. You don’t have to re-think things constantly moving from a location to a different location within an architecture.

Database

Do not read the database multiple times or a hundred times if you can do it with one request
Do not write to the database multiple times separately if you can do it in fewer write operations. Favor bulk operations.
Do not retrieve more data than you need, both columns and rows. Verify that you actually using all of the rows and columns you are retrieving
In SQL Server, functions and table definition can take a toll on performance, prefer views or procedures if possible
Cache data retrieved from the database if possible
Understand your database, for larger complex queries it might be a good idea to use temp tables or similar to cache the results if used later in the same process

Data Tools like Azure Data Factory

Avoid loops if possible, replace them with explicit single calls or few calls. Each request adds a few layers of latency:
ADF Adds its own latency while starting your activity
- Network and other communication latencies, request and response
- Then and only then your actual operation with its own processing time

Code

Parallelization is not always the solution, depending on your available resources there isn’t much benefit in running things in parallel. Understand hardware limitations where your code is running, amount of CPU cores, memory, drive speed, etc.
Data Structures matter, know your data, and based on this use the kind of data structures that provide the best functionality and performance. Sometimes using 3rd party code can give bloated functionalities that negatively impact your application performance. In such cases, do not be afraid to identify the problem areas and create your own data structures with corresponding functionalities.
Removed unused code.
Comments code that you are not going to refactor but is hard to understand and maintain, especially if you have an understanding after working with it. You will most likely forget the finer details when returning to it after a long pause.
Use different patterns and guidelines to help you refactor your code.
Use your knowledge of cohesion and coupling to refactor your code.
Use extract methods and extract classes to modularize your code.
If you do not understand what the code of an application does, try understanding it by testing parts of it either under a debugger and/or with input data.
The best place to start understanding an application is by looking at the topmost interface, which is the topmost layer that the outside world uses to communicate with. This can be an API call, a file being loaded from a main function, etc. The idea is for you to test these interfaces and see what happens, what kind of data they consume, what they do internally and what kind of data do they output.
Regarding the previous step, combine domain knowledge with any other functionality knowledge to focus your investigation. You may encounter “dead” code that does nothing, very little, or is not used often. This should give you an understanding of what is important in the system and where to focus your time and energy.

Testing and performance/load/stress testing

Verify system performance with performance tests like load tests
Use a baseline where one request after the other is run, this will tell how things work and how fast. Notice: This is specific to your project, architecture, and implementation details. If the system is a heavy data processing entity then running one after the other is an excellent way to get a baseline of how things are and how things have improved after changes.
Use stress testing to discover slow operations or operations that clog your system. After fixing rerun the same test, see how things have improved, and run an even heavier set of tests to see the next point where your system clogs. Repeat until no more items are found that clog your system or until performance is satisfactory to your needs.
Integration tests comparing old code with the new code until all refactoring done
Taking snapshots of outputs the code provides and comparing new code to match. Take all possible variations the code produces. Use sub-sets of the output or the ultimate result like something that is in a table(s).
If you are refactoring something and there are no prior tests, create them. It will help you and your teams to keep the quality higher and avoid regressions.
The important thing is to discover slow parts of your systems and applications. A suitable method that I use is isolation, by this I mean that by looking at a system or application you try to discover the different parts that make the whole. This way you create an imaginary isolation that you try to test for how each part performs. When you identify a slow portion, that portion and subdivide it further until you have found the most taxing part(s) of a system or an application.

Communication

Communication is hard to get right; not all people are good at it, and we have different views and knowledge. Having the courage to ask for help and express your ideas. Listening to others.
Common channels where all related persons and technical persons can communicate and follow what has happened, what is happening, and what will be done
In general avoid private messages related to project-specific work, problems solving, solutions, etc.
Having open communication between team members in my opinion helps morale, keeps all informed, and is able to step in and help in any situation.

Documentation

If there is no documentation do not be afraid to create some, especially on topics and areas that are complex to understand, be these system or application related.
If possible use visuals to demonstrate how something works.
You don’t need to document everything but business critical areas, functionalities, and logic that are complex might be good to have them described.

Last words

There are probably more things that I could add but this is a good start. I’ll have to come back to this list and revise it at some point.

Architecture, Azure, Continuous Integration / Continuous Deployment

Lessons Learned – Pulumi Infrastructure as Code, for real :)

April 3, 2023 Adrian Simionescu Leave a comment

Intro

I’ve worked with Pulumi on a couple of projects to create infrastructure. And not only that but also integrating and deploying code and applications with it into our architecture.

I have really come to like it a lot. There are different alternatives to it but at the moment I have really liked it. Here are my reasons why I think it is at the moment the best tool to handle your infrastructure and code deployments.

The good

It is so easy to use and get started. I’ve used it both with AWS and Azure and got my application resources and code deployed in less than an hour to just test thing works.
It is so easy that I’ve seen junior developers and non-DevOps engineers pick Pulumi up and deploy their applications into AWS or Azure without much guidance or help.
Using Pulumi has really changed the teams I have worked with. The developer himself/herself has full control over all aspects of the application he/she is working on development, testing, integration, deployment, and infrastructure.
Support for unit testing your infrastructure.
State management and secret support are good. You can use the Pulumi service which is out of the box, you can create your own self-hosted version of the Pulumi Service similar to the instance or you can use AWS or Azure features and resources to manage the state.
You can create your Infrastructure code in any language you like. This is good because a developer working on his frontend app with typescript on React can use typescript to create the infrastructure and deployment code.
When Pulumi does not support something you can pick things up with a library from your cloud provider, in the same language as the rest of your infrastructure code within the infrastructure project. No extra bash scripts or PowerShell scripts etc.
One of my favorite parts is the actual use of resources output values easily and with the full support of the programming language you are using.
You can use the full support and features of your chosen programming language. You can use design patterns, OOP, functional approach, you name it.
Support for passing infrastructure outputs, secrets, ids, etc from one project to another through stack references.
Good documentation to get you started and some good examples.
Damn cost-effective and fast to produce results.

The bad

So what are the bad parts of Pulumi?

You have to learn the approach Pulumi has for infrastructure, how it handles secrets, state, how resource outputs are handled, etc. This is true for any infrastructure tool but it will be something new to learn.
Since it is not a native tool to your favorite cloud provider, not all features are supported but I’ve found that the things that matter the most are there and haven’t run into a situation that was missing in Pulumi but you could use CloudFormation or ARM/Bicep to do. Usually, if it was missing from the cloud provider’s own tools, it was missing from Pulumi and not the other way around.
Documentation is good to get you started but unfortunately, some more complex issues are to be found by searching the internet or figuring things out yourself. It doesn’t have so many examples that you might find with ARM, Terraform, CloudFormation, etc. For me this hasn’t been a big issue, I have gotten around this by creating my own solutions.
Organizations might not want to pick this up because it is not the “mainstream” option or native to a cloud provider.

Some thoughts on Pulumi

I think generally I would recommend testing it out and using it. The biggest downside I see is that there might be an objection that it is not mainstream or native to a cloud-like ARM, Bicep, CloudFormation, or AWS CDK.

These are valid points but for me, the upsides are great. It is so easy to pick up, make your infrastructure and deploy it. All the developers I have worked with who were not DevOps engineer quickly or instantly picked it up and got results.

The ability to use time-tested patterns and approaches from software developer with the full support of a programming language is golden and I think in the long run increases the quality of the solution and bring down the costs.

https://www.pulumi.com/docs/intro/vs/

Examples

Now that I have written about Pulumi and how it works I’d like to give some tips, tricks, and examples that might help you if you pick Pulumi up and start using it.

To get started quickly head over here:

https://www.pulumi.com/docs/get-started/

https://www.pulumi.com/docs/get-started/install/

https://www.pulumi.com/docs/intro/concepts/

https://www.pulumi.com/docs/guides/continuous-delivery/

https://www.pulumi.com/registry/packages/azure-native/

https://www.pulumi.com/registry/packages/aws/

https://www.pulumi.com/registry/packages/aws-native/ (In Preview)

Manual State Management

In this example, I’ll show you a script that you can use to create an Azure storage account + Key Vault and configure Pulumi to use them.

https://www.pulumi.com/docs/intro/concepts/state/

https://www.pulumi.com/docs/guides/self-hosted/

The example below will create the needed resources and attach them to your Pulumi instance locally.

# PowerShell variables used in the script 
$location="West Europe"
$rgName="Infrastructure"
$saName="saiacstate"
$kvName="kv-iacstate"

az group create -n $rgName -l $location

# Configure the Azure Blob Storage that will contain the state 
az storage account create -g $rgName -n $saName -l $location --sku Standard_LRS --allow-blob-public-access false
# Set environment variables needed to write on the storage account
$env:AZURE_STORAGE_KEY=$(az storage account keys list -n $saName -g $rgName -o tsv --query '[0].value')
$env:AZURE_STORAGE_ACCOUNT=$saName
az storage container create -n iacstate

# Configure the Key Vault that will be used to encrypt the sensitive data
$vaultId=az keyvault create -g $rgName -n $kvName --query "id" -o tsv
# Use az cli to authenticate to key vault instead of using environment variables 
$env:AZURE_KEYVAULT_AUTH_VIA_CLI="true"
$myUserId=az ad signed-in-user show --query "id" -o tsv
az keyvault key create -n encryptionState --vault-name $kvName
az role assignment create --scope $vaultId --role "Key Vault Crypto Officer" --assignee $myUserId

If you have a state management configured into your could provider and need to connect to it, run the following in bash or Powershell:

#!/bin/sh
rgName="Infrastructure"
saName="saiacstate"
azKEy=$(az storage account keys list -n $saName -g $rgName -o tsv --query '[0].value')

export AZURE_STORAGE_KEY=$azKEy
export AZURE_STORAGE_ACCOUNT=$saName
export AZURE_KEYVAULT_AUTH_VIA_CLI="true"

# PowerShell variables used in the script 
$rgName="Infrastructure"
$saName="saiacstate"

# Set environment variables needed to write on the storage account
$env:AZURE_STORAGE_KEY=$(az storage account keys list -n $saName -g $rgName -o tsv --query '[0].value')
$env:AZURE_STORAGE_ACCOUNT=$saName
$env:AZURE_KEYVAULT_AUTH_VIA_CLI="true"

Notice that the following Environmental variables need to be set for Pulumi to work with manual state management:

AZURE_STORAGE_KEY
AZURE_STORAGE_ACCOUNT
AZURE_KEYVAULT_AUTH_VIA_CLI

Setting Pulumi Stack Secrets Provider

Run in your shell the following command in your Pulumi project if the project is an existing one:

pulumi stack change-secrets-provider "azurekeyvault://{myKvName}..vault.azure.net/keys/encryptionState"

If your project is a new one use the following:

pulumi new azure-csharp -n AzureStorageBackend -s dev -y --secrets-provider="azurekeyvault://{myKvName}.vault.azure.net/keys/encryptionState"

Showing a secret output

Run in the shell:

pulumi stack output MyPassword --show-secrets

Keeping the Infrastructure code “clean”

I’ve noticed that once you start to write your infrastructure as actual code with a programming language you start to run into the same problems as you would run into if you were to write normal software projects.

Mainly writing clean, easy-to-understand, and maintainable code that has good cohesiveness and low coupling.

If you look at any example you find regarding Pulumi, AWS CDK, or Terraform CDK, you will notice that the example are quite simple and do not take into consideration complex situations.

In my experience, I have seen the code running wild and complex easily, but there is a positive sign. You can use time-tested approaches with your code, whether you are doing OOP or functional programming, you can use all your best practices.

With OOP I find that some basic Creational Patterns can be very good in making the code clear. I highly recommend going over some of these patterns and Design Patterns, in general, to see how can these help.

I’ll give an example of how using the Builder pattern kept in my opinion the code clear when creating an Azure Function App with deployable code.

public class FunctionAppBuilder
{
    private string? _projectName;
    private string? _environment;
    private StorageAccount? _storageAccount;
    private AppServicePlan? _appServicePlan;
    private BlobContainer? _blobContainer;
    private Blob? _deploymentBlobArtifact;
    private Component? _appInsights;
    private AppConfigPulumi? _appConfigPulumi;
    private WebApp? _webApp;
    private string? _resourceGroupName;
    private InputList<NameValuePairArgs> _appConfigs;
    public FunctionAppBuilder WithResourceGroup(string? resourceGroupName)
    {
        _resourceGroupName = resourceGroupName;
        return this;
    }

    public FunctionAppBuilder WithEnvironment(string? environment)
    {
        _environment = environment;
        return this;
    }
    
    public FunctionAppBuilder WithFunctionAppConfigs(AppConfigPulumi? appConfigPulumi)
    {
        _appConfigPulumi = appConfigPulumi;
        return this;
    }

    public FunctionAppBuilder WithProjectName(string? projectName)
    {
        _projectName = projectName;
        return this;
    }

    public WebApp? Build()
    {
        _storageAccount = CreateStorageAccount();
        _appServicePlan = CreateAppServicePlan();
        _blobContainer = CreateBlobContainer();
        _deploymentBlobArtifact = CreateDeploymentBlobArtifact();
        _appInsights = CreateAppInsights();
        _webApp = CreateWebApp();
        
        return _webApp;
    }

    private StorageAccount? CreateStorageAccount()
    {
        return new StorageAccount($"{_environment}projectsa", new StorageAccountArgs
        {
            ResourceGroupName = _resourceGroupName,
            Sku = new SkuArgs
            {
                Name = SkuName.Standard_LRS,
            },
            Kind = Pulumi.AzureNative.Storage.Kind.StorageV2,
            Tags = DeploymentHelper.GetTags()
        });
    }

    private AppServicePlan? CreateAppServicePlan()
    {
        return new AppServicePlan($"{_environment}-func-asp", new AppServicePlanArgs
        {
            ResourceGroupName = _resourceGroupName,

            Kind = "functionapp",

            // Consumption plan SKU
            Sku = new SkuDescriptionArgs
            {
                Tier = "Dynamic",
                Name = "Y1",
                Size = "Y1",
                Family = "Y",
                Capacity = 0
            },

            Tags = DeploymentHelper.GetTags()
        });
    }

    private BlobContainer? CreateBlobContainer()
    {
        return new BlobContainer($"{_environment}-artifact-container", new BlobContainerArgs
        {
            AccountName = _storageAccount?.Name,
            PublicAccess = PublicAccess.None,
            ResourceGroupName = _resourceGroupName,
        });
    }

    private Blob? CreateDeploymentBlobArtifact()
    {
        var functionAppPublishFolder = "../myapp/bin/Release/net6.0/publish/";

        return new Blob($"{_environment}-myapp-blob", new BlobArgs
        {
            AccountName = _storageAccount?.Name,
            ContainerName = _blobContainer?.Name,
            ResourceGroupName = _resourceGroupName,
            Source = new FileArchive(functionAppPublishFolder),
            Type = BlobType.Block,
        });
    }

    private Component? CreateAppInsights()
    {
        return new Component($"{_environment}-myapp-appInsights", new ComponentArgs
        {
            ApplicationType = ApplicationType.Web,
            Kind = "web",
            ResourceGroupName = _resourceGroupName,
            Tags = DeploymentHelper.GetTags()
        });
    }

    private WebApp? CreateWebApp()
    {
        _appConfigs = CreateAppSystemConfigs();
        _appConfigs.AddRange(CreateAppPrimaryConfigs());

        return new WebApp($"{_environment}-func-myapp", new WebAppArgs
        {
            Kind = "FunctionApp",
            ResourceGroupName = _resourceGroupName,
            ServerFarmId = _appServicePlan?.Id,
            Tags = DeploymentHelper.GetTags(),
            SiteConfig = new SiteConfigArgs
            {
                AppSettings = _appConfigs
            },
            Identity = new ManagedServiceIdentityArgs()
            {
                Type = ManagedServiceIdentityType.SystemAssigned
            }
        });
    }

    private InputList<NameValuePairArgs> CreateAppSystemConfigs()
    {
        return new[]
        {
            new NameValuePairArgs
            {
                Name = "AzureWebJobsStorage",
                Value = DeploymentHelper.GetConnectionString(_resourceGroupName, _storageAccount?.Name),
            },
            new NameValuePairArgs
            {
                Name = "FUNCTIONS_EXTENSION_VERSION",
                Value = "~4",
            },
            new NameValuePairArgs
            {
                Name = "FUNCTIONS_WORKER_RUNTIME",
                Value = "dotnet",
            },
            new NameValuePairArgs
            {
                Name = "WEBSITE_RUN_FROM_PACKAGE",
                Value = DeploymentHelper.SignedBlobReadUrl(_deploymentBlobArtifact, _blobContainer, _storageAccount, _resourceGroupName),
            },
            new NameValuePairArgs
            {
                Name = "APPLICATIONINSIGHTS_CONNECTION_STRING",
                Value = Output.Format($"InstrumentationKey={_appInsights?.InstrumentationKey}"),
            },
        };
    }

    private InputList<NameValuePairArgs> CreateAppPrimaryConfigs()
    {
        return new[]
        {
            new NameValuePairArgs
            {
                Name = "AppConfig:myConfig",
                Value = "MyConigValue",
            },
            
        };
    }
}

Then to use the function app builder would look something like this:

var app = new FunctionAppBuilder()
            .WithResourceGroup(resourceGroupName)
            .WithEnvironment(environment)
            .WithProjectName(projectName)
            .WithFunctionAppConfigs(DeploymentHelper.GetFunctionAppConfigurations(config))
            .Build();

You can use this approach for different resources, then return the needed output params from a created resource to other builders. It can keep your main stack very clean and readable.

Waiting for resources to finish to make additional operations

Sometimes you may need to wait for a resource to be provisioned or changed before you can do something about it.

For example, you might need to add a test to Azure Load Testing and it is not supported by ARM, bicep, Pulumi, etc but it is supported through C# API. Before you can do this you need to wait for the Azure Load Testing to be created.

Here is an example on how to wait for a resource group to be created, you would use the Apply function in Pulumi:

var resourceGroup = new ResourceGroupBuilder()
            .WithResourceGroupName(resourceGroupName)
            .WithEnvironment(environment)
            .Build();

        resourceGroup?.Name.Apply(resourceGroupNameOutput =>
        {
// Do something with the group name
});

Metadata of the deployment that is currently running

In some situation you might need details about your Pulumi project, in such cases use the following approach:

var environment = Pulumi.Deployment.Instance.StackName;
        var projectName = Pulumi.Deployment.Instance.ProjectName;

Building resources IDs

Here is an example of how to get an existing resource group created outside your Pulumi configurations with a specific name and under a specific Azure client id.

public ResourceGroup Build()
    {
        var clientConfigResult = Pulumi.AzureNative.Authorization.GetClientConfig.InvokeAsync().Result;
        return new Pulumi.AzureNative.Resources.ResourceGroup(_resourceGroupName, new()
        {
            ResourceGroupName = _resourceGroupName,
        }, new CustomResourceOptions() {ImportId = $"/subscriptions/{clientConfigResult.SubscriptionId}/resourceGroups/{_resourceGroupName}", RetainOnDelete = true});
    }

The magic is in the Pulumi.AzureNative.Authorization.GetClientConfig.

Handling existing resources

Situation 1: Getting an existing resource and not deleting any existing resources if you will delete any of your project resources, “RetainOnDelete” param

public ResourceGroup Build()
    {
        var clientConfigResult = Pulumi.AzureNative.Authorization.GetClientConfig.InvokeAsync().Result;
        return new Pulumi.AzureNative.Resources.ResourceGroup(_resourceGroupName, new()
        {
            ResourceGroupName = _resourceGroupName,
        }, new CustomResourceOptions() {ImportId = $"/subscriptions/{clientConfigResult.SubscriptionId}/resourceGroups/{_resourceGroupName}", RetainOnDelete = true});
    }

Situation 2: Disallowing deletion of any resources from an existing resource group, including your projects: “Protect” param

public ResourceGroup Build()
    {
        var clientConfigResult = Pulumi.AzureNative.Authorization.GetClientConfig.InvokeAsync().Result;
        return new Pulumi.AzureNative.Resources.ResourceGroup(_resourceGroupName, new()
        {
            ResourceGroupName = _resourceGroupName,
        }, new CustomResourceOptions() {ImportId = $"/subscriptions/{clientConfigResult.SubscriptionId}/resourceGroups/{_resourceGroupName}", Protect = true});
    }

Generally, when creating a resource in Pulumi you should be able to configure how that creation is managed by supplying this parameter: CustomResourceOptions.

Assigning roles to a resource

First check out the Azure existing roles to see if you can get by using something existing: https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles

Here is an example of how to add the “Load Test Contributor” role for web application identity.

Output.Tuple(_webApp?.Identity, _loadTest?.Id).Apply(items =>
        { return new Pulumi.AzureNative.Authorization.RoleAssignment($"roleAssignment-mywebapp-LoadTesting-{_environment}", new()
        {
            PrincipalId = items.Item1?.PrincipalId,
            PrincipalType = "ServicePrincipal",
            RoleDefinitionId = "/providers/Microsoft.Authorization/roleDefinitions/749a398d-560b-491b-bb21-08924219302e",
            RoleAssignmentName = $"{_environment}-mywebapp-loadtesting",
            Scope = items.Item2,
        }); });

Testing your infrastructure code

First, get over to these links at Pulumi’s site for more details:

https://www.pulumi.com/blog/infrastructure-testing-concepts/

https://www.pulumi.com/docs/guides/testing/unit/

https://www.pulumi.com/blog/unit-testing-cloud-deployments-with-dotnet/

https://www.pulumi.com/docs/guides/testing/

Here are some examples of testing utilities for Pulumi for mocking and testing:

class Mocks : IMocks
{
    public Task<(string? id, object state)> NewResourceAsync(MockResourceArgs args)
    {
        var outputs = ImmutableDictionary.CreateBuilder<string, object>();

        // Forward all input parameters as resource outputs, so that we could test them.
        outputs.AddRange(args.Inputs);

        // Default the resource ID to `{name}_id`.
        // We could also format it as `/subscription/abc/resourceGroups/xyz/...` if that was important for tests.
        args.Id ??= $"{args.Name}_id";
        return Task.FromResult<(string? id, object state)>((args.Id, (object)outputs));
    }

    public Task<object> CallAsync(MockCallArgs args)
    {
        var outputs = ImmutableDictionary.CreateBuilder<string, object>();

        return Task.FromResult((object)outputs);
    }
}

/// <summary>
/// Helper methods to streamlines unit testing experience.
/// </summary>
public static class TestUtility
{
    /// <summary>
    /// Run the tests for a given stack type.
    /// </summary>
    public static Task<ImmutableArray<Resource>> RunAsync<T>() where T : Stack, new()
    {
        var configJson = @"
        { 
            ""project:environment"": ""prod"",
            ""project:dbName"": ""databasename"",
            ""project:dbAdmin"": ""admin"",
            ""project:name"": ""my project name""
        }";

        System.Environment.SetEnvironmentVariable("PULUMI_CONFIG", configJson);

        return Deployment.TestAsync<T>(new Mocks(), new TestOptions
        {
            IsPreview = false
        });
    }

    /// <summary>
    /// Extract the value from an output.
    /// </summary>
    public static Task<T> GetValueAsync<T>(this Output<T> output)
    {
        var tcs = new TaskCompletionSource<T>();

        output.Apply(v =>
        { tcs.SetResult(v);
          return v; });

        return tcs.Task;
    }
}

[TestFixture]
    public class MainStackTests
    {

        [Test]
        public async Task ShouldHaveResourcesAsync()
        {
            var resources = await TestUtility.RunAsync<MainStack>();

            resources.Should().NotBeNull();
        }
        
        [Test]
        public async Task ShouldHaveSingleVpcAsync()
        {
            var resources = await TestUtility.RunAsync<MainStack>();
            var vpcs = resources.OfType<Vpc>().ToList();

            vpcs.Count.Should().Be(1, "should be a single VPC");
        }
    }

In the example above the MainStack is your Pulumi stack with resources you would provision into the cloud.

Final Words

I think these are a good start on good-to-know things with Pulumi. There definitely are more situations that are good to know but these ones are the ones that I could remember and found useful.

Personally, my biased opinion, writing infrastructure with a programming language is the way of the future, at least for developers who are not DevOps engineers.

Uncategorized

Learn Store – Interesting Learning Sources: Part 1

January 20, 2023 Adrian Simionescu Leave a comment

Decided to start a new post series with learning and knowledge sources I find interesting. Things that might help me understand things better or someone else. From time to time, I’ll post a new one with new locations I find.

I might make one of these once a month or once every three months, it depends on how much time I have. Too many things to learn and do :).

Backend

https://codeopinion.com/

https://www.youtube.com/c/Elfocrash

Frontend

https://www.youtube.com/c/JackHerrington

Architecture

https://simonbrown.je/

https://leanpub.com/b/software-architecture

https://awesome-architecture.com/

https://jimmybogard.com/

DevOps and Related

https://www.youtube.com/c/ContinuousDelivery

Security

https://leanpub.com/web-hacking-101

https://owasp.org/www-project-top-ten/

Patterns and Guidelines

https://refactoring.guru/design-patterns

https://refactoring.guru/refactoring

https://thevaluable.dev/single-responsibility-principle-revisited/

https://thevaluable.dev/cohesion-coupling-guide-examples/

https://thevaluable.dev/open-closed-principle-revisited/

https://thevaluable.dev/guide-inheritance-oop/

An Introduction to Clean Architecture

Cargo Cult Programming Is the Art of Programming by Coincidence

In Defense of the SOLID Principles

Youtube channels with lectures

https://www.youtube.com/c/GotoConferences

https://www.youtube.com/channel/UC3PGn-hQdbtRiqxZK9XBGqQ

https://www.youtube.com/nctv

https://www.youtube.com/c/Gdconf

https://www.youtube.com/c/StrangeLoopConf

https://www.youtube.com/channel/UC3PGn-hQdbtRiqxZK9XBGqQ

Tools

https://newrelic.com/

https://serilog.net/

https://opentelemetry.io/

https://grafana.com/

https://www.sonarsource.com/products/sonarlint/

https://docs.sonarqube.org/latest/setup/install-server/

Misc

https://www.synopsys.com/software-integrity/security-testing/software-composition-analysis.html

https://www.toptal.com/nodejs/software-reengineering

https://github.com/JuanCrg90/Clean-Code-Notes

Fun stuff 🙂

https://www.youtube.com/c/TheCodingTrain

Uncategorized

Software Architecture – Part 4.3: Concepts and Guidelines – Cohesion, Coupling, and Modularity

December 13, 2022 Adrian Simionescu Leave a comment

In this post, I will cover cohesion, coupling, and modularity. I wanted to have a better understanding of these subjects and not just rely on patterns and guidelines without having a deeper understanding of why they are the way they are.

Notes for myself and someone else :).

That’s one of the reason why software development is so hard: because it is coupled to the ever-changing real world, and we need to use our incomplete perception to represent it as accurately as we can.

It’s not that easy to be functionally cohesive when you’re coding the business domain of a company. The Boundaries between different functionalities are not that clear or stable in the real world, with different concepts “leaking” into each other.
https://thevaluable.dev/cohesion-coupling-guide-examples/

Introduction: Modularity and Cohesion

Large systems are hard to understand and keep a mental track of, too complicated for the human brain to process correctly and efficiently.

Modularization is a key component to fight the software entropy caused by complexity. All software suffers from complexity but how that complexity is expressed and managed matters. When dealing with modules the complexity is served in smaller parts and thus easier to handle and comprehend.

What is a module? Well, anything that creates clear boundaries between knowledge and concepts. In software engineering, this can be a function, a class, a namespace, a package, a whole microservice, or even a monolith. There should exist a boundary between the module and the “outside” world. You are thinking and designing things from the viewpoint of the module.

How do cohesion and coupling fit into this? Depending on what languages and platform you are developing, each approach will have its own challenges regarding creating a code base that has a good balance between cohesion and coupling, preferably high cohesion with low coupling. Here is a list of different programming paradigms for a quick look:

Imperative: Programming with an explicit sequence of commands.

Declarative: Programming by specifying the result a user wants, instead of how to get it.

Structured: Programming with clean control structures.

Procedural: Imperative programming with procedure calls.

Functional: Programming with function calls that avoid any global state.

Function-Level: Programming with no variables at all.

Object-Oriented: Programming by defining objects that send messages to each other.

Event-Driven: Programming with emitters and listeners of asynchronous actions.

Flow-Driven: Programming processes communicating with each other over predefined channels.

Logic: Programming by specifying a set of facts and rules.

Constraint: Programming by specifying a set of constraints.

Aspect-Oriented: Programming cross-cutting concerns applied transparently.

Reflective: Programming by manipulating the program elements.

Array: Programming with powerful array operators.

https://www.indicative.com/resource/programming-paradigm/

I will focus primarily on functional and object-oriented since they are the paradigms that are well-known and most used.

Cohesion and coupling are essential to your code base, not only at a small level like a function but all the way to your architectural approach to creating your code. So there are two things that affect your coupling and cohesion levels:

How you write your code at a low level like a function, a class, etc
And a higher level of how your code is talking to each other

Some programming paradigms are older than others and the older a paradigm is the likelihood it was that it suffered from poor cohesion and higher coupling within the code base. Notice that newer approaches like object-oriented and functional are not exempt from these problems, even though both are trying to solve these problems in their own ways. It is also good to notice that modern languages can be multi-paradigm oriented, meaning that there can be an added risk of creating poor-quality software.

Coupling is about connections across the boundaries of different modules, while cohesion is about the connections between the elements inside the boundary of a module.
https://thevaluable.dev/cohesion-coupling-guide-examples/

Coupling

Coupling is the meaning we give when we want to talk about a connection between two or more modules. When the connection is strong, we speak about strongly coupled modules; when the connection is weak, we speak about loosely coupled modules. It’s a measure of how much they “know” about each other.

The structured design movement makes it clear that neither coupling nor cohesion are absolute truth: in design, everything is a trade-off. We should learn about, experience, and remember the benefits of the solutions we propose, but also the drawbacks. That’s why exploring and experimenting is so important. That’s also partly why software development is so damn difficult.

According to the structured design movement, the strength of coupling depends on:

The types of connections between modules.

The complexity of the interfaces of the modules.

The type of information going through the connection.

https://thevaluable.dev/cohesion-coupling-guide-examples/

What increases coupling?

Before we go forward, we should define an interface related to coupling and modules. An interface between modules is anything that you use to call or operate upon to get/set something; things like a function, a variable, a parameter, an interface construct, a namespace, an API call, etc.

Notice when one module connects to another one, the connecting module is making an assumption about that connection. It is a form of a contract. This can have two problems if not aware of 1. The connecting interface is poorly designed and implemented, not doing what it says it is doing. 2. Once a connection has been established, changing that connection becomes hard if not impossible.

Here are a few examples of how coupling is created:

Different modules can be more strongly coupled if they have many different interfaces, because, as a result, they potentially have many different types of connections.
The amount of input and output parameters affect how strongly modules become coupled. In other words, data affects your coupling intensity.
In an object-oriented approach or similar, passing an instance of a class does not only allow you to access any data that is exposed but all of the exposed functionalities. You might be getting more than you wanted.
Common/Global coupling through common or global modules.

Different coupling levels

https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Structural coupling
Dynamic coupling
Semantic coupling
Logical coupling

Structural coupling

From stronger to weaker coupling:

Content coupling – Modules directly accessing the content of each others, without using an interface.

Common coupling – Modules mutating common variables with bigger scope (like global variables).

Control coupling – Modules controlling the logic (control flow) of other ones.

External coupling – Modules exchanging information using an external mean, like a file.

Stamp coupling – Modules exchanging elements, but the receiving end doesn’t act on all elements. For example, a module receiving an array via its interface but not using all its elements.

Data coupling – Modules exchanging elements, and the receiving end use all of them.
https://thevaluable.dev/cohesion-coupling-guide-examples/

For object-oriented approaches:

CBO (Coupling Between Object) – How much objects acts upon another.

CBE (Coupling Between Element) – More precise variation of the CBO. It considers two (or more) elements coupled if there is any dependency between them, like access, or modification of implementation details to one another.

CTM (Coupling Through Message passing) – Measures the number of messages sent by a considered class to the other classes in the system.

IC (Inheritance Coupling) – Calculate the coupling due to inheritance.
https://thevaluable.dev/cohesion-coupling-guide-examples/

Dynamic coupling

Dynamic coupling is the coupling happening at runtime; by using interface constructs, for example (parametric polymorphism).
https://thevaluable.dev/cohesion-coupling-guide-examples/

Logical Coupling

Logical coupling happens when parts of different modules change at the same time, without visible connections between them in the codebase itself. It can happen, for example, when the same behavior is duplicated in different modules; said differently, the same knowledge has been codified in two different places. When a developer changes one representation of this behavior in one module, she needs to change it everywhere it’s repeated.
https://thevaluable.dev/cohesion-coupling-guide-examples/

Semantic Coupling

Semantic coupling happens when one module use the knowledge of another one. For example, when one module assume that another module does something specific.
https://thevaluable.dev/cohesion-coupling-guide-examples/

Additional explanations for Types of Coupling (best to worst):

Data Coupling: If the dependency between the modules is based on the fact that they communicate by passing only data, then the modules are said to be data coupled. In data coupling, the components are independent of each other and communicate through data. Module communications don’t contain tramp data. Example-customer billing system.

Stamp Coupling In stamp coupling, the complete data structure is passed from one module to another module. Therefore, it involves tramp data. It may be necessary due to efficiency factors- this choice was made by the insightful designer, not a lazy programmer.

Control Coupling: If the modules communicate by passing control information, then they are said to be control coupled. It can be bad if parameters indicate completely different behavior and good if parameters allow factoring and reuse of functionality. Example- sort function that takes comparison function as an argument.

External Coupling: In external coupling, the modules depend on other modules, external to the software being developed or to a particular type of hardware. Ex- protocol, external file, device format, etc.

Common Coupling: The modules have shared data such as global data structures. The changes in global data mean tracing back to all modules which access that data to evaluate the effect of the change. So it has got disadvantages like difficulty in reusing modules, reduced ability to control data accesses, and reduced maintainability.

Content Coupling: In a content coupling, one module can modify the data of another module, or control flow is passed from one module to the other module. This is the worst form of coupling and should be avoided.

https://www.geeksforgeeks.org/software-engineering-coupling-and-cohesion/

Other types of coupling

Vendor specific coupling
External and 3rd party solutions and libraries

Solutions to coupling

Limit the number of interfaces you expose from your modules. Remember that think of the big picture also. While a class may have one or two interfaces to communicate to the outside world, a namespace may have multiple classes with multiple public interfaces each. So thinking about your design and how you expose your modules is essential. The complexity increases with each possible new interface and the costs of upkeep rise.
Limit the number of input and output data to a connecting module.
Be aware of what you pass along to other modules. Design your code in such a way that if you are passing an instance of a class, then it should expose minimal data and functionalities. In other words, be aware in the module that takes this instance that you may be coupling to this class more tightly than needed. Use refactoring and design principles to help.
Be aware of the inner working of the coupling module, how it uses the data, and the interfaces of the other module.
Be aware of global modules, using them incorrectly can cause system-wide problems. Use global modules with care and only when needed. A logging functionality is a good example of functional global coupling that is useful.
Understand the domain you are working in, and understand different design and architectural patterns and approaches. Understand how to use them and when to use them.
Think before you generalize code to be used in many places, closely related to DRY. You have an imperfect amount of information about an unsure future, sometimes code duplication can be the answer.

Thoughts on cohesion and coupling

So what is the answer? Should we decouple everything into a single module? Or very very small modules?

As expected the answer is not so simple, the smaller modules you create the greater complexity and upkeep you introduce to your code base or any system. There are many more things to be aware of and use correctly. Your solution becomes a great of indirections.

With cohesion, we can decrease the level of coupling. In cohesion the elements of a module should aim for the same goal; they should try to solve the same domain problem. Considering the cohesion of a module first can have benefits:

Easier to make changes when the related functionality is in the same place with the same goal.
Change happens in one location, a module.
Increasing cohesion decreases coupling. The fewer connections to modules outside the current one lead to better cohesion.

Notice that you can’t create a fully decoupled 100% cohesive code. You will always couple something to something else. The aim is to achieve a good cohesion level with low coupling.

Cohesion tells us how strongly modules and classes are internally related to themselves. Cohesion is the degree to which all of the methods and data structures in a class or module are related to one another and belong together. A module or class with a high level of cohesion will have elements that all share a common purpose.

Different cohesion levels

From worst to best:

Coincidental Cohesion
Logical Cohesion
Temporal Cohesion
Procedural Cohesion
Communicational Cohesion
Sequential Cohesion
Functional Cohesion

Coincidental Cohesion

Coincidental cohesion appears when the elements of a module don’t have any meaningful relationship.

Changing modules with coincidental cohesion is difficult: their elements are independent of each other, so there are big chances they’re used (and therefore coupled) from other modules.
https://thevaluable.dev/cohesion-coupling-guide-examples/

A module is said to have coincidental cohesion if it performs a set of tasks that are associated with each other very loosely, if at all.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Logical Cohesion

When the elements of a module have some weak relationships, we can qualify its cohesion as logical. For example:

Elements which have similar interfaces.

Elements which work with the same kind of input, and/or output.

Elements which are all using a database.

The category of these elements is often vague, or too big to be really meaningful. These categories can be technical ones (like “every element using a database”), but not only. We could also encapsulate a wide and meaningless domain problem in a module.

In short, the commonality between the different elements often feel superficial.
https://thevaluable.dev/cohesion-coupling-guide-examples/

A module is said to be logically cohesive if all the elements of the module perform a similar operation. For example Error handling, data input and data output, etc.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Temporal Cohesion

This one is considered a tad better than logical cohesion because a temporally cohesive module has its elements bounded to an important dimension: time.

Indeed, the elements of such modules are executed in the same time frame. For example, good old modules containing some sort of temporal indication in their names, like “init”, “first”, “next”, “when”, “startup”, “termination”, or “cleanup”.
https://thevaluable.dev/cohesion-coupling-guide-examples/

When a module includes functions that are associated by the fact that all the methods must be executed in the same time, the module is said to exhibit temporal cohesion.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Procedural Cohesion

A module is said to be procedural cohesion if the set of purpose of the module are all parts of a procedure in which particular sequence of steps has to be carried out for achieving a goal, e.g., the algorithm for decoding a message.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Communicational Cohesion

Modules communicationally cohesive have different elements operating on the same data. As such, it’s the first category of cohesion we see in this article where the elements are likely to be about the same domain problem; they use the data of the problem at hand.

This kind of cohesion is quite common in e-commerce: for example, the module “stocks” can have multiple elements manipulating the same data related to products. The module then match a precise domain problem in E-commerce, namely how to represent the concept of “stock” in our code.

That said, you might also consider our “stocks” module only logically cohesive, if it’s too big to properly reason about it. Again, it depends on your codebase.
https://thevaluable.dev/cohesion-coupling-guide-examples/

A module is said to have communicational cohesion, if all tasks of the module refer to or update the same data structure, e.g., the set of functions defined on an array or a stack.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Sequential Cohesion

Sequential cohesion is similar to communicational cohesion. The difference: elements of such modules take the output of the other elements and use them as their inputs. It often follows a linear transformation of data, like the good old pipelines.

Achieving sequential cohesion with languages supporting the functional programming paradigm is easier. You’ll have access to many constructs facilitating the creation of sequential transformation of data, like the famous “map” or “reduce” functions for example.
https://thevaluable.dev/cohesion-coupling-guide-examples/

A module is said to possess sequential cohesion if the element of a module form the components of the sequence, where the output from one component of the sequence is input to the next.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Functional Cohesion

functional cohesion, or trying to put everything related to a single functionality together. Element of such modules try to achieve the same goal, try to solve the same problem.
https://thevaluable.dev/cohesion-coupling-guide-examples/

Functional Cohesion is said to exist if the different elements of a module, cooperate to achieve a single function.
https://www.javatpoint.com/software-engineering-coupling-and-cohesion

Architectural Cohesion

Technical cohesion.
Domain cohesion.

Technical Cohesion

This is when you are using horizontal layers architectures that allow a developer to create intentionally or unintentionally coupled code within a layer. All code within the layer can be considered cohesive to that layer but can produce highly coupled code.

This can be both good and bad, depending on the context. Again you need to understand what to use and when to use it.

You have to be aware that technical cohesion is technical in nature and does not take into consideration any domain-related problem-solving or knowledge.

Examples of horizontal technical cohesion, MVC, Ports, and adapter approaches like onion, hexagonal and clean architecture, and microservices.

Domain Cohesion

A domain cohesive approach is one where you take the time to understand the domains of the problem you are trying to solve and create cohesive modules to solve these problems.

Your code tries to reflect the real-world problem and its solution and the technical side comes afterward. You can use domain discovery tools like Eventstorming and strategic domain-driven design approaches.

An actual implementation approach and a feature-based codebase.

Sources

https://thevaluable.dev/cohesion-coupling-guide-examples/

https://blog.ttulka.com/how-cohesion-and-coupling-correlate/

https://codeopinion.com/solid-nope-just-coupling-and-cohesion/

https://www.geeksforgeeks.org/software-engineering-differences-between-coupling-and-cohesion/

https://www.geeksforgeeks.org/software-engineering-coupling-and-cohesion/

https://www.newthings.co/blog/coupling-and-cohesion-guiding-principles-for-clear-code/

https://www.javatpoint.com/software-engineering-coupling-and-cohesion

https://www.baeldung.com/cs/cohesion-vs-coupling

https://ammarmerakli.medium.com/pure-functions-high-cohesion-low-coupling-174b0a47ef24

https://www.ombulabs.com/blog/learning/software-development/coupling-and-cohesion.html

https://www.indicative.com/resource/programming-paradigm/

https://github.com/Phantas0s/alexandria-library/blob/master/computing/_PAPERS/1974_structured_design.pdf

https://www.goodreads.com/book/show/946145.Structured_Design

https://www.goodreads.com/en/book/show/1441004.Practical_Guide_to_Structured_Systems_Design

https://www.goodreads.com/book/show/4845.Code_Complete

https://www.geeksforgeeks.org/software-engineering-coupling-and-cohesion/

https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD447.html

https://github.com/Phantas0s/alexandria-library/blob/master/computing/_PAPERS/1971_parnas_on_the_criteria_to_be_used_in_decomposing_systemsinto_modules.pdf

http://homepages.cs.ncl.ac.uk/brian.randell/NATO/nato1968.PDF

https://fpalomba.github.io/pdf/Journals/J16.pdf

Uncategorized

Software Architecture – Part 4.2: Concepts and Guidelines – SOLID Principles

November 25, 2022 Adrian Simionescu Leave a comment

Hello again.

In this post I aim to understand the SOLID principles, primarily I want to understand the pros and cons of using each principle. I will not focus on how you need to implement them. I am more interested in why what and when.

The reason for this approach is that I have found different explanations and usages during my time as a developer. I have also found them difficult to implement correctly and I have seen them misused in such a way that the code was not easy to understand, hard to maintain, and sometimes”bad”. I have been guilty of all of these myself.

The SOLID principles are good and helpful but they do not fit into every solution and you have to understand when to use them and why. I also want to see how the principles work with different architectural approaches.

I won’t assume that I have all the answers I want, but I do want to write this for myself and for someone else who wonders about SOLID in the same line of questions.

As always, I will add at the end of this post different sources that I used for this blog post. If you want to learn more about the SOLID principles, please follow some of the links.

And the last reason for this blog post is the following quote which I like:

We should question what we do. We should thrive to understand the principles we use. We should understand if they’re good to use in a precise context. We should look at their potential benefits and trade-offs. We should listen to different opinions, to be sure ours are valid. We should look at every piece of information we find on the Internet (including this article) with a critical eye.

But our designs shouldn’t be solid like a stone we can’t really shape for our own needs, but fluid and flexible depending on the ever-changing business domain we want to codify. In this context, the OCP can make everything SOLID indeed, creating abstractions which are hard to modify. We don’t want that.
https://thevaluable.dev/open-closed-principle-revisited/

Introduction

So, SOLID is a mnemonic as a set of design principles:

Single Responsibility Principle
Open/Closed Principle
Liskov Substitution Principle
Interface Segregation Principle
Dependency Inversion Principle

These design principles aim to help you to create a solution that is:

Flexible towards changes and as a whole your entire solution
Extendable
Easy to understand code and maintain readability
Maintainability

Single Responsibility Principle

Each software module or a class should have one and only one reason to change
https://procodeguide.com/design/solid-principles-with-csharp-net-core/

“A class should have one, and only one, a reason to change.”
(Robert C. Martin)

“There should never be more than one reason for a class to change”
Clean Code, Robert C. Martin

a class should only have one responsibility. Furthermore, it should only have one reason to change.

How does this principle help us to build better software? Let’s see a few of its benefits:

Testing – A class with one responsibility will have far fewer test cases.

Lower coupling – Less functionality in a single class will have fewer dependencies.

Organization – Smaller, well-organized classes are easier to search than monolithic ones.

https://www.baeldung.com/solid-principles

Overall, one of the best examples is that SRP is not limited to a class but at any level in any programming paradigm. SRP can be applied to a class, namespace, package, microservice, function, etc.

I have found that the single responsibility principle is good to follow but also ambiguous on what it really means and how to implement it. Yes, there are many examples but when you start to create your application, the easy simple examples are harder to translate into your specific domain problem.

To fully utilize the SRP, I think that you need to have a good understanding of the problem you are trying to solve, both on a micro and macro level. For this, you should speak the same language as your product owner(s) and also do some problem discovery or domain discovery.

This should allow you to decompose your problem into smaller and smaller bits for you to see what is the right amount of decomposition.

You can’t achieve 1000 % decoupling and 100 % cohesion, but avoid unnecessary coupling and make your modules as cohesive as possible.

the goal is not to be 100% decoupled and 100% cohesive, it’s doing our best to avoid unnecessary coupling and making our modules as cohesive as possible.
https://thevaluable.dev/single-responsibility-principle-revisited/

Benefits:

Classes with single responsibility are easier to design & implement

Promotes separation of concern by restricting single functionality to a class

Improves readability as it is a single class per functionality which is much easier to explain and understand.

The maintainability of the code is better as a change in one functionality does not affect other functionality.

Improves testability as due to single functionality in a class it reduces complexity while writing unit test cases for a class

Also isolating each functionality in different classes helps to limit the changes in that class only which eventually helps to reduce the number of bugs due to modifications for new requirements.

It is easier to debug errors as well i.e. if there is an error in email functionality then you know which class to look for.

It also allows to reuse of the same code in other places at well i.e. if you build an email functionality class can same can be used for user registration, OTP over email, forgot passwords, etc.

https://procodeguide.com/design/solid-principles-with-csharp-net-core/

Open/Closed Principle

A software class or module should be open for extension but closed for modification.

If we have written a class then it should flexible enough that we should not change it (closed for modification) until there are bugs but a new feature can be added (open for extension) by adding new code without modifying its existing code.
https://procodeguide.com/design/solid-principles-with-csharp-net-core/

“A class is closed, since it may be compiled, stored in a library, baselined, and used by client classes. But it is also open, since any new class may use it as parent, adding new features. When a descendant class is defined, there is no need to change the original or to disturb its clients.”
(Bertrand Meyer)

classes should be open for extension but closed for modification. In doing so, we stop ourselves from modifying existing code and causing potential new bugs in an otherwise happy application.

Of course, the one exception to the rule is when fixing bugs in existing code.
https://www.baeldung.com/solid-principles

Key Points:

Adding new code that does not modify existing working code. The benefit of this is that you will not introduce new bugs into the working code.
You could use class inheritance to add new functionalities without modifying old functionality. Beware of problems that can be introduced by inheritance like; tight coupling, readability, flexibility, unnecessary inherited functionality, encapsulation can break, etc.
You could use interface constructs to achieve even more loose coupling between classes implementing the interface.

Possible downsides:

Increased complexity if abstractions are created too often without other fixed approaches considerations.
Requirements can change often, thus creating complexity if inheritance or abstractions are used too carelessly.
New abstractions should only be created when we have concrete reasons to use them, not each time we want to change our code.
If the complexity increases much, after a certain point the maintenance and readability of the code become difficult.
Think of preferring the Single Responsibility Principle to keep the complexity down.

Liskov Substitution Principle

Any function or code that use pointers or references to base class must be able to use any class that is derived from that base class without any modifications.

This principle suggests that you should write your derived classes in such a way that any child class (derived class) should be perfectly substitutable in place of its parent class (base class) without changing its behaviour.

This principle says that if you have a function in the base class that is also present in the derived class then the derived class should implement that function with the same behaviour i.e. it should give the same output for the given input. If the behaviour in the derived class is the same then the client code using the base class function can safely use the same function from derived classes without any modifications.

This principle focuses more on the behaviour of base and extended classes rather than the structure of these classes.
https://procodeguide.com/design/solid-principles-with-csharp-net-core/

“Subtypes must be substitutable for their base types.”
(Barbara Liskov)

if class A is a subtype of class B, we should be able to replace B with A without disrupting the behavior of our program.
https://www.baeldung.com/solid-principles

The LSP is applicable when there’s a supertype-subtype inheritance relationship by either extending a class or implementing an interface. We can think of the methods defined in the supertype as defining a contract. Every subtype is expected to stick to this contract. If a subclass does not adhere to the superclass’s contract, it’s violating the LSP.
https://dev.to/tamerlang/understanding-solid-principles-liskov-substitution-principle-46an

Key Points:

A base class and a derived class should implement the same functionality with the same behavior; the same input should give the same output.
This allows the user of a base class and a derived class to safely use the same functionality without worries.
Function derived from the base class should have the same signature, same input, and returns the same output value.
Function in derived class should not implement stricter rules as it will cause problems if called with an object of the base class.
This principle can be a little difficult to implement i.e. it requires lots of planning and code design efforts right at the start of the project. Needs manual checks, code reviews, and testing to follow this principle.

Rules to consider:

Parameter types in a method of a class should match or are more abstract than parameter types in the superclass
The return type in a method of a subclass should match or be a subtype of the return type in the superclass.
A method in a subclass shouldn’t throw types of exceptions that the base method isn’t expected to throw.
A subclass shouldn’t strengthen pre-conditions.
A subclass shouldn’t weaken post-conditions.
Invariants of a superclass must be preserved.

Interface Segregation Principle

Client should not be forced to implement an interface that it will never use or interface that is irrelevant to it.

This principle promotes the implementation of many small interfaces instead of one big interface as it will allow clients to select the required interfaces and implement the same.
https://procodeguide.com/design/solid-principles-with-csharp-net-core/

larger interfaces should be split into smaller ones. By doing so, we can ensure that implementing classes only need to be concerned about the methods that are of interest to them.
https://www.baeldung.com/solid-principles

No client should be forced to depend on methods it does not use.
Robert C. Martin

Key points:

The goal is to break your code into smaller pieces/modules that have clearer specific usage. This prevents too large classes, keeping things focused, lean, and decoupled as much as possible.
This allows your code to have a more fine-grained choice of what to implement, only picking what is needed.
Allows the class to be closely related to he implemented interface.
Allows the design using the SPR principle in mind.
Smaller interfaces allows for smaller responsibilities.
Clearer distribution of responsibilities.

Dependency Inversion Principle

High level classes should not depend on low level classes instead both should depend upon abstraction.

Abstraction should not depend upon details infact details should depend upon abstraction

This principle suggests that there should be loose coupling between high level and low-level classes and to achieve this loose coupling components should depend on abstraction. In simple terms, it says that classes should depend on interfaces/abstract classes and not on concrete types.
https://procodeguide.com/design/solid-principles-with-csharp-net-core/

The principle of dependency inversion refers to the decoupling of software modules. This way, instead of high-level modules depending on low-level modules, both will depend on abstractions.
https://www.baeldung.com/solid-principles

High-level modules should not depend upon low-level modules. Both should depend upon abstractions
https://www.c-sharpcorner.com/article/solid-with-net-core/

Some clarifications:

Client: Your main class/code that runs the high-level module.
High-Level Modules: Interface/Abstraction that your client uses.
Low-Level Modules: Details of your interfaces/abstraction.

Key points:

Avoiding classes depending on each other directly. This creates tightly coupled classes.
Low-level classes should implement contracts using an interface or abstract classes and high-level classes should make use of these contracts to access concrete types.
Related to Open/Closed Principle and Liskov Substitution Principle.

Problems and Complaints

I think that to truly understand something, I have to know the good and the bad, what works and what not. In this way I can have a better idea what to use when and how. So, on this note I’ll add some thoughts that I found on why SOLID principles are not so good.

The principles are vague. Depending on your skill level and experience, the principles and concepts within them can be hard to understand and put into good practice.
They can make your code complex. This can happen when you are using the principles too explicitly without considering the context you are applying them.
YAGNI effect can easily take effect with certain principles like SRP, OCP, ISP, and DIP. Or in other words, abstracting too early too much when you might not know the real requirements of your solution.
Another problem is splitting things into too small parts without proper context and categorization. This can lead to hard-to-understand code or mental complexity by having to keep in mind many small details.

Summary

What did I learn from all of this? Well, making software is never easy, with lots of concepts, ideas thoughts. You can easily understand something wrong and apply it even more wrongly.

At the end of the day, it’s about understanding your requirements, your domain, and what you are trying to achieve. So understanding the good and the bad things in a certain approach helps you make better judgments, keeping in mind that no single solution, set of principles, patterns, etc. is the silver bullet you are looking for.

I’ll close with the following quotes:

Generally, software should be written as simply as possible in order to produce the desired result. However, once updating the software becomes painful, the software’s design should be adjusted to eliminate the pain.

Often, these principles, in addition to the more general Don’t Repeat Yourself principle, can be used as a guide while refactoring the software into a better design.
https://deviq.com/principles/solid

make it as simple as possible;

pain and friction will indicate when your design needs to grow;

use SOLID (and other) principles as guidelines while refactoring.

passes all tests;

clearly expresses intent;

contains no duplication;

minimizes the number of classes and methods

https://blog.spinthemoose.com/2012/12/17/solid-as-an-antipattern/

Sources

https://www.baeldung.com/solid-principles

https://medium.com/backticks-tildes/the-s-o-l-i-d-principles-in-pictures-b34ce2f1e898

https://betterprogramming.pub/5-problems-faced-when-using-solid-design-principles-and-how-to-fix-them-df6dbf3699fb

https://thevaluable.dev/single-responsibility-principle-revisited/

https://thevaluable.dev/open-closed-principle-revisited/

https://www.c-sharpcorner.com/article/solid-with-net-core/

https://github.com/thangchung/clean-code-dotnet

https://github.com/procodeguide/ProCodeGuide.Sample.SolidPrinciples

https://blog.ndepend.com/defense-solid-principles/

https://blog.ndepend.com/introduction-clean-architecture/

https://blog.ndepend.com/cargo-cult-programming/

https://blog.spinthemoose.com/2012/12/17/solid-as-an-antipattern/

Architecture, Programming & Software Development

Software Architecture – Part 4.1: Concepts and Guidelines – Abstractions

November 18, 2022 Adrian Simionescu Leave a comment

This is a topic that I wanted to investigate further. For many years I have listened, discussed, and read about different programming languages, patterns, architectural styles, etc. In some cases, you will get similar answers and in some cases will give you different and also conflicting answers. I also have had the idea that I got a grasp on this, at least on some level but the truth is that it has been challenging to give a precise and simple definition.

Abstractions are one of the things that people, including myself, have had a hard time putting into words, and most of the answers to the question “How to best make use of abstractions?”. So this blog post is an attempt to go deeper into abstractions and get a better understanding.

Again, I’ll post all my sources for this blog post at the end of this post.

The goal of this blog post:

Define what an abstraction is.
Abstractions in software development.
The difference between abstraction and indirection.

What’s an Abstraction

Abstraction in our day-to-day life

Something that is loosely related to reality, conveying something difficult to understand.
The quality of dealing with ideas rather than events.
“An abstraction is a general idea rather than one relating to a particular object, person, or situation.” From Dictionary
A generalization of something in the real life, a general idea.
“Abstraction is the purposeful suppression, or hiding, of some details of a process or artifact, in order to bring out more clearly other aspects, details, or structure.” From Oregon State University book.

Abstraction can be said to have these three properties:

Hiding, or removing, details.
Generalization
Idea versus reality.

Or:

Hiding useless information.

Generalizing a concept.

Dealing with an idea representing the reality.

https://thevaluable.dev/abstraction-type-software-example/

Examples of abstraction in the real world:

Washing machines
Microwave
Dishwasher

The concept of a machine that abstracts its inner workings from you through a “user interface”. The “user interface” can be different, but the general concept of a washing machine or a microwave remains the same across different brands.

Notice: Abstractions are not the objects they abstract.

Abstraction layers

Abstractions also come in layers. Think of a car, within the car you have a steering wheel that abstracts away the inner working of turning the car through different mechanical and software-related functionalities all the way to your tires turning on the road. The same thing to the gas and brake pedals, they abstract the functionalities of the car’s engine pumping gas into the engine to get the car into movement or applying the brakes to the tires so that the car stops.

In software engineering, you might see something like this in a web solution:

User interface.

High-level language (PHP, for example).

Low-level language (C).

Machine language.

Architecture (registers, memory, arithmetic unit…).

Circuit elements (logic gates).

Transistors.
https://thevaluable.dev/abstraction-type-software-example/

The highest layer on the abstraction layer stack is the closes to reality and an actual person. Each layer acts as an interface to the layers below them.

A user should not be able to bypass the user interface, the user does not need to know what happens after the first layer of abstraction.

This concept is known as the abstraction barrier. A layer of abstraction should not know or care about the inner working of the abstraction below it. A layer should care about the interface it uses to communicate with an abstraction.

Abstractions in Software Engineering

Types of abstractions

In software engineering, we can say safely that an application can be divided into two parts: data and behavior/logic/control flow.

From this, we can derive two abstraction types:

Data abstractions, data stored somewhere somehow
Behaviors or control flow operating on stored data

Data Abstractions

Data abstractions are available in many programming languages with different programming paradigms, OOP, functional, etc.

A data abstraction tells you and a compiler (or interpreter):

What type of data, does a variable hold.
What possible behaviors the variable allows you to do.

Notice the differences between primitive data types in a language vs data types in a modern programming language. See more info in Abstract Data Type (ADT) and https://dotnettutorials.net/lesson/abstract-data-type/

Data abstraction is meant to:

Simplifying by hiding the complex memory management (for some language) and behavioral mechanisms.

Providing general behaviors, you can reuse everywhere.

Giving the power for developers to create new abstractions with ADTs.

https://thevaluable.dev/abstraction-type-software-example/

Control Abstraction

Generally, your control abstraction in most languages is a function.

Why a function is an abstraction:

The function name simplifies and hides the internal details. The implementation details do not matter to a function caller. What matters is the function signature: name, input, and output when you want to use it.
Generalizes behavior, reusable.

Abstractions in OOP

Let’s look at some abstractions that you can use with OOP.

Classes and Objects

A class is a construct which gather data (called properties or attributes) and behaviors acting on its data (called methods).
https://thevaluable.dev/abstraction-type-software-example/

A class should simplify the operations on its data, by encapsulating and hiding. A class should have a clear purpose by which it operates on data.
A class tries to generalize its purpose and operations of the required functionality.
The outside world will only know a representation of the internal behavior of the class, a user interface to the class. We care that the class does what it says it should, but not how it does it (except security and performance).

Abstract Classes

Depending on your programming language, you can create templates for other classes to inherit from, abstract classes.

An abstract class hides the details we don’t care about when using it.
An abstract class can generalize functionality, giving inheriting classes default behaviors.
An abstract class represents a general idea of some desired requirement or functionality.

Notice: Using abstract classes comes with some problems with it:

A normal class is enough to hide details.
You can use composition for the generalization of functionalities from many classes.
Abstract classes can cause indirection as complexity when you use them in generalization.

Interface (Construct)

I like the description of an interface as a construct by thevaluable.dev website. I think it does a better understanding of what people might think an interface is, depending on their background. I am going to use the same name.

Interface constructs:

Hide implementations details, you should only see a method signature with no implementation.
Generalization of a concept/functionality that can be implemented by an unlimited amount of classes.

Differences between interface constructs and abstract classes:

The interface construct is a limited form of multiple inheritance, and allow you to use another important concept in programming (not only in OOP) called polymorphism.

Methods of an abstract class can be implemented directly in the abstract class. Methods of an interface construct can’t be implemented directly in the interface construct.

https://thevaluable.dev/abstraction-type-software-example/

Notice:

Interfaces allow you to swap implementation, their primary focus is not an abstraction.
Using interfaces in your application does not make it incredibly abstract, or tremendously scalable.

Benefits and Costs of Abstractions

Simplifying ideas and concepts

Benefits

As human beings and developers, we only have a finite amount of capacity to process and remember ideas, thoughts, logic, and concepts before accuracy and quality starts to deteriorate.

Abstracting things helps us manage things in our minds and in our code. Abstraction is simplicity.

The Costs

The Naming Problem

You need to give proper names to your functions, classes, interfaces, etc. These names should be clear on what they do, the naming should not hide details such as a function doing more than its name implies. Your colleagues and others looking at your code should understand easily what it is about, it should not give them a wrong idea.

Over Simplification

If you simplify too much, you may end up with functionality that does too many things compared to what it is saying it is doing through a simple interface.

Your functionality says it supports a dozen of different external data sources, but only through a single interface point. This, of course, is a bad implementation but the idea is that it would be unclear how a programmer would use this functionality, is it through some sort of configuration, or some other way of delivering the configuration to connect to the external data sources, etc.

An abstraction should draw away details, that’s true, but it should brings out other details as well.
https://thevaluable.dev/abstraction-type-software-example/

You want to keep the necessary complexity and avoid unnecessary complexity.

Washing Away Too Many Details

Avoid oversimplification so that you don’t know anymore what the code is trying to do. This can be seen in a situation related to your business requirements where you are naming a class in such a way that it is too abstract and does not make it clear what it is used for.

Stay as close as possible to the business problems you solve, close to the reality of your features, to the real life. Don’t use abstractions at first.
https://thevaluable.dev/abstraction-type-software-example/

The image above from: https://computersciencewiki.org/images/thumb/e/e2/Abstract_heart.png/797px-Abstract_heart.png

Leaky Abstractions

Abstractions should only hide things, never remove them. An abstraction is leaking when it has bugs that need to be fixed.

Depending on who created it and where the bug is located, you or someone else needs to fix it if you want the abstraction to work.

It is not always clear how abstraction works, especially if it is not yours. Even if any problems are not obvious when you use the abstraction they might still be there and cause problems in the future.

For external libraries and tools, you can mitigate possible future problems by using design patterns and interfaces to create a layer of abstraction between them.

All in all, you need to know what’s going on in the closest layer of abstraction you work with.

Generalization

The Benefits

It is quite clear, from your choice of programming language to your libraries, frameworks, APIs, business logic, patterns, concepts, etc, they all are in one way or another abstracted through generalization.

As stated earlier, it will help you comprehend what you need to do and how to do it.

Pitfalls

UpFront Generalizations or Excessive generalizations

To be able to create a working solutions for a client, as developers we need to be able to understand many aspect like the clients business, their requirements and needs, identify their domains, common language etc.

This is all to be able to create generalizations and abstractions that allow you to create the kind of solutions your client needs, to turn everything into working code.

When we abstract things we must be aware of how our abstractions affect our code. With different architectural approaches and design patterns, we create different levels of cohesion and coupling in our code.

Unnecessary generalization upfront can be problematic, this is because you as a developer can’t know anything outside what you know. Meaning that requirements change, real-life needs change and when you need to make these changes to your code, these changes can lead to bugs and hard-to-maintain situations.

Doing too much generalization is guesswork, base your solution on what you know, and avoid YAGNI, you are never gonna need it dilemma.

Some thoughts on premature generalization:

Stick to what you know, avoid YAGNI
If something is needed in multiple places start to think about generalization
If unsure if something is good for generalization, then copy and paste it for the time being. (Notice: DRY is a good guideline but not always good, DRY leads to a more highly coupled code when unchecked).
A generalization used in one place is not useful.
Try generalization first with functions instead of complex patterns or guidelines.
Follow clean code and refactoring guidelines when examining what to and how to generalize.

Generalization can make things harder to understand, things can be so generalized that you don’t really have a good idea of what is going on, and you are having a hard to understand what business requirements the code is trying to solve.

The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.
https://en.wikipedia.org/wiki/Edsger_W._Dijkstra

Abstract Classes and Inheritance

Abstract classes and inheritance can create problems if used too carelessly. Inheritance is an IS-A relationship between the parent and the child class.

Trying to abstract and generalize things too early or together can lead to situations where there is code pollution between the parent and child classes. You may end up complying with the needs of the different derived child classes through your parent class in such a way that it pollutes your child classes with things they do not need. Your child classes need to start being aware of things they need not be.

In such a situation, a HAS-A relationship as composition is a better approach.

Abstraction and Indirection

Abstraction and indirection are two different things but related based on our implementation of functionality and code. Let’s get some groundwork done before going forward into abstraction and indirection.

The Two Dimensions of Software Variability

There are two different dimensions in play here. Both of them are important for creating simple, flexible, usable software. The first dimension is the degree of coupling. High coupling occurs when things are directly connected. Low coupling exists where things are indirectly connected. Neither high-coupling nor low-coupling is inherently good or bad. It’s contextual. Things that are directly connected are simple, faster to build, and easier to understand. However, they are inflexible. When there are expected dimensions of software change, it can be worth paying the cost to create software with higher flexibility. Flexible software is harder to create, more complex, harder to understand, and harder to maintain.

The second dimension is about level of abstraction. Low abstraction (concrete) occurs when details are very specific and explicit. High abstraction occurs when all details are left out, and only general concepts are communicated. Low abstraction in software operates with full knowledge of details such as memory addresses, pointers, threads, registers, encodings, http status codes, etc. High abstraction in software operates in business concepts such as calendars, schedules, orders, customers, slides, pictures, etc. Things which are more abstract are more generally usable, and are easier to reason about. Things which are more concrete are what enables and empowers the abstractions – they make everything work. Without concrete elements, software is nothing more than a beautiful shell of uselessness.

Directness is about how many elements are in between two parties.

Abstraction is about how many details are expressed and involved.
https://www.silasreinagel.com/blog/2018/10/30/indirection-is-not-abstraction/

Guidelines to Directness and Abstractness:

Directness Heuristics

Make your components relate to each indirectly, when they will change for different reasons.

Make your components relate to each other directly when they are trivial, or when they won’t have a need to change.

If you don’t know whether or not a component will change, your software will be best if you assume it will not change. (YAGNI).

Abstractness Heuristics

Make your components concrete when you don’t have a need to solve general problems.

Make your components abstract when you will need to solve a very similar class of problems with numerous permutations.

If you don’t know whether or not you will need to solve a whole class of problems, your software will be best if you assume that you don’t need a general solution. (YAGNI).
https://www.silasreinagel.com/blog/2018/10/30/indirection-is-not-abstraction/

Now let’s talk about indirection a bit more. What is indirection?

For use developers using we may think that abstraction allows you to replace part of your implementation with something else, like interfaces or abstract classes. This is not an abstraction but a sign of indirection.

You can use an interface to hide the implementation of something and then replace it with a concrete class when a concrete class is passed in place of the interface. While this looks like abstraction in reality it is indirection because:

You might have a class that only implements 1:1 relationship to an interface, thus not really hiding anything and nothing is really generalized.
Or in other words, the abstraction level does not change with the implementation of the interface if the abstraction level is not lower than that of the interface defined.
Having a single implementation of an interface without a change in the abstraction level is indirect.
Adding indirection to your system doesn’t change the abstraction level. There are many indirections that are not abstract at all.

Notice: Indirection can bring flexibility by being able to swap an implementation detail for another. Like let’s say you are building a game engine and have a need for multiple controllers to control the game: By introducing an indirection through an interface, can allow you to easily swap which controller the user wants to use and thus the game reacts properly to this change.

Indirections are also used in different software architectural approaches (port and adapters approach) and in unit testing to be able to test edge cases (external dependencies like databases, APIs, etc). One use case is to create isolation between different layers or architecture, or a framework and your logic, and also when you want to decouple from implementation details of a specific database. This allows you for example to swap the database implementation details and create unit tests without hassle.

An example of many different indirections in a .NET vertical slice / feature-based approach:

Some thoughts between abstraction and indirection:

Some abstractions create indirections, like an abstract class or an interface construct.

Creating an indirection doesn’t mean you create a (useful) abstraction.

Indirections can make your software more flexible to changes.

Be sure that you need this flexibility! If you intend to swap some implementation, be sure that you have more than one implementation in your codebase available right now.

https://thevaluable.dev/abstraction-type-software-example/

All in all, indirection affects two things:

Mental complexity: Your ability to keep track of details and logic in the code base.
Performance: Indirection can cause performance issues when the underlying implementation can slow your application when you don’t understand what is going on or how to properly use the indirection implementation.

Sources

https://thevaluable.dev/abstraction-type-software-example/

https://thevaluable.dev/kiss-principle-explained/

https://thevaluable.dev/open-closed-principle-revisited/

https://thevaluable.dev/cohesion-coupling-guide-examples/

https://thevaluable.dev/dry-principle-cost-benefit-example/

https://web.archive.org/web/20071214085409/http://www.itmweb.com/essay550.htm

https://web.engr.oregonstate.edu/~budd/Books/oopintro3e/info/chap02.pdf

http://infolab.stanford.edu/~ullman/focs/ch01.pdf

https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/

http://www.principles-wiki.net/principles:law_of_leaky_abstractions

https://tuhrig.de/programming-to-an-interface/

https://blog.acolyer.org/2016/10/20/programming-with-abstract-data-types/

https://hackernoon.com/abstract-programmers-acada09df860

https://www.abstractionlayeredarchitecture.com/

https://matthiasnoback.nl/2018/02/lasagna-code-too-many-layers/

https://betterprogramming.pub/avoiding-premature-software-abstractions-8ba2e990930a

https://codeopinion.com/when-not-to-write-an-abstraction-layer/

https://codeopinion.com/whats-the-cost-of-indirection-abstractions/

https://dotnettutorials.net/lesson/abstract-data-type/

https://www.silasreinagel.com/blog/2018/10/30/indirection-is-not-abstraction/

https://www.youtube.com/watch?v=DNjDZ0E6GUs

Continuous Integration / Continuous Deployment, Programming & Software Development, Testing

Software Testing – Part 1: Unit testing

October 28, 2022 Adrian Simionescu 2 Comments

Hello, this is a new start on a series of testing in software development. I wanted to understand it better because I felt that there where different views and opinions on the matter. I felt like there are too many times where I and other developers could disagree or agree on matters of testing while not being entirely sure on why and how testing is to be done.

So started to read some books, posts, article and have some discussion to get a better view. This blog post is primarily based on the book Unit Testing Principles, Practices, and Patterns, Vladimir Khorikov , these are my notes on the book with additional details and information from other sources.

Key terms

A test double is an object that looks and behaves like its release- intended counterpart but is actually a simplified version that reduces the complexity and facilitates testing.
Unit Testing: Principles, Practices, and Patterns
A method under test (MUT) is a method in the SUT called by the test. The terms MUT and SUT are often used as synonyms, but normally, MUT refers to a method while SUT refers to the whole class.
Unit Testing: Principles, Practices, and Patterns
A mock is a special kind of test double that allows you to examine interactions between the system under test and its collaborators.
A SUT a.k.a system under test
Unit Testing: Principles, Practices, and Patterns
Test-driven development:
Unit Testing: Principles, Practices, and Patterns: A software development process that relies on tests to drive the project development
- Write a failing test to indicate which functionality needs to be added and how it should behave.
- Write just enough code to make the test pass. At this stage, the code doesn’t have to be elegant or clean.
- Refactor the code. Under the protection of the passing test, you can safely clean up the code to make it more readable and maintainable.
Collaborator:
- A collaborator is a dependency that is either shared or mutable. For example, a class providing access to the database is a collaborator since the database is a shared dependency. Store is a collaborator too, because its state can change over time.
Dependency:
- Values or Value Objects, like a product or number 5.
A test?:
- Each test should tell a story. This story is an individual, atomic scenario or fact about the problem domain, and the passing test is a proof that this scenario or fact holds true. If the test fails, it means either the story is no longer valid and you need to rewrite it, or the sys- tem itself has to be fixed.
Test fixture:
- A test fixture is an object the test runs against. This object can be a regular dependency—an argument that is passed to the SUT. It can also be data in the database or a file on the hard disk. Such an object needs to remain in a known, fixed state before each test run, so it produces the same result. Hence the word fixture.
Refactoring
- Refactoring means changing existing code without modifying its observable behavior. The intention is usually to improve the code’s nonfunc- tional characteristics: increase readability and reduce complexity. Some exam- ples of refactoring are renaming a method and extracting a piece of code into a new class.

Goals of unit testing

Code tends to deteriorate. Each time you change something in a code base, the amount of disorder in it, or entropy, increases. With time the project becomes complex and disorganized. Tests act as a safety net— a tool that provides insurance against the vast majority of regressions.
Write good unit tests. Bad tests or no tests have the same impact: either stagnation or a lot of regressions with every new release.
Good unit tests gives you the confidence that changes won’t lead to regressions and make it easier to refactor or add new features.
Each test has a cost and a benefit component, and you need to carefully weigh one against the other. Keep only tests of positive net value in the suite, and get rid of all others.
Both the application code and the test code are liabilities, not assets.
Test coverage numbers aren’t everything and imposing a particular coverage number creates a perverse incentive. It’s good to have a high level of coverage in core parts of your system, but it’s bad to make this high level a requirement. Having too high coverage numbers might make the developer to create poorer quality tests.
A successful test suite exhibits the following attributes:
- – It is integrated into the development cycle.
- – It targets only the most important parts of your code base.
- – It provides maximum value with minimum maintenance costs.
The only way to achieve the goal of unit testing (that is, enabling sustainable project growth) is to
- – Learn how to differentiate between a good and a bad test.
- – Be able to refactor a test to make it more valuable.
Primary Goals of testing:
- ests provide an early warning when you break existing functionality. Thanks to such early warnings, you can fix an issue long before the faulty code is deployed to production, where dealing with it would require a significantly larger amount of effort.
- You become confident that your code changes won’t lead to regressions. Without such confidence, you will be much more hesitant to refactor and much more likely to leave the code base to deteriorate.
Tests must focus on the whats, not the hows.

What is a unit test

Unit tests key features:
- A unit test verifies a single unit of behavior (a unit can be interpreted in different ways, related to two different views on test isolation)
- The test performs quickly, fast feedback
- Performed in isolation from other tests.
There are two schools of testing the Chicago/Detroit and the London style of testing.
This difference of opinion affects the view of what constitutes a unit and the treatment of the system under test’s (SUT’s) dependencies.
- The London school states that the units under test should be isolated from each other. A unit under test is a unit of code, usually a class. All of its depen- dencies, except immutable dependencies, should be replaced with test doubles in tests.
- The classical school states that the unit tests need to be isolated from each other, not units. Also, a unit under test is a unit of behavior, not a unit of code. Thus, only shared dependencies should be replaced with test doubles. Shared dependencies are dependencies that provide means for tests to affect each other’s execution flow.
- Or: “In the Classicist world, the unit is a module. A module is a single class, just a function, or a set of closely related classes, which implement a particular functionality. It doesn’t matter how small and simple, or complex and full of inside collaborations it is. A module’s functionality is exposed by public exports (API). Hence, in the Detroit style, you write a test against the module.” – https://blog.devgenius.io/detroit-and-london-schools-of-test-driven-development-3d2f8dca71e5
- This approach can be interpreted as an integration test in the London School approach.
The London school provides the benefits of better granularity, the ease of testing large graphs of interconnected classes, and the ease of finding which functionality contains a bug after a test failure.
- Introduces also some problems:
  - The problem of overspecification—coupling tests to the SUT’s implementation details.
An integration test is a test that doesn’t meet at least one of the criteria for a unit test. End-to-end tests are a subset of integration tests; they verify the system from the end user’s point of view. End-to-end tests reach out directly to all or almost all out-of-process dependencies your application works with.
Thought: Tests shouldn’t verify units of code. Rather, they should verify units of behavior: something that is meaningful for the problem domain and, ideally, something that a business person can recognize as useful. The number of classes it takes to implement such a unit of behavior is irrelevant. The unit could span across multiple classes or only one class, or even take up just a tiny method.
A test should tell a story about the problem your code helps to solve, and this story should be cohesive and meaningful to a non-programmer.
Things to consider:
- Having a large set of classes to test with a test may be a sign on code design problems.
- The Chicago School approach is more on “digging” deeper into the implementation and a more vertical slice approach. Focus on an exploration of the core logic with all its complexities from the very first moment.
- The London School approach is more on doing a few things first and exploring little by little what comes next.
Unit Testing: Principles, Practices, and Patterns
Integration test is a test that verifies that your code works in integration with shared dependencies, out-of-process dependencies, or code developed by other teams in the organization.
End-to-end tests are a subset of integration tests. They, too, check to see how your code works with out-of-process dependencies. The difference between an end-to-end test and an integration test is that end-to-end tests usually include more of such dependencies. This usually includes also UI tests, from the end-users point of view.
Common pitfalls:
- Tests are too complex: You have too much things going on in your tests, probably you are testing too many things. Time to refactor your tests, split them, use factories, builders, object mothers to help.
- Spend more time fixing tests than writing production code: Might be a sign of over reliance on testing implementation details and heavy mocks usage. Time to refactor your tests, consider introducing a real collabolartor (instance of an object) instead of a mock..
- TDD approach does not give a good design – big ball of mud: Going tests first without understanding good design and architectural patterns is not a guarantee for good quality code. Being aware of design and architectural patterns helps you when you are doing TDD. TDD does not guarantee good quality, good architectural approaches etc.

School	Isolation of	A unit is	Uses test doubles for
London	Units	A class	All but immutable dependencies
Classical/Chicago	Unit tests	A class or a set of classes	Shared dependencies

Example: Classical/Chicago Style Unit Tests

[Fact]
    public void TicketBookingSucceedsWhenTicketsAvailable()
    {
        // Arrange
        var store = new TicketStore();
        store.AddSeats(Location.Balcony, 10);
        var customer = new Customer();
        
        // Act
        bool success = customer.Book(store, Location.Balcony, 5);
        
        // Assert
        Assert.True(success);
        Assert.Equal(5, store.GetSeats(Location.Balcony));
    }

Example: London Style Unit Tests

[Fact]
    public void TicketBookingSucceedsWhenTicketsAvailable()
    {
        // Arrange
        var storeMock = new Mock<ITicketStore>();
        storeMock.Setup(x => x.HasEnoughInventory(Location.Balcony, 10)).Returns(true);
        var customer = new Customer();
        
        // Act
        bool success = customer.Book(storeMock.Object, Location.Balcony, 5);
        
        // Assert
        Assert.True(success);
        storeMock.Verify(x => x.RemoveSeats(Location.Balcony, 5), Times.Once);
    }

Unit Tests Anatomy

Unit Testing: Principles, Practices, and Patterns

All unit tests should follow the AAA pattern: arrange, act, assert. Each unit test should have one arrange, act and assert sections, if it has multiple ones then split into multiple tests.
- Notice: In integration tests it is normal to have more than once acts.
- In the arrange section, you bring the system under test (SUT) and its dependencies to a desired state. Setting up the needed data and collaborator used with the SUT.
- In the act section, you call methods on the SUT, pass the prepared dependencies, and capture the output value (if any).
- In the assert section, you verify the outcome. The outcome may be represented by the return value, the final state of the SUT and its collaborators, or the methods the SUT called on those collaborators. Verifying that the outcome is in a certain way and that certain things did happen.
- Notice: This is similar to Given-When-Pattern (similar and no real difference to AAA, it is more readable to non-developers):
  - Given – Corresponds to the arrange section
  - When – Corresponds to the act section
  - Then – Corresponds to the assert section
More than one line in the act section is a sign of a problem with the SUT’s API. It requires the client to remember to always perform these actions together, which can potentially lead to inconsistencies. Such inconsistencies are called invariant violations. The act of protecting your code against potential invariant violations is called encapsulation.
You can distinguish the SUT in tests by naming it sut, helping identifying that is tested more clearly.
As a good practice in helping test readibility, differentiate the three test sections either by putting Arrange, Act, and Assert comments before them or by introducing empty lines between these sections.
Reuse test fixture initialization code by introducing factory methods, not by putting this initialization code to the constructor. Such reuse helps maintain a high degree of decoupling between tests and also provides better readability.
Don’t use a rigid test naming policy. Name each test as if you were describing the scenario in it to a non-programmer who is familiar with the problem domain.
- Separate words in the test name by underscores
- Don’t include the name of the method under test in the test name.
- Name the test as if you were describing the scenario to a non-programmer who is familiar with the problem domain. A domain expert or a business analyst is a good example.
Parameterized tests help reduce the amount of code needed for similar tests by running the test multiple times with different given parameters. The drawback is that the test names become less readable as you make them more generic. If you have too many parameters, you might need to split the test.
Assertion libraries help you further improve test readability by restructuring the word order in assertions so that they read like plain English.
General guidance:
- Arrange section: Can grow large and this is ok, things to do to make it more readable and maintainable. If your arrange becomes bigger that the act and assert sections combined:
  - extract the arrangements either into private methods within the same test class
  - Separate factory class
    - Object Mother
    - Test Data Builder
- Assert section:
  - You can assert multiple things as long as they are related to a unit of behaviour, this is especially true if you are doing unit testing in the classical style. A single unit of behavior can exhibit multiple outcomes, and it’s fine to evaluate them all in one test.
    - Of cource, be aware that they do not grow too large, this may be a sign of refactoring needs in your code.
- Tearddown phase: This should not be of a concern in unit testing, since you should not be depending on outside depencencies and needing clean ups. This is the realm of integration tests.
- High coupled tests:
  - Coupling tests together is a test anti-pattern, modification of one test should not affect other tests.
    - Notice: this rules doesn’t apply to integration tests where you will need a common shared dependency between all or most tests.
  - Avoid general test fixture initialization like in the constructor of the test class, this leads of coupling between tests and diminishes test readibility. Prefer factory methods or classes.

Pillars of a good unit test

A good unit test has four foundational attributes that you can use to analyze any automated test, whether unit, integration, or end-to-end:
- Protection against regressions
  - Is a measure of how good the test is at indicating the presence of bugs (regressions). The more code the test executes (both your code and the code of libraries and frameworks used in the project), the higher the chance this test will reveal a bug.
- Resistance to refactoring
  - Is the degree to which a test can sustain application code refactoring without producing a false positive.
  - A false positive is a false alarm: a result indicating that the test fails, whereas the functionality it covers works as intended. False positives can have a devastating effect on the test suite:
    - They dilute your ability and willingness to react to problems in code, because you get accustomed to false alarms and stop paying attention to them.
    - They diminish your perception of tests as a reliable safety net and lead to los- ing trust in the test suite.
  - False positives are a result of tight coupling between tests and the internal implementation details of the system under test. To avoid such coupling, the test must verify the end result the SUT produces, not the steps it took to do that. Tests should approach SUT verification from the end user’s point of view and check only the outcome meaningful to that end user.
- Fast feedback
  - Is a measure of how quickly the test executes.
  - Short and fast feedback loop, encourages you tests and keep tests in good quality and reducing the cost of fixing bugs to almost zero. Slow tests have the opposite effect.
- Maintainability
  - How hard it is to understand the test. The smaller the test, the more readable it is and easier to change it. The quality of the test code matters as much as the production code. Don’t cut corners when writing tests; treat the test code as a first-class citizen.
  - How hard it is to run the test. The fewer out-of-process dependencies the test reaches out to, the easier it is to keep them operational.
Protection against regressions and resistance to refactoring contribute to test accuracy. A test is accurate insofar as it generates a strong signal (is capable of finding bugs, the sphere of protection against regressions) with as little noise (false positives) as possible (the sphere of resistance to refactoring).
False positives don’t have as much of a negative effect in the beginning of the project, but they become increasingly important as the project grows: as important as false negatives (unnoticed bugs).
The Test Pyramid advocates for a certain ratio of unit, integration, and end-to- end tests: end-to-end tests should be in the minority, unit tests in the majority, and integration tests somewhere in the middle.
If you want your tests to give the best protection, you should also include code outside your own writen code: libraries, frameworks and external systems. This is to tests your assumptions you are making against these dependencies.
Significance to a test can be determined by the following:
- The amount of code that is executed during the test
- The complexity of that code
- The code’s domain significance
- Notice: Code that represents complex business logic is more important than boilerplate/trivial code, bugs in business-critical functionality are the most damaging.
Code that represents complex business logic is more important than boilerplate code—bugs in business-critical functionality are the most damaging.
Code correctness and test results possible outcomes:
- True Negative: The situation when the test passes and the underlying functionality works as intended is a correct inference: the test correctly inferred the state of the system (there are no bugs in it). Another term for this combination of working functionality and a passing test is true negative.
- True Positive: When the functionality is broken and the test fails, it’s also a correct infer- ence. That’s because you expect to see the test fail when the functionality is not work- ing properly. That’s the whole point of unit testing. The corresponding term for this situation is true positive.
- False Negative: Tests with a good protection against regressions help you to minimize the number of false negatives. Tests that don’t catch and error.
- False Positive: When the functionality is correct but the test still shows a failure.
Use the black-box testing method when writing tests. Use the white-box method when analyzing the tests.
- Make all tests—be they unit, integration, or end-to-end—view the system as a black box and verify behavior meaningful to the problem domain. If you can’t trace a test back to a business requirement, it’s an indi- cation of the test’s brittleness. Either restructure or delete this test; don’t let it into the suite as-is.

Mocks and test fragility

Test double is an overarching term that describes all kinds of non-production- ready, fake dependencies in tests. The major use of test doubles is to facilitate testing; they are passed to the system under test instead of real dependencies, which could be hard to set up or maintain.
- There are five variations of test doubles: dummy, stub, spy, mock, and fake.
- These can be grouped in just two types: mocks and stubs.
- Spies are functionally the same as mocks. Spies are written manually, whereas mocks are created with the help of a mocking framework. Sometimes people refer to spies as handwritten mocks.
- Dummies and fakes serve the same role as stubs.
- The difference between a stub, a dummy, and a fake is in how intelligent they are.
  - A dummy is a simple, hardcoded value such as a null value or a made-up string. It’s used to satisfy the SUT’s method signature and doesn’t partici- pate in producing the final outcome.
  - A stub is more sophisticated. It’s a fully fledged dependency that you configure to return different values for different scenarios.
  - A fake is the same as a stub for most purposes. The difference is in the ratio- nale for its creation: a fake is usually implemented to replace a dependency that doesn’t yet exist.
Mocks help emulate and examine outcoming interactions: calls from the SUT to its dependencies that change the state of those dependencies. Stubs help emulate incoming interactions: calls the SUT makes to its dependencies to get input data.

A mock (the tool) is a class from a mocking library that you can use to create a mock (the test double) or a stub.
Asserting interactions with stubs leads to fragile tests. Such an interaction doesn’t correspond to the end result; it’s an intermediate step on the way to that result, an implementation detail. This practice of verifying things that aren’t part of the end result is also called over- specification.
The notions of mocks and stubs tie to the command query separation (CQS) principle. The command query separation (CQS) principle states that every method should be either a command or a query but not both.
- Test doubles that substitute commands are mocks. Test doubles that substitute queries are stubs.
- Commands are methods that produce side effects and don’t return any value (return void). Examples of side effects include mutating an object’s state, changing a file in the file system, and so on. Queries are the opposite of that—they are side-effect free and return a value.
  - In other words, asking a question should not change the answer. Code that main- tains such a clear separation becomes easier to read. You can tell what a method does just by looking at its signature, without diving into its implementation details.
All production code can be categorized along two dimensions:
- Public API versus private API
- Observable behavior versus implementation details.
- Code publicity is controlled by access modifiers, such as private, public, and internal keywords.
- Code is part of observable behavior when it meets one of the following requirements (any other code is an implementation detail):
  - It exposes an operation that helps the client achieve one of its goals. An oper- ation is a method that performs a calculation or incurs a side effect.
  - It exposes a state that helps the client achieve one of its goals. State is the cur- rent condition of the system.
Well-designed code is code whose observable behavior coincides with the public API and whose implementation details are hidden behind the private API.
- A code leaks implementation details when its public API extends beyond the observable behavior.
- Actions for a well-designed code:
  - Expose an operation that helps the client achieve one of its goals.
  - Expose a state that helps the client achieve one of its goals.
- A good rule of thumb that can help you determine whether a class leaks its implementation details. If the number of operations the client has to invoke on the class to achieve a single goal is greater than one, then that class is likely leaking imple- mentation details. Ideally, any individual goal should be achieved with a single operation.
Encapsulation is the act of protecting your code against invariant violations.
- Exposing implementation details often entails a breach in encapsulation because clients can use implementation details to bypass the code’s invariants.
- An invariant is a condition that should be held true at all times.
- Encapsulation is crucial for code base maintainability in the long run. The reason why is complexity. Code complexity is one of the biggest challenges you’ll face in soft- ware development. The more complex the code base becomes, the harder it is to work with, which, in turn, results in slowing down development speed and increasing the number of bugs.
- When the code’s API doesn’t guide you through what is and what isn’t allowed to be done with that code, you have to keep a lot of information in mind to make sure you don’t introduce inconsistencies with new code changes. This brings an additional mental burden to the process of programming. Remove as much of that burden from yourself as possible. You cannot trust yourself to do the right thing all the time—so, eliminate the very possibility of doing the wrong thing.
- Ways to achieve good code encapsulation:
  - Hiding implementation details helps you remove the class’s internals from the eyes of its clients, so there’s less risk of corrupting those internals.
  - Bundling data and operations helps to make sure these operations don’t violate the class’s invariants.
There are two types of communications in an application: intra-system and inter-system.
- Intra-system communications are communications between classes inside the application.
- Inter-system communication is when the application talks to external applications.
Intra-system communications are implementation details. Inter-system communications are part of observable behavior, with the exception of external systems that are accessible only through your application. Interactions with such sys- tems are implementation details too, because the resulting side effects are not observed externally.
Using mocks to assert intra-system communications leads to fragile tests. Mocking is legitimate only when it’s used for inter-system communications.
- Communications that cross the application boundary—and only when the side effects of those communications are visible to the external world.
Not all out-of-process dependencies should be mocked out
- If an out-of- process dependency is only accessible through your application, then communications with such a dependency are not part of your system’s observable behavior.
- An out-of-process dependency that can’t be observed externally, in effect, acts as part of your application.
- When your application acts as a proxy to an external system, and no client can access it directly, the backward-compatibility requirement vanishes. Now you can deploy your application together with this external system, and it won’t affect the clients. The communication pattern with such a system becomes an implementation detail.
- The use of mocks for out-of-process dependencies that you have a full control over also leads to brittle tests. You don’t want your tests to turn red every time you split a table in the database or modify the type of one of the parameters in a stored proce- dure. The database and your application must be treated as one system.
- Out-of-Process dependencies:
  - Shared dependency: A dependency shared by tests (not production code)
  - Out-of-process dependency: A dependency hosted by a process other than the pro- gram’s execution process (for example, a database, a message bus, or an SMTP
  - service)
  - Private dependency: Any dependency that is not shared
Using mocks to verify behavior
- Mocks are often said to verify behavior. In the vast majority of cases, they don’t. The way each individual class interacts with neighboring classes in order to achieve some goal has nothing to do with observable behavior; it’s an implementation detail.
- Mocks have something to do with behavior only when they verify interactions that cross the application boundary and only when the side effects of those interactions are visible to the external world.

Styles of unit testing

The Three Styles of unit testing
- Output-based testing is a style of testing where you feed an input to the SUT and check the output it produces.
  - This style of testing assumes there are no hidden inputs or outputs, and the only result of the SUT’s work is the value it returns.
  - This is only applicable to code that doesn’t change a global or internal state, so the only component to verify is its return value.
  - The output-based style of unit testing is also known as functional. This name takes root in functional programming, a method of programming that emphasizes a preference for side-effect-free code.
  - Output-based testing produces tests of the highest quality. Such tests rarely couple to implementation details and thus are resistant to refactoring. They are also small and concise and thus are more maintainable.
- State-based testing verifies the state of the system after an operation is completed.
  - The term state in this style of testing can refer to the state of the SUT itself, of one of its collaborators, or of an out-of-process dependency, such as the database or the filesystem.
  - State-based testing requires extra prudence to avoid brittleness: you need to make sure you don’t expose a private state to enable unit testing.
  - Because state- based tests tend to be larger than output-based tests, they are also less maintainable.
  - Maintainability issues can sometimes be mitigated (but not eliminated) with the use of helper methods and value objects.
- In communication-based testing, you use mocks to verify communications between the system under test and its collaborators.
  - Communication-based testing also requires extra prudence to avoid brittleness.
  - You should only verify communications that cross the application boundary and whose side effects are visible to the external world.
  - Maintainability of communication-based tests is worse compared to output-based and state-based tests.
  - Mocks tend to occupy a lot of space, and that makes tests less readable.
The classical school of unit testing prefers the state-based style over the communication-based one. The London school has the opposite preference. Both schools use output-based testing.
Comparison of the three styles to the four attributes of good unit tests:
- Protection against regressions
  - Charasteristics:
    - The amount of code that is executed during the test
    - The complexity of that code
    - Its domain significance.
  - Output-based:
    - No specific advantage to amount of code or speed durings tests
  - State-based:
    - No specific advantage to amount of code or speed durings tests
  - Communication-based:
    - Overusing this style can lead to tests that verify a thin slice of the code with many mocking everything else.
    - Speed is similar to the other styles, could be slower if you would have thousands of tests
- Resistance to refactoring
  - Output-based:
    - Provides the best protection against false positives because the resulting tests couple only to the method under test. The only way for such tests to couple to implementation details is when the method under test is itself an implemen- tation detail.
  - State-based:
    - State-based testing is usually more prone to false positives. In addition to the method under test, such tests also work with the class’s state. State-based tests tie to a larger API surface, and hence the chances of coupling them to implementation details are also higher.
  - Communication-based:
    - Tests end up using more test doubles that may end up being brittle.
    - You can reduce the number of false positives to a minimum by maintaining proper encapsulation and coupling tests to observable behavior only. Admittedly, though, the amount of due diligence varies depending on the style of unit testing.
- Maintainability
  - Output-based:
    - Are almost always short and concise and thus are easier to maintain. This benefit of the output-based style stems from the fact that this style boils down to only two things: supplying an input to a method and verifying its output, which you can often do with just a couple lines of code.
    - Because the underlying code in output-based testing must not change the global or internal state, these tests don’t deal with out-of-process dependencies. Hence, output-based tests are best in terms of both maintainability characteristics.
  - State-based:
    - State-based tests are normally less maintainable than output-based ones. This is because state verification often takes up more space than output verification.
    - Can be mitigated with helper methods, but do require effort to write and maintain.
    - Also, be carefull of code pollution scenarrios. Code that is made for the sole purpose of helping unit tests to perform better or simplify it, this is an anti-pattern.
  - Communication-based:
    - Communication-based tests score worse than output-based and state-based tests on the maintainability metric.
    - Communication-based testing requires setting up test dou- bles and interaction assertions, and that takes up a lot of space.
    - Tests become even larger and less maintainable when you have mock chains (mocks or stubs returning other mocks, which also return mocks, and so on, several layers deep).
- Notice: While output-based testing style looks optimal, things are not easy. This style of unit testing is applicable to code that is written in a functional way.

Example: Output Based Testing Style

[Fact]
    public void PriceShouldHaveNoDiscount()
    {
        // Arrange
        var tickets = new List<Ticket>()
        {
            new Ticket()
            {
                Price = 10,
                SeatNumber = 1
            },
            new Ticket()
            {
                Price = 10,
                SeatNumber = 2
            },
            new Ticket()
            {
                Price = 10,
                SeatNumber = 3
            },
        };
        var sut = new PriceManager();
        
        // Act
        var result = sut.CalculateTotalPrice(tickets);
        
        // Assert
        Assert.Equal(result, tickets.Sum( o=> o.Price));
    }

Example: State Based Testing Style

[Fact]
    public void ShouldAddProductsToAnOrder()
    {
        // Arrange
        var ticket1 = new Ticket() { Price = 20, SeatNumber = 1};
        var ticket2 = new Ticket() { Price = 30, SeatNumber = 2};
        var sut = new Order();

        // Act
        sut.AddProduct(ticket1);
        sut.AddProduct(ticket2);
        
        Assert.Equal(2, sut.Tickets.Count);
        Assert.Equal(ticket1, sut.Tickets[0]);
        Assert.Equal(ticket2, sut.Tickets[1]);
    }

Example: Communication Based Testing Style

 [Fact]
    public void ShouldSendEmailVerification()
    {
        // Arrange
        var emailGatewayMock = new Mock<IEmailGateway>();
        var sut = new Order(emailGatewayMock.Object);
        
        // Act
        sut.FinalizeOrder();

        // Assert
        emailGatewayMock.Verify( o => o.SendVerificationEmail("john.doe@email.com"), Times.Once);
    }

Valuable Unit Tests

Code complexity is defined by the number of decision-making points in the code:
- Explicit (made by the code itself)
- Implicit (made by the libraries the code uses).
Domain significance shows how significant the code is for the problem domain of your project.
- Complex code often has high domain significance and vice versa, but not in 100% of all cases.
  - The domain layer has a direct connection to the end users’ goals and thus exhibits a high domain significance.
  - Utility code doesn’t have such a connection.
- Complex code and code that has domain significance benefit from unit testing the most because the corresponding tests have greater protection against regressions.
  - Note that the domain code doesn’t have to be complex, and complex code doesn’t have to exhibit domain significance to be test-worthy.
Unit tests that cover code with a large number of collaborators have high maintenance costs.
- Such tests require a lot of space to bring collaborators to an expected condition and then check their state or interactions with them afterward.
All production code can be categorized into four types of code by its complexity or domain significance and the number of collaborators:
- Domain model and algorithms (high complexity or domain significance, few collaborators) provide the best return on unit testing efforts.
- Trivial code (low complexity and domain significance, few collaborators) isn’t worth testing at all.
- Controllers (low complexity and domain significance, large number of col- laborators) should be tested briefly by integration tests.
- Overcomplicated code (high complexity or domain significance, large num- ber of collaborators) should be split into controllers and complex code.
The more important or complex the code is, the fewer collaborators it should have.
The Humble Object pattern helps make overcomplicated code testable by extracting business logic out of that code into a separate class. As a result, the remaining code becomes a controller—a thin, humble wrapper around the busi- ness logic.
Think of the business logic and orchestration responsibilities in terms of code depth versus code width.
- Your code can be either deep (complex or important) or wide (work with many collaborators), but never both.
Test preconditions if they have a domain significance; don’t test them otherwise.
There are three important attributes when it comes to separating business logic from orchestration:
- Domain model testability—A function of the number and the type of collaborators in domain classes
- Controller simplicity—Depends on the presence of decision-making points in the controller
- Performance—Defined by the number of calls to out-of-process dependencies
You can have a maximum of two of these three attributes at any given moment:
- Pushing all external reads and writes to the edges of a business operation—Preserves controller simplicity and keeps the domain model testability, but concedes performance
- Injecting out-of-process dependencies into the domain model—Keeps performance and the controller’s simplicity, but damages domain model testability
- Splitting the decision-making process into more granular steps—Preserves perfor- mance and domain model testability, but gives up controller simplicity
Splitting the decision-making process into more granular steps—Is a trade-off with the best set of pros and cons. You can mitigate the growth of controller complexity using the following two patterns:
- The CanExecute/Execute pattern introduces a CanDo() for each Do() method and makes its successful execution a precondition for Do(). This pattern essentially eliminates the controller’s decision-making because there’s no option not to call CanDo() before Do().
- Domain events help track important changes in the domain model, and then convert those changes to calls to out-of-process dependencies. This pattern removes the tracking responsibility from the controller.
It’s easier to test abstractions than the things they abstract. Domain events are abstractions on top of upcoming calls to out-of-process dependencies. Changes in domain classes are abstractions on top of upcoming modifications in the data storage.
- Abstracting away the application of side effects to external systems.
- You achieve such abstraction by keeping those side effects in memory until the very end of the business operation, so that they can be tested with plain unit tests without involving out-of-process dependencies.

Sources

https://blog.devgenius.io/detroit-and-london-schools-of-test-driven-development-3d2f8dca71e5

https://khorikov.org/files/infographic.pdf

Unit Testing Principles, Practices, and Patterns, Vladimir Khorikov

Architecture, Programming & Software Development

Software Architecture – Part 4: Vertical-Slice Architecture

July 23, 2022 Adrian Simionescu Leave a comment

Introduction

I have noticed that over the years the answer to the question of how to write testable, flexible, and maintainable software, is to focus on the features. A feature is a use case from a problem you are trying to solve. I want to point out that there is a place for different architectural approaches for different situations, you have to be aware of them. Some are better suited for some jobs than others.

There are many approaches and patterns out there but many of them lead to the opposite of testable, flexible, and maintainable software. I always think that if I add complexity to something it must solve a more complex problem than the newly introduced complexity. Some approaches and patterns are suited for different situations but the vertical-slice approach is quite good for quite many situations but not all. As always, there is no one way to rule them all :).

With vertical slice architecture, you’re organizing your code across a feature rather than layers. The focus is on features and capabilities rather than technical concerns. You can think of your object’s relationships in two ways: technical and business.

Technical relationships are those between objects that reside in the same technical spectrum, whether they have the same usage. Two controllers are technically related, but the controller and application service are not related – their purpose is different.

A business relationship is something that supports the fulfillment of the same use case. The objects can be definitely different from a technical point of view – for example: a validator, DTOs, and a command handler.

The vertical slice architecture approach aims to be a closely coupled approach. The coupling is limited by the scope of the feature and the size of the vertical slice.

Instead of separating based on technical concerns, Vertical Slices are about focusing on features. This means you’ll take all the technical concerns related to a feature and organize that code together. Focus on organizing by the features and capabilities of your system.

By doing this you’re dealing with coupling in a different way because slices are for the most part, self-contained. They don’t couple to other slices. By self-contained I mean that in relation to the domain problem-solving, logic, security, data access, etc.

You can keep things that relate together, even in one file if possible, what makes sense to your application. Keep things together that change together. Besides reducing the context switching, such a split also improves understanding of what is happening in business, managing dependencies, and ultimately even scaling out. It’s easier to extract features into dedicated microservices.

In vertical slices, each slice can have its own method of data access into a data store. One slice could use micro-ORMs while another a full ORM, and another could use an external API. You may end up with code duplication and this is OK. This is the price to pay for self-containment and loose coupling with vertical slices.

When adding business functionality the changes are “full stack,” meaning that they can include everything from the user interface, downwards. This way we minimize side effects by removing shared code or abstractions between different slices.

You try to encapsulate the slice so that the application does not know what happens inside the slice, you only need to pass input to the slice and receive output.

The vertical slice architecture is also closely related to CQRS, Command Query Responsibility Segregation. A slice usually is a command or a query that your code operates upon.

CQRS is a pattern where we’re segregating application behaviours. We’re splitting them into command and queries. Commands are intents to do something (e.g. change the state). Queries should return data but do not create side effects. Just like a question should not change the answer. Simple as that. We’re slicing our business domain vertically by operations we can do on it. The split can help us to focus on the specific business operation, reduce complexity and cognitive load, enable optimisations and scaling.
https://event-driven.io/en/cqrs_is_simpler_than_you_think_with_net6/

Because of the slice encapsulation, the vertical slices can choose whatever implementation strategy makes the most sense for that particular slice.

Lastly, I would like to quote Jimmy Bogard’s words on vertical slices:

While layered architectures and vertical slice architecture can safely co-exist in the same application, a vertical slice architecture ensures that any abstractions, encapsulations, or just plain refactorings are introduced when the need arises, and not before. By following a simpler approach from the start, we ensure that our code is only as complex as we need it.
https://headspring.com/2020/08/18/how-vertical-slice-architecture-fulfills-clean-architectures-broken-promises/

What does Vertical Slice architecture tries to solve?

Tight coupling between layers or sections of code
The big ball of mud problem
A “glued” together system
Complex solutions and projects hierarchies and structures
Minimizing the need for abstractions to what is actually needed
Avoiding premature complexity or being as complex as it is necessary and no more
Hard and costly to make changes to existing code
Avoiding the problems of communication between developers and non-developers

Where to start

I think that vertical slice architecture will benefit from other software architecture approaches. For me, I feel that by using Domain-Driven Design strategic part with a domain discovery you can find out what really matters to the client and the application(s) you are building.

Depending on your overall system architecture you could take some pointers from the modular monolith approach and use the vertical slice architecture for your modules.

I think your approach should be able to do the following for any developer opening the code and looking at it:

What business problems does it solve?
What problems does your company care about?
What are the capabilities of the system?

Features and Folders

One crucial factor of vertical-slice architecture is that you must start thinking differently about your code structure and organization. If you have been doing layers-based development, it is easy to create your code files and general code structure to represent layers-based architecture. With vertical slices, you need to start to think in terms of features.

You would normally have folders for different layers of your code, like controllers, models, commands, queries, repositories, etc. With feature-based, you can take a different approach, once you know your domain and its parts you can create a feature file containing all of your classes and logic needed to process a request. Or you could have a folder named according to the feature use case that contains all of the needed classes. You could also take all of the features uses cases and combine them into a folder that defines a domain worth of feature use cases.

Example feature slices

Here are some slice examples for a game backend:

Lessened absence of abstractions

When moving to vertical slices, you begin to stop thinking about layers and abstractions. The reason being is that a narrow vertical slice doesn’t necessarily need an abstraction. The slices are for the most part self-contained. You don’t have to swap out or change the entire layers, you swap out the features/slices you want.

Abstractions become less of a concern and you can start leveraging concrete implementations. It’s worth noting that anytime you create your own abstraction you’re likely going to lose functionality.

The problem comes when abstractions are introduced prematurely, i.e., before they are solving a real non-theoretical problem. Adding abstractions always comes at the cost of complexity and, when done excessively, starts slowing down the speed of development and the ability for people to easily comprehend the codebase.
Please understand that software development is all about abstractions, but adding abstraction when not needed does not serve the end result, Keep it Simple, Stupid (KISS) and You Aren’t Gonna Need It (YAGNI).
…
The points discussed are grounded in principles like Keep it Simple, Stupid (KISS) and You Aren’t Gonna Need It (YAGNI), meaning that we want to keep abstractions to a minimum and only introduce complexity when it provides a significant and real benefit.
https://betterprogramming.pub/avoiding-premature-software-abstractions-8ba2e990930a

The first major difference between Vertical Slices and layered architectures is the lack of abstractions in our slices. Because each slice is encapsulated, it’s straightforward to change the implementation of a handler without affecting any other slice. Typically, an abstraction is introduced to shield the application from implementation details that would make it difficult to swap an implementation for another.
In practice, abstractions are quite difficult to swap implementations – especially in an incremental fashion. For example, if we abstract our database access or ORM behind a repository, it’s not feasible to swap out an implementation on a case-by-case basis, since the concretion is configured at the application layer.
With vertical slices, I can have one slice use an ORM, another use a micro ORM, and another use external APIs. These concrete dependencies may be used amongst many vertical slices, but the decision to move to another strategy doesn’t affect any other slice.
We don’t add abstractions or patterns until the code inside a single slice exhibits code smells that guide our refactoring.
https://headspring.com/2020/08/18/how-vertical-slice-architecture-fulfills-clean-architectures-broken-promises/

Common Premature Abstractions

Responsibilities are abstracted too granularly
Design patterns are used without real benefit
Performance is optimized prematurely
Low coupling is introduced everywhere

Refactoring and Code Quality

As stated before, using the vertical slices architectural approach does require the development team to be familiar with understanding common code smells and refactoring techniques to get rid of them.

Possible code smells and problems in logic and/or behavioral code::

Large Class
Long Method
Duplicated Code
Combinatorial Explosion
Repeated Switches
Feature Envy
Conditional Complexity
Inappropriate Intimacy
Middle Man

Refactoring techniques that can be used to the rid of the above problems:

Key Points

Things that change together, belong together
Maximize cohesion along axes of change and minimize coupling between them

Benefits

Project structure
- More cohesive feature folders
- More discoverable folder structures
- Less “cognitive load” flipping around files/folders
- Screaming architecture (easy to remember what the system does)
Overall system simplicity
- Consistent testing strategy with a high ROI (acceptance test for each feature)
- Simpler architecture
- Easy to keep tabs on the coupling between features/vertical slices (SRP)
Refactoring becomes easier because the changes are self-contained within a slice. You don’t need to worry about breaking something in another location in the application, like when using a layered architecture and having a shared method.
Testing becomes clearer, precise, and easier to understand.
Things are easier to understand and maintain because: Things that change together, belong together
Development starts to be more about additions than modifications. You end up adding new functionalities and files, then modifying existing classes and methods.
It is easier for developers to have a mental picture of what is happening because of the self-contained nature of slices. A slice size and complexity should be rather low and if this is not the case then some analysis and refactoring should be the case.
Each slice can evolve independently from other slices and logic. This gives you the freedom and the sense of security to make changes.
It forces the developers to focus on the business and products, rather than technical aspects and how to tie things together.
The systems as a whole are “not glued” together, changes are easy to make, and removing things is equally easy and fast.

Drawbacks

Needs a new understanding of how to write code and organize it
Need to have refactoring skills and knowledge related to refactoring
Need to understand when things get too large or complex and need refactoring
Needs knowledge of your domain to take good advantage of the vertical slice architecture and create good code

Sources

https://headspring.com/2020/08/18/how-vertical-slice-architecture-fulfills-clean-architectures-broken-promises/

https://headspring.com/2019/11/05/why-vertical-slice-architecture-is-better/

https://jimmybogard.com/vertical-slice-architecture/

http://www.codingthearchitecture.com/2014/06/01/an_architecturally_evident_coding_style.html

https://dev.to/htech/exploring-vertical-slices-in-dotnet-core-3mik
https://blog.ttulka.com/package-by-component-with-clean-modules-in-java
https://www.ghyston.com/insights/architecting-for-maintainability-through-vertical-slices
http://www.codingthearchitecture.com/2015/03/08/package_by_component_and_architecturally_aligned_testing.html
https://codeopinion.com/organizing-code-by-feature-using-vertical-slices/
https://www.kenneth-truyers.net/2016/02/02/vertical-slices-in-asp-net-mvc/
https://www.markhneedham.com/blog/2012/02/20/coding-packaging-by-vertical-slice/
https://timgthomas.com/2013/10/feature-folders-in-asp-net-mvc/
http://www.kamilgrzybek.com/design/feature-folders/
https://lostechies.com/jimmybogard/2013/10/29/put-your-controllers-on-a-diet-gets-and-queries/
https://lostechies.com/jimmybogard/2013/12/19/put-your-controllers-on-a-diet-posts-and-commands/
https://builtwithdot.net/blog/changing-how-your-code-is-organized-could-speed-development-from-weeks-to-days
https://jimmybogard.com/migrating-contoso-university-example-to-razor-pages/
https://ardalis.com/api-feature-folders/
https://ardalis.com/moving-from-controllers-and-actions-to-endpoints-with-mediatr/
https://medium.com/@jacobcunningham/out-with-the-onion-in-with-vertical-slices-c3edfdafe118
https://www.reddit.com/r/dotnet/comments/m1t6g3/no_abstractions_in_vertical_slice_architecture/
https://khalilstemmler.com/articles/software-design-architecture/feature-driven/
https://jimmybogard.com/composite-uis-for-microservices-vertical-slice-apis/
https://event-driven.io/en/how_to_slice_the_codebase_effectively/
https://codeopinion.com/fat-controller-cqrs-diet-vertical-slices/
https://event-driven.io/en/cqrs_facts_and_myths_explained/
https://www.betterask.erni/news-room/slices-vs-layers/
https://codeopinion.com/clean-architecture-example-breakdown/
https://headspring.com/2020/09/02/testing-done-right-with-vertical-slice-architecture/
https://codeopinion.com/restructuring-to-a-vertical-slice-architecture/
https://betterprogramming.pub/avoiding-premature-software-abstractions-8ba2e990930a
https://event-driven.io/en/cqrs_is_simpler_than_you_think_with_net6/
https://codeopinion.com/organize-by-feature/

https://thevaluable.dev/abstraction-type-software-example/

Architecture, Programming & Software Development

Software Architecture – Part 3: Modular Monolith

July 15, 2022 Adrian Simionescu 1 Comment

Introduction

I am one of those developers who have been a part of creating monoliths (bad and good ones 🙂 ) and also created distributed systems using microservices or serverless architectures. I also have come to realize that a monolith architecture is not bad at all but is actually the way to go unless specific criteria are met to go for a distributed architecture.

But making a monolith using poor design and architectural approaches does lead to the dreaded big ball of mud problem. This is why I have come to like and partially love the modular monolith approach combined with Domain-Driven Design. Also closely related is the vertical slice architecture which will have its own post later.

Each approach in design and architecture you chose will have a pro and a con. You have to be aware of them before you start using them. There rarely is a silver bullet that fixes all problems. In the real world, you have good architecture when it complies with and solves most of the things you are trying to solve. You have to understand the requirements and needs of a project and have knowledge of architecture and critical thinking.

Modular Monolith vs Distributed Systems

Let’s start by examining the modular monolith architecture compared to the distributed systems architecture. The reason we do this is that they are actually quite similar in many ways. Technically the system architecture is different for them both but the underlying concepts you use in a distributed systems architecture like microservices will apply to the modular monolith one also.

The way you discover your key concepts, features, use cases, modules, services, and logic will be quite similar. In both cases, I would prefer to use Domain-Driven Design to find out what needs to be done and how to do it and perhaps combine the process with a domain discovery methodology like EventStorming.

In a microservices architecture, you will be creating separate services for your domains, you would have domain models, events, persistence layers, etc. With a modular monolith approach, you would have the same things but instead of having your architecture distributed physically and logic-wise, you most likely will have one logic (code) artifact and far fewer physical resources.

Monolith	Microservice
One artifact	Many individual services
Entanglement risk	Focus on clear small units
Simple method calls	Unreliable network calls
All parts are always up and available	Service discovery + internal load balancing + circuit breakers
Easy interface refactoring	Difficult to refactor
Application scales as a unit	Services scale individually
One database	Polyglot persistence
Transactions	Eventual consistency
One platform	Platform choice

https://lukashajdu.com/post/majestic-modular-monolith/

Monoliths are easier to start with and expand from there. If you use the modular monolith as your starting approach, your code and logic will be quite close to a microservices architecture services and you could start creating services quite easily.

Generally, a monolith approach is definitely the better approach for smaller teams and companies with “non-complex” requirements and functionality. A distributed architecture like microservices is better for a large organization with complex requirements and functionality with a large workforce and division of labor.

Keep in mind that both architectural approaches have their benefits and uses but using them wrong will end up with bad results. For example, going from a monolith to microservices to get the benefits of loosely coupled services architecture is no good if you end up calling services without clear boundaries. You will just end up with a distributed monolith. Similarly, using the modular monolith approach without equally clear boundaries between modules is just another distributed monolith.

Productivity

Also, the project productivity will differ wildly in both architectural styles depending on the size and complexity of what you are building. The monolith approach will generally be more productive when used in low complexity with small teams but will likely become a problem when the complexity rises to such a level that a microservices approach is better suited, especially if the number of modules rises to the hundreds.

Performance

The same things start to apply to performance, when the complexity and calculations are small and easy, a monolith is more than fine. But eventually, as things grow a monolith might not be enough, you start to run into scalability and deployment considerations. At some point, a single microservice will kick ass in performance compared to the same module running in a monolith instance. A microservice can be allocated just the right amount of resources and thus not slowing down any other parts of the architecture.

Scaling

Scaling is more efficient in a microservices architecture compared to a monolith because you have more fine-grained control, you only scale those modules that need to be scaled. When scaling a monolith vertically (more resources), these resources are not used efficiently within the monolith, and scaling horizontally (new nodes/instances), is also the same in resource efficiency.

Failure impact

Failure impact, a monolith works as a process and when something breaks it can take down the whole application. The way around this is of course to have multiple nodes/instances. With a microservice, only the node that failed will go down, but this is not without its own problems also. You need to have different approaches to deal with broken down nodes/services and this adds that layer of complexity that is hard to manage.

Heterogeneous Technology

A monolith can and is usually developed around a certain technology stack, like C# + .NET. This can be good for productivity and performance but it can also lead to legacy code and functionalities. In such a case some things can’t be upgraded into newer technologies and features. Sometimes, even security updates are hard to apply without refactoring a sizable chunk of code to comply with the new versions of languages and libraries.

With a distributed system like microservices, each service can operate in different technology stacks and thus allowing to use of totally new and fresh choices without any past baggage.

Key features and concepts

Let’s start with what you should do first when starting with a modular monolith: use a Domain-Driven design with a domain discovery like EventStorming. You want to do this to avoid the most common problems associated with monoliths, that is creating code that eventually is highly coupled, hard to understand, and hard to change and maintain. I want to stress this point again, you need something like DDD to discover your modules and how they work internally and externally with each other.

DDD should give you a clear design and boundaries for your application, it should show you your modules. Then you can start to focus on the main parts of any module:

Code
Data
Communication
Scalability

Code

What you need to do first is to find out your modules. As I wrote above, I would suggest using something like DDD and a domain discovery method. Once you have discovered your domain, your boundaries, and the unambiguous language, you can start to work on the code implementation.

If we think of traditional layered architecture, we can introduce the modules as vertical slices that go through all the layers, to give you an idea:

These slices each represent a module or boundary that needs to be respected and protected so that it is not misused. To have a loosely bounded monolith you do not call the implementation of the other module from your code but you communicate to other modules through their contracts.

This approach can also be thought of as a feature-based approach or vertical slices architecture. The idea is that the slice/module is self-contained, it owns everything in the slice and can’t be accessed from the outside world except through pre-defined ways and through specific location(s). The modules expose and communicate to be outside world only what is necessary, keeping the internal workings and data ownership to itself.

It should be pointed out that you can use different architectural implementations to create the code like:

DDD Tactical Design
Clean or Hexagonal Architecture
Vertical Slice Architecture
CQRS
etc…

Code organization

Depending on your technology stack and choice of programming languages + tools, you may have different options:

Separate project for each module, library, package, a module of the same app
The same project but organize through folder structure where root folders could represent different modules with their internal folder and code structures
Your code can reside in different repositories or the same but the end product is one combined app
Features, domain concepts, aggregate roots, etc

Data

When it comes to the data, the most important thing is that the module has to own that data. Other modules do not have direct access to your module’s data. The other thing is that if you need data from another module you can take data and put it into the data schema that helps the module with its data processing without burdening the other modules.

So, each module can store its own data into a separate database or in a single database but so that each module owns its schemas that no other boundry can access them.

Modular application database structure
Database object per module: we need to ensure that each module only accesses its tables or database objects.
No shared tables between modules: if we take a closer look at the database on the image above we can see that for every module that we have on the top, we have an equivalent set of database objects on the bottom and every module is only accessing its collection. There is no sharing of tables or objects between modules.
No cross-module joins: we don’t want joins between modules because we want to keep our options open for the spectrum of code isolation. We want to have joins only between tables of the same module. Once we get to a point where the integrated system isn’t enough for a module, we want to be able to pull it out into an independent service. It will be difficult to achieve if we would have joins on the database level. Joins should be handled by the APIs.
Referential integrity: we can still maintain referential integrity and transactions because if we want to separate a module later, we could just remove the foreign key relationship and then move that part of the data in its separate persistence entity away from the rest of the database. Until we need to do that we can still reap all the benefits.
https://lukashajdu.com/post/majestic-modular-monolith/

Data isolation
Separate tables: the simplest option is to use the same database technology with the same database and the same schema. Every module has its own set of tables in this option. However, this option gets more complicated with the increasing size of the database. It’s easy to accidentally create joins between tables belonging to different modules.
Separate schema: another step up is to use a database schema which provides namespacing. It allows to group together a bunch of database objects logically.
Separate database: following option is to move some of the schemas along with modules into their separate databases. We keep the same database technology to have just one set of expertise we need to manage in-house on the operations side.
Other persistence: if our current technology to underpin the persistence of our modules isn’t appropriate for what we need, and we need to move to a vastly different type of persistence, the last step is to move out.
https://lukashajdu.com/post/majestic-modular-monolith/

Communication

As stated earlier the communication between different modules should be done through contracts. You need to avoid modules directly using implementations or data stores to manipulate state or otherwise alter or request certain behavior in another module.

A contract can be an interface, delegate with DTOs. It can also be an event/message with the support of a message broker and message processor to handle the message.

Scalability

As stated earlier, you can scale a monolith vertically that is adding more resources, CPU, memory, hard drive, etc. Or you can scale horizontally, that is adding more instances of your monolith to run in your system architecture. You usually do this by adding a load balancer if you are building an API. You may also need to consider to scale horizontally the data storage, you may have to build logic to take into consideration the data storage need of your monolith instances.