Coupling in software development

In this blog post I will write some ideas on coupling and how to approach coupling. Some of these ideas are based on a course I’m going trough by Udi Dahan on system design.

First to start at; in object oriented programming there is almost always going to be coupling. If there where no coupling in the code that piece of code most likely will not be very useful.

It is the amount of coupling that is of concern. You usually want to aim for loosely coupled code and to this as developer we aim at but usually gradually over time as a project grows and the code grows, what was once a nice piece of code and easy to understand has become highly coupled and hard to read/maintain.

Also things in real life aren’t that simple; it isn’t just about aiming for loosely coupled code but more of understanding your system and making the best decisions based on that information. You may end up adding some tightly coupled elements to avoid other bigger problems.

Afferent and Efferent coupling

There are two types of coupling terms:

  • Afferent => Incoming
  • Efferent => Outgoing

In English from wikipedia:

Afferent Couplings (Ca): The number of classes in other packages that depend upon classes within the package is an indicator of the package’s responsibility. Afferent = incoming.

Efferent Couplings (Ce): The number of classes in other packages that the classes in the package depend upon is an indicator of the package’s dependence on externalities. Efferent = outgoing.

So coupling is a measure of dependencies.

Afferent Coupling :

  • Who depends on you.
  • Measure of how many other packages use a specific package.
  • Incoming dependencies.
  • Example: Self-contained, used in a number of places
    • Library
    • Framework
    • Logging

Efferent Coupling :

  • Who do you depend on.
  • Measure of how many different packages are used by a specific package.
  • Outgoing dependencies.
  • Example: Usually used in a presentation layer; user interactions

Hidden coupling

  • Shared resources between applications, like a database. Actions performed by one application may affect the other application negatively.

Coupling aspects for systems

Platform

Coupling between two or more pieces of code/application; interoperability. Using a mechanism of communication that is limited to certain applications.

Temporal

Service A calls Service B, Service A waits for B to finish.

If the communication is synchronous; then it is a highly coupled.

If the communication is asynchronous; then it is more loosely coupled.

Spatial

How binded is your solution to other machines or services.

If you solution stops functioning if one machines or service goes down; then it is highly coupled.

If your solutions continues to work if a machines or service goes down and restarts; then it is loosely coupled.

Solutions

General

  • In your deployment pipelines have check on how many incoming or outgoing dependencies are for a piece of code. If the dependency is higher than a certain number; fail the build.
  • Use code reviews to talk about the failed build to see what to do about the dependencies.
  • Use tools that analyze your project and code to get insights into your project.
  • Minimize afferent and efferent coupling. Zero coupling isn’t really possible.

Specific: Platform

  • Text-based representation on the wire (XML/JSON), with or without schema
    • Notice: Schemas are more about developer productivity than interoperability, it’s about saving development time without writing things yourself
  • Use standards based transfer protocol like HTTP, SMTP, UDP
    • Consider which protocol is best suited for you based on different functionality and aspects. Different protocols have different level of functionality and reliability. Use the right tool for the job
  • Consider the benefits of existing standards to solve your problems: SOAP / WSDL / REST
  • Generally be aware of existing standards and technologies. Know what they do and what suits your needs and chose the right tool. If a technology support things that you need and another technology does not consider using the one that is time tested and ready to use and not creating your own version of the same functionalities

Specific: Temporal

  • Avoid multi-threaded solutions to the problem, it brings about other problems such as:
    • Deadlock – Occurs when two competing processes are waiting for each other to finish, allowing neither to finish.
    • Starvation – Occurs when a process never gains accesses to resources, never allowing the program to finish.
    • Race Conditions – Occurs when processes that must occur in a particular order occur out of order due to multiple threading. More specifically, this is discussing a data race, please avoid arguments such as this one.
    • Livelock – Occurs when two threads are dependent on each other signals and are both threads respond to each others signals. If this is the case, the threads can cause a loop similar to something between a deadlock and starvation.
    • Increased Complexity − Multithreaded processes are quite complicated. Coding for these can only be handled by expert programmers.
    • Complications due to Concurrency − It is difficult to handle concurrency in multithreaded processes. This may lead to complications and future problems.
    • Difficult to Identify Errors− Identification and correction of errors is much more difficult in multithreaded processes as compared to single threaded processes.
    • Testing Complications− Testing is a complicated process i multithreaded programs as compared to single threaded programs. This is because defects can be timing related and not easy to identify.
    • Unpredictable results− Multithreaded programs can sometimes lead to unpredictable results as they are essentially multiple parts of a program that are running at the same time.
    • Complications for Porting Existing Code − A lot of testing is required for porting existing code in multithreading. Static variables need to be removed and any code or function calls that are not thread safe need to be replaced.
  • Consider a publisher/subscriber pattern
    • Subscriber must be able to make decisions based on somewhat stale data. In a distributed system this is inevitable.
    • Requires a strong division of responsibility between publishers and subscribers.
    • Only one logical publisher should be able to publish a given kind of event, being the source of truth; a single source of truth.
    • Where (and why) not to do pub/sub: when business requirements demand consistency
    • Note: If you can’t do pub/sub, you can’t do request/response which means that you need to consider combining the two services as one

Events design

  • Avoid requests/commands; Bad: “SaveCustomerRequested”
  • State something that happened (past tense). Subscribers shouldn’t be able to invalidate this; Good: “OrderAccepted”
  • If you have to talk about data, state its validity; ProductPriceUpdated { Price: $5, ValidTo: 1/1/15 }

Specific: Spatial

  • Spatial coupling is about firstly considering the logical elements then the physical elements like consumers, load balancing etc => Routing
    • Strongly-typed messages simplify routing vs document-centric messaging
      • Avoid content based routing because it creates a large logical coupling (having large amount of logic and properties that define many options how to process a message, like a real life document)
      • Prefer smaller more specific messages; split larger ones into smaller ones
        • They should clearly articulate what needs to be done
  • Application level code should not need to know
    where cooperating services are on the network
  • Delegate communications to lower layer – the service
    agent pattern myAgent.Send(message);

Software Development fallacies every software developer should know

These are my notes on a course on distributed systems by Udi Dahan.

I’m listening to it and decided that the best way to learn is to take notes while I listen, and maybe share what I learn once in a while.

Lets start by defining what a system is:

Systems are not applications, they are made up of multiple executable elements on multiple machines, with multiple sources of information. A system deals with connectivity.

An application is a single executable and runs on a single machine and don’t know usually about connectivity.

Each executable within a system is not an application, could be scripts but must deal with connectivity.

Fallacies by developers and architects

  • The network is reliable
  • Latency isn’t a problem
  • Bandwidth isn’t a problem
  • The network is secure
  • The topology won’t change
  • The administrator will know what to do
  • Transport cost isn’t a problem
  • The network is homogeneous
  • The system is atomic/monolithic
  • The system is finished
  • Business logic can and should be centralized

The network is reliable

  • Hardware can fail
  • Software crash and have bugs
  • Security may be a problem

Solutions

Networks are unreliable, so using different design and architecture solutions you can get around the problem or minimize it.

  • Retry and acknowledge
    • Synchronous situations
  • Store & Forward
    • Asynchronous situations
  • Transactions
    • Usually during integrations with multiple data sources
  • Reliable Messaging Infrastructure (Message Queues)
    • MSMQ
    • SQS
    • Azure Service Bus
    • ActiveMQ
    • RabbitMQ
    • NServiceBus
    • AMQP
    • Notice: Once moving to a MQ related architecture you lose request/response synchronous way of functionality. Notice some MQ solutions do provide a two way communication, back and forth.
  • Testing:
    • Test with unreliable network configurations or by simulating them (your browser may include this in the developer tools)
  • Caching: This is not optimal but may offer solutions for situations where part of the entire system architecture is failing and with caching you can still provide data while the problem is being fixed
    • HTTP Caching headers
    • Redis
    • Application In-Memory cache
  • Avoid distribution of objects in your system, otherwise the above solutions have to be taken into consideration

Latency isn’t a problem

Latency measures the delay between an action and a response. Over the Internet, individual delays at the application or packet level can accumulate, as each element in the chain of communication is linked to others.

Latency can occur from several sources:

  • Your application code
  • The libraries and development methods
    • Like ORM with lazy-loading
    • We code as if latency is zero, we expect that data is immediately available
  • Network
  • Disk
  • Operating system

Modern day solutions rely more and more on network and in a distributed systems. Think of cloud based solutions, microservices and serverless computing.

Solutions

  • Use DTOs to pack your data and send in less frequent times over a netwrok
  • Minimizing the affects:
    • Asynchronous programming
    • Parallel programming
    • WebSockets
  • Network I/O
    • Use faster networking
    • Eliminate network hops. In clustered queuing and storage systems, data can be horizontally scaled across many host machines, which can help you avoid extra network round-trip connections.
    • Keep client and server processes close together, ideally within the same datacenter and on the same physical network switch.
    • If your application is running in the cloud, keep all processing in one availability zone.
  • Disk I/O
    • Avoid writing to disk. Instead use write-through caches or in-memory databases or grids: CDN, Redis etc
    • If you do need to write to disk, combine writes where possible The goal is to optimize algorithms to minimize the impact of disk I/O.
    • Use fast storage systems, such as SSDs
  • Operating environment / Operating System
    • Run your application on dedicated hardware so other applications can’t inadvertently consume system resources and impact your application’s performance.
    • Be wary of virtualization—even on your own dedicated hardware, hypervisors impose a layer of code between your application and the operating system.
    • Understand the nuances of the programming language environment used by your application, like garbage collection.
  • Your code
    • Inefficient algorithms are the most obvious sources of latency in code.
    • Multithreaded locks stall processing and thus introduce latency. Consider design patterns to avoid the issue.
    • Blocking operations cause long wait times, so use an asynchronous (nonblocking) programming model to better utilize hardware resources, such as network and disk I/O.
  • Generally
    • Don’t cross the network if you don’t have to and if you have to take as much data with you as possible
    • Inter-object communication shouldn’t cross the network; consider caching
  • Take into account geography; keep things as close as possible or use technologies like CDN

Bandwidth isn’t a problem

More bandwidth is available year by year but the amount of data grows faster.

Networks can congest and slow things down if lots of data is being transferred.

ORM eagerly fetching data

General Common Causes of Bandwidth Issues

Bandwidth Issues can be usually traced to certain specific activities. These activities are: large amounts of data, and extended duration:

  • Watching videos from Internet (YouTube, Netflix) 
  • Video calls
  • Cloud storage
  • Large file transfers
  • Constant stream of data
  • Downloading files from internet

Solutions

  • Design your architecture within your system to have the options to split things into their own resources
  • Divide data source into different networks with their own network resources based on the needs of the data.
  • Avoid if possible eagerly fetching and lazy loading, setup restriction and limitations on how to use data
  • Have more than one domain model to resolve the forces of bandwidth and latency
    • Having different sets of object for different set of operations; like read, write, delete

The network is secure

Networks are not secure; many things can contribute to this:

  • Human beings
  • Software bugs
  • Firewall configurations
  • Network configurations
  • Access Rights
  • A virus or Trojan
  • Physical devices like USB memory sticks, DVDs, CDs

Solutions

I have a previous blog post that describes quite well what you need to do as a developer: https://lionadi.wordpress.com/2020/03/23/lessons-learned-from-building-microservices-part-2-security/

The topology won’t change

Today a systems topology is very complex than what is was in the past. This is due to several factors like cloud computing, going serverless, complex architecture solutions etc.

Servers can go down, they are moved to a different location like a different subnet. With cloud computing the change of network topology has become a constant thing; consider kuberbetes and docker for example.

Solutions

  • Testing to see if everything is working; early in the development and in production also
  • Don’t hard-code IP addresses and check to see that this hasn’t happened accidentally
  • Consider using resilient protocols (multicast)
  • Discovery mechanisms can work, but hard to get right
  • For async full-duplex communications be aware of locking threads if clients disconnect and the server is trying to reach them
  • Simulating failures and problems
  • Be aware of modern day devops and practices where topology related details are configurable and automated; this can cause problems if the changes are not taken into consideration
  • Run performance tests with resources going down or being slow going up
  • Run performance tests early and often; before going to production
  • In very rare cases, if you failed the above then you may as a last resort run all of your system part on a monolithic one gigantic server to avoid problems discussed. Notice: Not recommended, do these things above early on.

The administrator will know what to do

  • No single person can know everything
  • People change jobs, positions or retire
  • Documentation is hard to keep up to date
  • Multiple admins may cause actual admin problems managing things coherently; people making changes and updates that may cause problems that people are not aware of
  • Usually large amount of configurations to keep track of and with automation people have a false sense of security
  • Configuration problems can lead to time consuming debugging and problem solving
  • Upgrades = Downtime; delaying upgrades causes upgrades to pile up and cause system breaks when a major big upgrade occurs

Solutions

Transport cost isn’t a problem

  • Serialization before crossing the network (and de-serialization on the other side) takes time.
    • In the cloud, it can be a big cost factor.
    • We don’t usually measure the impact of serialization and de-serialization
  • Encryption and decryption add a compute cost factor
  • Hardware network infrastructure has upfront and ongoing costs

Solutions

  • Test the impact of serialization and de-serialization
  • Test to see how expensive your system is; see how much money your system will eat up undress load; especially in the cloud
  • Avoid “chatting” over the network
  • You need to make trade-offs between infrastructure costs and development costs – upfront vs. ongoing

The network is homogeneous

  • Today there are many more programming languages(Java, JS, C#, Ruby), database types (SQL, NoSQL, Graph), API types (REST, WebAPI)
  • All these add to complexity and interoperability
  • Being “Open” software doesn’t necessarily mean that it integrates well
  • Semantic interoperability will be hard; example star time Linux time vs SQL Server time

Solutions

Hard to come up with specific solutions due to the complexity of the issue. You could start my trying to minimize amount of technologies within your system and keep thing heterogeneous if possible.

The system is atomic/monolithic

  • Change in one part of the system affects other parts; maintenance is hard. This concerns more the logic part, the code; not so much the physical part of the system.
    • Tests in any form may give you false positives that everything is working as it should
  • Integration through the DB creates coupling
    • Code logic that is vastly different (different domains) become tightly coupled through shared database structure and data.
    • Changes to the database at this point adds complexity to the point where the DB schema is not allowed to be changed; then you start going around things like adding JSON or XML into a database column
  • Based on the above, over time things are going to get even more coupled and harder to maintain.
  • If the system wasn’t designed to scale out to multiple machines, doing so may actually hurt performance

Solutions

  • Internal loose coupling
  • Modularization
  • Design for scale out in advance, or you just may end up being stuck with scale up.

The system is finished

  • Maintenance costs over the lifetime of a system are greater than its development costs
  • A project may start small like a proof of concept or an mvp but after the release the project will end up with constant new needs, requirements and additions that start to balloon. This becomes a problem if this is not anticipated and planned for.
  • Release dates may be unrealistic or change and be unknown to developers due to business requirements; the new release dates do not realistically take into consideration past, present and future requirements
  • Under these circumstances, after the release date the actual work begins because of all the implementations that where done in a hurry
    • Refactoring
    • Bug
    • Tech debt
    • Re-architecture
    • Scalability problems
    • New features
  • It is more expensive to fix bugs after a “release” in maintenance
  • Adding more features to an existing code base requires more time and skills than adding them at the start of a project
  • Adding new version to an existing released version is more harder than at the start of a project
  • Building a project and maintaining one requires the same amount of skills level and more when maintaining a project
    • Software development is different than for example constructing a building then after it is finished to maintain it which would require a less set of skills than the actual construction
  • Senior developers put themselves at the start of a project because:
    • It’s more fun to build something from scratch
    • They can do it
    • And it looks good for business
  • On the previous note; it is hard to design a code base or project so that it is maintainable in the future by others in ease and low cost
    • This is hard to do especially if the people who end up maintaining the code are not as skilled or knowledgeable as the initial developers
  • This usually also end up later to a situation where a rewrite is suggested but the above problem may still be present in many situations
  • The system is never “finished” as long as there is work done on it, in any form
  • Estimates are hard to give and people don’t necessarily know how to do it

Solutions

  • Projects are a poor model for software development. Long-lived products are better
  • Think of a project as a product, the mindset will be different with a product. Software should be a product with a long term viability, it should live for a long time.
  • Design for long term viability which might require at start more effort
  • You need to have a team that can work together in the long term so that the knowledge and skills is retained by the team not the individuals
  • There’s no such thing as a “maintenance programmer”
  • Beware the rewrite that will solve everything
  • Give estimates taking these things into consideration:
    • Have a well-formed team that has worked in the past and has the suitable knowledge and skills
    • The team is not working on anything else
    • An estimate has a time range and a percentage of confidence that the work will take place within this time range; “I’m C% confident work will take between T1 & T2”. This is to avoid misunderstanding with business where you say that it might be possible and they interpret that it is possible
    • Confidence percentage will go higher the smaller the work load is divided into
    • Consider a “negotiation” way of giving estimations by testing and creating a prof of concepts and then giving estimates
    • If there still is problems or difference of opinion on estimates; then ask the business to prioritize, now they have to think what they really want and truly understanding the scope and risks
    • Avoid taking literal example from organizations or methods that do not correspond to your particular situation, like Google, Facebook, Apple and methods they may use.
      • This is because you need to understand the surrounding details about the project and organization you are working with and apply things that work for your situation.
      • Things that work for a large project may not work at all for a medium sized or small sized project.
      • Learn from big things and see what parts can be applied and used wisely

Business logic can and should be centralized

  • As developers we strive to centralize logic to avoid error, bugs and forgetting to fix things everywhere when the rules/requirements change
    • There are a lot of dependencies between entities, validations, services that end up creating coupling between code
    • re-usable code allows high amount of use, then high amount of dependencies = tight coupling

Solutions

  • One possible solution for re-usable code creating tight coupling is to create the re-usable code into specific domains or layers within the code. Example:
    • Have specific entities, services and validations for a data access layer and have the DAL return the entities that are used in the DAL layers
    • In the next layers that has business logic lets say; again have specific entities, services and validations for the business logic layer; then map the DAL entities to the BLL entities based on your needed logic.
  • Consider incorporating functional programming patterns or practices within your code to avoid re-usable code problems
  • Code traceability: Add tags to implemented code specific to a business logic or feature which you can at a later date trace bag to see where these changes have taken place and what they where
  • Explore different development views:
    • The Open Group Architecture Framework (TOGAF)
    • 4+1 architectural view model
    • Federal Enterprise Architecture (FEA)

Lessons learned from building Microservices – Part 3: Patterns, Architecture, Implementation And Testing

Introduction

In this blog post I will go over things I’ve learned when working with microservices. I will cover things to do and things that you should not do. There won’t be alot of code here, mostly theory and ideas.

I will discuss most of the topic from a smaller development teams point of view. Some of these things are applicable for larger teams and projects but not all necessarily. It all depends on your own project and needs.

  • Architecture
  • Sharing functionality (Common code base)
  • Continuous Integration / Continuous Delivery (CI/CD)
  • High Availability
  • Messaging and Event driven architecture
  • Security
  • Templates and scripting
  • Logging and Monitoring (+metrics)
  • Configuration pattern
  • Exception handling and Errors
  • Performance and Testing

General advice

Generally I advice to use and consider existing technologies, products and solution with your application and architecture to speed up your development and keep the amount of custom code at a minimum.

This will allow you to save on errors, save on time and money.

Still, make sure that you choose technologies and products that support your and your clients solutions; not things that you think are “cool” right now or would be fun to use. Your choices should fit the needs and requirements on not only your project but the whole of the architecture and the future vision of your project.

Architecture

To keep things simple I would say there are two ways to approach microservices.

Approach one: Starting big but small

The first approach is the one you probably are familiar with. These are big projects by big companies like Amazon, Netflix, Google, Uber etc.

Usually this involves creating hundreds or even thousands of microservice on multiple platforms and technologies. This usually require large teams of people both developing, deploying and up keeping the microservice solution they are working on.

This approach is definitely not for everyone; it may require alot of people, resources and money.

In this approach you can minimize the impact of needing many resources and people by sharing code but this creates coupling which may or may not be what you are looking for. I’ll explain the benefits of shared code in second approach.

You could also minimize needed resources and people by going very small on microservices size allowing them to be easily deleted or created in any language and technology. In this approach you should be ready to just delete something and start from scratch with ease. The idea is to avoid permanence, so you may end up having little or no unit tests that add permanence .

Also if there is a need to coupling between services, create a service that provides a needed functionality.

Approach two: Starting small but plan big

Most likely you have a team of a few people and limited resources. In this case I recommend starting between microservice architecture and monolith one.

By this I mean that you start the process by designing it and implementing all the infrastructure of a microservice but do not start splitting you application into microservices from the start. Create one microservice, expand it then split it when things start to grow so that it feels like a new microservice is needed.

By this time you have had time to understand and analyze your business domain. Now you have an idea what kind of a communication between microservices you need; perhaps HTTP based or decoupled messaging based.

When you are creating you microservice keep you design pattern for you microservices simple. Do not implement overly complicated patterns to impress anyone, including yourself. It will make the upkeep of you microservices a hell. You want to keep the code as simple and as common inside a microservice and between them.

Share as much as possible between microservices.

Create good Continuous Integration and Continuous Deployment procedures as soon as possible. It will save you time.

Verify you have proper availability and scalability based on your application needs.

Prefer scripting to automate and speed up development and upkeep.

Use templates everywhere you can, especially for creating new microservices.

Have a common way to do exception handling and logging.

Have a good configuration plan when deploying your Microservices.

You also need team member who are not afraid to do many different things with multiple technologies. You need people who can learn anything, adapt and develop with any tool, tech or language.

With these approaches and check list you should be able to manage a microservice architecture with only a handful of people. For up-keeping even one person is enough, but constant development at least two or three would be a good amount.

Sharing functionality (Common code base)

When it comes to code I prefer the “golden” rule in programming to not repeat myself but with microservices you will end up with duplication.

The wise thing to do with microservices is to know when to not duplicate and have a common access to shareable code and functionality; and why do this?:

  • Developer and up doing similar code that is used again and again in multiple microservices.
  • These common pieces of code and functionality end up having the same king of problems and bug which have to be corrected in every place
  • The problems and bugs cause security issues
  • With performance issues
  • With possible hard to understand code even when the logic and functionality is the same but the code ends of being slightly or vastly different.
  • And lastly all of the above combined cause you to spend time and money that you may not have

The next question to ask is:

What should you share? The main rule is that is must be common. The code should not be specific to a certain microservice or a domain.

Again it all depends on your project but here are a few things:

  • Logging, I highly recommend this to a unified logging output format that is easy to index and analyze.
  • Access Logs
  • Security
    • User authorization but not authentication or registration. Registration is better suited as an external microservice as it’s own domain.
    • Encryption
    • JSON Web Token related security code and processing
    • API Key
    • Basic Auth
  • Metrics
  • HTTP Client class for HTTP requests. Create an instance of this class with different parameters to share common functionality for logging, metrics and security.
  • Code to use and access Cloud based resources from AWS, Azure
    • CloudWatch
    • SQL Database
    • AppInsights
    • SQS
    • ServiceBus
    • Redis
    • etc…
  • Email Client
  • Web Related Base classes like for controllers
  • Validations and rules
  • Exception and error handling
  • Metrics logic
  • Configuration and settings logic

How should you distribute your shared functionality and code? Well it all depends on your project but here are a few ways:

  • One library to rule them all :D. Create one library which all projects need to use. Notice: This might become a problem later on when your common library code amount grows. You will end up with functionality that you may not need in a particular Microservice.
  • Create multiple libraries that are used on need basis. Thus using only bits of functionality which you need.
  • Create Web API’s or similar services that you request them to perform a certain action. This might work for things like logging or caching but not for all functionality. Also notice that you will lose speed of code to latency if you outsource your common functionality to a common service that is run independently from your actual code that needs that common functionality.
  • A combination of all of the above.

Dependency injection

Use your preferred dependency injection library to manage your classes and dependencies.

When using your DI I recommend thinking of combining classes into “packages” of functionality by feature, domain, logic, data source etc. By doing this you can target specific parts of your code without “contaminating” the project with unneeded code even if you have a large library like a common library.

For example you could pack a set of classes that provide you the functionality to communicate with a CRM, get, modify and add data.

This would include your model classes. A CRM client class, some logic that operate on the model to clean them up etc.

Once you identify them, make your code that that you can add them into your project with the least amount of code.

Also consider creating a logic to automatically tell a developer which configurations are missing once a set of functionalities are added. The easiest way to achieve this is to add this at compile and/or runtime with checks.

See my previous article on this matter for a more detailed description:

https://lionadi.wordpress.com/2019/10/01/spring-boot-bean-management-and-speeding-development/

Continuous Integration/Continuous Delivery (CI/CD)

There are may ways of doing CI and CD but the main point is that ADD it and automate as much as possible. This is especially important with microservices and small team sizes.

It will speed things up and keep things working.

Here are a few things to take into consideration:

  1. Create unit tests that are run during your pipelines
  2. Create API or Service level tests that verify that things work with mock or real life data. You can do this by mocking external dependencies or using them for real if available.
  3. Add performance tests and stability tests to your pipelines if possible to verify that things run smoothly.
  4. Think of using the same tool for creating your API or service tests when developing and when running the same tests in a pipeline. You can just reuse the same tests and be sure that what you test manually is the same that should work in production. For example: https://www.postman.com/ and https://github.com/postmanlabs/newman
  5. Script as much as possible and parametrize your scripts for reuse. Identify which scripts can be used and shared to avoid doing things twice.
  6. Use semantic versioning https://semver.org/
  7. Have a deployment plan on how you are going to use branches to deploy to different environments (https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow), for example:
    1. You can use release branches to deploy to different environments based on pipeline steps
    2. Or have specific branches for specific environment. Once things are merged into them certain things start to happen
  8. Use automated build, test and deployment for dev environment once things are merged to your development branch.
  9. Use manual steps for other environments deployment, this is to avoid testers to receiving bad builds in QA or production crashing on bugs not caught up.
  10. If you do decide to automate everything all the way to production, make sure you have good safe guards that things don’t blow up.
  11. And lastly; nothing is eternal. Experiment and re-iterate often and especially if you notice problems.

Common tools to CI/CD

https://www.sonatype.com/product-nexus-repository

https://www.ansible.com/

https://www.rudder.io/

https://www.saltstack.com/

https://puppet.com/try-puppet/puppet-enterprise/

https://cfengine.com/

https://about.gitlab.com/

https://www.jenkins.io/

https://codenvy.com/

https://www.postman.com/

https://www.sonarqube.org/

High Availability

The main point in high availability is that your solution will continue to work as well as possible or as normal even if some parts of it fail.

Here are the three main points:

  • Redundancy—ensuring that any elements critical to system operations will have an additional, redundant components that can take over in case of failure.
  • Monitoring—collecting data from a running system and detecting when a component fails or stops responding.
  • Failover—a mechanism that can switch automatically from the currently active component to a redundant component, if monitoring shows a failure of the active component.

Technical components enabling high availability

  • Data backup and recovery—a system that automatically backs up data to a secondary location, and recovers back to the source.
  • Load balancing—a load balancer manages traffic, routing it between more than one system that can serve that traffic.
  • Clustering—a cluster contains several nodes that serve a similar purpose, and users typically access and view the entire cluster as one unit. Each node in the cluster can potentially failover to another node if failure occurs. By setting up replication within the cluster, you can create redundancy between cluster nodes.

Things that help in high availability

  • Make your application stateless
  • Use messaging/events to ensure that business critical functionality is performed at some point in time. This is especially true to any write, update or delete operations.
  • Avoid heavy coupling between services, if possible and if you have to do use a lightweight messaging system. The most troublesome aspect of communicating between microservices is going to be over HTTP.
  • Have good health checks that are fast to respond when requested. This can be divided into two categories:
    • Liveness: Checks if the microservice is alive, that is, if it’s able to accept requests and respond.
    • Readiness: Checks if the microservice’s dependencies (Database, queue services, etc.) are themselves ready, so the microservice can do what it’s supposed to do.
  • Use “circuit breaker” to quickly kill unresponsive services and quickly run them back up
  • Make sure that you have enough physical resources(CPU, Memory, disk space etc) to run your solution and your architecture
  • Make sure you have enough request threads supported in your web server and web application
  • Make sure you verify how sizable HTTP requests your web server and application is allowed to receive, the header is usually that will fail your application.
  • Test your solution broadly in stress and load balancing tests to identity problems. Attach a profiler during these tests to see how your application perform, what bottlenecks there are in your code, what hogs resources etc.
  • Keep your microservices image sizes to the minimum for optimal run in production and optimal deployment. Don’t add things that you don’t use, it will slow your application down and deployment will suffer; all of this will lead to more needed physical resources and more money needed.

Messaging and Event driven architecture

I will be covering this topic in an upcoming post but until then here are a few pointers.

Because of the nature of Microservices; that they can be quickly scaled up and down based on needs I would very highly recommend that you use messaging for business critical operations and logic.

The most important ones I would say are: Writing, Updating and Deleting data.

Also all long running operations I would recommend using messaging.

Notice: One of the most important thing I consider is that you log and monitor the success of messages being send, processed and finished, with trailing to the original request to connect logs and metrics together to get a whole picture when troubleshooting.

I have coveted this in my previous logging post: https://lionadi.wordpress.com/2019/12/03/lessons-learned-from-building-microservices-part-1-logging/

Security

Generally security is an important aspect of any application and has many different topics and details to cover.

Related on security I covered this extensively in my last post in this series on Microservices, go check it out: https://lionadi.wordpress.com/2020/03/23/lessons-learned-from-building-microservices-part-2-security/

Templates and scripting

To speed up development, keep things the same and thus avoiding duplicate errors + unnecessary fixes, use templates where possible. This is especially true for Microservices.

What are possible templates that you could have:

  • Templates for deploying Cloud resources like ARM for Azure or Cloudformation for AWS.
  • Beckend Application templates
  • Front application templates
  • CI/CD templates
  • Kubernetes templates
  • and so on…

Anything that you know you will end up having multiple copies is good to standardize and create templates.

Also I recommend that for applications (front or backend), it is a very good practice to have the applications up and running as soon as you duplicate them from your repository. They should be able to be up and running as soon as you start them.

Script as much as possible and make the scripts reusable.

Parametrize all of the variables you can in your scrips and templates

Here are a few things you would need for a backend application template:

  • Security such as authentication and authorization.
  • Logging and metrics
  • Configuration and settings logic
  • Access Logs
  • Exception handling and errors
  • Validations

Logging and Monitoring (+metrics)

Again as with security this is a large topic, I’ve also written about this in my previous post in the series and recommend go checking it out:

https://lionadi.wordpress.com/2019/12/03/lessons-learned-from-building-microservices-part-1-logging/

Configurations pattern

For microservice configurations I recommend a following pattern where your deployment environments (DEV, QA, PROD etc) configuration files configurations/settings values are left empty. You still have the configuration/setting in your configuration/settings files but you leave them empty.

Next you need to make sure that your code knows how to report empty configuration values when your application is started. You can achieve this by creating a common way to retrieve configurations/settings value and being able to analyze which of the needed and loaded configurations are present.

This way when your docker image is started and the application inside the image starts running and retrieving configurations, you should be able to see what is missing in your environment.

This is mostly because you don’t want to have your environment specific configurations set in your git repository, especially the secrets. You will end up setting these values in your actual QA, PROD etc environments through a mechanism. If you forget to add a setting/configuration in your mechanism your docker image may crash and you will end up searching for the problem a long time, even with proper logging it may not be immediately clear.

I’ve written an previous post on this matter which opens things up on the code level:

https://lionadi.wordpress.com/2019/10/01/spring-boot-bean-management-and-speeding-development/

Exception handling and Errors

Three main points with exceptions and errors:

  • Global exception handling
  • Make sure you do not “leak” exceptions to your clients
  • Use a standardized error response
  • Log things properly
  • And take into consideration security issues with errors

Again for for details on logging and security check my previous posts:

https://lionadi.wordpress.com/2019/12/03/lessons-learned-from-building-microservices-part-1-logging/

https://lionadi.wordpress.com/2020/03/23/lessons-learned-from-building-microservices-part-2-security/

For error responses, you have two choices:

  1. Make up your own
  2. Or use an existing system

I would say avoid making your own if possible but it all depends on your application and architecture.

Consider first existing ones for a reference:

https://www.hl7.org/fhir/operationoutcome.html

https://developers.google.com/search-ads/v2/standard-error-responses

https://developers.facebook.com/docs/graph-api/using-graph-api/error-handling/

Still here is also an official standard which you can use and may be supported by your preferred framework or library: https://www.rfc-editor.org/rfc/rfc7807.html

The RFC 7807 specifies the following for error responses and details from https://www.rfc-editor.org/rfc/rfc7807.html:

  • Error responses MUST use standard HTTP status codes in the 400 or 500 range to detail the general category of error.
  • Error responses will be of the Content-Type application/problem, appending a serialization format of either json or xml: application/problem+json, application/problem+xml.
  • Error responses will have each of the following keys(Internet Engineering Task Force (IETF)):
    • detail (string) – A human-readable description of the specific error.
    • type (string) – a URL to a document describing the error condition (optional, and “about:blank” is assumed if none is provided; should resolve to a human-readable document).
    • title (string) – A short, human-readable title for the general error type; the title should not change for given types.
    • status (number) – Conveying the HTTP status code; this is so that all information is in one place, but also to correct for changes in the status code due to the usage of proxy servers. The status member, if present, is only advisory as generators MUST use the same status code in the actual HTTP response to assure that generic HTTP software that does not understand this format still behaves correctly.
    • instance (string) – This optional key may be present, with a unique URI for the specific error; this will often point to an error log for that specific response.

RFC 7807 example error response:

HTTP/1.1 403 Forbidden
Content-Type: application/problem+json
Content-Language: en

{
  "type": "https://example.com/invalid-account",
  "title": "Your account is invalid.",
  "detail": "Your account is invalid, your account is not confirmed.",
  "instance": "/account/34122323/data/abc",
  "balance": 30,
  "accounts": ["/account/34122323", "/account/8786875"]
}
   HTTP/1.1 400 Bad Request
   Content-Type: application/problem+json
   Content-Language: en

   {
   "type": "https://example.net/validation-error",
   "title": "Your request parameters didn't validate.",
   "invalid-params": [ {
                         "name": "age",
                         "reason": "must be a positive integer"
                       },
                       {
                         "name": "color",
                         "reason": "must be 'green', 'red' or 'blue'"}
                     ]
   }

Performance and Testing

Testing

To make sure that your solution and architecture works and performs I recommend doing extensive testing. Familiarize yourself with the testing pyramid which hold the following test procedures:

  • Units tests:
    • Small units of code tests which tests preferably one specific thing in your code
    • The tests makes sure things work as intended
    • The number of unit tests will outnumber all or tests
    • Your unit tests should run very fast
    • Mock things used in your tested functionality: replace a real thing with a fake version
    • Stub things; set up test data that is then returned and tests are verified against
    • You end up leaving out external dependencies for better isolation and faster tests.
    • Test structure:
      • Set up the test data
      • Call your method under test
      • Assert that the expected results are returned
  • Integration tests:
    • Here you test your code with external dependencies
    • Replace your real life dependencies with test doubles that perform and return same kind of data
    • You can run them locally by spinning them up using technologies like docker images
    • Your can run them as part of your pipeline by creating and starting a specific cluster that hold test double instances
    • Example database integration test:
      • start a database
      • connect your application to the database
      • trigger a function within your code that writes data to the database
      • check that the expected data has been written to the database by reading the data from the database
    • Example REST API test:
      • start your application
      • start an instance of the separate service (or a test double with the same interface)
      • trigger a function within your code that reads from the separate service’s API
      • check that your application can parse the response correctly
  • Contract tests
    • Tests that verify how two separate entities communicate and function with each other based on a commonly predefined contract (provider/publisher and consumer/subscriber. Common communications between entities:
      • REST and JSON via HTTPS
      • RPC using something like gRPC
      • building an event-driven architecture using queues
    • Your tests should cover both the publisher and the consumer logic and data
  • UI Tests:
    • UI tests test that the user interface of your application works correctly.
    • User input should trigger the right actions, data should be presented to the user
    • The UI state should change as expected.
    • UI Tests does not need to be performed end-to-end; the backend could be stubbed
  • End-to-End testing:
    • These tests are covering the whole spectrum of your application, UI, to backend, to database/external services etc.
    • These tests verify that your applications work as intended; you can use tools such as Selenium with the WebDriver Protocol.
    • Problems with end-to-end tests
      • End-to-end tests require alot of maintenance; even the slightest change somewhere will affect the end result in the UI.
      • Failure is common and may be unclear why
      • Browser issues
      • Timing issues
      • Animation issues
      • Popup dialogs
      • Performance and long wait times for a test to be verified; long run times
    • Consider keeping end-to-end to the bare minimum due to the problems described above; test the main and most critical functionalities
  • Acceptance testing:
    • Making sure that your application works correctly from a user’s perspective, not just from a technical perspective.
    • These tests should describe what the users sees, experiences and gets as an end result.
    • Usually done through the user interface
  • Exploratory testing:
    • Manual testing by human beings that try to find out creative ways to destroy the application or unexpected ways an end user might use the application which might cause problems.
    • After these finding you can automate these things down the testing pyramid line in units tests, or integration or UI.

All of the automated tests can be integrated to you integration and deployment pipeline and you should consider to do so to as many of the automated tests as possible.

Performance

For performance tests the only good way to get an idea of your solutions and architectures performance is to break it and see how it works under long sustained duration.

Two test types are good for this:

  • Stress testing: Trying to break things by scaling the load up constantly untill your application stop totally working. Then you analyze your finding based on logs, metrics, test tool results etc.
  • Load testing: A sustained test where you keep on making the same requests as you would expect in real life to get an idea how things work in the long run; these tests can go on from a few hours to a few days.

The main idea is that you see problems in your code like:

  • Memory leaks
  • CPU spikes
  • Resource hogging pieces of code
  • Slow pieces of code
  • Network problems
  • External dependencies problem
  • etc

One of my favorite tool for this is JMeter https://jmeter.apache.org/.

And to get the most out of these tests I recommend attaching a code profiler to your solutions and see what happens during these tests.

There is a HUGE difference how your code behaves when you manually test your code under a profiles and how it behaves when thousands or millions or requests are performed. Some problems only become evident when they are called thousands of times, especially memory allocations and releases.

And lastly; cover at least the most important and critical sections of your solutions and keep adding new ones when possible or problems areas are discovered.

These tests can also be added as part of the pipelines.

Topology tests

Do your performances tests while simulating possible error in your architecture or down times.

  • Simulate slow start times for servers.
  • Simulate slow response times from servers.
  • Simulate servers going down; specific or randomly

Test how your system works under missing resources and problems.

Test how expensive your system is

When you are creating tests consider testing the financial impact of your overall system and architecture. By doing different levels of load tests and stress tests you should be able to get a view on what kind of costs you will end up with.

This is especially important with cloud resources where what you pay is related to what to consume.

Azure AD and Azure Functions authentication 401 problems with access tokens

This is a very annoying thing since most documentation describing Azure AD user authentication is not very clear about using access tokens to authenticate a user.

If you follow the example on Microsoft page you will be doing the all right things but if you intend to use access token to authenticate you will likely encounter 401 even if you pass a proper access token. Especially if you are using Postman.

https://docs.microsoft.com/en-us/azure/app-service/configure-authentication-provider-aad

https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-auth-code-flow

So this is because you are using the wrong version of the authenticate API URLs for Azure AD.

The fix is to use the v2.0 of the login URLs and scopes.

Auth URL:

https://login.microsoftonline.com/{tenant}/oauth2/v2.0/authorize

Access Token URL:

https://login.microsoftonline.com/{tenant}/oauth2/v2.0/token

Scope:

{clientId}/.default

Found the fix finally in stackoverflow after alot of searching. It’s hard to find the exact documentation that you need: https://stackoverflow.com/questions/57496143/azure-functions-returns-401-unauthorized-only-with-postman

For postman if you Authorization tab in your request you can ask Azure AD to generate you a new access token:

Lessons learned from building microservices – Part 2: Security

In this blog post I will go through some of the things I have learned regarding security when it comes to micro-services. It is not a comprehensive guide and things change constantly, so keep learning and investigating.

Best advice here is to avoid re-inventing the wheel. Avoid making your own solutions related to security. This is because someone else with more resources and time has done it before you. Think of the libraries provided by the .NET Core or Java, these have been developed and tested for years. A good example of this would be encryption libraries.

Topics on this post are the following:

  • JSON Web Tokens
  • Monitoring, Logging and Audit trailing
  • Identity and access management
  • Encryption
  • Requests and data validations
  • Error handling
  • CORS & CSP & CSRF
  • OWASP (Open Web Application Security Project)
  • Configurations
  • Quality
  • Security Audit
  • Logs
  • Architecture

JSON Web Tokens

Basically they are JSON objects compacted and secured to transfer data between two or more entities, depending on usage.

https://jwt.io/introduction/

https://cheatsheetseries.owasp.org/cheatsheets/JSON_Web_Token_Cheat_Sheet_for_Java.html

The most common usage is to use them with authentication and authorization.

Another usage could be when you want a person to take some action but the action is delayed to a further date and moment. This is a common thing related to registration and verifying the person.

At some point during the registration process you would need to verify the user, so you can generate a token with needed metadata and send him/her an email. Later the user clicks a link on the received email containing the token as a parameter. One the data gets to your application you can open and validate the token and finish the registration.

There are many other uses but these are just an example. Any time you need to pass data or send it over the internet and that data needs to be secured and not long lived you could consider to use a token.

Security things to do with tokens

Summary: Always validate your tokens

  • Set the audience
  • Set the issuer
  • Set an expiration time
    • You don’t want your tokens in the outside world to live forever, meaning that they should not be used after a certain amount of time.
  • Sign the token to detect tampering attempts
    • Notice: In this scenario the token will not hide the data in the payload, it will only verify that the token hasn’t been tampered with. You need to combine it with encryption
  • Encrypt the token to hide the data in the token
    • Be aware of different encryption methods and when to use them. Generally symmetric encryption is preferred for data that is not in transit; like data that reside in a database. Asymmetric encryption is good for data that is moving and not stationary; data that is moving through the internet is a good example.

Monitoring, Logging and Audit Trailing

From a security perspective I consider logging a very important thing to do. When talking about this category the following things are important to your microservice (or any other application also).

  • Being able to trace activity in your application.
    • Who is operating
    • What is being done
    • How long things take and not just your application requests but also any external resources
    • Errors/warnings and successes
  • Being able to tell if possible attacks happen that want to cause damage or steal something valuable to your or your clients
  • Being able to tell the health of your solutions
  • Have a monitoring tool that can aggregate and display detailed information about your solutions
  • Being able to create alerts that inform you and your team of possible problems
  • Consider automatic actions to avoid issues if certain things are triggered, like possible attack attempts.
  • Consider having a plan on what to do when things seem to break or go bad based on gathered data, alerts, monitoring tools etc. Having an idea what will happen next will make things easier and help avoid public problems.

For more details on logging check out my previous blog entry on these series: https://lionadi.wordpress.com/2019/12/03/lessons-learned-from-building-microservices-part-1-logging/

Identity and access management

Application users

First things first as before with the example of encryption libraries I would recommend using a ready solution, especially if you plan to do a cloud based solution or an app with thousands or more users.

Consider AWS Cognito or Azure AD B2C.

The reason for this is that they provide all the security you need and more in some cases. You will have a huge possible security risk off your shoulders. There are many details to take into consideration if you go the way of manually creation an identity solution with authentication and authorization.

These ready solutions allow you to modify many details how you will use the authentication tokens and authorization tokens in your app. You can add custom attributes, you can use social media to create accounts, support for MFA, mobile users etc.

Require Re-authentication for Sensitive Features.

Proxy

Does the above mean that you can’t create a proxy service with custom logic and logging when users authenticate against Cognito or AD B2C? The answer is NO but consider if you really need it.

Possible situations where you might need an identity proxy are:

  • You need to verify that the authenticated user is allowed to authenticate. The account may not be disabled but might require a human step to be performed somewhere
  • A custom registration flow with custom business logic; for example a person can’t register is his/hers data is not in a certain state or if the data is in certain states then the registration will look, behave and end differently for different users.
  • Custom security logging; for maximum traceability and analysis. You might want to create custom logging and use proper tools to analyze what each person is doing. Especially things related to registration are critical and it is very common that people forget passwords, don’t know how to reset their password, problems logging in might occur etc. In all these cases logging saves hours if not days of troubleshooting.

Admin users

For admin users there are many good best practices to follow and I recommed looking overt them for you particular needs and technology uses. Here are a few links on the matter for AWS, Azure:

https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html

https://docs.microsoft.com/en-us/azure/security/fundamentals/identity-management-best-practices

Here is a quick list on the top things at the moment:

  • Require MFA
  • Limit the number of Super Admins
  • Enforce a limited session lifetime (Reducing the time a malicious 3rd party can take advantage of an open active session)
  • Enable user notifications for new sign-ons, password resets, account changes etc
  • Consider limiting access from specific locations/IPs etc, or between a certain date and time range, require SSL
  • Use Strong Password Policies (although some of these may make people “lazy” when changing password and pick low security or bad passwords)
    • Lockout
    • Password history
    • Password age
    • Minimum length
  • Create individual user accounts, do not share accounts
  • Grant least privilege
  • Do not share access keys
  • Remove unnecessary credentials
  • Monitor admin activity and send alerts of things that are suspicious

Encryption

Encryption comes in many forms, usually done by two methods: symmetric and asymmetric

Notice: Strong recommendation use existing common libraries for encryption all the way. Do not re-invent the wheel.

Symmetric

You should prefer symmetric encryption in stationary data, data that resides in databases. Also remember to add a salt to the encryption key to avoid possible guess work by an attacker. A salt is added to the hashing process to force their uniqueness, increase their complexity without increasing user requirements, and to mitigate password attacks like rainbow tables.

Probably the most recommended symmetric encryption is AES.

More info: https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html

Asymmetric

Asymmetric encryption is best suited and recommended for data in transit, data that is moving from one place to another over the web. Also in my opinion data that is leaving your secure environment to another location. The most common asymmetric is RSA that is used very broadly like any time you are using HTTPS application protocol to view a site.

Requests and data validations

Generally security authentication and authorization is based on user signin and user roles. While this is a good option there are drawbacks which I will discuss later.

So I will discuss here a way of achieving security through individual permissions for each action the person tries to do. The proper name for this is: Permission Based Access Control

Notice: Still take into consideration your needs in your project. As always some methods of doing things maybe suited for vastly different purposes. I would say that if you chose the Permission Based Access Control, I would recommend that you have many users, in the thousands and even many time greater than that. Also if your users permissions need to change dynamically this is a good solution.

Other options are:

  • Role-Based Access Control (RBAC)
  • Discretionary Access Control (DAC)
  • Mandatory Access Control (MAC)

The main focus is on individual permissions that are defined in a policy/access map. These maps can then be assigned on the fly in necessary to users and/or groups. So if you where to choose a role based access control, you will likely narrow down very much of what a person can do in a system. Also as your application grows it may start to become very rigid. All persons under that role must comply to role(s) exactly in the locations you apply that role.

This most likely will force you to apply multiple roles to a location or users, this will add security options/accesses which are greater than what a particular functionality requires. You may be opening up your application to security vulnerabilities by giving to much access.

So I would suggest creating or starting to work from the idea of individual access management based on permission. Your data in your system(s) should be able to tell you which permission/access maps can a user use.

Now in this situation I am talking about the authorization of a user. When a request happens the code will check the users data and determine which permission he/he has. Your permission/access maps are usually created manually and shared in your system in a secure manner so that they can only be read and not modified by any entity that reads them.

This permission/access map can be used in two very important situations; that is to control what a person can do and what a person can see, so below are our two main requirements:

  1. Test can the person use the requested functionality
  2. If the first step has passed: Test what the user can see. A person may see only parts of data or none at all

Notice: Steps 1 and 2 above are not the same thing. Step 1 is usually something you do on a controller level, on the level where your requests starts to be processed. Step 2 is something you should be doing at the data level; like a service that operates on a data source.

Steps for security validations:

  • Find out can the user access the system
  • Based on the authentication find out who the user is
  • Gather user related data to generate a permission/access map
    • User ID(s)
    • Permission/access map(s); this should be determined based on persons data in the system. A person can have multiple access maps.
    • For each role, have a list of read/create/write/delete for each “category” of importance/bounded context/models etc. This depends on your application and the size of the application and what you are trying to achieve.
  • Use this permission/access map to determine which requests the person can access and which data can he/she see

Taking this approach you are able to:

  • be as loose as you want
  • as rigid as you want
  • exactly where you want.
{
    "AccessCategory": {
      "USER": [
        "READ",
        "UPDATE"
      ],
      "ORDERS": [
        "READ",
        "UPDATE",
        "CREATE"
      ]
    },
    "id": "DEFAULT"
  }

For the requests in the controller make checks on what kind of operations you want the person to be able to do. Have a generated “access/permission map” that knows what the person can do based on his data and states. Have the access/permission map generated frequently, preferably each request.

API Request check example at controller level:

hasUserRights(EnumSet.of(AccessCategory.USER, AccessCategory.ORDERS), EnumSet.of(Permission.READ, Permission.UPDATE));

The above function will go through the access/permission map defined above in the JSON data and see if the requested categories have the requested permissions.

Data request: Does the person have the right to view all of the data requested; if partial show only partial or nothing.

So when you read the access/permission map you need to associate that map to the data that the users can view. This connection can be done inside the code based on the access categories.

Then when a user requests data you have to have internal business logic that will determine can the user view the requested data.

Your access/permission map by itself can’t tell your code how the code should behave, you have to associate the business logic by which to filter out data or deny data access.

I would recommend having a user access service that is responsible for generating the permissions and performs the main logic for the security checks. This way you can ask your service to generate for any user a an access service just by providing a user id. Then you can use this user specific access service to make security checks.

A good example on this would be AWS access permissions and policies:

https://docs.aws.amazon.com/IAM/latest/UserGuide/access_controlling.html

Or Azure:

https://docs.microsoft.com/en-us/azure/governance/policy/tutorials/create-custom-policy-definition

Error handling

It is important that you do not “leak” or give away any exceptions to the outside world.

I would recommend that you have a global way of catching all of your exceptions and replacing the response of your application with a client friendly message that tells the client what possibly went wrong but does not give away sensitive information that can be used against your application or their users. Always return a generic error: https://cipher.com/blog/a-complete-guide-to-the-phases-of-penetration-testing/

Remember to log your error properly.

Also regarding your responses to the outside world consider inserting your error friendly message in the body of the response as a custom JSON with data that might help your client app to response properly to the end user. This might be:

  • An error id
  • Possible translation error id for fetching the appropriate error message
  • Error source ID, like a database, 3rd party API, CRM etc, but be carefull not to give away this info to carelessly. Think how this info can be used against you.

Other things to consider regarding any error response is for you definitely think how the things you send to the outside world might be used against you.

This is especially true regarding authentication, authorization situation and registration. Depending on what you are doing you need to mask as much as possible in your responses when something goes wrong, even to the point of sending 200 HTTP status code in error situations. https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html#authentication-and-error-messages

CORS & CSP & CSRF

The following security measures are a combination of procedures and steps you need to take on both a client and server side applications. I won’t go into the details of implementing them or the in depth knowledge on them. There are many ways to implement these measures in your preferred technological stack. The end result should be the same but how you do this can be different on your choices of technology, being in Azure App service and enabling CORS can be as simple as pressing a button but do a microservice within kubernetes and things change drastically. Just be aware of these measures and seek out examples how to implement them.

Important: Don’t use pick one of them but use them in combination for maximum security.

Cross-Origin Resource Sharing (CORS): In this security measure you can specify who can communicate with your server, which HTTP methods are OK and which headers are allowed. This happens from request that originate from a different different origin (domain, protocol, or port) from its own.

Implementing has to be done in your server configuration and/or code. The client application will usually make an options request to the server with what is wants to do and from where it tries to do it, the server will then say of it is OK to continue by sending what it knows is allowed. The browser will then continue or stop the request there.

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS

Content Security Policy (CSP):

In this measure you are defining which resources are allowed from which sources to be used in your client applications. This includes, font, media, images, javascript, objects etc.

Notice that for dynamic script/content you need a nonce value for those contents. This nonce value need to be generated on a server each time the web application is loaded. If you assign a static nonce value this leaves an attack opportunity in your application to execute things which you do not intend.

https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy

https://cheatsheetseries.owasp.org/cheatsheets/Content_Security_Policy_Cheat_Sheet.html

Cross-Site Request Forgery (CSRF):

By this I mean unwanted actions performed in your users browser. For more details I recommed OWASP source for more detailed information:

https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html

https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html

OWASP (Open Web Application Security Project)

OWASP is great resource for information related web appication security. If you want to know more I very strongly recommend to look at their stuff. I’ll post here some of their material which I consider a must to know or atleast to have an idea and come back to.

A good thorough cheat sheet: https://cheatsheetseries.owasp.org/

Top Ten security issues project: https://owasp.org/www-project-top-ten/

Top API security issues: https://github.com/OWASP/API-Security

Top Serverless application issues: https://github.com/OWASP/Serverless-Top-10-Project

A tool to check for security holes and vulnerabilities within your 3rd party libraries and dependencies:

https://owasp.org/www-project-dependency-check/

https://jeremylong.github.io/DependencyCheck/

Configurations

For configurations the most important thing that I think all developer have done at least by accident is to push production credentials into git. So avoid this :).

But other than that here are a few tips:

  • Only add configurations in your configuration files for local development
  • For any other environment have your desired environment configuration file configuration empty. What I mean is that you configuration keys are there but they are empty. You want to do this to make sure that once your test, qa or prod environment configuration files are loaded they are empty unless an outside source sets them, the next step.
  • In your non local development environment load your application with the desired environment configuration like test, qa, prod and replace the empty configurations from a secure secrets store. For example in kubernetes secrets files, in Azure use the Key Vault in AWS Key Management service.
  • Now at this point your should also have a piece of code that can determine if a configuration key is not set and thus is empty. At this stage you should throw an exception and stop the application running. This is usually something that can happen during application start processing. For this I have a post that gives a sample code: https://lionadi.wordpress.com/2019/10/01/spring-boot-bean-management-and-speeding-development

The steps here will improve both your security and quality of your code which I think go hand in hand.

Quality

Quality is important for security because if you have the time and take the interest to create good code that can live for years then it is likely that you will have a secure code, or at least more secure.

Simple things like having good coding practices, common tools and way of doings things within your team can reduce the number of error that can reduce the number of security problems.

Here are a few tools that can help improve your code quality and workflows:

https://www.sonarqube.org/

https://www.sonarlint.org/

https://www.sonatype.com/product-nexus-repository

I will write more about quality in my next post in this series and link it here.

Security Audit

Lastly have someone do a security audit on your application and the entire ecosystem if possible. Have them try to hack into your application, your ecosystem. Have them create a threat analysis with out etc.

If you can’t afford someone then think of learning the basics yourself. This will also improve your code quality and things that you will start to automatically take into consideration when you work on your code.

Logs

The important things is that you have logs about your system that reveal possible security problems or threats. The previous part I went into logging details.

Architecture

When designing your Microservices architecture be aware of every detail and entity within your design.

  • Be aware of the traffic between your containers.
  • Be aware of encryption data between your containers.
  • Be aware of access to your resources within your architecture. Can resource X access resources Y? Are the given privileges too much? etc.
  • Only open ports and routes to your resources that are truly needed.
  • Don’t store sensitive information in places that are not secure, prefer ready made products like KeyVault in Azure or Key Management Service in AWS.
  • Use system identities between resources in Cloud environments, they are more secure than manually handled security accounts.
  • Prefer ready made solutions than creating/reinventing the wheel, if possible. Usually a good popular product has a large team and resources to keep things secure and up to date.
  • Have audit trails on what happens in your architecture, how does what etc.
  • Give the least amount of privileges to people within your architecture, only what is needed for that person or group of people to do their job.
  • Set expiration dates to secrets and privileges to resources, where applicable.

Lessons learned from building microservices – Part 1: Logging

This is a part in a series of posts discussing things learned while I worked with micro-services. The things I write here are not absolute truths and should be considered as the best solutions at the time I and my team used these methods. You might chose to do things differently and I recommend highly to find out for yourself the best practices and approaches that work for you and your project.

I also assume that you have a wide range of pre-existing knowledge on building microservices, API, programming languages, programming, cloud providers etc.

I recommend looking at the OWASP cheat sheet to get even a more in depth view: https://cheatsheetseries.owasp.org/cheatsheets/Logging_Cheat_Sheet.html

UPDATE – 17.3.2020: I’ve improved this post based on the OWASP logging cheat sheet

Notice: In the examples below I will omit “boilerplate” code to save space.

Introduction

By logging usually is meant created records by a piece of software at your operating system level, a web application, a mobile app, a load balancer, databases, mail servers and so on. Logs are created by many different types of sources.

Their importance comes from their ability to allow you to understand what is happening in your system or application. They should show you that that everything is alright and if they are not you should be able to determine what is not alright.

Base requirements for logging

General requirements for logging are:

  • Identifying security incidents
  • Providing information about problems and unusual conditions
  • Business process monitoring
  • Audit trails
  • Performance monitoring

Which events to log:

  • Input validation failures
  • Output validation failures
  • Authentication successes and failures
  • Authorization (access control) failures
  • Session management failures
  • Application errors and system events
  • Application and related systems start-ups and shut-downs, and logging initialization
  • Use of higher-risk functionality (user management, critical system changes etc)

Things to exclude:

  • Application source code
  • Access tokens
  • Sensitive personal data and some forms of personally identifiable information
  • Authentication passwords
  • Database connection strings
  • Encryption keys and other master secrets
  • Bank account or payment card holder data
  • Data of a higher security classification than the logging system is allowed to store
  • Information a user has opted out of collection, or not consented

In some of the cases to exclude you can obscure/remove/alter sensitive data to provide partial benefits without exposing all of the sensitive data to malicious people:

  • File paths
  • Database connection strings
  • Internal network names and addresses
  • Non sensitive personal data

Still, be very careful with this information, especially with user related data.

Now as much it is important to log and have a good view on what is happening in your system and application, it is also a fine art to understand when not to log things.

Having too much log will make it hard to find out the relevant critical information you need. Having too little logging you risk not being able to understand your problem properly.

So, there is a fine balance between logging too much or too little.

A possible solution to this issue is to have more verbose logging during development and when deploying to production your application will only log what is determined important by the developers so that someone will be able to troubleshoot a problem in production without having too much or too little logging. This is also a process that need refactoring during the lifetime of the application.

This leads us to a requirement of logs: logs should be structured and easily indexed, filtered and searched.

Logging audience

When you are logging, I recommend considering for who are you logging for?

You need to ask yourself: Why add logging to an application?

Someone someday will read that log entry and to that person the log entry should make sense and help that person. So, when you log things, think of your audience and the following things:

  • What is the content of the message
  • Context of the message
  • Category
  • Log Level

All of these can be quite different depending on who is looking at your logs. As a developer, you can easily understand quite complex logs but as a non-developer you mostly likely would not be able to make much sense of complex log entries. So, adapt your language to the intended target audience, you can even dedicate separate categories for this.

Also, think if the log entries can be visualized. For example, metrics logs should have categories, dates and numbers which can be translated into charts that show how long things last or succeed.

Write meaningful log messages

When writing log entries avoid writing them so that you need to have in depth knowledge of the application internals or code logic, even if you are a developer or someone who will look at logs that will be a developer.

There are a few reasons to write log messages that are not depended on knowing the application code or the technicalities behind your application:

  • The log messages will most likely be read by someone else that is not a technical person and even if they are not you may need to prove something in your application to a non-technical person.
  • Even if you are the only developer who is working on your application, will you remember all your logic and meaning of log entries a year or two from now? If you must go to your code and check on what the heck this log entry means, then your log entry was not meaningful enough. Yes, you do have to go back to the code anyway if you have problems but if you have to do this frequently then you definitely need to refactor your logging logic and the log content in your application.
  • If you have multiple developer and they do an analysis of a problem they may not understand what is going on. This is because they might not have any correlation or understanding of a log entry because they have not been apart of the initial solution. They must find out what is going on from the code.

Logging is about the four W:

  • When
  • Where
  • Who
  • What

Add context to your log messages

By context I mean that you log message should usually tell what is going on by giving away all the needed details to understand what is happening.

So, this is not OK:

“An order was placed”

If you where to read that one, you would ask: “What order? Who placed the order? When did this happen?”

A much more detailed and helpful log message would be:

“Order 234123-A175 was placed by user 9849 at 29.3.2019 13:39”

This message will allow someone to get that order from a system, look at what was ordered and by whom and at what time.

Log at the proper level

When you create a log entry your log entry should have an associated level of severity and importance. The common levels that are used are the following:

  • TRACE: The most verbose logging, will produce A LOT of log entries and is used to track very difficult problems. Never use it in production, if you have to them in production you have a design problem in your application. It is the finest grained log level.
  • DEBUG: This is mostly used for debugging purposes during development. At this level you want to log additional and extra information about the workings of your application that help you track down problems. This could be enabled in production if necessary, but only temporarily and to troubleshoot an issue.
  • INFO: Actions that are user-driven or system specific like scheduled operations.
  • NOTICE: Notable events that are not considered an error.
  • WARN: Events that could potentially become an error or pose might a security risk.
  • ERROR: Error conditions that might still allow the application to continue running.
  • FATAL: This should not happen a lot in your application but if it does it usually terminates your program and you need to know why.

Service instances

In a microservice architecture the most important thing is to be able to see what each microservice instance is doing. This means in the case of kubernetes each pod, or each container with docker etc.

So if you have a service named Customer and you have three instances of this service you would want to know what each service is doing when logging. So here is a check list of things to consider:

  • You need to know what each service instance is doing because each instance will process logic and each instance will have it’s own output based on what it is doing or requested to do
  • Each log entry should be able to identify which service instance was that performed the log entry by providing a unique service instance id
  • Each log entry should identify which application version the service instance is using
  • Each log entry should tell in which environment the service instance is operating in, example: development, test, qa, prod
  • If possible each log entry should tell where the service instance is like IP address or host-name

Monitoring

First thing I would recommend is to have an understating where your logs will end up and how you are going to analyze them.

The simplest form would be a log file where you would push your log entries and then using a common text editor or development editor to look at the entries. This works fine if your application is very small or you are dealing with a script. The log entries amount will likely be small, and they won’t be stored for a long period of time.

But, if you know your application or system will produce thousands, hundred of thousands or even millions of log entries each day and you need to store them for a longer period of time then you need a good monitoring tool that than read robustly log entries. You also need a good place to store your log entries.

What you would need normally is something that would:

  • Receive and process a log entry, them transform it and send it to a data store
  • At the data store you would need a tool that will index the data.
  • Then you would need to be able to search and analyze your indexed log entries

A very common tech stack for storing log entries and analysis would be ElasticSearch, Logstash and Kibana. You would use Logstash to process a log entry, transform it and send it to a data store like Elasticsearch where you would index, search and analyze data with. Finally you would use Kibana which is a UI on top of Elasticsearch to visually do the searching and analysis of logs.

Log types

Next I’ll cover the different logging types you might need and that will make your life easier.

General logging details

Before we cover the different types of logs which you might need first we need to have some common data witch each log entry. This data will help us in different way depending on the solution you are making. In my example here these data are related to an API backend but you might find them useful in some other types of solutions.

So consider adding these logging fields to other logs as metadata.

public class LogData
 {

    private String requestId;
    private String userId;
    private String environmentId;
    private String appName;
    private String appVersion;
    private Instant createdAt;

}
FieldSampleDescription
requestId6f88dcd0-f628-44f1-850e-962a4ba086e3This is a value that should represent a request to your API. This request id should be applied to all log entries to be able to group all log entries from a request. Should be unique.
userId9ff4016d-d4e6-429f-bca8-6503b9d629e1Same as with the request id but a user id that represents a possible user that made the API request. Should be unique.
environmentIdDEV, TEST, PRODThis should tell a person looking at a log entry from which environment the log entry came for. This is important in cases where all log entries are pushed into one location and not separated physically.
appNameYour Cool APISame as with the environment id but concerns the app name.
appVersion2.1.7Same as with the environment id but concerns the app version.
createdAt02/08/2019 12:37:59This should represent when the log entry has been created. This will help very much in tracking the progress of the application logic in all environment in case of troubleshooting. Preferable in UTC time.

As you can see with this base line details, we get a pretty good view on where things are happening, who is doing things and when. I can’t stress enough how important these details are!

General log entry

Well this is the base line log entry with an added message field and perhaps a title field. That’s it.

This is what you would need at a bare minimum to find out what is going on.

Access log

Access logs are a great way to keep track of your API requests and their response to a client. It’s a way to the server to keep records of all requests processed by the server. I won’t go deeper into them, there are plenty of detail descriptions available which I recommend going through, here is one:

https://httpd.apache.org/docs/2.4/logs.html#accesslog

https://en.wikipedia.org/wiki/Server_log

Here is some sample code:

public class AccessLog {
    private String clientIP;
    private String userId;
    private String timestamp;
    private String method;
    private String requestURL;
    private String protocol;
    private int statusCode;
    private int payloadSize;
    private String borwserAgent;
    private String requestId;
}
FieldSampleDescription
clientIP127.0.0.1The IP address of the client that made the request to you API.
userIdaa10318a-a9b7-4452-9616-0856a206da75Preferably this should be the same user id that was used in the LogData class above
timestamp02/08/2019 12:37:59A date time format of your choice when the request occured.
methodGET, POST, PUT etc.HTTP Method of the request.
requestURLhttps://localhost:9000/api/customer/infoThe URL of the request
protocolHTTP/1.1The protocol used to communicate with the API request.
statusCode200, 201, 401, 500 etc.HTTP status code of the request response.
payloadSize2345The size of the payload returned to the client.
borwserAgentMozilla/4.08 [en] (Win98; I ;Nav)“The User-Agent request header contains a characteristic string that allows the network protocol peers to identify the application type, operating system, software vendor or software version of the requesting software user agent.” – https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent
requestIdThis should the the same request id used in the LogData class earlier.

Message Queue Log

This is related to a decoupling pattern between two or more entities. You push a message to a storage location and someone or something reads and processes it, this is a simplified description of course.

This is a sample log which you could use with events/message queues. Depending on what message queue you use and what kind of configurations, you would most likely have minimal information about the message pushed to a queue.

From a troubleshooting point of view and being able to track things I would recommend passing with the message additional metadata related to the message original situation.

Lets take as an example an API request. What I did was add an additional

This is a bit of complex thing to go into but the main focus here is that depending on what king of message queue or event queue technology and applications you use, you might not get a very detailed view on who, when and what happened.

An example: You have an API that a client application invokes, this request has to do an asynchronous save to a CRM, you have to make sure that this is completed and re-tried if things go bad. This is fine but what if things go bad and even after several attempts nothing has happened. A common practice is that the message is going to go to a dead letter queue, for troubleshooting and future processing.

Now to be able to find out what the problem was you need detailed information and by default messages in queues have little details. So I would recommend adding additional data to the message in a queue so that when the receiving end gets it you can log and associate that message to our previous API request. Then later when using analysis tools you can get a history of the events that has happened, for example using the requestId/correlationId.

public class MessageQueueLog {
    private String sourceHostname;
    private String sourceAppName;
    private String sourceAppVersion;
    private String sourceEnvironmentId;
    private String sourceRequestId;
    private String sourceUserId;
    private String message;
    private String messageType;
}
FieldSampleDescription
sourceHostnameLook at the LogData example earlier.
sourceAppNameLook at the LogData example earlier.
sourceAppVersionLook at the LogData example earlier.
sourceEnvironmentIdLook at the LogData example earlier.
sourceRequestIdLook at the LogData example earlier.
sourceUserIdLook at the LogData example earlier.
messageJSON data JSON data representing a serialized object that hold important data to be used the the receiving end.
messageTypeUPDATE_USER, DELETE_USERA simple unique static ID for the message. This ID will tell the receiving end what it needs to do with the data in the message field.
createdAt02/08/2019 12:37:59This should represent when the message queue entry was created. Preferable in UTC time.

Metrics log

With metrics logs the idea is to be able to track desired performance and successes in your application. A common thing that you might like to track would be how external request from your own code is performing. This will allow you set up alerts and troubleshoot problem with external sources, especially if combined with an access log you can see and a metrics log of how long you request totally took to finish.

But depending on what kind of tools you use you might get automatic But depending on what kind of tools you use, you might get automatic metrics for your application; like CPU usage, memory usage, data usage etc. Here I will focus on metrics logs you would produce manually from your application.

So you could track the following metrics:

  • External source like database, API, service etc.
  • You request total processing time from start to end to return a response
  • Some important section of your code
public class MetricsLog {

    private String title;
    private String body;
    private String additional;
    private String url;
    private int statusCode;
    private Double payloadSize;
    private Long receivedResponseAtMillis = 0L;
    private Long sentRequestAtMillis = 0L;
    private MetricsLogTypes logType;
    private double elapsedTimeInSeconds = 0;
    private double elapsedTimeInMS = 0;
    private String category;
}
FieldSampleDescription
titleUser Database
bodyUpdate user
additionalSome additional data
urlhttp://localhost:9200/api/car/typesIf this is a API request to an external service you should log the request URL.
statusCode200, 401, 500 etc.The HTTP status code returned by the external source.
payloadSize234567The size of the returned data.
receivedResponseAtMillis1575364455When the response was received, this could be in UNIX epoch time.
sentRequestAtMillis1575363455When the request was received, this could be in UNIX epoch time.
logTypeAPI, DATABASE, CODE etc.The HTTP status code returned by the external source or some other code that you wish to use.
elapsedTimeInSeconds1Calculate and write how long it took for the response to be received.
elapsedTimeInMS1000Calculate and write how long it took for the response to be received.
categoryCategory1/2/3 etc.This could be used to group different metrics together.

Security Logs

I would also consider creating a separate security log that would be logged and identified by the logging indexer to it’s own pattern or category etc.

This is to speed up troubleshooting related to security issues like when someone signs in, signs out, registers etc.

A security log provides tools to establish an audit trail. It allows you to record, track and investigate security related operations that happen in your system. This is a hard thing to do it right since you must have enough information to troubleshoot but keep secrets and sensitive information hidden.

Start by using default features of the technology you are using like Azure AD or Cognito and then go into manually logging security logs to complement them which you would do normally from your application.

For each recorded event, the record in the security log includes at least the following:

  • Date and time of event.
  • User identification including associated terminal, port, network address, or communication device etc.
  • Type of event.
  • Names of resources accessed.
  • Success or failure of the event.

For the security logging you can combine the General Security Logging with just a Title and Body. The bare minimum. The idea is to log an event that is related to a security issue and if possible separate it in it’s own index pattern/category.

Aggregated log entry

This is an example where you would have a main log class that will contain our desired log entry data and details for a system.

Possible use cases is when streaming to Cloudwatch or to perhaps Elasticsearch.

public class CloudLog {
    private LocalDateTime timeStamp;
    private String logger;
    private Map<String, Object> metadata;
    private String message;
    private String level;
}
FieldDescription
timeStampA timestamp when the log entry was created.
loggerThe logger entity name.
metadataA map full of key value pair, full of data which can be serialized into JSON for indexing.
messageThe main message to the log entry
levelSeverity level of the log entry, DEBUG, INFO, ERROR, etc.

Spring Boot: Bean management and speeding development

Intro

Is this blog post I’ll show a way how to use Spring Boot functionality to create a more automatized way to use beans that are some what created as component or features.

The idea is that we way have functionalities or features which we want to have easy and clear access through code, so the following things should be true:

  • If I want I can use a set of beans easily
  • If I want I can use a specific bean or beans within the previous set of beans
  • It should be easily told to Spring what beans to load, a only liner preferably
  • Configuration of beans should be not hidden from a developer, the developer should be noticed if a configuration is missing from a required bean ( By configuration I mean application properties)
  • A bean or set of beans should be able to be used from a common library so that when the library is references in a project the beans will not be automatically created and thus creating mandatory dependencies that would break the other project code and/or add functionalities which are not required

All of the above will happen if the following three things are created and used properly within a code base:

  1. Custom annotations to represent features or functionalities by tagging wanted code
  2. Usage of component scan to load up the wanted features or functionalities based on the set annotations
  3. Usage of properties classes which extend from a properties base class handling application properties dependencies and configuration logic and logging

Notice: I assume that you are familiar with Java and Spring Boot, so I’ll skip some of the minor details regarding the implementation.

Implementation

Custom annotation

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
public @interface MyFeature {
   
}

To use this annotation you need to apply it to bean creation process which you want the component scan to pick up.

@Bean(name = "MY_FEATURE_BEAN")
        @Autowired
        @Profile({"primary"})
        @MyFeature
        public MyFeatureClass createMyFeatureBean(MyFeatureProperties myfeatureProperties) {
            MyFeatureClass myFeature = new MyFeatureClass(myfeatureProperties);
            // Do someething else with the class

            return myFeature; // Return the class to be used as a bean
        }

You can also directly apply it to a class. This way the class is used directly to create a bean out of it.

Component Scanning

You can use the Spring Boot component scanning in many different ways (I recommend looking at what the component scan can do).

In this example it is enough for you to tell which annotation to include in your project, notice that you have to create a configuration class for this to work:


@Configuration
@ComponentScan(basePackages = "com.my.library.common",
        includeFilters = @ComponentScan.Filter(MyFeature.class))
public class MyFeaturesConfiguration {
}

Extended properties configuration

For this example we need two things to happen for the custom properties configuration and handling/logging to work:

  1. Create a properties class that represents a set of properties for a feature or set or features and/or functionalities
  2. Extend it from a base properties class that will examine each field in the class and determine if a property has been set, not set or if it is optional.

What we want to achieve here is that we want to show a developer which properties from a feature or functionalities are missing or not missing. We don’t show the values since the values may contain sensitive data, we only list ALL of the properties in a properties class no matter if they have set values or not. This is to show to a developer all the needed fields and which are invalid, including optional properties.

This approach will significantly improve a developers or a system admins daily work load by decreasing. You won’t have to guess what is missing. And combining with good documentation on the property level of a configuration class you should figure out easily what is missing.

BaseProperties class

Extend this class in all classes that you want to define properties.

import com.sato.library.common.general.exceptions.SettingsException;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.util.StringUtils;

import javax.annotation.PostConstruct;
import java.lang.reflect.Field;
import java.util.Optional;

public class BaseProperties {
    @PostConstruct
    private void init() throws Exception {
        boolean failedSettingsCheck = false;
        StringBuilder sb = new StringBuilder();

        // Go through every field in the class and log it's situation if it has problems(missing property value). NOTICE: A report of the settings properties is only logged IF a required field is not set
        for (Field f : getClass().getDeclaredFields()) {
            f.setAccessible(true);
            String optionalFieldPostFixText = " ";
            boolean isOptionalSetting = false;
            String classConfigurationPropertyFieldPrefixText = "";

            // Check to see if the class has a configuration properties annontation, if so add the defined property path to the logging
            if (getClass().getDeclaredAnnotation(ConfigurationProperties.class) != null) {
                final ConfigurationProperties configurationPropertiesAnnotation = getClass().getDeclaredAnnotation(ConfigurationProperties.class);
                if (!StringUtils.isEmpty(configurationPropertiesAnnotation.value()))
                    classConfigurationPropertyFieldPrefixText = configurationPropertiesAnnotation.value() + ".";

                if (StringUtils.isEmpty(classConfigurationPropertyFieldPrefixText) && !StringUtils.isEmpty(configurationPropertiesAnnotation.prefix()))
                    classConfigurationPropertyFieldPrefixText = configurationPropertiesAnnotation.prefix() + ".";
            }

            // Check to see if this field is optional
            if (f.getDeclaredAnnotation(OptionalSetting.class) != null) {
                optionalFieldPostFixText = " - Optional";
                isOptionalSetting = true;
            }

            // Check to see if a settings field is empty, if so then set the execution of the application to stop and logg the situations
            if (f.get(this) == null || (f.getType() == String.class && StringUtils.isEmpty(f.get(this)))) {
                // Skip empty field if they are set as optional
                if (!isOptionalSetting) {
                    failedSettingsCheck = true;
                }
                sb.append(classConfigurationPropertyFieldPrefixText + f.getName() + ": Missing" + optionalFieldPostFixText + System.lineSeparator());
            } else {
                // If the field is OK then mark than in the logging to give a better overview of the properties
                sb.append(classConfigurationPropertyFieldPrefixText + f.getName() + ": OK" + optionalFieldPostFixText + System.lineSeparator());
            }
        }

        // If even one required setting property is empty then stop the application execution and log the findings
        if(failedSettingsCheck) {
            throw new SettingsException(Optional.of(System.lineSeparator() + "SETTINGS FAILURE: You can't use these settings values of " + this.getClass() + " without setting all of the properties: " + System.lineSeparator() + sb.toString()));
        }
    }
}

Optional Annotation for optional properties

Use the following code to set optional properties in properties classes. This means that in the properties base classes any optional property is ignored as a fatal exception that needs to stop the execution of the application.

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.FIELD)
public @interface OptionalProperty {
}

Using all of the above

@ConfigurationProperties(prefix = "myfeature")
@MyFeature
public class MyFeatureProperties extends BaseProperties {
    @OptionalProperty
    private String secretKey;
    private String region;

    public String getSecretKey() {
        return secretKey;
    }

    public void setSecretKey(String secretKey) {
        this.secretKey = secretKey;
    }


    public String getRegion() {
        return region;
    }

    public void setRegion(String region) {
        this.region = region;
    }
}

Notice: In the usage example code above I do not set a @Configuration annotation to the class, this is because the component scan will pick up this class and automatically determine it is a configuration class because of the @ConfigurationProperties annotation, yep this is a trick but it work nicely.

My Kubernetes Cheat Sheet, things I find useful everyday

Hi,

Here is a list of my personal most used and useful commands with Kubernetes.

kubectl config current-context # Get the Kuberneste context where you are operating

kubectl get services # List all services in the namespace
kubectl get pods # Get all pods
kubectl get pods –all-namespaces # List all pods in all namespaces
kubectl get pods -o wide # List all pods in the namespace, with more details
kubectl get deployment my-dep # List a particular deployment
kubectl get pods –include-uninitialized # List all pods in the namespace, including uninitialized ones

kubectl describe nodes my-node
kubectl describe pods my-pod
kubectl describe my-dep

kubectl scale –replicas=0 my-dep # roll down a deployment to zore instances
kubectl scale –replicas=1 my-dep # roll up a deployment to desired instaces number

kubectl set image my-dep my-containers=my-image –record # Update the image of the diven deployment containers

kubectl apply -f my-file.yaml # apply a kubernetes specific conifiguration, secrets file, deployment file

kubectl logs -f –tail=1 my-pod # Attach to the pods output and print one line of at a time

kubectl exec my-podf — printenv | sort # print all environmental variables from a pod and sort them

kubectl get my-dep –output=yaml # Print a deployment yaml file the deployment is using

kubectl get pod my-pod –output=yaml # Print the pod related configurations it is using

kubectl logs -p my-pod # Print the logs of the previous container instance, you can use this if there was a crash

kubectl run -i –tty busybox –image=busybox –restart=Never — sh # run a busybox pod for troubleshooting

More useful commands: https://kubernetes.io/docs/reference/kubectl/cheatsheet/

Redis caching with Spring Boot

Hi,

A few example on how to handle Redis usage with Spring Boot. Also some examples on how to error handle exceptions and issues with Redis.

The code below will help you initialize your redis connect and how to use it. One thing to take notice is that redis keys are global so you must make sure that any method parameter you use with you keys and unique. For this reason below you have samples of custom key generators.

Redis Samples

 

Redis main configurations


import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.*;
import org.springframework.cache.interceptor.CacheErrorHandler;
import org.springframework.cache.interceptor.KeyGenerator;
import org.springframework.context.annotation.*;
import org.springframework.context.support.PropertySourcesPlaceholderConfigurer;
import org.springframework.data.redis.cache.RedisCacheManager;
import org.springframework.data.redis.connection.jedis.JedisConnectionFactory;
import org.springframework.data.redis.core.RedisTemplate;

import org.springframework.data.redis.serializer.StringRedisSerializer;
import org.springframework.util.StringUtils;

import java.lang.reflect.Method;
import java.util.HashMap;
import java.util.Map;


@Configuration
@ComponentScan
@EnableCaching
@Profile({"dev","test"})
public class RedisCacheConfig extends CachingConfigurerSupport {
    @Override
    public CacheErrorHandler errorHandler() {

        return new CustomCacheErrorHandler();

    }

    protected final org.slf4j.Logger logger = LoggerFactory.getLogger(RedisCacheConfig.class);


    // This is a custom default keygenerator that is used if no other explicit key generator is specified
    @Bean
    public KeyGenerator keyGenerator() {
        return new KeyGenerator() {
            protected final org.slf4j.Logger logger = LoggerFactory.getLogger(RedisCacheConfig.class);

            @Override
            public Object generate(Object o, Method method, Object... objects) {
                return RedisCacheConfig.keyGeneratorProcessor(logger, o, method, null, objects);

            }
        };
    }

    // A custom key generator that generates a key based on the first method parameter while ignoring all other parameters
    @Bean("keyGeneratorFirstParamKey")
    public KeyGenerator keyGeneratorFirstParamKey() {

        return new KeyGenerator() {
            protected final org.slf4j.Logger logger = LoggerFactory.getLogger(RedisCacheConfig.class);

            @Override
            public Object generate(Object o, Method method, Object... objects) {

                return RedisCacheConfig.keyGeneratorProcessor(logger, o, method, 0, objects);
            }
        };
    }

    // A custom key generator that generates a key based on the second method parameter while ignoring all other parameters

    @Bean("keyGeneratorSecondParamKey")
    public KeyGenerator keyGeneratorSecondParamKey() {

        return new KeyGenerator() {
            protected final org.slf4j.Logger logger = LoggerFactory.getLogger(RedisCacheConfig.class);

            @Override
            public Object generate(Object o, Method method, Object... objects) {

                return RedisCacheConfig.keyGeneratorProcessor(logger, o, method, 1, objects);
            }
        };
    }

    // This is the main logic for creating cache keys
    public static String keyGeneratorProcessor(org.slf4j.Logger logger, Object o, Method method, Integer keyIndex, Object... objects) {

        // Retrieve all cache names for each anonation and compose a cache key prefix
        CachePut cachePutAnnotation = method.getAnnotation(CachePut.class);
        Cacheable cacheableAnnotation = method.getAnnotation(Cacheable.class);
        CacheEvict cacheEvictAnnotation = method.getAnnotation(CacheEvict.class);
        org.springframework.cache.annotation.CacheConfig cacheConfigClassAnnotation = o.getClass().getAnnotation(org.springframework.cache.annotation.CacheConfig.class);
        String keyPrefix = "";
        String[] cacheNames = null;

        if (cacheConfigClassAnnotation != null)
            cacheNames = cacheConfigClassAnnotation.cacheNames();


        if (cacheEvictAnnotation != null)
            if (cacheEvictAnnotation.value() != null)
                if (cacheEvictAnnotation.value().length > 0)
                    cacheNames = org.apache.commons.lang3.ArrayUtils.addAll(cacheNames, cacheEvictAnnotation.value());

        if (cachePutAnnotation != null)
            if (cachePutAnnotation.value() != null)
                if (cachePutAnnotation.value().length > 0)
                    cacheNames = org.apache.commons.lang3.ArrayUtils.addAll(cacheNames, cachePutAnnotation.value());

        if (cacheableAnnotation != null)
            if (cacheableAnnotation.value() != null)
                if (cacheableAnnotation.value().length > 0)
                    cacheNames = org.apache.commons.lang3.ArrayUtils.addAll(cacheNames, cacheableAnnotation.value());

        if (cacheNames != null)
            if (cacheNames.length > 0) {
                for (String cacheName : cacheNames)
                    keyPrefix += cacheName + "_";
            }

        StringBuilder sb = new StringBuilder();


        int parameterIndex = 0;
        for (Object obj : objects) {
            if (obj != null && !StringUtils.isEmpty(obj.toString())) {
                if (keyIndex == null)
                    sb.append(obj.toString());
                else if (parameterIndex == keyIndex) {
                    sb.append(obj.toString());
                    break;
                }
            }
            parameterIndex++;
        }


        String fullKey = keyPrefix + sb.toString();

        logger.debug("REDIS KEYGEN for CacheNames: " + keyPrefix + " with KEY: " + fullKey);

        return fullKey;
        //---------------------------------------------------------------------------------------------------------

        // Another example how to do custom cache keys
        // This will generate a unique key of the class name, the method name,
        // and all method parameters appended.
                /*StringBuilder sb = new StringBuilder();
                sb.append(o.getClass().getName());
                sb.append("-" + method.getName() );
                for (Object obj : objects) {
                    if(obj != null)
                        sb.append("-" + obj.toString());
                }

                if(logger.isDebugEnabled())
                    logger.debug("REDIS KEYGEN: " + sb.toString());
                return sb.toString();*/
    }

    // Create the redis connection here
    @Bean
    public JedisConnectionFactory jedisConnectionFactory() {
        JedisConnectionFactory jedisConFactory = new JedisConnectionFactory();

        jedisConFactory.setUseSsl(true);
        jedisConFactory.setHostName("127.0.0.1");
        jedisConFactory.setPort(6379);

        if (!StringUtils.isEmpty(mytoken)) {
            jedisConFactory.setPassword(mytoken);
        }

        jedisConFactory.setUsePool(true);
        jedisConFactory.afterPropertiesSet();

        return jedisConFactory;
    }

    @Bean
    public static PropertySourcesPlaceholderConfigurer propertySourcesPlaceholderConfigurer() {
        return new PropertySourcesPlaceholderConfigurer();
    }

    @Bean
    public RedisTemplate redisTemplate() {
        RedisTemplate redisTemplate = new RedisTemplate();
        redisTemplate.setConnectionFactory(jedisConnectionFactory());
        redisTemplate.setKeySerializer(new StringRedisSerializer());

        return redisTemplate;
    }

    // Cache configurations like how long data is cached
    @Bean
    public CacheManager cacheManager(RedisTemplate redisTemplate) {
        RedisCacheManager cacheManager = new RedisCacheManager(redisTemplate);

        Map cacheExpiration = new HashMap();


        cacheExpiration.put("USERS", 120);
        cacheExpiration.put("CARS", 3600):

        // Number of seconds before expiration. Defaults to unlimited (0)
        cacheManager.setDefaultExpiration(60);
        cacheManager.setExpires(cacheExpiration);
        return cacheManager;
    }
}

 

Redis Error/Exception Handling

 

public class CustomCacheErrorHandler implements CacheErrorHandler {


    protected final org.slf4j.Logger logger = LoggerFactory.getLogger(this.getClass());

    protected Gson gson = new GsonBuilder().create();


    @Override

    public void handleCacheGetError(RuntimeException exception,

                                    Cache cache, Object key) {

        logger.error("Error in REDIS GET operation for KEY: " + key, exception);
        try
        {
            if (cache.get(key) != null && logger.isDebugEnabled())
                logger.debug("Possible existing data which for the cache GET operation in REDIS Cache by KEY: " + key + " with TYPE: " + cache.get(key).get().getClass() + " and DATA: " + this.gson.toJson(cache.get(key).get()));
        } catch (Exception ex)
        {
            // NOTICE: This exception is not logged because this might occur because the cache connection is not established.
            // So if the initial exception that was thrown might have been the same, no connection to the cache server.
            // In such a case this is logged in above already, before the try catch.
        }
    }

    @Override

    public void handleCachePutError(RuntimeException exception, Cache cache,

                                    Object key, Object value) {

        logger.error("Error in REDIS PUT operation for KEY: " + key, exception);
        if(logger.isDebugEnabled())
            logger.debug("Error in REDIS PUT operation for KEY: " + key + " with TYPE: " + value.getClass() + " and DATA: " + this.gson.toJson(value), exception);
    }

    @Override

    public void handleCacheEvictError(RuntimeException exception, Cache cache,

                                      Object key) {

        logger.error("Error in REDIS EVICT operation for KEY: " + key, exception);
        try
        {
            if (cache.get(key) != null  && logger.isDebugEnabled())
                logger.debug("Possible existing data which for the cache EVICT operation in REDIS Cache by KEY: " + key + " with TYPE: " + cache.get(key).get().getClass() + " and DATA: " + this.gson.toJson(cache.get(key).get()));
        } catch (Exception ex)
        {
            // NOTICE: This exception is not logged because this might occur because the cache connection is not established.
            // So if the initial exception that was thrown might have been the same, no connection to the cache server.
            // In such a case this is logged in above already, before the try catch.
        }
    }

    @Override

    public void handleCacheClearError(RuntimeException exception,Cache cache){
        logger.error("Error in REDIS CLEAR operation ", exception);
    }

}

Custom Key Generator Example

 
@Cacheable(value = "USERS", keyGenerator = "keyGeneratorFirstParamKey")
    public UserData getUsers(String userId, Object data)
    {
        // Do something here
    }