Include Visibility in Your Design

This is a follow up to my blog post “Include Automation in your Design.”

 

Similar to automation, visibility is an essential element of software applications that is often overlooked.  Visibility is the product feature that allows for operations support and users to understand what is happening on the platform.  Transactions are occurring because of users, background tasks, API calls, reports and infrastructure-related activities.

Most visibility is created by the engineer, for the engineer.  If a user encounters an error, the engineer, if user friendly, will pop up a message telling the user there is a problem.  More often than not, the message will be cryptic, providing the user with little more than “you have a problem.” To compound the problem, the error is not put into a well-thought data store for review or real-time alerting.

 

Visibility begins with centralized logging

If you’ve done this, which we all have, you’re probably thinking that I log all error traffic.  I use try {} catch {}, I use logforj.  I try very hard to keep the user from experiencing unmanaged exceptions.  And in most cases, the exceptions are caught and logged, but not in a visible fashion.  Centralized logging must be part of the product, not just a disparate data store spread across a server farm.  There must be a well-thought out strategy for capturing all errors, storing them logically, and providing consoles and alerts to help operations manage the environment.

Design a centralized data store for all types of events.  You may have different mechanisms and different stores, but centralize on the log type.  Remember that all logs will need to be visible through a series of consoles.  Building a lightweight API that fronts your log store(s) will help when sitting a UI in front of the API.

Most importantly, envision someone having to use and support your code.  If you have to spend time sifting through decentralized log files to track down an issue, then you’re not spending time building new product.  Worse yet, if other people are having to sift through logs, you have missed on delivering an important part of your product.