Simplifying Monitoring

Tony Orme, Editor at The Broadcast Bridge

As IP, COTs and IT systems advance through broadcast infrastructures, monitoring continues to play a critical role in the efficient and reliable running of a facility. Combining traditional video and audio monitoring with network and system analysis soon increases in complexity, so what do we need to do to optimize monitoring and provide the most reliable system possible?

Few broadcasters have the opportunity to build a new facility from the ground up. Instead, many find themselves having to bolt on hardware and software to provide new functionality. Even more challenging are the systems that need to run workflows in parallel as new technology is installed. Consequently, monolithic architectures soon ensue resulting in complexity and interdependence that is not always clear.

Maintaining and supporting imposing structures is a difficult task. Designs are often difficult to document and acquiring a picture of how the individual components connect together is often a near impossible challenge. Any system can only be managed effectively if it can be monitored and broadcast infrastructures are no different.

Monitoring Data

Recording monitoring data is easier said than done. It’s easy enough to install probes that can gather terabytes of data a second but knowing what to do with the information after acquisition can often be as difficult as maintaining the facility in the first place. False positives and negatives plague monitoring leading to too much time being spent on dealing with problems that don’t exist and not enough time on problems that do.

Monitoring is the process of abstracting away the underlying signal data to put it into a form that we can easily understand and interpret. However, if we abstract too much then the acquired monitoring information is virtually meaningless, and if we don’t abstract enough then we haven’t succeeded in making the signal data easy to understand.

Tuning Measurement

Measuring data rates are a prime example of the challenges abstraction presents. By using wide sampling windows it’s very easy to average out data rates that may at first look innocent. With sufficient integration the averaged data rate gives a value that helps us understand the utilization of a network. But the devil is in the detail and the wide windowing function is probably masking underlying significant peaks and troughs in the data stream. If the windowing function is too narrow, then we run the risk of providing data that results in erratic measurements and are virtually meaningless.

Understanding data rate flow in a network is incredibly important as it has compelling implications for buffer management in switches and receiving devices. If a buffer is configured to be too large, then it can introduce unnecessary latency. If the buffer is too small, then data packets will be dropped resulting in sound and video dropout and distortion. In this instance, knowing the values of the data rates peaks and troughs, or short-term rates, is as important as knowing the value of the long-term data rate average so that the buffers can be correctly configured and managed.

Interconnected Complexity

Configuring monolithic systems is a challenge in itself due to the influence of interconnected devices. This is further exasperated as the influence of one part of the system on another may not be immediately obvious and a fault may not manifest itself for some time. This can often send the maintenance engineer on a wild goose chase as they try and rectify the fault.

Bridge Technologies VB440 can provide the basis of an integrated services monitoring solution.

Bridge Technologies VB440 can provide the basis of an integrated services monitoring solution.

Simplifying systems goes a long way to improving their reliability and resilience. Predictable functionality with well-defined interfaces helps isolate parts of the system that may be exhibiting a fault or just strange behavior. This further encourages effective documentation to help engineers understand signal workflows quickly and easily.

Strategic Monitoring

Although we have to be very careful with our use of abstraction, there is a middle ground; format autodetect. Having high-level strategic screens show at an instant whether a facility is working correctly or not. These screens further allow an engineer to click deeper into a monitoring system to understand what is going on if a problem occurs. Although this is a fairly standard method of operation, it works much better if format autodetect is employed.

Format autodetect is a method of intelligent anomaly detection. That is, the monitoring devices are capable of detecting the video and audio formats and decoding them for monitoring. This is useful as a network may appear to have excessive jitter or burstiness, but it won’t be affecting the audio or video. Conversely, the network may look fine, but the audio or video could be breaking up. In both instances, being able to automatically detect the higher-level applications, such as the video and audio, is critical in both detecting and finding problems with systems.

Furthermore, a system may appear to be working perfectly satisfactorily but could be outputting the wrong video or audio format. Again, standard monitoring systems may report that there is no fault, but broadcasting the wrong format is a serious issue.

Aggregating Monitoring Data

Harvesting data is more important now than ever as vendors have the opportunity to learn fault and anomaly patterns using thousands of probes deployed around hundreds of broadcasters. The anonymized data can be used to help teach monitoring systems what type of faults to look out for based on identified anomaly patterns in the broadcast infrastructure.

All this leads to integration within monitoring. Simplification may only get us so far but having monitoring systems that can integrate with a complex infrastructure as well as with each other helps enormously in detecting faults and pointing to where the possible source of the issue.

Integrated services monitoring helps with reporting as the connected probes are able to share data and then collate it into a single place. Not only does this help with fault and anomaly detection but it also allows us to maintain a high-level view of the system as a whole. The monitoring function succeeds in abstracting enough of the underlying infrastructure to effectively simplify the whole broadcast facility, no matter how complex it is.

Direct link to the article by Tony Orme, The Broadcast Bridge