Distilling massive data streams.

Aug 03, 2018

Network analytics poses daunting data pipeline challenges.

Niometrics’ network analytics software has been engineered from the ground-up with one non-negotiable attribute in mind: high-performance (i.e. extracting more computing power out of fewer resources). That enables a wide range of practical benefits, the most tangible of which is a significantly lower hardware ‘thirst’ that translates into reduced TCO (total cost of ownership) for our solutions.

High-performance principles influence not only how each individual component of our stack is engineered but also how its separate modules get interwoven together. It moulds, therefore, a critical aspect of our architecture: the deliberate split and parallelisation of processes across a geographically distributed infrastructure, gradually increasing the degree of data summarisation as information moves from the edge to more central instances.

That design has derived directly from the requirements our solutions must satisfy. Chiefly amongst them are:

  1. The need to deliver ‘true’ real-time insights from massive data streams (which support the deployment of 100%-live visualisations, triggers and alarms);
  2. The need to deliver ‘comprehensive’ offline analyses (which support the development of broad subsequent studies).

Our systems’ data pipeline initiates with the processing of logs generated by our DPI engine and ends either on visualisation dashboards or on API events triggered to 3rd party systems. To ensure genuine real-time delivery, a first layer of metadata processing starts running locally in each probe, already imprinting an early structure onto the vast information streams that traffic inspection incessantly produces.

“A first layer of metadata processing starts running locally in each probe, already imprinting an early structure onto vast information streams.”

Consequently, not only do the network probe machines get to run NCORE (our DPI engine), as it would be expected, but they also contribute to the initiation of decentralised processes in NLIVE (our in-memory data analytics streaming module), NSTORE (our high-insertion record indexing database) and NOLAP (our proprietary OLAP cube, which must be continuously populated with relevant data from the fringes).

Those distributed processes are ultimately combined within more centralised computational resources, standing higher up in our processing chain. Piggybacking on edge-initiated heavy-lifting, those consolidating steps make their final contribution to the build-up of increasingly synthesised information sets out of our entire pipeline of data streams.

Instant queries become, then, a staple of our solutions: triggers and alarms get configured on business rules that are systematically checked against, on a constant flow of complex event processing. Subscriber-level consultations reveal what each specific client is doing right now, making it very easy to troubleshoot diverse types of complaints. Cross and upsell campaigns are empowered to offer very context-centric, time-sensitive stimulations. And active network management decisions take place when and where they matter most: in the heat of any quality deterioration moment.

Ultimately, our Deep Network Analytics (DNA) platform gives CSPs the speed advantage of going from trillions of raw metadata to first-order events – and from first-order events to aggregates – in a split second, without compromising on accessibility, retainability and timeliness of critical data streams. It is a capability that, when used correctly (with the right business goals and enablers in place), can open up a wide range of opportunities for CSPs to explore, either in enhancing their own operational health or in experimenting with whole new propositions to upend their businesses.