Pragma Analytics Software Suite

Once deployed and adapted to your needs, the PASS software suite offers a powerful graphical front-end interface using the latest Web development technologies. This animation gives you some elements of the graphical capabilities of this interface, including the capacity at the representations of the data.

We invite you to follow the rest of this presentation to understand our solution.

Pragma Analytics Software Suite design

The PASS suite is based on the reference model of big data architectures. We find, from the bottom up, the key elements ensuring the functioning of the system.

  • The data ingestion layer ensures the formatting, the enrichment and the standardization of the data format,
  • The broker message ensures the exchange of data between the ingestion layer and the backend,
  • The backend provides storage, consultation, and the desired level of resiliency for the data.

Finally, relying on the API of the backend, we find:

  • Data consultation tools and dashboards organization,
  • Security use cases,
  • Engineering,
  • Business workflow optimization.

For each of these reference functions, we have validated software that we can assemble in order to meet a set of specifications. Our expertise can also be proposed to validate your working hypotheses.

Ingest layer

The data ingestion layer is undoubtedly the most critical for obtaining the optimal end result. Our proposal is to make the information available in graphical form in near real-time, so it is important to put in place an efficient ingestion layer. The software must be scalable horizontally and as far as possible without the need for synchronization (“share nothing” approach). It will also be key to build a data model easily assimilated by the backend of our choice. The data model always favors speed, the low cost of processing, even if it means sacrificing volumetric aspects reasonably.

The modules we currently use are either open source modules like PMACCT (Netflow and sFlow data), or modules developed by us in golang (SNMP processing, CDR Charging Data Records of voice networks)

Message broker et inter process communication

We offer two solutions for communication within the PASS solution. To ensure communication between the software layers and whatever their programming language, we prefer an approach with Kafka software. This one brings to the solution a good level of resilience and load sharing. Moreover, when dealing with data ingestions with transfer peak behavior (bursty traffic like netflow), Kafka also provides the ability to smooth the load and acts as an application shaper.

To ensure communication between the modules of the same layer, we favor the use of the Zero MQ library. The data is exchanged in binary or via a simple format like message pack. The use of the JSON format can be a problem of performance at the time of its serialization / de-serialization.

Backend
Pragma Innovation is positioning itself on a very specific segment of big data. We are interested in issues related to the analysis of events where the notion of time is crucial. We can meet needs such as log wells, analysis of billing tickets, tickets from network equipment or production line.

In order to meet these needs, we use two types of backend: Druid but also clickhouse.

These two solutions are quite close to each other. Both are based on a model called “columnar db” or model OLAP. Druid allows automatic temporal aggregation that clickhouse does not allow. Clickhouse offers flexibility and an SQL interface that Druid does not always allow. Depending on your specific use cases we will direct you to the best solution for your needs. You can also consult our existing use cases.

As always, we cannot work without the use of the very robust PostgreSQL database which will be in charge of processing the meta-data of our big data solutions. This type of database is also used by the Frontend of the PASS solution.

Big data, the icing on the cake ! We take advantage of this section to clarify what big data means for Pragma Innovation. We select our solutions so that they are scalable horizontally. Also, a big data solution must have the possibility to be reduced to a single server with some TB of data. It must be able to evolve to PetaB and support our customers in their growth. It is therefore simple and inexpensive to evaluate the PASS stack.

Frontend and dashboards

All data consultation systems must have a graphical interface that is both easy to use and rich in graphical functionality. We chose the open source Superset which was initially developed by AirBnB and then recently taken over for the Apache foundation.

This front end has a driver for the Druid backend but it is also able to interface with all systems offering an SQL interface. This frontend connects to the SQL database through the SQLAlchemy library.

Superset uses the Flask framework, uses JavaScript to provide good graphical performance (REACT.JS) and uses a wide range of possibilities from the D3.js library. We recommend using this Frontend on our deployments.

If an existing tool has to be taken into account, it will be possible to consider an integration. For example, a graphical tool such as Grafana has plug-in for our backend.