Skip to content

Overview

Problem Statement

Security is essential for any organization that stores, processes and expose sensitive data. These organizations must adhere to strict corporate security policies.

The punch as a platform is used for collecting data from geographically distributed systems and leverages large-scale data processing capabilities: parsing, correlation, aggregation, machine learning.

Securing such data centric and distributed applications is challenging. One of the basic reason is that few interactions are performed using the traditional client-server pattern, and the data is distributed to potentially many different servers:

  • In Elasticsearch and distributed Ceph or S3 storage, the data is partitioned and replicated, requiring authorization checks at multiple points.
  • Punch applications are distributed. Even though they are started (i.e. submitted) from an authenticated server, they actually run as one or several processes on many servers.
  • Secondary or internal services such as the spark or storm ui components also access the running applications and data on behalf of users.
  • Storm, Spark, Kafka cluster scales to tenth or hundreds of servers and many concurrent tasks.

All this significantly increase the number of data access points. Worst, putting in place a data lake where lots of diverse data from your enterprise is concentrated in a common and mutualised platform increases the impact and criticity level of any security breaches.

These issues are serious, and of course not specific to the punch. Whatever data platform you use, as soon as you leverage distributed and big data technologies, you face this.

Solution Overview

Needless to say, the security and governance of every business and organisation is crucial to punch. To ensure effective protection for its customers, the punch provides many security component that permit the setup of a holistic security strategy.

Quote

Like holistic medicine, holistic cyber security isn’t just about treating a certain area of the “body” or business but allowing for the health of the whole entity. An holistic approach means to understand the core activities that are needed to operate a business, understand the benefit of the activities to the business, craft or refactor the cybersecurity approach to be minimally disruptive and use the systems created for cybersecurity to add value to the business.

In the rest of the security chapters we go through the several security areas that are put in place altogether and that ensure you platform and your data as a whole is controlled and protected.