Skip to content

Before You Start

How It Works

To deploy a punchplatform you need a deployer laptop or server and the target servers where you want to deploy your platform. The starting situation is illustrated next.

image

The punchplatform-deployer.sh tool is a punchplatform software tool delivered as part of the installation package. All you have to provide is a description of your target platform. I.e. what component you want to deploy on what server.

Running that tool is easy and fully automated. What is not automated and what is key for you to have is a clear idea of your design. Because the punchplatform (and its deployer) are extremely modular, you have a wide range of options from deploying a small one-node system up to a full-fledged giant clustered platform.

In short: make sure you go through the rest of this chapter before you start deploying. It will help you.

Configuration Management

To start with it is important to understand the punch configuration logic, and how users interact with the patform. The deployer helps you to create the working environment for the platform operators and user. An operator typically acts upon the platform through a few terminal command line utilities. It does that from a well identified admnistration server where these commands have been installed. This is depicted next where the yellow server is the one where the operator environment has been deployed:

image

These commands are of two kinds. The first two (punchplatform-getconf.sh and puchplatform-putconf.sh) let him save or load the per-tenant configuration folder.The other two let the user start or stop channels and services. To store the configuration, the punch relies on zookeeper. ZooKeeper is a centralized service for maintaining configuration information. Loading or saving the configuration is as simple as executing getconf and putconf:

image

At the very start, somebody has to create a new configuration tree for the tenant. Here is how it typically works: after the punch is deployed, there is no configuration defined. I.e. the deployer is not in charge of creting a tenant configuration. The platform is up and ready, but empty. It looks like this:

image

Somebody must therefore create the first configuration. This is typically a post-deployment operation. It can be done from the deployer machine as illustrated next:

image

The operator is then ready to go and use its platform.

Note

As you can see, it cannot be simpler. The punch design driver is to make deployment and operation as straightforward as possible.

Methodology

Deploying a complete PunchPlatform requires four steps.

Design

The PunchPlatform can be architectured in many different ways, to serve different purposes and to take into account various networking and security issues. The point is : do not start deploying a PunchPlatform if you are unclear on your architecture :

  • where do logs/data come from, do you need virtual IPs, load balancing, failover of some of the components ?
  • will your platform scale ?
  • are you sure you have enough kafka, elasticsearch or ceph storage capacity to fulfill you SLAs ?
  • etc ..

The PunchPlatform stack has been designed from day one to make it extra easy to deploy a production setup. This is a significant advantage as it drastically reduces the costs of your project build. This said, part of the success will come from you. If you read carefully the deployment documentation, and understand your architecture and components, you can deploy a complete system in a few hours. It will takes days or month if your requirements (OSes, networking, VMs etc ..) are not met. Juts like any large scale distributed application.

The good news is that the training material of the PunchPlatform is quite rich and easy to work with on any laptop, VM or workbench. Our recommendation for building a platform is the following.

Let us assume that you have system administrator skills, and that (i) you are completely new to the PunchPlatform and (ii) you are asked to deploy a distributed multi-tenant, multi-site setup with a PetaByte of storage.

How do you do ? Here is our recommendation:

Day 1: deploy a standalone PunchPlatform on your laptop, and follow the 2 and 5 minutes tour of the Getting started guide. This is quick, fun, and you end up with a LTR LMC combination with every functionality at hand. Take some time to see how you operate and run the platform, starting stopping channels, having a look at the various UIs, understanding the key concepts.

Day 2: deploy a production setup, again on your laptop. Do it for a LTR, and for a mono server LMC. Check the PunchPlatform confluence space, there we have blogs to explain how we do that (daily) using vagrant or docker, or on our native linux or macos laptop. In case you do not have such a laptop, our recommendation is : buy one ! If and only if you cannot have one, go to the Thales Cloud, Amazon or OVH and use VMs.

After this second step you will understand what a production deployment is about, how the various components are monitored and supervised, how you can decide in what folder/partition the data is, where the logs are, what version is installed etc.. You will also understand how to write the two key configuration files that describe a platform.

Day 3: your are all set ! Check that your architect/project leader gave you clear and concise instructions. Make sure you have the required target architecture, the hardware and/or servers, networking, firewalls. Do not go further before making sure that all that is up and running. We have great tooling for you to check all that.

Then and only then, start deploying your project PunchPlatform.

Select the Right Package

First, you can download a deployer package from the punch website in the download area section. Next, you should identify the right configuration package which fit to your need. You have two packaging options: :

  • 1-Node deployer : This option is the easiest to start with. The configurations files are provided and support by the punchchplatform team in order to run all components and service on a single server. If you need to scale to more servers, start from the 1-Node configurations you have choosen and adapt its.
  • Deployer package : this option lets you define your platform on your own. We recommend you use this only with the punch expert service assistance.

Tip

These packages are meant to accelerate your deployment. For more details, do not hesitate contacting us at contact@punchplatform.com. We also we provide trainings to help your integrators during the installation.

1-node

LTR or LTR-light (DataTransport)

A ready-to-use configuration to run a data collector and forwarder node. It receives data on syslog tcp sockets, and forwards it using the lumberjack protocol to another punchplatform

This package has only basic processing capabilities. It is meants to collect the data on remote sites and forward it to a (typical) central punchplatform where you run your processing.

LMR (DataManagement)

A ready-to-use configuration to receive, process, index your data and save it locally.

This is a typical ELK-like all-in-one package. A good starting point to setup a log management server use case.

ML (DataAnalytics)

This package is similar to the DATA MANAGEMENT but adds some advanced machine learning capabilities.

You are ready to install and setup the deployer environment. Refer to the Deployer setup guide.

Tip

If you have choosen the one node package, please refer to the 1-node installation guide

Install Prerequisites

Your have your Punchplatform up and ready. What you must do next is to create your channels, according to your business specific use cases.

Deployer Installation Guide

Requirements

The punchplatform deployer is supported on Macos, Ubuntu 16.04 or later, Centos 7 or Redhat 7 (not tested) operating systems.
We recommend:

  • 2 CPU
  • 4Gb Memory
  • 30 Gb Additional Storage

In the rest of this paragraph we describe the required setup for each supported deployer type.

Ubuntu

Execute the following packages installation:

1
2
3
4
5
6
7
8
sudo apt install \
  unzip \
  curl \
  git \
  jq \
  python \
  python-pip \
  sshpass

Install ansible:

1
2
# ansible 2.3.0 
sudo pip install ansible==2.3.0

If you encounter problem setting up this required version of ansible from your available repositories, you will find an offline setup tool in the deployment package of punchplatform :

1
2
3
4
5
unzip punchplatform-deployer-x.y.z.zip
cd deployment_dependencies
unzip ansible-2.3.0-pippackages.zip
cd ansible-2.3.0-pippackages
sudo ./install.sh

Last, only if you need to deploy a CEPH cluster, install the ceph packages on the deployer machine:

1
2
# if ubuntu 18 else use ceph-13 provided
sudo apt install ceph 

the reason to install the CEPH packages on the deployer is because some of the deployment steps requires CEPH tools

Centos / RedHat

Execute the following packages installation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sudo yum install \
    vim \
    wget \
    unzip \
    curl \
    git \
    jq \
    python \
    sshpass \
    python-pip 

CentOS: No package available.

You have to enable the EPEL repo, use:
sudo yum --enablerepo=extras install epel-release
This command will install the correct EPEL repository for the CentOS version you are running.
After this you will be able to install python-pip.

Install ansible: (If you do not have internet access or local pip repository, use other variant below)

1
2
# ansible 2.3.0 
sudo pip install ansible==2.3.0

If you encounter problem with the previous command, set up this specific required version of ansible from your available repositories, you will find an offline setup tool in the deployment package of punchplatform :

1
2
3
4
5
unzip punchplatform-deployer-x.y.z.zip
cd deployment_dependencies
unzip ansible-2.3.0-pippackages.zip
cd ansible-2.3.0-pippackages
sudo ./install.sh

Last, perform the following actions:

1
2
3
4
5
6
7
8
9
# disable firewalld on all devices
sudo systemctl disable firewalld
sudo systemctl stop firewalld
sudo vi /etc/sysconfig/selinux
# change the following line :
# SELINUX=enforcing
# by 
# SELINUX=disabled
# and restart the machine 
MacOS

Macos works as deployer server except that it does not allow you to deploy a CEPH cluster. Be careful to deploy the 2.3.0 ansible version as explained below.

First install Xcode. Then install the following packages:

1
2
3
sudo easy_install pip
sudo pip install ansible==2.3.0
brew install core-utils

Additional Package Installation

Download one of the deployer package provided by the punchplatform team or punchplatform.com
We recommend to move it into a large storage partition (for instance: /data):

1
2
3
4
wget <link>
sudo mkdir -p /data
sudo chown $(user) /data
unzip punchplatform-<package>-<version> -d /data

If you plan to deploy a CEPH cluster you need additional steps.

Download the external archives corresponding to your deployer:

then move it to the punchplatform-<package>-<version>/archives directory and rename it ceph_<version>.tgz.

Download, rename and move in one command

wget https://punchplatform.com/artefacts/ceph/ceph_13.2.5_[distribution].tgz -O punchplatform-[package]-[version]/archives/ceph_13.2.5.tgz

and install ceph archives on your deployment server

1
2
3
4
cd punchplatform-<package>-<version>/archives
tar -xvf ceph_<version>.tgz
sudo yum install -y lttng-ust
sudo yum install -y ceph<version>/*

Remember deploying CEPH requires a Redhat or Centos deployer.

Deployer Environment Setup

Update your PATH so as to have the punchplatform-deployer.sh available :

Ubuntu / Centos / Redhat
1
2
cd punchplatform-<package>-<version>
echo "export PATH=`pwd`/bin:$PATH" >> ~/.bashrc
MacOS
1
2
cd punchplatform-<package>-<version>
echo "export PATH=`pwd`/bin:$PATH" >> ~/.bash_profile

Do not forget to reload your .bashrc to take this environment update into account in your terminal! (either re-login or source your ~∕.bashrc)!

Deployer Configuration directory

Next, create your platform configuration directory.
This directory will hold the description of your target platform with the punchplatform.properties and the punchplatform-deployment.settings files.

Ubuntu/Centos/Redhat:
1
2
3
4
5
6
7
8
cd ~
mkdir pp-deployment-conf
cd pp-deployment-conf
echo "export PUNCHPLATFORM_CONF_DIR=`pwd`" >> ~/.bashrc
cd ..
mkdir pp-deployment-logs
cd pp-deployment-logs
echo "export PUNCHPLATFORM_LOG_DIR=`pwd`" >> ~/.bashrc
Macos:
1
2
3
4
5
6
7
8
cd ~
mkdir pp-deployment-conf
cd pp-deployment-conf
echo "export PUNCHPLATFORM_CONF_DIR=`pwd`" >> ~/.bash_profile
* cd ..
mkdir pp-deployment-logs
cd pp-deployment-logs
echo "export PUNCHPLATFORM_LOG_DIR=`pwd`" >> ~/.bash_profile

Do not forget to reload your .bashrc to take this environment update into account in your terminal! (either re-login or source your ~∕.bashrc)!

1
2
# Ubuntu/Centos/Redhat
source ~/.bashrc
1
2
# Macos
source ~/.bash_profile

Check it worked as expected. The result of the env command must look like:

1
2
3
4
5
env | grep PUNCH
PUNCHPLATFORM_LOG_DIR=/Users/dimi/pp-deployment-logs
PUNCHPLATFORM_CONF_DIR=/Users/dimi/pp-deployment-conf
echo  $PATH
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/dimi/bin:/data/deployer/punchplatform-deployer-5.1.0/bin

Target Servers Installation Guide

Requirements

The punchplatform is only supported on Ubuntu 16.04 or later, Centos 7 or Redhat 7 operating systems.

The choice of these operating systems is the result of extensive testing, running and tuning platforms on production systems for years. Leveraging these OSes, you will benefit from a great deal of feedback to tune and debug common issues.

The target servers are going to be installed with binaries and configurations to run the punchplatform components. The punchplatform deployer tool internally uses ansible to do that.

In this chapter we list the requirements and checks you must ensure before deploying.

Infrastructure Prerequisites

Make sure:

  • All network interfaces are up and configured.
  • Storage partitions are writable and mounted except for Ceph intended block device to manage.

Ceph specific requirements:

You must prevent the updatedb process (standard on Debian-like distributions) to scan the whole system, especially to scan the Ceph data partition or the punchplatform partition. You can do that in several ways

Use the Ansible playbook provided in the official Punchplatform deployer to automatically patch the configuration file on multiple nodes.
This playbook is in updatedb_patch directory, at deployer root directory. Its use is documented on playbook itself.

1
2
3
4
5
# add your ceph nodes in the inventory
vim inventory_updatedb_patch.inv

# apply playbook (ssh access from deployer to all servers required)
ansible-playbook -i inventory_updatedb_patch.inv updatedb_patch.yml

Manual process

You can manually patch the /etc/updatedb.conf configuration file, adding /var/lib/ceph to PRUNEPATHS values on Ceph nodes.

You can manually patch the /etc/updatedb.conf configuration file, adding /data to to PRUNEPATHS values on all servers.

Example of a /etc/updatedb.conf must contains: PRUNEPATHS="/var/lib/ceph" "/data"

Note

Preventing updatedb to scan the whole system is necessary on a server exposing many files (typically the situation on a Ceph server), as the updatedb internal database can quickly and dramatically grow up.

Additional Elasticsearch Prerequisites

The execution prerequisites are disclosed in Elasticsearch public documentation; the current section highlights some specificities that are often not identified during server/OS setup and may cause deployment problems

  • the /tmp partition must not be mounted with 'noexec' options, otherwise Elasticsearch will fail its bootstrap checks with an error message in its log file : system call filters failed to install

  • Elasticsearch by default requires that its process memory may be locked to prevent swapping. The target servers (or virtual servers) must be able to deliver this feature (i.e. no specific hardening) should prevent Elasticsearch from requesting memory lock from the kernel. Elasticsearch checks this at startup time during its 'bootstrap checks' and fail if no/not enough memory could be locked.

System Prerequisites

  • Administration access: An administration account must be provided and access with SSH from the installation environment to the servers. This account must be sudoers to update systemctl configuration for instance.
  • Naming resolution: Naming resolution must be configured (short and long hostname are resolved. When resolving from any target machine the hostname of itself or any other target machine (i.e. the return of 'hostname' command), the result must be the production network interface (as opposed to any other administration or supervision/monitoring dedicated network interface).
  • Time Synchronisation: A full time synchronisation infrastructure like NTP must be configured and running.
  • Repository: Standard repositories of the chosen operating system must be provided. To test the correct configuration, we recommend to update all servers before the punch deployment. Internet access is ok, but private repository too. For centos deployment, 'epel' standard repository must be enable and available.
  • system language: en_US.UTF-8
1
2
3
4
5
localectl | grep LANG
     System Locale: LANG=en_US.utf8

# If this is not the case, change it :
sudo localectl set-locale LANG=en_US.utf8

The following packages and configuration are mandatory on all targer servers:

1
2
3
4
5
6
7
8
9
# Ubuntu
sudo apt install python

# Centos/Redhat
sudo yum install python

# Centos only: disable firewalld 
sudo systemctl disable firewalld
sudo systemctl stop firewalld

Deploy the platform

You must now precisely define your platform by selecting the component you need. This is easy if you work with a ready-to-use package as all the choices have been made for you.

In case you selected the full deployer package it is your task to create and fill two files:

We strongly suggest you try first the tutorials, they will help you progress step by step:

These pages may also interest you: