Skip to content

Gateway

Abstract

The Punchplatform Gateway is a Restful service placed in front of the Punchplatform services such as Elasticsearch, Punchctl client for channels or PML jobs. It provides a transparency access to the Punchplatform features by an endpoint design for any external client.

image

Start

On a fresh standalone, run :

# run in background
punchplatform-gateway.sh --start
# run in foreground
punchplatform-gateway.sh --start-foreground

Logs

Check the Gateway's status with :

punchplatform-gateway.sh --status

In standalone, check the application logs on your file system :

tail -f $PUNCHPLATFORM_CONF_DIR/../external/punchplatform-gateway-6.0.0/logs/punchplatform-gateway.log

Or check the Rest API logs in Kibana in index platform-logs.

Feature redirection

All the endpoints for routing are described in REST API doc

Elasticsearch

Every Elasticsearch cluster is accessible through /es/{cluster_id}.

The redirection respects a request transparency to the Elasticsearch clusters.
Each path matching the pattern /es/{cluster_id}/** will be directly rerouted to the cluster.

Example :

curl GET localhost:4242/v1/mytenant/es/es_search/_cat/indices
curl -XPUT localhost:4242/v1/mytenant/es/es_search/newindex

Channels

Channel management is accessible through /channels.

GET method is used to request channels status, while each POST method is used to execute start and stop actions.

Example :

curl -v GET localhost:4242/v1/mytenant/channels
curl -v -XPOST localhost:4242/v1/mytenant/channels/admin/start
curl -v GET localhost:4242/v1/mytenant/channels/admin
curl -v -XPOST localhost:4242/v1/mytenant/channels/admin/stop

Punchline

Punchline application features are accessible through /punchline.

It allows a client to save a punchline to Elasticsearch, query the saved ones, execute them and request the execution results.

Examples :

# save punchline
curl -XPOST localhost:4242/v1/mytenant/punchline/save \
  -F file=/@tmp/dataset_generator.hjson
# scan saved punchlines
curl GET localhost:4242/v1/mytenant/punchline/scan
# execute saved punchline
curl -XPOST localhost:4242/v1/mytenant/punchline/{punchline_id}
# directly execute punchline
curl -XPOST localhost:4242/v1/mytenant/punchline \
  -F file=@/tmp/dataset_generator.hjson
# get punchline execution
curl GET localhost:4242/v1/mytenant/punchline/{punchline_id}/executions/{execution_id}
# delete punchline
curl -XDELETE localhost:4242/v1/mytenant/punchline/{punchline_id}

Puncher

The Puncher tool for punchlets processing is accessible through /puncher.

It allows a client to directly execute a grok or a dissect operator on inputs, or execute a complete punchlet over a log file.

Examples :

# grok operator on input
curl -XPOST localhost:4242/v1/puncher/grok \
  -F input=@/tmp/inputfile \
  -F pattern=@/tmp/patternfile
# dissect operator on input
curl -XPOST localhost:4242/v1/puncher/dissect \
  -F input=@/tmp/inputfile \
  -F pattern=@/tmp/patternfile
# execute punchlet
curl -XPOST localhost:4242/v1/puncher/dissect \
  -F input=@/tmp/inputfile \
  -F logFile=@/tmp/logfile

Resource Manager

The resource manager to store data is accessible through /resources.

It allows a client to :

  • Upload a resource
  • Download a resource
  • Copy a resource
  • Move a resource
  • Delete a resource
  • List resources inside a tenant
  • Register an external resource

Info

Check Resource Manager Reference Guide to learn more.

Upload

pattern : curl -vX PUT http://localhost:4242/v1/mytenant/resources/upload/<resource_name>

You must provide a form-data body with properties :

  • input : mandatory, input path of the file to store
  • properties: optional, list of custom properties with format ["key=value"]
  • version: optional, specific version to store
  • embedded: optional, set to true if you want to store the data in metadata
curl --location --request PUT 'http://localhost:4242/v1/mytenant/resources/upload/tests/test.txt' \
--form 'input=@/home/lca/Pictures/wp3136254.png' \
--form 'properties={"description":"hello world, and aliens","test":true}' \
--form 'version=1' \
--form 'embedded=true'

Download

pattern : curl -vX GET http://localhost:4242/v1/mytenant/resources/download/<resource_name>?<parameters>

You can provide parameters like :

  • version: specific version to download
  • output: output path to store the downloaded resource on local filesystem
curl --location --request GET 'http://localhost:4242/v1/mytenant/resources/download/tests/test.txt?output=/tmp/output.txt&version=42'

Copy

pattern : curl -vX PUT http://localhost:4242/v1/mytenant/resources/copy/<resource_name>

You must provide a form-data body with properties :

  • destination : mandatory, future path of the file to copy
  • version: optional, specific version to copy
  • embedded: optional, set to true if you want to copy the data in metadata
curl --location --request PUT 'http://localhost:4242/v1/mytenant/resources/copy/tests/test.txt' \
--form 'destination=copies/test.txt' \
--form 'version=42' \
--form 'embedded=true'

Move

pattern : curl -vX PUT http://localhost:4242/v1/mytenant/resources/move/<resource_name>

You must provide a form-data body with properties :

  • destination : mandatory, future path of the file to move
  • version: optional, specific version to move
  • embedded: optional, set to true if you want to move the data in metadata
curl --location --request PUT 'http://localhost:4242/v1/mytenant/resources/move/tests/test.txt' \
--form 'destination=moves/test.txt' \
--form 'version=42' \
--form 'embedded=true'

Delete

pattern : curl -vX DELETE http://localhost:4242/v1/mytenant/resources/delete/<resource_name>?<parameters>

You can provide parameters like :

  • version: specific version to delete
curl --location --request DELETE 'http://localhost:4242/v1/mytenant/resources/delete/images?version=42'

List

pattern : curl -vX GET http://localhost:4242/v1/mytenant/resources/list?<parameters>

You can provide parameters like :

  • pattern: wildcard pattern name to filter metadata resources according to this pattern
  • all: set it to true if you want to list all the versions matching the pattern. If false, only the last version of each
    resource will be provided
  • filter: filter the results matching the provided filter with format key=value. This parameter can be repeated
  • output: output path to store the list on local filesystem
  • simplify: set it to true if you want to simplify the provided list with only name, version and timestamp
curl --location --request GET 'http://localhost:4242/v1/mytenant/resources/list?all=true&pattern=mytests/*&filter=owner=bob&simplify=true&output=/tmp/simple.txt'

Update properties

pattern : curl -vX POST http://localhost:4242/v1/mytenant/resources/update/<resource_name>

You must provide a data raw body in json format with properties :

  • version: optional, specific version to register
  • properties: optional, list of custom properties with format ["key=value"]
curl --location --request POST 'http://localhost:4242/v1/mytenant/resources/update/tests/test.txt' \
--form 'version=12' \
--form 'properties={"description":"hello world, and aliens","type":"json"}'

Register

pattern : curl -vX PUT http://localhost:4242/v1/mytenant/resources/register/<resource_name>

You must provide a data raw body in json format with properties :

  • url : mandatory, url of the file to register. This url must be complete and allow a user to query it to get the concerned data
  • version: optional, specific version to register
  • properties: optional, list of custom properties with format ["key=value"]
  • embedded: optional, set to true if you want to store the data in metadata
curl --location --request PUT 'http://localhost:4242/v1/mytenant/resources/register/tests/test.txt' \
--form 'version=42' \
--form 'url=/tmp/test/test.txt' \
--form 'properties={"description":"hello world, and aliens","test":true}'

Manual configuration

Configuration file :

  • $PUNCHPLATFORM_GATEWAY_INSTALL_DIR/conf/punchplatform-gateway.yml

Basic example :

spring:
  servlet:
    multipart:
      max-file-size: -1
      max-request-size: -1

# Internal server configuration for Gateway
server:
  address: localhost
  port: 4242

# One Gateway is related to a tenant
# The requested tenant is specified inside each request's path and a wrong tenant lead to a 404 error
punchplatform:
  tenant: "mytenant"

# The Gateway has its own reporters sending the gateway metrics inside the ES metric cluster
reporters:
  elasticsearch:
    - hosts:
        - "localhost:9200"
      index_name: "mytenant-gateway-logs"

# This configuration is used for ES forwarding feature
# It MUST contain 2 sections, one to store data and one to store metrics 
# There is no need to configure a 'credentials' section. If either the data cluster or the metric cluster is secured
# with authentication, each forwarded request MUST contain an Authorization header 
elasticsearch:
  data_cluster:
    cluster_id: "es_data"
    hosts:
      - "server1:9200"
    settings:
      - "es.index.read.missing.as.empty: yes"
      - "es.nodes.discovery: true"
  metric_cluster:
    cluster_id: "es_metrics"
    hosts:
      - "server2:9200"
    settings:
      - "es.index.read.missing.as.empty: yes"
      - "es.nodes.discovery: true"
    index_name: "mytenant-metrics"

# Related to channel management
# Disabling this service will lead to a 404 error if requested
channels:
  enabled: true

# Related to Puncher tool
# Disabling this service will lead to a 404 error if requested
puncher:
  enabled: true

# Related to Punchlines executions and management 
# Disabling this service will lead to a 404 error if requested
punchline:
  enabled: true

# Related to forwarding service to ES, with a filtering action according to the configured punchline in this section
# Disabling this service will lead to no filtering applied on forwarded requests to ES
forwarding:
  enabled: true
  punchlet: "file:///home/lca/Applications/punch-standalone-6.1.0-linux/external/punch-gateway-6.1.0/conf/forwarding.punch"
  reload : "0 * * * * *"

# Related to the extraction service
# Disabling this service will lead to a 404 error if requested
services:
  extraction:
    enabled: true
    formats:
      - "csv"
      - "json"

# Related to resources management files and services like documentation and archive files
# Disabling the resource manager service will lead to a 404 error if requested
resources:
  doc_dir: "<path_to_documentation_html_page>"
  tmp_dir: "/tmp"
  archives_dir: "/tmp/extractions"
  manager:
    enabled: true
    timeout: 15000
    metadata:
      elasticsearch:
        - hosts:
            - "server2:9200"
          index: "resources-metadata"
    data:
      file:
        - root_path: "/tmp/punchplatform/manager/resources"

management:
  endpoint:
    httptrace:
      enabled: false
    mappings:
      enabled: false
  endpoints:
    enabled-by-default: false

Security

Authentication forwarding

Gateway will forward any authorization header to Elasticsearch cluster.

The concerned endpoints are :

  • Elasticsearch

All token types supported by Elasticsearch Rest API are also supported by the Punchplatform Gateway.

Abstract

How to get the token?
In the case of standalone with Opendistro, the token is a base64 encoding of the "login:password" chain.
You can generate a token using, for example, the website base64encode.org.
The token for the standalone corresponding to the credentials admin:admin is YWRtaW46YWRtaW4=.

Example :

curl -v GET localhost:4242/v1/mytenant/es/_cat/indices -H "Authorization: Basic YWRtaW46YWRtaW4=" 

yellow open platform-logs-2020.01.28             JVGEA2xsRUWDhNCVn18vdg 5 1   10 0  55.2kb  55.2kb
yellow open .kibana_92668751_admin               MdP8UNobT8SmW3U276K6iQ 1 1    1 0   3.7kb   3.7kb
green  open .kibana_1                            Zq5w1fBtSPeIQvZ45vhdyQ 1 0    0 0    261b    261b
yellow open security-auditlog-2020.01.28         W12oeFkYT7qXc3B_pcREog 5 1   11 0 174.8kb 174.8kb
yellow open platform-metricbeat-6.8.6-2020.01.28 UnTZL_U5QZqMU8bZtao94g 1 1 1479 0 962.9kb 962.9kb
green  open .opendistro_security                 Su-xHUevSL2IarcTfhu-lA 1 0    5 0  25.6kb  25.6kb

Authentication for other services

The services concerned by an Elasticsearch connexion should be configured with credentials information if needed.

The potential services concerned by an Elasticsearch authentication configuration are :

  • ES reporters
  • Resource Manager

The authentication configurations for these services are the same. Example :

reporters:
  elasticsearch:
    - hosts:
        - "server2:9200"
      index_name: "mytenant-gateway-logs"
      credentials:
        user: "admin"
        password: "admin"

Warning

For a prduction context, be sure this file is properly protected by appropriate Unix account and permissions

SSL

There are two ways to activate SSL for the Punchplatform Gateway :

  1. Client to Gateway
  2. Gateway to endpoints

These features are both independent and disabled in standalone by default, but you can trigger them inside the Gateway configuration file.

SSL for clients to Gateway

A keystore is provided by the standalone in $PUNCHPLATFORM_CONF_DIR/../external/punchplatform-gateway-6.0.0/res/ssl/gateway.keystore

To activate SSL from any clients to Gateway's Rest API, set server.ssl.enabled to true :

vi $PUNCHPLATFORM_CONF_DIR/../external/punchplatform-gateway-6.0.0/conf/application-gateway.yml
# conf for standalone
server:
  address: 127.0.0.1
  port: 4242
  ssl:
    enabled: true
    key-alias: "gateway"
    key-store: "/path/to/gateway.keystore"
    key-store-type: "jks"
    key-store-password: "gateway"
    key-password: "gateway"

You can also create your own keystore with :

keytool -genkey -alias myalias -keyalg RSA -keystore gateway.keystore \
          -validity 3650 -storetype JKS \
          -dname "CN=localhost, OU=Spring, O=Pivotal, L=Kailua-Kona, ST=HI, C=US"
          -keypass changeit -storepass changeit
          -deststoretype pkcs12

Then change the configuration according to your new keystore.

SSL for Gateway to endpoints

Each cluster can be referenced as a protected endpoint with ssl_enbled: true.

Additional SSL configurations are available, and depends on your security architecture :

  • ssl_private_key: Optional, Gateway's private key to connect to the ES cluster.
  • ssl_certificate: Optional, Gateway's private key to connect to the ES cluster
  • ssl_trusted_certificate: Optional, Gateway's CA file to connect to the ES cluster. This option will enable the server-side authentication using the ES certificates.
elasticsearch:     
  enabled: true   
  data_cluster:                                                    
    cluster_id: "es_search"   
    hosts:                                                         
      - "localhost:9200"  
    settings:
      - "es.index.read.missing.as.empty: yes"
      - "es.nodes.discovery: true"                                                                                                    
    ssl_enabled: true
    ssl_private_key: "/data/certs/key.pem"
    ssl_certificate: "/data/certs/cert.pem"
    ssl_trusted_certificate: "/data/certs/cafile.pem"

Updating static punchlines nodes resources

In case a patch is made on either spark, pyspark or storm runtime, impacting nodes configurations, you might want to update punchlines static resources found in: $PUNCHPLATFORM_GATEWAY_INSTALL_DIR/punchlines/

  • spark_nodes.json
  • storm_nodes.json
  • pyspark_nodes.json

punchplatform-inspect-node.sh -h

Spark with mllib

# generates a list of json documents
punchplatform-inspect-node.sh --packages org.thales --runtime spark --mllib --base-class org.apache.spark.ml.PipelineStage > mllib_nodes
# generates a list of json documents
punchplatform-inspect-node.sh --packages org.thales --runtime spark --jar $PUNCHPLATFORM_INSTALL_DIR/lib/spark/punch-spark-uber-*.jar > spark_nodes

# manually concatenant both list from mllib_nodes and spark_nodes into a single one and replace content of
# $PUNCHPLATFORM_GATEWAY_INSTALL_DIR/punchlines/spark_nodes.json with the new concatenanted one.

Storm

# generates a list of json documents
punchplatform-inspect-node.sh --packages org.thales --runtime storm --jar $PUNCHPLATFORM_INSTALL_DIR/lib/storm/punch-topology-app-*-jar-with-dependencies.jar > $PUNCHPLATFORM_GATEWAY_INSTALL_DIR/punchlines/storm_nodes.json

Pyspark

# generates a list of json documents
punchplatform-inspect-node.sh --packages punchline_python --runtime pyspark > $PUNCHPLATFORM_GATEWAY_INSTALL_DIR/punchlines/pyspark_nodes.json

API Documentation

You can check the API documentation for more information.

The associated javadoc is a part of the user documentation, though some of it targets only the developers community :