5.4.0 to 5.5.0¶
This document explains what configuration changes MUST be performed during a PunchPlatform update from version 5.4.0 to 5.5.0
General changes¶
Important
Before upgrading it is recommended to stop all your topologies. If this is a concern to you, make sure you have each topology file have a name property that matches the file name. See an example below.
Configuration files¶
punchplatform.properties¶
Note
These changes make it possible to standardize deployments of kibana and elasticsearch plugins like the other components and to allow the deployment of Opendistro Kibana and Elasticsearch plugins.
The configuration of the Kibana plugin Punch initially in the punchplatform-deployment.settings
has moved in the plugins
section of kibana component in the punchplatform.properties
.
Only the version of the documentation has been removed. The other parameters remain unchanged.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | "kibana" : { "domains" : { }, "servers" : { }, "plugins":{ "opendistro_security":{ "ssl_verification_mode": "none", "elasticsearch_username": "kibanaserver", "elasticsearch_password": "kibanaserver" }, "punchplatform": { "zookeeper_cluster": "punchplatform", "spark_cluster": "spark_main", "extraction_path": "/home/vagrant/extractions", "tmp_path": "/home/vagrant/tmp", "job_editor_tenant": "mytenant", "job_editor_index": "jobs", "platform_editor_tenants": ["mytenant"] } } } |
punchplatform-deployment.settings¶
Note
These changes make it possible to standardize deployments of kibana and elasticsearch plugins like the other components and to allow the deployment of Opendistro Kibana and Elasticsearch plugins.
The configuration of the Kibana plugin Punch initially in the punchplatform-deployment.settings
has moved in the plugins
section of kibana component in the punchplatform.properties
.
The punchplatform-deployment.settings
only refers to the version of the plugin punch to install.
1 2 3 4 5 6 7 8 | "kibana_plugins": { "opendistro_security": { "version": "0.10.0.1" }, "punchplatform": { "version": "5.4.0" } }, |
TLS for spouts and bolts using sockets¶
The components concerned by this migration are :
- Lumberjack spouts and bolts
- Http spouts
- Syslog spouts and bolts
These components must take this new configuration into account :
1 2 3 4 5 | { "ssl_private_key":"/path/to/private/key", "ssl_certificate":"/path/to/certificate", "ssl_trusted_certificate":"path/to/trusted/certificate" } |
PML (Punch Machine Learning)¶
- We are deprecating PML acronym to PL (Punch Line)
- Added the ability to execute UDF from mllib and sql node
- Breaking changes for sql_statement and sql node
- Both of them has been merged as one but you still can execute a single or multiple queries if needed
- replace in your configuration file
type: sql_statement
tosql
- replace in your configuration file
- In case you want to execute one query:
- use statement parameter like you did before
- In case you want to use multiple queries, parameter
statements
has been renamed tostatement_list
- statement_list is a list of map containing two keys:
output_table_name: name_of_the_resulting_dataset_from_statement
(String parameter) andstatement: query_to_execute_on_the_input_dataset
(String parameter)
- statement_list is a list of map containing two keys:
- a new parameter has been introduced for registering within the running pipeline custom UDFs
register_udf
parameter is a list of map where each of them contains three keys:- function_name: name_you_wish_to_attribute_to_your_custom_udf (String parameter)
- class_name: package_name.ClassName (String parameter)
- schema_ddl: return_type_of_your_custom_udf_function (String parameter) NOTE refer to Spark Schema DDL for available output type
- New parameter for mllib node
register_udf
parameter is available, see above (Sql Node) for more details- As a notice, if you want to implement your own UDF, refer to our starting maven project: LINK
Example for migrating sql_statement to sql node:
FROM | TO |
---|---|
{ job: [ ... { type: sql_statement component: sql_statement settings: { statement: SELECT * FROM input_data } subscribe: [ { component: input stream: data } ] publish: [ { stream: data } ] } ] } |
{ job: [ ... { type: sql component: sql_statement settings: { statement: SELECT * FROM input_data } subscribe: [ { component: input stream: data } ] publish: [ { stream: data } ] } ] } |
Example for migrating sql node:
FROM | TO |
---|---|
{ job: [ ... { type: sql component: sql settings: { statements: [ query1 = SELECT * FROM input_data query2 = SELECT * FROM query1 LIMIT 1 ] } subscribe: [ { component: input stream: data } ] publish: [ { stream: query1 } { stream: query2 } ] } ] } |
{ job: [ ... { type: sql component: sql settings: { statement_list: [ { output_table_name: query1 statement: SELECT * FROM input_data } { output_table_name: query2 statement: SELECT * FROM query1 LIMIT 1 } ] } subscribe: [ { component: input stream: data } ] publish: [ { stream: query1 } { stream: query2 } ] } ] } |