DAVE-6.1.0 release notes¶
This document is a summary of content, changes, limitation and fixes of this release as compared to DAVE-6.0.1 release
For full documentation of the release, a guide to get started, and information about the project, see the Punchplatform project site.
The documentation for this release can be found inside the deployment archives (standalone and deployer versions), and at https://doc.punchplatform.com/6.1.0/index.html.
The documentation for the most recent release can be found at https://doc.punchplatform.com.
Note about upgrades: Please carefully review the upgrade documentation for this release thoroughly before upgrading your clusters. The upgrade notes ( e.g. upgrade from 6.0 to 6.1) discuss any critical information about incompatibilities and breaking changes, performance changes, and any other configuration changes that might impact your production deployment of Punchplatform.
do NOT use this version
Version 6.1.0 has a major bug (#998) preventing the Kibana from starting normally.
Release main features and enhancements¶
6.1 compared to 6.0¶
Storm-like punchlines¶
- The File Output node now supports writing archives using Avro format.
- added SFTP Input node
- added Extraction Input node
- added File Transfer Output node
Pyspark & Spark punchlines¶
- added Kafka Batch Input node for spark and pyspark runtime.
- deprecate
plan_settings
forsettings
inplan.hjson
. - drastically improved exception returned by spark application for
--runtime spark
. It nows follows ECS 1.5 convention. - added more properties to logger configuration to hide unwanted information.
- pyspark runtime now brings
pyarrow
as a dependency. In case you are using apyarrow
version <= 1.5, useexport ARROW_PRE_0_15_IPC_FORMAT=1
before launching your punchline. - fixed multiple scenarios where submitting a spark application using punchlinectl/planctl/shiva to a spark cluster would not work
- both pyspark and spark runtime now supports out of the box reading/storing dataframe to S3/minio by using
file_input
/file_output
node - sql node now supports the registration of custom udf written in python for pyspark runtime
planctl command¶
- reduce the number of process spawned by 2 during
planctl
execution. This improves RAM usage when plan is launched by shiva.
Shiva tasks scheduler refactoring and enhancement:¶
-
Shiva now relies on Kafka topics to store its command/management data, instead of relying directly on zookeeper. This prepares for reduced dependency towards the zookeeper framework.
-
Shiva now provides isolated configuration environment between applications, using the configuration at the time of the channel submit (Before, the last resources version of a tenant were used for all restarting application previously submitted, without an explicit operator decision) Shiva architecture Shiva Protocol
Runtime configuration resolver¶
- A tool called 'resolver' has been added, to change/complement channels file at start time Resolver to reduce configuration verbosity and help make it more platform-independent.
- All channels configuration files (channel_structure, punchlines, monitoring services configuration files) can now be updated/completed on-the-fly by the "resolver" mechanism This allows for more platform-independent channels configuration if desired. The use of the "resolver" mechanism is still optional, allowing to provide all needed settings in the channel files themselves. The resolv.hjson file is now mandatory at deployment time, but can contain no resolving rules if plain, complete configuration files are provided.
Channels monitoring service has been fixed and improved to:¶
- provide periodicic health even for applications running inside Shiva
- prevent "RED" health status display for applications that have just been started by the operator (the monitoring will activate a few minutes later, once all needed metrics and events are available to compute health)
Elastalert¶
- enhanced our packaged elastalert to add external PEX dependencies at runtime per instance. See examples and documentation.
Deployment configuration¶
- We merged punchplatform.properties & punchplatform-deployment.settings to one single file punchplatform-deployment.settings At deployment time, only "punchplatform-deployment.settings" file is now used ; it includes the deployment information that was previously distributed between this file and the 'punchplatform.properties' file. There is no "punchplatform.properties" file accessible in the runtime operator configuration filesystem anymore, as it contained only static information, not supposed to change out of the deployment phase.
Provided mandatory resources¶
- Some runtime configuration files (Elasticsearch index templates containing mappings) are now MANDATORY as their inclusion in the deployed resources of the platform is needed to ensure proper behaviour of the inbuilt services (monitoring services, API Gateway , punchplatform plugin functions for extraction and jobs monitoring...) Based on this data, standard dashboards are now available for platform and channels monitoring and troubleshooting, and should be imported on all platforms.
Custom nodes development¶
To help design custom nodes for enhancing punchlines :
- documentation to develop a java storm custom node
- documentation to develop a java spark custom node
- documentation to develop a pyspark custom node
Other main features compared to 5.x (Craig) release¶
Note : more information about those changes can be found in previous versions Release notes for 6.0.0 and 6.0.1 and in related Migration Guides ( migration to 6.0.0 and migration to 6.0.1
-
(Since 5.7)The punchplatform gateway can filter Elasticsearch requests coming in from the Kibana server OR from an external source (if allowed by the platform), based on a user-provided punchlet. This allows for reducing the overload/crash risk of the Elasticsearch backend in a mutualized context due to too heavy queries. All forwarded or rejected queries are traced in an Elasticsearch history.
-
(Since 6.0) Minio / Amazon S3 API is supported by File Output node and archiving. (Optional) Minio deployment is proposed as a way of providing a clustered S3 storage cluster (although for production, off-the-shelf S3-providing hardware appliances are advised for ease of operation.)
-
(Since 6.0) Elastalert is now packaged as a standard shiva task than can be easily included as an application inside a channel
-
(Since 6.0) An optional "Feedback GUI" can be deployed as a kibana plugin, allowing a user to interactively annotate or score some data. This can be customized depending on the feedback need, especially when working with AI Use-cases involving "supervised training" phases.
-
(Since 6.0) A punchplatform 'gateway' daemon is now mandatory between any Kibana providing the punchplatform HMI plugin and Elasticsearch. The Kibana server does not require anymore the puncplatform operator tools and access rights to be deployed on the same server ; it relies on the gateway APIs. This helps enhancing the overall platform security in case of breach of the kibana layer.
-
(Since 6.0) 'pex' python packaging system is used to reduce versions dependencies side-effect of python commands of the punchplatform framework (e.g. elasticsearch housekeeping service), reducing the need for deploying python many python modules in the Operating system environment at deployment time.
-
(Since 6.0) added the ability to use pyspark udf(User Defined Functions), see here for an example on how to use it.
-
(Since 6.0) added the ability to specify a custom python packaging (pex-based) per punchline execution, this can offers the possibility to use multiple version of the same python module across different Pyspark punchlines (especially useful with user-provided UDF or custom nodes)
Release main open-sources components integrated versions¶
Open-source punch is integrated with | Version by default for this Punchplatform release | Previously |
---|---|---|
Elastic stack (Elasticsearch, Kibana, Beats, Logstash) | upgraded to 7.8.0 | 7.4.2 since Dave 6.0.0 |
Spark | 2.4.3 (hadoop 2.7) | unchanged since Dave 6.0.0 |
Storm | upgraded to 2.2.0 | 2.1.0 since Dave 6.0.0 |
Kafka | 2.11_2.4.0 | unchanged since Dave 6.0.0 |
Python | 3.6.8 | unchanged since Dave 6.0.0 |
Zookeeper | 3.5.5 | unchanged since Dave 6.0.0 |
Elastalert | 0.2.4 | unchanged since Dave 6.0.0 |
Opendistro stack (opendistro-security) | 1.9.0.0 | 1.4.0.0 since Dave 6.0.0 |
Clickhouse server | 20.4.6.53 | unchanged since Dave 6.0.0 |
Minio | 2020-08-26T00-00-49Z | 2019-12-17T23-16-33Z since 6.0.0 |
Ansible | 2.9.7 | unchanged since Dave 6.0.0 |
Apache Modsecurity module | 2.9.0-1 | unchanged since Dave 6.0.0 |
Ceph | 13.2.5 | unchanged since Craig 5.0 |
Elasticsearch Curator | 5.5.4 | 5.5.4 since Dave 6.0.0 |
siddhi-core library | 4.3.17 | unchanged since Dave 6.0.0 |
New security features for Dave-6.1.0 need at least JDK 8 with a version superior or equal to 1.8.0_241. Please ensure at deployment time that the openjdk version provided by your repositories is at least the version.
Known limitations¶
The following issues are reducing the functional scope of the release :
-
#983 - Shiva monitoring does not work properly : Although available in the platform-monitoring service output, the reports about shiva clusters health is incomplete, missing the list of the faulty nodes This will be fixed in a coming minor release.
-
#967 Punch Node fails in Spark runtime : An exception is raised when using a Punch node with a Spark runtime. This will be fixed in a coming minor release.
Fine-Grain traceability to included fixes/enhancement tracking¶
Bug Fixes¶
- #1001 method TupleFactory.getObjectMapper(boolean) is missing
- #999 plan start date and end date wrong timedateparser
- #995 Missing "ever-running task is running" events when multiple tasks are running
- #994 hadoop 2.8.5 version conflict with spark 2.4.x
- #992 Incorrect platform.storm.container_id for storm topologies in cluster
- #988 Deployer does not update Shiva binaries
- #986 puncher does not work on 6.1
- #984 Digest in filebolt is not yet implemented
- #977 No punchctl event for starting applications
- #976 Channel_monitoring task doesn't work in 6.1
- #974 file bolt : digest computed on uncompressed content instead of the compressed file
- #973 No exception raised when using same component name in Spark
- #972 Elastic params not taken into account in PySpark runtime
- #969 Gateway stops responding when Kafka is down
- #943 Error not captured when shiva punchline fails fast
- #897 Properly process the platform logs from kafka reporters
- #982 punchplatform-setup-kibana.sh --import reports SUCCESS when kibana is not running
- #979 Gateway doesn't return all punchlines
- #978 opendistro demo in getting started not working
- #975 Too many variables in operator .bashrc
- #971 ES Batch Input fails in PySpark runtime
- #968 Gateway crashes when fetching punchlines in fresh standalone
- #966 null pointer in channels monitoring on getActiveJobs
- #964 'brokers' should always be mandatory in kafka settings (reporters, input/output)
- #963 No way to clear task started with mistyped shiva cluster
- #959 Shiva runner stops starting tasks after null pointer
- #957 cannot launch an extraction after creating it through the Gateway
- #956 ES housekeeper not working due to changed 6.1 environment
- #955 Install punch standalone DARWIN not working
- #953 Unable to resolve manually channels-monitoring.json config
- #952 Invalid config item error in platform monitoring logs
- #950 Dangerous (non-temporary) working directory for shiva tasks
- #949 Resilience of channelctl in case of kafka single-node failure
- #948 Bad topology name in Kafka consumer group default name and in metrics topology_name
- #947 shiva does not start in shiva with kafka backend enabled
- #946 Gateway doesn't log anything
- #942 Unusable local subprocess log for shiva tasks due to missing task name/metadata in log lines.
- #940 'platform' tenant is mandatory in platform monitoring configuration
- #939 Bad reload of systemd definition for gateway server
- #938 shiva runner running old task version when stop start is too fast
- #937 resolv.hjson is mandatory for punchlinectl, although not for deployment
- #935 shiva runner does not start if kafka reporter bootstrap servers are not set
- #934 ClickHousDriver trace in punchctl
- #929 Permission denied and bad idempotence on operator punchplatform.properties generation
- #928 Make mandatory 'brokers_with_ids' for kafka cluster
- #927 missing group_vars inventory templates (modsecurity, pyspark)
- #923 Gateway fails to start if ES hosts are configured with suffix
- #920 Unexpected logs displayed on screen when using channelctl
- #915 MLFLOW | Gunicorn not found Error
- #913 Event type not standardized between shiva and storm job
- #912 filebolt crashes when kafkaspout uses key value decoder
- #895 channel_structure can be hjson but not with hjson extension
- #893 Shiva deployment without tags fail
- #892 Gateway start with failure because of dependency management
- #888 When error in bolt settings, no bolt name is provided in exception stack/error message
- #884 resource manager registers two metadata for the same resource version
- #878 Incorrect offsets reported for partitions >1 by punchplatform-kafka-topics.sh --offsets
- #877 kafka reporter uses only 1 partition (scalability issue)
- #874 Wrong PEX_ROOT definition causes pex files creation in working directory
- #873 Helper options for punchplatform-kafka scripts not working
- #872 No punchplatform plugin deployment when using deployer for upgrade
- #870 Punchctl Kafka metrics reporter should provide error message if target topic is missing
- #867 plans only launch spark punchlines
- #866 elasticsearch masters list is wrongly built
- #860 unchecked punch gateway mandatory resources including resources.doc_dir
- #858 pyspark-tmp directory is filling up
- #856 python_elastic_input spark node has no credential settings
- #854 spark jars cannot be updated on production platforms
- #853 archiving getting started documentation is out of date
- #852 operator role enriches path with wrong priority
- #851 In production, ansible deployer must NEVER reapply opendistro security config to opendistro plugin
- #850 significantly improve the error and exception message handling of plans
- #846 Spark plan ignores spark settings in punchlines
- #845 not all error tuples are acked in case of punch exception
- #844 ipv6 range operator does not work
- #840 Minio port configuration not taken into account in mononode deployment
- #837 world.meta not working anymore
- #836 Punchplatform (Dave) shells uses ambiguous 'python' command instead of using 'python3'
- #835 make all punch unchecked exceptions use the correct usage api
- #831 pyspark plans fail to start in shiva
- #830 not all storm options are considered when submitting a topology
- #829 Dave kibana uselessly requires operator environment
- #828 Deployer uses ambiguous 'python' command instead of using 'python3'
- #827 Hard-coded
/opt in gateway deployment - #824 Feedback visualization not taking Kibana advanced settings into account
- #819 fail to start a channel or to access application executions in the Kibana plugin
- #816 improve the shiva health monitoring document format
- #808 identical cluster name conflicts in generated ansible inventories
- #805 platform monitoring dashboards cannot be imported
- #802 properly deal with elasticsearch GET url parameters in gateway protection punchlet
New Features¶
- #945 Allow resolving of channel_structure to facilitate tests of other platforms channels
- #941 Environment-provided default tenant/channel in punchlines and monitoring configuration
- #931 Upgrade Minio version
- #981 Upgrade Storm to 2.2.0
- #930 PySpark Load MLFlow UDF
- #910 Spark Monitoring | Implementation and update of some spark metrics
- #909 improve the toflat operator to support an array friendly strategy
- #905 Upgrade kafka version to 2.6.0
- #901 Punchplatform-plugin doesn't update with deployer installation
- #894 Upgrade Elastic Stack from 7.4.2 to 7.8.0
- #883 make the gateway refuse incoming elastic request with elastic compatible error formats
- #881 Allow custom additional JVM options for STORM NIMBUS and ELASTICSEARCH (helps hardening)
- #871 handle elastalert rules in resource manager
- #869 provides punch public api for custom nodes
- #857 add autocompletion and syntax highlighting for punch language in the resource manager
- #843 make punchplatform.properties generated at deploy time
- #842 Adding minio, clickhouse and gateway support to the platform monitoring
- #838 support Kibana 6.8 in feedback plugin
- #825 make clickhouse deployed through the punch deployer
- #820 provide a helper ansible role to install certificates wherever needed
- #818 provide the Storm SMTP node
- #801 make gateway protection punchlet two-ways
Improvements¶
- #958 Spark Monitoring Dashboard | ECS Format
- #949 Resilience of channelctl in case of kafka single-node failure
- #945 Allow resolving of channel_structure to facilitate tests of other platforms channels
- #928 Make mandatory 'brokers_with_ids' for kafka cluster
- #925 Elastalert custom modules packaging
- #918 make shiva worker use a persistent identifier
- #917 Normalize spark logs
- #909 improve the toflat operator to support an array friendly strategy
- #907 remove bootstrap.servers from kafka reporter configuration
- #906 improve the support of multi shiva clusters
- #899 shiva.clusters.
.storage.root and other shiva control settings should be optional - #890 display feedback table like a saved search table
- #887 improve the eps channel monitoring dashboard
- #886 improve and document the spark application packaging
- #883 make the gateway refuse incoming elastic request with elastic compatible error formats
- #880 make tenant channel and name not mandatory settings in punchline
- #875 make gateway use the resource manager to save and load punchlets and other resource files
- #865 resource manager audit and deployment through gateway
- #864 improve elasticsearch configuration settings for housekeeping and monitoring services
- #863 add channels security configuration support through resolv.conf
- #861 shell enhancement for security validation related to punchbox
- #849 remove the dependency from plan lib from the spark analytics artefacts
- #848 make shiva local logs human readable
- #847 improve the standalone example channels to properly deal with parsing errors
- #839 missing metrics to fine tune shiva topologies
- #834 support the execution of plan through threads instead of fork-exec process
- #826 remove punch tools in Kibana plugin