Skip to content

Python Elastic Output

Before you start...

Before using...

This node is intended to be used when spark features are not required. For instance, in a pipeline, where you are not manipulating Spark's dataframe at all...

With this node, you can save your an incoming stream of list (where each element is a python dictionary) to an elasticsearch index.

PySpark ->

Spark ->

Examples

Use-cases

Our "hello world" punchline configuration.

beginner_use_case.punchline

{
    type: punchline
    version: "6.0"
    runtime: pyspark
    tenant: default
    dag: [
        {
            type: python_elastic_input
            component: python_elastic_input
            settings: {
                index: mydata
                nodes: [
                    localhost
                ]
            }
            subscribe: [

            ]
            publish: [
                {
                    stream: data
                }
            ]
        }
        {
            type: python_elastic_output
            component: python_elastic_output
            settings: {
                nodes: [
                    localhost
                ]
                index: multiquerytest
            }
            subscribe: [
                {
                    stream: data
                    component: python_elastic_input
                }
            ]
        }
    ]
}

run beginner_use_case.punchline by using the command below:

CONF=beginner_use_case.punchline
punchlinectl start -p $CONF

Comming soon

Comming soon

Parameters

Common Settings

Name Type mandatory Default value Description
index String true NONE The name of your elasticsearch index where data will be fetched.
port Integer false 9200 Your Elasticsearch server Port.
nodes List of String true NONE Hostnames of your elasticsearch nodes. In general, only one hostname is needed.
type String false NONE document type that will be retrieved from elasticsearch index

Advanced Settings

No advanced settings