Skip to content

Python File Input

Overview

This simple node can be used for simple use cases where you do not need to manipulate dataframe APIs. The resulting output of this node is a list of string where each element is a line of your file.

Runtime Compatibility

  • PySpark :
  • Spark :

Example

1
    Our "hello world" punchline configuration.

Warning

Each line of your file should be a valid elasticsearch query

Here is a simple example:

{
    type: punchline
    version: "6.0"
    runtime: spark
    tenant: default
    dag: [
        {
            type: python_file_input
            component: queries
            publish: [
                {
                    stream: data
                }
            ]
            settings: {
                file_path: /full/path/to/file/query
            }
        }
        {
            type: python_elastic_input
            component: python_elastic_input
            settings: {
                index: mydata
                nodes: [
                    localhost
                ]
            }
            subscribe: [
                {
                    stream: data
                    component: queries
                }
            ]
            publish: [
                {
                    stream: data
                }
            ]
        }
        {
            type: python_elastic_output
            component: python_elastic_output
            settings: {
                nodes: [
                    localhost
                ]
                index: multiquerytest
            }
            subscribe: [
                {
                    stream: data
                    component: python_elastic_input
                }
            ]
        }
    ]
}

You can run this punchline using the following command

punchlinectl start -p punchline.hjson

Parameters

Name Type mandatory Default value Description
file_path String true NONE full path the file you want to ingest.