Skip to content

Python File Input

Overview

This simple node can be used for simple use cases where you do not need to manipulate dataframe APIs. The resulting output of this node is a list of string where each element is a line of your file.

Runtime Compatibility

  • PySpark :
  • Spark :

Example

1
    Our "hello world" punchline configuration.

Warning

Each line of your file should be a valid elasticsearch query

Here is a simple example:

{
  type: punchline
  version: "6.0"
  runtime: spark
  tenant: default
  dag:
  [
    {
      type: python_file_input
      component: queries
      publish:
      [
        {
          stream: data
        }
      ]
      settings:
      {
        file_path: /full/path/to/file/query
      }
    }
    {
      type: python_elastic_input
      component: python_elastic_input
      settings:
      {
        index: mydata
        nodes:
        [
          localhost
        ]
      }
      subscribe:
      [
        {
          stream: data
          component: queries
        }
      ]
      publish:
      [
        {
          stream: data
        }
      ]
    }
    {
      type: python_elastic_output
      component: python_elastic_output
      settings:
      {
        nodes:
        [
          localhost
        ]
        index: multiquerytest
      }
      subscribe:
      [
        {
          stream: data
          component: python_elastic_input
        }
      ]
    }
  ]
}

You can run this punchline using the following command

punchlinectl start -p punchline.hjson

Parameters

Name Type mandatory Default value Description
file_path String true NONE full path the file you want to ingest.