Python File Input
Overview¶
This simple node can be used for simple use cases where you do not need to manipulate dataframe APIs. The resulting output of this node is a list of string where each element is a line of your file.
Runtime Compatibility¶
- PySpark : ✅
- Spark : ❌
Example¶
1 |
|
Warning
Each line of your file should be a valid elasticsearch query
Here is a simple example:
---
type: punchline
version: '6.0'
runtime: spark
tenant: default
dag:
- type: python_file_input
component: queries
publish:
- stream: data
settings:
file_path: "/full/path/to/file/query"
- type: python_elastic_input
component: python_elastic_input
settings:
index: mydata
nodes:
- localhost
subscribe:
- stream: data
component: queries
publish:
- stream: data
- type: python_elastic_output
component: python_elastic_output
settings:
nodes:
- localhost
index: multiquerytest
subscribe:
- stream: data
component: python_elastic_input
You can run this punchline using the following command
punchlinectl start -p punchline.yaml
Parameters¶
Name | Type | mandatory | Default value | Description |
---|---|---|---|---|
file_path | String | true | NONE | full path the file you want to ingest. |