Python File Input¶
Overview¶
Compatible Pyspark only
This node is intended to be used when spark features are not required. For instance, in your pipeline, you are not manipulating Spark's dataframe at all...
The resulting output of this node is a list of string where each element is a line of your file.
Example(s)¶
1: Multiple queries¶
Below is an example where our Python Elastic Input Node
is subscribed to our Python File Input Node
, publishing a list of string and each element of the list is a query to be executed against an elasticsearch cluster. The result is then saved in a new index.
Note
Each line of your file should be a valid elasticsearch query
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | job: [ { type: python_file_input component: queries publish: [ { stream: data } ] settings: { file_path: /full/path/to/file/query } } { type: python_elastic_input component: python_elastic_input settings: { index: mydata nodes: [ localhost ] } subscribe: [ { stream: data component: queries } ] publish: [ { stream: data } ] } { type: python_elastic_output component: python_elastic_output settings: { nodes: [ localhost ] index: multiquerytest } subscribe: [ { stream: data component: python_elastic_input } ] } ] |
Configuration(s)¶
-
file_path
: StringDescription: [Required] full path the file you want to ingest.