Skip to content

File Model Input

Before you start...

Before using...

This node is designed to take load in memory a machine learning model generated by an mllib pipeline. The file should be binary blob.

Warning

For the sake of keeping consistency across nodes name parameter, we have decided to keep file_path as the naming convention. As you can guess from the below example, what is expected is only the file name. The absolute or relative path of the model file should be set inside your settings as shown in the example below.

Pyspark ->

Spark ->

Examples

Use-cases

Our "hello world" punchline configuration.

beginner_use_case.punchline

{
    type: punchline
    version: "6.0"
    runtime: spark
    tenant: default
    dag: [
        { 
            type: file_model_input
            component: file_model_input
            settings: {
        file_path: /tmp/model.bin
            }
            publish: [
                {
                    // Most probably you will use this model in a mllib node
                    // You must therefore name it model. This is explained in
                    // the mllib node documentation. 
                    stream: model
                }
            ]
        }
settings: {
    // Location of the input file. That path must be reachable
    // from where the spark runs. I.e. every spark node.
    // You can also use relative path like './AAPL.csv' as long
    // as you launch your pml in foreground mode from the same directory.
    // model.bin is located within the same directory as the pml you want to launch
    spark.files: ./model.bin
}
    ]
}

run beginner_use_case.punchline by using the command below:

CONF=beginner_use_case.punchline
punchlinectl start -p $CONF

Coming soon

Description: [Required] Absolute file path.

Parameters

Common Settings

Name Type mandatory Default value Description
file_path String true NONE The name of the file specified within spark.files parameter.

Advanced Settings

No advanced settings