Skip to content

Cast errors pml model

In this chapter, we present job and plan uses cases.

When does it arrives

Sometimes when you are creating your own PML graph, and you want to post process the data (for example with k means). You want to get the meta data of the fitted model. In that case you will use the following line :

1
Transformer stage = stages[3];

You will in that case get the following stacktrace :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
18/06/05 10:29:25 ERROR Executor task launch worker for task 11 org.apache.spark.executor.Executor: Exception in task 0.0 in stage 11.0 (TID 11)
java.lang.ClassCastException: org.apache.spark.ml.feature.VectorAssembler cannot be cast to
    org.apache.spark.ml.clustering.KMeansModel
at com.thales.services.cloudomc.punchplatform.punch.PunchletImplementation1.execute(PunchletImplementation1.java:19)
at org.thales.punch.ml.plugins.punch.PunchStage.lambda$transform$f0911526$1(PunchStage.java:140)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

How to correct it

This means that the number 3 that you put in your line isn't pointing to the right position of the stage into your Mllib node. it is pointing to another stage, here, for instance, it's pointing to the stage 4 (0 is a index too) and it is a vector assembler whereas my kmeans is the stage 5. So i just have to correct the value :

1
Transformer stage = stages[4];