TROUBLESHOOTING cast errors pml model
In this chapter, we present job and plan uses cases.
When does it arrive¶
Sometimes when you are creating your own PML graph, and you want to post process the data (for example with k means). You want to get the meta-data of the fitted model. In that case you will use the following line :
Transformer stage = stages[3];
You will in that case get the following stacktrace :
18/06/05 10:29:25 ERROR Executor task launch worker for task 11 org.apache.spark.executor.Executor: Exception in task 0.0 in stage 11.0 (TID 11)
java.lang.ClassCastException: org.apache.spark.ml.feature.VectorAssembler cannot be cast to
org.apache.spark.ml.clustering.KMeansModel
at com.thales.services.cloudomc.punchplatform.punch.PunchletImplementation1.execute(PunchletImplementation1.java:19)
at org.thales.punch.ml.plugins.punch.PunchStage.lambda$transform$f0911526$1(PunchStage.java:140)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
How to correct it¶
This means that the number 3 that you put in your line isn't pointing to the right position of the stage into your Mllib node. it is pointing to another stage, here, for instance, it's pointing to the stage 4 (0 is an index too) and it is a vector assembler whereas my means is the stage 5. So i just have to correct the value :
Transformer stage = stages[4];