TROUBLESHOOTING cope with oom
Why do that¶
Storm Topologies are java application that requires the right amount of memory to properly run. Some act as proxy-forwarder applications and can run in a limited amount of memory (100Mb). Others run many spouts and bolts or run archiving logic and require more memory (> 1Gb).
In all cases the punchplatform provides an internal mechanism to make topologies fail fast should they consume too much of their heap memory. Fail fast is key to prevent a topology to significantly slow down the traffic processing. Topologies will be quickly restarted. This said, you can detect the restart of topologies, and you should of course, grant them the right amount of memory.
By default punch topologies stop should their heap memory reach a treshold of 85%. Thei simply exit with an error log:
1 | message="FATAL ERROR OutOfMemory treshold exceeded" treshold=0,85 used=0.8635 |
The uptime metric can be used to detect these restarting topologies on your platform.
What to do¶
Grant enough memory to your topology¶
Refer to the performance memory<performance_memory>
chapter to grant each topology the memory it needs.
Change the OutOfMemory treshold¶
You can change the out of memory exit treshold by settings the [punchplatform.oom_exit_treshold] property. For example to use a 95% treshold, add the following to the topology storm_settings section:
1 2 3 4 | "storm_settings" : { "topology.worker.childopts" : "-server -Xms100m -Xmx100m -Dpunchplatform.oom_exit_treshold=0.95" ... } |