Suggestion:
Add a separate ‘Scale Up’ stage in provisioning workflows which will allow the Post-Provisioning workflow to run only during provisioning of the resource. The ‘Scale Up’ stage workflow can then be used to safely run scale-up related tasks that know the Post-Provisioning workflow has already been completed. It would complement the existing counterpart ‘Scale Down’ stage.
Description:
When adding a node (VM) to an Instance and the Instance has a provisioning workflow that has defined post-provisioning tasks, all nodes in the Instance will run all tasks from the Post-Provisioning stage, not just the added node.
This is somewhat unexpected behavior as it’s often expected that the post-provisioning step only runs when the node itself was provisioned, not also when another node in the same instance is provisioned as part of a horizontal scaling operation. This last behavior is to my knowledge not documented, but is (based on a raised support case) ‘as designed’.
Looking at the Provisioning Workflows a bit more, it looks like the “Post-Provisioning” stage is abused for a missing “Scale Up” stage. I say missing because a “Scale Down” stage does exist. So why would the counter part “Scale Up” stage not exist?
Use case:
During provisioning, Morpheus handles a failure during the Provisioning stage different than during the Post-Provisioning stage. So without horizontal scaling, you’d stick those operations that are allowed to fail and can be retried in the Post-Provisioning stage. But if you need to also allow horizontal scaling to run tasks as part of the ‘Scale Up’ event, suddenly your Post-Provisioning stage workflow can get triggered multiple times.
If you want to use the ‘Scale Up’ run of the Post-Provisioning stage you’d have to include a ‘did I run previously?’ switch for the scripts that ran from Post-Provisioning tasks. On the resource this is trivial, but for those tasks in the workflow running on a remote resource, or the Morpheus appliance, this becomes more tricky to ensure. This is especially non-trivial if these tasks are of the non-script type.
-Yaron.