Azure Cloud Service Deployment Slots

Azure deployment slots are the most beautifully crafted feature in Azure App Service. It helps us to deploy different versions on different slots depending on our needs, to swap them, to route a specific percentage of user traffic to one or more of our deployment slots etc. I guess I fell into the trap of treating the Cloud Service Slot like the Website Staging Slot, and keeping it there as a deployment slot which is no issue with Websites. However for all the minutes the staging slot runs. So I guess my workflow should be 1) deploy to staging 2) test, 3) swap to production, 4) delete the staging slot.

-->

Summary

When you deploy instances to a Cloud Service or add new web or worker role instances, Microsoft Azure allocates compute resources. You may occasionally receive errors when performing these operations even before you reach the Azure subscription limits. This article explains the causes of some of the common allocation failures and suggests possible remediation. The information may also be useful when you plan the deployment of your services.

  1. The deployment slot has its own host name and is also a live app. To limit public access to the deployment slot, see Azure App Service IP restrictions. The new deployment slot has no content, even if you clone the settings from a different slot. For example, you can publish to this slot with Git.
  2. This can be done with a few clicks from the Azure Portal. Follow the instructions below to create your first app service slot: Navigate to your Azure App Service that you created in your environment. Click on “Deployment Slots” in the left panel and click “Add Slot” to create a new slot. Adding Deployment Slot.

If your Azure issue is not addressed in this article, visit the Azure forums on MSDN and Stack Overflow. You can post your issue in these forums, or post to @AzureSupport on Twitter. You also can submit an Azure support request. To submit a support request, on the Azure support page, select Get support.

Background – How allocation works

The servers in Azure datacenters are partitioned into clusters. A new cloud service allocation request is attempted in multiple clusters. When the first instance is deployed to a cloud service(in either staging or production), that cloud service gets pinned to a cluster. Any further deployments for the cloud service will happen in the same cluster. In this article, we'll refer to this as 'pinned to a cluster'. Diagram 1 below illustrates the case of a normal allocation which is attempted in multiple clusters; Diagram 2 illustrates the case of an allocation that's pinned to Cluster 2 because that's where the existing Cloud Service CS_1 is hosted.

Why allocation failure happens

When an allocation request is pinned to a cluster, there's a higher chance of failing to find free resources since the available resource pool is limited to a cluster. Furthermore, if your allocation request is pinned to a cluster but the type of resource you requested is not supported by that cluster, your request will fail even if the cluster has free resource. Diagram 3 below illustrates the case where a pinned allocation fails because the only candidate cluster does not have free resources. Diagram 4 illustrates the case where a pinned allocation fails because the only candidate cluster does not support the requested VM size, even though the cluster has free resources.

Troubleshooting allocation failure for cloud services

Error Message

You may see the following error message:

'Azure operation '{operation id}' failed with code Compute.ConstrainedAllocationFailed. Details: Allocation failed; unable to satisfy constraints in request. The requested new service deployment is bound to an Affinity Group, or it targets a Virtual Network, or there is an existing deployment under this hosted service. Any of these conditions constrains the new deployment to specific Azure resources. Please retry later or try reducing the VM size or number of role instances. Alternatively, if possible, remove the aforementioned constraints or try deploying to a different region.'

Common Issues

Here are the common allocation scenarios that cause an allocation request to be pinned to a single cluster.

  • Deploying to Staging Slot - If a cloud service has a deployment in either slot, then the entire cloud service is pinned to a specific cluster. This means that if a deployment already exists in the production slot, then a new staging deployment can only be allocated in the same cluster as the production slot. If the cluster is nearing capacity, the request may fail.
  • Scaling - Adding new instances to an existing cloud service must allocate in the same cluster. Small scaling requests can usually be allocated, but not always. If the cluster is nearing capacity, the request may fail.
  • Affinity Group - A new deployment to an empty cloud service can be allocated by the fabric in any cluster in that region, unless the cloud service is pinned to an affinity group. Deployments to the same affinity group will be attempted on the same cluster. If the cluster is nearing capacity, the request may fail.
  • Affinity Group vNet - Older Virtual Networks were tied to affinity groups instead of regions, and cloud services in these Virtual Networks would be pinned to the affinity group cluster. Deployments to this type of virtual network will be attempted on the pinned cluster. If the cluster is nearing capacity, the request may fail.

Solutions

  1. Redeploy to a new cloud service - This solution is likely to be most successful as it allows the platform to choose from all clusters in that region.

    • Deploy the workload to a new cloud service
    • Update the CNAME or A record to point traffic to the new cloud service
    • Once zero traffic is going to the old site, you can delete the old cloud service. This solution should incur zero downtime.
  2. Delete both production and staging slots - This solution will preserve your existing DNS name, but will cause downtime to your application.

    • Delete the production and staging slots of an existing cloud service so that the cloud service is empty, and then
    • Create a new deployment in the existing cloud service. This will re-attempt to allocation on all clusters in the region. Ensure the cloud service is not tied to an affinity group.
  3. Reserved IP - This solution will preserve your existing IP address, but will cause downtime to your application.

    • Create a ReservedIP for your existing deployment using Powershell

    • Follow #2 from above, making sure to specify the new ReservedIP in the service's CSCFG.

  4. Remove affinity group for new deployments - Affinity Groups are no longer recommended. Follow steps for #1 above to deploy a new cloud service. Ensure cloud service is not in an affinity group.

  5. Convert to a Regional Virtual Network - See How to migrate from Affinity Groups to a Regional Virtual Network (VNet).

Modern-day data centers are extremely complex and have many moving parts. VMs can restart or move, systems are upgraded, and file servers are scaled up and down. All these events are to be expected in a cloud environment. However, you can make your cloud application resilient to these events by following best practices. This document outlines 13 crucial steps that you can take to ensure that your app is cloud ready. By taking these steps, you will ensure that any events in the data center will have negligible effects on your app and that your app will be more resilient and future proof.

As mentioned above, your instances are expected to and will restart. They will be upgraded and will sometimes suffer from file server movements. However you can make your app resilient to all these incidents. In order to guarantee the maximum uptime for your app, please ensure that you follow all practices.

Use Multiple Instances

Running your app on only one VM instance is an immediate single point-of-failure. By ensuring that you have multiple instances allocated to your app, if something goes wrong with any particular instance, your app will still be able to respond to requests going to the other instances. Keep in mind that your app code should be able to handle multiple instances without synchronization issues when reading from or writing to data sources. You can allocate multiple instances to your app using the “Scale out (App Service Plan)” blade:

To avoid a single point-of-failure, run your app with at least 2-3 instances. This is especially important if your app takes considerable time to start (known as cold start). Running more than one instance ensures that your application is available when App Service moves or upgrades the underlying VM instances. You can also configure rules to automatically scale out based on predefined rules such as:

  • Time of day (when the app has the most traffic)
  • Resource utilization (memory, CPU, etc.)
  • A combination of both!

Learn More

Update your default settings

App Service has many settings for developers to configure the web app to their use case. Always-On keeps your VM instances alive even when no requests have been received in the last 20 minutes. By default, Always-On is disabled; enabling Always-On will limit application cold starts. ARR Affinity creates sticky sessions so that clients will connect to the same app instance on subsequent requests. However, ARR Affinity can cause unequal distribution of requests between your instances and possibly overload an instance. For production apps that are aiming to be robust, it is recommended to set Always on to On and ARR Affinity to Off. Disabling ARR Affinity assumes that your application is either stateless, or the session state is stored on a remote service such as a cache or database.

You can change these settings in the configurations section of the Azure Portal, under the General Settings tab:

Azure Cloud Service Deployment Slots

Learn More

Use Production Hardware

App Service offers a variety of hardware tiers (also known as SKUs) to suit different customer needs. When creating a new App Service Plan, you have an option to select a different hardware tier for your plan:

If your App Service Plan is used for production, please ensure that your App Service Plan is running on one of the recommended “production” pricing tiers. Moreover, if your application is resource intensive, make sure to select the appropriate pricing tier within the recommended ones according to the need of your app. For example, if your application consumes a lot of CPU cycles, running on an S1 pricing tier will not be ideal as it could potentially cause high CPU that would cause downtime or slowness on your app.

Learn More

Leverage Deployment Slots

Before deploying your new code to production, you can leverage the Deployment Slots feature in App Services to test your changes. Deployment slots are live apps with their own host names. App content and configurations elements can be swapped between two deployment slots, including the production slot.

Deploying your application to a non-production slot has the following benefits:

  • You can validate app changes in a staging environment before swapping it into the production slot.
  • Deploying an app to a slot first and swapping it into production makes sure that all instances of the staging slot are warmed up before swapping into production. This eliminates downtime when you deploy your app. The traffic redirection is seamless, and no requests are dropped because of swap operations. You can automate this entire workflow by configuring auto swap.
  • After a swap, the slot with previously staged app now has the previous production app. If the changes swapped into the production slot aren’t as you expect, you can perform the same swap immediately to get your “last known good site” back.

Please note that Deployment Slots are only available for Standard, Premium, or Isolated App Service plan tiers

We highly recommend using Swap with Preview. Swap with Preview allows you to test the app in your staging slots against your production settings and also warm up the app. After doing your tests and warming up all the necessary paths, you can then complete the swap and the app will start receiving production traffic without restarting. This has a high impact on your app’s availability & performance.

Learn More

Set your Health Check path

App Service allows you to specify a health check path on your apps. The platform pings this path to determine if your application is healthy and responding to requests. When your site is scaled out to multiple instances, App Service will exclude any unhealthy instance(s) from serving requests, improving your overall availability. Your app’s health check path should poll the critical components of your application, such as your database, cache, or messaging service. This ensures that the status returned by the health check path is an accurate picture of the overall health of your application.

  1. Go to Monitoring > Health Check on the Web App blade for Azure portal:

  2. Set the value of the path that our service will ping.

  3. Hit save to save the configuration.

Please note that the Health Check feature works only when you have two or more instances, which is a very strong recommendation. For a single instance web app, the traffic is never blocked even if that single instance is encountering issues.

Learn More

Use Application Initialization

Application Initialization ensures that your app instances have fully started before they are added to they start serving requests. Application Initialization is used during site restarts, auto scaling, and manual scaling. This is a critical feature where hitting the site’s root path is not sufficient to start the application. For this purpose a warm-up path must be created on the app which should be unauthenticated and App Init should be configured to use this url path.

Try to make sure that the method implemented by the warm-up url takes care of touching the functions of all important routes and it returns a response only when warm-up is complete. The site will be put into production only when it returns a response (success or failure) and app initialization will assume “everything is fine with the app”. App Initialization can be configured for your app within web.config file.

Learn More

Enable Local Cache

When this feature is enabled, the site content is read, written from the local virtual machine instance instead of fetching from Azure storage (where site content is stored). This will reduce the number of recycles required for the app. It can be enabled through Azure portal from the “General -> Application settings”. On this page under the App settings section add WEBSITE_LOCAL_CACHE_OPTION as key and 'Always' as value. Also add the WEBSITE_LOCAL_CACHE_SIZEINMB with a desired local cache size value up to 2000MB (if not provided, it defaults to 300 MB). It helps to provide the cache size specially when the site contents are more than 300 MB. Ensure that site contents are less than 2000MB for this feature to work. Also it is a good practice to keep it as a slot setting so that it does not get removed with a swap.The most important thing to keep in mind here is that app should not be doing local disk writes for state persistence of its data/transactions.External storage like storage containers, db or cosmosDB should be used for storage purposes.

Please note that the behavior of Local Cache depends on the language and CMS you are using. For best results, we recommend using it for .net and .netcore apps as long as local writes are not being done by the app.

Learn More

Auto-Heal

Sometimes your application might experience unexpected behaviors that could be resolved by a simple restart. The Auto-Heal features allows you to do exactly that! It allows you to define the ‘condition’ that would trigger Auto-Heal and the ‘action’ that Auto-Heal will initiate when the condition is met.

You can create an Auto-Heal mitigation rule by going to “Diagnose and Solve problems” section -> “Diagnostic Tools” tile and then “Auto-Heal” under Proactive Tools section.

Below are example filter values to set up, however if some other value of error code and frequency suits your application, please modify accordingly:

ConditionValue
Request Count70
Status Code500
Sub-status code0
Win32-status code0
Frequency in seconds60

Once the condition above is met, we recommend configuring an action to:

  • Recycle Process

and add an ‘Override when Action Executes’:

  • Startup Time for process before auto heal executes: 3600 seconds (1 hour)

Learn More

Minimize App Service Plan Density

Ensure not more than 8 apps are running on the app service plan to ensure healthy performance. All the apps running on the app service plan can be seen on “Apps” under “Settings” section in your app service plan on azure portal.

Learn more about App Service Plan Density Check here:

Monitor Disk Space usage

Azure cloud service deployment slots download

Ensure that the disk space used by www folder should be less than 1GB. It is a very healthy practice in reducing downtime during app restarts and hence improve the application performance. File system usage can be tracked from “App Service Plan -> Quotas” section in Azure portal.

Enable Application Insights

Application Insights offers a suite of features that empower you to troubleshoot incidents that happen on your app. You can use it to debug code errors, diagnose performance degradations caused by dependencies and more.

Azure Cloud Service Deployment Slots Download

One of the powerful features of Application Insights is the App Insights Profiler. Enabling Application Insights Profiler provides you with performance traces for your applications that are running in production in Azure. Profiler captures the data automatically at scale without negatively affecting your users. Profiler helps you identify the “hot” code paths that take the longest when handling a web request. Profiler works with .NET applications. To enable it, go to your Application Insights in Azure portal. Click on Performance under Investigate.

  1. In the Performance pane click on “Configure Profiler”

  2. In the pane that opens after that, click on “Profile Now” to start profiling.

  3. When Profiler is running, it profiles randomly about once per hour and for a duration of two minutes. If your application is handling a steady stream of requests, Profiler uploads traces every hour. To view traces, in the Performance pane, select Take Actions, and then select the Profiler Traces button.

  4. App Insights also allows you to track dependencies in your application. You can leverage this feature to troubleshoot slow requests. To automatically track dependencies from .NET console apps, install the Nuget package Microsoft.ApplicationInsights.DependencyCollector, and initialize DependencyTrackingTelemetryModule as follows:

  5. Each request event is associated with the dependency calls, exceptions, and other events that are tracked while your app is processing the request. So if some requests are doing badly, you can find out whether it’s because of slow responses from a dependency. You can see a waterfall view of the requests in the performance blade as well under the “Dependencies” tab:

Azure Cloud Service Deployment Slots Free

You can also leverage our newly released App Insights integration with App Service Diagnostics, discussed in details here:

Learn More

Deploy in Multiple Regions

You can deploy Azure Front Door or Azure Traffic Manager to intercept traffic before they hit your site. They help in routing & distributing traffic between your instances/regions. In the event that a catastrophic incident happens in one of the Azure Datacenters, you can still guarantee that your app will run and serve requests by investing in one of them.
There are additional benefits to using Front Door or Traffic Manager, such as routing incoming requests based the customers’ geography to provide the shortest respond time to customers and distribute the load among your instances in order not to overload one of them with requests.

Learn More

Check App Service Diagnostics

Finally, you can check the progress you’ve accomplished in making your app resilient by leverage the “Risk Assessments” section available in App Service Diagnostics here:

You’ll be presented by 2 options:

Azure Cloud Service Deployment Slots Online

  • Best Practices for Availability & Performance
  • Best Practices for Optimal Configuration

We recommend that you follow all the best practices listed in those detectors and get them all to green!

Azure Cloud Service Deployment Slots Software

Finally, we also recommend that you take a look at the Cloud Design Patterns document to minimize the application start time and follow more resiliency recommendations.

Feel free to post any questions about App Resiliency on the MSDN Forum.