Sudarshan's Blog

My Thoughts, Findings & Experiences

Managing Azure App Service Routing Rules

June 12, 2016 21:35

Azure web services has a very good and powerful feature: Traffic Routing. Traffic Routing allows you to test your production website with live customers!! (Is this true?). This feature allows you to test new version of your website with subset of live users, so that you can verify functionality, performance and bugs if any. If you see any issues then you can move traffic back to old site else you can gradually move users to the next version of application. This will allow you to do rapid deployments without application downtime.

One are use this feature, if Azure App Service is running in Standard OR Premium mode and web application must be deployed in one or more deployment slots. Traffic routing can be configured in 2 ways:

  1. Through new Azure Portal
  2. Using PowerShell


In this post, we will take a look at how to do traffic routing using Azure PowerShell. If you are planning to use automated release management, then PowerShell is the way. Here is how you can route traffic using PowerShell:

$RoutingRule = New-Object Microsoft.WindowsAzure.Commands.Utilities.Websites.Services.WebEntities.RampUpRule
$RoutingRule.ActionHostName = $ActionHostName
$RoutingRule.ReroutePercentage = $ReroutePercentage;
$RoutingRule.Name = "ProductionRouting"

Set-AzureWebsite $_ -Slot Production -RoutingRules $RoutingRule
Here is description of parameters:
  • $ActionHostName: Name of azure web service deployment slot
  • $ReroutePercentage: Percentage of people you wan to move to new deployment

You can opt for routing 100% traffic to new deployment OR you can move it gradually. Gradual traffic movement can be done using PowerShell as well:

# Below properties can be used for ramp up or ramp down
$RoutingRule.ChangeIntervalInMinutes = 10;
$RoutingRule.ChangeStep = 5;
$RoutingRule.MinReroutePercentage = 1;
$RoutingRule.MaxReroutePercentage = 80;

BUT, how to remove traffic routing rule?

Let’s say we want to move traffic back to old website. In this case to Production website slot. There is very less documentation available about how to remove traffic routing rule. I spend lot of time researching how to do this. Here is how it can be achieved

Set-AzureWebsite $_ -Slot Production -RoutingRules @()

This will remove the traffic routing rules that you have added.

Here is parameterized version of PowerShell script for traffic routing management

[string] [Parameter(Mandatory=$true)] $WebsiteName,
[string] [Parameter(Mandatory=$true)] $DeploymentSlot,
[string] [Parameter(Mandatory=$true)] $ReroutePercentage

Write-Host "$WebsiteName : Moving Traffic to Slot - $DeploymentSlot => Start"

if($DeploymentSlot.ToLower() -eq "production")
Write-Host "Removing Routing Rules"
Set-AzureWebsite $WebsiteName -Slot Production -RoutingRules @()
$ActionHostName = $WebsiteName + "-" + $DeploymentSlot + ""

Write-Host $ActionHostName

$RoutingRule = New-Object Microsoft.WindowsAzure.Commands.Utilities.Websites.Services.WebEntities.RampUpRule
$RoutingRule.ActionHostName = $ActionHostName
$RoutingRule.ReroutePercentage = $ReroutePercentage;
$RoutingRule.Name = "ProductionRouting"

# Below properties can be used for ramp up or ramp down
#$RoutingRule.ChangeIntervalInMinutes = 10;
#$RoutingRule.ChangeStep = 5;
#$RoutingRule.MinReroutePercentage = 1;
#$RoutingRule.MaxReroutePercentage = 80;

Set-AzureWebsite $WebsiteName -Slot Production -RoutingRules $RoutingRule

Write-Host "$WebsiteName : Moving Traffic to Slot - $DeploymentSlot => End"

Happy programming!

Azure App Services hidden Gem: Local Cache feature

May 25, 2016 21:42

Azure App Services: Local Cache feature will help you to make your web application highly available, high performant and resilient to maintenance & upgrades in Azure. Local cache feature can be enabled for any web application running on any platform (.NET, JAVA or PHP). You will see visible signs of the process in terms of performance, response times and reduced number of site failures/downtimes. If your web application has heavy I/O usage then this feature is really helpful. Let's see how Local Cache feature works:

Azure Web App (also known as Websites) is a PaaS (Platform as a Service) offering to host your web applications. It gives you some really nice features like dynamic scaling in terms of the number of instances or machine sizes, sticky sessions, load balancing, traffic routing etc. without worrying about maintenance and patching of the servers.

Let's understand the deployment of Azure Web Apps. You might think, if a website is running on one instance then a single virtual machine is allocated to it or if the website is running on ten instances then ten virtual machines are allocated. This is true but this is not the complete picture. Here is how Azure Web App deployment looks like:

Conceptual Deployment of Azure App Service

Conceptual Deployment of Azure App Service

As shown in diagram, it consists of three components:

  1. Front-end server
    All the web requests are terminated on this server.
    Its job is to check validity of the HTTPS certificate and to forward request to an appropriate worker role to execute the request

  2. Worker role
    This is responsible for hosting the web application
    It will execute the web request and returns the response to the user

  3. Shared network drive
    Website contents are stored on this shared network drive
    Content stored here are shared across all worker roles (VM instances)
    This content can be accessed via FTP or SCM website (KUDU portal)

When you deploy the website, contents are copied to a shared network drive. Worker role hosts application in IIS (assuming it's a .NET application) pointing content location to the shared network drive. This is why, when you scale out your application, it scales out very fast.

This deployment structure works in most of the cases, BUT this does not work in these scenarios:

  • If you want a high performing application
  • If you want a highly available application

Why doesn't it work for the above scenarios?

  • When content is stored at a shared location, there is latency added to access the content.
  • An application is dependent on the availability of the shared network drive. If the connection is lost to shared network drive then it results in application downtime.
  • If a connection to the shared network drive is lost and restored then it results in application restart. If your application is bootstrapping certain data then it could add the further delay in starting the application. For example: If you have an Umbraco application then as part of the start-up process it creates indexes, creates XML file etc. So, it takes time to start the application
  • What we have also observed is that if your application does a lot of disk I/O then frequent storage connection failures might happen which results in application restarts

How do we avoid shared network drive connection failures and make our application highly available and performant?

Answer is: Use "LOCAL CACHE" feature

When you enable your website to use Local Cache feature, each worker role (VM instance) gets its own copy of the website contents. This is a write-but-discard-cache of your shared storage content. It is created asynchronously at the time of the site startup. When the local cache is ready on worker role, the site is switched to run against locally cached contents. This will give you the following benefits:

  • Eliminate latency issues for accessing shared contents
  • Your website is unaffected if shared storage is undergoing through planned upgrades or unplanned downtimes or any other disruptions with shared content store
  • Fewer application restarts because of shared storage issues


How to enable "LOCAL CACHE" feature?

Enabling Local Cache feature is simple. You just need to add two settings to your Application Settings section. These two settings are:


Local Cache Portal Settings

The default size of a local cache is 300 MB. You can increase it up to 1GB. So the value range will be 300-1000.

Once you save the setting, you need to restart the site. After the restart, platform sees the settings and copies the website contents (D:\home) folder locally on the worker role. Once the content copy is done then the website will be working in Local Cache mode.

Important: Every time web application is restarted, content from shared location will be copied to the worker role machine


Downsides of "LOCAL CACHE" feature

Local Cache feature sounds fascinating and makes your web applications more performant, highly available. But, it has some limitations too. Let's discuss some of those. You need to carefully evaluate your application and deployment process for these limitations.

  • Newly deployed code changes will not be reflected until you restart the site
  • If your web application write logs into the web contents (for example 'App_Data' folder) then these log files will be discarded when the web application is restarted or application is moved to a different virtual machine
  • If you application uploads media file or any other file into web contents then these contents will NOT be shared across all instances (if you have multiple instances) and newly added contents will be discarded when the web application is restarted or application is moved to a different virtual machine
  • If your web application content is > 1GB then you cannot use this feature

To address these limitations, you might need to update your application code. For example: In our case, we changed the code to store media files in a blob storage. When a user uploads a new media file (regardless to which web instance user is connected), it will be stored in the blob storage and hence available to serve from all instances. Also, even if the web application is restarted or moved to a different virtual machine, we are not losing any media files.


Final Thoughts

Local Cache is a very useful feature and a hidden gem of Azure App Service. It gives a performance boost and high resiliency to the web application.

We have seen 100% performance boost in response times for our applications (response time went down from ~250ms to ~120ms) and have not seen storage connection failure issues.

We advise Local Cache feature if your application is ready for it!