UrlRewriting in EPiServer CMS

EPiServer CMS supports arbitrary rewriting of incoming and outgoing URLs. This text describes the architecture and how to use some of the capabilities.

What is ‘real’ URL Rewriting?

The topic of URL rewriting is common on the Internet and in ASP.NET-related documentation. However, this term most frequently refers to only rewriting incoming URLs. EPiServer CMS supports seamless and arbitrary rewriting of incoming and outgoing URLs, including all HTML rendered by the system. This means that all links in all HTML will automatically be rewritten and rebased to properly reflect the situation on the client browser. Rewriting outgoing HTML prevents a host of problems, and ensures that the user sees the URL’s in the proper format regardless how they were originally generated. It also isolates the developer from the issue, such that for normal pages no consideration for the rewriting is necessary. In effect, EPiServer includes a built-in fully featured reverse-proxy.

The architecture of URL Rewriting in EPiServer CMS

The EPiServer URL Rewrite module

Rewriting of incoming request URLs and outgoing HTML is handled by a loadable module implementing the System.Web.IHttpModule interface. The standard module provided with EPiServer is named EPiServer.Web.UrlRewriteModule, which inherits its basic functionality from EPiServer.Web.UrlRewriteModuleBase. The URL Rewriting module depends on another vital piece of software to perform the actual rewriting, the UrlRewriteProvider that is loaded dynamically to perform this task.

The UrlRewriteProvider

Any code that needs to rewrite URLs does so with an instance of a class derived from EPiServer.Web.UrlRewriteProvider that is accessed via EPiServer.Global.UrlRewriteProvider. EPiServer will create this instance from information found in the <episerver><urlRewrite> element in Web.Config, thus making it fully replaceable with any code implementing the contract defined by the base class.

URL Rewriting is about mapping an internal representation of an URL to an external representation – and vice versa. The external representation is what the user sees in the client browser, the internal representation is what the code sees (and what the user would see if the rewriting is disabled). It’s important to note that the mapping between internal and external must be one to one (bijective), i.e. there must be one and only one mapping from internal to external and the other way around. An UrlRewriteProvider implements the following methods and properties:

Configuring the UrlRewriteProvider

Regardless of whether any friendly URL rewriting is performed, there must still be an UrlRewriteProvider configured.

EPiServer comes with three predefined providers:

Configuration is done via Web.Config in the <episerver><urlRewrite> section:

CopyXML
<urlRewrite defaultProvider="EPiServerFriendlyUrlRewriteProvider">
  <providers>
    <add name="EPiServerFriendlyUrlRewriteProvider"
        type="EPiServer.Web.FriendlyUrlRewriteProvider,EPiServer"
        enableSimpleAddress="true"
        friendlyUrlCacheAbsoluteExpiration="0:0:10" />
    <add description="EPiServer identity URL rewriter"
        name="EPiServerIdentityUrlRewriteProvider"
        type="EPiServer.Web.IdentityUrlRewriteProvider,EPiServer" />
    <add description="EPiServer bypass URL rewriter"
        name="EPiServerNullUrlRewriteProvider"
        type="EPiServer.Web.NullUrlRewriteProvider,EPiServer" />
  </providers>
</urlRewrite>

Use the <providers><add> element to add any custom providers that you implement, giving each provider a name, and a fully qualified type and assembly reference.

The provider that will be loaded and actually used is determined by the defaultProvider attribute on the <urlRewrite> element, which refers to the name attribute of the provider listed in the <providers><add> element.

Note that a provider may have custom configuration settings.

Custom settings for FriendlyUrlRewriteProvider

The Friendly URL provider supports two custom settings to enable/disable the simple address feature and the timeout for friendly URL resolution, which includes a not-found cache. This timeout is intentionally very brief, by default 10 seconds, to make changes in the underlying site structure effective quickly. Please note that this cache does not support invalidation events, only invalidation by timeout. Its primary purpose is to quickly resolve requests that are not found as friendly URLs.

Configuring the UrlRewriteModule

To have EPiServer perform URL rewriting according to the configured provider, incoming requests must be intercepted and outgoing redirections and HTML must also be intercepted and rewritten. This function is performed by UrlRewriteModule which is an implementation of System.Web.IHttpModule. To enable URL rewriting for a site, you must thus first configure how URLs are to be rewritten, using an UrlRewriteProvider and then configure that appropriate requests and responses are rewritten using the configured provider, which is the function of UrlRewriteModule. The built-in module is configured in Web.Config under the <system.web><httpModules> element as follows:

CopyXML
<httpModules>
  <add name="UrlRewriteModule"
       type="EPiServer.Web.UrlRewriteModule, EPiServer" />
</httpModules>

If the <add> element is not included in the <httpModules> element, the site will not have UrlRewriting enabled. You should in this case also configure the UrlRewriteProvider accordingly, using the NullUrlRewriteProvider.

How to work with URLs and HTML in EPiServer

As a consequence of the integrated support for URL rewriting, your code must not assume that it knows the final form of any URLs. If you work with EPiServer pages and content, code does not need to take any special action, all is handled transparently. If, however, code outputs URLs or HTML via other channels than via regular page rendering and redirection, the code most explicitly ensure that any rewriting necessary takes place. For this, the code uses the UrlRewriteProvider instance, just as does EPiServer itself in the UrlRewriteModule etc. Please read the SDK documentation for EPiServer.Web.UrlRewriteProvider for details. Methods are provided to conveniently rewrite from internal to external formats and vice versa, as well as to rewrite strings and filter streams with automatic parsing of HTML and rewriting of URLs. All URLs and HTML that is passed outside of EPiServer by other means than via built-in functionality must ensure that the appropriate rewriting using the provider takes place.

Extending and modifying URL and HTML rewriting

Since the architecture is based on replaceable providers and modules and is separate from the basic EPiServer functionality, customization can be performed to just about any degree. There are several models of customization possible; the choice should usually go with the easiest model. In order of increasing implementation complexity and flexibility the models are:

Customization via event subscription

UrlRewriteModule events

Since UrlRewriteModule implements IHttpModule, there may be any number of instances active in any one application. To subscribe to an event is thus a two-stage process, first you must subscribe to a static event that is raised every time a new instance is instantiated by ASP.NET, and then in that event handler, you must subscribe to the relevant instance events.

HttpRewriteInit event
CopyC#
public static event EventHandler<UrlRewriteEventArgs> HttpRewriteInit;

To subscribe to events associated with the rewriting of HTTP requests URLs, use code like this:

CopyC#
static Module()
{
    UrlRewriteModuleBase.HttpRewriteInit += new EventHandler<UrlRewriteEventArgs>
    (UrlRewriteModuleBase_HttpRewriteInit);
}

static void UrlRewriteModuleBase_HttpRewriteInit(object sender, EPiServer.Web.UrlRewriteEventArgs e)
{
    UrlRewriteModuleBase module = sender as UrlRewriteModuleBase;
    if (module != null)
    {
        module.HttpRewritingToInternal += new EventHandler<UrlRewriteEventArgs>
        (module_HttpRewritingToInternal);
    }
}
Note that since the event is static, the subscription must also only be done once, either in a static constructor, or via some other mechanism that guarantees once only initialization.
UrlRewriteModule instance events

The following events are supported as instance events that can be subscribed to via the HttpRewriteInit event. See the SDK for details about these events.

CopyC#
public event EventHandler<UrlRewriteEventArgs> HttpRewritingToInternal;
public event EventHandler<UrlRewriteEventArgs> HttpRewroteToInternal;
public event EventHandler<UrlRewriteEventArgs> HttpRewritingToExternal;
public event EventHandler<UrlRewriteEventArgs> HttpRewroteToExternal;
public event EventHandler<UrlRewriteEventArgs> HtmlAddingRewriteToExternalFilter;
public event EventHandler<UrlRewriteEventArgs> HtmlAddedRewriteToExternalFilter;

The Http-prefixed events are for URLs that are used as HTTP request URLs, such as the incoming URL in the original request, as well as URLs sent to the client for redirection.
The Html-prefixed events are part of the process to rewrite outgoing URLs in HTML. To actually modify the rewriting of these URLs you must subscribe to the specific events provided by the HTML rewriting.

HtmlRewriteToExternal events

CopyC#
public static event EventHandler<HtmlRewriteEventArgs> HtmlRewriteInit;

EPiServer.Web.HtmlRewriteToExternal is an abstract base class that handles common tasks in rewriting outgoing HTML. To modify the way HTML is rewritten in EPiServer your code must subscribe to a static initializing event that will be raised at every instantiation of a HTML rewriter. Use this event to subscribe to the instance events.

CopyC#
[PagePlugIn()]
public class MyClass
…
public static void Initialize(int optionFlag)
{
    HtmlRewriteToExternal.HtmlRewriteInit += HtmlRewriteToExternal_HtmlRewriteInit;
}

static private void HtmlRewriteToExternal_HtmlRewriteInit(object sender, HtmlRewriteEventArgs e)
{
    MyClass myClass = new MyClass();
    e.RewritePipe.HtmlRewriteName += myClass.HtmlRewriteNameEventHandler;
    e.RewritePipe.HtmlRewriteValue += myClass.HtmlRewriteValueEventHandler;
}

The above sample uses the PagePlugIn mechanism to ensure that the static Initialize is called once and only once.

Event handlers will be called as the HtmlRewriteName and HtmlRewriteValue events are raised, enabling selective rewriting of the HTML.

Please note that the HTML is parsed in a streaming model, so the events are raised as the HTML is parsed. There is no DOM available, unless your code builds one (which should be avoided for performance reasons). See the ViewStateMover.cs example in the sample code package for a fully functioning sample of how to do some advanced manipulation of all HTML sent out by EPiServer.

HtmlRewrite instance events

On a higher abstraction level than the pure HTML stream provided by the HtmlRewritePipe, events are raised for every URL rewrite in outgoing HTML, which are possible to subscribe to via the HtmlRewriteInit event.

CopyC#
public event EventHandler<UrlRewriteEventArgs> HtmlRewritingUrl;
public event EventHandler<UrlRewriteEventArgs> HtmlRewroteUrl;

Please see the SDK for details of how to use these events.

HtmlRewritePipe instance events

As HTML is being parsed, a number of events are raised. In fact, at least one event is raised for every part of the parsed HTML, enabling any transformation, including clean-up, compression, injection, validation etc. The following events are provided:

CopyC#
public event EventHandler<HtmlRewriteEventArgs> HtmlInit;
public event EventHandler<HtmlRewriteEventArgs> HtmlRewriteBegin;
public event EventHandler<HtmlRewriteEventArgs> HtmlRewriteUrl;
public event EventHandler<HtmlRewriteEventArgs> HtmlRewriteName;
public event EventHandler<HtmlRewriteEventArgs> HtmlRewriteValue;
public event EventHandler<HtmlRewriteEventArgs> HtmlRewriteEnd;

Please see the SDK for details, as well as the ViewStateMover.cs sample code.

UrlRewriteProvider events

All actual rewriting of URLs go through the UrlRewriteProvider configured. The provider raises pre- and post events as it processes URLs. A reference to the global provider instance is available as:

CopyC#
EPiServer.Global.UrlRewriteProvider

Using this instance, the following events are available for subscription:

CopyC#
public event EventHandler<UrlRewriteEventArgs> ConvertingToInternal;
public event EventHandler<UrlRewriteEventArgs> ConvertedToInternal;
public event EventHandler<UrlRewriteEventArgs> ConvertingToExternal;
public event EventHandler<UrlRewriteEventArgs> ConvertedToExternal;

Customization via inheritance

Both the UrlRewriteProvider and the UrlRewriteModule are dynamically loaded as configured in Web.Config above. It’s thus possible to replace the EPiServer-provided classes with custom implementations derived from these.

Inheriting from EPiServer.Web.UrlRewriteProvider

The following methods may be overridden in a derived class to customize behavior:

CopyC#
public abstract bool ConvertToInternal(UrlBuilder url, out object internalObject);
public abstract bool ConvertToExternal(UrlBuilder url, object internalObject, Encoding toEncoding);
public abstract HtmlRewriteToExternal GetHtmlRewriter();
public abstract bool IsIdKeep { get; }

Please see the SDK for details.

Inheriting from EPiServer.Web.UrlRewriteModule

The following methods may be overridden in a derived class to customize behavior:

CopyC#
protected virtual void OnHttpRewriteInit(UrlRewriteEventArgs e)
protected virtual void OnHttpRewritingToInternal(UrlRewriteEventArgs e)
protected virtual void OnHttpRewroteToInternal(UrlRewriteEventArgs e)
protected virtual void OnHttpRewritingToExternal(UrlRewriteEventArgs e)
protected virtual void OnHttpRewroteToExternal(UrlRewriteEventArgs e)
protected virtual void OnHtmlAddingRewriteToExternalFilter(UrlRewriteEventArgs e)
protected virtual void OnHtmlAddedRewriteToExternalFilter(UrlRewriteEventArgs e)

public virtual void Init(HttpApplication application)

abstract protected string HttpUrlRewriteToExternal(string url, bool bRemoveId);
abstract protected void HttpUrlRewriteToInternal(UrlBuilder url);
abstract protected void HtmlAddRewriteToExternalFilter(HttpApplication httpApplication);

The On… methods provide an alternative to subscribing to the events.

See the SDK for further details.

Customization via independent implementation

If the custom rewriting scheme is fully independent of the default scheme provided with EPiServer, it might make sense to completely rewrite the functionality, in which case the most likely scenario is that it’s sufficient to provide an UrlRewriteProvider, which governs all rewriting of URLs.
In rare cases there might be a need to rewrite the UrlRewriteProvider, in which case the only requirements are that it implement IHttpModule, and that it uses UrlRewriteProvider for any actual URL rewrite.