Wednesday, April 11, 2018

Using WebP Format in Sitecore

It is a copy of this blog post to have all my articles in one place.
WebP is promising format that allows good optimization comparing to JPEG and PNG. According to tests it provides lossless images that are 26% smaller compared to PNGs and lossy images that are 25-34% smaller than comparable JPEG images at equivalent SSIM quality index.
This format becomes more and more popular over the web nowadays. Google promotes WebP format, it released WebP as open source to allow anyone works with it and suggest improvements. But, is it supported by major browsers? Going to CanIUse gives answer that not:


(support details are on 2018/01/21, it could be changed over time)

It is supported by Chrome, Opera, Chrome for Android. Safari and Firefox are experimenting(not supported yet) with supporting WebP images. IE and Edge doesn’t support it.
But how it can be used now, when not all browsers support it? Each browser(web client) provides Accept header when getting resources from server. If this header contains image/webp, web server know that it can returns WebP format. As example Akamai CDN use this behavior. It can return optimized WebP images for web clients who support it and JPEG and PNG for those who doesn’t support.
Let’s consider how it could be used in Sitecore. It has no sense to add support of WebP format to media library for now, we can’t return this file format to all web clients. But it makes sense to return WebP format to web clients that can use it. Saving about 25% of time on loading images can make big difference for user experience, especially on mobile. I decided don’t write new image optimizer from the scratch and add support of WebP to well known tool for images optimization for the Sitecore: Dianoga.
To make it works, we should get understanding how Sitecore Media Library and Sitecore Media Library cache work. Sitecore media library creates files on disk after each request, or uses previously created files. By default these files are located under \Website\App_Data\MediaCache\website\{some folders}. There are few files in each folder. First type of files is cache for image. It is image itself, Sitecore don’t need to preprocess width, height and other parameters for each media request. It does processing only once. Second type of files is .ini file, it is metadata for cached object. Metadata contains next values:
Key. e.g. ?as=False&bc=0&h=0&iar=False&mh=0&mw=0&sc=0&thn=False&w=0. Key contains all media query parameters.
Extension - extension of file.
Headers - cached headers that should be returned to web client
DataFile - file name of object that should be returned to web client.

We need to extend forming of key, to add one more parameter that will indicate support of WebP format. It will require extending of MediaRequestHandler:

public class MediaRequestHandler : Sitecore.Resources.Media.MediaRequestHandler
{
 protected override bool DoProcessRequest(HttpContext context, MediaRequest request, Media media)
 {
  if (context?.Request.AcceptTypes != null && (context.Request.AcceptTypes).Contains("image/webp"))
  {
   request.Options.CustomOptions["extension"] = "webp";
  }

  return base.DoProcessRequest(context, request, media);
 }

 private static bool AcceptWebP(HttpContext context)
 {
  return context?.Request.AcceptTypes != null && (context.Request.AcceptTypes).Contains("image/webp");
 }
}

Our key will contain one more additional parameter: extension. E.g.: ?as=False&bc=0&h=0&iar=False&mh=0&mw=0&sc=0&thn=False&w=0&extension=webp
Let’s add handler that will process WebP compatible requests based on
CommandLineToolOptimizer:

public class WebPOptimizer : CommandLineToolOptimizer
{
 public override void Process(OptimizerArgs args)
 {
  //If WebP optimization was executed then abort running other optimizers
  //because they don't accept webp input file format
  if (args.AcceptWebP)
  {
   base.Process(args);
   args.AbortPipeline();
  }
 }

 protected override string CreateToolArguments(string tempFilePath, string tempOutputPath)
 {
  return $"\"{tempFilePath}\" -o \"{tempOutputPath}\" ";
 }
}

It is very easy, it runs cwebp.exe tool that converts JPEG or PNG to WebP. It doesn’t utilize all available command line options, and could be tuned depending on requirements. All others code changes are more about Dianoga configuration and unit tests, I will not stop on them in this article. If you want more details, you can review all changes in GitHub repository.

How to enable Dianoga WebP support for your project:
  1. Clone GitHub repository and build project
  2. Enable Dianoga.WebP.config.disabled config
  3. Open web.config and change line <add verb="*" path="sitecore_media.ashx" type="Sitecore.Resources.Media.MediaRequestHandler, Sitecore.Kernel" name="Sitecore.MediaRequestHandler" /> to <add verb="*" path="sitecore_media.ashx" type="Dianoga.MediaRequestHandler, Dianoga" name="Sitecore.MediaRequestHandler" />
  4. If you have custom MediaRequestHandler (e.g. Habitat is used) then skip step 3 and override DoProcessRequest method with detection of support of WebP format. See MediaRequestHandler code listening above.

P.S. It is experimental feature, use it on your own risk. :-)

Use of Solr Search Provider in Sitecore

It is a copy of this blog post to have all my Sitecore articles in one place.

Usage of Solr search provider on Sitecore solutions becomes more and more popular. Solr is built over Lucene and provides additional abilities. Comparing with Lucene, it's server based:

  • You don’t have a separate index for each CM/CD server, you don’t have problems with indexes sync on different machines
  • CD servers doesn’t do indexes build, it frees server resources
  • You are able to scale your search servers
  • The more big your Sitecore solution is, the more probability is that you will use Solr provider.

Out of the box, Sitecore provides three types of Solr indexes: SolrSearchIndex, SwitchOnRebuildSolrSearchIndex and SwitchOnRebuildSolrCloudSearchIndex. SwitchOnRebuildSolrSearchIndex is built under top of SolrSearchIndex. SwitchOnRebuildSolrCloudSearchIndex is built under top of SwitchOnRebuildSolrSearchIndex. Why do you need different implementation? Answer is simple: SwitchOnRebuildSolrSearchIndex solves big problem of SolrSearchIndex: After start of index rebuild on SolrSearchIndex you temporarily get empty index. It causes seeing no search results by user during index rebuild.



SwitchOnRebuildSolrSearchIndex solves this problem by having 2 cores: one core is used during index rebuild, cores are swapped after rebuild, second core started to used after rebuild. It causes 2 requirements: having different core per index and double amount Solr cores. Swap atomically swaps the names used to access two existing cores. The prior core remains available and can be swapped back, if necessary. Each core will be known by the name of the other, after the swap.



Note: names of cores remains unchanged. Changing of places “sitecore_web_core” and “sitecore_web_core_rebuild” are only for highlighting that content of the core was changed by swap.

But what is happening when we are starting to use few SOLR servers: master and slave:



From diagram above we can make conclusion, that having “rebuild” cores could be redundant. And usage of SwitchOnRebuildSolrSearchIndex could be replaced with SolrSearchIndex, but we should disable replication during index rebuild. It could be easily done adding two Sitecore events:
Disabling replication on indexing:start. (if it is full index rebuild, not incremental)
Enabling replication on indexing:end. (if it is full index rebuild, not incremental)

Both these event handlers do web request to Solr server:

  • http://master_host:port/solr/replication?command=disablereplication
  • http://master_host:port/solr/replication?command=enablereplication

There is one un-obvious thing that you should have in mind: indexing:end event could be called when server is shutting down. It is necessary to check if server is not shutting down before enabling replication.

Now we get simpler process and possibility to use SolrSearchIndex.



Conclusion: when you have few Solr instances (Master/Slave) on Sitecore environments then you can use SolrSearchIndex and enabling/disabling replication instead of SwitchOnRebuildSolrSearchIndex.