OutOfRangeException in SharePoint 2013 SetThreadCulture method

The problem

Recently I had to work on a tricky SharePoint bug. Sometimes users called about a correlation ID that was displayed when accessing a SharePoint publishing site. The issue was sporadic. In some cases refreshing the page was enough but on other cases the correlation ID was displayed during several minutes. We were running SharePoint Server 2013 with SP1 without any cumulative updates.

So we did what any SharePoint developer would have done : scanning the ULS logs. Here is a copy of the exception :

System.IndexOutOfRangeException: Index was outside the bounds of the array.
at System.Collections.Generic.Dictionary2.FindEntry(TKey key)
at System.Collections.Generic.Dictionary
2.TryGetValue(TKey key, TValue& value)
at Microsoft.SharePoint.SPFallback1.GetPath(T node)
at Microsoft.SharePoint.SPFallback
1.GetPathLength(T startNode, T endNode, Int32& length)
at Microsoft.SharePoint.SPFallback`1.TryDetermineBestMatch(T[] supporteds, T[] desireds, IsMatched isMatched, LanguageDecisionOptions options, T& bestMatch)
at Microsoft.SharePoint.SPLanguageSettings.TryDetermineLanguage(UInt32[] supportedLcids, UInt32[] desiredLcids, LanguageDecisionOptions options, UInt32& bestMatchedLcid)
at Microsoft.SharePoint.SPLanguageSettings.TryDetermineLanguage(String[] supportedLanguages, String[] desiredLanguages, LanguageDecisionOptions options, String& language)
at Microsoft.SharePoint.WebPartPages.Utility.ComputeUICultureForMUIWebs(String strMUILanguages, UInt32 WebLanguage, UInt32& uiCultureLcid)
at Microsoft.SharePoint.WebPartPages.Utility.SetThreadCulture(SPWeb spWeb, Boolean force)
at Microsoft.SharePoint.ApplicationRuntime.SPRequestModuleData.GetFileForRequest(HttpContext context, SPWeb web, Boolean exclusion, String virtualPath)
at Microsoft.SharePoint.ApplicationRuntime.SPRequestModule.InitContextWeb(HttpContext context, SPWeb web)
at Microsoft.SharePoint.WebControls.SPControl.SPWebEnsureSPControl(HttpContext context)
at Microsoft.SharePoint.ApplicationRuntime.SPRequestModule.GetContextWeb(HttpContext context)
at Microsoft.SharePoint.ApplicationRuntime.SPRequestModule.PostResolveRequestCacheHandler(Object oSender, EventArgs ea)
at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

Note : text in bold was found in several locations in ULS. Text that is not in bold is variable since the “SetThreadCulture” method is called by many other methods. So you can encounter the same behavior with a different stack trace.

Why? (Skip if you are not a developer)

The problem happens when SharePoint sets the thread culture. We have french language pack installed on our Farm and the SharePoint site on which the error occurs is multilingual. Moreover we  developed a custom product to translate our publishing pages. This tool is setting the thread culture depending of the current user’s language preference. Since we have some custom code that is playing with the thread culture we thought that was a good lead. Unfortunately our code seemed OK and we didn’t find any problem with it.

At this time we decided to decompile the Microsoft.SharePoint.dll assembly and search for any clue.

Here is the code of the “GetPath” method you can see in the stack trace :

The m_path field that is highlighted is instantiated in the constructor the generic SPFallback class :

And the SPFallback instance used by SPLanguageSettings (check the stack trace if you are lost) is actually static (I invite you to decompile the SPLanguageSettings class if you are curious). To summarize the m_path Dictionary is shared by all the requests made on this particular Web Application…

Did you see that coming?

A lot of .Net developers are not aware that a Dictionary is not thread-safe in .Net. Worst : you can end with a high cpu usage when reading and writing in a Dictionary concurrently!

A Dictionary<TKey, TValue> can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure (MSDN).

See the thread safety section of the Dictionary reference : https://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.110).aspx

The funny thing is that the developer who wrote this piece of code was aware of a potential concurrent issue so he (or she) decided to put a lock (Monitor.Enter actually) around the block that clears the m_path Dictionary. Unfortunately that is not enough!

The solution

Install the May 2014 Cumulative Update! In the description you can read :

When you change the fallback language on a SharePoint Server 2013 server, you experience high CPU usage on the server.

Here is the link : https://support.microsoft.com/en-us/kb/2878240

If you are a developer and you are curious about how this has been fixed, you can decompile the code after applying the CU and you will see that the Dictionary I was talking about has been replaced by a ConcurrentDictionary.