Tuesday, February 18, 2020

Sitecore Identity Server Error: Sorry, there was an error : unauthorized_client in Sitecore 9.1 login to CMS Azure App Service PAAS


 
Hello Sitecore Engineers,


 This is kind of an easy one, but I thought id throw it into a blog post in case it helps someone:


After adding a new binding from IIS on my Sitecore 9.1 instance I was getting "unauthorized client" error while login with newly added binding URL.

 The official doc for this is here but they don't talk about using the pipe operator


File Name : Sitecore.IdentityServer.Host.xml
Location : C:\inetpub\wwwroot\9Dot1.identityserver\Config\production

After adding field tag AllowedCorsOriginsGroup in  Sitecore.IdentityServer.Host.xml file I was able to login properly with newly added URL.

Also you can fix this issue by placing a pipe operator beside of old AllowedCorsOriginsGroup as below like this:.

<AllowedCorsOriginsGroup1>http://9Dot1.sc | http://jss.9Dot1.sc</AllowedCorsOriginsGroup1

Hope this Helps

Colin Cooper

Sitecore 9.1 Reference Data database HIGH DTU AZURE PAAS - Sitecore KB 595419 - DTU FIRE




Hello Sitecore Engineers!

I recently worked with a 9.1 deployment that was affected by this issue and so I just wanted to take a moment to discuss Sitecore KB 595419 and what the actual problem looked like in the wild in my scenario.

Here's the official KB from Sitecore on this:

https://kb.sitecore.net/articles/595419

In the KB, Sitecore mentions "Might Degrade Site Performance"

What this actually meant to me when it occurred, was that we saw was a 10-18 second TTFB when loading the site homepage - this was easy to see in chrome developer tools, initially it looked a little like a CDN issue, however when testing by loading the content directly from the origin nodes, bypassing the CDN, the 10-18 second delay was still there.

I was actually led to the right troubleshooting path by looking at a stack trace taken from the CD App service.

In the stack trace there were MANY references to sitecore.analytics.blahblahblah, but there was one important reference to  sitecore.xdb.referencedata.client



After seeing the sitecore.xdb.referencedata.client in the trace I looked at the reference data database and saw that it was pegged at 100% DTU usage - after finding the KB, I applied Sitecore.Support.312397.sql and restarted the Reference Data App service, the 10-18 second delay on the homepage disappeared and the problem was resolved.

Apparently this issue can present itself with different symptoms, in my case, the DB was completelt pegged, however if you look at other references like this one , you will see that it can look very differently DTU usage wise.

So a few things to pass on regarding this issue:

I recommend that Sitecore Production Environments be instrumented with an APM solution. Azure Application insights will work, but I personally prefer AppDynamics and New relic.

If you don't have APM and you are working in a Azure PAAS environment its going to be important to know your Azure app Service Diagnostic Tools. Specifically knowing how to collect a Profiler trace is extremely important. Read up on that HERE
 


To make problems like this easier to spot, I would suggest creating DTU alert rules that will monitor for the DTU going above 90 % for the last 15min for all databases within the Sitecore resource group in question.

Here is a great way to do that in powershell:  LINK  (Thanks Georgi!)



I hope this helps someone out there!

Colin Cooper
Sitecore Engineer

Tuesday, July 23, 2019

Broken Sitecore Analytics: Missing SQL Stored procs for the Marketing Automation DB on XP 9.0.2 Azure

So while troubleshooting some XDB related issues on a  9.0.2 Azure IAAS deployment, I was looking at the logs for the Marketing automation continuous job (the one that runs as a windows service) and I found this error that I'd never seen before: (truncated)

[Error] An error occurred while communicating with the SQL Server
System.Data.SqlClient.SqlException (0x80131904): Arithmetic overflow error for data type smallint, value = 32768.
Arithmetic overflow error for data type smallint, value = 32768.
The statement has been terminated.
   at System.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__180_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---


On first look It appears that the cause of this could be some corrupted entries in the automation pool table of the marketing automation database.


One (not very attractive or advisable IMHO) approach would be to try removing the non-processed entries from the db doing by something like this:

 
DELETE FROM [xdb_ma_pool].[AutomationPool]
WHERE Attempts LIKE '32767' 

Hacking stuff out of the database kinda creeps me out though and I tend to consider stuff like this as a last course of action.
 
The more attractive possibility is that these non-processed records might be related to the following KB article (specific to 9.0 update 1 and 2): https://kb.sitecore.net/articles/065636

From the KB:
 
"In Sitecore XP 9.0.2, several of the new stored procedures are not present in the DACPAC for Azure for the Marketing Automation database." 

More coming on this issue soon...