From April 10 to April 26th, there was an issue that caused us to lose all incoming analytics data from roughly 10% of viewers. Specifically, we lost data from viewers whose timezone contained non-English (non-ASCII) characters.
Viewers whose browser language is set to something other than English can report timezones containing non-ASCII characters. For example, a viewer whose browser language is Portuguese and is in the Brasilia Standard Time timezone reports a timezone of “Horário Padrão de Brasília”. This is an example of a viewer whose data would have been lost during this period.
On April 10, we added date and timezone information to the payload our video player sends our analytics service to help us debug an issue. The presence of non-ASCII characters in the timezone caused the payloads to be encoded incorrectly and become corrupted. When these corrupted payloads were received by our analytics service they were deemed invalid and discarded. Regrettably, we did not have alerts in place to notify us of an uptick in invalid incoming data, which is why it took us far too long to notice the issue. On April 26, we applied a fix and resolved this issue moving forward.
Losing analytics data like this is a serious failure, and since resolving the issue we have begun work to ensure this or something like it does not happen again. Our analytics service has appropriate logging and alerts for cases where valid data payloads fail to be processed correctly by us, but we had insufficient logging and alerts for invalid or corrupt data payloads. We’re adding those to our system this week, and additionally we will store all corrupted payloads so that we can reprocess them in the future if needed. We will also be adding additional automated testing to our player to ensure we don’t accidentally introduce a change that causes the payloads to be malformed.