Options Aggregates WebSockets are down, no aggregate data is streaming for per-second and per-minute frequencies.
Incident Report for PolygonIO
Postmortem

Postmortem on Options Aggs not being produced after 11am

What went wrong

At 11am EST on the morning of January 17th, Options aggregates stopped being produced.

Impact

Customers stopped seeing Options aggregate data for approximately 105 minutes.

Detection, Mitigation and Resolution

Customers informed us of the missing data at approximately 11:48 am EST.  At 11:59 am EST we restarted the options stream service which had no impact. At 12:30 pm, in our staging environment, we rolled back to a previous deployment. After verifying we were seeing aggregates being produced in that environment, we applied that same change to production at 12:45pm EST.

Root cause

We are in the process of refactoring a few of our services which involves adjusting configuration files.  As part of this refactor, we released a new version of the options aggregates service into production. Unfortunately, a line in the dockerfile that explicitly set the Timezone to New York (EST) was inadvertently removed.  This caused aggregates to stop being produced at 11am (4pm UTC) because the code thought it was market close.

Calls to Action

  • We will be adding alerting to our monitors that track streams being up/down.
  • We’ve updated our release cycle process by extending our testing times in our QA environment.
  • We will be adding additional diff checks in our PR review process to help identify configuration delta’s.

We deeply apologize for any disruptions and inconvenience caused by this incident. We do not take this event lightly. Our team worked diligently to address all problems and restore normal functionality to the affected services as quickly as possible. By implementing these mitigation measures and refining our incident response strategy, we aim to improve the reliability and availability of our services and prevent future outages.

Please don’t hesitate to reach out with any additional questions about this matter.

Thank you for being a loyal customer of Polygon.io,

Polygon Engineering Team

Posted Jan 19, 2024 - 18:14 EST

Resolved
This incident has been resolved.
Posted Jan 17, 2024 - 15:07 EST
Update
We are continuing to monitor for any further issues.
Posted Jan 17, 2024 - 12:59 EST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jan 17, 2024 - 12:59 EST
Update
We are continuing to investigate this issue.
Posted Jan 17, 2024 - 12:59 EST
Investigating
We are currently investigating this issue.
Posted Jan 17, 2024 - 11:57 EST
This incident affected: Options (Market Data WebSocket Channels).