Historical record of incidents for PDF Generator API
Report: "General Cloud Service: Release Window 2.206.0"
Last updateThe scheduled maintenance has been completed.
Scheduled maintenance is currently in progress. We will provide updates as necessary.
A weekly release window to deploy updates and fixes. The update can cause temporary downtime up to 5 seconds.If the deploy includes customer facing updates, we will also publish release notes here: https://support.pdfgeneratorapi.com/en/category/release-notes-uouwx9/
Report: "General Cloud Service: Release Window 2.205.0"
Last updateThe scheduled maintenance has been completed.
Scheduled maintenance is currently in progress. We will provide updates as necessary.
A weekly release window to deploy updates and fixes. The update can cause temporary downtime up to 5 seconds.If the deploy includes customer facing updates, we will also publish release notes here: https://support.pdfgeneratorapi.com/en/category/release-notes-uouwx9/
Report: "Document Generation Delays"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Downtime in Enterprise Deployment (EU)"
Last updateThe issue affected Enterprise EU customers using API v4 for document generation. The problem was caused by the misconfiguration of the Generator API endpoints required by the latest release updates to generation logic. We are looking into how we can improve the configuration management to mitigate risks in the future.
A fix has been implemented and we are monitoring the results.
A configuration issues caused downtime in Enterprise Deployment EU. We have found the root cause and are updating the deployment configuration.
Report: "Document Generation Delays"
Last updateThis incident has been resolved.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Elevated API Errors on Document Generation APIs"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We're experiencing an elevated level of API errors and are looking into the issue. The API returns correct response content, but the status code is returned as 500, which can cause false negative results in customer applications.
Report: "Redis connection issues on Enterprise Deployment (EU)"
Last updateRedis connection issues resolved.
We are investigating Redis connection issues on Enterprise Deployment (EU)
Report: "Document Generation Downtime"
Last updateToday, we experienced our longest downtime in the ten years we have provided our document generation service. We apologize to all of our customers who were affected by the downtime. The saddest part of this incident is that we could have avoided it. We are improving our processes to avoid repeating our mistakes. **CHRONOLOGY OF THE INCIDENT** * Logged to Slack monitoring channel: 18:17 20.06.2024 UTC * Reports from customers via Crisp: 01:00 21.06.2024 UTC * Incident seen by technical team member: 08:00 21.06.2024 UTC * Issue resolved: 08:36 21.06.2024 UTC * Issue closed: 08:40 21.06.2024 UTC **IMPACT OF THE INCIDENT** The Cloud US Generator API service downtime affected all Cloud Service users, and none of the customers could generate documents during the incident. **THE ROOT CAUSE** The root cause of the incident was a misconfiguration of nginx. We synced configuration changes in ArcoCD on 20.06.2024 but didn't restart all the PODs. When new PODs were added or existing ones restarted, they failed to come up and caused the downtime. The existing alerting system logged issues to our Slack monitoring channel, but the health check considered the service running and didn't send out SMS notifications to the technical team. **LESSONS LEARNED** * After syncing configuration changes, we must restart all PODs and validate if everything is working as expected. This would have saved us from the incident. * The US-based support team should call the European-based technical team if they validated the incident outside of the EU's working hours. This would have allowed us to act much faster and solve the incident. * All health checks need to include sub-systems and microservices on which they depend. This would have notified the technical team much faster and reduced the downtime.
This incident has been resolved.
A fix has been implemented and we are monitoring the results.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Service Downtime"
Last updateDocument Generation Service Major outage
Report: "Document Generation issues in EU deployment"
Last updateThis incident has been resolved.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Issues with the service API endpoints"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Issues with the service API endpoints"
Last updateWe had issues with our database instances, which prevented some users from using the service. We are investigating the root cause if the problem and working on improving reliability. The incident lasted from 7 a.m. to 8:30 a.m. EEST.
Report: "Issues with the service API endpoints"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Issues with the service API endpoints"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have identified the root cause of the issue and the team is solving the issue.
We are experiencing issues with the API endpoints. We have identified the issues and will deploy a fix as soon as possible.
Report: "Issues with API v4 Open Template endpoint"
Last updateThis incident has been resolved.
We are experiencing issues with the API v4 Open Template endpoint which causes the endpoint not to work properly. We have identified the issues and will deploy a fix ASAP. Endpoint: https://docs.pdfgeneratorapi.com/v4/#tag/Templates/operation/openEditor
Report: "Document Generation Errors"
Last updateOur document generation service experienced issues (HTTP 503 errors), and the service was not available for the customers between 02:20 and 02:30 UTC. We have identified that the root cause was a Redis Cache cluster crash, which was automatically resolved, and the service resumed working as expected.
Report: "Authentication errors on API v3 using legacy authentication methods"
Last updateWe are experiencing Authentication errors on API v3 using legacy authentication methods (key and secret in the query parameters). We have identified the root cause and deployed a hotfix.
Report: "Document Generation Errors on API v4"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are experiencing issues with the API v4, which causes some of the document generations to fail. We are investigating the root cause and working on solving the issue.
Report: "Document Generation Delays in API v3"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Issues: API v3"
Last updateWe have rollbacked the API v3 to the previous version to fix the issue.
API v3 document generation endpoint returns base64 content instead of a raw file when output=I is used.
Report: "Document Generation Errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
Some users may experience failed document generation requests for the first request they make with a new template.
Report: "Document Generation Errors"
Last updateThis incident has been resolved.
After deploying the new Template Service, some users may experience failed document generation requests for the first request they make with a new template. We have identified the issue and are working on a fix to ensure that Template Service creates a template definition cache already on the first request.
Report: "Document Generation Errors"
Last updateThis incident has been resolved.
We have issues with the document generation service. We have identified the root cause and working on solving the problem.
Report: "General Cloud Service DB connection issues"
Last updateWe have fixed the database connection issue caused by high usage. We have upgrade database instances to support high loads better. The issue caused API downtime for 30 minutes from 14:54 to 15:24 UTC.
We are currently investigating the database connection issues that affect our Cloud service.
Report: "Document Generation Errors"
Last updateOur document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Delays"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Delays"
Last updateThe issue was caused by memory consumption configuration which caused PODs to run out of the memory. Our team is reviewing the configuration and will update it accordingly to prevent future issues.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document generation service issues"
Last updateOur document generation service experienced issues and the service was not available for the customers between 01:02 to 01:22 UTC.
Report: "Document Generation Delays"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Multi-template generation issues"
Last updateThis incident has been resolved.
We are continuing to work on a fix for this issue.
We're experiencing an elevated level of errors for the multi-template generation endpoint and are currently looking into the issue.
Report: "Document Generation Delays"
Last updateOur document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Issues with the Cloud service"
Last updateThe PDF Generator API Cloud service is back up. We have rollbacked to previous version.
The PDF Generator API Cloud service currently has issues related to roll-backing deployment.
Report: "Status page shows invalid information."
Last updateThis incident has been resolved.
At the moment the status page is showing invalid information about system outage. All services are operational and issue is related to bug in Status page logic. There was an partial system outage from 20:30 to 21:00 UTC on 19th of November, but this issue is now resolved.
Report: "General Cloud Service: Document Generation Delays"
Last updateOur document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Delays"
Last updateOur document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Issues with the Cloud on Enterprise US service"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently experiencing network issues and our team is working on solving the problem.
Report: "Document Generation Service not accessible (Enterprise Deployment EU)"
Last updateWe have identified that our users are no longer able to access the API endpoints after infrastructure migration. Our DevOps is investigating the root cause.
Report: "Document Generation Delays"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are receiving more document generation request than usually and our auto-scaling processes failed. Due to high CPU load we are currently adding new PODs manually to service all the incoming requests.
We are continuing to investigate this issue.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Document Generation Delays"
Last updateThis incident has been resolved.
Our document generation processes are taking longer time than usual. We have identified the root cause and working on solving the issue.
Report: "Interference in the document generation service"
Last updateThe Document Generation suffered a 2 minute downtime. DevOps investigating further to understand the root cause.
Report: "Partial interference in the document generation service"
Last updateRelease 2.56.0 introduced a bug that caused some of the generation request to failed. We rollbacked to previous version to fix the issue.
Report: "Partial interference in the document generation service"
Last updateWe updated configuration for the database service to finish the migration. The database service was rebooted and it caused some requests to fail.
Report: "Document Generation Service not accessible"
Last updateToday we migrated our main cloud service US1 to a new infrastructure to provide more scalable and reliant service. The new infrastructure uses the best practices and tools to provide world class service to our growing customer base. Unfortunately during the migration process we had issues with the Ingress configuration and we suffered a 30 minute downtime \(Severity Fatal\) from 7:35 UTC to 8:04 UTC. Our DevOps team is monitoring the service to make sure that everything is running as expected. Up-time monitoring: [http://stats.pingdom.com/6kk6o9wsksha/4741594](http://stats.pingdom.com/6kk6o9wsksha/4741594)
We have identified that our users are no longer able to access the API endpoints after infrastructure migration. Our DevOps is investigating the root cause.