Onfleet

Is Onfleet Down Right Now? Check if there is a current outage ongoing.

Onfleet is currently Operational

Last checked from Onfleet's official status page

Historical record of incidents for Onfleet

Report: "iOS Provisioning errors"

Last update
investigating

There is currently an issue blocking initial logins with new devices on the iOS app. Most devices that have previously logged in will not be affected. The team is investigating this now.

Report: "ETA calculation errors"

Last update
resolved

The ETA calculation issue has been tracked to a configuration issue, and has now been identified and resolved. We will be improving our monitoring and response process to handle similar issues more expediently in the future

investigating

We are experiencing errors related to ETA calculations

Report: "ETA calculation errors"

Last update
Resolved

The ETA calculation issue has been tracked to a configuration issue, and has now been identified and resolved. We will be improving our monitoring and response process to handle similar issues more expediently in the future

Investigating

We are experiencing errors related to ETA calculations

Report: "Webhooks not firing for image upload"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently experiencing an issue where webhooks are not firing when an image is uploaded. A fix has been found and we are in the process of testing and deploying

Report: "Webhooks not firing for image upload"

Last update
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Identified

The issue has been identified and a fix is being implemented.

Investigating

We are currently experiencing an issue where webhooks are not firing when an image is uploaded. A fix has been found and we are in the process of testing and deploying

Report: "Elevated Response Times"

Last update
resolved

System performance has improved significantly and most if not all functions should be operating without excessive delay. Further adjustments to database queries are in progress to avoid this issue in the future.

identified

The team has identified some changes to database queries to alleviate this issue, and will be publishing these changes as soon as possible.

investigating

Monitoring indicates increased response times overall. Our infrastructure teams are investigating the issue.

Report: "Increased API and Dashboard response times"

Last update
postmortem

The increase in system latency was caused by a query which was designed to depend on a new database index, but the index had not been created in production at the time of the deployment. We are reviewing our deployment practices to add safeguards for this type of issue in the future.

resolved

We briefly experienced partial service outage and increased response times following an application update. All systems are now operating normally. We apologize for any inconvenience caused during this incident.

Report: "Chat service unavailable"

Last update
resolved

This incident has been resolved.

identified

We are continuing to work on a fix for this issue.

identified

We are continuing to work on a fix for this issue.

identified

The provider has confirmed the issue and identified a root cause. A fix should be available shortly.

investigating

Chat is currently unavailable due to an issue with an upstream provider.

Report: "Batch Task Creation Jobs Failing"

Last update
postmortem

Between 11:45 a.m. and 12:20 p.m. PST, a batch request began to loop unexpectedly, interfering with each other and causing container contention. Because of how batch creations are processed, this prevented other jobs from being processed until they were cleared manually. Once the job was manually stopped, the system returned to normal batch processing. The team will implement enhanced monitoring to avoid this scenario in the future.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

At this time the internal process for managing task creation queues for the asynchronous batch task creation endpoint is experiencing issues. This is being investigated with high urgency.

Report: "Route Optimization"

Last update
resolved

This incident has been resolved.

monitoring

We have detected a slight decrease in errors for Route Optimization.

investigating

The upstream provider has confirmed the issue and is currently investigating the problem.

investigating

Route Optimizations are currently running very slowly or failing. Initial signs point to an issue with an upstream provider.

Report: "Pages for courier clients unavailable due to deployment issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

Report: "Onfleet integrations subsystem experiencing issues"

Last update
resolved

Our DevOps team was able to find a glitch within our AWS infrastructure that required a manual override. All systems are operationsal.

investigating

“The Onfleet integrations subsystem is currently experiencing issues, blocking creation of manifests as well as some other 3rd party integrations.”

Report: "Route Optimization issues."

Last update
resolved

Route optimization was briefly unavailable due to an outage at a partner provider. Their issues have now been resolved, and monitoring indicates that all features are once again available.

Report: "Brief Outage Due to System Restart"

Last update
resolved

We experienced a brief 1-minute outage due to a system restart around 2:40 Pacific. We apologize for any inconvenience caused. The services have been fully restored shortly after the incident.

Report: "Login failures."

Last update
resolved

The incident has been resolved.

monitoring

A misconfigured firewall rule briefly caused login failures. The rules have been adjusted, and this should now be resolved.

Report: "Increased API and Dashboard response times"

Last update
resolved

We briefly experienced partial service outage and increased response times following a planned infrastructure update. All systems are now operating normally. We apologize for any inconvenience caused during this incident.

Report: "Service Unavailability in Australia and Southeast Asia."

Last update
resolved

Impacted APAC customers have confirmed that they can now access the Onfleet application without any issues.

monitoring

While monitoring indicates that Onfleet systems are running normally, we are receiving reports of service unavailability affecting users in Australia and Southeast Asia. Initial assessments suggest this may be due to a network issue in ISPs from this region. We apologize for the inconvenience and will provide updates once we have more information. Thank you for your patience.

Report: "Dashboard search is not working on tasks created after Aug 18"

Last update
resolved

This incident has been resolved. We apologize for the disruption and appreciate your understanding.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Elevated Error Rates in Image Uploads"

Last update
resolved

We experienced an issue with elevated error rates in image uploads following our recent release and deployment. This occurred on August 15th between 5:39 PM and 6:43 PM Pacific. Our team identified the root cause and rolled back the release. The image uploads are now functioning normally. We apologize for the disruption and appreciate your understanding.

Report: "Service instability and outage."

Last update
resolved

A package upgrade was necessary due to a vulnerability, requiring additional updates to other packages. During deployment, some services did not restart appropriately due to a usage change in a newer container version, which requires a restart after upgrades. This was the root cause: the container service was not restarted, and all containers were stopped. Our operating procedures have been updated to reflect the proper usage pattern for newer versions.

Report: "Dispatchers aren’t able to view driver details"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Drivers locations not updating"

Last update
resolved

This incident has been resolved.

monitoring

Our team has successfully identified and reverted the changes that caused the disruption. Services are being restored to full functionality.

identified

We have gotten reports with Driver's Location not updating on the Map view. We are currently investigating our location service for the Driver locations and will keep updating as we find out more.

Report: "Service unavailable"

Last update
resolved

This incident has been resolved.

monitoring

The dashboard rollback has completed and we are verifying the fix.

identified

The team has confirmed an issue with inconsistent sidebar loading in the the recent release, and is initiating a rollback

Report: "Issues with SMS deliverability for some customers"

Last update
resolved

Some customers experienced issues with SMS task notifications wherein those SMS messages were not delivered to end recipients. Upon investigation, we observed that the issues were caused by a recent change to our production telephony service. Upon discovery of the issue, we immediately rolled back our changes and verified that normal service and SMS deliverability had resumed.

Report: "Network Connectivity in California"

Last update
resolved

Per AWS, between 11:43 AM and 1:30 PM PST some customers in California may have experienced connectivity issues between their ISP and AWS destinations. Onfleet was able to reproduce the issue from San Francisco with Comcast connections but not ATT connections.

investigating

AWS reports connectivity issues for some customers in California. Onfleet has received customer reports of problems connecting to the service.

Report: "Issues with Task Exports"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently seeing issues related to task exports email being sent out.

Report: "Service Degraded"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

An error in code led to an unintended functionality change. The problem release is being rolled back.

Report: "Service Degraded"

Last update
resolved

Between 7:25 PM January 4th and 1:43 AM January 5th tasks appeared out of order on the dashboard side panel. The issue has been resolved.

Report: "System Unavailable"

Last update
resolved

All changes were reverted, and the service is now fully operational.

identified

Our team has successfully identified and reverted the changes that caused the disruption. Services are being restored to full functionality.

investigating

We are currently addressing an unexpected disruption that occurred during the deployment/upgrade process. Our team is actively working to resolve the issue.

Report: "Telephony - US shared toll-free number degradation"

Last update
resolved

This incident has been resolved.

monitoring

We have worked with our telephony provider to resolve the issue and are now monitoring traffic from the shared high throughput toll-free phone number.

investigating

We are investigating issues related to the shared high throughput US toll-free phone number used for notifications and call anonymization that is affecting some customers.

Report: "System unavailable"

Last update
resolved

Our API and Dashboard experienced intermittent errors and elevated response times between 1:01AM to 1:08AM PDT. We restarted all affected servers and API requests resumed processing normally.

Report: "Service degraded"

Last update
resolved

Our dashboard experienced intermittent errors between 4:32PM PDT and 4:48PM PDT. Some users may have experienced issues loading the dashboard and tracking pages.

Report: "System unavailable"

Last update
resolved

Our API and Dashboard experienced intermittent errors and elevated response times between 1:33PM and 1:46PM PDT. We restarted all affected servers and API requests resumed processing normally.

Report: "Service degraded"

Last update
resolved

One of the web proxy instances failed causing transient errors for API and dashboard services from 1:57PM PDT to 2:04PM PDT.

Report: "System unavailable"

Last update
resolved

Our API experienced intermittent errors and elevated response times between 03:14 and 03:30 PDT. We restarted all affected servers and API requests resumed processing normally.

Report: "Issues performing imports"

Last update
resolved

The incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are investigating issues experienced by some users in performing imports in the dashboard.

Report: "System Unavailable"

Last update
resolved

We identified elevated response times on our API between 10:00PM - 10:12PM PDT. We have restarted all affected servers and API requests started being processed normally.

Report: "System Unavailable"

Last update
postmortem

The service was unavailable for approximately 7 minutes due to our database provider losing an instance which hosted multiple databases.

resolved

The service is unavailable / inaccessible.

Report: "Email Delivery Issues Affecting Exports and Other Transactions"

Last update
resolved

This incident has been resolved.

monitoring

Exports from the dashboard are starting to deliver normally. Continuing to monitor until the transactional email provider has declared all clear.

identified

The transactional email provider has confirmed the issue and is investigating it on their end.

investigating

We are investigating issues with email delivery which appear to stem from our transactional email provider

Report: "Geo-coding errors preventing task creation in certain regions"

Last update
postmortem

Around 2023-03-03 06:00 UTC a strict address validation check was introduced to task creation which disrupted API workflows among a range of customers. After noticing a spike in user errors we rolled back this change.

resolved

Errors in geo-coding temporarily prevented tasks from being created, especially in regions of the world with inaccurate or incomplete mapping data.

Report: "Unavailable Driver Location Data"

Last update
postmortem

At 2023-01-27 9:45 UTC, a failover event occurred due to a transient networking issue for a database cluster related to location storage. This event lasted for around a minute. As a result, several processes related to persisting location data entered a state which prevented them from persisting locations. Due to a lack of effective monitoring, we were not immediately aware of this issue. The root issue was resolved at 2023-01-30 22:11 UTC, soon after we became aware of it. However, as a result of this issue, complete location information for about 20% of tasks during this time period was not collected. We subsequently backfilled derived location information when appropriate, providing distance information for about 75% of these tasks by 2023-01-31 19:07 UTC. We understand how crucial location information is and apologize for any impact this incident may have caused. We have since improved our monitoring so a similar incident would have immediately paged an on-call engineer.

resolved

We are currently investigating this issue.

Report: "Route Optimization / Auto-dispatch experiencing increased failure rates."

Last update
resolved

Optimization services are fully operational.

monitoring

We no longer see elevated failure rates from the route optimization provider. We will continue to monitor the situation.

identified

We have detected an elevated failure rate from our route optimization provider and are in contact with their support team. They are working on the issue but do not have an ETA.

Report: "Errors sending email"

Last update
resolved

This incident has been resolved.

monitoring

Mailchamp Transactional has reported that the issue is resolved. Onfleet is monitoring the recovery.

identified

Mailchimp Transactional error rates are decreasing.

identified

Mailchimp Transactional is continuing their investigation to resolve connection issues.

identified

Onfleet has identified an upstream issue with our Mailchimp email provider. Mailchimp is working to resolve the issue.

investigating

We are currently investigating the issue.

Report: "Increased API error rate"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Export processing delayed"

Last update
postmortem

At 2022-11-01 6:50 UTC, an export error occurred that caused the corresponding queue worker to stop processing other exports. The same error had been missed previously which had caused other export queue workers to fail in the same way. As such, no further exports could be processed until the error was resolved. Once we became aware of the issue, we resolved the issue by 2022-11-01 10:52 UTC. All of the previously deferred exports were processed by 2022-11-01 11:07 UTC. We apologize for any interruption this deferred processing may have caused. We are validating a fix for the root cause of this issue which will be released when ready. We will improve our monitoring procedures to ensure that we respond more quickly in any similar future situations if they arise.

resolved

This incident has been resolved.

monitoring

An issue with export caused processing to be delayed. Previously delayed exports have now been processed. We are currently monitoring.

Report: "Increased error rates for routing functions"

Last update
resolved

Routing has been stable since 8:42 AM PDT.

monitoring

The routing functions are working correctly. We're going to continue monitoring and stay in touch with the upstream provider.

investigating

We are working with an upstream provider to determine to cause of this issue.

Report: "Increased error rates for routing functions"

Last update
resolved

Routing has been stable since 19:20 PDT.

investigating

We are working with an upstream provider to determine to cause of this issue.

Report: "API & analytics unavailable"

Last update
postmortem

On 2022-09-01 03:06 UTC, we were notified that some users were experiencing issues accessing certain features through the dashboard and API. A database update earlier in the day resulted in some documents being modified incorrectly, which impacted feature access for the associated users. We performed a fix at 2022-09-01 04:43 UTC that resolved the issue for most users. We were notified again at 2022-09-01 14:08 UTC that some users continued to experience issues with the API. We determined that this was due to unexpected behavior with our caching system and performed a full cache refresh at 2022-09-01 15:50 UTC. This fully resolved the issue for all affected users. We are enacting procedural changes to prevent similar database errors in the future and to resolve related incidents quicker. We are also implementing changes to our caching system to fix the issue experienced and detect such issues quicker.

resolved

This incident has been resolved.

monitoring

We have applied a fix and are watching to make sure access to the API and analytics has been restored.

identified

We discovered an issue after a recent database update. This issue affects some customers and prevents their use of our API and analytics, as well as other functionality. We are working on a fix currently.

Report: "Increased error rates for routing functions"

Last update
resolved

The routing services are fully operational.

monitoring

Routing services are functioning normally. We are monitoring the service status.

identified

Our VRP provider has identified the problem and is working on resolving the issue.

investigating

We have escalated this issue to the affected provider and are working with them to identify & correct it.

Report: "Dashboard error when editing drivers"

Last update
postmortem

On Aug 29 at 17:18 PDT \(Aug 30 00:18 UTC\) we deployed a change to support upcoming improvements in our backend systems.  This change introduced a bug in the way data for driver addresses was being sent to the dashboard.  We investigated the issue and determined that rolling back this change was necessary.  The rollback was completed at  21:29 PDT \(04:29 UTC\), and we monitored for some time to make sure all dashboard functionality had been restored.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are looking into reports of an error when editing drivers in the dashboard.

Report: "API Instability"

Last update
resolved

Beginning at 11:15 PDT on September 2, 2022, Onfleet services were unstable for approximately 3 minutes. This outage was due to a server under maintenance that stopped responding unexpectedly. The maintenance was rolled back, and service was restored to normal.

Report: "System unavailable"

Last update
resolved

Our API was responding slowly between 02:08 - 02:37 PDT. We have restarted all affected servers and API requests are currently being processed normally.

Report: "Issues with US SMS deliverability for some numbers"

Last update
resolved

This incident has been resolved.

identified

We are continuing to work with our provider to restore SMS traffic on this number. If these efforts do not yield any results, we will start working on provisioning a new number.

identified

We're working with our communications provider to restore message deliverability.