August 28th incident Report [RESOLVED]

Updated by Elizabeth Smith

This article details incident logs impacting network connectivity for Aug 2023.

Connectivity Interruption on some SIM card Partners

Incident Report

Update: SEPTEMBER 5th - 16:20 UTC

METER Group connected devices should be running normally at this time. If you are seeing connection issues please reach out to support.envoronment@metergroup.com

Update: Aug 31, 2023 - 16:56 UTC

Our provider says that 97% of devices are connecting normally. They continue to observe a small subset of devices with unstable data connections, with sessions dropping more often than expected. They are diagnosing root causes.

Update: Aug 31, 2023 - 00:23 UTC

Over 95% of provider devices have restored connectivity, with more devices coming back online each hour. Our provider is still working with their upstream partners to ensure the same success on the remaining devices.

The provider has observed a small subset of devices with unstable data connections, with sessions dropping more often than expected. They are diagnosing root causes.

Update: Aug 30, 2023 - 21:11 UTC

Based on our monitoring we've identified a majority of our affected SIM cards connectivity has been restored. We're still working with our upstream partners to ensure the same success on the remaining ones.

METER Group is continuing to monitor the connections from your devices. Please allow time for your device to catch up sending your data.

To see the device last connection visit the DEVICE LOG

If you have questions or concerns reach out to support.environment@metergroup.com

Update: Aug 30, 2023 - 13:09 UTC

Our upstream partner is continuing to make improvements, and Our SIM Provider is starting to see devices come on line. To reduce signal overload on the network, the number of new devices are being rate limited to prevent congestion, and our upstream partner is continuing to incrementally increase capacity. Both METER Group, our SIM Provider and and the upstream provider are closely monitoring the network recovery to ensure network stability, and work through the recovery in a controlled manner.

Update: Aug 30, 2023 - 07:42 UTC

Our upstream partner is continuing to implement changes to bring devices back online, and is currently working to mitigate signaling congestion. With the current congestion, Hologram continues to see devices failing to register with carrier networks. We are continuing to monitor the situation.

Update: Aug 30, 2023 - 05:19 UTC

Our upstream partner has shifted traffic to a new node and is beginning to see improvements in IoT device connectivity. We are continuing to monitor the situation. We will continue to post updates here as we learn more.

Update: Aug 30, 2023 - 02:01 UTC

The previously identified fix for the failures in our upstream partner's network interfaces was not ultimately viable and was not implemented. Our partner believes they have now identified a common problem with the interface failures and are working on a fix.

Update: Aug 30, 2023 - 01:06 UTC

The likely root cause of the regression has been identified on a node in our upstream partner's network interface. They engaged with their vendors and a solution has been identified, which will be executed in the next 30 minutes. The updated node will come back online and traffic will be gradually increased to it to ensure a stable recovery.

The expected recovery time depends on the depth of backlog of connection requests, it will be a slow release to avoid overloading the signal.

Update: Aug 29, 2023 - 22:59 UTC

Our upstream partner has identified the root cause of the regression.

The solution is currently being assessed, in which they'll migrate the signaling services to a different node that is operating normally and has bandwidth to support the incremental traffic.

Update: Aug 29, 2023 - 20:59 UTC

After trending upwards close to a resolution, we noticed a regression on resolved sims. We escalated with our upstream partners and they confirmed there has been a major regression on the solution that was implemented.

We're monitoring the impact and working with our partners to identify the root cause of this regression.

Update: Aug 29, 2023 - 15:58 UTC

The latest update received from our upstream partner indicates signalling has been stabilized and congestion is now overcome with continuous monitoring ongoing. The network is processing traffic as per normal standard, though affected devices will take some time to all reattach.

From timelines of previous similar incidents, it is expected that full recovery will likely take until well past midnight UTC.

All teams are focusing on improving these timelines wherever possible.

Update: Aug 29, 2023 - 12:39 UTC

There are still issues with the connectivity on our upstream partners network. The cause of this incident has now been rectified and stability has been confirmed.

They're now facing a signaling storm due to congestion, they're restricting the traffic and are slowly increasing throughput to resolve this.

We're still unable to provide an ETA but we started seeing improvements.

Update: Aug 29, 2023 - 09:23 UTC

Our upstream partner continues to work on the remaining network instability issues adversely affecting subscriber attachments.

There is no ETA yet on resolution. We will continue to post updates here as we learn more.

 

Update: Aug 29, 2023 - 08:11 UTC

Our upstream partner continues to work on the remaining network instability issues adversely affecting subscriber attachments.

There is no ETA yet on resolution. We will continue to post updates here as we learn more.

Update: Aug 29, 2023 - 07:26 UTC

 Our upstream partners have stabilized the replacement hardware and continue to work on the remaining network instability issues that are adversely affecting subscribe attachments. There is no ETA yet on resolution. We will continue to post updates here as we learn more.

Update: Aug 29, 2023 - 07:25 UTC

 We are continuing to work on a fix for this issue.

Update: Aug 29, 2023 - 06:58 UTC

 We are continuing to work on a fix for this issue. 

Update: Aug 29, 2023 - 05:39 UTC

Our upstream partners have replaced faulty hardware, have begun bringing interconnect links back online, and are continuing to work to resolve the issues that remain. Unfortunately, bringing back the interconnect links have not yet had the expected effect on connectivity.

Subscriber attachments are still adversely affected.

There is no ETA yet on resolution. We will continue to post updates here as we learn more.

Identified : Aug 28, 2023 - 23:45 UTC

The issue has been identified. We will post updates here as we learn more.

Update: Aug 28, 2023 - 19:47 UTC

This issue is affecting SIMs starting with the 8944* prefix. This does affect the h.meter.apn connections.

It appears to be preventing any attachment to the cellular network right now. This has been escalated to the highest level

Investigating : Aug 28, 2023 - 19:39 UTC

We are seeing an issue with a network partner causing some SIM cards to be unable to pass data on the network. The issue has been escalated and we are investigating along with our partner. We will post updates here as we learn more.


How did we do?


Powered by HelpDocs (opens in a new tab)