Cloud Services

Archive for Cloud Services

How AI Enabled Analytics Can Learn From Your Own Alerts – A Feedback Loop for Physical Security Video Platforms

Posted by Zack Hamm on February 7, 2026

When an AI enabled camera sees something odd—a person climbing a fence, a vehicle moving the wrong way, or a suspicious package in a hallway—the system throws an alarm. Today most commercial platforms either stop there or dump every alarm into a human analyst’s to‑do list. The next wave of smart video management is showing its value when those alerts become data points that teach the AI how to get better.

Below is a quick rundown of why you should care about event tagging, batch upload, and on‑device model fine‑tuning, and exactly how it works in practice.

The Core Idea – Turn Every Alert into a Teaching Sample

What Happens Today	What We Want Tomorrow
Alarm fires → analyst decides manually whether to investigate.	Alarm fires → system records the event. Optionally, a user manually tags its alarm quality and object type. After a batch is collected, those tags are sent back to the manufacturer’s AI factory for model fine‑tuning.
Model improvement relies only on pre‑collected public datasets (often outdated).	Your own field data continuously refines the base model without you having to rebuild anything.

In short: Your “alarms” are gold; each one tells the AI system where its current understanding needs a tweak.

The Three Simple Actions You Can Take

Tagging – Assign three pieces of information to each stored event:
1. Alarm status: “true alarm” (verified threat) or “false alarm” (benign).
2. Confirmation flag: Did a security officer verify the incident? “Yes/No”.
3. Object classification label – optional but powerful: person, vehicle, gun, fire, smoke, animal, etc.
Batching & Uploading – When you hit a preset size (e.g., 100 to 500 events), the platform packages the tagged data and pushes it to a secure portal at your vendor (“AI Factory”).
Model Refresh Cycle – Raw batch à fine‑tune base model à Updated model

The updated model is automatically downloaded to all devices in the field and applied with no manual intervention.

What Happens Under the Hood?

Detection Stage – The AI generates an alarm whenever its confidence exceeds a configurable threshold for one of the pre‑defined classes (person, vehicle, gun, animal, etc.).
Event Posting – Each alarm is stored with metadata: timestamp, camera ID, motion trajectory, and the raw classification probabilities.
Human In‑the‑Loop Confirmation
- When a guard or operator reviews the alert, they select “True” or “False.”
- They also pick an optional secondary label (e.g., “bag left unattended → animal”).
Accumulation & Chunking – The VMS platform accumulates these confirmed events in a local buffer until it reaches the configured batch size.
Secure Shipping to the vendor’s “AI Model Factory” – Using an encrypted web connection, the VMS platform batch uploads to the vendor’s cloud fine‑tuning service.
AI Model Update – The Model Factory feeds the newly labeled data into a specialized training loop that adjusts only the “head” layers (the part responsible for detecting your specific threat types) while leaving the base model untouched.
Rollout & Deployment – The refreshed model is packaged, signed, and pushed back to the customer’s VMS platform for delivery to every camera or NVR in your network. From this point forward, detection accuracy reflects both the original manufacturer’s expertise and the nuances you’ve taught it with your own footage.

Why This Matters for Your Business

Continuous Accuracy Improvement – False‑positive rates can drop dramatically after just a few hundred confirmed events, eliminating wasted response time and unnecessary security dispatches.
Tailored Threat Profiling – If you run on an oil platform where “penguins” trigger alarms, labeling those as animal stops the AI from over‑responding to wildlife.
Compliance & Auditing – Every tagged event creates an immutable audit trail that shows exactly how the system learned—a compelling narrative for regulators or internal risk teams.
Operational Simplicity – No need for separate training projects, no heavy‑weight on‑site compute resources. The vendor does the heavy lifting; you just keep feeding it good data.

A Minimal Example Flow

Alarm: “Person is detected at Gate 3 at 02:14 am.” – Confidence 78%.
Guard Review: “False alarm – animal present (deer).” → Tag = false, label = animal.
Batch Fill: After 500 such events accumulate (e.g., many false alarms that turned out to be “wildlife”), platform sends the data.
Model Update Received – Cameras in your site stop flagging alerts on animal detections, reducing unnecessary dispatch by ~30%.

Looking Ahead – What’s Next?

Self‑Optimizing Networks – Systems that automatically suggest which cameras or zones need more frequent tagging based on detection confidence spikes.
Edge‑to‑Cloud Continuous Learning Loops – Real‑time model adaptation without waiting for a batch, using federated learning concepts to keep edge devices private yet collaborative.
Hybrid Human‑AI Decision Boards – Visual dashboards where analysts can overlay “probability heatmaps” with their own tags, creating a shared intelligence layer.
LLM AI Model Training – Using large language models to do the adjudication and automatically label the alerts for you.

Bottom Line: Turn Every Alarm Into Learning

Modern security platforms already spot threats—what’s missing is the feedback that lets them get smarter on their own. By tagging events as true / false and attaching simple classifications like “person” or “vehicle,” you hand the manufacturer a curated training set. Upload a batch, let them fine‑tune, install the updated model, and watch false alarms shrink while true positives hold steady.

That’s not “just another feature.” It’s a closed‑loop intelligence engine that keeps your security posture ahead of emerging threats—without hiring a team of data scientists.

Posted in: AI, Cloud Services, IP Video

Leave a Comment (0) →

Affected Services and Companies by the AWS DNS Outage on October 20, 2025

Posted by Zack Hamm on October 22, 2025

So, apart from bean counters constantly purporting that corporations are saving money in the cloud, the general premise is that the cloud is more “resilient” than your own corporate data center, and by offloading that infrastructure to the cloud you are more robust, resilient, and secure. Then… AWS has a major outage and proves all of this wrong.

On October 20, 2025, a DNS resolution failure in AWS’s US-EAST-1 region (Northern Virginia) caused widespread disruptions, primarily affecting DynamoDB and cascading to other services like EC2, Lambda, and SQS. The outage began around 3:11 AM ET (12:11 AM PDT) and the core DNS issue was fully mitigated by approximately 6:35 AM ET (3:35 AM PDT), lasting about 3-4 hours at its peak. However, full recovery for many dependent services took longer due to backlogs and cached DNS issues, with some disruptions extending up to 7-15 hours or into the afternoon ET. AWS advised flushing DNS caches to speed recovery, and all services were reported back to normal by around 6:53 PM ET.

Below is a comprehensive list of affected companies and services, aggregated from reports. Downtime estimates are approximate based on user reports and official updates, as exact durations varied by region and service. Services are categorized for clarity.

This list covers over 100 reported services, though not all may have been equally impacted globally. The outage highlighted dependencies on AWS, affecting millions and costing businesses potentially billions in lost productivity. No cyberattack was involved; it was an internal technical fault.

Social Media and Communication

Snapchat: Down for ~4-6 hours; login and messaging issues.
Reddit: Down for ~3-5 hours; site access and loading failures.
Slack: Down for ~3-4 hours; connectivity and messaging disruptions.
Signal: Down for ~4 hours; messaging outages.
Zoom: Down for ~3-5 hours; meeting and connectivity issues.
Discord: Down for ~3-4 hours; partial outages in voice and chat.
WhatsApp: Partial downtime of ~2-4 hours; messaging disruptions.
Facebook (partial): Down for ~2-3 hours in affected regions.
Instagram (partial): Down for ~2-3 hours; loading issues.
Kik: Down for ~3 hours; app access failures.
Life360: Down for ~3-4 hours; tracking disruptions.
Pinterest: Down for ~3 hours; site loading issues.

Finance and E-Commerce

Coinbase: Down for ~4-6 hours; login and balance issues, but funds safe.
Venmo: Down for ~3-5 hours; payment failures.
Robinhood: Down for ~4 hours; trading disruptions.
Chime: Down for ~3-4 hours; banking app access issues.
Lloyds Bank: Down for ~2-4 hours; online banking disruptions (UK).
Bank of Scotland: Down for ~2-4 hours; similar banking issues (UK).
Navy Federal Credit Union: Down for ~3 hours; account access failures.
Square: Down for ~3 hours; payment processing issues.
Polymarket: Down for ~3-4 hours; betting platform disruptions.
Xero: Down for ~3 hours; accounting software issues.

Gaming and Entertainment

Fortnite (Epic Games): Down for ~4-6 hours; server and login issues.
Roblox: Down for ~4-5 hours; game access failures.
Pokémon GO: Down for ~3-4 hours; app disruptions.
PUBG Battlegrounds: Down for ~3-4 hours; server issues.
PlayStation Network: Down for ~3-5 hours; online gaming affected.
Xbox Live: Down for ~3-4 hours; similar gaming disruptions.
Steam: Down for ~3 hours; store and multiplayer issues.
Rainbow Six Siege: Down for ~3-4 hours; Ubisoft Connect affected.
Dead By Daylight: Down for ~3 hours; game access issues.
Battlefield (EA): Down for ~3 hours; server disruptions.
League of Legends: Down for ~3 hours; login failures.
VRChat: Down for ~3 hours; virtual reality app issues.
Disney+: Down for ~3-4 hours; streaming failures.
Hulu: Down for ~3 hours; video loading issues.
HBO Max: Down for ~3 hours; similar streaming disruptions.
Prime Video (Amazon): Down for ~4 hours; video access issues.
Roku: Down for ~3 hours; device connectivity failures.
IMDb: Down for ~3 hours; site loading issues.
Apple Music: Down for ~3-4 hours; streaming disruptions.
Apple TV: Down for ~3 hours; app issues.
Tidal: Down for ~3 hours; music streaming failures.
YouTube (minor): Partial disruptions for ~2-3 hours.

Productivity, Media, and Education

Canva: Down for ~4 hours; design tool access issues.
Duolingo: Down for ~3-4 hours; app loading failures.
Perplexity AI: Down for ~4 hours; AI search disruptions.
CharacterAI: Down for ~3 hours; AI chat issues.
Microsoft 365 (incl. Outlook, Teams): Down for ~3-4 hours; productivity suite disruptions.
Wordle: Down for ~3 hours; game access issues.
Strava: Down for ~3 hours; fitness tracking failures.
The New York Times: Down for ~3 hours; site and app issues.
Wall Street Journal: Down for ~3 hours; news access disruptions.
CollegeBoard: Down for ~3-4 hours; educational platform issues.
Goodreads: Down for ~3 hours; book tracking app failures.
Adobe Creative Cloud: Down for ~3-4 hours; design software disruptions.
Airtable: Down for ~3 hours; database tool issues.
Asana: Down for ~3 hours; project management disruptions.
Atlassian (incl. Jira, Trello): Down for ~3-4 hours; collaboration tools affected.
Canvas by Instructure: Down for ~3 hours; education platform issues.
Smartsheet: Down for ~3 hours; spreadsheet tool disruptions.
MyFitnessPal: Down for ~3 hours; fitness app issues.
Peloton: Down for ~3 hours; workout platform failures.
Teachable: Down for ~3 hours; online course platform issues.
Substack: Down for ~3 hours; newsletter disruptions.
ChatGPT (SSO): Partial downtime of ~3 hours; login issues.
Postman: Down for ~3 hours; API tool disruptions.
NPM: Down for ~3 hours; package manager issues.
GitHub: Down for ~3-4 hours; code repository access failures.
New Relic: Down for ~3 hours; monitoring tool issues.
ShipStation: Down for ~3 hours; shipping software disruptions.
GoDaddy: Down for ~3 hours; hosting issues.
Shutterstock: Down for ~3 hours; stock media access failures.
HMRC (UK tax site): Down for ~2-3 hours; government services affected.

Retail, Delivery, and Consumer Services

Amazon (incl. Alexa, Ring, Blink): Down for ~4-6 hours; smart devices, shopping, and video feeds affected (Ring lingered longer for some users).
McDonald’s App: Down for ~3-4 hours; ordering issues.
DoorDash: Down for ~3 hours; delivery app failures.
Starbucks: Down for ~3 hours; app ordering disruptions.
Instacart: Down for ~3 hours; grocery delivery issues.
Lyft: Down for ~3 hours; ride-sharing app failures.
Grubhub: Down for ~3 hours; food delivery disruptions.
Fetch: Down for ~3 hours; rewards app issues.
Whatnot: Down for ~3-4 hours; shopping platform disruptions.
Zillow: Down for ~3 hours; real estate site issues.
Ancestry: Down for ~3 hours; genealogy platform failures.
Eight Sleep: Down for ~3-4 hours; smart bed cooling disruptions.
Hinge: Down for ~3 hours; dating app issues.

Telecom and Infrastructure

Verizon: Down for ~3 hours; service disruptions.
AT&T: Down for ~3 hours; network issues.
T-Mobile: Down for ~3 hours; mobile service disruptions.
Boost Mobile: Down for ~3 hours; similar issues.
Xfinity by Comcast: Down for ~3 hours; internet and TV disruptions.
Vodafone: Down for ~2-3 hours; telecom issues (international).

Travel and Airlines

United Airlines: Down for ~3-4 hours; app and minor flight delays.
Delta Air Lines: Down for ~3 hours; minor delays and app issues.

Other/Miscellaneous

Fanduel: Down for ~3 hours; sports betting issues.
Vercel: Down for ~3 hours; serverless functions offline.
Nintendo: Down for ~3 hours; online services affected.
Twitch: Down for ~3 hours; streaming disruptions.

Posted in: Cloud Services

Leave a Comment (0) →

Axis Cloud Outage 2025: Navigating the New Normal in Cybersecurity and Remote Access

Posted by Zack Hamm on May 6, 2025

In yet another installment of “Why You Don’t Put Security Infrastructure in the Cloud”. I try to cover these whenever possible, but there are more informative and comprehensive articles out there by other parties such as IPVM that include more detail.

This latest cloud services outage, this time by Axis Communications, bodes an ominous tone for security in the cloud, especially with the latest discovery of over 19 billion passwords available for grabs on the darkweb. (See Forbes article here.)

In early May 2025, Axis experienced a significant cloud outage that has since stirred discussions about cybersecurity, cloud reliability, and the importance of upgrading legacy systems. While Axis’ Secure Remote Access (SRA) version 1 was the center of attention, a combination of factors—from a sophisticated Distributed Denial of Service (DDoS) attack to evolving cyber threats—has made it clear that organizations need to stay ahead of potential vulnerabilities. Recent web searches and industry reports paint a broader picture of the cybersecurity landscape that helped shape the narrative behind this outage.

On May 1, 2025, Axis customers in the Americas encountered major disruptions in accessing AXIS Camera Station services via SRA v1. According to public reports and status page updates, the outage was triggered by a targeted DDoS attack that effectively blocked access to crucial authentication services. News outlets and cybersecurity blogs noted that such attacks have been on the rise in 2025, with a marked increase in frequency, volume, and sophistication. Cloud service providers and security experts have been warning that the arms race between attackers and defenders is intensifying, with recent data from independent security research groups highlighting a dramatic spike in global DDoS events.

Web search transparency and trending articles on cybersecurity point out that this Axis incident is not isolated. In recent months, several major cloud providers have reported disruptions linked to DDoS attacks. These events underscore a common theme in today’s digital environment: as businesses increasingly rely on remote access and cloud technologies, robust security and continuous system upgrades become non-negotiable. For Axis, the outage served not only as a serious operational setback but also as a wake-up call to accelerate the migration from SRA v1 to the safer and more advanced SRA 2.0.

Industry analyses from cybersecurity thought leaders suggest that the attack on Axis was emblematic of broader trends seen in 2025. For example, multiple online publications have detailed how threat actors are leveraging automation and AI to launch more effective DDoS attacks, which overwhelm networks within minutes. The Axis event fits into this larger narrative, illustrating the potentially devastating impact such incidents can have on remote access infrastructures. Many businesses, especially those with remote monitoring systems, are now rethinking their strategies to include not just reactive measures but proactive defense mechanisms, such as increased redundancy and constant monitoring.

A key recommendation emerging from these discussions is the importance of upgrading legacy solutions. Axis has been urging its customers to shift from SRA v1 to SRA 2.0, a newer platform that utilizes WebRTC-based encrypted connections along with improved backend architectures to thwart similar attacks. Migrating to SRA 2.0 is not just about staying current with technology trends; it is about building a resilient infrastructure that can adapt to rapidly changing threat landscapes. Web-based resources and forum discussions among integrators emphasize that switching to newer versions may require minimal changes for some users but could be more demanding for those relying on older hardware configurations. Nonetheless, the consensus remains that the benefits—better security, enhanced performance, and longer-term support—outweigh the transitional challenges.

Another aspect highlighted in web searches is the value of diversified access strategies. In the Axis outage, customers who had set up alternative remote access methods, such as manual port forwarding or VPN solutions, found themselves insulated from the worst effects of the disruption. This revelation has prompted many organizations to revisit their disaster recovery and business continuity plans. Multiple cybersecurity experts recommend that businesses implement layered security measures, ensuring that if one pathway is compromised, others remain fully operational.

The broader narrative emerging from this event also includes reflections on cloud dependency. With an increasing shift to cloud-based services, the Axis Cloud Outage has reignited debates on the practical and security-related downsides of heavy reliance on centralized systems. Future-proofing remote access capabilities, therefore, may involve a hybrid approach—melding on-premises resilience with the scalability and convenience of the cloud.

In summary, the Axis Cloud Outage of 2025 is a pivotal reminder of our interconnected digital vulnerabilities. It situates itself within a larger context of escalating DDoS threats and emphasizes the urgent need for continual system upgrades and diversified security approaches. For organizations using Axis Camera Station services, the move to SRA 2.0 isn’t merely a recommended upgrade—it’s an essential strategic shift toward establishing a more secure and reliable remote access framework. As companies adapt to the evolving cybersecurity environment, lessons from this outage will undoubtedly drive innovations and reinforce the importance of proactive risk management in today’s cyber-driven landscape.

This is the constant threat with cloud internet or public facing network services in general. It’s a race, always a race, to keep ahead of criminals trying to gain access for information, control, or disruption. Worse still, you’re paying for it. Sometimes corporations are on the hook for data breaches, but more often than not, the customer, consumer, or taxpayer ends up footing the bill for the breach, and the subsequent hardware/firmware/software replacement needed to fix it.

Tags: Axis, Axis SRA, Cloud, DDOS

Posted in: Cloud Services

Leave a Comment (0) →

Another Oops for Cloud Services – InfluxDB Halts Service in Belgium/Sydney with Insufficient Customer Notice

Posted by Zack Hamm on July 18, 2023

In the latest example of “If you don’t own the server you don’t own the data” cloud events, InfluxData recently closed operations in their Belgium and Sydney locations, with apparently woefully inadequate customer notification and follow up. In both instances, users were apparently notified only via email and via the InfluxDB documentation or status website. The hows and whys are a little fuzzy, but suffice to say that InfluxData management made some very unusual decisions to turn off services and delete customer data… with what is overwhelmingly being called “insufficient notice”. It appears that this event may have cost InfluxData some customers, or new customers at least, as they try to dig out from under this fiasco.

In fairness, they did provide notification via their cloud status page, but who looks at that unless there’s an outage or service degradation? You can follow the thread here if you want to see the drama unfold: https://community.influxdata.com/t/getting-weird-results-from-gcp-europe-west1/30615/19

While we are not aware of any security product that uses InfluxDB for it’s cloud database, there are plenty of examples of video and access control products that use cloud based database instances or other cloud dependent services. InfluxData uses Google, Azure, and AWS for it’s hosting services, so this wasn’t a case of a company that suffered a catastrophic site failure or financial bankruptcy issue. This was more likely a financial issue to discontinue services for poor performing areas and focus on better areas. It certainly was well within InfluxData’s rights to do so too, but apparently could’ve been communicated much better. Further, there was no attempt to migrate the user’s data to another region, or even provide backups of the data for user’s to migrate themselves.

Responses from user’s on the support page was scathing, if not somewhat in disbelief too:

Users from the Sydney region weren’t so lucky, as apparently there were no measures taken to be able to restore their data:

All of this is just to say that thousands of business run on cloud services every day, and many of them probably have no idea about what their hosting provider’s service level guarantee or disruption notification policies are. Further, just because your cloud service guarantee they are backing up your data doesn’t mean you shouldn’t be backing it up also… to your own storage.. that you own. If you must use security software in the cloud and store your data there, have a business continuity plan that includes your cloud provider services and the recovery of the data that is stored there.

Now repeat after me, “If you don’t own the server, you don’t own the data”…

Posted in: Cloud Services, Security Technology, Vulnerability Analysis

Leave a Comment (0) →

How AI Enabled Analytics Can Learn From Your Own Alerts – A Feedback Loop for Physical Security Video Platforms

Affected Services and Companies by the AWS DNS Outage on October 20, 2025

Social Media and Communication

Finance and E-Commerce

Gaming and Entertainment

Productivity, Media, and Education

Retail, Delivery, and Consumer Services

Telecom and Infrastructure

Travel and Airlines

Other/Miscellaneous

Axis Cloud Outage 2025: Navigating the New Normal in Cybersecurity and Remote Access

Another Oops for Cloud Services – InfluxDB Halts Service in Belgium/Sydney with Insufficient Customer Notice

How AI Enabled Analytics Can Learn From Your Own Alerts – A Feedback Loop for Physical Security Video Platforms