Software testing and bug stories in the news

I wonder what other questions we can ask ourselves about bug stories. :thinking:

What was the root of the problem?
Who could have helped spot it?
How much impact has it really had?
Who did it impact?
What processes should be reviewed?
What can we learn from it?

3 Likes

Yes, great questions!

And I found this handy set of post-mortem questions.

Some that stood out that I think are particularly helpful:

  • Were there any factors beyond our control that contributed to the problem?
  • Did we have the necessary resources to prevent the problem?
  • Did we have a clear understanding of the problem and its impact?
  • Was the incident well-documented, and can it be replicated?
1 Like

Spotted this one on X via @bobs

1 Like

We used to do a thing called “Five Whys” following any production incident that required a notification to users of a degraded experience. The purpose of the technique is to promote a root cause analysis and then generate a corrective action. Like many of these kinds of ceremonies it feels awkward at first. But over time it gets better.

1 Like

Global McDonald outage:

In New Zealand, X user Germin van Royen: “The Mcdonalds outage is crazy. Went in tonight and drive thru + all kiosks were down. A system that can fail nation wide is bad but across multiple countries too!? Bonkers.”

In other news, citizens of those countries were reported to be slightly healthier than those in countries where McDonald’s did not fail.

(Yes, that’s sarcasm)

2 Likes

“Sainsbury’s says it will not be able to fulfil the “vast majority” of online deliveries on Saturday because of “technical issues”.”

Yikes, feeling for team Sainsbury’s. :purple_heart:

And now Greggs!

h/t to @bethtesterleeds for sharing this on LinkedIn.

1 Like

This one is a true national crisis. If this doesn’t raise the profile of software and systems issues nothing can.

3 Likes

@therockertester’s flight was delayed. Here’s what happened!

4 Likes

Anonymous reviews that weren’t actually anonymous, oops.

4 Likes

Triage closed the bug reports “By Design”

that is no “oops”

1 Like

It’s “oops” on the part of whoever made that design decision.

I might consider using the site to look for jobs, but there’s no way I’m posting anything there.

Just found this one: Software glitch saw Aussie casino give away millions in cash

2 Likes

I like it:

  • it is possible to insert two receipts into TICO machines

  • That was a feature, not a bug, and allowed gamblers to redeem two receipts and be paid the aggregate amount.

  • But a software glitch meant that the machines would return one of those tickets and allow it to be re-used – the barcode it bore was not recognized as having been paid.

I’m dream-guessing that someone in the dev team thought, we can’t allow redeeming two receipts, that’s a bug, and thought of ‘fixing it’. Redeem one and reject the other. Especially based on ‘An internal investigation found “numerous failures (human, process and technological) that more than likely prevented the fraud from being identified at an earlier opportunity.”’

2 Likes

I hope that this one is not a airline any of us have used.

Here’s a boundary bug with an airline not liking people over 100 years old :smiley:

Found via @bethtesterleeds on LinkedIn

2 Likes

As far as I’ve experienced this limitation(max 99 years) is present on a larger scale in aviation.
This might be handled differently per system: error in the interface when booking a flight or allowing the booking and dealing with it after.

1 Like

Lol i’ve used them quite a lot actually and not come across this but something tells me i may try this on BA. :smiley: :sweat_smile:

1 Like

There is the famous Aviation case of computer software being “infallible”, that was used as case-law in the Post Office Scandal of a chinook crash, which involved GPS coordinates calculations or something being wrong Faulty software could have caused Chinook crash in 1994 | Military | The Guardian