I helped define and test similar data capture systems in HMG (utilities regulation) some 25 years ago, and we had a particular emphasis on end-to-end testing because our senior management concentrated very sharply on data accuracy and security. Individual data points had to be discrete, unique and to survive data scrape from the capture system into our master database. Our data schema had been designed as a paper system and our first five years concentrated on digitising that data. We collected a large amount of data annually from utility companies and used this to take decisions on utility bills that would apply nationally.
(Oh, and we outgrew Excel inside three years in terms of the amount of data we were requesting [though Microsoft code bloat was also an issue]; but it was a yearâs development to create a capture system that still looked like a familiar spreadsheet but actually used database architecture.)
My later private sector experience in end-to-end testing of business processes following the upgrade of one of the component apps of the data pipeline would focus me very much on the interface between different apps. And anyone in public sector IT (where the small-p political stakes are high and could seriously impact people in so many ways) who doesnât carry out some sort of risk analysis is just storing up trouble for themselves. âWhatâs the worst thing that could happen?â ought to be a daily challenge to any decision.
Itâs like making POST requests but not validating they are actually in the DB.
Funny nobody noticed for 16.000 records ⌠. Insane how it still happens these days but yea more common then we think.
So yea what would I do?
Write a validation that the record has been added
@progrockmicky: Thereâs nothing wrong with Excel as a data capture solution, especially when youâve only been given six weeks to develop a âworld-beatingâ app. But it does help if you understand how it worksâŚ
TL;DR: I wouldnât. Thereâs more than one way this might have happened. Well, or not happened.
That word âwouldâ invites to some ⌠fantasising. â So, had I been on the team right from the beginning, I may have accepted exploring the problem space with Excel (or any other spreadsheet).
Taking risk into account (in COVID-19 cases, that would be the death of patients that may have been prevented), weâd have gotten into answering one question: Can the solution deal with the worst case? - Letâs assume the worst case is that every person in the country gets infected (Yes, unlikely. Yes, in that case we wouldnât need that data collection anymoreâŚ), the system would need do deal with â70.000.000 cases (according to wikipedia).
At that point, I would have strongly recommended against using any spreadsheet, and turning to a data base instead.
At that point I imagine 3 possible paths:
Everythingâs cool: The project decides to use a DB system (PostgreSQL, MySQL, maybe even SQLite is good enough (not sure about the amount of data and/or performance). Everything goes smoothly, we progress using DBB, add load and performance testing and let a security expert have a look at things.
The messenger gets shot: The project fires me, and heads into trouble without me. I will later probably be blamed anyway.
Weâve always done it this way: The project decides to continue using a spreadsheet. I reraise my concerns, am ignored and decide to leave (not without a written memo that explains my concerns).
Iâd really like to see more details about this. That said: Given the information I have and the importance of a working solution, I can imagine that this is considered malpractice.
In order of likeliness, the paths would probably be 2, 3, 1.
If PHE follow usual UK public sector practice, at least your concerns could be registered in your exit interview, though the findings from that would most likely get buried in the bureaucracy. Then again, I was never high enough in the civil service to know what happens to exit interview notes.
As I said in my first post, faced with a similar situation twenty-five years ago, we eventually adopted a database solution (MS-Access!). But that position evolved over time. The question this whole affair raises to me is to what extent PHEâs IT team are at the current cutting edge of the industry. Given that they are a public sector organisation, I suspect that they do not have access to new kit, the latest thinking on software architectures, the time to go on training courses, or are in a position to recruit the brightest and best as their salary rates will not match those on offer in the wider industry. But this is the elephant in the room; itâs far easier to shoot the messenger. And hence your option 2.
In which case, the political masters have to put up with a lead time to MVP of more than six weeks (which was what was promised way back when the UKâs lockdown first started).
I still find it telling that the bug was actually in the interface between 2 components of the system, although the excel component was at fault (technically you could design around the columns limit by adding sheets for each week number), the fault was really for me, between two parts. And thatâs where many bugs can be discovered through static analysis?