Hi there! Iām not very active here but right now I feel like the community can help me come up with ideas because I very much need them blush.
Our product is a C2C and B2C solution (kinda like Vinted ) and weāre organized by tribes split by user experience, i.e. Seller, Buyer, Marketplace, etc. I was recently moved to a different tribe in the company and one particular team is eager for me to help them avoid lots of bugs (itās happened) whenever thereās a refactor. Iām lost and I feel like a complete impostor. I am a user of the app but I donāt know āeverythingā (can anyone even?) and Iām struggling to help them find solutions to plan things better and avoid breaking a lot of things from other teams whenever they refactor something.
Iām currently working on a doc I call āTrigger questionsā in which I speak with different people and come up with a list of things we should take into account (and encourage communication with the other teams!) when we plan a refactor. Iāve also suggested modeling before refactoring (like a mindmap) so we can have an idea of different areas that could be affected by our changes. Weāll try these things and see how they work.
My question is, have you been in a similar situation? If yes, what helped you help the team?
PS: I tried to provide enough context, I hope it is
The first thing that comes to mind when I see 'to help them avoid lots of bugs ā is to find out what your role is and their concrete expectations:
Product manager - help them define better specifications and offer technical support if possible;
Development lead - define, design and lead development practices;
Quality Assurance Engineer - analyze development issues, fix them, review code, fix bugs, improve the code, write low-level checks, build scripts/tools to help developers test(Google was using QAE with tasks like these)
Software Tester - identify quickly the problems or potential issues impacting the quality of the product and inform the relevant people that matter;
Release manager - lead the process of release and not go through if there are pending bugs that would require fixing.
Other?! maybe an Automation engineer - which codes hundreds of checks in the logic/flows of the application as a net for the obvious issues.
Keeping backward compatibility is key - by doing integration regression testing, youāre making sure everything works as it did before. You donāt need to test anything new, just that the integration still works after refactoring. Refactoring is tricky - thereās no magic solution to avoid breaking stuff and having bugs Devs from different teams gotta talk and figure out how to handle everything - integrations, APIs, schemas, etc - without breaking the whole system. In some situations, you might run almost 2 systems/interfaces for a bit to switch from the old to the new code.
Planning is important - break down the refactor into smaller iterations you can do it step by step. Itās gonna take more time and resources, but itās way easier to manage and less likely to cause critical issues. Make sure thereās a code review process across all teams, so everyone understands and accepts the changes, and plans any extra refactoring that needs to be done on their sides.
The real issue probably is in having a solid dev process that all affected teams get and strong engineering management to maintain. needed formalities in processes. As QA, finding every bug or cutting down on them significantly might not be possible but keeping them manageable is possible. Refactoring is basically a feature If youāre implementing a big feature or several across different services, youāre in the same situation.
Try to use automation - think e2e API tests, unit testing, contract testing, etc. Use feature flags to ship refactoring so you can switch between old and new code as and when and if needed. Keep tech docs and diagrams up to date, have cross-team meetings, and make sure everyone who needs to know, does.
In the past I worked for a company with two products sharing some code and dealing with different users. The devops engineers had the following things implemented:
= for each feature or change, a developer had to make automated tests to assure that the code worked as required. The bulk of the tests were unit tests, which could be executed within seconds. Test Driven Development was their way to develop the code.
In order to manage the code, they used git, a version control system. There was only one single branch. If the code would break, then it would become obvious within minutes.
A developer could only integrate code in this single branch after passing the auotmated tests of all parts of the system.
There were other house rules for development, but the described ones had a major impact on the quality of the products.
Welcome back @antonella . Yeah, Iām curious as to why the company keeps on refactoring, I mean thatās not actually a customer value piece of work, but maintenance work for engineering sake. Which is a velocity enabler or a velocity killer.
I am not so sure you are so much preventing ābugsā as preventing āregressionsā, and as such I find the friction that having a very strong full automated regression testing suite imposes, means that every refactor has to work the same as the one before. And in reality, thatās not true all of the time. During a refactoring, you may find many reasons to update or to fix the UI (or for that matter to fix any api interfaces and integrations. And I think it has been unhelpful to create the myth that regression tests must always pass, so I think for me, that would be my first ātriggerā or myth to bust in any conversations. Refactoring is a chance to make security as well as UI and workflow changes.
As for ways you might actively help, it might be in championing the socialising of internal test environments where the teams integrate all their components into a environment usefull for not just automated-testing, but also for those B2C demos as a way of turning the integration testing you already do into something far more āsocialā as well as better understood by all teams. Things become less āchuck it over the fenceā when there is a test sandbox that is always live and up to date. Hope thatās an idea that is highly technical but only requires communications and people-networking effort to do.
Having a place where everyone can see their refactoring and show it off being that step towards the 3 teams/divisions in the company being able to share by seeing new features early, and being allowed to play because nothing can really break is powerfull.
Thanks for reaching out an sharing your thoughts.
It sounds like youāre navigating some new territory within your company, but youāre not alone in feeling that way.
Remember you were chosen to be part of this team for a reason!
Your approach creating a Trigger questions doc & suggesting modelling before refactoring shows great initiative and problem-solving skills, itās all about fostering open communication & collaboration b/w teams which is a key to successful refactoring.
In similar situations, Iāve found that transparency and teamwork are crucial. Building relationships with others and fostering a culture of knowledge sharing can help immensely. Donāt be afraid to lean on your colleagues for support and bounce ideas off each other.
Keep up the great work, youāve got this!!!
NO MATTER HOW SMALL, EVERY STEP FORWARD IS PROGRESS.
I think I have an understanding that:
You dont have the Subject Matter Expertise that longer term folks have
Your application is subject to changes that teams external to yours are making, causing regressions that your team didnt create.
First, find the SMEs that do exist. Often they are in Product Support (they know all the ugly warts in the system) Product Management (they know all the business rules) and engineering management (they know how often their teams have to fix the same things) Learn from them. Always create test documentation. Even if its brief. Either a planned set of tests OR a record of tests that have been done. This is so that all of those mentioned disciplines have something to go over and examine for gaps in testing. I find it also helps me articulate my understanding of the feature or the change being made
If there are tools in the build process that articulate upstream changes, get to know them. If there arent, encourage the implementation of them. Work with engineering leadership about how you all can be aware of dependencies changing in the codebase. Dont just accept a build. Look at the pull requests involved. Over time you will learn to recognize areas of risk. Or at least ask. In my last role this would happen often. It had taken a long time but I knew which changes were risky. An external developer (single) had created a particular email service. He like to just push changes. Now We couldnt detect those changes (and he was a jerk about bothering to announce them) But when we had a feature change that would touch any aspect of email, I would alert on that and loudly communicate that the development and testing needs to be broad and deep around any email dependency
And do continue to create a culture of that communication. We continually struggled with it. Our Product managers and engineering leads were instrumental in those comms. Where a PM would call out that this feature chage or business rule change they were making would impact other team features. Engineering leads would call out when a PR they were reviewing was depended on by other teams.
It will never be perfect and it takes relentless leaning on others to get that culture to shift slowly. Keep at it!
If this problem is an āissueā for you, you may try to dig into finding the cause first before trying to plan on āhow to avoidā.
The first thing I would do is understand the dev team ability. e.g.
Will the team combination be having too many junior dev?
Does this problem only happen in this team/ specific personās work?
Is the team scale too large to manage?
Will the pull requests be stacking for a long time before review and merge?
This is not to blame anyone, but to check if we need to talk to managers on people arrangement.
Another check point is on the timeline, are the dev finishing the refractor in a rush for a deadline? Mistakes may occur easier if they donāt have enough time to review and check their work.
You could also suggest a retrospective with the team to know more about the reason behind. Communication is important to solve problems.
Of course also check the testing process, but this is mostly to avoid the situation going worse, but not tackling the root cause.
How early/ frequently are the tests run? As in projects with many bugs, it may require early testing to shorten the feedback loop with developers.
Is the defect list well managed? As if bugs keep growing it will be hard to see the priority of issues.
Are the high-value issues being prioritised to report and arrange for fix? If impossible to have nice builds, at least we ensure the main features are not breaking.
Hi @ipstefan! Iām sorry I forgot to mention my rol, itās called āQA Engineerā but what itās expected from us it to coach the teams when needed so they can come up with solutions that work for them. Also, we arenāt embedded on a team, we split our time working with multiple teams.
Hi @shad0wpuppet Iām super happy it inspired you to write a post!
Iām happy to say that the list of potential things to take into account that weāve been working on has all the things you mentioned, so it seems like weāre on the right path and I find that a relief, to be honest.
Thank you so much for taking the time to answer my question so thoroughly!
Hello @han_toan_lim thank you for your reply. I believe our devs are already doing this with every piece of code they merge, but Iāll ask to learn if theyāre doing it with new features mostly or also with refactors.
Thank you so much!
Hi @conrad.braam , thank you for taking the time to answer to my question!
Currently, the problem these refactors cause isnāt really affecting the regression or the automated tests we have (or not in a big way, because no one has complained about it yet haha).
We do have a beta environment and weāve been suggesting for years that perhaps we should have one more environment, since beta is āeveryoneās sandboxā and sometimes things that should work donāt because someoneās deployed something thereā¦but it hasnāt been a problem for at least the past two years, since I think communication improved and weāre alerted when someoneās going to merge something and cause some temporary disruptions.
I think theyāre refactoring this important parts because theyāre legacy, and they have two goals: 1) implement the latest design system and, 2) update the code so that they can carry out their next ideas on the roadmapā¦however, reading this question of yours made me thinkā¦so Iāll be surely asking the team WHY weāre doing it because I sense there might be more than I think.
Once again, thank you for your reply! Reading all the different replies is making me think that perhaps the issue here (Iām new in this tribe, so I donāt have the full context) might be linked to speed and pressure to deliverā¦so I think Iāll push some buttons there to see what answers I get
Hi @ansha_batra thank you thank you thank you for this. Perhaps thatās the solution I was really looking for, someone to remind me that Iām already trying and that Iām trying to find a collaborative solution because, truth is, I think thatās the key to solve this problem: collaborate more, often and better.
Hi @msh , thank you for taking the time to answer to my question.
I think weāre already doing many of the things you mentioned BUT youāre telling me something Iāve been feeling for the little while Iāve been working with this team (3 weeks-ish) which is āif we donāt know many things about the area of the product weāre refactoring, then letās find someone who does!ā. Iāll follow that path to see what I find out! Thank you!!!
Hi @joyz thank you for taking the time to answer my question.
Your initial questions are super interesting and combined with some suggestions given by other people here, I think theyāll allow me to get a better picture of what could be the actual problem here, so I marked it as a solution.
Fortunately, weāre monitoring bugs and defects pretty well and also prioritizing them well enough, so I think that the actual team + speed they commit themselves to + lack of breaking refactors into smaller deliverable chunks might be causing some pain hereā¦Iāll keep investigating and asking questions to make people reflect.