In my experience, doing continuous performance testing in fully integrated environments is very difficult, at least in orgs larger than an early startup.
There are a few reasons for this, but the biggest is that you generally need large and diverse data sets to be able to do good performance testing. Maintaining this (and keeping in sync across the whole system) on a continuous basis takes a lot of engineering and process discipline.
The last consulting gig I did was primarily as a performance engineer, and the stack was a mixture of legacy eCommerce products and modern microservices. Configuration management of the legacy parts was one of our biggest headaches - we’d quite often run two tests believing the configuration was identical (or had only varied in the way we intended), only to find an uncontrolled change had altered the performance profile.
I also find with full system perf tests that there are so many confounding factors that you can’t just automate a red/green result at the end - you need a human in the loop and time spent in analysis.
Having said all this, I’ve seen a few teams successfully doing continuous perf testing at the component level, mocking all of an app/service’s dependencies and running short tests on commit. Quite often they record and plot the key metrics such as latency quantiles over time in order to establish when a new build has degraded performance relative to the previous one.