The Page Config Model

Many of you will be familiar with the page object model.

A page object model wraps an HTML page or fragment, with an application-specific API, allowing you to manipulate page elements without digging around in the HTML.

Iā€™d like to propose a a variant on this idea - the page config model.

The main difference is that the page config model uses configuration not code to abstract selectors. This is done according to the rule of least power. It should negate the need to e.g. add a whole new class when a new page is built.

Example configuration, in type-safe YAML:

login:
  element:
    username: "#id_username"
    password: "#id_password"
    ok: "#id_ok_button"
dashboard:
   element:
     message: "#id_dashboard_message"

with corresponding code that

pcm.next_page("login")
pcm.element("username").fill("myusername")
pcm.element("password").fill("mypassword")
pcm.element("ok").click()
pcm.next_page("dashboard") expect(pcm.element("message")).to_be_visible()

This code is from my new library.

Thereā€™s a few reasons for doing this:

  • Treating code as a liability - moving code to configuration as a ā€œlower poweredā€ version of code is easier to maintain - at least in theory (I think this would be untrue for type-unsafe YAML - the kind used in kubernetes, etc).
  • It makes it easy to measure ā€œselector debtā€ - e.g. by giving low scores to the selectors which are riskier/more likely to break and high scores to ā€œgood selectorsā€ and tracking the selector debt over time.

Thoughts?

1 Like

Nice idea, @hitchdev I am wanting to do some config driven page objects too, but had, a different idea and a cross-platform use-case too. Soā€¦

Okay Iā€™m seeing a few things here that are just tell me we are using different terminology or just shifting the selectors from code into config or metadata? Is that right?

You mention ā€œselector debtā€, I mean if you have no problem with your debt, is moving it into a YAML file going to make life better for engineers? Sure, code is still a liability, but inflexible config is also a liability. Notice how your start page is ā€œinitializedā€ by the call to .next_page(). To me this is a code cognitive ā€œloadā€ because ā€œloginā€ is no longer a static, itā€™s only possible to detect typos in variations of page names or missing pages at runtime now - is that intended?

Iā€™m assuming you can support composition of pages using this model? How would a person compose pages? For example if a new login page was added that was supporting federation (Google etc) type buttons instead of username + password fields? Can config-driven page object creation still handle that kind of product change? I think it can, but you would need to add a few things which are ging to help us really get to a rule of least power solution that wont frustrate and cause people to just abandon the pcm module and write their own.

Iā€™m not even sure I ever ever want to type pcm.element(ā€œokā€) , because ā€˜okā€™ is a constant and itself becomes problematic from a DRY perspective.

Iā€™m probably sounding mega critical, but thatā€™s only because Iā€™m working to a platform agnostic page object model where elements on pages on different platforms or on compact screen sizes change selector, and where I do NOT want the coder to have to worry about environment and test case portability too much. I think itā€™s possible, but needs covering if we want to scale the idea without resorting to the ugly code I currently have and am thus keen to pursue your idea.

ā€¦the place where things fell down for me in an earlier project (skunkworks and abandoned, but might be revived) was where some fields in a form can be optional fields. I went and attributed or decorated those so that the engine would know that some fields are disabled when the form loads, and that some fields like the password field are invisible until the user either enters an SSO name or enters a user ID and we can then detect whether to federate or not. So my configuration suddenly got full of little decorations. Rule of least power, should however remain a goal, but it needs to fit the 80/20 rule very well. Going the route of being config driven gave me some advantages, but ultimately that project got mothballed before I could explore their superpowers fully.

One reason I also love config driven is that is removes problems with things like circular dependency and so page composition tricks get easier to perform too. You can also get smart and use dynamic properties if your underlying language allows you to attribute objects at runtime. That converts code from
pcm.element('ok').click()
to
pcm.button_ok().click()
which has advantages for IDE editors which then cache the button_ok() method because they know the lvalue actually has such a property (although not foolproof) but it assists a lot in easier coding because you can just be literally duck typing half of the time. But be careful to not slip.

notice how i converted element('ok) into button_ok() to imply that a thing is a button not an edit box. This was super helpful whenever things needed to be smarter when a control was really an image not a button, or not an edit box. So the config could actually tell if an element was supposed to be modifiable or no - for example radio-button lists get much easier, but also more complicated once they are dynamic. Being able to either diss-allow in your product policy radio buttons in your UI or make test code deal with dynamic radio buttons becomes a time sink if you cannot get some power back.

Good idea, but does not or can not page object model encapsulate/contain your page config model?

You still have to write code to implement the page object methods to interact with the page object. Good page objects extract the page behavior to higher level descriptive actions (that combine a sequence of actions performed behind the scenes) rather than low level individual atomic actions like click, sendkeys, etc.

Iā€™ve not delved into the specifics of page object model guidelines for element locators, but based on my experience, nothing says you must follow page object model exactly to the official spec. So in that sense, thereā€™s no concrete rule to say how you organize the locators/selectors:

  • they could be (hard) coded in code
  • they could be abstracted out to a config file (YAML, JSON, XML, INI)
  • they could be stored and fetched from database

The default implementation I assume is folks typically store the locators coded into the code for single management codebase. But if you treat locators like test data, better to store them in config/input files. The code then just reads them into variables for use.

So it sounds like internally a page object implementation could just utilize your design for handling locators.

1 Like

Iā€™m pretty sure @daluu - I did not yet look at your code yet Colm, but that Colm is using a ā€œpage objectā€ thingy underneath and on top of the config loader layer and doing a good bit of SOLID software principles here all along. And yes a way to abstract the config to allow storage choices because some config schemas have so many problems we cannot go into here. But yes David I agree and when you read Martin fowler in between the lines and read the selenium implementation, both point to the topic not being a closed or complete pattern at all. References:

Nice idea, @hitchdev

Thanks! And thank you for really engaging with the idea. I do appreciate it.

Okay Iā€™m seeing a few things here that are just tell me we are using different terminology or just shifting the selectors from code into config or metadata? Is that right?

Yes, thatā€™s exactly it.

You mention ā€œselector debtā€, I mean if you have no problem with your debt, is moving it into a YAML file going to make life better for engineers?

Maybe. Two reasons:

  • If the selectors are no longer spread out all across the code base and are consolidated in a few files you can now see all of the selectors at a glance. That makes it easy to spot general problems and to do general maintenance - e.g. a front end developer could clean it up by adding IDs to interacted with elements and altering the selectors without having to change any tests.

  • It opens up a new use case - giving scores to selectors and aggregating those scores. For example ā€œ//xpath/[0]/1/a/lot/of//rubbish[@xyz=43431]ā€ could be given a score of 0.1 because itā€™s got flakiness written all over it and ā€œ#next-buttonā€ could be given a score of 1 because ID selectors are are about as solid as you can get. A script can give your code base an average of 0.85 (good) or 0.3 (not good). It could list the selectors by order of ā€œproblematicnessā€.

2 is just an idea for now, but Iā€™m interested in pursuing it a bit further.

To me this is a code cognitive ā€œloadā€ because ā€œloginā€ is no longer a static, itā€™s only possible to detect typos in variations of page names or missing pages at runtime now - is that intended?

Itā€™s not intentional, and I would like to prevent it. Iā€™m very keen on type safety and would love for this to fail quickly with a typo.

I think it would be possible to use page.ok.click() by dynamically constructing classes from the pages and returning page as an object more - exactly as you suggest in your second post. Iā€™ll have a long think about this.

Iā€™m assuming you can support composition of pages using this model? How would a person compose pages? For example if a new login page was added that was supporting federation (Google etc) type buttons instead of username + password fields? Can config-driven page object creation still handle that kind of product change?

Wouldnā€™t that just mean adding a new selector for the new ā€œlog in with google buttonā€ and writing a new story that uses that?

What exactly do you mean by composition of pages?

Iā€™m not even sure I ever ever want to type pcm.element(ā€œokā€) , because ā€˜okā€™ is a constant and itself becomes problematic from a DRY perspective.

Why so?

Iā€™m probably sounding mega critical,

No, mega critical is exactly what I want, thank you.

but thatā€™s only because Iā€™m working to a platform agnostic page object model where elements on pages on different platforms or on compact screen sizes change selector.

Whatā€™s the scenario you are thinking of here? I have some pages which appear very differently on, say, phone and PC but while some elements might disappear or resize selectors generally remain the same.

notice how i converted element('ok) into button_ok() to imply that a thing is a button not an edit box. This was super helpful whenever things needed to be smarter when a control was really an image not a button, or not an edit box. So the config could actually tell if an element was supposed to be modifiable or no - for example radio-button lists get much easier, but also more complicated once they are dynamic. Being able to either diss-allow in your product policy radio buttons in your UI or make test code deal with dynamic radio buttons becomes a time sink if you cannot get some power back.

Can you give an example use case here? Iā€™m not really sure why youā€™d need do use ā€œpage.button_ok.click()ā€ instead of ā€œpage.ok.click()ā€.

Whether or not it itā€™s an image or a text box or a radio button itā€™s still a thing with a selector which you can click. If you try to modify something which is not modifiable then, well, the test will fail.

You still have to write code to implement the page object methods to interact with the page object. Good page objects extract the page behavior to higher level descriptive actions (that combine a sequence of actions performed behind the scenes) rather than low level individual atomic actions like click, sendkeys, etc.

This could fit in the page config model. For example:

login:
  element:
    username: "#username"
    password: "#password"
    ok: "#ok"
  steps:
    login as ben:
      - fill form:
           username: ben
           password: password
      - click: ok

Thatā€™s edging on programming though. I donā€™t want to fall into the trap of creating another YAML programming language like github actions or ansible :slight_smile:

Controversial opinion here:

I actually donā€™t like putting these atomic actions inside the page object model or page config model. I find that those higher level descriptive actions usually end up throwing critical details of the spec under the bonnet. Iā€™ve read loads of test that use a vague page object method like ā€œfill in user detailsā€ and there were critical details inside the page model which explained the what and the why.

I get why this happens. They belong in some kind of abstraction because they will obviously become very repetitive but I donā€™t necessarily think that they should share the same abstraction that you use to abstract CSS selectors and XPaths. They belong, I think, under some kind of separate ā€œspecificationā€ abstraction - one that could theoretically be drilled into by the client rather than an HTML interaction pattern that nobody but testers and front end devs care about.

So it sounds like internally a page object implementation could just utilize your design for handling locators.

In theory yes, just as you could put one page object model inside another, but my goal with this pattern to pull apart ā€œspecification detailsā€ and ā€œinteraction detailsā€ and keep them in two separate layers with their own abstraction mechanisms.

1 Like

That for me has been the underlying distraction quite often too. I want my page object to be used in ā€œfluentā€ test writing style, a bit like a cucumber or the robot test language does. But both of those hide stuff under the bonnet and fail to split the two problems of ā€œspecificationā€ and ā€œnaturalā€ interactions that the tester needs to separate sometimes. I refuse to believe that I cannot have separation of business logic and nice pretty easy to use and consistent page objects at once. I really do hope the new web project team we are running invite me to join them, then I can give this idea a different stab. Model is just that, the way of seeing all web pages, the spec or business logic needs to live entirely in the test case somehow, but just without all the copy-paste and repetition if possible. Perhaps a really clean and pure page object is the right route.