Introduction
Welcome!
In the realm of software test automation, Page Objects are indispensable. They serve as a bridge between the intricacies of the applications and the automated test scripts that ensure their functionality and reliability. When implemented effectively, Page Objects can simplify the creation and maintenance of test automation scripts, making them more robust and adaptable to changes in the user interface (UI).
However, like any powerful tool, Page Objects are not immune to pitfalls that can hinder the effectiveness of your test automation efforts. These pitfalls, if left unaddressed, can lead to brittle tests, increased maintenance overhead, and frustrated automation teams. In this article, we’ll delve into the world of Page Objects and explore the common mistakes and missteps that can plague their implementation.
So, fasten your seatbelts and get ready to navigate the terrain of Page Object best practices. Feel free to follow me on Telegram channel as well where you can find more content about QA and development.
Keep locators open for interaction
Once I got a rather simple implementation task. Something changed in a simple user action on a certain page. An additional click was added or removed — it doesn’t matter, the focus is on changes in the behavior of our page.
“Well, what could be complicated here, you just need to change the corresponding method” — I thought. And I was right… Partly…
As soon as the changes were made, I ran all the tests to make sure they would apply correctly. And as a result… several tests kept failing.
After a little research, it turned out that the locators on this page were open for external interaction and someone took advantage of this, bypassing the Page Object layer in the test itself. So the test failures after correction are a consequence of violating the encapsulation principle of the page-object.
Let me show you an example of interaction with a simple application how it looked:
// Main page Page Object class
public class MainPage {
private final WebDriver driver;
@FindBy(how = How.CSS, using = "a.btn-primary.sweets")
public WebElement buttonBrowse;
public MainPage(WebDriver driver) {
this.driver = driver;
PageFactory.initElements(this.driver, this);
}
public void clickBrowseButton() {
this.buttonBrowse.click();
}
}
In the code, you can see that the locator is open for interaction directly from the test. This is a bit puzzling, because now I have two options for interacting with the element:
Call the Selenium click() method for buttonBrowse
Call the clickBrowseButton() method
Take a pause and think about which way you would choose to implement and why.
I still prefer the second one, because the first one does not take into account the following points:
I don’t know exactly what should happen when you click on a button. This can be either a simple click or a complex logic like “wait until the button is active and then click on it”. The click() method only covers the first option. It's better to delegate the click logic to a special page object that is "trained" to work with elements correctly
Let’s now think about what will happen if we would need to rename the element. If I used the fields from the Page Object directly, every test that uses this page and calls this field from it will fall into the list of changes when renaming. This will lead to a significant increase in the number of files in the Pull Request for a [relatively simple] task, which will complicate the Code Review process
And what if the click logic changes? For example, the button is now not immediately displayed on the page, but is loaded depending on the data received from the API. And again, we return to the fact that we will have to change the logic in all the tests this field is used in. In addition to this, the use of “direct” interaction in tests does not give an understanding from the code of how this logic works (unlike the method, which I can open and see that we are waiting for a response from the API)
Store elements, not locators in class fields
One day I decided to improve our page objects to provide more readable and accessible functionality to develop automated tests. My suggestion was to have a “Pages hub” — a place where all pages are stored. It seemed very convenient to call the pages like this:
Pages.main.clickBrowseButton();
The main advantage was not to keep in mind all the pages set I have in project (and remember all the class names). Now I have only one class and can easily find the required one.
When I refactored the code my tests became fail if the test number was more than 1. After a small research I understood that problem was related to proxy pattern using for initializing the elements so you need to be accurate with the pages hub initialization (where and how you do it).
The final solution I came up to was to store locators instead of elements in page objects:
public class AjaxDataPage {
private By buttonAjax = By.id("ajaxButton");
private By ajaxContent = By.cssSelector("#content > p");
// ...
public void loadContent() {
this.driver.findElement(buttonAjax).click();
}
// ...
}
The approach is pretty much the same but:
Can extract WebDriver to somewhere else and I do not need to create a constructor for every page object created (e.g. create BasePage class and initialize driver there or write a singletone class for WebDriver)
No more PageFactory.init() methods to initialize the page object
No @FindBy annotation for each field and it looks much more cleaner
No more headeache about proxy pattern Selenium annotations use
Pretty sure the case is arguable as it’s rare. Anyway, the page object class looks cleaner and it simplify the work with such an objects. The next step could be to create a wrap class to operate with the element where complicated logic can be stored, not only clicks, sending keys etc. But let’s get back to our main topic.
Page Object contains the full page URL
One of the advantages Page Object pattern could provide for us is to open the page directly. But for using such an advantage Page Object should store the URL page can be opened by. The obvious trick here (at least I hope so 🙂) is not to store the URL in absolute manner, but the relative one.
Let me tell you what happend to absolute URLs. There was only one evironment for tests so at the first look it does not matter to have configuration class or not. There also was someone who had chosen the most complicated way to open the required page in the tests — to open the main one and then click to the required. Worth it to say such an approach leads to increasing execution time compared to open the page directly by the link? In such a situation you’are intended to use full URL for the page to open, but please don’t.
To avoid the hardcode you need to store the base (application) URL in the config file that is reading before test execution and can be dynamically applied to the Page Object. We can manage what environment is chosen by passing the ENV variable while test running or the parameter in CI/CD server for the specific build.
Another one profit you can get by parameterize the page URL is “easy to move” between the applications (especially if you have multi-domain architecture).
Let’s see the example:
// Ajax page we're going to interact with
// Page domain has changed? Just replace the UiTestingPlaygroundPage class with the new one
public class AjaxDataPage extends UiTestingPlaygroundPage {
public AjaxDataPage(WebDriver driver) {
super(driver);
this.url = "/ajax";
}
}
// Base page (application page)
public class UiTestingPlaygroundPage {
// Not using hardcode but the config file instead
private final String baseUrl = Config.getBaseTestingPlaygroundBaseUrl();
protected String url;
// Keep this method in base page so we can use it in any page with the specific base url
// Basically, it's worth to create a "Base" class for all the pages and implement the common methods
public void open() {
this.driver.get(String.format("%s%s", this.baseUrl, this.url));
}
}
// Test example
@Test
void testAjaxPage() {
AjaxDataPage page = new AjaxDataPage(this.driver);
page.open();
page.loadContent();
page.isAjaxContentLoaded();
}
Operating with WebElements in complex methods
When we talk about methods in Page Objects, keep in mind that the page object itself is a full-fledged entity, responsible for actions on this page. This formulation seems clear when it comes to simple interactions like a click or text input — you need to create separate methods for these operations, not call fields from the page object in the test. However, what to do when it comes to non-atomic actions?
A good example is user login. Being on one page, we perform several actions — we enter the login, password, and press the login button. An example of code describing this function could be as follows:
public class LoginPage extends BasePage {
// locators, constructors etc.
public void login(String username, String password) {
this.usernameTextbox.type(username);
this.passwordTextbox.type(password);
this.loginButton.click();
}
}
Unfortunately, such an organization repeats the mistake from the previous point and carries the same consequences — if something changes in the behavior of the application we would have to change the actions given over one or another element in each places this element used in.
Let’s imagine that I have a second method that is responsible for the user’s login by saving session data:
public class LoginPage extends BasePage {
// locators, constructors etc.
public void login(String username, String password) {
this.usernameTextbox.type(username);
this.passwordTextbox.type(password);
this.loginButton.click();
}
// Epic naming, I know ...
public void loginForever(String username, String password) {
this.usernameTextbox.type(username);
this.passwordTextbox.type(password);
this.rememberMeCheckbox.check();
this.loginButton.click();
}
}
Now, if something changes in the login behavior, I will need to update the code in 2 methods.
This example is quite trivial, it is easy to understand, but in practice there are significantly more complex cases that will require more mental effort from you.
What to do? The answer is obvious: create atomic methods for each operation. This will transfer some of the responsibility for repeating actions to separate methods and give more flexibility, since you will be able to use atomic methods in tests in specific cases where it is not required to perform a long scenario (but be careful, I do not recommend storing long scenarios in page objects, all page object methods must be short and logically completed):
public class LoginPage extends BasePage {
// locators, constructors etc.
public void fillLogin(String username) {
this.usernameTextbox.type(username);
}
public void fillPassword(String password) {
this.passwordTextbox.type(password);
}
public void clickLogin() {
this.loginButton.click();
}
public void setRememberMe(bool state) {
state
? this.rememberMeCheckbox.check()
: this.rememberMeCheckbox.uncheck();
}
public void login(String username, String password) {
fillLogin(username);
fillPassword(password);
clickLogin();
}
public void loginForever(String username, String password) {
fillLogin(username);
fillPassword(password);
setRememberMe(true);
clickLogin();
}
}
Now you can assemble complex actions like a constructor from simple ones. At the same time when changing the behavior of a simple method, you will only need to correct one place, the other methods will remain untouched.
Asserting in Page Object
At first look, this idea seems logical. We have a page object, why can’t we delegate to it the verification of the presence of a certain element/text/something else? It seems that such delegation is a continuation of the previous point, where we make a more flexible structure of pages by creating atomic methods and constructing more complex ones from them.
Okay, let’s say we go this way. What could go wrong?
Firstly, we “share” verifications in the test across our framework. If we shift part of the verification logic to the page object, then the library for running tests will have to be pulled up depending on the classes of page objects. Not convincing? Probably.
The second point to consider is the redundant methods that will inevitably appear when transferring the layer of checks to the page object. I’ll show this with an example:
public class HomePage extends BasePage {
// locators, constructors etc.
public String getHeaderText() {
this.header.getText();
}
public void checkHeaderText(String text) {
String actual = getHeaderText();
assertEquals(actual, text, "expected text: '" + text + "', but was: '" + actual + "'");
}
}
It seems like our structure has become slightly overloaded. Instead of having one method to retrieve some data from the page, we’re adding another one responsible for validation. But what if we need to add a check that the text differs from the given one, contains a specific sequence of characters, etc.? For each variation, we’ll have to write a separate method in the page object. As a result, we’ll end up with quite a large chunk of code.
This argument looks more convincing already. Is there another one?
Certainly. It’s worth mentioning that the Selenium developers themselves recommend to extract validations in the tests themselves, rather than keep them to the page objects.
Describe same elements in different Page Objects
Modern web applications are built on frameworks such as React, Vue, Angular (add your own to this list 🙂). This means that all pages use components and layouts to display various elements, i.e., during development, a piece of code is written once and then embedded into the necessary pages.
I’ve also encountered this in my practice. Instead of following a similar approach in tests, a framework was created that adhered to the principle of “tests should be as simple as possible.” I’m not disputing this statement by the way, but it’s important to distinguish between “simple” and “clumsy” 🙂
As expected, the structure of the page changed at some point, and all tests using a certain component started failing. Do I need to explain how much time was spent changing the old logic in all pages of the project and how large the Pull Request became afterwards? This could have been easily avoided by not duplicating elements on the page.
So what should we do? As mentioned above, reusable page elements should be extracted into components and then connected to the page itself as dependencies. As for layouts, I prefer the solution of inheriting from specific page objects:
/*
We inherit the layout here to get access to common admin elements like Edit/Delete buttons, Sidebars etc
Do it for every admin page with the same layout
*/
public class ContactListPage extends AdminLayout {
// And here we have a component injection
public ContactListComponent contactsList = new ContactListComponent();
// ...
}
// Example of component code
public class ContactListComponent {
private By selectAllCheckbox = By.sssSelector("#select-all");
public void selectAllContacts() {
selectAllCheckbox.click();
}
}
@Test
public void allContactsDelete() {
ContactListPage page = new ContactListPage(driver); // driver is already initialized
page.contactList.selectAll();
/*
This is a common component for admin view so we can use it thanks to AdminLayout
inheritance.
*/
page.adminBar.delete();
}
Actually, using components opens up more possibilities for us. For example, we can use this approach for easy automation of responsive design or implementation of the strategy pattern if we’re writing E2E tests in a microservices architecture. I understand that it may not sound very clear, but that’s not the goal of this article. If you’re interested in what is meant by easy automation of responsive design and the strategy pattern then 👏, I’ll definitely delve into these topics in the future.
Not using DSL for Page Object methods
I really like the idea of OOP. Mostly because we write programs not as we were taught in school (scripts, a set of commands), but as if we interact with the world through objects. From the outside, we don’t know how a particular object behaves, but we know that we can fully trust it to solve a certain task and not interfere, altering its behavior with our logic. This approach significantly changes the way we live and think, but that’s not the point right now 🙂
Since a Page Object is also an object, it should be autonomous and independent. This requirement imposes certain restrictions on the naming of methods within. For example, if we consider an object as just an instance of a class, then the method name clickStartButton() doesn't seem so bad. However, when we perceive an object as an autonomous unit capable of solving tasks for us, such naming doesn't seem like a good idea anymore.
And why is that? Let me try to explain. A page object is created to solve tasks related to interacting with the page. But clicking a button itself is not a fully fledged task; rather, it’s an action. It’s more important to ask the question: “Why are we clicking the button? What the result do we want to achieve by pressing it?”
If we pay attention to these two questions during the design and implementation of the page object, our methods automatically become more deliberate, giving the user (QA automation engineer) more information about what we are doing and why. Compare these two method names: clickStartButton() and startQuiz(). Which one provides more context? Which one represents a logically complete business action?
Try writing a test using this technique, and you’ll be surprised how pleasant it is to read afterwards. Try letting someone who is unfamiliar with the product read this test and ask him what it does. I bet $10 they’ll understand much more from the DSL description than by interpreting a sequence of clicks and data entry.
When I talk about DSL approach in tests, I’m not just voicing my thoughts, which are the result of long search and work on numerous projects, but I also refer to Selenium developers. They also recommend using this approach when designing Page Objects. If you’re hearing about this for the first time, I highly recommend exploring and trying it out.
Don’t utilize Fluent API pattern
I don’t know what’s the matter but for some reason automation testers don’t seem to like using this pattern on real projects. In my opinion, it’s very unwarranted. Chaining methods is much more convenient than dragging a bunch of page objects into a test or scenario class and “losing context” while writing the test itself.
I already have a couple of articles on this topic, just take a look at how much neater the code becomes:
If we’re talking about testing, this pattern has another huge advantage. When our code “knows” which page should open after a certain action, we stop thinking about it. This means that we can focus on the test logic instead of constantly returning to the question “Which page will open after the action I perform?” Just try it, and you’ll see how convenient it is 🙂
This pattern seems extremely simple to me, but I rarely encounter such implementation in practice. Am I mistaken, and does it actually cause difficulties in work? 🤔 Share your opinion in the comments.
Conclusion
Today we’ve discussed typical mistakes in designing page objects. If you want to recap the thoughts, please keep in mind that I described errors under the subtitles, so the theses should be read as non-. For example, if the subtitle says “Describe same elements in different Page Objects” then you need to add negation to it to get a good practice instead of a bad one: “Don’t describe same elements in different Page Objects”
Don’t forget to share your thoughts in the comments, and keep automating! Bye for now!
Top comments (1)
Do you know any other Page Object pitfall? Please feel free to leave the description in comments