Having years of experience with big Selenium clusters we in Aerokube team are trying to change Selenium testing world with really efficient tools free to use for everyone. You could have heard about our lightweight Selenium server replacement called Selenoid and an extremely efficient load balancer for big clusters called Ggr. If you already have a Selenium cluster or are planning to deploy one — an important question you should think about is: “How would I maintain such cluster?” or “Is it clear how to work with such cluster from systems administrator point of view?”. Today we are going to make your systems administrator happy with a Selenium cluster clear as a bell.

Clear Documentation

A short but sufficiently detailed documentation is an important part of any self-respecting software project. Selenium traditionally had rather poor, scrappy and very often outdated documentation stored on the wiki pages. Which is very annoying — this official documentation mainly covers usage aspects such as creating tests and does not provide a strong foundation about setting up the reliable infrastructure. Numerous articles about Selenium Grid continue to copy-paste the same commands. Tens of thousands of StackOverflow questions about correctly using CSS selectors are even more distancing your systems admin from making your development team happy. We did an effort to deliver a single point of truth about our tools. The following links lead to genuine documentation:

  • Selenoid UI — a short demonstration of the Selenoid UI, our standalone user interface for Selenoid.
  • Configuration Manager — more details about cm, a magic wand — doing all routine Selenoid installation and configuration work for you.
  • Ggr — everything you wish to know about Ggr.

Responsive Support

It is impossible to describe all software usage aspects even being an 80th level documentation expert. You should always be sure to quickly find answers to tricky or platform-specific questions. This is why a good open-source project should always have at least one responsive support channel. We are providing at least three official support channels:

  1. Then — faster Telegram support chat. We are proud to have almost 300 permanent members now in this chat. So most of newbie questions are answered almost immediately.
  2. Finally — we have StackOverflow tag and from time to time ask our users to post tricky questions there. You are welcome to use the best fitting channel if you have any questions. Let’s now move to the most interesting part — the practice.

Measuring Software Efficiency

Health API

Large scale software is often installed behind load balancers proxying requests only to healthy application instances. Every application should provide the way to determine whether it is alive. One possible approach for HTTP-based applications is to return health status via HTTP — by provide its health API. Our tools — Ggr and Selenoid follow this pattern and return their health status on /ping. A healthy instance always returns 200 and some additional information in JSON format:

$ curl -s -D- http://my-ggr-host.example.com:4444/ping
HTTP/1.1 200 OK
Date: Mon, 11 Dec 2017 03:36:39 GMT
Content-Length: 125
Content-Type: text/plain; charset=utf-8
{"uptime":"60h9m36.257828483s","lastReloadTime":"2017-12-08 18:27:03.632220529 +0300 MSK m=+0.009773971","numRequests":4082}

Dealing with Logs

Now when you know how to easily find sick Selenoid and Ggr instances — let’s move to the logging stuff. Application logs are used by systems administrator every day. This is why to be efficient it is very important to have handy logging architecture. Modern application deployments nowadays tend to use centralized logging storage. This gives a single entry point to all logs as well as fast log searching capabilities.

  • Logstash — a daemon to process your logs
  • Kibana — cool user interface to view the logs
version: '3'networks:
elk:
volumes:
elasticsearch:
driver: local
services: elasticsearch:
environment:
http.host: 0.0.0.0
transport.host: 127.0.0.1
image: docker.elastic.co/elasticsearch/elasticsearch:6.2.1
networks:
elk: null
ports:
- 9200:9200
restart: unless-stopped
volumes:
- elasticsearch:/usr/share/elasticsearch/data:rw
logstash:
image: docker.elastic.co/logstash/logstash-oss:6.2.1
depends_on:
- elasticsearch
networks:
elk: null
ports:
- 5044:5044
restart: unless-stopped
volumes:
- ./etc/logstash/pipeline:/usr/share/logstash/pipeline:ro
kibana:
depends_on:
- elasticsearch
environment:
ELASTICSEARCH_PASSWORD: changeme
ELASTICSEARCH_URL: http://elasticsearch:9200
ELASTICSEARCH_USERNAME: elastic
image: docker.elastic.co/kibana/kibana-oss:6.2.1
networks:
elk: null
ports:
- 5601:5601
restart: unless-stopped
input {
beats {
port => "5044"
}
}
filter {
if [docker][container][name] == "ggr" {
grok {
match => {
"message" => "%{YEAR:year}\/%{MONTHNUM:month}\/%{MONTHDAY:day} %{TIME:time} \[(-|%{NONNEGINT:request_id})\] \[(-|%{NUMBER:duration}s)\] \[%{NOTSPACE:status}\] \[(-|%{NOTSPACE:user})\] \[(-|%{IPORHOST:user_host})\] \[(-|%{NOTSPACE:browser})\] \[(-|%{NOTSPACE:browser_host})\] \[(-|%{NOTSPACE:session_id})\] \[(-|%{POSINT:counter})\] \[(-|%{DATA:msg})\]"
}
}
mutate {
remove_field => [ "message" ]
}
} else if [docker][container][name] == "selenoid" {
grok {
match => {
"message" => "%{YEAR:year}\/%{MONTHNUM:month}\/%{MONTHDAY:day} %{TIME:time} \[(-|%{NONNEGINT:request_id})\] \[%{NOTSPACE:status}\] \[%{DATA:data}\]( \[%{DATA:optional_data}\])?( \[%{NUMBER:duration}s\])?"
}
}
mutate {
remove_field => [ "message" ]
}
}
mutate {
remove_field => [ "beat", "source", "prospector", "tags", "stream" ]
convert => {
"request_id" => "integer"
"duration" => "float"
"counter" => "integer"
}
}
}
output {
if [docker][container][name] == "ggr" {
elasticsearch {
hosts => "elasticsearch:9200"
index => "ggr-%{+YYYY.MM.dd}"
}
} else if [docker][container][name] == "selenoid" {
elasticsearch {
hosts => "elasticsearch:9200"
index => "selenoid-%{+YYYY.MM.dd}"
}
} else {
elasticsearch {
hosts => "elasticsearch:9200"
index => "browsers-%{+YYYY.MM.dd}"
}
}
}
  1. Start the stack using docker-compose.yml file shown before:
$ docker-compose -f /path/to/docker-compose.yml up -d
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
not:
contains:
docker.container.image: filebeat
config:
- type: docker
containers.ids:
- "${data.docker.container.id}"
logging.metrics.enabled: false
output.logstash:
hosts: ["elk.example.com:5044"]
version: '3'services:
filebeat:
image: docker.elastic.co/beats/filebeat:6.2.1
user: root
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /etc/filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro

Dealing with Selenium-specific Metrics

In addition to system metrics such as load average or memory consumption it is very important to analyze application-specific metrics to better understand what happens inside the application. For Selenium possible specific metrics could be: browsers usage — overall and per version, total number of sessions being run in parallel, total number of tests waiting for browsers and so on and so forth. With standard Java-based Selenium server a trivial task of getting these metrics is not possible out of the box. How could you do this?

package com.aerokube.selenium;import com.google.common.io.ByteStreams;
import org.openqa.grid.web.servlet.RegistryBasedServlet;
import javax.servlet.ServletConfig;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;
public class HubStatServlet extends RegistryBasedServlet { public HubStatServlet() {
super(null);
}
@Override
public void init(ServletConfig config) throws ServletException {
super.init(config);
}
@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException {
response.setContentType("application/text");
response.setCharacterEncoding("UTF-8");
response.setStatus(200);
Map<String, Browser> table = new HashMap<String, Browser>() {
@Override
public String toString() {
return entrySet().stream()
.map(e -> String.format("%s %s", e.getKey(), e.getValue()))
.collect(Collectors.joining("\n"));
}
};
getRegistry().getAllProxies().forEach(
p -> p.getTestSlots().forEach(
slot -> {
Map<String, Object> caps = slot.getCapabilities();
String browserName = String.format(
"%s-%s",
caps.get("browserName"),
caps.get("version")
);

if (!table.containsKey(browserName)) {
table.put(browserName, new Browser());
}

Browser browser = table.get(browserName);
if (slot.getSession() == null) {
browser.increaseFree();
} else {
browser.increaseUsed();
}
}
)
);

byte[] out = table.toString().getBytes("UTF-8");
response.setContentLength(out.length);
try (InputStream in = new ByteArrayInputStream(out)) {
ByteStreams.copy(in, response.getOutputStream());
} finally {
response.getOutputStream().close();
}
}
private class Browser {

private int used;
private int free;
void increaseUsed() {
this.used++;
}
void increaseFree() {
this.free++;
}
@Override
public String toString() {
return String.format("%s %s", used, used + free);
}
}
}
$ curl -s http://selenium-hub.example.com:4444/grid/admin/HubStatServlet
$ curl http://selenoid-host.example.com:4444/status
{
"total": 10,
"used": 1,
"queued": 0,
"pending": 0,
"browsers": {
"chrome": {
"62.0": {},
"63.0": {}
},
"firefox": {
"57.0": {
"my-user": {
"count": 2,
"sessions": [
{
"id": "37809fc9-37b5-4537-a23e-34df28637228",
"container": {
"id": "2a82d79b690a0148fdf59c3af97d3a73df63108090318746df2fa48642410a6e",
"ip": "172.17.2.221"
},
"vnc": false,
"screen": "1920x1080x24",
"caps": {
"browserName": "firefox",
"version": "57.0",
"screenResolution": "1920x1080x24",
"enableVNC": false,
"enableVideo": false,
"videoName": "",
"videoScreenSize": "1920x1080",
"videoFrameRate": 0,
"name": "",
"timeZone": "",
"containerHostname": "",
"applicationContainers": "",
"hostsEntries": ""
}
}
]
}
},
"58.0": {}
},
"opera": {
"50.0": {},
"51.0": {}
}
}
}
  • A time-series database instance to store metrics. In this example we’ll use InfluxDB but there are a lot of alternatives.
  • Cool web UI to show the charts == Grafana. Point.
version: '3'
services:
influxdb:
image: influxdb:alpine
container_name: influxdb
ports:
- "8086:8086"
volumes:
- ./data/influxdb:/var/lib/influxdb
environment:
INFLUXDB_REPORTING_DISABLED: "true"
INFLUXDB_DB: telegraf
INFLUXDB_USER: telegraf
INFLUXDB_USER_PASSWORD: supersecret
grafana:
build: ./grafana
container_name: grafana
volumes:
- ./data/grafana:/var/lib/grafana
ports:
- "3000:3000"
links:
- influxdb
environment:
GF_AUTH_ANONYMOUS_ENABLED: "true"
GF_AUTH_ANONYMOUS_ORG_ROLE: "Admin"
INFLUXDB_URI: "http://influxdb:8086"
INFLUXDB_DB: telegraf
INFLUXDB_USER: telegraf
INFLUXDB_USER_PASSWORD: supersecret
version: '3'
services:
telegraf:
image: telegraf:latest
container_name: telegraf
network_mode: "host"
volumes:
- ./telegraf.conf:/etc/telegraf/telegraf.conf:ro
environment:
INFLUXDB_URI: "http://grafana.example.com:8086"

Conclusion

I hope you now have a lot more arguments for systems administrator to install and maintain a lightweight Selenium cluster for your team. Browser automation have never been so simple and efficient. While you are reading these lines we certainly continue doing our best to make you even more happy. See you soon!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store