← Notes

August 9, 2021

Load Testing Nim WebSockets with Artillery

Testing simultaneous WebSocket connections in Nim


Tags: websocket nim artillery
network-1572617_1920.jpg

WebSocket is a communication protocol widely used for specific web tasks. It has some similarities with HTTP. They both use TCP and often work over the same 443 port. However, while HTTP opens and closes a connection on each request, WebSocket opens a connection once, allowing a free flow of data reducing the communication overhead. This is particularly useful for streaming APIs, polling applications and more in general when data has to be continuously streamed between client and server.

In order to sustain many connections with a high QoS, an efficient implementation of the WebSocket protocol is necessary. Here, we perform some measurements of a WebSocket library written in Nim called ws. To do so, we will use a tool called Artillery.

Echo Server

Let's begin by installing the WebSocket module. If you have nimble install you can use the following command:

nimble install ws

For these tests, we are going to use an echo server provided in the library documentation. For reference, we report here the Nim code:

import asyncdispatch
import asynchttpserver
import ws

var connections = newSeq[WebSocket]()

proc cb(req: Request) {.async, gcsafe.} =
  if req.url.path == "/ws":
    try:
      var ws = await newWebSocket(req)
      connections.add ws
      await ws.send("Welcome to simple echo server")
      while ws.readyState == Open:
        let packet = await ws.receiveStrPacket()
        asyncCheck ws.send(packet)
    except WebSocketClosedError:
      echo "Socket closed"
    except WebSocketProtocolMismatchError:
      echo "Socket tried to use an unknown protocol: ", getCurrentExceptionMsg()
    except WebSocketError:
      echo "Unexpected socket error: ", getCurrentExceptionMsg()

  await req.respond(Http200, "Hello World")

var server = newAsyncHttpServer()
waitFor server.serve(Port(9001), cb)

To compile and run the code, you can use the following command, assuming the code has been stored in a file called server.nim:

nim c -r -d:release server.nim

Artillery

We first have to install Artillery with this command (Node.js is required):

npm install -g artillery@latest

Artillery will be installed globally, so it can be executed from any directory.

Once installed, we need a configuration file that we call config.yml. Inside this file we can specify the test procedure that we want to try against the server.

The configuration that we are going to use is the following:

scenarios:
  - name: Nim WebSocket load test
    engine: ws
    flow:
      - send: "Stress test"

config:
  target: "ws://localhost:9001/ws"
  phases:
    - duration: 30
      arrivalRate: 10

The two main parameters are the duration of the test, set to 30 seconds, and the arrival rate, set to 10. We can vary these values to test for different types of traffic conditions and include more steps in the flow.

In this case we are using an echo server, but it is also possible to create more thorough load tests as explained in the Artillery documentation.

To run the test simply use the following command, specifying the configuration file.

artillery run config.yml

The result on an Intel Core m7 (1.3 GHz) is the following.

Started phase 0, duration: 30s @ 14:23:37(+0200) 2021-08-07
Report @ 14:23:47(+0200) 2021-08-07
Elapsed time: 10 seconds
  Scenarios launched:  99
  Scenarios completed: 99
  Requests completed:  99
  Mean response/sec: 10.01
  Response time (msec):
    min: 0.1
    max: 1.8
    median: 0.2
    p95: 0.3
    p99: 1.1
  Codes:
    0: 99

Report @ 14:23:57(+0200) 2021-08-07
Elapsed time: 20 seconds
  Scenarios launched:  100
  Scenarios completed: 100
  Requests completed:  100
  Mean response/sec: 10.02
  Response time (msec):
    min: 0.1
    max: 0.5
    median: 0.2
    p95: 0.3
    p99: 0.4
  Codes:
    0: 100

Report @ 14:24:07(+0200) 2021-08-07
Elapsed time: 30 seconds
  Scenarios launched:  100
  Scenarios completed: 100
  Requests completed:  100
  Mean response/sec: 10.01
  Response time (msec):
    min: 0.1
    max: 2.7
    median: 0.1
    p95: 0.3
    p99: 2.5
  Codes:
    0: 100

Report @ 14:24:08(+0200) 2021-08-07
Elapsed time: 30 seconds
  Scenarios launched:  1
  Scenarios completed: 1
  Requests completed:  1
  Mean response/sec: 2
  Response time (msec):
    min: 0.2
    max: 0.2
    median: 0.2
    p95: 0.2
    p99: 0.2
  Codes:
    0: 1

All virtual users finished
Summary report @ 14:24:08(+0200) 2021-08-07
  Scenarios launched:  300
  Scenarios completed: 300
  Requests completed:  300
  Mean response/sec: 9.87
  Response time (msec):
    min: 0.1
    max: 2.7
    median: 0.2
    p95: 0.3
    p99: 1.9
  Scenario counts:
    Nim WebSocket load test: 300 (100%)
  Codes:
    0: 300

This is a simple working example. In this case, artillery creates a number of connections equal to the arrivalRate every second for the specified duration, sends the message "Stress test" and closes the connection. However, we can look at more specific scenarios like the number of open connection at the same time while performing a given task. In the configuration file, we have specified a single option under flow. Let's make it more complex.

scenarios:
  - name: Nim WebSocket load test
    engine: ws
    flow:
      - loop:
        - send: "Stress test"
        - think: 0.01

config:
  target: "ws://localhost:9001/ws"
  phases:
    - duration: 10
      arrivalRate: 1000

With this example, we can simulate multiple open connections over a longer period of time with clients and server exchanging the string "Stress test". In fact, each client sends the message continuously pausing 10 milliseconds between each message. The duration is set to 10 seconds. With an arrival rate of 1000, we should expect 10k connections each sending 100 messages per second.

Note: the loop option expects a count parameter. If omitted the loop runs indefinitely.

This time let's run the test on a beefier machine, a server with an Intel Xeon E5-2620 v3 (6 cores, 12 threads, Turbo boost up to 3.2 GHz). Notice that the WebSocket server is running on a single core. By distributing the load over multiple cores the performance should increase. The result on a single thread after running 30 seconds is reported hereafter.

[OSError]: Too many open files. If you see this error, you have to increase the number of sockets that can be opened at the same time. This can be done with ulimit -n <no_socket>. The value of no_socket cannot be larger than the hard limit which can be seen with ulimit -aH. Notice that this change is temporary, so it is valid only for the current session.
Started phase 0, duration: 10s @ 14:57:09(+0200) 2021-08-09
Report @ 14:57:20(+0200) 2021-08-09
Elapsed time: 10 seconds
  Scenarios launched:  973
  Scenarios completed: 0
  Requests completed:  224179
  Mean response/sec: 20738.11
  Response time (msec):
    min: 0
    max: 1.5
    median: 0
    p95: 0
    p99: 0
  Codes:
    0: 224179

Warning: 
CPU usage of Artillery seems to be very high (pids: 6585)
which may severely affect its performance.
See https://artillery.io/docs/faq/#high-cpu-warnings for details.

Report @ 14:57:29(+0200) 2021-08-09
Elapsed time: 20 seconds
  Scenarios launched:  261
  Scenarios completed: 0
  Requests completed:  191773
  Mean response/sec: 20358.07
  Response time (msec):
    min: 0
    max: 5.1
    median: 0
    p95: 0
    p99: 0
  Codes:
    0: 191773

Warning: High CPU usage warning (pids: 6585).
See https://artillery.io/docs/faq/#high-cpu-warnings for details.

Report @ 14:57:40(+0200) 2021-08-09
Elapsed time: 30 seconds
  Scenarios launched:  202
  Scenarios completed: 0
  Requests completed:  199241
  Mean response/sec: 20372.29
  Response time (msec):
    min: 0
    max: 7
    median: 0
    p95: 0
    p99: 0
  Codes:
    0: 199241

Conclusions

With this simple example, we have shown that the WebSocket library ws written in Nim is able to sustain around 10k connections communicating in 10 milliseconds intervals satisfying almost 20k requests each second using a single thread. It is also worth noting that the best 99% response time took less than 0.1 ms and the worst response time took about 7 ms.

By running a few more times the same test, the results were quite similar with about 20k requests per second satisfied.

In a real-world scenario, the logic of the server would be much more complex resulting in an increase in resource utilization causing longer response times. In this case, Artillery has been executed on the same machine. Considering that the WebSocket server was running on a single thread and that the CPU has 12 threads, Artillery shouldn't have influenced the results.

It would be interesting to see the performance of the same echo server taking advantage of multiple cores.