libpaf  1.1.9
Pathfinder Client Library API

This is the documentation for the Pathfinder Client Library API.

Author
Mattias Rönnblom
Version
0.1 [API]
1.1.9 [Implementation]

Overview

The Pathfinder Client Library API is used to access one or more Pathfinder service discovery domains, either as a service producer or consumer.

All the functions in this API are non-blocking in the sense that no blocking system calls are made.

For a description of the Pathfinder data model, refer to the Pathfinder Protocol Specification. Note: there are important semantical differences between certain operations on the protocol level, compared to this API (e.g., paf_publish() doesn't have the exact same semantics as the publish protocol-level command).

Service Discovery Domains

A Pathfinder service discovery domain is a namespace shared by all Pathfinder clients attached to that domain. A service publish by one client can be seen by all other clients attached to that domain. A domain is served by one or more Pathfinder server instances.

In order to participate in a domain, an application issues paf_attach() with the appropriate service discovery domain name. It need not know what servers are currently serving that domain.

Domain Configuration

The mapping between a service discovery domain name and the set of addresses to the Pathfinder servers serving this domain is kept in a file. The configuration for a particular domain name must be stored in a file with the same name as the domain, and be located in the domain files directory. The compile-time default location is is /run/paf/domains.d/.

The directory may contain an arbitrary number of domains.

In case the domain file does not exist at the time of the paf_attach() call, libpaf will periodically check if it has been created.

In case the file is modified (e.g., a server is added, removed or has its address changed), the file will be re-read by libpaf. If the file is removed, the set of servers is considered empty.

The environment variable PAF_DOMAINS may set in case a non-standard directory is preferred over the default.

File Format

libpaf supports two file formats. Either the contents of the file is a newline-separated list of XCM addresses, or a JSON object.

The newline-separated format allows for comments. In this format, empty lines and lines beginning with '#' are ignored. JSON does not support comments.

A domain file in the JSON format must contain a root JSON object, with a key "servers". The value of "servers" must be an array of zero or more JSON objects, each representing a server.

The server object must have a key "address", with the server's address in XCM format as its value.

A server object may have a key "localAddress", in which case this XCM-formatted address will be bound to before establishing an outgoing connection.

A server object may have a key "minProtocolVersion", used to increase the minimum protocol version advertised as supported by libpaf to that server. If "minProtocolVersion" is set higher than the maximum version suported by libpaf (i.e., > 3), no connections will be initiated to that server.

A server object may have the key "maxProtocolVersion", used to decrease the maximum protocol version advertised by libpaf. If the "maxProtocolVersion" is set lower than the minimum version supported by libpaf (i.e., < 2), no connections will be initiated to that server.

If both "minProtocolVersion" and "maxProtocolVersion" are set, "minProtocolVersion" must be equal to or lower than "maxProtocolVersion".

A server object may have a keys "minIdleTime" or "maxIdleTime" to configure client-side Liveness Tracking, and override the PAF_IDLE_MIN and PAF_IDLE_MAX values and the compile-time defaults (in case the environment variables are not set).

If both "minIdleTime" and "maxIdleTime" are set, "minIdleTime" must be equal to or lower than "maxIdleTime".

A server object may include a key "networkNamespace". If present, the library will make sure the outoing transport layer connection originates from a Linux network namespace named per the key's value. To switch between network namespaces, the process needs the CAP_SYS_ADMIN capability. The network namespace needs to be named as per iproute2 conventions.

In case the transport protocol uses TLS, a number of optional keys may be present in the server object:

  • "tlsCertificateFile": the leaf certificate to use.
  • "tlsKeyFile": the private key corresponding to the leaf certificate.
  • "tlsTrustedCaFile": a file containing the trusted CA certificates.
  • "tlsCrlFile": a file containing Certificate Revocation Lists (CRLs).

Setting tlsCrlFile will enable certificate revocation verification, and requires libpaf to be linked to libxcm version v1.9.0 or later.

In case some/all of the certificate file related keys are left out, libpaf will fall back to using the XCM defaults.

Below is an example of a domain file in JSON format:

{
"servers": [
{
"address": "tls:1.2.3.4:4444",
"tlsCertificateFile": "/etc/paf/certs/cert.pem",
"tlsKeyFile": "/etc/paf/certs/key.pem",
"tlsTrustedCaFile": "/etc/paf/certs/ca-bundle.pem"
},
{
"address": "tls:5.6.7.8:8888",
"minProtocolVersion": 3,
"localAddress": "tls:9.9.9.9:0"
},
{
"address": "tcp:fqdn:1111",
"networkNamespace": "oam",
"minIdleTime": 10
},
{
"address": "ux:foo"
"maxProtocolVersion": 2,
}
]
}

The same configuration (minus the network namespace and the certificate-related configuration), but in the newline-separated format:

tls:1.2.3.4:4444
tls:5.6.7.8:8888
tcp:fqdn:1111
ux:foo

Domain File Rescan

For all domains the application currently has attached to, libpaf tracks domain file changes. This check is performed periodically every ~5 s. A small random component is added to avoid load spikes, in case there are many clients on the same system.

This default interval may be changed by setting the PAF_RESCAN environment variable. The value a floating point number (in s). If set to zero, the rescanning is disabled.

Connection Reestablishment

In case the connection to a server is lost, or never was successfully established in the first place, libpaf will perform another attempt at a later time.

libpaf uses exponential back-off. The first retry is scheduled to occur after 10 ms. Every failed attempt double the retry interval, up to a maximum of 5 s. These two defaults may be changed by setting the PAF_RECONNECT_MIN and/or PAF_RECONNECT_MAX environment variables.

Liveness Tracking

On connections where the Pathfinder protocol version 3 is negotiated to be used, libpaf performs server liveness tracking on the level of the Pathfinder protocol.

On v3 connections, libpaf imposes an upper limit on how long time the remote peer is allowed to remain idle. When maximum idle time is approaching, libpaf will query the server to ensure it is still alive. In case the server also employs liveness checking, any server queries will be treated as a sign of life, and make libpaf post-poned any liveness query.

The maximum idle time is 30 seconds by default, and may be overriden by setting the PAF_IDLE_MAX environment variable.

The actual max idle time used may be lower than PAF_IDLE_MAX, in case low-TTL services have been published by the application, or have been matched in one of its subscriptions.

The actual max idle time will never be set to lower than PAF_IDLE_MIN, which is 4 seconds by default. To protect the server, libpaf will treat PAF_IDLE_MIN set lower than 1 second as set to 1 second.

On version 2 connections, libpaf depends on the transport protocol (e.g., TCP) for liveness checking.

On version 3 connections, TCP keepalive is disabled.

The minimum idle time is also used as an upper bound for the total amount of time the initial transport connection establishment (e.g., TCP three-way handshake and TLS hello) and the Pathfinder protocol-level hello transaction is allowed to take.

DNS and Multihomed Servers

The host part of the XCM server address in the Domain Configuration may either be a DNS hostname or an IP address in string format. If a Pathfinder server DNS hostname resolves to multiple A or AAAA records, libpaf will interpret that as a single, multihomed, server.

In such a scenario, libpaf will attempt to establish a TCP connection the server via all available IP addresses, but will employ only at most one connection for the actual Pathfinder protocol signaling. The Happy Eyeballs (RFC 6555) method is used.

Multihomed servers are only supported when libpaf is running linked to XCM v1.9.0 (or later). For older XCM versions, only the first (i.e., most preferred) IP address will be considered.

Service TTL

A service published using libpaf has a time-to-live (TTL) of 30 s. This default may be changed by setting the PAF_TTL environment variable, before the paf_publish() call.

The paf_set_ttl() function may be used to update the TTL for a specific service.

For a description of how service TTLs work in Pathfinder, please refer to the Pathfinder protocol specification.

Tracing

libpaf comes with built-in support for tracing. The library supports writing traces to stderr in human-readable format, or direct them to LTTng. The former is always available, and the latter is available if the library is built with LTTng support.

To enable stderr-type tracing, set the PAF_DEBUG environment variable to "1", before starting the application.

To enabled LTTng tracing, enable the relevant libpaf LTTng tracepoints.

Multi-thread Safety

All API calls are multi-thread (MT) safe when called on different context (for paf.h API calls) or service properties (for paf_props.h API calls). Thus, one thread may safely call paf_publish(), while another thread calls the same (or a different) paf_*() function, but on another context.

No API calls are MT safe when called on the same context or service properties. For that to work, external synchronization (e.g., a mutex lock) is required.