Skip to main content

Ephemeral client publications

Centrifugo PRO provides schema validation for client publications, enabling ephemeral messaging: client publications can pass through Centrifugo directly without involving backend proxy logic, reducing backend load and delivery latency. Normally the backend is required because it may validate and store messages in the main database, but for certain types of messages—such as typing notifications in a chat room—backend involvement adds unnecessary overhead. Centrifugo PRO offers an efficient way to address that.

Overview

The feature consists of three parts which together provide a ground for ephemeral client publications:

  • Validation layer - validate client publications based on JSON schema
  • Bandwidth optimization - optionally exclude client info from publications to reduce message size
  • Server-side tagging - attach custom tags to publications that cannot be spoofed by clients

Configuration

Defining schemas

Schemas are defined at the top level of Centrifugo configuration. Centrifugo supports two types of schemas:

  • JSON Schema (jsonschema_draft_2020_12) - Validates publication data against JSON Schema Draft 2020-12
  • Empty Binary (empty_binary) - Only allows empty binary data (useful for presence-like signals)
Security Default

For JSON schemas, Centrifugo automatically sets "additionalProperties": false on object-type schemas unless explicitly specified otherwise. This prevents clients from injecting unexpected fields into validated data.

JSON Schema (default)

The type field is optional and defaults to jsonschema_draft_2020_12. You can define schemas directly in your configuration file:

config.json
{
"schemas": [
{
"name": "chat_message",
"type": "jsonschema_draft_2020_12",
"definition": "{\"type\":\"object\",\"properties\":{\"text\":{\"type\":\"string\",\"maxLength\":500}},\"required\":[\"text\"]}"
}
]
}

For better readability in YAML, use multiline strings:

config.yaml
schemas:
- name: chat_message
type: jsonschema_draft_2020_12 # Optional, this is the default
definition: |
{
"type": "object",
"properties": {
"text": {"type": "string", "maxLength": 500},
"mentions": {"type": "array", "items": {"type": "string"}}
},
"required": ["text"]
}

Empty Binary Schema

The empty_binary schema type validates that publication data is empty. This is useful for presence-like signals where the fact of publication itself carries meaning (e.g., "user is typing"):

config.json
{
"schemas": [
{
"name": "typing_indicator",
"type": "empty_binary"
}
]
}
config.yaml
schemas:
- name: typing_indicator
type: empty_binary
note

Empty binary schemas don't require a definition field since they only validate that data is empty (0 bytes).

Schema from file

For complex schemas, you can reference external JSON schema files. This provides better readability, IDE support, and easier maintenance:

Create a schema file:

schemas/chat_message.json
{
"type": "object",
"properties": {
"text": {
"type": "string",
"maxLength": 500,
"minLength": 1
},
"mentions": {
"type": "array",
"items": {"type": "string"}
},
"metadata": {
"type": "object",
"properties": {
"timestamp": {"type": "integer"}
}
}
},
"required": ["text"]
}

Reference it in your config:

config.json
{
"schemas": [
{
"name": "chat_message",
"definition": "./schemas/chat_message.json"
},
{
"name": "reaction",
"definition": "./schemas/reaction.json"
}
]
}

Or in YAML:

config.yaml
schemas:
- name: chat_message
definition: ./schemas/chat_message.json
- name: reaction
definition: ./schemas/reaction.json
- name: typing
definition: ./schemas/typing.json
Benefits of schema files
  • Better IDE support - Syntax highlighting, validation, and autocomplete
  • Easier testing - Validate schema files independently
  • Cleaner diffs - Track schema changes separately in version control
  • Reusability - Share schemas across environments or services
info

"additionalProperties": false is automatically added to object schemas for security. You can explicitly set "additionalProperties": true in your schema file if you need to allow extra fields.

Applying schemas to channels

Use client_publication_data_schemas in channel or namespace configuration to apply validation:

config.json
{
"schemas": [
{
"name": "typing",
"definition": "{\"type\":\"object\",\"properties\":{},\"additionalProperties\": false]}"
}
],
"channel": {
"namespaces": [
{
"name": "typings",
"publication_data_format": "json",
"client_publication_data_schemas": ["typing"],
"allow_publish_for_subscriber": true
}
]
}
}
Schema Type Compatibility

Schemas must be compatible with the channel's publication_data_format setting:

  • JSON schemas (jsonschema_draft_2020_12) require publication_data_format: "json"
  • Empty binary schemas (empty_binary) require publication_data_format: "binary" to be set

Centrifugo validates this configuration at startup and will reject incompatible combinations.

Multiple schemas

When multiple schemas are configured, the publication data must match at least one of them. This allows supporting different message types in the same channel:

{
"schemas": [
{
"name": "typing",
"definition": "{\"type\":\"object\",\"properties\":{\"is_typing\":{\"type\":\"boolean\"}},\"required\":[\"is_typing\"]}"
},
{
"name": "reaction",
"definition": "{\"type\":\"object\",\"properties\":{\"emoji\":{\"type\":\"string\",\"enum\":[\"👍\",\"👎\",\"❤️\",\"😂\",\"😮\",\"😢\",\"😡\"]}},\"required\":[\"emoji\"],\"additionalProperties\":false}"
}
],
"channel": {
"namespaces": [
{
"name": "ephemeral",
"publication_data_format": "json",
"client_publication_data_schemas": ["typing", "reaction"],
"allow_publish_for_subscriber": true
}
]
}
}
note

All schemas referenced in client_publication_data_schemas must have the same type (either all jsonschema_draft_2020_12 or all empty_binary) since they share the same publication_data_format setting.

Client publication tags

Client publication tags allow you to attach server-side metadata to publications that clients cannot forge. This is useful for analytics, routing, or adding contextual information.

Configuration

Tags are defined as key-value pairs. Tag values can be:

  1. Literal strings - Used as-is without any processing
  2. CEL expressions - Wrapped in ${...} for dynamic values and conditional logic

CEL expressions are pre-compiled and validated at startup, ensuring type safety and optimal runtime performance.

config.json
{
"channel": {
"without_namespace": {
"client_publication_tags": [
{"key": "user_id", "value": "${user}"},
{"key": "client_id", "value": "${client}"},
{"key": "environment", "value": "production"}
]
}
}
}

Available CEL variables

CEL expressions in client publication tags have access to the following variables:

  • user (string) - User ID from connection credentials
  • client (string) - Client ID (unique connection identifier)
  • timestamp_ms (int) - Current server timestamp in milliseconds (Unix epoch)
  • meta (map) - Connection metadata (access nested fields like meta.tenant_id or meta.user.role)
  • vars (map) - Channel pattern variables (requires channel patterns)
  • schema_name (string) - Name of the schema that matched the publication data (empty string if no schemas configured)

All CEL expressions must return a string type and are validated at configuration load time.

Examples

Literal strings and simple variables:

{
"client_publication_tags": [
{"key": "user_id", "value": "${user}"},
{"key": "environment", "value": "production"}
]
}

Conditional logic with ternary operator:

{
"client_publication_tags": [
{"key": "tier", "value": "${meta.premium ? 'premium' : 'free'}"},
{"key": "msg_type", "value": "${vars.room_id == 'chat' ? 'reaction' : 'typing'}"}
]
}

String concatenation:

{
"client_publication_tags": [
{"key": "label", "value": "${user + ':' + meta.role}"}
]
}

Complex boolean logic:

{
"client_publication_tags": [
{"key": "access", "value": "${meta.role == 'admin' || meta.role == 'moderator' ? 'full' : 'limited'}"}
]
}

Using matched schema name:

{
"client_publication_tags": [
{"key": "msg_type", "value": "${schema_name}"},
{"key": "priority", "value": "${schema_name == 'urgent_message' ? 'high' : 'normal'}"}
]
}

Multi-tenant with channel patterns:

config.json
{
"channel": {
"patterns": true,
"namespaces": [
{
"name": "tenant_chat",
"pattern": "/tenants/:tenant_id/chat",
"client_publication_tags": [
{"key": "tenant", "value": "${vars.tenant_id}"},
{"key": "user", "value": "${user}"},
{"key": "region", "value": "${meta.region}"}
],
"allow_publish_for_subscriber": true
}
]
}
}
Performance

CEL expressions are pre-compiled at startup and validated to return string types. At runtime, only the evaluation happens, making the performance impact minimal. Connection metadata is only accessed when CEL expressions reference it.

Excluding client info

By default, Centrifugo includes client information in publications. For bandwidth optimization or privacy reasons, you can exclude this information:

config.json
{
"channel": {
"without_namespace": {
"client_publication_exclude_client_info": true,
"allow_publish_for_subscriber": true
}
}
}

This prevents the info field from being included in publications.

tip

Use this option when:

  • You want to reduce bandwidth usage
  • Client identity is not needed by subscribers
  • You're using client publication tags to provide necessary metadata

Complete example

Here's a comprehensive example combining all features:

config.json
{
"schemas": [
{
"name": "reaction",
"type": "jsonschema_draft_2020_12",
"definition": "{\"type\":\"object\",\"properties\":{\"emoji\":{\"type\":\"string\"},\"message_id\":{\"type\":\"string\"}},\"required\":[\"emoji\",\"message_id\"],\"additionalProperties\":false}"
}
],
"channel": {
"patterns": true,
"namespaces": [
{
"name": "room_chat_reactions",
"pattern": "/rooms/:room_id/reactions",
"publication_data_format": "json",
"client_publication_data_schemas": ["reaction"],
"client_publication_tags": [
{"key": "user_id", "value": "${user}"},
{"key": "room_id", "value": "${vars.room_id}"}
],
"client_publication_exclude_client_info": true,
"allow_publish_for_subscriber": true
}
]
}
}

Example with Empty Binary Schema

Here's an example using empty_binary schema for a typing indicator:

config.json
{
"schemas": [
{
"name": "typing",
"type": "empty_binary"
}
],
"channel": {
"patterns": true,
"namespaces": [
{
"name": "room_typing",
"pattern": "/rooms/:room_id/typing",
"publication_data_format": "binary",
"client_publication_data_schemas": ["typing"],
"client_publication_tags": [
{"key": "user_id", "value": "${user}"},
{"key": "room_id", "value": "${vars.room_id}"}
],
"client_publication_exclude_client_info": true,
"allow_publish_for_subscriber": true
}
]
}
}

Behavior

Schema validation

  • Publications are validated before being broadcast to subscribers
  • If validation fails, the client receives an error and the publication is rejected
  • Multiple schemas act as an OR condition - data must match at least one schema
  • Schema names must reference schemas defined in the top-level schemas array

Configuration validation

Centrifugo validates schema configurations at startup:

  • Schema type defaults to jsonschema_draft_2020_12 if not specified
  • JSON schemas (jsonschema_draft_2020_12) must have a definition field
  • Empty binary schemas (empty_binary) must not have a definition field
  • Schema type must be compatible with channel's publication_data_format:
    • jsonschema_draft_2020_12 requires publication_data_format: "json"
    • empty_binary requires publication_data_format: "binary"
  • All schemas referenced in client_publication_data_schemas must exist

Bottom line

Generally speaking all the existing namespace options like recovery/positioning, delta compression, channel batching controls will apply to namespaces with ephemeral client publications also. Then it depends on the specific use case whether you would like to apply those or not.

See also