Query Topics

Invoke the query-topics operation

query-topics is an operation for efficient searching of Volga topics at multiple sites. You can modify the search basically by:

  • Choosing which sites to run the query on
  • Choosing which topics to run the query towards
  • Choosing what to search for
  • Choosing how the output result should be presented

All parameters are individually documented, but here we present some example inputs.

Note that the query is always issued at the Control Tower.

By default, the Control Tower will make a best-effort attempt at sorting the incoming messages by timestamp, but it will prioritize throughput, so the order is not guaranteed to be correct. If timestamp ordering is important, use the sort option.

First example: Search a topic name called mytext on all sites and just print the data from that topic.

{
  "follow": true,
  "all-sites": true,
  "topics": [
    {
      "topic-names": ["mytext"],
      "output": {"payload-only": true}
    }
  ]
}

Next example is similar, searches the same topic, but only entries since one hour ago and forward, and it will grep for the string Login:

{
  "follow": false,
  "since": "1h",
  "all-sites": true,
  "topics": [
    {
      "topic-names": ["mytext"],
      "filter": {"re-match": "Login:"},
      "output": {"payload-only": true}
    }
  ]
}

When applications are deployed, several interesting system: topics are created by the system. For example scheduling information is available in a topic called system:scheduler-events. If an application, called cowboy-app is deployed on many sites, the following JSON queries all sites. The output data on the topic is JSON, and in that case, we can use the fields output modifier to only output certain parts of the JSON log structure

{
  "follow": false,
  "topics": [
    {
      "topic-names": ["system:scheduler-events"],
      "filter": {"re-match": "\"application\": \"cowboy-app\""},
      "output": {
        "payload-only": true,
        "fields": "data/[application,oper-status]"
      }
    }
  ]
}

When containers are deployed, the output on stdout,stderr from the container is collected in a container topic. If you are an application developer, searching and streaming the container output topics is obviously important. The following query searches for errors in all containers called cowboy. The query uses a few advanced features, first the query is not run at all sites, but rather the sites where our container is actually deployed. It also choses the topics, using labels instead of the topic names. Follow is true, thus this is basically tailing all our application logs, looking for errors.

{
  "follow": true,
  "sites-from-application-deployment": "cowboy-dep",
  "topics": [
    {
      "match-topic-labels": "container-name = cowboy",
      "filter": {
        "re-opts": ["caseless"],
        "re-match": "error|emerg|critical"
      },
      "output": {"payload-only": true}
    }
  ]
}
SecurityaccessToken
Request
Request Body schema:
One of:
One of:
One of:
sort
boolean
Default: false

Sort all messages by timestamp. When true, the control tower will wait for all sites to respond before returning anything to the caller, ensuring that messages are delivered in timestamp order. Note that when follow is also set, it is not possible to wait for all sites to respond as that would entail waiting indefinitely for possible future messages. Instead, messages are buffered and sorted for up to three seconds, first at the site level and then at the Control Tower, to account for network lag and system clock discrepancies, but under poor conditions it is still possible that some messages will be delivered out of order.

follow
boolean
Default: false

If follow is true, the output from the search will continue indefinitely. If follow is false, all sites that participate in the query will utilize the end-marker feature in Volga for the chosen topics and report to the Control Tower when there is no more data. If one ore more sites chosen by the query fails to respond, the unfollow-timeout will trigger and the query is aborted

site-timeout
string <duration>
Default: "10s"

A duration in years, days, hours, minutes and seconds.

Format is [<digits>y][<digits>d][<digits>m][<digits>s].

Examples: 1y2d5h, 5h or 10m30s

Valid when follow is not true

Timeout controlling how long to wait for unresponsive sites to deliver log data

compact-output
boolean
Default: false

Valid when count-matches is not true

Produce newline-delimited JSON messages. This can make life easier for a parser which consumes the output data.

dry-run
boolean
Default: false

This call will report which sites, and topics are chosen by the query. Furthermore, when issuing a query towards a set of sites, if one or several sites are down or unresponsive, this flag will report all sites that never responded, using the site-timeout value

drop-until-n-remain
integer <int32>

Valid when follow is not true

At the Control Tower, collect all data according to the provided filters, and deliver the last n items gathered from all the specified sites. This implies follow false

count-matches
boolean
Default: false

Count number of occurrences that match the provided filters. Counting occurs at the edge, and starts after start-time settings, and drop settings have been applied. No data is returned, only the count of matches as a single JSON object, for example: {"count": 36}

Array of topic-names (object) or re-match-topic-name (object) or match-topic-labels (object)
since
string <duration>

A duration in years, days, hours, minutes and seconds.

Format is [<digits>y][<digits>d][<digits>m][<digits>s].

Examples: 1y2d5h, 5h or 10m30s

By default, queries will start at the beginning of all selected topics. This parameter will make the query start at the specified time relative to current time. For example

$ supctl do volga query-topics --since 1h .....

will search all topics starting with messages from one hour ago.

duration
string <duration>

A duration in years, days, hours, minutes and seconds.

Format is [<digits>y][<digits>d][<digits>m][<digits>s].

Examples: 1y2d5h, 5h or 10m30s

Valid when follow is not true

This parameter limits the timespan within which messages are searched for.

$ supctl do volga query-topics --since 1h --duration 10m ....

Will search all topics for messages that are between 50 and 60 minutes old.

site-names
Array of strings

A list of site names

Responses
204

No Content

400

Bad Request

401

Unauthorized

403

Forbidden

404

Not Found

503

Service Unavailable (strongbox sealed)

post/v1/state/volga/query-topics
Request samples
sort: false
follow: false
site-timeout: 30s
compact-output: false
dry-run: false
since: 1h
duration: 30s
drop-until-n-remain: 3
count-matches: false
all-sites: true
topics:
  - re-match-topic-name: audit
    filter:
      re-opts:
        - caseless
        - ungreedy
      merged-drop-until-re-match: joe
      merged-re-match: joe
      drop-until-last-re-match: path.*volga
      re-match: joe
    output:
      payload-only: true
      format: %t %s %h - %p
      fields: client-ip