Polarity Source Analytics with Elasticsearch
  • 24 Dec 2024
  • 4 Minutes to read
  • Dark
    Light

Polarity Source Analytics with Elasticsearch

  • Dark
    Light

Article summary

The following guide walks through how to collect source analytic logs from your server using Elasticsearch.

Enable Source Analytics

Prior to setting up collection of your source analytic (PSA) logs, please ensure that source analytics logging is enabled:

Create a Polarity Agent Policy

Once source analytic are being collected on your Polarity Server you can configure Elasticsearch to receive those logs. 

  1. Login to your Elasticsearch Kibana instance.
  2. Navigate to the "Management" -> "Fleet" page
  3. Click on "Agent Policies"
  4. Click on "Create agent policy"
  5. Name the policy.  For example "polarity-source-analytics"
    Text
    polarity-source-analytics
  6. Decide if you would also like to collect system logs and metrics (note that this is not required for Source Analytics collection)
  7. The default "Advanced options" will work but you may want to make changes depending on your organization.
  8. For example, you might want to add an optional description or modify the "Default namespace". 
  9. Click on "Create agent policy"

Your new policy will be created but still needs to be configured.

Configure the Agent Policy

Your new policy will show up in the Fleet list under "Agent Policies". Click on it to view the details:

  1. Click on "Add integration"
  2. Search for the "Docker" integration and click on it:
  3. Click on the "Add Docker" button to add the Docker integration to your agent policy.
  4. Under the Docker integration configuration, set the integration name:
    Text
    polarity-telemetry-collection
  5. Set the description.
    Text
    Collects polarity source analytic (PSA) logs
  6. Uncheck "Collect Docker Metrics"
  7. Check "Collect Docker container logs" and then expand the "Change defaults"
  8. Set the "Condition" variable to only collect logs from the polarity_platform container.
    Text
    ${docker.container.name} == 'polarity_platform'
    Note
    This condition is optional and if you'd like to collect logs from all the containers you can leave this option blank.
  9. Click on "Advanced Options"
  10. Leave the "Docker container log path" to the default value which should be:
    Text
    /var/lib/docker/containers/${docker.container.id}/*-json.log

  11. Set the "Container parser's stream configuration" to "all"
  12. Leave the "Additional parsers configuration" text area blank.
  13. Under the "Processors" text area add the following configuration:
- decode_json_fields:
    fields: ["message"]
    target: polarity
    process_array: true   
- drop_event:
    when:
      not:       
        equals:
          polarity.msgType: "integration-lookup"     
- timestamp:
    field: polarity.timestamp     
    layouts:
      - '2006-01-02T15:04:05Z'
      - '2006-01-02T15:04:05.999Z'
      - '2006-01-02T15:04:05.000000Z'
      - '2006-01-02T15:04:05.999-07:00'
    test:
      - '2019-06-22T16:33:51Z'
      - '2019-11-18T04:59:51.123Z'
      - '2020-08-03T07:10:20.123456+02:00'
      - '2024-02-19T18:28:23.650027Z'


Note
The drop_event directive will drop all logs from the container that are not specifically telemetry logs.  If you'd like to collect all logs (to include all HTTP request traffic), leave out the drop_event directive.

Deploy Agent Policy

  1. Once you've completed creating your ingest pipeline click on "Save and continue".
  2. When prompted click on "Add Elastic Agent to your hosts"
  3. Leave the default settings.  Copy the "Linux Tar" command and run it on your Polarity server to install the fleet agent.
  4. Send a search request in Polarity to send a telemetry log to Elasticsearch

View your Data Stream

If you used the default namespace of default then your Polarity telemetry logs will collected using the datastream "logs-docker.container_logs-default".

If you modified the namespace when configuring the Docker integration your datastream will be in the format "logs-docker.container_logs-{namespace}".

To find this data stream navigate to "Stack Management" -> "Index Management" -> "Data Streams":

Configure a Data View

If you do not see the data stream and your Agent is reporting as "Healthy", ensure you have PSA enabled on the server and that a search has been run since you enabled it.

  1. To make your data stream searchable you have to create a "Data View".  Navigate to "Kibana" -> "Data Views" and click on "Create data view".
  2. Give the data view a name:
    Text
    Polarity Source Analytics
  3. Then set the "Index Pattern":
    Text
    logs-docker.container_logs-default
  4. You can leave the Timestamp field with the default setting of "@timestamp".
  5. Click on "Save data view to Kibana"

View your Data

You can view the raw source analytics by navigating to "Analytics" -> "Discover".  In the top left, filter to only show data from your newly created "Polarity Source Analytics" data view.

You should now see your Source Analytics Data available in Kibana.  To view the Source Analytics specific data you can click on a log file and then filter fields by the term "Polarity"

Ingest Pipelines (Optional)

You can optionally add a custom ingest pipeline to remove fields added by Elasticsearch.  Elasticsearch adds these fields as part of the Docker integration and many of them do not provide any value and unnecessarily increase the size of your index.  

Note
The additional fields are automatically added after any processors run which means they cannot be removed using the drop_fields directive of the processor and instead a custom ingest pipeline must be used.

To add a custom ingest pipeline:

  1. Navigate to the "Agent Policies" page and find your recently created agent policy called polarity-telemetry-collection.  Click on the name to edit the policy.
  2. Click on the "Advanced Options" dropdown and click on the "Add custom pipeline".
  3. Click on "Add a processor" and select "Remove" for the type of the processor.
  4. Under the "Fields" option we recommend adding the following fields for removal:
    Text
    container.labels.maintainer
    container.labels.description
    container.labels.com_docker_compose_service
    container.labels.license
    container.labels.summary
    container.labels.vendor
    container.labels.com_docker_compose_project_config_files
    container.labels.com_docker_compose_project_working_dir
    container.labels.com_docker_compose_project
    container.labels.com_docker_compose_oneoff
    container.labels.com_docker_compose_depends_on
    container.labels.com_docker_compose_container-number
    container.labels.com_docker_compose_config-hash
    container.labels.com_docker_compose_service
    log.file.path
  5. Check the "Ignore missing" and "Ignore failures for this processor" options.
  6. Once your done click on "Add processor".

Note
The "Fields" input will only accept a certain number of fields to remove.  If you hit the limit, click "Add processor", and then add another "Remove" processor with the additional fields you'd like to remove from the logs.


Was this article helpful?