Technical solution description

This document provides comprehensive information about the introduction and concepts around Nexthink Chatbot SDK, its API and use cases. The information contained herein is subject to change without notice and is not guaranteed to be error-free. If you find any errors, please report them to us via Nexthink Support Portal. This document is intended for readers with a detailed understanding of Nexthink technology.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

Overview and intended use

Nexthink Chatbot SDK is an extra component of the Nexthink core product. The purpose of Chatbot SDK is to both diagnose and provide automatic resolutions for IT issues by creating an abstraction layer in front of the Nexthink solution that any third-party chatbot vendor can integrate via a standard REST API.

The main functionalities provided by Chatbot SDK include:

Isolating internal Nexthink components by creating a new REST API.
Identifying devices used by employees and providing timing information about the last activity of the employee that includes interaction with the mouse or keyboard.
Providing diagnosis and remediation to the chatbot so that it can make decisions based on a specific use case, or topic and device, for example, by triggering remote actions.
Caching information retrieved from Nexthink Engines in order to improve the response time. This information is updated via a scheduled discovery process.

Chatbot integration solution

The solution for the chatbot integration comprises the following subsystems:

One or more Engine appliances with their Web API V2.0 available
One Nexthink Portal appliance with available remote actions API
One Nexthink appliance (not an Engine or Portal) with Chatbot SDK installed
A third-party chatbot that integrates with Nexthink Chatbot SDK

Chatbot SDK is composed of several microservices which run as Docker containers. The microservices are described in the following sections.

Ingress

The Ingress service is the entry point that serves as a single way of communication between Nexthink Chatbot SDK and the third-party chatbot. This service exposes a public port and a series of endpoints. By default, the port is 8090 but can be changed.

All external communication between chatbot and Ingress are HTTPS-based. This is also the case for internal communications between Ingress and the rest of the microservices.

Additionally, the Ingress service is in charge of authenticating each call from the chatbot using the following process:

An API Key must be generated by providing the necessary Nexthink credentials (basic authentication).
The generated API Key must be used for all the communications between the third-party chatbot and Chatbot SDK.
Ingress verifies the API Key, allowing or rejecting the communication with Chatbot SDK. The number of API Key requests, either for generation or retrieval, is limited to 1 every 10 seconds.

Config Reader

The Config Reader reads all the configurations provided by several YAML files. It stores the configuration data locally and then provides it to the other services on demand. This configuration information includes:

The topics to be used. Each topic is stated in a YAML file with a given syntax. Ideally, each topic should be defined in its own file.
The list of Engines that will be taken into consideration by Chatbot SDK. Please note that the SDK provides cross-engine functionality.
The list of remote actions that can be triggered by Chatbot SDK. These remote actions can be executed either as a part of a diagnosis when information gathered by them is needed to detect a given issue, or as part of remediation when the remote actions will assist, help or heal a given problem.
The list of fields that will be included in the cache. This file is automatically generated based on the device-only NXQL queries of the different topics during the installation process. It can be modified at any time to customize which fields should be cached. Refer to the Discovery and cache section for more information about the cache.

NXQL Wrapper

NXQL Wrapper is an interface between the engines and the rest of the services, with the exception of the discovery service which has direct access. It parses the NXQL query and either forwards it to the corresponding engine (the last one that had a device connected) or retrieves data from the cache.

Topic Manager

Topic Manager is responsible for handling all the topic logic. It triggers remote actions or executes the queries that correspond to each topic and returns back the information (field values and suggested remediations) according to the topic configuration.

Remote Action Manager

Remote Action Manager handles two things:

Dispatches a remote action execution to Portal synchronously or asynchronously. When it is done synchronously, it regularly checks for the remote action completion until the specified timeout.
Queries Engines to check the value of all the fields corresponding to a remote action for a given device

To check the remote actions execution results, there is a grouping remote action poller to minimize the number of queries to the engines. The remote action poller groups all the pending remote action requests for all the devices and merges them into a single query that is executed every 5 seconds.

Discovery and cache

The Discovery service together with the PostgreSQL database is responsible for maintaining the information cache, which is used by the NXQL Wrapper service, thus minimizing the number of requests sent directly to Engines.

Discovery process

The process to feed information to the local cache is called discovery, see the steps below:

The discovery service requests the list of Engines and the list of cached fields from the Config Reader.
Static and dynamic fields preparation:
- It queries Engines for all the existing dynamic fields.
- It builds the final list of fields to be cached from the cached fields list and the dynamic fields from the Engine.
Devices and employee-device relationship population:
- It queries the information about the devices and employees with activity in the last number of days defined as USER_ACTIVITY_PERIOD in the environment configuration set in /var/nexthink/nexthink-chatbot-adapter/.env , set to 30 days by default. The NXQL query is done both for Windows and MacOS platforms sequentially for each Engine. These queries are retried 3 times.
  - Query 1: MacOS devices and users with activity in the last USER_ACTIVITY_PERIOD days
  - Query 2: MacOS device fields
  - Query 3: Windows devices and users with activity in the last USER_ACTIVITY_PERIOD days
  - Query 4: Windows device fields
- All the information from previous queries is prepared to be stored in the PostgreSQL cached database.
  - Take the last user activity date for each tuple device and user.
  - Merge the information of this user activity query with the rest of the fields of the device.
  - For the devices that don’t have any user activity, the relationship with the user is configured with last_logged_on_user and last_logon_time fields.
Clean-up
- It deletes the old user and their relationship with the devices. Old users are those whose last activity is older than the period configured in USER_ACTIVITY_PERIOD.
- It deletes the old devices. Old devices are devices that were last seen by Engine before the period configured in USER_ACTIVITY_PERIOD.
Next loop: this waits for the period specified in the CHATBOT_CACHE_REFRESH environment variable, which by default is one hour from the beginning of the previous discovery, and starts all over again from the static and dynamic field preparation step listed above. The CHATBOT_CACHE_REFRESH environment variable is set in /var/nexthink/nexthink-chatbot-adapter/.env

Error handling

If for any reason the information from Engine cannot be successfully updated during the discovery process on the third discovery attempt, it is considered down and the data corresponding to that Engine is removed from the PostgreSQL database. All Engines continue to be queried on the following refresh cycles and the removed information becomes available again in the cache when Engine replies in a new discovery sequence.

Overload protection

Nexthink Chatbot SDK was designed with Portal, Engine or its own appliance overload protection in mind. There are three mechanisms to avoid the overload:

Discovery

The discovery process identifies which Engine a device belongs to. It avoids queries by caching the information related to devices stored in Engines.

Remote actions group poller

Queries to retrieve the result of remote actions are grouped and sent every 5 seconds to Engines as described in the Remote Action Manager section.

Throttling

There is a set of variables in the /var/nexthink/nexthink-chatbot-adapter/.env file to control the maximum rate of calls that can be made to each Engine, Portal and database per second.

MAX_CACHE_REQUESTS_PER_SECOND=500

MAX_PORTAL_REQUESTS_PER_SECOND=20

MAX_ENGINE_REQUESTS_PER_SECOND=40

Those are the default values and can be adjusted depending on the installation. Those limits were calculated to protect and minimize the number of errors sent by Portal and Engines. The CPU usage in the Engines, Portal and adapter is also monitored.

If the request rate is greater than this limit, Chatbot SDK will return an HTTP 429 error. The client should, in this situation, try to lower the rate or repeat the request after a certain amount of time.

There is also a non-configurable limit in the rate of requests for the api_key endpoint. For both POST and GET methods only one request can be made every 10 seconds. This limitation has been set to protect against brute force attacks rather than overload protection.

Last updated 11 months ago

Was this helpful?