The last view is the list of error tickets watched in the environment. A ticket shows relevant information about the incident. For example, it shows the date of the last occurrence, the current state, and the owner of the ticket (person in charge of solving it).
Hayei should show error tickets as soon as it connects. However, sometimes the environment might not connect, or the servers are not generating enough data to train a model, so no ticket arrives. The below figure shows this condition.
By clicking in the center, Hayei tries to connect to the environment. It will take a few minutes before showing tickets.
Once everything goes fine, Hayei will list by default all the open tickets. Hayei allows different filters and sorting options.
Adding Environments
A data source can have multiple environments. Each one shares the same ELK cluster but with a different index.
An environment is an index pattern from ElasticSearch that points to a specific source of logs such as an App/Webapp. Therefore, the user should create environments with similar logs that share the same format established when setting up the ELK cluster and the App/Webapp.
Tickets Information
This section shows the #ID, the current name (the user can change that), the creation date, the last occurrence, the environment (the source of the alert), and other parameters that tell the current state and the person taking care of the ticket.
Ticket State
By default, a ticket starts with a Discovered state. During the solution process (open incidents), other intermediate states like investigating, resolving, simulating the solution, and reproducing or applying the solution. After solving the problem, the ticket passes to a close state as resolved, rejected, or archived. The status assignment is helpful to, for example, filter by state.
Ticket State History
Hayei keeps track of the state’s history. In a view, anyone can see the progress on each ticket.
Ticket Owner
In Hayei, a ticket can be assigned to a user (a support team member) to look up the ticket.
Incident Insights
Hayei shows the current label and severity highlighted with color for incident tickets depending on the severity. It also displays how often the error appears, where (host) is the error, and who (logger) triggers the ticket. An unknown label or severity means that Hayei does not have the criteria to assign a value.
Sometimes the alert arrives with a predefined name and severity based on previous feedback from the user. Therefore, those assigned values (tags) might not reflect reality. Then, the user can update both fields by clicking on the highlighted box or in the link below to correct them.
Labeling a ticket
The first time labeling a ticket, Hayei shows the log archetypes to help understand the issue. After that, it is possible to reuse a label. In this scenario, Hayei merges the current ticket (child) with the labeled one (parent) in one ticket with the union of the archetypes.
Set Ticket Severity
Hayei presents the severity of the ticket using a five levels scale to indicate how much attention a ticket demands.
Healthy
No loss of service. The system is working as expected.
Low
Minor issues require action but do not affect the customer’s use of the product. Such defects can cause some malfunction but keep the software up and running. The workaround is easy and obvious.
Medium
Product features are unavailable, but a workaround exists, and most software functions are still usable. Minor function/feature failure that the customer can easily circumvent or avoid. The customer’s work has a little loss of operational functionality.
High
Important product features are completely unavailable with no acceptable workaround.
Critical
It represents a complete loss of service. Unfortunately, no workaround is immediately available.
After applying the changes, the ticket appearance changes.
Incident logs fingerprint, occurrence timeline, and navigation
Hayei also describes the ticket. For Hayei, a ticket is a window-time collection of logs. Each log belongs to an archetype, and the set of logs archetypes define an incident type. Hayei calls it an incident fingerprint. Every time Hayei sees the same pattern, it adds an occurrence to the internal counter, showing a distribution of events in a timeline. The arrows allow navigation each day.
The ticket allows labels and refines the incident (a.k.a. ticket) by clicking the tiny brain in the right upper corner of each log archetype.
The new view allows splitting a ticket into two different incidents by moving a subset of logs from the parent ticket to another incident. This option helps the internal machine learning model learn based on user feedback.
Solutions
For tickets, Hayei shows possible solutions. The solution is common to all the tickets that share the same incident type.
In this tab, a user can propose or edit a solution.
Here, users can vote (thumb up or down) for a solution or even edit the current one.
Hayei uses the Log Archetypes to search solutions (StackExchange, Launchpad, Google). In some cases, Hayei extracts pieces of code that might work on solving the problem.
Hayei also provides a list of URLs that might discuss ways to overcome the current incident.
Actions
Hayei can trigger actions to solve a problem automatically when a ticket reappears over time. The purpose of actions is to improve user experience and avoid open support tickets with the same incident type (same log archetypes) that appears in other environments.
The categories of the actions are API and SSH. API calls are calls that Hayei will use to fix a problem in an environment from where Hayei created the ticket. SHH commands that will be run directly on a server. For this feature, Hayei would need specific permissions to make changes.
For now, Hayei supports only API calls via cURL.
All the API actions are visible.
Hayei allows testing each API action in the Manage section.
For SSH command execution, the logic is similar to the API calls.
For a quick view, Hayei also displays all SSH actions.
For traceability, Hayei keeps track of the execution history.
Alerts
The last section is Alerts. Here a user can see and add alerts by Incident Type or Logger.
Setting a new alert is similar to the method explained before in section ElasticSearch Alerts. The main difference is that a user can add an alert when a ticket is open from a known incident type.
It is also possible to create an alert when Hayei detects a logger combination in a new ticket. For example, it is useful when an issue might trigger logs from two different loggers (a.k.a. applications).
Hayei keeps track of the alert in the alert history section.
In a view, Hayei shows details about the new tickets that meet the conditions defined in the alert.