Optional Missing Data

Optional Missing Data

Intent:

This pattern allows models to include desirable facts, data or inferences that are not met without stopping the completion of the further inference. Information is not always available to Rainbird at run time, however, you may wish for the query to continue the inference with reduced confidence. 

Applicability

This pattern is applicable when: 

  • Answers have been skipped by a user in a Rainbird driven interaction (allowUnknown enabled).
  • Data is not available from a datasources, stored fact in the Knowledge Map or injected into the model via the APi.
  • Facts have not been inferred at run time (not fact inferred or inferred with too small a certainty).
  • The missing facts are not essential to a decision being made.
  • The model is using uncertainty in its inference
  • The missing data is not essential to make a required fact

Limitations:

Primary missing data cannot be set to optional if it is the only factor considered within the rule i.e. the only rule condition or only weighted condition.

Implementation:

Rules that have decreased confidence due to missing data may still not be produce a fact if the impact of the unmet condition produces a certainty below either the default 20% rule minimum of a different set minimum certainly of the rule.

We will use the example of an insurance company wanting to make an assessment of liability on a motor claim. The involved parties have completed a questionnaire with information on the circumstances of the incident. The model will produce a summary of the drivers behaviour being either ‘Appropriate’ or ‘Inappropriate’  in these circumstances that includes a confidence in these results.

We will seek to accommodate two examples of dealing with some of this information, deemed non-essential,  being missing.

Examples:

  1. The model includes the driver’s description of the road conditions (wet, dry, icy, muddy etc) as part of their testimony. If the driver does not remember or has not included this information we can still be reasonably confident on their behaviour being appropriate/inappropriate in the incident based on other key factors. The fact that the driver describes the road as ‘wet’ is used directly in the rule making the inference, we will call this primary missing data.
  2. Calculations around the vehicles required and estimated stopping distance is also included in the assessment of their behavioural appropriateness. The calculation takes into account the road speed limitation and the actual vehicle speed (which the driver does not provide) and then an assessment is made as to whether the stopping distance was suitable (true/false). This inferred result is then used in the the rule making the behavioural assessment, the suitability is also a non-essential fact in this inference. As the missing data is not being used directly in the assessment of the drivers behaviour but by a condition as part of it we will call this secondary missing data.

The important difference between these two examples is where the optionality of the missing data is described, and where the impact of it not being provided is captured.

Fig. 1 – Subject matter describing the factors involved in deeming a drivers behaviour to be inappropriate when:

  • The road conditions were described by the driver as “wet” (optional)
  • The visibility of the road was classified as “Poor” (Mandatory)
  • The calculated stopping distance was “not suitable” (optional)

These criteria equally impact the result of the decision (they all weight the same in the decision, a third each).

Fig. 2 – Subject matter around deeming the vehicle’s speed(true/false) when:

  • A vehicle is on the road with a speed limit of X mph
  • The driver was travelling at a speed of Y mph
  • Y is great than X

How do we decide where the optionality and impact is captured?

Within Rainbird we have the ability to make certain conditions within rules optional (see Elements – Condition behaviour for link to resources). In these circumstances we are able to use the optional functionality because the missing data is not essential to complete the inference(s) required. However it is important to ensure that we are applying the optional behaviour to the right condition in the right way to allow the impact of this missing data to be captured correctly (see weighting)

Primary Missing Data

In example A, the missing data is being used directly in the final, uncertain inference. The subject matter (fig.1) states that if the driver cannot remember/provided the description of the road conditions, in this case ‘wet’, then we loose a third of the overall rule confidence.  The expected source of the missing data is from outside of the Rainbird platform, it is expected to be provided by an external source such as user input, datasource or injection via the APi. 

In these circumstances the optionality can be applied directly to the condition that seeks to use this data, this then has the desired effect of decreasing the % confidence of the inferred fact.

Secondary Missing Data

In example B, the missing data is being used indirectly by the final uncertain inference. The missing data (vehicle’s speed) forms a part of a calculation within the model through the speed suitability relationship (explained in fig. 2), which results is then taken into consideration in the categorisation of the driver’s behavioural summary. We are told that the fact that the vehicle’s speed was not (false) suitable, is not essential for the inference to be made and that if it is not available we are a third less confident in the overall inference.

In these circumstances the missing data is not being used in the same rule that the impact of it being missing is being considered (what is missing is the vehicle’s speed, which in turn makes the speed suitability inference impossible). The model has been told that certain calculations of stopping distances require the road conditions to taken into considerations, these calculations cannot be completed with out this data being (not) provided. The same would be true if the road’s speed limit was missing as a variable as well.

In this case the rules with the reliance on the missing data fail to infer a fact, and in turn causes further missing data within the chain of inference. As per the subject matter, this chain of missing data becomes optional in the rule shown in fig.1.

Allow User to Skip (AllowUnknown=“true”)

If the information is being provided to the Rainbird model by an end user directly this relationship attribute is required to set to ‘true’ to allow the end user to skip this question without providing any answer. This feature is not required to included if the data is being injected into the model ahead of query, however it is best practice to include it on any relationship that may not receive a fact/answer to allow non-integrated testing.

Optional Condition behaviour

Optional conditions are required to allow Rainbird to complete the inference of a fact without the condition itself being met. The unmet condition will affect the overall confidence of the inferred fact.

Mandatory Condition behaviour

Mandatory conditions determine that Rainbird cannot complete the inference of a fact without the condition itself being met. The unmet condition will cause no fact to be produced by the rule. In this pattern the we this creates secondary missing data that is dealt with as optional in further rules.

Condition Weighting

This attribute dictates the impact of an optional condition not being met on the overall confidence of the inferred fact.

Click on the ‘Export.rbird’ button to download the ‘Optional Missing Data’ map used in this example. The knowledge map can then be imported into your Rainbird Studio

Query and Results

The Export File below has the main rules on ‘deemed to have’. Please take into consideration that only specific answers will show results – more information in the article above.

Article Feedback form
Did you find this article useful?

Version 1.01 – Last Update: 26/02/2021