In artificial intelligence, preference-based planning is a form of automated planning and scheduling which focuses on producing plans that additionally satisfy as many user-specified preferences as possible. In many problem domains, a task can be accomplished by various sequences of actions (also known as plans). These plans can vary in quality: there can be many ways to solve a problem but one generally prefers a way that is, e.g., cost-effective, quick and safe.
Preference-based planners take these preferences into account when producing a plan for a given problem. Examples of preference-based planning software include PPLAN[1] and HTNPlan-P[2] (preference-based HTN planning).
Overview
Preferences can be regarded as soft constraints on a plan. The quality of a plan increases when more preferences are satisfied but it may not be possible to satisfy all preferences in a single plan. This differs from hard constraints which must be satisfied in all plans produced by the planning software. These hard constraints are part of the domain knowledge while the soft constraints (or preferences) are separately specified by the user. This allows the same domain knowledge to be reused for various users who may have different preferences.
The use of preferences may also increase the length of a plan in order to satisfy more preferences. For example, when planning a journey from home to school, the user may prefer to buy a cup of coffee along the way. The planning software could now plan to visit Starbucks first and then continue to school.[3] This increases the length of the plan but the user's preference is satisfied.
Planning Domain Definition Language
The Planning Domain Definition Language (as of version 3.0[4]) supports the specification of preferences through preference
statements. For example, the statement
(preference (always (clean room1)))
indicates that the user prefers that room1
should be clean at each state of the plan. In other words, the planner should not schedule an action that causes room1
to become dirty. As this example shows, a preference is evaluated with regard to all states of a plan (if semantically required).
In addition to always
, other constructs based on linear temporal logic are also supported, such as sometime
(at least once during the plan), sometime-after
(to be planned after a particular state) and at-most-once
(the preference holds during at most one sequence of states in the plan).
Plan quality
In addition to determining whether a preference is satisfied, we also need to compute the quality of a plan based on how many preferences are satisfied. For this purpose, PDDL 3.0 includes an expression called is-violated <name>
which is equal to "the number of distinct preferences with the given name that are not satisfied in the plan".[4] For a plan, a value can now be computed using a metric function, which is specified with :metric
:
(:metric minimize (+ (* 5 (is-violated pref1)) (* 7 (is-violated pref2))))
This example metric function specifies that the calculated value of the plan should be minimized (i.e., a plan with value v1 and a plan with value v2 such that v1 < v2, the former plan is strictly preferred). The value of a plan is computed by the given function, which is expressed in Polish notation. In this case, violation of the second preference, pref2
, has been given a greater penalty than the first preference, pref1
.
Constraints satisfaction problem
In the area of constraint satisfaction problems, flexible variants exist that deal with soft constraints in a similar way to preferences in preference-based planning.
References
- โ PPLAN, Bienvenu et al.
- โ HTN Planning with Preferences, Sohrabi et al.
- โ Planning with Preferences using Logic Programming, Son and Pontelli
- 1 2 Deterministic planning in the fifth international planning competition: PDDL3 and experimental evaluation of the planners, Gerevini et al.