Expression Manager to be updated

ExpressionScript - Quick Tutorial

=Introduction=

LimeSurvey uses the new ExpressionScript (EM) module which lets LimeSurvey support more complex branching, assessments, validation, and tailoring. It replaces how LimeSurvey manages Replacements, Conditions, and Assessments on the back-end. It also speeds up processing considerably since it eliminates most run-time database reads. EM was developed by Dr. Thomas White (TMSWhite).

Key Definitions

 * 1) Expression: Anything surrounded by curly braces:
 * 2) *As long as there is no white space immediately after the opening brace or before the closing curly brace.
 * 3) *The expression content is evaluated by EM, so it can contain mathematical formulas, functions, and complex string and date processing.
 * 4) Tailoring: Sometimes called "piping". It is the process of conditionally modifying text:
 * 5) *You have access to all 'replacement fields', participant data, and response data.
 * 6) *You also have easier access to questions, answers, and their properties.
 * 7) Relevance Equation: A new question attribute controlling question visibility:
 * 8) *If there is a relevance equation, then the question is only shown if the relevance evaluates to true.
 * 9) *Internally, all array_filter and array_filter_exclude commands become subquestion-level relevance.
 * 10) Equation Question Type: A new question type that saves calculations or reports to the database:
 * 11) *It is like a Boilerplate question, but its contents are saved to the database even if you set "Always Hide this Question".
 * 12) Question Code: This is the preferred variable name for EM:
 * 13) *This can be a descriptive name indicating the purpose of the question, making it easier to read complex logic.
 * 14) *Valid question codes should NOT start with a number, so when using the question code to number your questions, simply use "q1", or "q1a" or "g1q2".
 * 15) *This is what becomes the variable name if you export data to SPSS or R. So, if you do statistical analysis, you need to create only unique question codes.

Do I have to use EM?
The short answer is "no". However, this heavily depends on the complexity of the survey you want to create.

For example, the Conditions editor covers some basic expressions that can be applied to the questions of your survey. However, the Conditions editor is limited. That is why the EM is used - it expands the realm of customization possibilities.

Can I mix Conditions and Relevance equations?
Yes. You can use the Conditions editor for some questions and Relevance equations for others.

You cannot have both conditions and expressions set up in the same question! Once a condition is set up, it replaces whatever expression is written in the relevance equation field. Moreover, the Relevance equation field cannot any longer be manually edited.

Yet, there is a way to use both expressions and conditions within a question. As mentioned above, a condition replaces the relevance equation field. Once done, check what is the newly created equation and copy it in a text editor. Delete the newly created condition from the Conditions editor and then edit the question by adding the condition-based expressions from your text editor file alongside the rest of expressions you wish to use.

How should I choose between Conditions and Relevance?
Here is a list of the pros and cons of each style:

=Getting Started=

The best way to get started with the EM is to:
 * Install the latest stable version from http://www.limesurvey.org/en/download.
 * Import and explore some sample surveys.
 * Explore the use cases and HowTos, and the step-by-step examples.
 * Explore the EM documentation (this page)
 * Unit Tests of Isolated Expressions (advanced)
 * shows examples of using all EM functions and operators, and the PHP and JavaScript results;
 * note there are few functions that generate different results in the PHP and JavaScript versions, so this page lets you plan your EM logic accordingly.

=Terminology=

These words are commonly used to describe the capabilities of the EM:

EM "thinks" of its functionality in the following terms:
 * Relevance-based Branching - if a question is relevant, then ask it, otherwise don't (e.g., make it invisible and mark it as NULL in the database). You may find the Relevance fields in the question editor panel as well as in the question group editor panel. The later is used to apply a set of conditions to an entire group without having to copy the same condition to each question, and/or combine group and question-level conditional logic).
 * Tailoring - Once you know which questions should be asked, tailoring (sometimes called piping) specifies how the question should be asked. This lets you support not only simple substitution (like {TOKEN:FIRSTNAME}), but also conjugation of verbs and declination of nouns based upon the gender or number of your subjects. It also lets you change the message you deliver to a survey respondent based upon whether they answered (or how they answered) other questions.
 * Equations - EM adds a new question type called Equation which stores the result of an Expression. The equation results are computed and written to the database, even if you hide them on the page. Thus, they are used for hidden scoring calculations, navigation based upon complex equations, assessments, and reports that will be generated and stored within the database.

Relevance and Cascading Relevance
Every question type now has a Relevance option which controls whether the question is displayed or not. The EM processes each Relevance Equation in the order they appear in the survey. If the expression is true (or missing - to support legacy surveys), the question will be displayed. If it is not relevant, then the question will be hidden, and the value will be NULLed in the database. If there are no relevant questions in a group, the entire group will be skipped.

Moreover, if any of the variables within an expression is irrelevant, then the expression always evaluates to false. This enables Cascading Relevance so that you do not have to write very long Relevance equations for each question.

Say you have five questions Q1-Q5, and you only want to show Q2 if Q1 was answered, and Q3 if Q2 was answered, etc.  The relevance equations might be:

Group-Level Relevance
ExpressionScript also supports group-level relevance. This makes it easier to implement looping. Say you want to collect information from up to 10 entities (such as products or people from a household), where you first determine how many entities need follow-up (such as by asking how many people live in a household or having people check which products they like from a long list). After knowing how many entities need follow-up, you can use Group-level relevance like {count >= 1}, {count >=2}, ... {count >= 10} for each of the 10 groups of follow-up questions. Within each group, you can have question-level conditional logic (e.g. gender or age-specific follow-up questions for each subject). The question and group-level relevance equations are ANDed together to determine which should be shown.

To check such an example, import the following survey: [[Media:EM survey - Cohabs.zip|Census survey example]].

Tailoring / Piping
Anything within curly braces is now treated as an Expression (with one exception described below). Expressions have access to all the LimeReplacementFields and variables (via several aliases), all typical equation operators (mathematical, logical, and comparison), and to dozens of functions (that even work dynamically on the client-side).

By Using these equations, you can do things such as:
 * 1) Conditionally show tailored messages to the respondents based on prior responses;
 * 2) Create Assessments and show Assessment results (or conditionally branch or show messages) based upon those results, all without using the Assessments module itself;
 * 3) Conjugate verbs and decline nouns within questions, answers, and reports;
 * 4) Show summaries of responses before the "Show your answers" page at the end of the survey.

Equations
There is a new question type called Equation. Think of it as a Text display question type, except that it stores the value of what is displayed in the database. So, if the Equation Question text contains an Assessment computation, that value would be stored in the database in a variable that can be displayed in public or private statistics.

=Syntax=

Anything contained within curly braces is now considered an Expression (with one exception: there must be no leading or trailing whitespace - this is needed to ensure the ExpressionScript does not try to process embedded JavaScript).

Note that it is OK for expressions to span multiple lines, as long as there is no whitespace after the opening curly brace or before the closing curly brace. This is especially helpful for nested "if" statements like this:

The ExpressionScript supports the following syntax:
 * All standard mathematical operators (e.g. +,-,*,/,!);
 * All standard comparison operators (e.g. <,<=,==,!=,>,>=, plus their equivalents: lt, le, eq, ne, gt, ge);
 * Parentheses (so you can group sub-expressions);
 * Conditional operators (e.g. &&,| | and their equivalents: and, or);
 * Single and double-quoted strings (which can each embed strings with the other quote type);
 * Comma operator (so can have a list of expressions and just return the final result);
 * Assignment operator (=);
 * Pre-defined variables (to refer to questions, question attributes, and responses) - e.g., the SGQA codes;
 * Pre-defined functions (there are already 80+, and it is easy to add more).

Operators
EM syntax follows normal operator precedence:

Caution about using Assignment Operator (=)
You should avoid using the assignment operators unless absolutely necessary, since they may cause unexpected side-effects. For example, if you change the value of a previous response, the cascading relevance and validation logic between that question and the current question is not re-computed, so you could end up with internally inconsistent data (e.g., questions that stay answered but should have been NULLed, or questions that are skipped but should have been answered). In general, if you want to assign a value to a variable, you should create an Equation question type, and use an expression to set its value. However, there are some rare times that people really need this operator, so we made it available.

To help caution you about this operator, it is shown in red font within the syntax equations (so that you don't confuse it with "==").



Using Assignment Operator
The main reasons you may want to use the assignment operator are:
 * You need to set the default value via equation for a question that does not accept default values (such as list radio, where the user interface lets you pick one of the answer options, but does not let you enter an equation). However, be careful, as LimeSurvey will not be able to validate that your equation generates one of the allowable answers for that question;
 * You need to forcibly change the response to a previous question based upon a later response;
 * etc...

You can use all the expression manager system for this purpose. It's better to use an Equation for this purpose.

Some examples:
 * Set answer to a short text question in lowercase : ;
 * Set a default answer to an array question type at start of a survey : ;
 * Set a default answer to an array texts question type at start of a survey : ;
 * Set an answer with condition :.

= XSS security =

With XSS enabled, some parts of the expression manager system cannot be used:
 * starting a HTML tag in expression but ending in another expression;
 * using a complex expression within a URL.

Examples and workarounds:
 * is broken with XSS security, here you can use ;
 * , here you can use an equation question because using a complete question code is OK :.

=Access to Variables=

ExpressionScript provides read-only access to whichever variables you might need. For backwards compatibility, it provides access to the following:
 * TOKEN:xxx - the value of a TOKEN (e.g., TOKEN:FIRSTNAME, TOKEN:ATTRIBUTE_5) (only for not anonymous survey).
 * INSERTANS:SGQA - the display value of an answer (e.g., "Yes") - similar to using {QCODE.shown}.
 * All {XXX} values used by templates.
 * In question text, you can use {QID} replaced by the question id and {SGQ} replaced by the SGQA of the question.

In addition, ExpressionScript lets you refer to variables by the Question Code (the 'title' column in the questions table within the database). This is also the variable label used when you export your data to SPSS, R, or SAS. For example, if you have questions about name, age, and gender, you could call those variables name, age, and gender instead of 12345X13X22, 12345X13X23, and  12345X13X24. This makes equations easier for everyone to read and validate the logic, plus making it possible to shuffle questions around without having to keep track of group or question numbers.

Important: It is safer to refer to variables that occur in the preceding pages or questions.

Furthermore, ExpressionScript lets you access many properties of the question:

HTML editor issue
If you use the HTML editor, some characters are replaced by HTML entities.
 * & by &amp;amp;
 * < by &amp;lt;
 * > by &amp;gt;

If you use HTML editor you need to use :
 * and for &
 * lt for <
 * le for <=
 * gt for >
 * ge for >=

It is recommended to clear your expression of HTML that appears within your expression. If you use the LimeSurvey HTML editor, click on the "Source" button (located in the upper left part of the editor) and delete all the characters that are not related to your expression (e.g.,, , and so on).

=Qcode Variable Naming=

Here are the details of how to construct a Qcode (and access some properties) by question type. In general, Qcodes are constructed as:

QuestionCode. '_' . SubQuestionID. '_' . ScaleId

For comment and other, the corresponding question codes are QuestionCode_comment and QuestionCode_other, respectively.

=Usage of NAOK=

NAOK --> "Not Applicable" (NA) is alright (OK)

Using NAOK, means that all or some of the variables are irrelevant (e.g. "Not Applicable" (NA) is alright (OK)).

For example: count(Q1_SQ1,Q1_SQ2,Q1_SQ3,Q1_SQ4) give always an empty string if one subquestion of Q1 is filtered. To count the number of checked subquestion in such question can be count(Q1_SQ1.NAOK,Q1_SQ2.NAOK,Q1_SQ3.NAOK,Q1_SQ4.NAOK). If the subquestion is hidden, the EM returns an empty string.

Without NAOK, if one question or one subquestion is hidden, the EM returns always an empty string (same to returning false).

The .shown always use the NAOK system (empty string if hidden) but if you need the code of the answer: it's always a good idea to add .NAOK after the question code (except if you need it and know what you do).

More information is provided in the Overriding Cascading Conditions subsection.

=The reserved "this", "self", and "that" variables=

Quite often you want to evaluate all the parts of a question, such as counting how many subquestions have been answered or summing up the scores. Other times you want to process just certain rows or columns of a question (such as getting the row or column sums and storing them in the database). These reserved variables make that process relatively painless.

"This" variable
The "this" variable is used exclusively within the "Whole question validation equation" and "Subquestion validation equation" options (the later is not possible from GUI). It expands to the variable names of each of the cells within those questions. So, if you want to make sure that each entry is greater than three, you would set the "Sub-question validation equation" to (this > 3).

"Self" variable
The "self" and "that" variable are more powerful, and serve as macros which are expanded prior to processing equations. The syntax choices for the "self" variable are:
 * self
 * self.suffix
 * self.sub-selector
 * self.sub-selector.suffix


 * 1) suffix is any of the normal qcode suffixes (e.g., NAOK, value, shown)


 * 1) sub-selector can be one of the following:
 * comments - only subquestions that are comments (e.g., multiple choice with comment and list with comment);
 * nocomments - only subquestions that are not comments;
 * sq_X - where X is a row or column identifier.  Only subquestions matching pattern X are selected. Note that search is done on complete code identifier, then sq_X match and include subquestions nX, X, Xn (e.g. if you use sq_1, subquestions a1, 1a, 1, 11 or 001 was included). Put attention at dual scale question type where subquestions code are QCODE_SQCODE_1 and QCODE_SQCODE_1 and to ranking question type where subquestions code are QCODE_1,QCODE_2 ....

Examples:
 * Has any part of a question been answered? -> {count(self.NAOK)>0}
 * What is the assessment score for this question? -> {sum(self.value)}

You can also use these to get row and column totals. Say you have an array of numbers with rows A-E and columns 1-5.
 * What is the grand total? -> {sum(self.NAOK)}
 * What is the total of row B? -> {sum(self.sq_B.NAOK)}
 * What is the total of column 3? -> {sum(self.sq_3.NAOK)}

"That" variable
The "that" variable is like the "self" variable, but it allows you to refer to other questions. Its syntax is:
 * that.qname
 * that.qname.suffix
 * that.qname.sub-selector
 * that.qname.sub-selector.suffix

qname is the question name without any subquestion extensions. So, let's create a question 'q1', 'q' representing also its qname.

Examples:
 * Has any part of question q1 been answered? -> {count(that.q1.NAOK)>0}
 * What is the assessment score for q2? -> {sum(that.q2.NAOK)}
 * What is the grand total of q3? -> {sum(that.q3.NAOK)}
 * What is the total of row C in q4? -> {sum(that.q4.sq_C.NAOK)}
 * What is the total of column 2 in q4? -> {sum(that.q4.sq_2.NAOK)}

The "self" and "that" variables can be used in any relevance, validation, or tailoring.

The one caveat is that when you use the Show Logic File feature, it will show you the expanded value of "self" and "that". This lets you see the actual equation that will be generated so that you (and the EM) can validate whether the variables exist or not. This may seem confusing since you may see quite lengthy equations. However, if you edit the question, you will see the original equation using "self" and/or "that".

=Access to Functions=

The ExpressionScript provides access to mathematical, string, and user-defined functions, as shown below. It has PHP and JavaScript equivalents for these functions so that they work identically on server-side (PHP) and client-side (JavaScript). It is easy to add new functions.

Implemented Functions
The following functions are currently available:

=ExpressionScript Knows Which Variables are Local=

In order to properly build the JavaScript for a page, the ExpressionScript needs to know which variables are set on the page, and what their JavaScript ID is (e.g., for document.getElementById(x)). It must also know which variables are set on other pages (so that it can ensure that the needed  fields are present and populated).

Cascading Conditions
If any of the variables are irrelevant, the whole equation will be irrelevant (false). For example, in the following table, N/A means that one of the variables was not relevant

Overriding Cascading Conditions
Say you want to show a running total of all relevant answers. You might try to use the equation {sum(q1,q2,q3,...,qN)}. However, this gets translated internally to LEMif(LEManyNA("q1","q2","q3",...,"qN"),"",sum(LEMval("q1"),LEMval("q2"),LEMval("q3"),...,LEMval("qN"))). So, if any of the values q1-qN are irrelevant, the equation will always return false. In this case, the sum will show "0" until all questions are answered.

To get around this, each variable can have a ".NAOK" suffix (meaning that Not Applicable is OK) added to it. In such cases, the following behavior occurs. Say you have a variable q1.NAOK:
 * 1) q1 is not added to the LEManyNA clause
 * 2) LEMval('q1') will continue to check whether the response is relevant and will return "" if it is not (so individual irrelevant responses will be ignored, but they will not void the entire expression).

So, the solution to the running total problem is to use the equation sum(q1.NAOK,q2.NAOK,q3.NAOK,...,qN.NAOK).

The use of the .NAOK suffix also lets authors design surveys that have several possible paths but then converge on common paths later. For example, say subjects answer a survey in a way that is outside the normal range of responses. The author could alert the subjects that they may not get valid results, and ask them whether they really want to continue with the survey. If they say "Yes", then the rest of the questions will be shown. The condition for the "rest of the questions" would check whether the initial responses were answered within the normal range OR whether the subject said "Yes" to the question that is only relevant if they answered outside the normal range.

=How does ExpressionScript Support Conditional Micro-Tailoring?=

Here is an example of micro-tailoring (where Question Type=='expr' means an Equation):

To download the above survey example, click on the following link: [[Media:No_of_kids_-_Micro_Tailoring.zip|Number of kids survey example]].

All of these questions can be on a single page (e.g. in the same group), and only the relevant questions will display. Moreover, as you enter the ages of children, the sum expression in the last question will dynamically update on the page.

ExpressionScript provides this functionality by surrounding each expression with a named  element. Every time a value changes, it recomputes the expression that should appear in that  element and regenerates the display. You can have dozens, or even hundreds, of such tailored expressions on the same page, and the page will re-display all of them in a single screen refresh.

=Syntax Highlighting=

To help with entering and validating expressions, the EM provides syntax highlighting with the following features:

Types and Meanings of Syntax Highlighting
=Additional Reading=