Actions

Project Ideas for GSoC 2012

From LimeSurvey Manual

Revision as of 16:57, 1 March 2012 by TMSWhite (talk | contribs)

Welcome

Welcome Google Summer of Code Student aspirants (:razz:)

This page lists project ideas developed by the LimeSurvey Community. These tend to be areas that will get the most support as projects since they have been developed by people who know the project and what it needs the most. However, if you have your own idea for a project discuss your awesome idea with us in our forums, mailing list or at #limesurvey on irc.freenode.net. Then submit your proposal. Good Luck (:biggrin:)


Project ideas


Refactor all question types to a modular OOP format


Implement a fully featured XMP-RPC API complete with unit test


Add a unit testing framework and write tests


Integrate the DBSE into Yii


Develop Improved 'bulk' Question Handling UI (and other UI enhancements)

The idea is to provide ways of updating and modifying multiple questions, or multiple groups in one go. Often in a survey a user will need to make parallel changes to 5 or 6 questions all at once across a range of different groups. Being able to select various elements of questions, and then select a range of questions to edit would dramatically improve LimeSurvey's usability from the administrators perspective.

Skills

PHP, SQL, Yii, Authentication protocols (openID, CAS, LDAP-bind, ...)

Difficulty

Medium

Probable Mentors

Jason Cleeland (jcleeland)


Implement a modular authentication framework

The idea is to design and implement a modular authentication "framework" for LimeSurvey as well as some authentication modules. The Authentication framework will define the API each authentication plugin must (or may) implement and propose basic methods that can be used by plugins (for instance how to store the plugins parameters in DB, how to display a form using the survey template). A plugin will be responsible to implement, for each particular authentication backend, specific methods (some mandatory, some optional) such as: user authentication, user provisioning, user rights, ...

Currently authentication is only used for the survey-administration GUI and not the participants interface (tokens are used for this). With the new authentication framework it will be possible to define several authentication backends and use some of them for participants-authentication.

The generic authentication framework must define the following services:

  • User authentication: this interface must return the identity of the authenticated user if authentication is successful.
    • Authentication may not always be based on a simple user/password form, so the proposed framework must be generic enough to enable authentication based on other schemes (using any numbers or interaction-pages between the server and the end-user or any other contextual parameter such as Referrer, session variable...
  • User provisioning:
    • when activated on the survey administration interface, an authentication module might be able to create a newly authenticated user into the LimeSurvey internal DB (which is required for setting the user rights on the platform). The newly created user can be assigned a default profile or a per-user profile (queried from an external database).
    • when activated on the participants interface, the authentication module will provision the token table of the corresponding survey // alternatively the general-purpose cross-survey participants-database as described in the above GSoC idea.

The already existing authentication schemes will be ported to the new framework (internal-DB and Web server authentication delegation) and at least a new one will be implemented (openID, CAS, Shibboleth).

Skills

PHP, SQL, Yii, Authentication protocols (openID, CAS, LDAP-bind, ...)

Difficulty

Medium

Probable Mentors

Unknown

Want to know more

Please read the FAQ about this project


Custom Report Generation

The task is to make a module which will generate custom reports. Module should be able to do these things(at least) :-

  • Creating various types of reports i.e. using tables, pie charts, graphs and bar-charts to name few.
  • The resulting charts and other illustrations should be easy to read and understand, and should include the ability to export into standard office suites.
  • Reports can be general or survey specific.
    • General in the sense that it should show basic findings of a general survey graphically like number of users completing the survey, average time taken to complete the survey, average number of correct responses etc.
    • Survey specific results should also be addressed properly e.g. how many users choose first option as there answer of a particular question, how many users didn't answer a specific question etc.

Skills

OOP experience in PHP and experience with a PHP framework like CakePHP or CodeIgniter, jQuery. Strong mathematical background will help!

Difficulty

Moderate

Probable mentors

Carsten Schmitz (c_schmitz)

Jason Cleeland (jcleeland)

Diogo Gonçalves (dionet)


Adding new questions/subquestions and groups dynamically

Currently in LimeSurvey once a survey is activate we can not

  • Add or delete groups
  • Add or delete questions
  • Add or delete subquestions or change their codes

The idea is to cover these shortcomings. And in addition, all those participants who have already taken that survey must also be notified about the changes.

Skills

Experience with a PHP framework like CakePHP or CodeIgniter and knowledge of inner working of LimeSurvey is must!

Difficulty

Moderate

Probable mentors

Carsten Schmitz (c_schmitz)

Jason Cleeland (jcleeland)

Diogo Gonçalves (dionet)


Enhance Expression Manager

Starting with LimeSurvey 1.92, all of the front-end (survey-taking) processing is now managed using Expression Manager (EM).  EM implements a safe subset of PHP syntax so that authors can write complex equations.  It is integrated into LimeSuvey to control navigation (branching via relevance), validation, and tailoring/piping of content.  Although it is quite powerful, users are already asking for enhancements.

  1. Add a GUI for Expression Manager
    • Background:  LimeSurvey has a nice GUI for building conditions, but it does not support complex equations such as those available within EM.  EM does have robust syntax highlighting after an equation is saved, which makes it easy to fix any syntax errors.  Hoever, users would like a GUI to ease authoring of equations.
    • Strategy - use CodeMirror - see here
      • Create CodeMirror syntax file to do appropriate syntax highlighting (use C as base language), ensuring that CodeMirror knows the set of valid operators
      • Use CodeMirror API to let it know the names of registered function and variable names
      • Utilize auto-complete, ideally letting users see and choose among the availalbe function  syntaxes (e.g. when a function can take several paameters).
      • Auto-completion of variable names should show the question and question type to make it easier for users to pick the correct variable.
  1. Support sub-question-level Relevance
    • Background:  EM already supports user-entered relevance equations at the group and question levels.  It also generates sub-question-level relevance for features like array_filter.  Users would like to be able ot add additional relevance criteria at the sub-question level.
    • Strategy:
      • Add a relevance column to the display of sub-questions, and ensure entered relevance is saved the to database (the data model is already OK)
      • Refactor createFieldMap() (or its successor) to read sub-question-level relevance into the run-time data structures so that passed to EM
      • Enhance EM's sub-question-level data structures to hold sub-question-level releavance
      • Update EM's _ValidateQuestion() and JavaScript generation functions to AND any manually-entered sub-question-level relevance equations with that auto-generated for array_filter and array_filter_exclude
      • Update ShowSurveyLogicFile() to show sub-question-level relevance equations ***Rigorously unit and regresssion test these changes.
  1. Add Sub-question-level Validation
    • Background:  The question-level validation supports regular-expression-based validation of each sub-question.  Users would like to have sub-question-level validation to implement things like a question for collecting user contact information, where there are different validation criteria for address parts (e.g. city, state, postal code) and phone numbers.
    • Strategy:  Similar to sub-question-level relevance
      • Add data entry fields for regular expression validation at the sub-question level, and ensure saved to database (which already has a preg field at the sub-question level)
      • Ensure EM gets access to the new preg values (via upgrade to createFieldMap)
      • Upgrade EM _ValidateQuesation() and function for generation of validation equations to include these new critiera.  EM already supports validation at the sub-question level (and changes the CSS style to show fields that fail this validation)
      • Upgrade ShowSurveyLogicFile() to show sub-question-level validation rules
      • Rigorously unit and regresssion test these changes.
  1. Add Sub-question-level mandatory criteria
    • Background:  Many users have asked, via the forums, for ways to make parts of a multi-part questions mandatory, but make other parts optional.  LimeSurvey already provides many options, such as minimum and maxium numbers of answers; but users have asked for more fine-grained control, and currently can only achieve that via custom JavaScript.
    • Strategy:  Similar to sub-question-level relevance
      • Add checkbox to indicate mandatory status at the sub-question level
  1. Add native support for input-masks at the question and sub-question level
    • Background:  One of the commonly used work-arounds deals with input masks, such as the jQuery meimomask plugin.
    • Strategy:
      • Add mask attribute at question and sub-question level
      • Have EM generate needed JavaScript code to create and manage those masks
  1. Add EM reporting functions (tables)
    • Background: There is commonly request for enhancemnt of the print answers table at the end of the survey.  Some users want to generate custom reports mid-survey.
    • Strategy:
      • Implement showAllResponsesExcept(attributeList,attributeTitleList,questionList) function.  questionList = list of question identifiers; attributeList = pipe-delimited list of attributes (like question#, title, text, type - so you can decide what to show); attributeTitleList = pipe-delimited list of table headers, so can internationalize the report.
      • Implement showTheseResponses(attributeList,attributeTitleList,questionList) function. questionList = list of question identifiers; attributeList = pipe-delimited list of attributes (like question#, title, text, type - so you can decide what to show); attributeTitleList = pipe-delimited list of table headers, so can internationalize the report.
  1. Add better EM support for operations on array type questions
    • Background:  Many of the EM validation rules are effectively statements like count the number of empty sub-questions, or sum the values of the sub-questions.  EM generates these functions itself, so there is no burden on the user, even if there ar dozens of sub-quetions.  However, manual editing of these questions is cumbersom. Users could benefit from special variables to access all of the elements of a question so that they could write functions like (sum(this) == 10)) and have it be expanded to (sum(q1_1, q1_2, ..., q1_N) == 10).  Note, there is already a "this" variable, but it does not apply to sub-questions.
    • Strategy:
      • Extend EM so that "this" variable gets expanded into a comma separate list of sub-question references if there are sub-questions.  Have this expansion carry the suffix, so this.valueNAOK would become q_1.valueNAOK, q_2.valueNAOK, ..., q_N.valueNAOK.
      • This macro expansion should occur in group.php (e.g. in the process of generating JavaScript), rather than having this variable be resolved at run-time within JavaScript.
      • Provide similar array-expansion macros for all variables (not just "this") - such as qcode_vars.*.  This would also functions like sum(qcode_vars.NAOK), count(qcode_vars.NAOK), implode(' ', qcode_vars.valueNAOK)
      • For questions with comments, create macros like qcode_vars_nc and qcode_vars_oc for no-comments and  only-comments
      • For arrays that might need row or  column-level processing, create aliases  like qcode_rowname_vars and qcode_colname_vars so can expand each.  This would let us replace the current system for generating row and column sums with equations like sum(qcode_rowname_vars.NAOK) and be sure that the sums will honor array_filter and array_filter_exclude.
  1. Add EM functions to validate data entry against value sets managed by Enterprise Vocabulary Systems
    • Background:  Healthcare and biological sciences increasing use large controlled vocabularies, terminologies, or ontologies.  Data entry systems for such domains require validation against those vocabularies.  Large open-source projects, like LexEVS and Apelon DTS provide open APIs to access that content.  Such tools let one validate diagnostic codes, and even do incremental search into those vocabularies as one types.  The main open projects are standardizing on the CTS-2 (Clinical Terminology Services-2) specification.
    • Strategy:
      • Create EM-compatible PHP and JavaScript functions to access CTS-2 compliant EVS systems
      • Should include question and sub-question-level validation rules to validate the final entry.  Currently, thre are em_validation_q and em_validation_sq advanced quation attributes for validating questions and sub-questions based upon calls to external functions.  It may be desirable to support custom sub-question-level validation to validate each sub-question against different value sets.  If so, can follow the model of adding sub-question-level validation and relevance.  Alternatively, may be desirable to add sub-question-level advanced question options (if LimeSurvey community feels that there may be enough such extensions that such customization should be stored in a general attribute table rather than making the questions table wider).
      • Should also include ability to do incremetal searches into the value sets

Skills

PHP, SQL, PHP Debugging (e.g. xdebug), Yii, LimeSurvey's Expression Manager and code-base in general

Difficulty

Medium to Hard

Probable Mentors

Thomas White (TMSWhite)


Optimize LimeSurvey for Long Surveys (and better performance and maintenance in general)

Although LimeSurvey is excellent for short and mid-length surveys, it is not optimized for surveys with hundreds or thousands of questions - such as those used by epidemiologists or in clinical trials.  This project would tackle each of those main limitations.  Those sub-tasks, and the development strategy include:

  1. Support for more than 1000 database columns
    • Background: LS creates a horizontal table for survey data collection, and such tables often are limited to at most 1000 columns
    • Strategies:
      • Remove unneeded columns from horizontal table (e.g. type 'X')
      • Conditionally remove unneeded equation columns (e.g. let users specify a prefix for variable names that should not be stored)
      • Add option for Entity Attribute Value data model for data collection (which can support unlimited number of columns.  See details here.
  1. Memory and Code Optimization
    • Background: LS loads the survey definition data model into several different data structures, such that it uses at least twice as much storage as it really needs.  Expression Manager now holds all of the instrument definition data, so the other data stores are no longer needed.
    • Strategy
      • Remove need for buildsurveysession() - gap analysis to add in any missing content to EM
      • Remove need for createFieldMap() - similar gap analysis
      • Normalize EM data structures to avoid internal duplication (and refactor code to use normalized structures) (and document new data structures so that future developers know which to use, and how to use them)
        • gInfo - renaming of groupSeqInfo, plus add missing group-level attributes (relevance, description)
        • qInfo - renaming of questionSeq2relevance, plus add any missing attributes from $fieldarray; remove aid, sqid; move grelevance to gInfo
        • aInfo - for storing answer arrays?
        • sqInfo - renaming of q2subqInfo, indexed on sgqa(?); remove content gleaned from gInfo and qInfo
        • gStatus - renaming of indexGseq - make hold only dynamic values; so move gtext and gname to gInfo; also remove gRelInfo, keeping any unique variables it contains
        • qStatus - renaming of indexQseq - make hold only dynamic values; so move qtext, qhelp, gtext, gname to qInfo
        • sqStatus - renaming of subQrelInfo - make hold only dynamic values
        • groupRelevanceInfo - consolidate into indexGseq?
        • knownVars - instead of copying content, use reference to gInfo and qInfo (e.g. remove question, relevance, grelevance, qcode, ansList, ansArray, onlynum).
        • varNameAttr - remove; generate on the fly from gInfo, qInfo, and knownVars
        • alias2varName - remove; generate on the fly from knownVars
      • Refactor EM so that stores normalized copies of secondary language text
      • Optionally load only current group for each page transition rather than holding entire survey definition in memory
      • Refactor EM for consistent variable naming
        • e.g. questionId => qid; groupNum => gid; groupSeq => gseq; questionSeq => qseq
  1. Run-Time Performance Optimization
    • Background:  qanda.php loads the content to create the questions and answers.  It used to do this by separate database queries per question.
    • Strategy
      • Refactor qanda.php to remove queries (like for "other", or sub-question text) - have it get that information from EM
      • Refactor LS so that language switching does not require a re-load of the core logic, but just the new language content  (and that it gets this from EM)
      • Refactor replacements.php and EM
        • Should only need to call replacements.php once per page, so set those valeus as locally static in EM per page.
        • Refactor group.php to pass {QUESTION_*} via replacements array, rather than as globals passed to templatereplace
  1. Design-Time Performance Optimization
    • Background:  LS used to only load the information from the data model that was needed for the given question or group.  EM had to load the entire data model to properly syntax highlight everything.  This can lead to some performance degradation in very long surveys
    • Strategy:
      • Add methods to EM to just load changes to model as questions or groups are added, removed, or updated
      • Pass those updates into EM internals so that syntax highlighting continues to be correct
      • Ensure that admin pages only call the subset of EM functions needed to do accurate syntax highlighting.
  1. Optimizations for Rapid Development
    • Background:  LS has a nice GUI for editing single questions at a time.  However, it is not optimized to make changes to multiple questions at a time.  Some competitor systems let authors design surveys using an Excel template (e.g. so that they can do bulk find and replace, or easily copy similar portions or answer lists).  Short of a full-blown AJAX-enabled admin system, this has the highest throughput potential
    • Strategy:
      • Create an Excel data model that would work for importing surveys
      • Create import and validation routines from that model
      • A similar model, which could be extended for LS, is noted here.
  1. Performance Validation
    • Load test surveys of varying lenghts
    • Identify performance and memory bottlenecks
    • Idenfity minimum memory requirements for certain survey lengths and concurrent user volumes
    • Propose strategy to overcome those performance issues

Skills

PHP, SQL, PHP Debugging (e.g. xdebug), Yii, LimeSurvey's Expression Manager and code-base in general, Load testing tool (e.g. webload), PHP Profiler

Difficulty

Medium to Hard

Probable Mentors

Thomas White (TMSWhite)


Idea template

Describe the idea here in general terms

Skills

Explain what sort of coding skills would be needed for a student to implement this project

Difficulty

Explain the level of difficulty involved

Probable Mentors

Put your name (and tag) here if you are willing to mentor a student for this idea


More information

Getting started

Check out our 'Get started' page for setting up the development environment, coding standards, and all the other important stuff that you need to know before the real fun begins!

Frequently Asked Questions

Check out our GSoC FAQ page.