Psychometric Properties of Persian Version of “Child-Initiated Pretend Play Assessment” For Iranian Children


Minoo Dabiri Golchin 1 , Navid Mirzakhani 2 , * , Karen Stagnitti 3 , Mahsa Dabiri Golchin 4 , Mehdi Rezaei 5

1 Faculty of Rehabilitation, Shahid Beheshti University of Medical Sciences, Tehran, Iran

2 Physiotherapy Research Centre, School of Rehabilitation, Shahid Beheshti University of Medical Sciences, Tehran, Iran

3 School of Health and Social Development, Deakin University, Geelong, Australia, 3220

4 Faculty of Nursing and Midwifery, Tehran University of Medical Sciences, Tehran, Iran

5 Faculty of Rehabilitation, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Iranian Journal of Pediatrics: 27 (1); e7053
Published Online: November 6, 2016
Article Type: Research Article
Received: May 14, 2016
Revised: September 7, 2016
Accepted: September 24, 2016




Background: Play, particularly pretend play, has a cognitive basis and has been linked to the language and social ability.

Objectives: The goal of this study was to examine face and content validity, inter-rater, intra-rater and test-retest reliability of the Persian translation of the child-initiated pretend play assessment

Methods: Ten occupational therapists consented to be in the content validity study. Face validity was examined by five occupational therapy specialists. For reliability, 31 typically developing children aged 4 - 6 years were chosen from kindergartens of four regions of Tehran, Iran. Two weeks after the initial assessment the children were re-tested for test-retest reliability. Intra-rater and inter-rater reliability was scored from videos of the children’s play assessment.

Results: To be culturally appropriate for Iran, some phrases were changed and the pigs were replaced by dogs. Content validity index (CVI) and content validity ratio (CVR) were acceptable for all items. The Intra-class correlation coefficient was ICC = 0.99 for intra-rater reliability and ICC = 0.98 for inter-rater reliability. For test-retest reliability, the intra-class correlation coefficient for symbolic and combined object substitution scores and all elaborate play scores ranged from ICC = 0.69 to - 0.99. For imitated actions, the majority of children scored 0 on both test and re-test.

Conclusions: The Persian version of child initiated pretend play assessment has appropriate face and content validity. Inter-rater and intra-rater reliability were excellent. PEPA, combined and symbolic NOS showed good to excellent test-retest reliability. Test retest reliability for conventional NOS was moderate and NIA was not stable with more children imitating the examiner in the first test but not the re-test.


Face Validity Test-Retest Reliability Reliability and Validity Child Play and Playthings

1. Background

Play appears to be a simple concept while at the same time having a complex meaning for the individual which may reflect personal life experiences (1). True play is a self-chosen activity that occurs spontaneously and at times may seem to have no aim (2). Play is flexible, challenging, transcends reality, is fun and totally charming (2). For children who are typically developing, it is a meaningful activity that they engage in their daily routine (3). Pretend play develops in most children from 18 months to 6 years old children (4). When it takes the form of playing with toys and play materials. Pretend play ability is a potential reference for assessing pre-academic skills as pretend play is related to skills that are essential for learning and pre-literate skills (5-7). The link between pretend play and pre-literate skills have been argued to be talking, organization of thinking and decontextualized language, abstract thinking, logical thinking, generalization and adaptation (7, 8). Children who are involved in pretend play are better at problem solving (9). Pretend play is an important index in cognitive development of children (10). It is a proper and mature play for preschool aged children (11).

In the last 45 years, occupational therapists have put forward views, models and definitions for play. Reilly contributed greatly in this area because of her occupational behavior frame of reference and her description of play as a prioritization tool for assessment of skills which underpinned further development of competencies. After that Parham, Lindquist and Mack (1982) developed some models that explained play as a connector between sensory integration and Reilly's frame of reference (12). Bundy developed a model which was one of the most influential models which defined playfulness (13).

There are many play assessments, however few of them are standardized and fewer are norm referenced. The child-initiated pretend play assessment (ChIPPA) assesses a child’s ability to spontaneously initiate pretend play (14). Stagnitti defines pretend play on the ChIPPA as conventional imaginative play and symbolic play. Conventional imaginative play is play with common play materials and the child imposes meaning on them. For example, the doll is sitting and eating. Symbolic play is playing with unstructured play materials. For example, when the child pretends a box as a ship. The ChIPPA assesses both of these areas in one session (15).

Other assessments for pretend play, are the Symbolic Play Test (15). This assessment only assesses conventional imaginative play. The Test of Pretend Play (16) measures the child’s ability to substitute objects, but doesn't consider conventional imaginative play. The ChIPPA differs from these two assessments because it measures both conventional-imaginative play and symbolic play in one session for pre-school aged children (17).

The ChIPPA has been studied in Brazil, Finland and Australia and it is shown that it is reliable and culturally suitable in these countries (18-20).

The concurrent validity of the ChIPPA with the Miller Assessment for Preschoolers (Miller, 1988) (17) and the predictive validity of the ChIPPA (21) are reported to show strong validity of ChIPPA.

ChIPPA was published in 2007 in Australia (14). Stagnitti was working on this assessment for 14 years before publication (personal communication, 2013). In the development of the ChIPPA, Stagnitti argued that essential behaviors of pretend play were the child’s ability to: self-initiate play, use symbols in play, logically sequence play actions, attribute properties to objects, refer to absent objects, and refer to someone or something outside of self (13, 14). The play materials of the ChIPPA have been examined for gender neutrality and developmental appropriateness (22). More recently the play materials have been examined for cultural appropriateness with Australian Indigenous, Aboriginal children (19) and Brazilian children (20). Standardized assessments also have a standardized approach to administration scoring and comparison of children to a norm score which gives a more objective picture of how a child’s ability to play compares to a child of the same gender and age (23). Standardized assessments can also be used to monitor a child’s progress over time.

A norm referenced standardized assessment, such as the ChIPPA, can be used to identify if a child’s pretend play abilities are developing on a level to his/her peers. As play is an important occupation of childhood that has implications for further learning, a norm referenced standardized assessment such as the ChIPPA, can fill the gap in available assessments for occupational therapists. In occupational therapy in Iran, there is a need for a valid tool that assesses play as important in itself and not as a means for assessing other skills. In Iran, there are no valid and reliable pretend play assessments, even though a child’s ability to pretend play is an important aspect of a child’s development. The ChIPPA assesses a child’s ability to self-initiate pretend play in children aged 3 years to 7 years and 11 months old and this age range of children reflects the age of the majority of children referred to occupational therapy clinics in Iran.

2. Objectives

The aim of this study was to translate the ChIPPA and examine the face and content validity of the newly translated Persian version of the ChIPPA. The second aim was to establish the inter-rater, intra-rater and test-retst reliability of Persian version of the ChIPPA. The translation is explained first, followed by the validity and reliability studies.

3. Methods

This study is a sectional psychometric study. It has three parts: Translation, Validity and Reliability part. Thus we used experts as samples for validity, and 31 children for reliability part. We found sample size using data of Brazil study (20). We put data in this formula:

Equation 1.

Each part has been described in details.

3.1. Translation of the ChIPPA

At the beginning we got permission of working on ChIPPA from its developer. Then the ChIPPA was translated using the international quality of life (IQOLLA) method. The process of IQOLLA was completed in these steps:

First, a description of the translation process was given to translators. Translations should emphasize on concepts and meaning and also should be understood by a 14 year old person. In the first step two Persian native translators who were fluent in the English language, but not familiar to the ChIPPA (translators 1 and 2) wrote a list of possible translations for each item. Then they marked these items from the weakest to the strongest translation. Difficulty of the translation was estimated from 0 to 100 in a session with translators and the first author. A Persian version was chosen. Then two other translators (translators 3 and 4) scored the quality of the Iranian translation using the LASA scale. In this scale, a 100 milimeter line is used where the translator estimates the quality of the text from 0 to 100.Using the LASA scale, these translators scored 83.2 out of 100. The higher the score the more precise the translation. Translators 3 and 4 also expressed problems with the quality of some items. The quality of translation was based on the clarity of the translation, common language used and conceptual equivalence from the English to the Iranian translations. Then, two other translators (translators 5 and 6) translated the Persian version to English. These two were English native speaking translators with knowledge of Persian. This English version was sent to the test developer. Then, three professors reviewed the Persian version and a final version was developed. The three professors suggested changing the font of text or changing some phrases. These ideas were discussed in an expert panel with translators and researchers. After revision, a Persian version of the ChIPPA was ready to use. This version was then sent to five occupational therapists to use it clinically as a pretest of its suitability in Iran (24).

During the translation, some phrases had no equivalence in Persian, for example, “12 inches doll syndrome”. In other cross-cultural studies, such as Brazil, this phrase was changed to “Barbie doll syndrome” (20). For the Persian version the Brazilian term was used. For some phrases there was no agreement between translators. In these cases, the closest phrases to the meaning of the content were chosen. For example the term “initiated” was translated to “creative” by one of the translators, with this translator insisting on this translation. In such an instance, the test developer was contacted for the concept meaning and “initiated” was translated to “spontaneously” in Persian.

Culturally, some of the play materials were changed such as replacing the two pigs with two dogs. Culturally, eating ham is taboo in Islam, the Iranian religion, and there are no pigs in Iran. In addition, in the clinical observations, the item referring to templates such as “Thomas the Tank” was replaced by a culturally appropriate example of a story named “Shangul and Mangul”. This story is a popular and familiar story for all Iranian children.

The five occupational therapists, who used the ChIPPA clinically, reported no problems in the translation.

3.2. Validity Study of the Iranian Version of the ChIPPA

Ethics approval was granted by Shahid Beheshti University of Medical Sciences, faculty of rehabilitation in Tehran, Iran.

3.3. Validity Study

3.3.1. Participants

For content validity 10 occupational therapists in play and pediatric clinical work, attended. Of these 10, three were studying PhD of occupational therapy and seven were studying a master of occupational therapy. For the face validity study, seven occupational therapists with master degree who had a minimum of two years experience working with children consented to be in the study.

3.3.2. Instrument

As ChiPPA is a tool and not a questionnaire, we should assess all aspects of content, administration and scoring. Thus a 21 item questionnaire was used that previously was used in the content validity study of the Australian ChIPPA (14). The 21 question questionnaire had four sections: play materials, scoring, administration and content (see Table 1). Participants were requested to record importance and necessity of each item by professionals. Participants scored the questionnaire using a 6 point Likert scale from completely agree to agree, almost agree, disagree, and most disagree.

Table 1. Interpretation of CVR for Content Validity
N (Panel Size)CVR Critical Value

An additional questionnaire on the ChIPPA test items was developed where participants rated each item between 1 to 4 (1 means minimum and 4 means maximum). Because in CVI, we should assess whether items are related or not, we used assessments own item. Then we can ask the experts about clarity and simplicity of items. This questionnaire included questions on the clarity of sentences, being simple to understand and content being related to pretend play.

3.3.3. Procedure

To recruit participants, an invitation to be part of the study was sent by email to 11 occupational therapist. Ten consented to be part of the study. After consent to attend in the study, the therapists were emailed the questionnaires and information on the ChIPPA. The therapists returned the questionnaires to the first author.

3.3.4. Data Analysis

In this study, the Lawshe method was used for data analysis. Lawshe method is a quantitative way to assess content validity. It uses two indexes: Content Validity Ratio (CVR) and Content Validity Index (CVI). For Lawshe method disagree scores were counted as one unit, completely agree and agree were counted as one unit and almost agree was counted as one unit.

Then Content Validity Ratio (CVR) was calculated for each item by this formula:

Equation 2.

Where Ne is the number of specialists who chose the most score for each item.

N is the whole number of specialists.

The CVRwas interpreted with reference to Table 2.

Table 2. Content Validity Ratio (CVR) of ChIPPA (n = 10)
QuestionAgreeAlmost AgreeCVR
Play materialsDevelopmentally appropriate101
Gender neutral101
Farm set suitable for conventional imaginative play910.8
Unstructured play materials suitable for symbolic play910.8
AdministrationThe manual instructions are sufficiently detailed910.8
The “cubby” house and floor sitting makes the ChIPPA more like a play situation910.8
Administration allows self-initiation of play910.8
30 minutes is an adequate time frame to assess play910.8
The ChIPPA is clinically viable101
ScoringVerb lists are helpful101
Scoring guidelines are clearly explained101
The score sheet is clearly laid out101
Clinical observations add useful information101
After training, I would be confident to score the ChIPPA101
PEPA measures sequences in imaginative play actions101
NOS measures the number of object substitutions101
NIA measures the number of imitated actions101
ContentThe ChIPPA assesses a child’s ability to self-initiate play101
PEPA assesses cognitive elements of play101
NOS assesses cognitive elements of play101
ChIPPA provides information on play not obtained through other assessments101

CVI analysis was used on the questionnaire with the ChIPPA test items. CVI was calculated using this formula:

CVI = (number of agree specialists with the score of 3 and 4) / (number of whole specialists)

3.4. Reliability Studies

3.4.1. Participants

Thirty-one children aged 4 to 5 years and 11 months, participated in the study. These children had no physical and mental disabilities and their parents and teachers reported no concerns related to their learning or development. These children were chosen from four regions (north, south, east and west) of Tehran, the capital of Iran. Exclusion criteria included rejecting assessment by child or parents, failure of retest or feeling unwell at assessment time. Children who had any acute or chronic physical problem, were excluded from the study.

3.4.2. Instrument

The Child initiated pretend play assessment is a standardized norm-referenced assessment that is designed for 3 to 7 and 11 months old children. This tool consists of two sets for conventional imaginative and symbolic play. The set of conventional imaginative includes a farm set with the animals and two dolls. In symbolic set, it has unstructured play materials such as shoe box and stones. The toys for 3 year old children differs from the toys for 4-7/11 years. In this study, we used the 4-7/11 years old version of the ChIPPA.

The child plays with each set for 15 minutes. Each 15 minutes is divided into three 5 minutes. In the first 5 minutes, the child is involved with play materials and the examiner gives no guidance but encourages the child to engage with the play materials. In the second 5 minutes, the examiner models 5 actions in any order without disrupting the child’s play. One of the modelled actions is walking the doll. In the last 5 minutes, the modeling stops and the child continues to play.

ChIPPA scores are the percentage of elaborated pretend play (PEPA), number of object substitution (NOS) and number of imitated actions (NIA). These scores are reported in three parts: conventional imaginative and symbolic & combined, which gives a total of nine scores. To score the PEPA, each action of the child is rated as behavioral, repetitive, functional or elaborate. PEPA is scored as a percentage of the elaborate actions over the total number of actions. NOS is scored as the number of objects used as a symbol during play. Number of Imitated Actions is the number of times a child immediately copies the examiner in the middle 5 minute section of each session.

3.4.3. Procedure

The children were recruited in two steps. First, we listed Tehran's kindergartens, then randomly chose one from each region. The kindergartens were approached and invited into the study. They all consented to be part of the the study. Parents gave consent for their child to attend the study. The child participants were chosen systematically and by random order by preparing a list of children whose parents had given written consent for their child to be in the study. The child’s assent was also obtained.

Children were tested using the ChIPPA. Their assessment was recorded by CCTV of the kindergarten, and the researcher scored the video, while at the kindergarten. In this way, the child’s privacy was assured as no videos were taken out of the kindergarten. For inter-rater reliability, these videos were scored by first and second authors. For intra-rater reliability the first author scored the videos twice, one week apart. For Test-retest reliability, the first author assessed the same children again after two weeks.

3.4.4. Data Analysis

For estimating inter-rater and intra-rater reliability, Intra-class correlation coefficient (ICC) was used for all PEPA and symbolic NOS and combined NOS. For conventional NOS and all NIA scores Kappa was calculated.

4. Results

4.1. Face and Content Validity

No participants marked the ‘disagree’ column and thus in Table 1 this column is omitted. According to Table 2, CVR must be more than 0.62. In this study, we had a range of 0.8 to 1. Tables 1 and 3 present the CVR results and the CVI results for the face and content validity, respectively

Table 3. Content Validity Ratio (CVI) of ChIPPAa
ItemsBeing RelatedBeing ClearBeing Simple
Completely RelatedClearCompletely ClearSimpleCompletely Simple
B :non play behaviour, the child is not involved with play material5325
R: the child repeated one or more behaviours. The third time he would get R5145
f: functional behaviors are associated with using play materials functionally52323
e: elaborated behaviors are functional actions that child uses in a logical sequence54114
e’: attribution of features in a verbal way. Also refers to absent objects51423
Object substitution: The number of objects which are used in object substitution would record555
Imitative actions: when the child imitates the therapist tick the box. leave the play action part empty555

aNot related, almost related and related did not have any responses so have been deleted from the table; Not clear and almost bright did not have any responses so have been deleted from the table; Not simple and almost simple did not have any responses so have been deleted from the table.

4.2. Reliability

Table 4 shows ICC results for intra-rater, inter-rater and test-retest reliability. Table 5 shows Kappa results for all imitated action items and the object substitution scores in the conventional-imaginative play session. The latter (NOS conventional and NIA scores) have a small range of scores with most typically developing children scoring 0 for these items. That is, the play materials in the conventional-imaginative session are used in play as they present (e.g., a truck for a truck and an animal for an animal) for the majority of children in the norm sample and the majority of typically developing children who can initiate their own play also score 0 for imitation as they have already planned their play and do not need to imitate the examiner. In Table 6, the percentage of children who scored 0 in Test and Retest is shown.

Cronbache’s alpha was 0.752 which shows good internal reliability

Table 4. ICC for Reliability
NOS CombinedNO SsymbolicPEPA CombinedPEPA SymbolicPEPA Conventional Imaginative
0.9980.9980.99810.995ICCIntra-rater reliability
0.9980.9980.9950.9950.984ICCInter-rater reliability
0.9660.9660.8650.8260.685ICCTest-retest reliability
Table 5. Kappa for Reliability
NOS conventional imaginativeNIA CombinedNIA SymbolicNIA Conventional Imaginative
1111KappaIntra-rater reliability
1111KappaInter-rater reliability
0.460.250.360.46KappaTest-retest reliability
Table 6. Percentage of 0 in Test and Retest
Percentage of 0 in ReTestPercentage of 0 in Test
93.587.1NOS conventional
64.554.8NIA Conventional
90.380.6NIA Symbolic
58.145.2NIA combined

5. Discussion

One of the most important aspects of translation of an assessment from one culture to another is the quality of translation and simplicity of the language. It means that the test developers try to choose and use vocabularies in phrases that are clear and this facilitates the translation process (25).

In the current study, six translators who were fluent and experienced in translation between English and Persian and their input, together with forward and backward translation achieved an acceptable translation were attended. In this study, the ChIPPA was translated in two steps. At first, the English text was translated into Persian and cultural adaptations were made. In the second step, the Persian version was back translated into English and this version was confirmed by the test developer. Three professors also gave input into the translation. A final Persian version was trialled clinically by five occupational therapists and found to be relevant to occupational therapy practice in Iran.

The most seminal issue in developing an assessment is the validity of the assessment. While reliability is about accuracy and consistency of an assessment, validity is assessing what it should be or should not be measured (26).

The content validity of the test was analysed using the CVR score. For all items, the CVR score was 0/8-1 indicating that all items were acceptable according to Lawshe method. Also, there was no disagreement with any of the items. In a similar study by Stagnitti, eight experts were recruited and the results were similar to our study (27). Content validity of the Persian version of the ChIPPA was established.

Face validity is the aspect of being clear, being simple and being relevant to what is being measured. In addition, participants were asked if the test measures what it supposes to measure. In the Persian version of the ChIPPA, explanation of items were found to be simple, clear and related to the topic. CVI scores were 1 on every item, and this is the maximum possible. In administrating ChIPPA clinically, the face validity of ChIPPA was proved once again. children participating liked the test and enjoyed the play. They didn't feel that was in an assessment and they found it playful.

Reliability is an important index for the practical use of an assessment (28). In this study, three kinds of reliability were carried out. The intra-rater reliability’s ICC was excellent being 0.995-1 with the highest scores being for symbolic PEPA, conventional NIA and NOS. In Brazil's study, ICC was 0.92-1 and that is consistent with our study. In Pfeifer’s research, the highest scores in the Brazilian study belonged to conventional NIA and NOS (20).

ICC for inter-rater reliability was excellent with scores ranging from 0.984 to 1 and the highest scores belonged to conventional NIA and NOS. In the study in Brazil, the results for the ICC were -0.13 to 0.76 and the lowest score was for conventional PEPA with -0.13. Pfeifer et al. argued that the English language of the test was difficult and there were difficulties learning the scoring for NOS for researchers (20). In another study by Stagnitti et al. the Kappa was used for inter-rater reliability and the score was 0.96-1. And inter-rater reliability was measured between three examiners. These results were consistent with our study (17).

Test-retest reliability and ICC for PEPA and symbolic NOS and combined NOS was 0.685-0.988. This indicates that ChIPPA has moderate to excellent test-retest reliability. The lowest ICC was for conventional PEPA and the highest was for symbolic and combined NOS. combined PEPA was the most stable score in Stagnitti and Unsworth’s study (ICC = 0.84). In our study, ICC of combined PEPA is 0.87 but symbolic and combined NOS were the most stable (23). Stagnitti and Unsworth argued that one of the reasons between NOS in the test and retest is the difference between play topics as this score related to the narrative of the play scrip (23). In the current study, test retest reliability, using Kappa, for conventional NOS was moderate and for NIA was poor. In Stagnitti’s study, there were no differences between the test and re-test. In typically developing children, the vast majority of children score 0 for NIA and NOS conventional. In the current study, there was a difference between the test and re-test in NIA scores. Most of the children remembered the models in the retest session. The mean of conventional NIA in test for Iranian children was 0.93 which indicates less than one imitated action per child. In symbolic NIA, there was a mean of 0.14 which in retest it decreased to 0.1 for symbolic part, this still translate to less than 1 imitated action per child. We speculate that the reason for this change is the reflexion the child thought at the time of first assessment when the examiner started modelling. The child thought that the therapist is going to play with her/him. Thus when the therapist started waving the doll’s arm, the child responded to the examiner by waving the doll’s arm. After a while, the child understood that the examiner was not initiating play with her and so the child didn’t imitate any further actions. Stagnitti also noted that for typically developing children, if there are any imitated actions, it is usually waving the doll’s arm in response to the examiner’s modelled action. This is more a social response even though it is scored as an imitated play action (14).

5.1. Limitations

The most important limit of this research, was finding cooperative kindergartens equipped with CCTV. Another problem was the difficulty of buying the test and having access to the scientific articles because of political sanction.

5.2. Conclusion

As ChiPPA is a nice acceptable norm-reference assessment and we have no pretend play assessment in Iran, our goal was to make it valid and reliable for Iranian children. We first translated the tool into Persian and the test's designer confirmed it by reviewing back-translated form.

In validity part, we found proper content and also face validity.

It was found that the Persian version of the ChIPPA had excellent inter-rater and intra-rater reliability. PEPA and combined & symbolic NOS had a good to excellent test-retest reliability. Test-retest reliability for conventional NOS was moderate and for NIA it was poor. However, for NIA, in each assessment the children had less than one imitated action per child.

At the end, it is concluded that ChiPPA is a suitable and proper pretend play assessment for Iranian Children. It is suggested to have a normalization study that makes it possible to compare a child’s play score with peers.



