The feasibility of the ASQ-3 “home procedure” was assessed in a field trial in a North Indian urban setting. Ahead of the five months data collection, there was an 11 days training, including translation, cultural adjustments and standardization of the examiners. In the translational process, four items were changed in order to be appropriate for the study population. The general feedback both during the training of the examiners and during the clinical trial, indicated that children, caregivers and examiners commonly found the ASQ-3 “home procedure” enjoyable to attend. The examiners experienced the ASQ-3 “home procedure” as a reasonable and feasible instrument to administer in the current clinical trial. During the study, all initiated sessions were completed and with very few missed observations. The ICC-values show a high degree of inter-observer agreement both during standardization and the main study, indicating feasibility of the ASQ-3 in terms of the collection of reliable data.
Furthermore, the total ASQ scores were in the entire range of possible values, however, some items did not show any variability. The correlation coefficients showed satisfactory concurrence between the five subscales and the total scale, but the standardized alpha values varied in the different subscales and age levels indicating some weakness of the internal consistency.
For the cultural adjustments of the ASQ-3 forms, four items in the 11 relevant forms were identified as improper in a North Indian context, and were changed. This is in accordance with other studies on the translation and adjustments of the ASQ to new cultural contexts reporting similar changes at item level [10, 11, 17]. It may seem that some items are more challenging to use in other cultures. For example, the item concerning a fork was changed in the present study, likewise a study in Ecuador reports that items involving using a fork were removed as they are not commonly used . In the previously mentioned study from India, the items with forks were also mostly left unanswered indicating that the items were irrelevant for the children in the sample . Furthermore, mirrors were found to be uncommon for the present study population, also demonstrated in previous adaptions . In the ASQ-3 manual, the mirror-items are highlighted as possible problematic items for many cultures . This could suggest that there are some items that are more cultural specific than others, and which should be considered with particular care while interpreting the results from studies, as well as when necessary adjustments are made in future studies.
In the present study eighteen items showed no variability since all children in the specific age categories had developed the relevant skill for the item, for example the skill of walking in the Gross Motor subscale. This might have been incidental since groups at each age levels were small (ranging from 16 to 52 participants in each age category). However, the number of constant items may also be an expression of cultural differences in child rearing practices and expectations to children’s development between North India and the US. This last assumption gives rise to the idea that the 18 constant items are not developmentally appropriate for this North Indian sample of infants and young children, and should be adjusted and/or regrouped age appropriately prior to further use.
The internal consistency of the ASQ-3 when transferred to a North Indian setting was expressed by correlations between the total scores and the subscale scores, and by standardized alphas. The strong and consistent correlation coefficients between all of the five subscales and the total ASQ-3 scale indicate concurrency. The moderate correlation coefficients between the five subscales are expected, indicating a certain degree of concurrence between the subscales, but at the same time underlining that the subscales measure different developmental skills. These results are in accordance with the correlations between the different subscales and the total scores described in the ASQ-3 manual . For the standardized alphas however, the picture is not as clear. The 66 alpha values range from highly internally consistent to unsatisfactory and in two instances, negative values. The standardized alphas for the total scale at the different age groups generally indicate that the scale is highly internally consistent and measuring the same thematic areas. For the subscales however, the values vary. The calculations of the standardized alphas therefore unfold additional problematic items causing unsatisfactory alpha values, and even, negative item covariance. These items are inconsistent with the other items in the subscale, and therefore might not assess the same developmental area in this setting. Analysis on relevant subscales when removing adapted items does not consistently lead to improved internal consistency, and thus indicating that these are not the primary cause of the poor internal consistencies. The problematic items should be scrutinized further in order to get an understanding to why certain items in this cultural setting show inconsistency. With further adjustments to certain items there might be a possibility to improve the internal consistency of the scales, and then increase the level of reliability.
The calculations of the standardized alphas are sensitive to the number of items that are included in the analysis . In the alpha calculations of the total scale, 30 items are included, while only six items are included in the calculations of the subscale alphas. Constant items are excluded from the analysis of standardized alphas, and therefore, the number of items may be even fewer than six on certain age levels in this study since a total of 18 items are constant. This may reduce the alpha values in the relevant subscales and age levels even further. Two alpha values are particular problematic in our calculations. These are in the Personal Social scale at 24 and 36 months where items cause negative average covariance, and therefore violate the assumptions of the calculations, resulting in no alpha values shown in the results.
In the technical report of the ASQ-3 manual, the standardized alpha values from their sample of 18 000 children are listed. It was concluded that the overall internal consistency of the subscales was good to acceptable. However, the table of the alphas for all the age intervals has values from 0.51 to 0.87. The Personal Social subscale is the scale with the poorest values. In a study on the cross cultural adaption of the ASQ-2 to a Korean setting, the standardized alpha values of all subscales ranged from 0.30 to 0.91, again with the poorest values in the Personal Social subscale . In their discussion of the study, Heo, Squires and Yovanoff  argue that Personal Social items such as eating and dressing skills will give rise to differences between the Korean and the US sample. Gladestone et al.  argue similarly in their report on the modification of Western screening tools to a Malawian setting that cultural differences often appear in the area of social development. These assumptions are in accordance with the present study, where the Personal Social subscales offers the overall poorest alpha values. In the process of further adjusting the ASQ-3 to a North Indian setting, the Personal Social subscale should be handled with particular care.
We administered the ASQ-3 as “home procedure”. Feedback and observations during the sessions indicate that the ASQ-3 “home procedure” in general was an enjoyable time both for children and caregivers. Examiners experienced the adjusted ASQ-3 as reasonable in assessing children from the area. This indicates that the face validity of the adjusted ASQ-3 was satisfactory. Sessions were brief and all 422 children completed their session once it was initiated. Children were given time during sessions to practice with possible unfamiliar material and were scored based on their accomplishments during sessions. Based on the possibility of collecting information both from observation and caregiver’s report missing data were scarce. These factors support the feasibility of the ASQ-3 “home procedure” in large population-based studies. Furthermore, the developmental assessment was conducted at a low cost. The examiners were not psychologists, the ASQ-3 kit was purchased online, and only one kit was required for the study site. Necessary materials and equipment for the “home procedure” were purchased at local markets, or downloaded from the Internet. Accessible tools at low cost, that are easy to use and which are enjoyable for the children in a given culture are in accordance with the recommendations of Fernald, Kariger, Engle and Raikes  in their toolkit for the assessment of child development in low and middle-income countries.
However, the “home procedure” approach does require some training of examiners, in addition to practice sessions after the initial training. In our study we conducted an 11 days training, which also included discussions of cultural adjustments. The ICCs both of the standardization exercises during training and the quality check during the study period show that the examiners through intensive training and subsequent practice managed to obtain a high degree of concurrence in their scorings. The satisfactory ICCs serve as further support that the ASQ-3 “home procedure” may be a beneficial approach to efficiently obtain reliable data on child developmental status for research purpose.
A challenge of the ASQ-3 “home procedure” for research purpose is that, although examiners intention was to observe as much of the children’s skills during sessions as possible, some ASQ items fail to provide this possibility due to its inherent structure. Analysis shows that the Motor scales and the Problem Solving scales include most items that may be observed by examiners during an assessment session. The two remaining scales, Communication and Personal Social include more items that require information from the caregiver to score. The scales may therefore be perceived to provide data of different quality, three of the scales provide objective information scored by trained examiners, and two of the scales are more reliant of the subjective report from caregivers.
Parental report do provide a risk of inaccuracy and/or overstatements in the report of the child’s development due to factors such as social desirability, caregivers inexperience in interpreting their child’s skills and/or their inability to accurately report the child’s behaviour . However, the ASQ-system is developed and based on the conviction that caregivers can provide information for proper assessments of their children. For instance, a study comparing the ASQ completion of low and middle-income parents in the US with subsequent assessment by the Bayley Scale of Infant and Toddler Development, shows no differences in the accuracy of scoring in the two groups of parents, giving support to the idea that parents-completion of child development questionnaires give reliable data also in high risk groups . For now, when utilizing the ASQ-3 “home procedure” for research purpose in this cultural setting, data should be carefully interpreted with the difference in the quality of information in mind.
The total ASQ-3 scores range from zero (no scores) to 300 (full score), in our study the scores ranged from 30 to 300. The five subscales ranged from zero to 60 (full subscale score). Our results imply that although the data are not perfectly normally distributed, the ASQ-3 managed to identify children in both ends of the scale. The total ASQ-scores has a mean of 231.9 and SD of 50, while for the subscales the mean scores range from 44.8 to 47.8. A study by Kerstjens et al.  compares mean subscale values between Dutch, US, Norwegian and Korean samples. The mean values from our study are generally lower on all subscales, except for the Fine motor subscale were mean values from our studies are slightly larger than in the Dutch and US sample, but still lower than in the Norwegian and Korean sample. The intention of this study has not been to formally validate the ASQ-3 for a North Indian setting and establishing cut off scores for developmental delay in the children. The differences of mean subscale values should therefore be interpreted with care. Fernald, Kariger, Engle and Raikes  emphasize that when cut off scores are not established for the given culture were the screening tool is used, its use should be limited to that of comparing groups. The differences between mean values in our study from other studies underline this statement. Until further validation has been conducted on the ASQ-3 for this particular population, there are no cut-off scores feasible for this North Indian sample, and data should be limited to the comparison of groups.
When evaluating the transference of an assessment tool to a new cultural context, test-retest reliability is of importance. Within the framework of this study, such evaluation was not possible. This is a definite weakness of the study. Furthermore, piloting of the translated questionnaire prior to the study would be preferable, and give room for further adjustments ahead of the study start based on preliminary calculations of internal consistencies, variability and constant items. These limitations of the study, together with other remarks in the Discussion section should set the groundwork for further attempts to transfer the ASQ-3 to new cultural settings.