This report showed the reproducibility and validity of an FFQ designed to capture the common intake of nutrients and major food in a rural Chinese population. The results demonstrated that the FFQ had reasonable reproducibility (correlation coefficients ≥0.58 and weighted κ statistic > 0.45) for all selected food and nutrients and fair to moderate validity (correlation coefficients > 0.40 and weighted κ coefficients > 0.3) for most of the food and nutrients.
The means of some nutrients and food from FFQ1 were slightly higher than those from FFQ2. However, no significant difference was found for most items (except for retinol), indicating the learning effect was not a major concern. In China, people tended to mix several food items together, which made it difficult to estimate the accurate amount of each item, and they might overestimate the intake of some items when FFQ was used. However, a noteworthy difference between FFQ1 and dietary records was only seen in tuber crops, fruits, white meat, carotene, and retinol, indicating the overestimation in FFQ did not happen in most items.
In this study, the dietary intake survey with FFQ was conducted twice with 1 month apart to test the reproducibility of FFQ, which was similar to other reports [29,30,31]. There would be an overlap between FFQ1 and FFQ2 as they were finished a month apart reflecting an 11-month overlap in recall time; however, two FFQ surveys were done to examine the reproducibility, and like many other studies [1, 29, 31], the overlap could not significantly affect the results. This interval could be long enough for participants to forget their previous responses, but short enough for participants not to change their dietary and life habits . The length of FFQ and the number of food items should be decided based on objective of the study, food accessibility and variability of food consumption in the target population [1, 2]. In this study, the eating habits and lifestyle of residents were not changed over time as much as many other Chinese people did. We selected most consumed dietary items, covering more than 97.5% of typical food in the region, which could reflect the usual dietary habits.
In testing the reproducibility, both crude and adjusted Spearman correlation coefficients showed that FFQ1 and FFQ2 were moderately to strongly correlated in macronutrients (0.70–0.75), micronutrients (0.61–0.81) and food (0.58–0.92). The correlation coefficients in this study were higher than those in other Chinese studies [4,5,6,7,8,9, 13, 14], this might due to the fact that most of Chinese studies adopted an interval of 9 to 24 months when testing the reproducibility of FFQs, which might increase the risk of changing dietary habits. Masson and colleagues’ criteria require that more than 50% of participants should be correctly classified into same tertile and less than 10% into the opposite tertile . In this study, the results showed that more than 50% of participants were correctly classified into same tertile and less than 8% into an opposite tertile, which indicated a reasonably good agreement and less misclassification for all food and nutrients. Weighted k statistic further displayed moderate to good inter-rate agreements (0.45–0.81) for all food and nutrients . The dietary consumption in the population concerned lacked diversity. More often, the type and quantity of food consumed by local residents kept consistent and did not change in a relative long period , this might also be the explanation for stronger correlations and better agreement in food and nutrients between two FFQs.
Many factors may influence the evaluation of validity, such as reference method, days of diet tracked, record period, and the homogeneity of intake within participants . Dietary recall usually represents an optimal comparison method in measuring food intake, because sources of errors from dietary recalls are largely independent errors associated with a food frequency questionnaire [1, 2]. Some researchers suggested the optimal study design of dietary record rarely required more than four- or five-day dietary recalls for each participant [2, 34]. In this study, we collected two three-consecutive-day dietary recalls, which have some advantages to explore the day-to-day intake variation. However, this short interval cannot avoid the seasonal/monthly variations in food consumption. This may be the major reason why the correlation coefficients and kappa statistics in some nutrients and food were relatively low between dietary records and FFQ1.
The validity assessment of FFQ in this study was assessed by comparing food and nutrients intake from FFQ1 with those from dietary records. This could avoid some extra influence (such as learning effects ) and it was easier to explain the results. Between FFQ1 and dietary records, there were moderate correlations for energy (0.55) and macronutrients (0.41–0.58) and moderate correlations for most micronutrients and food (0.40–0.68), though the correlation coefficients for a few of micronutrients (riboflavin and selenium) and food (white meat, nuts, and seeds) were less than 0.40. Compared with other studies that used the same approach with ours, the correlation coefficients in this study were similar to or larger than those in other areas of China [6,7,8, 11, 13, 14]. The Spearman correlation coefficients in food items and nutrients decreased when adjusting for energy, which might be due to high inter-person variation in the frequency and amount of food intakes in the study subjects. For most nutrients and food, the percentage of participants correctly classified into same tertile was higher than 50%, which indicated a higher agreement between FFQ1 and dietary records according to the Masson and colleagues’ criteria . In addition, the percentages of participants classified into opposite tertile were lower than 10% for most nutrients and food, apart from white meat, and nuts and seeds, which indicated that the misclassification between FFQ1 and dietary records was small. Compared with results from other studies, the percentages of agreement were similar to studies in Taiwan and some western countries [35,36,37,38] and higher than in Belgian (32–76%)  and Australia (35–54%) . Meanwhile, the misclassification in most items was lower than those in Taiwan and some western countries [36,37,38,39,40,41]. Weighted k statistic demonstrated a consistent moderate agreement in fiber, vitamin E, calcium, cereals, and fruits (0.40–0.49), fair agreement for most food and nutrients (0.30–0.38), as well as fair agreement in riboflavin, iron, white meat and nuts and seeds (0.21–0.29). Weighted k statistic (0.21–0.49) in this study was similar to those in Britain (0.23–0.66)  and Belgian (0.10–0.71, 39], which indicated acceptable inter-rater agreements.
We found that there was a weak association and/or low agreement between FFQ1 and dietary records for a few of food and nutrients, especially for white meat, nuts and seeds. The mean of white meat intake from FFQ1 (2.2 g/day) was much lower than that from dietary records (4.7 g/d). This might be due to that the dietary recall method was self-administrated with open-questions, whereas the FFQ was interviewed with in-person approach and with close-ended questions. Although the errors from FFQs and dietary recalls were independent and dietary recall was suggested to be an adequate comparison method for the target instrument , self-monitoring of food intake in dietary recalls may lead to eating behavior changes and may make participants pay more attention to their dietary behaviors. The participants might consume more white meat or overestimate white meat intake during the period of recording dietary diary. However, the mean of white meat intake from the FFQ1 in this study was approximate to those reported in another study  in a similar population, which tracked food intake in 1 year and found lower intake of white meat (3 g/d ay). This suggested that the FFQ could reasonably reflect yearly white meat intake. There was a lower agreement in the consumption of nuts and seeds between FFQ1 and dietary records. Cross classification analysis classified the participants close to cutoff points into different tertiles. It may increase the percentage of participants classified into the opposite tertile and lower the weighted k statistic. Another reason may be that 6 days dietary recalls may not reflect yearly consumption of nuts and seeds, because nuts and seeds consumption has seasonal variation in rural areas . However, there was no significant difference in nuts and seeds intake between FFQ1 and dietary records. Moreover, the mean of nuts and seeds intake from FFQ1 in this study was approximate to that in Chinese adults  and in the same targe population , which showed that the FFQ in some degree can reflect the consumption of nuts and seeds.
The major strengths of this study include multiple tools or approaches adopted in the estimation of portion sizes in data collection, higher participation rate and the ability to recruit a relative representative sample. However, we acknowledged that two three-day dietary recalls might not be adequate to reflect the seasonal effects and other poorly defined fluctuations in dietary consumption. This is first limitation in this study. Nonetheless, dietary records covered 4 weekdays and 2 weekends, which to some extent could capture the day-to-day variation. The second limitation is that sample size in this study was relatively small which may lower the statistic power. The last limitation is that this study only assessed the relative validity of FFQ by using the dietary recalls, but instead of criterion validity by using biomarkers of dietary exposure.