Development and validation of a quantitative food frequency questionnaire to assess dietary intake among Lebanese adults

Background The food frequency questionnaire (FFQ) is the most frequently used method to assess dietary intake in epidemiological studies evaluating diet-disease association. The objective of this study was to validate a FFQ for use among Lebanese adults by evaluating various facets of validity and reproducibility. Methods The quantitative 164-items FFQ was validated against the average of six 24-h dietary recalls (DRs) in a sample of 238 Lebanese adults. Reproducibility of the FFQ was assessed by administering it twice within 1 month’ time interval. Results Positive statistically significant Pearson correlations were observed in most macro and micronutrients between the FFQ and the six 24-h DRs, ranging from 0.16 to 0.65, with two thirds of the correlation coefficients exceeding 0.3. Energy, gender, and age-adjusted statistically significant Pearson correlation coefficients ranged from 0.14 to 0.64, with two thirds of the coefficients exceeding 0.2. Intakes from the FFQ were mostly higher than those of the 24-h DRs. Mean percent difference between nutrient intakes from both dietary methods decreased remarkably after using energy-adjusted mean intakes. Values were acceptable to good for all macronutrients and several micronutrients. Cross-classification analysis revealed that around 64.3 to 83.9% of participants were classified into the same and adjacent quartile whereas grossly misclassified proportions ranged from 3.7 to 12.2%. Weighted kappa values ranged from 0.02 to 0.36 with most of them exceeding 0.2. In indirect validity analysis, key nutrient mean intakes estimated from the six 24-h DRs were significantly positively associated with tertiles of food groups derived from the FFQ. Bland Altman plots showed that the majority of data points fell within the limits of agreement (LOA) for all nutrients. As for reproducibility analysis, ICC values were all statistically significant ranging from 0.645 to 0.959 and Bland Altman plots confirmed these results. Conclusions Based on various aspects of validity and reproducibility, and an extensive range of statistical tests, the present FFQ developed for a Lebanese community is an acceptable tool for dietary assessment and is useful for evaluating diet-disease associations in future studies.


Introduction
Several epidemiological studies investigate the effect of diet on health and non-communicable diseases such as obesity, diabetes, cancers, as well as neurological, endocrine, and immunological disorders [1]. Such studies require precise methods to evaluate long-term dietary intake in an aim to carry out an extensive dietary assessment. A universal epidemiological method for nutritional assessment does not currently exist [2] and the selection of the adequate instrument depends on the study objectives [3]. Multiple dietary instruments are used to determine nutritional intake [2] and they are divided into objective and self-reported instruments [4]. Objective methods include nutritional biomarkers [3] whereas self-reported dietary instruments are generally divided into short-term methods (24-h dietary recalls (DRs) and food records) and long-term methods (Food Frequency Questionnaire, FFQ) [3]. Each dietary assessment tool has specific advantages and disadvantages [5]. Nutritional biomarkers are the method of choice to assess micronutrient intakes [3]; however, they are relatively expensive and they involve respondents burden [6]. Twenty-four-hour DRs provide reliable quantitative estimates of dietary intakes with no reactivity bias; however, the results may be affected by memory and do not represent the usual dietary intake [6]. Food records -especially weighed food records -have the advantage of being accurate without relying on memory (more accurate portion sizes with no food omission); nevertheless they require relatively high cooperation from participants whose motivation might decrease over time, and intakes can be affected by the process of regular food recording [6]. Time and economic restrictions make the above-mentioned methods unsuitable for use in epidemiological studies [7]. The FFQ is a simple, less invasive, and inexpensive tool [6,8] that captures the usual dietary intake because it covers a longer period of time [9]. In fact, when evaluating the association between diet and related diseases, measuring food intake over a period of months to years is more valuable than measuring the intake of few days [8]. One main disadvantage of the FFQ is the overestimation of dietary intakes. However, it is the most frequently used method to assess dietary intake in epidemiological studies [2] especially when investigating diet and disease association [10]. Adopting a pre-existing FFQ poses inaccuracies, as the original objectives might not meet the requirements of the current study, and the FFQ yields different results according to different demographic groups [9]. In Lebanon, previous FFQs have been validated for use among children [11] and pregnant women [12] and to assess the intake of antioxidant vitamins [13], and Middle Eastern and Mediterranean food [14]. A new FFQ providing a detailed assessment of a wide array of food and nutrients is needed in order to evaluate the association of diet with health and diseases.
Thus, the aim of the current study was to validate a FFQ for use among Lebanese adults, by investigating various facets of validity and reproducibility. The objectives of the present study were to i) determine the relative validity of the developed FFQ in measuring energy and nutrient intakes as compared to six non-consecutive 24-h DRs ii) compare the means of nutrient intakes across tertiles of food groups obtained by the FFQ iii) evaluate the reproducibility of the FFQ.

Study design and participants
A sample of 500 participants was drawn from the university database covering both students and employees, using a stratified random cluster sampling, with a status and sex distribution proportionate to that of the university population per campus. To be included in the study, the participants had to be Lebanese, aged between 18 and 64 years, not having medical conditions or taking medications that affect food intake. Selected participants received a letter by email explaining the procedure of the study. The agreement for participation was requested by phone, 5 days after sending the letter. Out of the 500 participants, 305 agreed to participate in the study. After providing written consent, sociodemographic, anthropometric, and dietary data were collected. Based on the "non-individualized method" [15], also called "recommended method" [16], participants with high or low reported energy intake, i.e. outside the range of 500-3500 kcal/day for women and 800-4000 kcal/day for men, were excluded from the analysis [17]. This resulted in a final sample of 238 participants.

Development of the FFQ
The quantitative 164-items FFQ was developed by a panel of nutritionists who drafted a pre-final version of the questionnaire that was tested on a representative sample of the target population, composed of 50 students from the university. The respondents completed the FFQ during in-depth interviews where they were asked about its comprehensibility and acceptability [18]. The final version was composed of the following 14 food categories, including culture-specific food items: cereals and grains, dairy products, fruits, vegetables, legumes, meat, fast-food, nuts and seeds, oils and fats, salty snacks, sweets and beverages (hot, alcoholic and nonalcoholic). Portion size was determined according to food servings by the World Health Organization Eastern Mediterranean Region guide based on United States dietary guidelines [19]. During the development phase of the FFQ, respondents were asked about their usually consumed portion size for all food items. Final portion sizes were derived from the most commonly observed ones, reflecting consumption patterns in our target population. A standard portion size was designated for each food item and participants estimated portion sizes by weight, household measures (cups, spoons and plates), and customary packing size [3]. The number of portions consumed was determined and the frequency of portion consumption was recorded per day, week, month or year over the past year. Seasonality of certain food items was accounted for, by adjusting the frequency of consumption for the period of the year during which they were consumed. The FFQ took around 30 min to fill, with this being the usual reported duration [6]. A list of the food items included in the FFQ is available in table S1 in additional file 1.

Dietary validation analysis
The FFQ was validated against the average of six nonconsecutive 24-h DRs. Participants filled 24-h DRs of 3 days (two weekdays and a day of the weekend) repeated twice within one-month interval to obtain a total of six non-consecutive 24-h DRs. The average of six 24-h DRs was used: in order to estimate the usual intake from 24h DRs and investigate its association with biochemical variables and indicators of health status, it is recommended to collect multiple days. In general, 4 to 5 days are collected as an appropriate compromise between scientific rigor and practicability for assessing energy and macronutrients intake [7]. This method ensures reducing measurement errors [20] and provides more reliable associations [6]. The five steps multiple-pass method [21] proposed by the United States Department of Agriculture was adopted. This method provides a more complete and accurate food recall while making it easier on the participants. The five steps start with a quick list of foods consumed, followed by a list of potentially forgotten foods, time, and occasion of the meals for more precision, detailed quantities and ingredients, ending with a final review [21]. The 24-h DR took 25-30 min to complete, conformingly with the recommendations [7].
The reproducibility of the FFQ was tested on a smaller sample from the same target population. A number of 52 participants completed the questionnaire twice within 1 month' time interval [9]. The period was long enough for the participants to forget their previous responses, but too short for any considerable changes in dietary habits to occur [9]. A period longer than 1 month could lead to seasonal reporting bias [22]. Questionnaires were interviewer-based and not self-administered which increases completion rates and enhances the consistency of the results' analysis [23,24]. They were filled by a research assistant who was a trained dietician having experience in both professional and research domains. The flow diagram of the study is shown in Fig. 1.

Analysis of food consumption data
Nutritional data deriving from the FFQ and the 24-h DRs of 6 days were assessed using the Nutrilog software (Nutrilog SAS, Version 2.30, France). Nutrient composition of the Lebanese traditional dishes was derived from the American University of Beirut (AUB) food composition table [25]. For the rest of the food items that are not exclusively traditional Lebanese, data was extracted from the United States department of agriculture (USDA) nutrient database version 2010 [26] and the French food composition table (Ciqual) version 2008 [27]. These were carefully chosen to reflect our community's dietary habits in the most accurate way. For products from specific brands, we chose the exact item from the corresponding database. We extracted the nutrient content of each portion based on the amount specified by the abovementioned databases. Hence, nutritional intakes of energy, 18 macronutrients, 11 vitamins, and 10 minerals were retrieved from the FFQ and the average of the six 24-h DRs.

Data collection
Data were collected regarding age, gender, and crowding index. The latter is defined as the total number of coresidents per household, divided by the total number of rooms, excluding the kitchen and the bathrooms. Anthropometric measurements were taken for the description of the population. Weight and height were measured using a scale and stadiometer (Health o meter professional scale, United States). Body mass index (BMI) was calculated and the participants were classified as overweight or obese if the BMI value ranged between 25 and 29.9 kg/m 2 and ≥ 30 kg/m 2 respectively [28].

Statistical analysis
Frequencies and percentages were used for categorical variables and means (standard deviation SD) were used for quantitative data. Validity of the FFQ as compared to the 24-h DRs was assessed for all nutrients using the Pearson correlation coefficient, with adjustment for energy intake, age, and gender. The analysis was repeated for men and women separately because they have several physiological and behavioral differences that affect distinctively their response to health problems [29]. Mean percent difference was calculated to test the difference between mean nutrient intakes from both dietary instruments (agreement at group level). The same test was done using energyadjusted nutrient intakes (adjustment was done using the nutrient density method [30]). Mean percent difference = [(FFQ -24 h recall)/24 h recall]*100. Outcome is judged according to the following criteria: good: 0.0-10.9%; acceptable: 11.0-20.0%; poor: > 20.0% [31]. The distribution of nutrient intakes was categorized into quartiles to test agreement at individual level including chance while weighted kappa was calculated to examine agreement at individual level excluding chance. Interpretation was based on the following cutoffs: good: ≥ 0.61; acceptable: 0.20-0.59; poor: < 0.20 [31]. Cohen's kappa was calculated to evaluate the agreement between the two measures, based on dichotomized categories of nutrient intakes from both instruments. We used the following cutoffs for results interpretation: almost perfect agreement: 0.81-1.00; substantial: 0.61-0.80; moderate: 0.41-0.60; fair: 0.21-0.40; none to slight: 0.01-0.20; no agreement: ≤ 0 [32]. Indirect validity corresponds to "the extent to which a test measure of a concept agrees with a reference measure of that concept that has a greater degree of demonstrated validity, even if it is not an exact measure of the concept" [33]. It was examined using one-way ANOVA between food categories derived from the FFQ, and nutrient intakes derived from the average of six 24-h DRs. Bland Altman plots were also performed to test the agreement between the two methods; the mean of nutrient intakes between the FFQ and 24-h DRs was plotted against the difference between the two methods. Reproducibility of the FFQ was assessed using the ICC (based on a meanrating (k = 2), absolute-agreement, 2-way mixed-effects model) as well as the Bland Altman plots. Statistical analyses were performed using IBM SPSS (IBM SPSS Statistics for Windows, Version 20, IBM corp., Armonk, NY).

Results
General characteristics of the sample are presented in Table 1. A total of 238 participants completed the FFQ  Table 2. Statistically significant correlations were observed in most macro and micronutrients between the FFQ and the 24-h DRs of 6 days, ranging from 0.16 for monounsaturated fatty acids (%) to 0.65 for alcohol (%). Saturated and polyunsaturated fatty acids (%) did not show significant correlations. Adjustment for age, gender, and energy intake maintained the significant correlations in most nutrients, except for fat (g), saturated fatty acids (g), and polyunsaturated fatty acids (g). Correlation coefficient values decreased after the adjustment except for carbohydrates (CHO) (%), sugars (%CHO), fat (%), monounsaturated fatty acids (%), protein (%), alcohol (%), vitamin E, vitamin C. Adjusted significant correlation coefficients ranged from 0.14 for carbohydrates (g) to 0.64 for alcohol (%). Significant correlation coefficients of vitamins ranged from 0.21 for vitamin D to 0.50 for vitamin B6. Correlations were not significant for vitamin A and riboflavin. Significant correlation coefficients of minerals ranged from 0.28 for copper to 0.45 for phosphorus.
Pearson correlations between FFQ and six 24-h DRs for men and women are presented in Table S2 in the additional file 1. When correlations were unadjusted, values were higher for saturated fatty acid (%, g), alcohol (%), niacin, pantothenic acid, and magnesium among men whereas fat (%) and zinc were higher among women. When correlations were adjusted, values were higher for saturated fatty acids (g), monounsaturated fatty acids (g), cholesterol, protein (g), alcohol (%, g), niacin, sodium, and selenium among men whereas fat (%), fibers, vitamin D, zinc, and manganese among women.
The Mean (SD) of energy intake and all nutrient intakes as estimated by the FFQ and the average of six 24h DRs are presented in Table 2. Mean intakes from the FFQ are mostly higher than those of the 24-h DRs. Mean intakes stratified by gender are available in the additional file (Additional file 1: Table S3). Men generally reported higher intakes than women. Energyadjusted nutrient intakes are also presented in Table 2. Energy-adjusted mean nutrient intakes stratified by gender are available in the additional file (Additional file 1: Table S4). Women reported higher sugar and micronutrients intakes compared to men, whereas men reported higher cholesterol and alcohol intakes. Mean percent difference was calculated to test the difference between mean nutrient intakes as well as energy-adjusted mean nutrient intakes from both dietary instruments among the total sample. Mean percent difference decreased remarkably after using energy-adjusted mean intakes. See Table 2 for more details. Table 3 shows the proportions of participants classified in the same category according to the two dietary assessment methods, as well as the corresponding kappa value. Around 49.2-70.6% of participants were classified in the same category. Kappa values ranged from − 0.02 for monounsaturated fatty acids (%) to 0.42 for alcohol (%).
The indirect validity of the FFQ is shown in table S5 in additional file 1, where the mean of energy, macronutrients, and micronutrients estimated from six 24-h DRs are presented for a range of food groups classified by the FFQ. Selective associations between a number of food groups and key nutrients are presented in Table 5. Key nutrient mean intakes were positively associated with tertiles of food groups. While refined cereals showed positive associations with CHO and sodium intake, whole-grain cereals were positively associated with fibers intake and several vitamins and minerals that did not show any associations with refined cereals, such as thiamin, B5, B6, zinc, magnesium, etc. Fruits and vegetables showed positive associations with fibers and vitamin C, while folate and other micronutrients were positively associated with cooked green leafy vegetables. Protein and fat intakes that were not among the associations observed with cereals, showed positive associations with legumes and animal products, including red meat, chicken, fish and shellfish, and dairy products. Intakes of vitamins A, D, and E were positively associated with fish and shellfish intake. Iron intake was positively associated with chicken and red meat, fish and shellfish, and legumes. Processed meats and fast food sandwiches were not positively associated with a noteworthy number of micronutrients but showed a positive association with sodium intake. Sugar intake and alcohol intake were positively associated with dessert and alcoholic beverages respectively. See Table 5 and additional file 1: Table S5 for more details.
Bland Altman analysis was conducted by plotting the mean of nutrient intakes between the FFQ and 24-h DRs against the difference between the two methods [31]. Bland Altman plots for selected macronutrients and micronutrients are presented in Figs. 2 and 3. The  Table 8.

Discussion
Based on various aspects of validity and an extensive range of statistical tests, we demonstrated that the present FFQ developed for a Lebanese community is a useful tool for dietary assessment, when compared to six 24-h DRs. We obtained an acceptable agreement between nutrient intakes of both dietary instruments, given that most participants were correctly classified into the same and adjacent quartiles, with a low level of misclassification. Weighted kappa statistics also showed acceptable results. These findings were further confirmed in the Bland Altman plots and the indirect validity analysis relating nutrient intakes from the 24-h DRs to food groups from the FFQ, indicating a satisfactory agreement between the two methods.
There is no perfect reference method in validation studies. Objective methods such as biochemical indicators are relatively invasive and expensive especially when they aim to test many nutrients. Moreover, biochemical indicators do not exist for some nutrients (total fat, total CHO, total fibers). They are also influenced by dietary factors including day-to-day variation and physiological factors such as nutrient absorption and metabolism, diurnal and menstrual cycles [34]. Diet records also hold several limitations, such as decreased cooperation from the respondents and modification of their dietary intake.
Therefore, multiple 24-h DRs appear to be the primary alternative [34], and they are used by most validation studies as a reference method [9]. In validation studies, it is important to cover many aspects of validity. An in-depth literature review carried out in 2015 showed that the mostly used statistical tests in FFQ validation studies were combinations of two to three tests, which may not be sufficient to provide a comprehensive perception of various facets of validity  [31]. Moreover, the sole use of correlation analysis is not sufficient in validity studies, as it does not measure agreement between methods [31]. Hence, in the current study, we applied a remarkable number of statistical tests for a more reliable analysis. In addition to correlation analysis, we used percent difference, crossclassification quartiles, weighted kappa statistics, and Bland Altman plots to measure agreement between the two methods, as well as indirect validity analysis between nutrient intakes from 24-h DRs and food consumption categories derived from the FFQ. While correlation coefficient, kappa statistics, and cross-classification assess validity at the individual level, Bland Altman and percent difference do it at the group level [31]. Regarding correlation analyses, a desirable Pearson correlation coefficient generally ranges from 0.5 to 0.7 [34], with coefficients between 0.2 and 0.45 considered acceptable [31]. In the current study, correlation coefficient values fell within the acceptable range, with a good outcome for alcohol (> 0.5). They were similar to some FFQ validation studies [35,36] and lower than others [14,37]. Adjusting for factors such as age, gender, and energy intake is very important in validation studies. In line with the present findings, it is unrealistic to obtain high values of correlations coefficients after such an adjustment [34]. However, Pearson correlation coefficient cannot be considered the only determinant of validity as it does not test the level of agreement between the two dietary instruments [31].
Results showed that the FFQ tended to overestimate nutrient intakes as compared to 24-h DRs. This finding is consistent with most of the FFQ validation studies [35,36,38,39]. Possible reasons for this overestimation are the relatively large number of food items participants have to recall while filling the FFQ in comparison with the 24-h DR [9]. We also described mean nutrient intakes by gender; they were generally higher in men than women, which is consistent with previous findings [37].
Regarding the mean percent difference, it was calculated for both crude and energy-adjusted nutrient intakes. The difference remarkably decreased with the energy-adjusted values. It showed acceptable to good results for macronutrients, vitamins such as vitamin D, thiamin, and niacin, and minerals like phosphorus, potassium, sodium, and iron. Given that the FFQ overestimates energy intake, it seemed more plausible to compare intakes when they are energy-adjusted. This allowed evaluating the nutrient composition of the diet as assessed by both dietary instruments, rather than only crude intakes. In future epidemiological studies, especially those evaluating diet-disease associations, it is crucial to consider adjusting for energy intake among other confounding factors; diet-disease associations should not be the sole result of differences in total energy intake between cases and non-cases [30]. Cross-classification of nutrient intakes into quartiles and weighted kappa calculation showed promising results as per the agreement between the dietary instruments. Regarding the quartile categorization, misclassification was less than 10% among most nutrients, while a relatively high proportion of participants were classified into the same or adjacent quartile. Results were similar to previous FFQ validation studies [35,37,40]. Moreover, most weighted kappa values fell within the acceptable range (between 0.2 and 0.6) [31] while Cohen's kappa values reflected fair agreement (between 0.2 and 0.4) [32]. These results are of utmost importance, given that ranking individuals according to their dietary intakes is fundamental in the investigation of diet-disease associations [31].
Bland Altman plots showed a good level of agreement between the two methods. While the positive mean in  LOA limit of agreement most plots indicated that the FFQ overestimated intakes, plots show that the majority of data points fell within the LOA around the mean intake. Indirect validity assesses the relationship between the food consumption categories derived from the FFQ, and the nutrient intakes extracted from the 24-h DRs [41]. This type of analysis has been rarely conducted in previous FFQ validation studies [42,43]. Results suggested a good indirect validity; intakes of key nutrients significantly increased with the relative tertiles of foods groups that they are usually and logically related to.
Test-retest reliability displays not only the degree of correlation but also the agreement between measurements. In contrast to Pearson correlation coefficient, paired t-test, and Bland Altman plots, ICC is an advisable measure of reliability that assesses both degree of correlation and agreement between two measures [44]. In the current study, the FFQ yielded good to excellent reproducibility according to ICC results [44], similarly to previous studies [12,14,36,45]. Bland Altman analysis for reproducibility confirmed these findings. Moreover, the interval between repeated measurements (1 month) is adequate in order to minimize dietary changes over time as well as the recall of previous answers [9]. In fact, following a time interval longer than 1 month (reaching 3 months), seasonality bias could emerge and affect food reporting during the second administration of the FFQ [22]. Hence, the resulting reproducibility correlation in the present study could be attenuated. Nevertheless, previous studies have adopted time intervals of 2 weeks [43,46,47], 3 weeks [12,38], 4 weeks [11,36,48], 4 to 6 weeks [13], and 6 weeks [49]. This is the first FFQ validation study conducted in Lebanon to assess most aspects of validity, for a complete range of macro-and micronutrients. In fact, only a few number of studies worldwide used an extensive number of statistical tests for FFQ validation. Another strength of this study is the number of 24-h DRs (6 days) collected as a reference method for the FFQ validation, which was not common in previous studies. In addition, the sample size which is relatively higher than other validation studies, appears sufficient in the context of deriving useful information on questionnaire validity, when combined with 24-h DRs of 6 days [34].
We acknowledge the present validation study has some limitations. First, the length of the FFQ could have increased the burden on participants, hence impairing LOA limit of agreement the cooperation of the respondents and raising the risk of biased responses and the overestimation of intakes. Therefore, in order to account for this limitation, the FFQ was interviewer-administered which assured a more accurate completion of answers [23,24]. Despite this limitation, some studies suggested that food lists reaching 200 items could perform better than shorter ones with 100 items, and the resulting respondent burden "does not seem to be a decisive factor for FFQs" [6]. Second, the present sample of a university community is not necessarily representative of the total population; it includes a higher proportion of women, a higher education level, and a younger age distribution. Third, errors usually associated with both dietary instruments should be taken into consideration, including errors related to memory and estimation of energy and nutritional intakes. It would have been preferable to administer the multiple 24-h DRs several times over the period of 1 year. However, this was not possible due to technical and collaboration issues. In order to account for this limitation, we collected multiple 24-h DRs administered twice within an interval of 1 month. Finally, even though food composition databases were carefully chosen to reflect our community's dietary habits in the most accurate way, the use of multiple food composition tables could still induce a certain level of error.

Conclusion
In the present study, we performed an extensive range of statistical tests and evaluated various aspects of validity and reproducibility. We demonstrated that the FFQ developed for a Lebanese adult community is an acceptable tool for dietary assessment, namely in the context of nutrients distribution and ranking individuals according to their dietary intake. Hence, it is valuable for use in future epidemiological studies evaluating diet-disease associations. This study also showed that caution must be taken in the quantitative assessment of the diet by accounting for energy intake, in addition to gender and other confounding variables.
Additional file 1: Table S1. Food items included in the FFQ; Table S2. Validity of the FFQ: Pearson correlations between first FFQ and mean of six 24-h DRs, stratified by gender (n = 238); Table S3. Mean ± SD comparison of nutrient intake estimated by the FFQ and the average of six 24-h dietary recalls, stratified by gender (n = 238); Table S4. Mean ± SD comparison of energy-adjusted nutrient intake estimated by the FFQ and the average of six 24-h dietary recalls, stratified by gender (n = 238); Table S5. Indirect validity: mean daily nutrient intake as assessed by 24-h DRs of 6 days according to tertiles of food group consumption (FFQ) Additional file 2: Figure S1. Bland-Altman plots of difference between nutrients as predicted by the first FFQ and the mean of six 24-h recalls (n = 238); Figure S2. Bland-Altman plots of difference between nutrients as predicted by the first and second FFQs (n = 52)