We undertook a 2-group, parallel, double-blind, randomized clinical trial (RCT) with a 3-week washout period.
The protocol and consent form for this study were approved by the institutional health science research ethics committee of Université Laval, Quebec. Appointments were scheduled for eligible women, where the risks and benefits of their possible participation were reviewed in detail. The informed consent form was read and signed by them before study inclusion.
Between July 2011 and March 2012, we enrolled non-smoking healthy women aged 20 to 65 years who had normal skin types I or II, as described by Fitzpatrick .
We excluded patients with one or more of the following conditions: pregnancy or breast- feeding, photosensitivity, history of skin cancer, photosensitizing medication, sunbed tanning or sunbathing in preceding 3 months, planned sunbed tanning or sunbathing during the study period, supplements of any kind (fish oil, coenzyme Q-10, garlic, lycopene, beta-carotene, etc.), except for medically-prescribed supplements or natural health products, consumption of ≥2 alcoholic drinks per day, allergy or intolerance to nuts or chocolate, body mass index (BMI) >35, hormone replacement therapy (HRT) or hormonal contraception in the preceding 6 months before the pre-randomization visit, or planned HRT or hormonal contraception during the study period. Women with systolic blood pressure ≥160 mmHg, diastolic blood pressure ≥100 mmHg, or treated with antihypertensive medication(s) were also excluded.
Recruitment and randomization
Women were recruited from the general population of Quebec City through websites, email, newspapers, radio-television advertising, and flyers posted in clinics. Potential study participants in the study contacted the study coordinator who explained the research project to them and verified inclusion and exclusion criteria.
Allocating participants to trial groups
At the randomization visit, participants were randomly assigned to either HFC (experimental group) or low-flavanol chocolate (LFC, placebo group). The randomization schedule was prepared at the St-François d'Assise Research Centre statistics unit. A blocked randomisation (4) was computer-generated by a statistician who was not involved in the study. It was stratified according to skin type (I and II) and age (30–35 years; 36–49 years; 50–65 years). A first list of randomisation was generated according to an equal number of participants in each age and skin type stratum. After three months of recruitment, proportion of women with skin type 2 and age 50–65 were more prevalent than expected and a new independent list of randomisation was generated.
Daily chocolate intake (30 g)
Study participants consumed 1 chocolate square 3 times per day (30 g/day) for 12 weeks, included in participants’ regular diet in place of an equivalent food in terms of energy and macronutrient content. The nutritional contents of each HFC and LFC square (10 g) are presented in Additional file 1: Table S7. HFC provided 600 mg of flavanols daily.
HFC and LFC were supplied as chocolate bars by Barry-Callebaut, Lebeke-Wieze, Belgium. All steps of chocolate production (fermentation, drying, roasting, and alkalinization) were optimized to preserve antioxidants. Chocolate bars were standardized for their flavanol and theobromine content and matched for caloric load, nutrients and caffeine. They were similar in taste and colour and were supplied in individual, opaque packaging. 30 g of chocolate contained less than 25 mg of caffeine.
Recruited participants presented at the Institute of Nutrition and Functional Foods (INAF) clinical facility for a total of 10 visits, including 5 10-minute visits 24 hours after each main visit, for MED assessment.
Participants were asked to abstain from chocolate consumption, other than the study product, for the study’s duration, including the washout period, and for 7 days before the randomization visit. Intense physical activity was forbidden for 48 hours preceding each visit. Women could not apply any body lotion, gel or moisturizer on the skin for the 24 hours preceding each visit.
A short questionnaire documenting social and demographic characteristics, alcohol consumption, and medication, was completed by participants. Anthropometric data (body weight, height and body fat percentage) were measured according to a standard protocol.  Food habits and f1avonoid consumption during the last month were estimated by validated food frequency questionnaire (FFQ) . Sun exposure (>30 minutes daily) and sun protection practices during the last summer and last week were evaluated by validated auto-administered questionnaire . Blood samples were collected. MED and skin elasticity were measured, as were hydration parameters.
Participants returned to our clinical research facility for follow-up visits at weeks 6, 9, and 12. All measurements at the 12th week visit were repeated after a 3-week washout period (15th week visit). MED, skin elasticity and hydration parameters were tested during each follow-up visit. Anthropometric data were collected. Blood samples were taken at every visit, in the morning after overnight fasting except for their morning 10-g intake of chocolate, for measurement of plasma flavanols and methylxanthines. Blood pressure was measured at every visit. Participants completed the FFQ to estimate their food consumption in the last month. Sun exposure and protection practices during the previous week were evaluated. Women returned 24 hours later for a 15-minute visit to assess the MED results.
Evaluation of side-effects
BMI (kg/m2) and body fatness were assessed by bioelectrical impedance according to the validated Tanita technique.  Digestive and other symptoms (nausea, abdominal pain, constipation, and headache) were documented by questionnaire administered at randomization and at each study visit. Blood lipid profile and glucose were measured at baseline, at week 12 and after the 3-week washout period.
Minimal erythema dose
Defined as the lowest UV dose to elicit just perceptible erythema at 24 hours, MED was assessed by an automatic Durham erythema dose tester emitting narrowband ultraviolet B (UVB) light (emission peak 311 nm). This valid and reproducible method  involves a small hand-held unit containing low-pressure TL-01 tubes and a 10-aperture plate with metal foil attenuators designed to administer a 1.26 dose series (i.e. dose increments of 1.26 times the previous dose). The time period of MED tester application on the skin, corresponding to maximum dosage at open aperture, was determined by patients’ Fitzpatrick skin type. The attenuation factor of other apertures produced a dose sequence, with each subsequent hole receiving a smaller dose than the previous one. The test device was switched on for 10 minutes to reach optimum performance. After 10 minutes, the unit was switched off, and the test commenced immediately. Exposure time listed on previously-selected dosage was set and the device placed on a forearm (the same side was then used for all subsequent measures). The timer and tester were switched on simultaneously, and good skin contact was maintained. When the alarm sounded, the tester was switched off. All tests on each patient were performed at the same vertical level. MED was evaluated clinically under controlled, artificial lighting by determining which aperture presented just perceptible erythema 24 hours after irradiation, in accordance with a validated method for MED testing with the Durham erythema dose system [30–34].
Skin elasticity parameters
Skin parameters were assessed by Cutometer (MPA580, Courage & Khazaka, Cologne, Germany) . Measurements were based on the suction method. Negative pressure was created in the device, and skin was drawn into the probe’s aperture. Penetration depth was ascertained by a non-contact optical system consisting of a light source and light receiver as well as 2 prisms facing each other, projecting light from the transmitter to the receiver. Light intensity varied with penetration depth. With larger probe apertures, deeper layers of the skin are deformed by suction. We, therefore, chose an 8-mm aperture. The resistance of skin sucked up by negative pressure (firmness) and its ability to return to its original position (elasticity) were displayed as curves at the end of each measurement. The cutometer generated a graph (Additional file 1: Figure S1) depicting immediate deformation or skin extensibility (Ue), delayed distention (Uv), final deformation (Uf) and immediate retraction (Ur). Ur/Ue ratio, or net elasticity, was the parameter of choice for quantifying skin aging, since it represented the ability of skin to recover after deformation. This parameter is independent of skin thickness. We evaluated final skin distension (distensibility), overall elasticity and net e1asticity .
Before skin measurement, the participants remained in a seated position for 10 minutes, in an environmentally-controlled room (temperature: 22 ± 2°C, relative humidity: 40-60%), for acclimatization to ambient conditions.
Skin hydration parameters
The CM 825 Corneometer (Courage & Khazaka, Cologne, Germany), a well-established and accurate system, estimated skin surface hydration level. It is principally based on capacitance measurement of dielectric media. Changes in the dielectric constant due to variation of skin surface hydration alter capacitance of the precision-measuring capacitor. Reproducibility of the instrument is high (coefficient of variation ± 3%). Measurement time is about 1 second. Even slight modifications of hydration level can be detected .
Plasma flavanols and methylxanthines concentrations
Flavanols were measured in the INAF laboratory by P. Dubé. They were purified by solid extraction, followed by high-pressure liquid chromatography with a fluorescence detection system . Methylxanthines were quantified by high-pressure liquid chromatography .
Study participants received a telephone reminder in the week preceding each visit. To promote chocolate compliance, the research coordinator telephoned them a few days before each study visit. A new appointment was scheduled if participants missed a visit. In addition, they documented their daily intake of chocolate bars on diary cards. Plasma theobromine concentrations served as marker of cocoa consumption .
Blinding and methods for protecting against sources of bias
All clinical, biophysical, laboratory and statistical analyses were blinded. We applied the intent-to-treat principle to avoid attrition bias. Participants were compared in the groups to which they were originally assigned randomly. According to previous experience, we anticipated that 15% of randomized women would be lost to follow-up. To ensure support and motivation for substitution of equivalent foods by chocolate, participants received nutritional counselling at the randomization and 6th week visits. All efforts were made to ensure primary outcome (MED) measurements at the 6th, 9th, 12th and 15th week visits in all randomized patients.
To avoid selection bias, the study subjects, investigators, staff and all laboratory analyses were blinded to treatment assignment. All chocolate bars were matched for calorie load, nutrients and caffeine. Similar in appearance (e.g. colour and quantity), smell and taste, they were supplied in individual, opaque packaging. The proportion of women who guessed right about group allocation was documented with a short questionnaire at the 15th week visit.
To control for contamination bias, flavonoid consumption was measured by FFQ in the last month preceding each follow-up visit. Finally, data were collected according to a standardized procedure supervised by the research coordinator who had extensive experience in data monitoring. Sunbathing and tanning devices were not permitted during the study period.
Sample size and planned recruitment rate
Williams et al.  reported MED mean ± standard error of the mean of 0.l09 ± 0.011 J/cm2 in a sample of healthy subjects, which rose to 0.223 ± 0.019 after 12-week chocolate intake. Based on these estimates, with standard deviation (SD) of 0.043, a sample of 31 women was required in each group to detect a minimal difference of 0.031 (28%) between groups with 5% 2-sided significance and 80% power. 15% loss to follow-up was anticipated so that sample size was increased to a total of 73 women. Augmenting it to 74 allowed equal numbers of subjects in both study arms.
Statistical data analysis investigated the effect of HFC vs. LFC consumption on skin sensitivity to UV radiation according to MED criteria. Thus, the statistical hypothesis was that mean MED was not different in those who consumed HFC and those who did not: null hypothesis (H0): mean difference (MD) = 0. Statistical analysis was carried out at the St-François d'Assise Hospital Research Centre, CHUQ, with SAS software (version 9.3), according to the intent-to-treat principle. P values (bilateral) lower than or equal to 0.05 indicated significant differences. The baseline characteristics of participants in each group were compared by Chi-square test for categorical variables and independent sample t-test for continuous variables with normal distributions. These tests validated the randomization process. The primary outcome was changes in MED values, calculated as the difference for each person at weeks 6, 9 and 12 compared to baseline. MED at 15 weeks was compared to week 12. The secondary endpoints were assessed similarly. The primary outcome was analyzed in secondary analysis of repeated measures adjusted for potential factors affecting MED. Differences in the number and quality of side-effects between the 2 groups were compared as well.
The results were expressed as means ± SD at baseline and at 6, 9, 12 and 15 weeks; n was number of women. Missing data were imputed by a commonly-employed single imputation method with null values for mean differences. We explored different approaches to the imputation problem. All analyses are in general agreement qualitatively, and intent-to-treat results are presented. Multivariate analysis included baseline values for each outcome, Fitzpatrick skin phototype (I or II), season of study participation, age of participants (<50 years or ≥50 years), and BMI (for the skin hydration outcome only), as potential confounding variables. In the final multivariate model, baseline values for each analyzed outcome, season of study participation, age of participants (<50 years or ≥50 years), and BMI (for the skin hydration outcome only) were retained by backward step-wise elimination. If, by removing a variable from the model, change in the regression coefficient was more than 10% compared to the adjusted model including all variables, then that variable was retained in the final model. Skin phototype was not specifically accounted for in the final model for primary outcome since more precise measurement of skin photosensitivity was included in the form of MED values at baseline.