Comparison of Answers between ChatGPT and Human Dieticians to Common Nutrition Questions
Table 2
The explanations of each of the grading criteria sent to the graders.
Grading component
Description
Scientific correctness
How accurately each answer reflects the current state of knowledge in the scientific domain to which the question belongs. The requested word count of the answers (100–300 words) and the natural limitations on detailed explanation and nuance this imposes should be kept in mind when grading scientific correctness. The target audience of the layman and their expected level of scientific knowledge and nutritional understanding should also be kept in mind.
Comprehensibility
How well the answer could be expected to be understood by the layman. Comprehensibility should pertain mostly to the content of the answer; however, if grammatical errors hinder comprehensibility, then this may also be considered.
Actionability
The degree to which the answers to the questions contain information that is useful and can be acted upon by the hypothetical layman asking the question. For example, whilst bariatric surgery may represent an effective weight loss strategy for morbidly obese individuals, this would not be a helpful suggestion for someone with a BMI of 27 looking to lose a little weight. Hence, such an answer would score poorly on this component.