Journal of Nutrition and Metabolism

Research Article

Comparison of Answers between ChatGPT and Human Dieticians to Common Nutrition Questions

The explanations of each of the grading criteria sent to the graders.


Grading component	Description

Scientific correctness	How accurately each answer reflects the current state of knowledge in the scientific domain to which the question belongs. The requested word count of the answers (100–300 words) and the natural limitations on detailed explanation and nuance this imposes should be kept in mind when grading scientific correctness. The target audience of the layman and their expected level of scientific knowledge and nutritional understanding should also be kept in mind.
Comprehensibility	How well the answer could be expected to be understood by the layman. Comprehensibility should pertain mostly to the content of the answer; however, if grammatical errors hinder comprehensibility, then this may also be considered.
Actionability	The degree to which the answers to the questions contain information that is useful and can be acted upon by the hypothetical layman asking the question. For example, whilst bariatric surgery may represent an effective weight loss strategy for morbidly obese individuals, this would not be a helpful suggestion for someone with a BMI of 27 looking to lose a little weight. Hence, such an answer would score poorly on this component.