Arizona State University researchers, led by Subbarao Kambhampati, challenge the characterization of AI language models’ intermediate text generation as “reasoning,” arguing that this anthropomorphization fosters misconceptions about their functioning. Their analysis of models like DeepSeek’s R1 reveals that these systems can produce lengthy intermediate outputs that mimic human scratch work without genuine reasoning, and even perform better when trained on semantically meaningless data. The study cautions against interpreting these outputs as valid reasoning, as it may lead to misplaced confidence in AI capabilities and misinform users about the underlying problem-solving processes.
Full Article