Grok poses an epistemological challenge to Veriphysics
Will you try other LLM flavors, such as ChatGPT/Gemini/GabAI, in a similar fashion?
Not in this context. There isn't any point due to the limitations of the various models. Gemini is an overenthusiastic cheerleader, ChatGPT invents nonexistent flaws. Both are unreliable. Grok has serious limitations too, but they can be managed.
Will you try other LLM flavors, such as ChatGPT/Gemini/GabAI, in a similar fashion?
Not in this context. There isn't any point due to the limitations of the various models. Gemini is an overenthusiastic cheerleader, ChatGPT invents nonexistent flaws. Both are unreliable. Grok has serious limitations too, but they can be managed.