Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
You might not have to start all over again.
Users could be tricked into running arbitrary code, but the issue was patched last week.
Microsoft deployed a fix for the bug, which shows the hazards of using AI in the workplace.