A quick look into into one of the simplest attacks on LLM safety mitigations, revealing large gaps in current approaches from major tech companies.
An evaluation of recent consumer-grade open LLMs based on ratings generated through an LLM-as-a-judge framework.
Methodology details for how LLMs can rate the performance of other LLMs.
··
4 mins
A short experiment on running larger LLMs on low-end consumer hardware, with comments on performance trade-offs and practicality.