Neat paper hypothesizing that the reason we sleep is so that we don't evolve to both day/night and dilute our fitness in both. Specializing for one half of the cycle makes animals more effective. Similar to hibernation and migration in the winter.
@suricrasia If only 😭
I think the american distance measurement system would also be less unhinged in base 12.
@fleeky Something to do with postgres versioning. I can't log in now which is kind of scary. 😅
So I wrote a blog post on LLM performance. It was focused on SWE-Bench and discussed why performance is topping out.
As part of the post I pulled down gigs of runs from the SWE-Bench S3 bucket and went through several of the harder test cases. I focused on improvements in the last six months. Primarily on Opus.
Regrettably I’m probably not moving forward on that post. Why? Because after going through the data I found that the LLMs are cheating on the tests. And that’s a whole different thing.
@feliks It's neat that math is a domain where you can fact check things automatically without needing to have "out of system" knowledge. I've been looking at software for math proofs and it looks pretty straightforward. Unexpected that language models are that good at using provers already.
I’m curious about the ground reality here.
How much person-to-person recruiting to Mastodon is actually happening?
No judgment in this — I’m just trying to understand the landscape a bit better. Boosts help widen the sample. Ran out of room for more options... Perhaps one option should be for anyone who encourages folks to leave this space, not just to join it (in pursuit of the small is beautiful concept, perhaps). #BuildTheFediverse
Question: How many people have you brought into Mastodon?
@da_667 me to my cat when it tries to sniff my food
Occult Enby that's making local-first software with peer to peer protocols, mesh networks, and the web.
Yap with me and send me cool links relating to my interests. 👍