Anyone have a link to the video? Looking for a good laugh.
- 0 Posts
- 5 Comments
shawn1122@lemm.eeto Technology@lemmy.world•Meta whistleblower Sarah Wynn-Williams says company targeted teens with advertisements based on their ‘emotional state’English74·20 days agoAh yes having to lick the boot of an autocrat with no freedom to dissent. That sure sounds like its working to me.
shawn1122@lemm.eeto Games@lemmy.world•Switch 2 GameCube controller will only be offered to those who pre-order the consoleEnglish6·24 days agoThey must enter a unique switch serial number (that corresponds with inventory) to make the purchase? Don’t see why it has to be contigent on a subscription.
shawn1122@lemm.eeto Technology@lemmy.world•Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thoughtEnglish1·26 days agoThis is what the ARC-AGI test by Chollet has also revealed of current AI / LLMs. They have a tendency to approach problems with this trial and error method and can be extremely inefficient (in their current form) with anything involving abstract / deductive reasoning.
Most LLMs do terribly at the test with the most recent breakthrough being with reasoning models. But even the reasoning models struggle.
ARC-AGI is simple, but it demands a keen sense of perception and, in some sense, judgment. It consists of a series of incomplete grids that the test-taker must color in based on the rules they deduce from a few examples; one might, for instance, see a sequence of images and observe that a blue tile is always surrounded by orange tiles, then complete the next picture accordingly. It’s not so different from paint by numbers.
The test has long seemed intractable to major AI companies. GPT-4, which OpenAI boasted in 2023 had “advanced reasoning capabilities,” didn’t do much better than the zero percent earned by its predecessor. A year later, GPT-4o, which the start-up marketed as displaying “text, reasoning, and coding intelligence,” achieved only 5 percent. Gemini 1.5 and Claude 3.7, flagship models from Google and Anthropic, achieved 5 and 14 percent, respectively.
Found it, for anyone reading don’t bother. It is still images with an AI voiceover. I was hoping for half baked AI generated video.