@GissaMittJobb

GissaMittJobb@lemmy.ml · 7 hours ago

A few not yet mentioned:

Well There’s Your Problem - a podcast about engineering disasters
Hard Fork - a weekly tech news show, with banter similar to what you could find on Reply All before that was ended
The War on Cars - an urbanism-podcast
The Urbanist Agenda - another urbanism-podcast, by the creator behind Not Just Bikes
The Climate Denier’s Playbook - a climate-podcast
Hyperfixed - by one of the hosts of Reply All

And a vote for previously mentioned podcasts:

99% Invisible - a podcast about design, arguably my favourite
Darknet Diaries - a podcast about cybersecurity

GissaMittJobb@lemmy.ml · 8 hours ago

I don’t think DeepSeek has the capability of generating code and executing it inline in the context window to support its answers, in the way that ChatGPT does - the “used”-part of that answer is likely a hallucination, while “or would use” more accurately represents reality.

GissaMittJobb@lemmy.ml · 14 hours ago

The concern is that the model doesn’t actually see the world in terms of distinct hexadecimals, but instead as tokens of variable size - you can see this using the tiktokenizer-webapp: enter some text and it will split it into the series of tokens the model actually will process.

It’s not impossible for the model to work it out anyway, but it is a reason for this type of task to be a bit harder on LLMs.

GissaMittJobb@lemmy.ml · 1 day ago

It’s not out of the question that we get emergent behaviour where the model can connect non-optimally mapped tokens and still translate them correctly, yeah.

GissaMittJobb@lemmy.ml · 1 day ago

It is a concern.

Check out https://tiktokenizer.vercel.app/?model=deepseek-ai%2FDeepSeek-R1 and try entering some freeform hexadecimal data - you’ll notice that it does not cleanly segment the hexadecimal numbers into individual tokens.

GissaMittJobb@lemmy.ml · 1 day ago

Still, this does not quite address the issue of tokenization making it difficult for most models to accurately distinguish between the hexadecimals here.

Having the model write code to solve an issue and then ask it to execute it is an established technique to circumvent this issue, but all of the model interfaces I know of with this capability are very explicit about when they are making use of this tool.

GissaMittJobb@lemmy.ml · 2 days ago

Is this real? On account of how LLMs tokenize their input, this can actually be a pretty tricky task for them to accomplish. This is also the reason why it’s hard for them to count the amount of 'R’s in the word ‘Strawberry’.