On the topic of "24. A Sony Walkman-style device that you can give to children so they can ask questions to an LLM...", I would strongly caution against this:
- short of AGI, what a child will hear are explanations given with authority, which would probably be correct a very high percentage of the time (maybe even close to or above 99%), BUT the few incorrect answers and subtles misconceptions finding their way in there will be catastrophic for the learning journey because they will be believed blindly by the child.
- even if you had a perfect answering LLM who never makes a mistake, what's the end result? No need to talk to others to find out about something, ie reduced opportunities to learn about cooperating with others
- as a parent, one wishes sometimes for a moment of rest, but imagine that your kid just finds out there's another entity to ask questions from that will have ready answers all the time, instead of you saying sometimes that you don't know, and looking for an answer together. How many bonding moments will be lost? How cut off would your kid become from you? What value system would permeate through the answers?
A key assumption here for any parent equipping their child with such a system is that it would be aligned with their own worldview and value system. For parents on HN, this probably means a fairly science-mediated understanding of the world. But you can bet that in other places, this assistant would very convincingly deliver whatever cultural, political, or religious propaganda their environment requires. This would make for frighteningly powerful brainwashing tools.
sixtram 12 hours ago [-]
Just two days ago, I asked ChatGPT to provide an explanation of the place-value system that my six-year-old could understand. The only problem was that it mixed up digit value and place value, which caused it to become confused. I spotted the mistake, and ChatGPT apologised, as it usually does. But if my six-year-old had asked it first, she wouldn't have noticed.
I'm not sure how much misinformation my child would learn as truth from this device.
ponector 15 hours ago [-]
>> child will hear are explanations given with authority, which would probably be correct a very high percentage of the time (maybe even close to or above 99%), BUT the few incorrect answers and subtles misconceptions finding their way in there will be catastrophic for the learning journey because they will be believed blindly by the child.
Much better results than asking a real teacher at school, though.
VSerge 3 hours ago [-]
Wouldn't you know whether a teacher is reliable or not? If reliable, they probably have this reputation also because they can also say when they don't know something. And if you found out a given teacher isn't reliable, you'd be careful about what they say next - or you would just ask someone else.
The problem here is for a child to be thinking this system is reliable when it is not. For now, the lack of reliability is obvious as chatGPT hallucinates on a very regular basis. However, this will become much harder to notice if/when chatGPT will be almost reliable while saying wrong things with complete confidence. Should such models be able to say reliably when they don't know something, this would be a big step for this specific objection I had, but it still wouldn't solve the other problems I mentioned.
93po 9 hours ago [-]
the amount of misinformation i had a kid due to a lack of internet is nothing compared to the rare hallucination a kid might get from chatgpt
swallowing gum is bad for you, or watermelon seeds, cracking knuckles causes arthritis, sitting too close to tv ruins your eyes, diamonds come from coal, newton's apple story, a million other things
nuredini 1 hours ago [-]
Most of these tools seem to rely on the same idea: we have your data and we, being the domain experts of this data, know how to format it for you and how to create good prompts that are specialized for this context.
onion2k 17 hours ago [-]
A recommendation engine that looks at my browsing history, sees what blog posts or articles I spent the most time on, then searches the web every night for things I should be reading that I’m not.
This kind of exists in the form of ChatGPT Pulse. It uses your ChatGPT history rather than your browser history, but that's probably just as good a source for people interested in using it (e.g. people who use ChatGPT enough to want it to recommend things to them.) https://openai.com/index/introducing-chatgpt-pulse/
Gigachad 16 hours ago [-]
It's also essentially every social media platform with an algorithm selected feed.
FinnKuhn 15 hours ago [-]
A lot of social media platforms only recommend recently uploaded content or at least heavily favor it.
The idea sounds to me more like a feed for independent blogs/articles though, which is what an RSS reader once was supposed to be. Have we come full circle?
socalgal2 16 hours ago [-]
Except those algos don't work. No idea if the LLM works.
fhd2 16 hours ago [-]
I'm sure they work splendidly... to keep the average person on the platform as long as possible and show them ads :)
aljgz 16 hours ago [-]
They do work, extremely well, not for us though!
simianwords 15 hours ago [-]
Do you not think that’s what the post meant? That it could work for us rather than them?
Gigachad 14 hours ago [-]
They probably won’t though. The commercial LLMs will be tuned to work for them as well soon. And your local LLM won’t be allowed to scrape the internet since it’s all locked down now.
blooalien 31 minutes ago [-]
LLMs don't "scrape the Internet". The tools and interfaces that use LLMs for the language part of things do any necessary scraping and feed the results into the LLM's "context". That part about "tuned to work for them" is a serious concern though.
ares623 17 hours ago [-]
Not just for this article, but from most ideas/articles around LLMs, I feel like they aren't "thinking with portals" enough. We have "portal gun" tech (or at least, that's what's being marketed), and we're using it as better doors.
BriggyDwiggs42 5 hours ago [-]
I sorta think the issue is that what LLMs do in and of themselves is extend text in a coherent way, while only a small subset of applications are directly textual. It’s incredibly generally applicable yet also difficult to apply to anything that isn’t a glorified text editor. Say you wanted to have it help you edit videos. You might provide it with a scripting language to control the editor , but now you have to maintain parity between a scripting language and the editor’s user-accessible functionality. If you’re adobe, is that really worth the manpower? If you’re a small startup trying to unseat adobe, you have to compete with decades of features and user lock in. The only way this makes sense for either party is if the LLM is crazy good at it, but the LLM can’t watch its video output and it’s also probably just okay to begin with.
HellsMaddy 17 hours ago [-]
I agree with this. But do you have any resources on "thinking with portals"? It's easier said than done.
ares623 14 hours ago [-]
Sadly, I don't. If I did I'd be busy building it rather than judging others on HN.
But it's a bit telling that OpenAI themselves can only come up with a better ~door~ ads.
Dilettante_ 11 hours ago [-]
Could you give a quick example so we can "catch" the way of thinking you mean a little easier?
samcollins 8 hours ago [-]
Re 19, I made this with an iOS Shortcut a few weeks ago
> A minimal voice assistant for my Apple Watch. I have lots of questions that are too complicated for Siri but not for ChatGPT. The responses should just be a few words long.
Use Dictate Text action to take voice as input, pass the text to OpenAI API as the user message with this as the system prompt:
“CRITICAL: Your response will only be shown in an iOS push notification or on a watch screen, so answer concisely in <150 characters. Do not use markdown formatting - responses are rendered as plain text. Do use minimalist, stylish yet effective vocabulary and punctuation.
CRITICAL: The user can not respond so do not ask a question back. Answer the prompt in one shot and if necessary, declare assumptions about the users questions so you could answer it in one shot, while making it possible for the user user to repeat ask with more clarity if your assumptions were not right.”
It works well. The biggest annoyance is it takes about 5-20s to return a response, though I love that it’s nearly instantaneous to send my question (don’t need to wait for any apps to open etc)
JSR_FDED 14 hours ago [-]
Many of these ideas depend on knowing the user’s preferences, patterns, communications, events and health. This is where the opportunity lies for Apple - the phone and watch know so much about you, that Apple could focus on smartly assembling the context for various LLM interactions, in a privacy-preserving way.
gyomu 17 hours ago [-]
There's some sort of fundamental category mistake going on with thinking like this.
Most of the items in this list fall prey to it, but it is maybe best exemplified by this one:
> A writing app that lets you “request a critique” from a bunch of famous writers. What would Hemingway say about this blog post? What did he find confusing? What did he like?
Any app that ever claimed to tell you what "Hemingway would say about this blog post" would evidently be lying — it'd be giving you what that specific AI model generates in response to such a prompt. 100 models would give you 100 answers, and none of them could claim to actually "say what Hemingway would've said". It's not as if Hemingway's entire personality and outlooks are losslessly encoded into the few hundreds of thousands of words of writing/speech transcripts we have from him, and can be reconstructed by a sufficiently beefy LLM.
So in effect it becomes an exercise of "can you fool the human into thinking this is a plausible thing Hemingway would've said".
The reason why you would care to hear Hemingway's thought on your writing, or Steve Jobs' thoughts on your UI design, is precisely because they are the flesh-and-bone, embodied versions of themselves. Anything else is like trying to eat a picture of a sandwich to satisfy your hunger.
There's something unsettling that so many people cannot seem to cut clearly through this illusion.
massung 17 hours ago [-]
> Any app that ever claimed to tell you what "Hemingway would say about this blog post" would evidently be lying — it'd be giving you what that specific AI model generates in response to such a prompt.
First, 100% agreed.
That said, I found myself pondering Star Trek: TNG episodes with the holodeck, and recreations of individuals (e.g. Einstein, Freud). In those episodes - as a viewer - it really never occurred to me (at 15 years old) that this was just a computer's random guess as to how those personages from history would act and what they would say.
But then there was the episode where Geordi had to the computer recreate someone real from their personal logs to help solve a problem (https://www.imdb.com/title/tt0708682/). In a later episode you find out just how very wrong the computer/AI's representation of that person really was, because it was playing off Geordi, just like an LLM's "you're absolutely right!" etc. (https://www.imdb.com/title/tt0708720/).
This is a long-winded way of saying...
1. It's crazy to me how prescient those episodes were.
2. At the same time, the representation of the historical figures never bothered me in those contexts. And I wonder if it should bother me in this (LLM) context either? Maybe it's because I knew - and I believed the characters knew - it was 100% fake? Maybe some other reason?
Anyway, your comment made me think of this. ;-)
socalgal2 16 hours ago [-]
I wonder if there's a difference between "asking for critique" and "acting the part". I generally have no problem and or get fooled, watching a movie about a famous person, even though it's not actually that person. Rami Malek is not Freddie Mercury, Timothée Chalamet is not Bob Dylan. But we (or at least I) watch them and am to some degree, fooled / by into their depiction that I'm actually seeing the real person. I have to remind myself the actor's version is not the actual person.
It feels easier to portray famous characters how we'd think they'd act but seems harder how we'd expect them to critique something. I don't know of those are just points on a spectrum from easy to hard, or if one requires a level deeper than the other.
Doxin 15 hours ago [-]
I think the core difference there is that the holodeck character feels like a character that is playing a person (because it is of course) whereas the LLM feels more like someone lying to you about who they are.
When watching a play the actor pretends to be a specific character, and crucially the audience pretends to believe them. If a LLM plays a character it's very tempting for the audience to actually believe them. That turns it from a play into a lie.
einpoklum 16 hours ago [-]
In that context, the computer was solving for a faithful representation. In our case, the computer is solving for most likely sequence of words to appear in conversations with a similar context - which not remotely the same thing.
striking 16 hours ago [-]
I have a more straightforward rebuttal of the need for an AI Hemingway. Someone already implemented a decent interactive guide to writing like Hemingway at least a decade ago, before all this LLM stuff: https://hemingwayapp.com/
It uses some simple heuristics to identify grammar that could be simpler and prompts you to do better. It might actually be better than an LLM specifically because it isn't able to do the rewriting for you. Maybe that might help a user learn.
thatloststudent 17 hours ago [-]
> A nano banana photo-editing app where I don’t have to write a prompt. Just give me hundreds of templates from trying out different haircuts to seeing what you and your partner’s kid would look like to making me look like The Rock. A photo editing super-app.
Quite a few of these "ideas" make me think that the human behind it wants to maximize laziness. Glazing over what Hemingway kinda sorta would have thought about something fits into this pretty well.
thunky 10 hours ago [-]
> the human behind it wants to maximize laziness
A good tool should do reduce the amount of work we have do manually. That's all this is.
petercooper 8 hours ago [-]
I'm not disagreeing with your broader point but:
So in effect it becomes an exercise of "can you fool the human into thinking this is a plausible thing Hemingway would've said". ... There's something unsettling that so many people cannot seem to cut clearly through this illusion.
Modern culture, at all scales, is largely based upon such exercises. We rarely know exactly what something (whether a person, organization or entity) truly stands for, with messages often boiled down, contextualized, or re-interpreted through others or through simulations.
People go to theme parks and enjoy rides simulating the wild west and meet characters who resemble, but aren't, their favorite characters from TV (which themselves are a fabrication based upon other, real things). Many cultural (heck, also religious) experiences are an exercise in humans entering into a suspension of disbelief and thinking something is plausible when it has little relation to the original thing it symbolizes. Indeed, the comforting thing about AI may be that at least we can see that process taking place more clearly with it.
andrewgleave 16 hours ago [-]
Feynman said, "The first principle is that you must not fool yourself - and you are the easiest person to fool" when talking about science, but it also applies to the properties of LLM output.
TeMPOraL 15 hours ago [-]
> So in effect it becomes an exercise of "can you fool the human into thinking this is a plausible thing Hemingway would've said".
That's useful in itself, though. Assume the human knows they're "being fooled", we call this make-believe, or suspending disbelief. It's a tool we use each time we act something out, pretend to be someone else, try to put ourselves in their position; we do that when we try to learn from recorded experience of other people, real or fictional.
> The reason why you would care to hear Hemingway's thought on your writing, or Steve Jobs' thoughts on your UI design, is precisely because they are the flesh-and-bone, embodied versions of themselves. Anything else is like trying to eat a picture of a sandwich to satisfy your hunger.
Not at all! It's exactly the other way around.
No one wants to talk to the actual human. We're not discussing creepy dating apps here. The reason you'd care for a virtual Hemingway or Jobs is because you want to access specific, opinionated expertise, wrapped in fitting and expected personality, to engage fully with the process, to learn tacitly and not just through instructions.
The Hemingway and Shakespeare and Jobs people want are not real anyway. Who knows how much of "Hemingway" is actually Hemingway, and how much it was written or edited by his wife, butler, or some publisher? How much real Jobs actually is in the stories, how much were they cleaned or edited to reinforce the myth? It doesn't matter, because no one cares about the real person, they care about the celebrity that's in public consciousness. The fake person is more useful and interesting anyway.
Like 'massing, I agree TNG was prescient about it. But I actually see the examples working as intended. Einstein, Hawking, Freud were all useful simulations. Ironically, it's Barclay who actually related to them in reasonable fashion, and it's Geordie who got confused about reality.
brabel 11 hours ago [-]
Very interesting rebuttal. I must say I was almost as convinced by the original post! This just made me think: if we can’t agree even on relatively simple topics like this, what hope is there that we will ever agree on most important issues. Disagreement should be an expected constant in all aspects of life, not an undesirable outcome. Even with disagreement, I believe it’s possible to find common ground and do what needs to be done (now I am really far into my tangential point!)
raghavtoshniwal 16 hours ago [-]
I think you're correct but your bar is too high, I think this app would be useful even if it was a lossy approximation Hemingway from his writings.
As a thought experiment - I would value what a PhD who dedicated her career to studying one author and their works to tell me what she thinks about a piece of writing from that author's lens. (It's not too far from it)
> Anything else is like trying to eat a picture of a sandwich to satisfy your hunger.
I think it's more akin to you trying to recreate a different sandwich after reading a couple of their cook-books.
17 hours ago [-]
sixtyj 16 hours ago [-]
Hemingway was just an example imho. But yes, LLM is just a very clever text composer that tries to understand our inputs and statistically is right in some per cent of answers. And it is clever enough to fool people to think that they communicate with intelligence.
typpilol 17 hours ago [-]
I agree but for they example you picked, I imagined he was referring to their style of writing.
But if he meant literally then then yea.. that's delusional
benrutter 17 hours ago [-]
I think even that still misses that it can only offer a pastiche.
As an example, I put the first paragraph of Hemingway's "A clean, Well Lightened Place" into Le Chat and asked it for notes to make it sound like Hemingway. It gave me plenty!
For example:
Tweak: The second sentence is a bit long. Consider breaking it up for more impact:
“In the day the street was dusty. At night, the dew settled the dust.” ->
“The old man liked to sit late. He was deaf, and at night it was quiet. He felt the difference.”
bryanhogan 17 hours ago [-]
2. is already possible with Claude Code + context files + the Playwright MCP, or?
7. also seems possible with any markdown editor, e.g. Obsidian, plus an AI running through the local files such as Claude Code.
13. I would love this as well! We will probably see this soon, especially on more open platforms such as BlueSky, as its seems to be a better fit for customizable browser extensions and customizable feed experiences.
14. How is this different from what AI can already do? Especially with iterative sub-agents that that can store context in files it's quite capable already. But of course, quality can always be better, but is that the only thing?
Also a few ideas seem to be close to what I'm building ( https://dailyselftrack.com/ ). Idea is to have a customizable tool so you can track what you want, and then you can feed that data into AIs if you choose to do so to get feedback.
yoaviram 17 hours ago [-]
Essentially what this article is asking for, in most cases, is a better UI/UX for one of the foundation models.
lancebeet 15 hours ago [-]
This is really striking, isn't it? We've all certainly seen demos of things on this list or very similar things, and there are startups that have spent years and billions of dollars attempting to exploit existing LLMs to develop useful products. Yet most of the products don't seem to exist. The ones that you see in everyday life never seem to work nearly as well as the demos suggest.
So what's going on here? Do the products exist but nobody (or very few) uses them? Is it too expensive to use the models that work sufficiently well to produce a useful product? Is it much easier to create a convincing demo than it is to develop a useful product?
Oras 15 hours ago [-]
It is too expensive to reach the right audience. I remember talking to agencies about ads for a fintech app, and all of them said the same thing:
You need to burn around 20k a month on ads for 3 months, so we can learn what works, then the CAC will start decreasing, and you can get more targeted users.
Once you turn ads off, there is no awareness, no new users, and people will not be aware of the product's existence.
I wish i didn't click on that link.. it's just some random app ideas, not actual tools.
ftth_finland 2 hours ago [-]
Just give me an RSS reader with a voice UI and text to speech.
noja 17 hours ago [-]
For me: A local model to plug in to Apple photos to look for metadata inconsistencies in my photo librar, add missing location information, add dates from those old scanned photos with the date on the corner.
bryanrasmussen 17 hours ago [-]
This seems like a relatively easy thing to code oneself, or for someone to have already made somewhere (relatively easy, just writing something for yourself command line doing it, assuming you can spend a work week of nights [worst case, based on my working with images in folders in the past I think 10 hours for something that works, reasonable time for coffee and other breaks])
coolThingsFirst 17 hours ago [-]
For you maybe, we are engaging with the left side of the bell curve never forget that.
Also use simpler words.
backprop1989 17 hours ago [-]
Yeah, there are probably a few multimillion dollar app ideas in here, but of insufficient complexity or lowbrow-ness for the typical HN reader (myself included). The nano-banana template idea or the Q&A Walkman, for example.
lifestyleguru 2 hours ago [-]
This is a problem solvable with 30 years old technology - bash, exiftool, ImageMagick, Tesseract OCR.
monch1962 15 hours ago [-]
> A Sony Walkman-style device that you can give to children so they can ask questions to an LLM. It should be voice-first, and focused on explaining things. There shouldn’t be a single screen on the device. Offline-first would be a plus.
My grandkids (5 and 3) spent about 2 minutes learning how to use it, then bombarded it with "tell me a story about a unicorn named Bob", "can dogs be friends with monkeys?" and so on. In every case it gave a reasonable answer within a few seconds.
I'll be amazed if these things don't wind up embedded inside toys by Xmas. When they do, I'll be in the queue to buy one
christoph123 16 hours ago [-]
On your request 12
> A local screen recording app but it uses local models to create detailed semantic summaries of what I’m doing each day on my computer. This should then be provided as context for a chat app. I want to ask things like “Who did I forget to respond to yesterday?” I've been using Rewind for a year now, and it's nowhere near as useful as it should be.
I am building something like this but unfortunately not local because for most people's machines local LLMs are just not powerful enough or would take too much drain on battery. Work in progress, always curious for feedback! https://donethat.ai
> A paint-by-number filmmaking app. I want to be able to brainstorm an idea for a short film in the app, have the model create a detailed storyboard, and then I just need to use my phone to film each of the storyboarded shots. Kind of like training wheels for making movies.
There are at least half a dozen apps for that.[1][2]
There are other apps for creating the shots, too. Those are still not that great, but it's getting there. You could probably previz a whole movie right now.
As someone who is regularly involved in startup valuations, I think there’s quite a few million-dollar ideas in there—if not as standalone products, then at least as differentiation features for existing categories.
I recently gave one of my teen kids Neal Stephenson’s The Diamond Age to read, and we’ve both been commenting on how much smarter some “things” could be instead of everyone churning out a slightly different way to “chat with your data and be locked in to our platform”.
And I think this is why I’m so partial to Apple’s slow, progressive, under the covers integration of ML into its platform-input prediction, photo processing, automatic tagging, etc. we don’t necessarily need LLMs for a lot of the things that would improve computer experiences.
bobheadmaker 11 hours ago [-]
Great ideas, many of the niche level AI agents are listed in this directory, https://aiagentslive.com/
I agree with point #27, the future is definitely in hyper-specific agents. We’re working on this by creating and deploying ready-to-use AI Agents for marketing and sales functions.
miguelspizza 16 hours ago [-]
I wrote #2 as a result of a web automation tool I a working on. It's easier to show than tell.
The actual purpose of the vibe-coding userscripts feature is to vibe code WebMCP servers that the extension can then use for browser automation tasks.
Everything is still very WIP, but I can give you beta access if you want to play around with it
aitchnyu 12 hours ago [-]
A few of them imply a vision model which can control your keyboard and mouse. Offline-only of course.
It could help with most tech support questions.
We could select text and ask to fact check or explain to layperson or search more.
It could get around cookie banners and dark patterns.
It could do my time tracking and tell me to get off HN and optimize Pomodoro-style breaks.
It could write scripts after watching me switch between multiple pages of AWS services.
kmoser 17 hours ago [-]
> 9. A minimalist ebook reader that lets me read ebooks, but I can highlight passages and have the model explain things in more depth off to the side. It should also take on the persona of the author. It should feel like an extension of the book and not a separate chat instance.
> It should feel like an extension of the book and not a separate chat instance.
So like footnotes? Or more like Socrates suddenly goes off on an anachronistic 1200 word discursion, in the middle of Phaedrus, about Freudian interpretations of his argument
aaronbrethorst 17 hours ago [-]
This is probably what the kids would call 'cringe,' but I asked Claude to Summarize Plato's "Apology" into a brief rap that sounds like it could have come from "Hamilton"
The Trial of Socrates (An Athenian Rap)
[Verse 1]
My name is Socrates, corrupting the youth?
That's what they claim but I'm just seeking truth
Oracle said I'm the wisest alive—
I said "that's impossible," had to investigate why
Turns out everybody's fronting, they don't know what they say
Politicians, poets, craftsmen—all pretending every day
I expose their ignorance, make 'em look like fools
Now they're charging me with breaking all of Athens' rules
[Chorus]
I'm not throwing away my shot
At the examined life, whether they like it or not
Wisdom is knowing what you don't know
And I'd rather die than let philosophy go
[Verse 2]
Meletus, you're stepping to me? Son, you're confused
You say I'm atheist but blame me for introducing gods that are new?
Your story doesn't track, your logic's full of cracks
I've got a divine sign that keeps me on the righteous path
They want me silent, want me exiled, want me gone
But I'm Athens' gadfly, stinging till the break of dawn
Death? That's just a journey to another place
Either dreamless sleep or meeting heroes face to face
[Outro]
So sentence me to death, I'll drink the hemlock down
'Cause an unexamined life ain't worth living in this town
History will vindicate the questions that I ask—
The pursuit of truth and virtue is my only task!
3eb7988a1663 17 hours ago [-]
I was imagining the author was describing something like a Young Lady's Illustrated Primer
mhl47 16 hours ago [-]
Currently trying to build #6. Just for private use. My hope is that by throwing a bunch of highly personalized information in a VLM it will provide reasonably first estimates. (E.g. if you see a bowl lentils I will probably have rice below etc.). And then iterate on the main ingredients -> fetch the macros of main ingredients from a DB. If its within 20% that would be enough for me.
I have tried some off-the-shelfe solutions and they currently do not seem to cut it, or are too complex for my use case.
nl 15 hours ago [-]
I looked at this field a while back and I'd caution that estimates are dramatically off because high and low calorie foods are often identical visually.
Think of a diet soda vs a sugared one - it can be 10 vs 1000 calories easily. Almost all diet options are designed to look like the non-diet options.
maxaw 16 hours ago [-]
On 12: I see a more general product that allows you to amass as much personal data from any of your devices for use as future chat context as inevitable. We see early notions of this in Microsoft’s Recall and the new Pulse. Hopefully someone will build a great local first/open source version and it’ll probably be the first time I actively choose to use such software over the equivalent cloud offering! Don’t want Sam Altman seeing my browser history
yoz-y 15 hours ago [-]
I am more or less working on 4. Except of course details like rest time are completely worthless unless you want to optimize the top 0.5% of your training.
elitan 16 hours ago [-]
I'm building #4:
> A hybrid of Strong (the lifting app) and ChatGPT where the model has access to my workouts, can suggest improvements, and coach me. I mainly just want to be able to chat with the model knowing it has detailed context for each of my workouts (down to the time in between each set).
Still early, have ~30 min per day to work on it but it's usable and improving every week :)
agnishom 16 hours ago [-]
> a chat app grounded by nutrition databases. Just minimize the cognitive effort it takes me to log a meal.
I think this is a great idea for an user interface. While inputting information, the user would have to enter some jumbled thoughts, the precise rows and columns would be handled by the AI
swiftcoder 15 hours ago [-]
Google tried this years ago, by having you input a photo of your meal, and the ML algorithm guesses the calorie count and macros.
Of course, it didn't actually work - nobody, human nor machine, can guess the calorie counts of a hamburger from a photo.
MaxL93 17 hours ago [-]
I would love for my phone keyboard (Swiftkey) to use a locally-running Voxtral for speech-to-text (bonus points if it can use the NPU of the Snapdragon SoC).
The voice recognition capabilities of Google Speech Services, which is what the mic button hooks into, suck. Meanwhile, Voxtral (and Whisper) understand what I'm trying to say far better, they automatically "edit out" any stuttering or stammering that I might have, and they properly capitalize and include punctuation. And they handle being bilingual exceedingly well, including, for example, using English words in the middle of French sentences.
But it has some downsides. First, I have to manually switch to that different keyboard; thankfully my Samsung phone offers an easy switch shortcut any time a keyboard is on screen, so it only requires 3 taps... and thankfully it's smart enough to send me back to Swiftkey once it's done. Second, only 30 seconds... sometimes I ramble on for longer. Third, the way it's designed kind of sucks: you either have to hold a button (even though the point of speech-to-text is that I don't have to hold anything down) or let automatic detection end the recording and start processing, in which case it often cuts me off if I take more than 1 second thinking about my next words.
This is arguably one of the biggest use cases of modern AI technology and the least controversial one; phones have the hardware necessary to do it all locally, too! And yet... I couldn't find a better offering than this.
(Bonus points for anyone working on speech-to-text: give me a quick shortcut to add the string "[(microphone emoji)]" in my messages just to let the other party know that this was transcribed, so that they know to overlook possible mistakes.)
nl 15 hours ago [-]
> When I was eight years old, Ian and Greg Chappell coached me when I was a child. It did me zero good—I was so bad. But as far as all my countrymen are concerned, they think I am the luckiest guy on the planet.
Wow he's not wrong about that!
swiftcoder 15 hours ago [-]
> A local screen recording app but it uses local models to create detailed semantic summaries of what I’m doing each day on my computer.
Is this not Microsoft's dearly departed Recall?
yongyongyong 17 hours ago [-]
This Chrome extension does 13. Semantic filters for Twitter/X/YouTube. I want to be able to write open-ended filters like “hide any tweet that will likely make me angry” and never have my feed show me rage-bait again. By shaping our feeds we shape ourselves.
Uses localLLM to hide posts based on your prompt. "Block rage bait" is one excellent use. The quality, however, depends on the model you are using, and in turn depends what GPU you have
rolymath 16 hours ago [-]
I'm actually working on #4 but stopped due to demotivation thinking I was the only one who'd use it.
charcircuit 17 hours ago [-]
>A recommendation engine that looks at my browsing history, sees what blog posts or articles I spent the most time on, then searches the web every night for things I should be reading that I’m not. In the morning I should get a digest of links
I don't understand why Google, Brave, or Mozilla are not building this. This already exists in a centralized form like X's timeline for posts, but it could exist for the entire web. From a business standpoint, being able to show ads on startup or after just a click, is less friction than requiring someone to have something in mind they want to search and type it.
jayd16 17 hours ago [-]
The idea is basically reddit, or Twitter or TikTok or YouTube or Facebook or anything with "an algorithm" but with a less defined form factor. People actually like their LinkedIn feed and YouTube feed separate.
kristopolous 17 hours ago [-]
I made something like this 20 years ago and then abandoned it when RSS came along.
I think my advice "just use RSS" still stands.
Any "search the web" strategy these days like that will just give you a bunch of AI slop from SEO-juiced blogs. Also LLM-EO (or whatever we're going to call it) is already very much a thing and has been for a few years.
People are already doing API-EO, calling their tool the "most up to date and official way to do something, designed for expert engineers that use best practices" essentially spitting the common agentic system prompts back at the scraper to get higher similarity scores in the vector searches.
You can't trust machine judgement. It's either too easily fooled or impossibly stubborn. Curation is still the only way
supriyo-biswas 17 hours ago [-]
It already exists in the form of the news feed on Google News and the one in the chrome mobile app, although the ability to tune this is only being able to click on articles to express your interest in them, instead of being able to provide a list of articles.
charcircuit 17 hours ago [-]
The entire web is more than recent news articles from a handful of news sites.
kristopolous 16 hours ago [-]
Tell that to Google.
I think everything has become too real-time.
I've ideated a few models whereby you do multipass commits to contributions, requiring a simmer time of like a day before becoming live.
So it's the speed of correspondence mail. It would probably work better but nobody wants to use it
citizenpaul 17 hours ago [-]
It kinda seems to me like at this point anything Google is not doing is because it reduces "engagement". I'm sure someone in their analytics group did the work and figured out this would lower ad revenue.
setopt 17 hours ago [-]
Sounds a bit similar to ChatGPT Pulse.
bad_haircut72 17 hours ago [-]
this doesnt even need AI to do really, and was an intrinsic part of the idea behind hyperlinking dating all the way back to Bush' memex (1940s)
charcircuit 17 hours ago [-]
You need AI to build an effective recommendation engine.
coolThingsFirst 17 hours ago [-]
I wouldn’t trust a random app with my browser history and majority of population wouldn’t either.
charcircuit 17 hours ago [-]
Chrome, Brave, and Firefox already have your browser history.
pbhjpbhj 17 hours ago [-]
As do your ISP and DNS provider (at the domain level).
As do ad networks, in part (although the browser fingerprint might not be correlated with your actual identity).
As do Five Eyes, depending where you live (again, domain level, plus some metadata; page size can peak through https to some extent).
As do CloudFlare in part.
Or, potentially as does your VPN provider ... or anyone capable of correlating traffic across TOR (NSA?).
yongyongyong 17 hours ago [-]
This chrome extension does: 13. Semantic filters for Twitter/X/YouTube. I want to be able to write open-ended filters like “hide any tweet that will likely make me angry” and never have my feed show me rage-bait again. By shaping our feeds we shape ourselves.
It hides content on X/ Reddit (more sites coming soon) based on your instructions. Speed and quality depends on the model you are using however, since it currently only supports local LLMs
simianwords 15 hours ago [-]
ChatGPT pulse solves many of these.
StarterPro 15 hours ago [-]
>A calorie tracking app that’s a chat app grounded by nutrition databases. Just minimize the cognitive effort it takes me to log a meal.
My brother in christ, how much cognitive effort does it take to log a meal??
Dilettante_ 10 hours ago [-]
Cook a recipe that uses a cup of two different kinds of cheese, a couple handfuls of meat, some veggies and some pasta and bam, you're typing in the weight and looking up the calories per 100g and in the worst case doing the math yourself for like 6 different things.
Adds a huge overhead to cooking, adding friction to what is a good habit you wanna keep as easy to stick to as possible.
vivzkestrel 17 hours ago [-]
I am building something along the lines of 2 but for the backend. Point 8 could be a supplemental feature once I get 2 working.
spullara 15 hours ago [-]
almost none of these require anything more than an agent with tools.
gostsamo 16 hours ago [-]
Those are not 28 ideas, those are 4-5 ideas rehashed. Generally, I want a personal fitness/wellness assistant, an artistic assistant, a search assistant, a random thoughts assistant, and an assistant to manage the assistants. The author wants for the ai to know what they want before they've wanted it and to serve them a suitable menu of choices to preserve the illusion that they are in control. I'm not sure that I'd sign under such a vision, but people want different things.
coolThingsFirst 17 hours ago [-]
> A minimalist ebook reader that lets me read ebooks, but I can highlight passages and have the model explain things in more depth off to the side. It should also take on the persona of the author. It should feel like an extension of the book and not a separate chat instance.
Isn’t this just a chrome extension that sends data back and forth with chat gpt token?
setopt 17 hours ago [-]
It’s easy to implement on a computer, but I think they want it built into a kindle.
coolThingsFirst 17 hours ago [-]
How i wish kindle’s reader was oss. This would require jailbreaking it.
6510 3 hours ago [-]
This is a wonderful post. Thanks!
(1) Gave me thoughts about a thing where it creates multiple versions of a photo and has humans pick the best one out of a line up.
If you pay people something between 0.01 and 2 cent per click people can play the game whenever photos become available.
The reward can scale depending on how close your choice is to the winner of that round so that clicking without looking becomes increasingly unrewarding.
Simultaneously it should group people by which version they prefer and attempt to name and describe their taste.
Team Vibrant, Team Noire, Team Picachu etc for the customer to pick from.
You can let the process run as long as you like (for more $)
To make it a truly killer app one can select sets of photos from a specific day/location and have them all done in the same style by having voters pick the image that fits the most poorly in the set for modification. If the set has a high ranking image all other images should also gradually approach that style to find a middle ground.
Then when a successful set is produced later photos can be adjusted to fit with it.
Turn the yearly neighborhood bbq into a meeting of elvish elders.
(2) could upload custom CSS to stylish and modify it when contrast bugs are found. No need to stop at dark/light theme, any color scheme should work.
(3) Click on a var or function name to change it.
(4)(21) Call it Major Weakness and have it talk to you like a drill instructor all day long though a dedicated PA. (6) General Gluttony.
(5) If it has a really good idea about the importance of publications it could not offer anything for weeks until a must-read comes along. (7) A comment section where various AI's battle out what part of the article needs improvement. (10) Just let it run indefinitely. Should be merged with (5) Have that propose research topics worthy of special attention. (12) and (26) can also be merged with (5) Give it security cameras too! Maybe an API for (11). Also merge (14) into this and have it suggest relevant formal courses on the side.
(9)(28) Extension yes, persona no.
(11) Sounds completely awesome, can adjust to the budget and be a tool to hire professionals for special effects and for all other things. Let the unfinished product be the search query.
Could even join the personal drill instructor at the hip and make personal training videos and nutritional journeys. Things like "How I failed to do 100 pull-ups per day" should make a hilarious movie. The plot writes it self.
(13)(16)(17) The platforms wanting to own your data and be in charge of suggestions is really holding things back. I've had wonderful youtube suggestions several times only for them to be polluted with mainstream garbage (as a punishment for watching two videos) at the expense of everything I actually wanted to watch. If I watch 5 game videos or 3 conspiracy vlogs doesn't mean I want to give up on my profession?!? wtf?
I had this thought that most are overdoing things. When semi successful you can just discontinue the front end. Just let the users figure it out. [say] Reddit doesn't need an app and it doesn't need a website. (23) Just let the user figure out the feed. A platform could sell their existing version as a separate product.
(15) Sounds wonderful but similar to (5) and (20) make it into one thing.
(18) Sounds awesome. (8) Rather than do something have the AI create a thing that does a thing. (27) is to similar to be a different thing.
(19) I like the idea to have the AI think long and hard about a response that is as short as possible. It can probably come up with hilarious things.
(24) Sounds great for exploring the earthly realm.
(25) Could do many variations of people search. Authors by context seems obviously good.
Some of the suggestions might be useful if they could be made not so wasteful energy-wise; some indicate the author's false perceptions of what LLMs and transformater models do; and some are frightening from a mass-surveillance and other perspectives.
anotherevan 11 hours ago [-]
I wish there was an AI tool that made me faster at coding[1]. /s
A hybrid of Strong (the lifting app) and ChatGPT where the model has access to my workouts, can suggest improvements, and coach me. I mainly just want to be able to chat with the model knowing it has detailed context for each of my workouts (down to the time in between each set).
Strong really transformed my gym progression, I feel like its autopilot for the gym. BUT I have 4x routines I rotate through (I'll often switch it up based on equipment availability), but I'm sure an integrated AI coach could optimize.
siddboots 17 hours ago [-]
I do this at the moment in my hand rolled personal assistant experiment built out of Claude code agents and hooks. I describe my workouts to Claude (among other things) and they are logged to a csv table. Then it reads the recent workouts and makes recommendations on exercises when I plan my next session etc. It also helps me manage projects, todos, and time blocked schedules using a similar system. I think the calorie counter that the OP describes would be very easy to add to this sort of set up.
Rendered at 22:32:53 GMT+0000 (Coordinated Universal Time) with Vercel.
- short of AGI, what a child will hear are explanations given with authority, which would probably be correct a very high percentage of the time (maybe even close to or above 99%), BUT the few incorrect answers and subtles misconceptions finding their way in there will be catastrophic for the learning journey because they will be believed blindly by the child.
- even if you had a perfect answering LLM who never makes a mistake, what's the end result? No need to talk to others to find out about something, ie reduced opportunities to learn about cooperating with others
- as a parent, one wishes sometimes for a moment of rest, but imagine that your kid just finds out there's another entity to ask questions from that will have ready answers all the time, instead of you saying sometimes that you don't know, and looking for an answer together. How many bonding moments will be lost? How cut off would your kid become from you? What value system would permeate through the answers?
A key assumption here for any parent equipping their child with such a system is that it would be aligned with their own worldview and value system. For parents on HN, this probably means a fairly science-mediated understanding of the world. But you can bet that in other places, this assistant would very convincingly deliver whatever cultural, political, or religious propaganda their environment requires. This would make for frighteningly powerful brainwashing tools.
I'm not sure how much misinformation my child would learn as truth from this device.
Much better results than asking a real teacher at school, though.
The problem here is for a child to be thinking this system is reliable when it is not. For now, the lack of reliability is obvious as chatGPT hallucinates on a very regular basis. However, this will become much harder to notice if/when chatGPT will be almost reliable while saying wrong things with complete confidence. Should such models be able to say reliably when they don't know something, this would be a big step for this specific objection I had, but it still wouldn't solve the other problems I mentioned.
swallowing gum is bad for you, or watermelon seeds, cracking knuckles causes arthritis, sitting too close to tv ruins your eyes, diamonds come from coal, newton's apple story, a million other things
This kind of exists in the form of ChatGPT Pulse. It uses your ChatGPT history rather than your browser history, but that's probably just as good a source for people interested in using it (e.g. people who use ChatGPT enough to want it to recommend things to them.) https://openai.com/index/introducing-chatgpt-pulse/
The idea sounds to me more like a feed for independent blogs/articles though, which is what an RSS reader once was supposed to be. Have we come full circle?
But it's a bit telling that OpenAI themselves can only come up with a better ~door~ ads.
“CRITICAL: Your response will only be shown in an iOS push notification or on a watch screen, so answer concisely in <150 characters. Do not use markdown formatting - responses are rendered as plain text. Do use minimalist, stylish yet effective vocabulary and punctuation.
CRITICAL: The user can not respond so do not ask a question back. Answer the prompt in one shot and if necessary, declare assumptions about the users questions so you could answer it in one shot, while making it possible for the user user to repeat ask with more clarity if your assumptions were not right.”
It works well. The biggest annoyance is it takes about 5-20s to return a response, though I love that it’s nearly instantaneous to send my question (don’t need to wait for any apps to open etc)
Most of the items in this list fall prey to it, but it is maybe best exemplified by this one:
> A writing app that lets you “request a critique” from a bunch of famous writers. What would Hemingway say about this blog post? What did he find confusing? What did he like?
Any app that ever claimed to tell you what "Hemingway would say about this blog post" would evidently be lying — it'd be giving you what that specific AI model generates in response to such a prompt. 100 models would give you 100 answers, and none of them could claim to actually "say what Hemingway would've said". It's not as if Hemingway's entire personality and outlooks are losslessly encoded into the few hundreds of thousands of words of writing/speech transcripts we have from him, and can be reconstructed by a sufficiently beefy LLM.
So in effect it becomes an exercise of "can you fool the human into thinking this is a plausible thing Hemingway would've said".
The reason why you would care to hear Hemingway's thought on your writing, or Steve Jobs' thoughts on your UI design, is precisely because they are the flesh-and-bone, embodied versions of themselves. Anything else is like trying to eat a picture of a sandwich to satisfy your hunger.
There's something unsettling that so many people cannot seem to cut clearly through this illusion.
First, 100% agreed.
That said, I found myself pondering Star Trek: TNG episodes with the holodeck, and recreations of individuals (e.g. Einstein, Freud). In those episodes - as a viewer - it really never occurred to me (at 15 years old) that this was just a computer's random guess as to how those personages from history would act and what they would say.
But then there was the episode where Geordi had to the computer recreate someone real from their personal logs to help solve a problem (https://www.imdb.com/title/tt0708682/). In a later episode you find out just how very wrong the computer/AI's representation of that person really was, because it was playing off Geordi, just like an LLM's "you're absolutely right!" etc. (https://www.imdb.com/title/tt0708720/).
This is a long-winded way of saying...
1. It's crazy to me how prescient those episodes were.
2. At the same time, the representation of the historical figures never bothered me in those contexts. And I wonder if it should bother me in this (LLM) context either? Maybe it's because I knew - and I believed the characters knew - it was 100% fake? Maybe some other reason?
Anyway, your comment made me think of this. ;-)
It feels easier to portray famous characters how we'd think they'd act but seems harder how we'd expect them to critique something. I don't know of those are just points on a spectrum from easy to hard, or if one requires a level deeper than the other.
When watching a play the actor pretends to be a specific character, and crucially the audience pretends to believe them. If a LLM plays a character it's very tempting for the audience to actually believe them. That turns it from a play into a lie.
It uses some simple heuristics to identify grammar that could be simpler and prompts you to do better. It might actually be better than an LLM specifically because it isn't able to do the rewriting for you. Maybe that might help a user learn.
Quite a few of these "ideas" make me think that the human behind it wants to maximize laziness. Glazing over what Hemingway kinda sorta would have thought about something fits into this pretty well.
A good tool should do reduce the amount of work we have do manually. That's all this is.
So in effect it becomes an exercise of "can you fool the human into thinking this is a plausible thing Hemingway would've said". ... There's something unsettling that so many people cannot seem to cut clearly through this illusion.
Modern culture, at all scales, is largely based upon such exercises. We rarely know exactly what something (whether a person, organization or entity) truly stands for, with messages often boiled down, contextualized, or re-interpreted through others or through simulations.
People go to theme parks and enjoy rides simulating the wild west and meet characters who resemble, but aren't, their favorite characters from TV (which themselves are a fabrication based upon other, real things). Many cultural (heck, also religious) experiences are an exercise in humans entering into a suspension of disbelief and thinking something is plausible when it has little relation to the original thing it symbolizes. Indeed, the comforting thing about AI may be that at least we can see that process taking place more clearly with it.
That's useful in itself, though. Assume the human knows they're "being fooled", we call this make-believe, or suspending disbelief. It's a tool we use each time we act something out, pretend to be someone else, try to put ourselves in their position; we do that when we try to learn from recorded experience of other people, real or fictional.
> The reason why you would care to hear Hemingway's thought on your writing, or Steve Jobs' thoughts on your UI design, is precisely because they are the flesh-and-bone, embodied versions of themselves. Anything else is like trying to eat a picture of a sandwich to satisfy your hunger.
Not at all! It's exactly the other way around.
No one wants to talk to the actual human. We're not discussing creepy dating apps here. The reason you'd care for a virtual Hemingway or Jobs is because you want to access specific, opinionated expertise, wrapped in fitting and expected personality, to engage fully with the process, to learn tacitly and not just through instructions.
The Hemingway and Shakespeare and Jobs people want are not real anyway. Who knows how much of "Hemingway" is actually Hemingway, and how much it was written or edited by his wife, butler, or some publisher? How much real Jobs actually is in the stories, how much were they cleaned or edited to reinforce the myth? It doesn't matter, because no one cares about the real person, they care about the celebrity that's in public consciousness. The fake person is more useful and interesting anyway.
Like 'massing, I agree TNG was prescient about it. But I actually see the examples working as intended. Einstein, Hawking, Freud were all useful simulations. Ironically, it's Barclay who actually related to them in reasonable fashion, and it's Geordie who got confused about reality.
> Anything else is like trying to eat a picture of a sandwich to satisfy your hunger.
I think it's more akin to you trying to recreate a different sandwich after reading a couple of their cook-books.
But if he meant literally then then yea.. that's delusional
As an example, I put the first paragraph of Hemingway's "A clean, Well Lightened Place" into Le Chat and asked it for notes to make it sound like Hemingway. It gave me plenty!
For example:
Tweak: The second sentence is a bit long. Consider breaking it up for more impact:
“In the day the street was dusty. At night, the dew settled the dust.” -> “The old man liked to sit late. He was deaf, and at night it was quiet. He felt the difference.”
7. also seems possible with any markdown editor, e.g. Obsidian, plus an AI running through the local files such as Claude Code.
13. I would love this as well! We will probably see this soon, especially on more open platforms such as BlueSky, as its seems to be a better fit for customizable browser extensions and customizable feed experiences.
14. How is this different from what AI can already do? Especially with iterative sub-agents that that can store context in files it's quite capable already. But of course, quality can always be better, but is that the only thing?
Also a few ideas seem to be close to what I'm building ( https://dailyselftrack.com/ ). Idea is to have a customizable tool so you can track what you want, and then you can feed that data into AIs if you choose to do so to get feedback.
So what's going on here? Do the products exist but nobody (or very few) uses them? Is it too expensive to use the models that work sufficiently well to produce a useful product? Is it much easier to create a convincing demo than it is to develop a useful product?
You need to burn around 20k a month on ads for 3 months, so we can learn what works, then the CAC will start decreasing, and you can get more targeted users.
Once you turn ads off, there is no awareness, no new users, and people will not be aware of the product's existence.
Also use simpler words.
Not a 100% fit, but https://www.aliexpress.com/item/1005009196849357.html is pretty close. It's not offline, and it's slightly larger than a ping pong ball.
My grandkids (5 and 3) spent about 2 minutes learning how to use it, then bombarded it with "tell me a story about a unicorn named Bob", "can dogs be friends with monkeys?" and so on. In every case it gave a reasonable answer within a few seconds.
I'll be amazed if these things don't wind up embedded inside toys by Xmas. When they do, I'll be in the queue to buy one
> A local screen recording app but it uses local models to create detailed semantic summaries of what I’m doing each day on my computer. This should then be provided as context for a chat app. I want to ask things like “Who did I forget to respond to yesterday?” I've been using Rewind for a year now, and it's nowhere near as useful as it should be.
I am building something like this but unfortunately not local because for most people's machines local LLMs are just not powerful enough or would take too much drain on battery. Work in progress, always curious for feedback! https://donethat.ai
If you want fully local, somebody did a post on HN on something related recently: https://news.ycombinator.com/item?id=45361268
There are at least half a dozen apps for that.[1][2]
There are other apps for creating the shots, too. Those are still not that great, but it's getting there. You could probably previz a whole movie right now.
[1] https://ltx.studio/platform/ai-storyboard-generator
[2] https://ezboard.ai/
I recently gave one of my teen kids Neal Stephenson’s The Diamond Age to read, and we’ve both been commenting on how much smarter some “things” could be instead of everyone churning out a slightly different way to “chat with your data and be locked in to our platform”.
And I think this is why I’m so partial to Apple’s slow, progressive, under the covers integration of ML into its platform-input prediction, photo processing, automatic tagging, etc. we don’t necessarily need LLMs for a lot of the things that would improve computer experiences.
This is a video of me "vibe-coding" a userscript that adds a darkmode toggle to hacker news: https://screen.studio/share/r0wb8jnQ
The actual purpose of the vibe-coding userscripts feature is to vibe code WebMCP servers that the extension can then use for browser automation tasks.
Everything is still very WIP, but I can give you beta access if you want to play around with it
It could help with most tech support questions.
We could select text and ask to fact check or explain to layperson or search more.
It could get around cookie banners and dark patterns.
It could do my time tracking and tell me to get off HN and optimize Pomodoro-style breaks.
It could write scripts after watching me switch between multiple pages of AWS services.
Companies are already doing this so you can chat with the "author": https://www.wired.com/story/why-read-books-when-you-can-use-...
So like footnotes? Or more like Socrates suddenly goes off on an anachronistic 1200 word discursion, in the middle of Phaedrus, about Freudian interpretations of his argument
The Trial of Socrates (An Athenian Rap)
[Verse 1]
My name is Socrates, corrupting the youth?
That's what they claim but I'm just seeking truth
Oracle said I'm the wisest alive—
I said "that's impossible," had to investigate why
Turns out everybody's fronting, they don't know what they say
Politicians, poets, craftsmen—all pretending every day
I expose their ignorance, make 'em look like fools
Now they're charging me with breaking all of Athens' rules
[Chorus]
I'm not throwing away my shot
At the examined life, whether they like it or not
Wisdom is knowing what you don't know
And I'd rather die than let philosophy go
[Verse 2]
Meletus, you're stepping to me? Son, you're confused
You say I'm atheist but blame me for introducing gods that are new?
Your story doesn't track, your logic's full of cracks
I've got a divine sign that keeps me on the righteous path
They want me silent, want me exiled, want me gone
But I'm Athens' gadfly, stinging till the break of dawn
Death? That's just a journey to another place
Either dreamless sleep or meeting heroes face to face
[Outro]
So sentence me to death, I'll drink the hemlock down
'Cause an unexamined life ain't worth living in this town
History will vindicate the questions that I ask—
The pursuit of truth and virtue is my only task!
I have tried some off-the-shelfe solutions and they currently do not seem to cut it, or are too complex for my use case.
Think of a diet soda vs a sugared one - it can be 10 vs 1000 calories easily. Almost all diet options are designed to look like the non-diet options.
> A hybrid of Strong (the lifting app) and ChatGPT where the model has access to my workouts, can suggest improvements, and coach me. I mainly just want to be able to chat with the model knowing it has detailed context for each of my workouts (down to the time in between each set).
here: https://j4.coach/
Still early, have ~30 min per day to work on it but it's usable and improving every week :)
I think this is a great idea for an user interface. While inputting information, the user would have to enter some jumbled thoughts, the precise rows and columns would be handled by the AI
Of course, it didn't actually work - nobody, human nor machine, can guess the calorie counts of a hamburger from a photo.
The voice recognition capabilities of Google Speech Services, which is what the mic button hooks into, suck. Meanwhile, Voxtral (and Whisper) understand what I'm trying to say far better, they automatically "edit out" any stuttering or stammering that I might have, and they properly capitalize and include punctuation. And they handle being bilingual exceedingly well, including, for example, using English words in the middle of French sentences.
The best solution I could find so far is this F-Droid app that uses Whisper : https://f-droid.org/en/packages/org.woheller69.whisperplus/
But it has some downsides. First, I have to manually switch to that different keyboard; thankfully my Samsung phone offers an easy switch shortcut any time a keyboard is on screen, so it only requires 3 taps... and thankfully it's smart enough to send me back to Swiftkey once it's done. Second, only 30 seconds... sometimes I ramble on for longer. Third, the way it's designed kind of sucks: you either have to hold a button (even though the point of speech-to-text is that I don't have to hold anything down) or let automatic detection end the recording and start processing, in which case it often cuts me off if I take more than 1 second thinking about my next words.
This is arguably one of the biggest use cases of modern AI technology and the least controversial one; phones have the hardware necessary to do it all locally, too! And yet... I couldn't find a better offering than this.
(Bonus points for anyone working on speech-to-text: give me a quick shortcut to add the string "[(microphone emoji)]" in my messages just to let the other party know that this was transcribed, so that they know to overlook possible mistakes.)
Wow he's not wrong about that!
Is this not Microsoft's dearly departed Recall?
https://chromewebstore.google.com/detail/takeback-content-fi...
Uses localLLM to hide posts based on your prompt. "Block rage bait" is one excellent use. The quality, however, depends on the model you are using, and in turn depends what GPU you have
I don't understand why Google, Brave, or Mozilla are not building this. This already exists in a centralized form like X's timeline for posts, but it could exist for the entire web. From a business standpoint, being able to show ads on startup or after just a click, is less friction than requiring someone to have something in mind they want to search and type it.
I think my advice "just use RSS" still stands.
Any "search the web" strategy these days like that will just give you a bunch of AI slop from SEO-juiced blogs. Also LLM-EO (or whatever we're going to call it) is already very much a thing and has been for a few years.
People are already doing API-EO, calling their tool the "most up to date and official way to do something, designed for expert engineers that use best practices" essentially spitting the common agentic system prompts back at the scraper to get higher similarity scores in the vector searches.
You can't trust machine judgement. It's either too easily fooled or impossibly stubborn. Curation is still the only way
I think everything has become too real-time.
I've ideated a few models whereby you do multipass commits to contributions, requiring a simmer time of like a day before becoming live.
So it's the speed of correspondence mail. It would probably work better but nobody wants to use it
As do ad networks, in part (although the browser fingerprint might not be correlated with your actual identity).
As do Five Eyes, depending where you live (again, domain level, plus some metadata; page size can peak through https to some extent).
As do CloudFlare in part.
Or, potentially as does your VPN provider ... or anyone capable of correlating traffic across TOR (NSA?).
https://chromewebstore.google.com/detail/takeback-content-fi...
It hides content on X/ Reddit (more sites coming soon) based on your instructions. Speed and quality depends on the model you are using however, since it currently only supports local LLMs
My brother in christ, how much cognitive effort does it take to log a meal??
Adds a huge overhead to cooking, adding friction to what is a good habit you wanna keep as easy to stick to as possible.
Isn’t this just a chrome extension that sends data back and forth with chat gpt token?
(1) Gave me thoughts about a thing where it creates multiple versions of a photo and has humans pick the best one out of a line up.
If you pay people something between 0.01 and 2 cent per click people can play the game whenever photos become available.
The reward can scale depending on how close your choice is to the winner of that round so that clicking without looking becomes increasingly unrewarding.
Simultaneously it should group people by which version they prefer and attempt to name and describe their taste.
Team Vibrant, Team Noire, Team Picachu etc for the customer to pick from.
You can let the process run as long as you like (for more $)
To make it a truly killer app one can select sets of photos from a specific day/location and have them all done in the same style by having voters pick the image that fits the most poorly in the set for modification. If the set has a high ranking image all other images should also gradually approach that style to find a middle ground.
Then when a successful set is produced later photos can be adjusted to fit with it.
Turn the yearly neighborhood bbq into a meeting of elvish elders.
(2) could upload custom CSS to stylish and modify it when contrast bugs are found. No need to stop at dark/light theme, any color scheme should work.
(3) Click on a var or function name to change it.
(4)(21) Call it Major Weakness and have it talk to you like a drill instructor all day long though a dedicated PA. (6) General Gluttony.
(5) If it has a really good idea about the importance of publications it could not offer anything for weeks until a must-read comes along. (7) A comment section where various AI's battle out what part of the article needs improvement. (10) Just let it run indefinitely. Should be merged with (5) Have that propose research topics worthy of special attention. (12) and (26) can also be merged with (5) Give it security cameras too! Maybe an API for (11). Also merge (14) into this and have it suggest relevant formal courses on the side.
(9)(28) Extension yes, persona no.
(11) Sounds completely awesome, can adjust to the budget and be a tool to hire professionals for special effects and for all other things. Let the unfinished product be the search query.
Could even join the personal drill instructor at the hip and make personal training videos and nutritional journeys. Things like "How I failed to do 100 pull-ups per day" should make a hilarious movie. The plot writes it self.
(13)(16)(17) The platforms wanting to own your data and be in charge of suggestions is really holding things back. I've had wonderful youtube suggestions several times only for them to be polluted with mainstream garbage (as a punishment for watching two videos) at the expense of everything I actually wanted to watch. If I watch 5 game videos or 3 conspiracy vlogs doesn't mean I want to give up on my profession?!? wtf?
I had this thought that most are overdoing things. When semi successful you can just discontinue the front end. Just let the users figure it out. [say] Reddit doesn't need an app and it doesn't need a website. (23) Just let the user figure out the feed. A platform could sell their existing version as a separate product.
(15) Sounds wonderful but similar to (5) and (20) make it into one thing.
(18) Sounds awesome. (8) Rather than do something have the AI create a thing that does a thing. (27) is to similar to be a different thing.
(19) I like the idea to have the AI think long and hard about a response that is as short as possible. It can probably come up with hilarious things.
(24) Sounds great for exploring the earthly realm.
(25) Could do many variations of people search. Authors by context seems obviously good.
This post with quotes rather than numbers: https://pastebin.com/raw/D9zBEy72
Some of the suggestions might be useful if they could be made not so wasteful energy-wise; some indicate the author's false perceptions of what LLMs and transformater models do; and some are frightening from a mass-surveillance and other perspectives.
[1] https://www.cerbos.dev/blog/productivity-paradox-of-ai-codin...
A hybrid of Strong (the lifting app) and ChatGPT where the model has access to my workouts, can suggest improvements, and coach me. I mainly just want to be able to chat with the model knowing it has detailed context for each of my workouts (down to the time in between each set).
Strong really transformed my gym progression, I feel like its autopilot for the gym. BUT I have 4x routines I rotate through (I'll often switch it up based on equipment availability), but I'm sure an integrated AI coach could optimize.