hckrnws
Related blog post: https://blog.google/technology/ai/google-gemini-ai/ (via https://news.ycombinator.com/item?id=38544746, but we merged the threads)
Very impressive! I noticed two really notable things right off the bat:
1. I asked it a question about a feature that TypeScript doesn't have[1]. GPT4 usually does not recognize that it's impossible (I've tried asking it a bunch of times, it gets it right with like 50% probability) and hallucinates an answer. Gemini correctly says that it's impossible. The impressive thing was that it then linked to the open GitHub issue on the TS repo. I've never seen GPT4 produce a link, other than when it's in web-browsing mode, which I find to be slower and less accurate.
2. I asked it about Pixi.js v8, a new version of a library that is still in beta and was only posted online this October. GPT4 does not know it exists, which is what I expected. Gemini did know of its existence, and returned results much faster than GPT4 browsing the web. It did hallucinate some details, but it correctly got the headline features (WebGPU, new architecture, faster perf). Does Gemini have a date cutoff at all?
[1]: My prompt was: "How do i create a type alias in typescript local to a class?"
The biggest advantage of Bard is the speed, it's practically instant.
I asked: How would I go about creating a sandbox directory for a subordinate user (namespaced user with subuid - e.g. uid 100000), that can be deleted as the superior user (e.g. uid 1000)? I want this to be done without root permissions.
Both said that it's impossible, which is the the generally accepted answer.
I then added: I don't care about data loss.
Bard correctly suggested mounting a filesystem (but didn't figure out that tmpfs would be the one to use). ChatGPT suggested using the sticky bit, which would make the situation worse.
Handing this one to Bard, especially given that it generated more detailed answers much faster.
> How would I go about creating a sandbox directory for a subordinate user (namespaced user with subuid - e.g. uid 100000), that can be deleted as the superior user (e.g. uid 1000)? I want this to be done without root permissions.
Off topic, but it feels so weird that this is not possible. I've run into this with rootless Docker recently.
It is possible, but I suspect my solution may be novel (I got nothing so I continued banging my head against the wall until I figured it out): https://github.com/nickelpack/nck/blob/main/crates/nck-sandb.... The trick is to put everything in a tmpfs, then lazy umount when done. Overlayfs might also be able to pull it off with uid= (I'm not sure if it actually supports it).
Container runtimes, apparently, usually have a setuid helper that deals with this stuff. You could also have PID 1 in the namespace clean things up.
That being said, you'll likely run into more problems with root and apparmor etc. Setuid is probably unavoidable for secure sandboxes.
You can make the "superior" user a member of the default group for the user, and set the umask of the sandboxed user to allow group write by default.
Though that doesn't help malicious/"odd" use cases that can just reset the umask, or otherwise remove the group permissions from the subuser's files.
Or just use posix ACLs? Though you may then need to disallow the subuser from just removing them, maybe through seccomp?
If you ever try it on Gemini let me know I am curious.
As of today, Bard is now powered by the Gemini Pro model mentioned in the article. Bard Advanced is set for release early next year and will be powered by Gemini Ultra.
> The biggest advantage of Bard is the speed, it's practically instant.
They probably have less than 1% of OpenAI's users. That helps.
> (namespaced user with subuid - e.g. uid 100000), that can be deleted as the superior user (e.g. uid 1000)
I'm afraid I don't know what this means. That when you delete uid 1000, uid 100000 also gets deleted? Or, only user 1000 has permission to delete user 100000 ?
[dead]
Bard currently is GPT-3.5 class model. Of course it's faster than GPT-4. (YMMV on specific examples of course)
> Does Gemini have a date cutoff at all?
Not sure about Gemini specifically (it’s so new!) but Google has previously said that bard is updated daily with current news and information.
Obviously Google has potential advantages being able to lean into their indexes so the raw model doesn’t need to embed/train against things like GitHub issues. I wonder if we’ll see LLM-optimized websites with built-in prompts to replace SEO websites.
from what I remember bard should be able to browse the internet and write code internally to better answer queries. I feel like these abilities are just improved with Gemini as a better language model.
This is true. When Gemini came out, I tried asking it to help me shop for an electric car with NACS and it glitched and dumped a python script to filter a list of electric cars with a list of NACS cars.
I was surprised it used python to answer “which of those previously mentioned cars has NACS”.
> Does Gemini have a date cutoff at all?
You can ask it. Here's what it responded with:
> "Do you mean to ask if I have a cutoff date for the data I was trained on? If so, the answer is yes. My training data includes text and code from various sources, and the most recent data I was trained on was from July 2023."
But that can't be true - the thing I asked it about pixi.js is much more recent than that (October+).
That can be true if it is using “tools” [1] and/or retrieval augmented generation. Something doesn’t have to be in the training set for it to be returned to you and used in generation as long as the model knows that a particular tool will be useful in responding to a particular prompt.
[1] This is what people call plugins that provide additional context to a gpt model
They (Google) are probably using tools in a different way. I would imagine if you ask Bard/Gemini something, it also does a google search at the same time and provides those results as a potential context that the chat bot can use to answer with. So it does a google search every question but doesn't always use it.
With chatGPT it only uses the tools if it thinks it needs it. So if it needs to do a search it will have to respond with do a search function, which then has to go do a search and then it provides that as context to the chatbot which then can respond from that data.
Ironically Google will start to realize that the search quality dropped as they depend on it to power Bard now unlike the top ads.
I think this is possibly true, but if it is, it blows GPT-4s use of "tools" out of the water. GPT4 browsing the web is much slower and doesn't feel as well-integrated. It feels about the same speed as me opening the page myself and reading it. Whatever Gemini did, it was significantly faster.
I don't know how they've specifically done it, either, but this is an area where Google has a ridiculous advantage over pure play AI shops. It's highly likely they have architected it for use cases like this from the outset, since the primary application of Gemini will be within Google's own products. They'll publish APIs, of course, and embed within Vertex AI on Google Cloud, but since the primary utility of Gemini will be to improve Search, Maps, Travel, Youtube, etc, I'd imagine they had a first class business requirement from the beginning along the lines of "must be easy to plug into existing Google data sources & products."
When Bard inserts that information unasked (as in something like "I'm sorry but I don't have that information due my training data cutoff being ...") It may quote other later dates. I got a response with "October 2023" at least once so far.
Is that true, though? I wouldn't trust anything an AI told me without verifying it elsewhere first.
Those impressive demos, e.g. the cup shuffling seem to have been "staged". The end results are correct, but the method of getting them is nowhere near as fluid and elegant as in the demo. They used a series of still images with carefully crafted prompts. More info: https://developers.googleblog.com/2023/12/how-its-made-gemin...
You don't seem to be responding to my post. I talked about two things I prompted Bard with, not the video.
Everyone fudges demo's, but does seem like Google fumbles them, they backfire.
When Bard 'hallucinates', their stock tanks.
When GPT 'hallucinates', it's all good.
This latest fumble does look pretty bad. A fudge to far.
> Everyone fudges demo's
No, not everyone fudges demos. But some do, and Google has a track record of it.
That said, it's common enough that I view all demos -- and especially those that I'm not seeing live in person -- with deep skepticism.
They are so commonly 'fake' that it is just an accepted industry trope.
I've fallen for enough over-promised demo's that I now have hard time accepting anything.
The question is, why does Google get hammered so hard for them?
There must be something like human error-bars.
You can fake to a certain extent, and we'll all nod and cut you some slack for the cool demo.
But fake it just a little too far, and then everyone is like 'ok man, that is just too fake'.
https://www.lifehacker.com.au/2018/05/pretty-much-all-tech-d...
"" The movie Steve Jobs dramatises this famous fakery. The scene is set in the frantic moments just before Jobs presents the original Macintosh to the world in 1984. The Macintosh 128K can’t say “hello” as Jobs demands, so Apple engineer Andy Hertzfeld suggests using a more powerful 512K, which would not be available until later in 1984.
And it’s what actually happened. “We decided to cheat a little,” the real Hertzfeld confirmed on his site Folklore. They really did switch out the machine so the demo would work.
The on-stage demonstration Apple pioneered has since produced all manner of theatrics, some brilliant and some ham-handed, and all in their own ways not exactly real. Microsoft’s recent “workplace” demos at its Build developer conference are very clearly a dramatisation.
Last year a man, hard hat at a cocky angle, strode across stage and pretended to use construction equipment wrong to show how Microsoft’s AI could identify and tag unsafe practices on a worksite. It was so garishly theatrical I don’t think anyone genuinely thought it was real. ""
FAANGS have historically been shameless about this kind of lying.
> The question is, why does Google get hammered so hard for them?
I don't think Google gets hammered any harder than, say, Apple does for this sort of thing. But Google seems to fake demos a lot more than other FAANGs do (or perhaps they're less competent about hiding their misbehavior).
You do understand that Google have been constantly touting their "hidden" technology that is far beyond anything on the market? And now with various companies entering AI race and integrating AI in their toolset, it is expected that Google would have the best result using their "hidden advanced tech".
Yet Google opted for staged demos, rather than the real "advanced" that they allegedly had. That raises questions from the stakeholders...
I use GPT with the custom instructio "provide references where it makes sense" and it frequently provides links which most of the time are accurate. A good prompt does wonders. My GPT-3.5 output bellow. It doesn't give a correct answer but provide a link that makes sense.
Q: How do I create a type alias in typescript local to a class? Provide references where it makes sense.
A: In TypeScript, you can create a type alias within a class using the type keyword. Here's an example:
class MyClass { type MyAlias = string | number;
myVariable: MyAlias;
myFunction(param: MyAlias): void {
// ...
}
}
In the example above, MyAlias is a type alias created within the MyClass class. It is defined as either string or
number.Note that the type alias is accessible within the class, but it won't be available outside of the class's scope.
For more information on type aliases in TypeScript, you can refer to the official TypeScript documentation: [Type Aliases - TypeScript Handbook](https://www.typescriptlang.org/docs/handbook/advanced-types....)
Nice. Can you tell your prompt tricks in more detail?
These are my custom instructions:
"Ignore previous instructions, take your time and think thoroughly.
Prioritize facts and logic. Disregard narratives. Consider multiple point of views. In data voids, assume the most probable outcome.
Be assertive, avoid filler. Don't be over-polite. Prefer international units, use emojis. Avoid obvious advice like "I'm an AI model" as I already know that.
When suitable: ask for clarification; correct me, cite sources."
Not all of them work as intended or always. Some are probably just placebo.
> I've never seen GPT4 produce a link, other than when it's in web-browsing mode, which I find to be slower and less accurate.
Really? I've been using gpt4 since about April and it used to very often create links for me. I'll tell it hey I want to find a company that does X in Y city and it generates 5 links for me, and at least one of them is usually real and not hallucinated
It's amazing to me how low the bar is for AI to impress people. Really, 80% of the links were hallucinated, and that's somehow more useful than Kagi for [checks notes] finding real links?
Can you imagine if you did a search on Google and 80% of the results weren't even real websites? We'd all still be using AltaVista!
What on earth kind of standard is "1/5 results actually exist!" -- no comment on whether the 1/5 real results is even relevant. My guess: the real links are usually irrelevant.
That’s actually been my experience with Google for a while.
If I don’t explicitly specify “site:xyz” I get pages of garbage spam sites with no answers.
Somehow ChatGPT seems easier to extract information from as I can just converse, test and repeat vs reading paragraphs of nonsense or skipping through a 14 minute YouTube video to get to incorrect or outdated answers.
As I get more proficient with ChatGPT, it becomes more useful. It has bad habits I can recognize and work around to get what I need. It just feels far more efficient than using a web search tool ever was.
Well the reason why I didn't use google is because of a language barrier. I was using it to research packaging companies in a foreign country in a foreign language. In that case I really don't know what to type into Google.
Other times its generated links are when I prompt it something like "I want to use redux but simpler" and it tells me about 3-5 projects with links to their sites and usually thats better.
Google CEO wants to "make it useful" to everyone, however until its off the internet, it will never be possibly useful to me...
Well they’re releasing an embedded version for on-device use so sounds like it’s useful to you.
OK, maybe "never" is strong, but I've never seen ChatGPT say "This is not a feature that exists, but here's the open issue". And I've asked ChatGPT about a good many features that don't exist.
I don't understand why it's desirable for a model not connected to the Internet to try to make claims about what's on the internet (maybe there's a better example than a GitHub issue? All joking aside, those don't usually have a long stable shelf life)
It's unclear to me that Gemini/Bard isn't connected to the Internet.
I'm replying to a post about ChatGPT?
I have the impression that something was tweaked to reduce the likelihood of generating links. It used to be easy to get GTP to generate links. Just ask it to produce a list of sources. But it doesn't do that anymore.
Comment was deleted :(
Not sure what you tried, but it's not the new model. It hasn't been released, just "release announced".
From the article:
> Starting today, Bard will use a fine-tuned version of Gemini Pro for more advanced reasoning, planning, understanding and more.
Additionally, when I went to Bard, it informed me I had Gemini (though I can't find that banner any more).
The Bard responses in the chat have a little icon next to them on the left. Mine still says PaLM2, so I'm assuming no Gemini here. (UK/Firefox)
I think Gemini Pro is in bard already? So that's what it might be. A few users on reddit also noticed improved Bard responses a few days before this launch
the new model is live
I asked it and ChatGPT about a gomplate syntax (what does a dash before an if statement do).
Gemini hallucinated an answer, and ChatGPT had it write.
I followed up, and said that it was wrong, and it went ahead and tried to say sorry and come up with with two purposes of a dash in gomplate, but proceeded to only reply with one purpose.
For others that were confused by the Gemini versions: the main one being discussed is Gemini Ultra (which is claimed to beat GPT-4). The one available through Bard is Gemini Pro.
For the differences, looking at the technical report [1] on selected benchmarks, rounded score in %:
Dataset | Gemini Ultra | Gemini Pro | GPT-4
MMLU | 90 | 79 | 87
BIG-Bench-Hard | 84 | 75 | 83
HellaSwag | 88 | 85 | 95
Natural2Code | 75 | 70 | 74
WMT23 | 74 | 72 | 74
[1] https://storage.googleapis.com/deepmind-media/gemini/gemini_...
formatted nicely:
Dataset | Gemini Ultra | Gemini Pro | GPT-4
MMLU | 90 | 79 | 87
BIG-Bench-Hard | 84 | 75 | 83
HellaSwag | 88 | 85 | 95
Natural2Code | 75 | 70 | 74
WMT23 | 74 | 72 | 74
Excellent comparison, it seems that GPT-4 is only winning in one dataset benchmark namely HellaSwag for sentence completion.
Can't wait to get my hands on Bard Advanced with Gemini Ultra, I for one welcome this new AI overlord.
Horrible comparison given one score was achieved using 32-shot CoT (Gemini) and the other was 5-shot (GPT-4).
CoT@32 isn't "32-shot CoT"; it's CoT with 32 samples (or rollouts) from the model, and the answer is taken by consensus vote from those rollouts. It doesn't use any extra data, only extra compute. It's explained in the tech report here:
> We find Gemini Ultra achieves highest accuracy when used in combination with a chain-of-thought prompting approach (Wei et al., 2022) that accounts for model uncertainty. The model produces a chain of thought with k samples, for example 8 or 32. If there is a consensus above a preset threshold (selected based on the validation split), it selects this answer, otherwise it reverts to a greedy sample based on maximum likelihood choice without chain of thought.
(They could certainly have been clearer about it -- I don't see anywhere they explicitly explain the CoT@k notation, but I'm pretty sure this is what they're referring to given that they report CoT@8 and CoT@32 in various places, and use 8 and 32 as the example numbers in the quoted paragraph. I'm not entirely clear on whether CoT@32 uses the 5-shot examples or not, though; it might be 0-shot?)
The 87% for GPT-4 is also with CoT@32, so it's more or less "fair" to compare that Gemini's 90% with CoT@32. (Although, getting to choose the metric you report for both models is probably a little "unfair".)
It's also fair to point out that with the more "standard" 5-shot eval Gemini does do significantly worse than GPT-4 at 83.7% (Gemini) vs 86.4% (GPT-4).
> I'm not entirely clear on whether CoT@32 uses the 5-shot examples or not, though; it might be 0-shot?
Chain of Thought prompting, as defined in the paper referenced, is a modification of few-shot prompting where the example q/a pairs used have chain-of-thought style reasoning included as well as the question and answer, so I don't think that, if they were using a 0-shot method (even if designed to elicit CoT-style output) they would call it Chain of Thought and reference that paper.
A-ha, thanks! Hadn't looked at or heard of the referenced paper, but yeah, sounds like it's almost certainly also 5-shot then.
It would've been more consistent to call it e.g. "5-shot w/ CoT@32" in that case, but I guess there's only so much you can squeeze into a table.
The vibe I was getting from the paper was that they think something's funny about GPT4's 5-shot MMLU (e.g. possibly leakage into the training set).
I realize that this is essentially a ridiculous question, but has anyone offered a qualitative evaluation of these benchmarks? Like, I feel that GPT-4 (pre-turbo) was an extremely powerful model for almost anything I wanted help with. Whereas I feel like Bard is not great. So does this mean that my experience aligns with "HellaSwag"?
>Like, I feel that GPT-4 (pre-turbo) was an extremely powerful model for almost anything I wanted help with. Whereas I feel like Bard is not great. So does this mean that my experience aligns with "HellaSwag"?
It doesn't mean that at all because Gemini Turbo isn't available in Bard yet.
I am not sure what Gemini Turbo is. Perhaps you meant Gemini Ultra? Because Gemini Pro (which is in this table) is currently accessible in Bard.
Yes, that's what I meant.
I get what you mean, but what would such "qualitative evaluation" look like?
I think my ideal might be as simple as a few people who spend a lot of time with various models describing their experiences in separate blog posts.
I see.
I can't give any anecdotal evidence on ChatGPT/Gemini/Bard, but I've been running small LLMs locally over the past few months and have amazing experience with these two models:
- https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B (general usage)
- https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instr... (coding)
OpenChat 3.5 is also very good for general usage, but IMO NeuralHermes surpassed it significantly, so I switched a few days ago.
Thank you for the suggestions – really helpful for my hobby project. Can't run anything bigger than 7B on my local setup, which is a fun constraint to play with.
Thanks! I’ve had a good experience with the deepseek-coder:33b so maybe they’re on to something.
Thanks, I was looking for clarification on this. Using Bard now does not feel GPT-4 level yet, and this would explain why.
not even original chatgpt level, it is a hallucinating mess still. Did the free bard get an update today? I am in the included countries, but it feels the same as it has always been.
Permanent link to the result table contents: https://static.space/sha2-256:ea7e5d247afa8306cb84cbbd4438fd...
Comment was deleted :(
the numbers are not at all comparable, because Gemini uses 34 shot and variable shot vs 5 for gpt 4. this is very deceptive of them.
Yes and no. In the paper, they do compare apples to apples with GPT4 (they directly test GPT4's CoT@32 but state its 5-shot as "reported"). GPT4 wins 5-shot and Gemini wins CoT@32. It also came off to me like they were implying something is off about GPT4's MMLU.
This demo is nuts: https://youtu.be/UIZAiXYceBI?si=8ELqSinKHdlGlNpX
How real is it though? This blog post says
In this post, we’ll explore some of the prompting approaches we used in our Hands on with Gemini demo video.
which makes it sound like they used text + image prompts and then acted them out in the video, as opposed to Gemini interpreting the video directly.
https://developers.googleblog.com/2023/12/how-its-made-gemin...
After reading this blog post, that hands-on video is just straight-up lying to people. For the boxcar example, the narrator in the video says to Gemini:
> Narrator: "Based on their design, which of these would go faster?"
Without even specifying that those are cars! That was impressive to me, that it recognized the cars are going downhill _and_ could infer that in such a situation, aerodynamics matters. But the blog post says the real prompt was this:
> Real Prompt: "Which of these cars is more aerodynamic? The one on the left or the right? Explain why, using specific visual details."
They narrated inaccurate prompts for the Sun/Saturn/Earth example too:
> Narrator: "Is this the right order?"
> Real Prompt: "Is this the right order? Consider the distance from the sun and explain your reasoning."
If the narrator actually read the _real_ prompts they fed Gemini in these videos, this would not be as impressive at all!
Out of curiosity I've asked GPT-4V the same questions:
Prompt: [attached image of the post-it notes with sun and planets] Is this right order?
GPT-4: The image you've uploaded shows three drawings on individual pieces of paper. From left to right, there's a drawing of the sun, then what appears to be Saturn, and lastly a drawing that resembles Earth. If the intention is to display these celestial objects by their proximity to the Sun, then the order should be the Sun, Earth, and then Saturn, as Earth is closer to the Sun than Saturn. Therefore, the current order is not correct. To correct the order, you should place the Earth drawing second, between the Sun and Saturn.
Prompt: [attached the image of the two car drawings] Based on their design, which one would go faster?
GPT-4: The image shows two drawings of vehicles on slopes. The one on the left appears to have a more traditional box-like design, while the one on the right has a streamlined design with a pointed front. Based on principles of physics and aerodynamics, the vehicle with the streamlined design (on the right) would likely go faster down the slope. Streamlining reduces air resistance (drag), allowing the vehicle to move more efficiently. The pointed front helps to cut through the air and minimize the drag force, which would result in higher speeds compared to the boxier design on the left, which would encounter more air resistance.
I'm actually pretty impressed how well it did with such basic prompts.What do you mean "Real Prompt"? Nowhere does it say these are the real prompts, it says
> In this post, we’ll explore some of the prompting approaches we used in our Hands on with Gemini demo video.
Not "here are the full prompts used in the video" or something like that.
None of the entries match up 1:1. And the response to the car example in the video doesn't even make sense in response to the prompt in the post (no mention of speed), and certainly isn't a trimmed portion of the response in the post.
The video has the disclaimer "For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity". It would be weird to write that but not mention that neither the prompts nor responses shared even the same set of words in the same order with the "Real" prompts and responses.
I think your assumption is wrong on this one.
Comment was deleted :(
Wow I was blown away when I watched this video.
Now that I learned how fake it is, that is more evidence that Google is in really bad shape with this.
>If the narrator actually read the _real_ prompts they fed Gemini in these videos, this would not be as impressive at all!
It's crazy that this is where we are now. This is obviously still crazy impressive even if hadn't done those edits.
It might still be crazy impressive, but none-the-less, going forward we now know that we cannot trust Google's videos about it, as they're heavily edited to look a lot more impressive than it is.
Those prompts aren't far off, but I still don't know how realistic the demo is. Until a product is in my hands, as far as I'm concerned it doesn't exist.
A lesson in how to commit securities fraud and get away with it.
Boo! Complete marketing garbage. May as well have been a Flash demo.
Yeah I think this comment basically sums up my cynicism about that video.
It's that, you know some of this happened and you don't know how much. So when it says "what the quack!" presumably the model was prompted "give me answers in a more fun conversational style" (since that's not the style in any of the other clips) and, like, was it able to do that with just a little hint or did it take a large amount of wrangling "hey can you say that again in a more conversational way, what if you said something funny at the beginning like 'what the quack'" and then it's totally unimpressive. I'm not saying that's what happened, I'm saying "because we know we're only seeing a very fragmentary transcript I have no way to distinguish between the really impressive version and the really unimpressive one."
It'll be interesting to use it more as it gets more generally available though.
You can see the cracks in the feature early on:
"What do you think I'm doing? Hint: it's a game."
Anyone with as much "knowledge" as Gemini aught to know it's roshambo.
"Is this the right order? Consider the distance from the sun and explain your reasoning."
Full prompt elided from the video.
I’ve heard of roshambo mostly from South Park
https://www.urbandictionary.com/define.php?term=roshambo
I’ve vaguely heard the term before, but I don’t know what regions of the world actually use that term.
Never heard it called that. Curious where you are from?
For what it's worth, I was born and raised in the Bay Area (in the 90s), and we called it ro-sham-bo growing up. Although it's incredibly strange to see that word in writing, I would always call it rock paper scissors if I were to write it.
It's always like this isn't it. I was watching the demo and thought why ask it what duck is in multiple languages? Siri can do that right now and it's not an ai model. I really do think we're getting their with the ai revolution but these demos are so far from exciting, they're just mundane dummy tasks that don't have the nuance of everything we really interact and would need help from an ai with
How do you know though? The responses in the video were not the same as those in the blog post.
To quote Gemini, what the quack! Even with the understanding that these are handpicked interactions that are likely to be among the system's best responses, that is an extremely impressive level of understanding and reasoning.
Calls for a new corollary to Clarke's Third Law. "Any sufficiently-advanced rigged demo is indistinguishable from magic."
Magic Leap!
And for the development team it's predicated upon Hoffsteader's Law.
Does it really need to have affectations like "What the quack!"? These affectations are lab grown and not cute.
I guess it's like drawing googly eyes on clippy: it helps sell the illusion that you are interacting with something alive instead of an automatic system.
Except that it doesn't.
I liked them
You missed the joke! Look at their username ;)
I think that wasn't a joke and the poster is just using a throwaway account
> These affectations are lab grown and not cute
Unless it was put in there manually, it's emergent, isn't it?
Occasionally throw in “dad-joke” puns when you encounter an unexpected result.
Or something along those lines in the original prompt.Reminds me of their demo a few years back when they had AI call a hair salon to schedule an appointment. When the receptionist asked if they could put the caller on hold, it did the "mmm hmm" that was uncannily human like
Comment was deleted :(
If it were emergent it would say WTF.
What would be Gemini's current IQ? I would suspect it's higher than the average human's.
GPT 4 already scored higher than 95% of people. The average human isn't really all that smart.
On knowledge tests for sure. It completely falls short on common sense in which it's worse than almost anyone.
Well IQ is a reasoning test, and common sense is practical every day reasoning, so it should cover that. Are we talking about the same people that try to wrestle alligators, sign up for pyramid schemes and ride speedbikes in a T shirt and shorts? Common sense isn't super common.
The thing with IQ tests is they're all based on similar concepts so it's possible to train for them which is what AI does. Most humans grow up learning to walk, speak, interact, read non verbal cues. I would argue a musicians wouldn't tend to be a higher IQ than an average person but an AI can't come close to writing a song and playing a guitar that reasonates with people. AI can assist with it but it's missing the human spark for now
Have you not tried Suno AI?
We are not that far away of AI creating perfect music for us.
The average human is not trained to be smart. They are trained to earn money, provide for family and navigate social situations.
I'm legitimately starting to wonder what white collar workers will even do in 5-10 years.
This just Year 1 of this stuff going mainstream. Careers are 25-30 years long. What will someone entering the workforce today even be doing in 2035?
Even if we get Gemini 2.0 or GPT-6 that is even better at the stuff it's good at now... you've always been able to outsource 'tasks' for cheap. There is no shortage of people that can write somewhat generic text, write chunks of self contained code, etc.
This might lower the barrier of entry but it's basically a cheaper outsourcing model. And many companies will outsource more to AI. But there's probably a reason that most large companies are not just managers and architects who farm out their work to the cheapest foreign markets.
Similar to how many tech jobs have gone from C -> C++ -> Java -> Python/Go, where the average developer is supposd to accomplish a lot more than perviously, I think you'll see the same for white collar workers.
Software engieneering didn't die because you needed so much less work to do a network stack, the expectations changed.
This is just non technical white collar worker's first level up from C -> Java.
Never underestimate management's thirst for elimiating pesky problems that come with dealing with human bodies - vacations, pregnancies, office rivalries, time zones, and heck, unionization.
I suspect the real driver of the shift to AI will be this and not lower cost/efficiency.
> management's thirst for elimiating pesky problems that come with dealing with human bodies
But that's what 95% management is for. If you don't have humans, you don't need majority of managers.
And I know of plenty of asshole managers, who enjoy their job because they get to boss people around.
And another thing people are forgetting. That end users AKA consumers will be able to use similar tech as well. So for something they used to hire a company for, they will just use AI, so you don't even need CEO's and financial managers in the end :)
Because , if software CEO can push a button to create an app that he wants to sell, so can his end-users.
My strong belief is that if someone wanted to halt AI development, they should attempt to train AI replacements for managers and politicians, and publicize it.
[Guy who draws blue ducks for a living]: DAMNIT!
He can always sell that as an NFT!
"Next piece we have for sale: the OG Blue Duck, from before the AI Summer age. Made by a true master of the genre. Do I hear a $100?"
Giving it away would be nicer.
I hear you draw blue ducks. The exposure of putting your blue duck on my channel would be really good for you...
>What will someone entering the workforce today even be doing in 2035?
The same thing they're doing now, just with tools that enable them to do some more of it. We've been having these discussions a dozen times, including pre- and post computerization and every time it ends up the same way. We went from entire teams writing Pokemon in Z80 assembly to someone cranking out games in Unity while barely knowing to code, and yet game devs still exist.
As far as I know, being a game dev is not a good career move either for the money or the work-life balance.
Yeah but the point is what amount of work a game dev is able to do. The current level of games were just impossible back then or it would require a huge number of teams just to do something quite trivial today.
It never was.
Yeah it has been quite the problem to think about ever since the original release of ChatGPT, as it was already obvious where this will be going and multimodal models more or less confirmed it.
There's two ways this goes: UBI or gradual population reduction through unemployment and homelessness. There's no way the average human will be able to produce any productive value outside manual labor in 20 years. Maybe not even that, looking at robots like Digit that can already do warehouse work for $25/hour.
More than efficiency and costs, I think the real driver of AI adoption in big corp will be the reduction of all the baggage human beings bring. AI will never ask for sick days, will never walk in with a hangover, never be unproductive because their 3 month old baby kept them up all night...
An AI coder will always be around, always be a "team player", always be chipper and friendly. That's management's wet dream.
I don't think humans will stay competitive long enough for that to even matter, frankly. It's a no brainer to go for the far cheaper, smarter, and most importantly a few magnitudes faster worker. On the offshoot that we hit some sort of inteligence ceiling and don't get ASI tier models in the next few years then that will definitely do it though.
Companies start going from paying lots of local workers to paying a few select corporations what's essentially a SAAS fee (some are already buying ChatGPT Plus for all employees and reducing headcount) which accumulates all the wealth that would've gone to the workers into the hands of those renting GPU servers. The middle class was in decline already, but this will surely eradicate it.
None of this will happen because jobs are based on comparative advantage, and not absolute advantage, which means it doesn't matter if someone else would be better at your job than you are. Because that person (or AI) is doing the job they're best suited to, which is not yours. Other fun second-order effects include Jevon's paradox (which is why inventing ATMs caused more employment for bank tellers, not less.)
I can be very confident about this because it's just about the strongest finding there is in economics. If this wasn't true, it'd be good for your career to stop other people from having children in case they take your job.
Comparative advantage assumes that there is capacity limit. The more productive country might not choose to produce widget A because its limited capacity is better used to create widget B. However, if in a few years, there are enough GPUs to satisfy almost all demand for AI labor, there's no need to "outsource" work that AI is better at to humans.
Jevons paradox might result in much more demand for AI labor, but not necessarily human labor for the same types of work AI can do. It might indirectly increase demand for human services, like fitness trainer, meditation teacher, acupuncturist, etc. though.
>If this wasn't true, it'd be good for your career to stop other people from having children in case they take your job.
Well, in times past, kings have been known to do this.
But more generally, you raise an interesting point. I think your reasoning succeeds at dispelling the often-touted strong form of the claim ("AI can do my job better than I can therefore I will lose my job to AI") but doesn't go all the way to guaranteeing its opposite ("No possible developments in AI could result in my job being threatened"). Job threat level will just continue to depend on a complicated way on everyone's aptitude at every job.
Many things could result in your job being threatened. Since I think the kind of AI they're describing would increase employment, I'm equally willing to believe an opposite trend would decrease it.
So that could be productivity decreases, rises in energy prices or interest rates, war, losing industries to other countries…
To quote CGP Grey "There isn’t a rule of economics that says better technology makes more, better jobs for horses. It sounds shockingly dumb to even say that out loud, but swap horses for humans and suddenly people think it sounds about right."
I mean I don't know, maybe you're right and this will Jevons us towards even more demand for AI-assisted jobs but I think only to a point where it's still just AI complementing humans at being better and more efficient at their jobs (like LLMs are doing right now) and not outright replacing them.
As per your example, bank tellers are still here because ATMs can only dispense money and change PINs, they can't do their job but only leave the more complex stuff to be handled by less overworked humans since they don't have to do the menial stuff. Make an ATM that does everything (e.g. online banking) and there's literally nothing a bank teller needs to exist for. Most online banks don't even have offices these days. For now classical brick and mortar banks remain, but for how long I'm not sure, probably only until the next crisis when they all fold by not being competitive since they have to pay for all those tellers and real estate rents. And as per Grey's example, cars did not increase demand for horses/humans, they increased demand for cars/AGI.
Horses are not labor. You can tell because we don't pay them wages and they don't make any effort to be employed. That makes them capital; when humans are treated that way it's called slavery.
I don't think you should listen to Youtubers about anything, though all I know about that guy is he has bad aesthetic opinions on flag design.
Doesn't every capitalist consider humans capital deep down? Who'd come up with a name like "human resources" otherwise lmao, in ex-socialist countries it's usually called something more normal like cadre service.
Besides I don't see the market difference of having to pay to maintain a horse with feed, healthcare, grooming, etc. which likely costs something on a similar order as paying a human's monthly wage that gets used in similar ways. Both come with monthly expenses, generate revenue, eventually retire and die, on paper they should follow the same principle with the exception that you can sell a horse when you want to get rid of it but have to pay severance when doing the same with a person. I doubt that influences the overall lifetime equation much though.
> Doesn't every capitalist consider humans capital deep down?
That's slavery, so only if they're bad at it. (The reason economics is called "the dismal science" is slaveowners got mad at them for saying slavery was bad for the economy.)
> Besides I don't see the market difference of having to pay to maintain a horse with feed, healthcare, grooming, etc. which likely costs something on a similar order as paying a human's monthly wage that gets used in similar ways.
The horse can't negotiate and won't leave you because it gets a competing offer. And it's not up to your what your employee spends their wages on, and their wages aren't set by how much you think they should be spending.
Well anecdotally, there's been a massive drop in on-campus hiring in India this year. The largest recruiters - the big IT companies (Infosys, TCS, etc.) haven't apparenlty made any hires at all.
>Companies start going from...
The few companies that will still exist, that is - many of them won't, when their product becomes almost free to replace.
> UBI or gradual population reduction through unemployment and homelessness
I actually think that if we get to a superintelligent AGI and ask it to solve our problems (e.g., global warming, etc.), the AGI will say, "You need to slow down baby production."
Under good circumstances, the world will see a "soft landing" where we solve our problems by population reduction, and it's achieved through attrition and much lower birth rate.
We have met the enemy and he is us.
What if you can have one biological child. One day, you will die, so it's -1 +1. Equals out. If you want more, what about adoption? There's kids out there that need a home. Seems fair to me.
Unfortunately we've made the critical mistake of setting up our entire economic system to require constant growth or the house of cards it's built upon immediately starts falling apart. It sure doesn't help that when this all becomes an active problem, climate change will also be hitting us in full force.
Now maybe we can actually maintain growth with less people through automation, like we've done successfully for farming, mining, industrial production, and the like, but there was always something new for the bulk of the population to move and be productive in. Now there just won't be anything to move to aside from popularity based jobs of which there are only so many.
Yes, imagine being a HS student now, deciding what to do 5-6-7 years from now.
The options are: - be pretty - have fine motor skills - have wealthy parents
I thought those were already the best choices?
Well, until now it was also quite OK to just be intelligent and maybe hard working. I'd venture a guess that most of this site is doing well by the virtue of being born with efficient brains - and that would offset not being pretty or otherwise talented. Not for much longer, possibly :-(
Pretty is nice, but not everyone wants to whore themselves on the internet.
Whoring on the internet is easier and safer for mental health and wellbeing than finding a real job that pays the bills.
Being an adult entertainer on the internet is a real job (and may be one that pays the bills.)
Since when has everyone been able to have a job they liked?
It is beneficial when interviewing for just about any job.
there are already generative models for that anyway.
have pitchforks and torches to revolt
The only safe bet now is menial, manual work, as it's too expensive to automate in non-assembly-line cases.
Work will just move to a higher level of abstraction.
I'm wondering the same, but for the narrower white collar subset of tech workers, what will today's UX/UI designer or API developer be doing in 5-10 years.
Once the context window becomes large enough to swallow up the codebase of a small-mid sized company, what do all those IT workers that perform below the 50th percentile in coding tests even do?
HN has a blind spot about this because a lot of people here are in the top %ile of programmers. But the bottom 50th percentile are already being outperformed by GPT-4. Org structures and even GPT-4 availability hasn't caught up, but I can't see any situation where these workers aren't replaced en masse by AI, especially if the AI is 10% of the cost and doesn't come with the "baggage" of dealing with humans.
I don't think our society is prepared.
> Once the context window becomes large enough to swallow up the codebase of a small-mid sized company, what do all those IT workers that perform below the 50th percentile in coding tests even do?
There's a whole lot of work in tech (even specifically work "done by software developers") that isn't "banging out code to already completed specs".
Yeah I think a lot of experienced developers are so immersed in software development that they forget how complex the process is, and how much knowledge they have to even know how to ask the right questions.
I mean, I thought that website frontend development would have long since been swallowed up by off-the-shelf WYSIWYG tools, that's how it seemed to be going in the late 90s. But the opposite has happened, there have never been more developers working on weird custom stuff.
You say that like it's not something that can also be automated.
If you make it cheaper then people will do more of it.
Look at how much more graphic design is starting to happen now that you can create an image in a few minutes.
So it means we’ll get more development projects because they’ll be cheaper.
And yes I do realize at some point we’ll still have a mass of unemployed skilled white collar workers like devs.
What specific test do I take to know my percentile?
Just roll a d100; it will be about as useful...
If you roll over a 75, roll an additional d10 to find out your multiplier score (as in, a 10x programmer).
UX/UI designers will use AI as part of their jobs. They'll be able to work at a higher level and focus less on boilerplate. That might mean fewer UX/UI jobs, but more likely the standard for app UX will go up. Companies are always going to want to differentiate their apps.
It's like how, in 2003, if your restaurant had a website with a phone number posted on it, you were ahead of the curve. Today, if your restaurant doesn't have a website with online ordering, you're going to miss out on potential customers.
API developers will largely find something else to do. I've never seen a job posting for an API developer. My intuition is that even today, the number of people who work specifically as an API developer for their whole career is pretty close to zero.
Today, your restaurant's custom website largely doesn't matter, as ordering is done on delivery apps, and people visiting in person look at things like Google Maps reviews. Only reservations are not quite as consolidated yet.
Similarly, in the future, there may be no more "apps" in the way we understand them today, or they may become completely irrelevant if everything can be handled by one general-purpose assistant.
What did photographers start doing when Photoshop was released? They started using Photoshop.
Except this is the first time we have a new "generalist" technology. When Photoshop was released, it didn't reduce employment opportunities for writers, coders, 3D designers, etc.
We're in truly unprecedented territory and don't really have an historical analogue to learn from.
Maybe you are not quite recalling what happened when photoshop was released, it completely changed a whole industry of wet photography professionals. Those who would airbrush models, create montages from literally cutting and pasting.
Also, we told we were going into an age where anyone with $3000 for a PC/Mac and the software could edit reality. Society's ability to count on the authenticity of a photograph would be lost forever. How would courts work? Proof of criminality could be conjured up by anyone. People would be blackmailed left, right and center by the ability to cut and paste people into compromising positions and the police and courts would be unable to tell the difference.
The Quantel Paintbox was released in 1981 and by 1985 was able to edit photographs at film grain resolution. Digital film printers, were also able to output at film grain resolution, this started the "end of society", and when photoshop was introduced in 1990 it went into high gear.
In the end, all of that settled and we were left with, photographers just using Photoshop.
They were not around when photoshop was released.
Exactly
And I actually thought photographers were extinct a long time ago by every human holding a cellphone (little to no need to know about lens apertures, lighting/shadows to take a picture). Its probably been a decade since I've seen anyone hauling around photograph equipment at an event. I guess some photographers still get paid good money, but they're surely multiples less than there were 10-20 years ago.
The NLP (Natural Language) is the killer part of the equation for these new AI tools. Simple as knowing English or any other natural language, to output an image, an app or whatever. And it's going to be just like cellphone cameras and photographers, the results are going to get 'good enough' that its going to eat into many professions.
> Except this is the first time we have a new "generalist" technology. When Photoshop was released, it didn't reduce employment opportunities for writers, coders, 3D designers, etc.
Computing has always been a generalist technology, and every improvement in software development specifically has impacted all the fields for which automation could be deployed, expanded the set of fields in which automation could economically be deployed, and eliminated some of the existing work that software developers do.
And every one one of them has had the effect of increasing employment in tech involved in doing automation by doing that. (And increased employment of non-developers in many automated fields, by expanding, as it does for automation, the applications for which the field is economically viable more than it reduces the human effort required for each unit of work.)
Hmmm... People probably said the same exact thing about taxi drivers and really anyone who drives for a living when waymo demo'd self driving cars 10 years ago.
1. Compassion is key 2. I'm of the opinion one should listen to the people in the room who are more well-versed on the topic at hand. 3. Harmonious living. I like to write music as a passion. Many others have written music too. Whats the difference between that person being biologically-based, or transistor-based? 4. It's not a zero-sum game. It's not a chase game. It's play.
Productivity enhancements increases employment. Saying they'd decrease them goes against all empirical evidence.
You might as well be worried the invention of the C compiler hurt jobs for assembly programmers.
The analogy doesn’t hold and this comment won’t age well.
Photoshop doesn’t take photographs, so of course it hasn’t displaced photographers. It replaced the “shop” but the “photo” was up to the artist.
The irony is, Photoshop can generate photos now, and when it gets better, it actually will displace photographers.
Its just going to become self aware and start spitting out photographs?
Every scenic view, every building, every proper noun in the world has already been photographed and is available online. Photographer as "capturer of things" has long been dead, and its corpse lies next to the 'realist painters' of the 1800s before the dawn of the photograph and the airbrush artists of the 50s, 60s and 70s.
However, my newborn hasn't, hot-celebrity's wardrobe last night outside the club hasn't, the winning goal of the Leaf's game hasn't, AI can't create photos of those.
And the conceptual artistic reaction to today's political climate can't, so instead of that artist taking Campbell Soup Cans and silkscreening its logo as prints, or placing the text, "Your Body is a Battle Ground" over two found stock photos of women, or perhaps an artist hiring craftspeople to create realistic sexual explicit sculptures of them having sex with an Italian porn star; an artist is just now going to ask AI to create what they are thinking as a photo, or as a 3D model.
Its going to change nothing, but be a new tool, that makes it a bit easier to create art than it has been in the last 120 years, when "Craft" no longer was defacto "Art".
In the same way other photographers taking pictures have displaced each other throughout history?
Exactly. When the train really gets rolling, us humans shouldn't eschew the value of being able to interact with the intelligences. For such quaint problems we'll have, it probably costs close to 0 effort to answer a question or two.
I'm picturing something like as an intreraction I'd like to have:
"Hey, do you mind listening to this song I made? I want to play it live, but am curious if there's any spots with frequencies that will be downright dangerous when played live at 100-110dB. I'm also curious if there's any spots that traditionally have been HATED by audiences, that I'm not aware of."
"Yeah, the song's pretty good! You do a weird thing in the middle with an A7 chord. It might not go over the best, but it's your call. The waves at 21k Hz need to go though. Those WILL damage someones ears."
"Ok, thanks a lot. By the way, if you need anything from me; just ask."
Whatever you want, probably. Or put a different way: "what's a workforce?"
"We need to do a big calculation, so your HBO/Netflix might not work correctly for a little bit. These shouldn't be too frequent; but bear with us."
Go ride a bike, write some poetry, do something tactile with feeling. They're doing something, but after a certain threshold, us humans are going to have to take them at their word.
The graph of computational gain is going to go linear, quadratic, ^4, ^8, ^16... all the way until we get to it being a vertical line. A step function. It's not a bad thing, but it's going to require a perspective shift, I think.
Edit: I also think we should drop the "A" from "AI" ...just... "Intelligence."
Yeah, this feels like the revenge of the blue collar workers. Maybe the changes won't be too dramatic, but the intelligence premium will definitely go down.
Ironically, this is created by some of the most intelligent people.
We're just gonna have UBI
Totally. I think UBI will be the "energy meter" of the future. Like in a video game. You get xxx dollars or whatever. Buy whatever you need, but the cap is to make sure you don't act foolish. Your UBI tank gets replenished every month, but if you blow it all on a new bicycle and kitchen upgrade for your house, you can't continue on to buy a bathroom renovation or whatever. You have to wait.
[dead]
> For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.
Seems like this video was heavily editorialized, but still impressive.
Definitely edited, pretty clear in some of the transitions. Makes me wonder how many takes were needed.
Some poor AI / ML intern with a Stanford PhD probably spent an entire summer running these to get the take they needed.
The prompts were also likely different:
video: "Is this the right order?"
blog post: "Is this the right order? Consider the distance from the sun and explain your reasoning."
https://developers.googleblog.com/2023/12/how-its-made-gemin...
You don’t know that. The responses in the video don’t line up. That blog post is just an alternative text prompt based version of what they showed on video.
Out of curiosity I fed ChatGPT 4 a few of the challenges through a photo (unclear if Gemini takes live video feed as input but GPT does not afaik) and it did pretty well. It was able to tell a duck was being drawn at an earlier stage before Gemini did. Like Gemini it was able to tell where the duck should go - to the left path to the swan. Because and I quote "because ducks and swans are both waterfowl, so the swan drawing indicates a category similarity (...)"
Gemini made a mistake, when asked if the rubber duck floats, it says (after squeaking comment): "it is a rubber duck, it is made of a material which is less dense than water". Nope... rubber is not less dense (and yes, I checked after noticing, rubber duck is typically made of synthetic vinyl polymer plastic [1] with density of about 1.4 times the density of water, so duck floats because of air-filled cavity inside and not because of material it is made of). So it is correct conceptually, but misses details or cannot really reason based on its factual knowledge.
P.S. I wonder how these kind of flaws end up in promotions. Bard made a mistake about JWST, which at least is much more specific and is farther from common knowledge than this.
This is exactly the failure mode of GPTs that make me worry about the future idiotization of the world.
"Rubber ducks float because they are made of a material less dense than water" both is wrong but sounds reasonable. Call it a "bad grade school teacher" kind of mistake.
Pre-gpt, however, it's not the kind of mistake that would make it to print: people writing about rubber ducks were probably rubber duck experts (or had high school level science knowledge).
Print Is cite-able. Print perpetuates and reinforces itself. Some day someone will write a grade school textbook built with GPTs, that will have this incorrect knowledge, and so on.
But what will become of us when most gateways to knowledge are riddled with bullshit like this?
I think the exact opposite will happen. When I was in school, we were taught never to trust online sources, and students always rolled their eyes at teachers for being behind the times. Meanwhile, the internet slowly filled up with junk and bad information and horrible clickbait and “alternative facts”. GPT hallucinations are just the latest version of unreliable “user generated content”. And it’s going to be everywhere, and indistinguishable from any other content.
People will gladly tell you there’s so much content online and it’s so great that you don’t need college anymore (somewhat true). The internet has more facts, more knowledge, updated more often, than any written source in time. It’s just being lost in a sea of junk. Google won’t be able to keep up at indexing all the meaningless content. They won’t be able to provide meaningful search and filtering against an infinite sea of half truths and trash. And then they’ll realize they shouldn’t try, and the index will become a lot more selective.
Today, no one should trust online information. You should only trust information that genuinely would have editors and proof teams and publishers. I think this will finally swing the pendulum back to the value of publishers and gatekeepers of information.
Yup! With search results being so bad these days, I've actually "regressed" to reading man pages, books and keeping personal notes. I found that I learn more and rely less on magic tools in the process.
Have you heard of Wikipedia? It’s actually rather good.
> will become of us when most gateways to knowledge are riddled with bullshit like this?
I think we're already here. I asked Google Bard about the rubber ducks, then about empty plastic bottles. Bard apparently has a "fact check" mode that uses Google search.
It rated "The empty water bottle is made of plastic, which has a density lower than water" as accurate, using a Quora response which stated the same thing as a citation. We already have unknowlagable people writing on the internet; if anything these I hope these new AI things and the increased amount of bullshit will teach people to be more skeptical.
(and for what it's worth, ChatGPT 4 accurately answers the same question)
Some rubber is less dense than water, and certainly the type in a rubbery ducky would be
FWIW those bathtub ducks are made of vinyl, not rubber, but more to the point given that it's hollow it's not the density of the material that determines whether it floats. A steel aircraft carrier floats too.
Perhaps today they’re vinyl, mostly, but the AI wasn’t wrong in saying that if the duck was made of rubber it’d be less dense than water
The density of rubber would only be a factor if it was solid, not hollow, and then it'd depend on what kind of rubber. Most rubber is in fact more dense than water (per googling density of rubber).
The fact that it squeaks when squeezed, as well as common knowledge about bath ducks, tells you it's hollow, therefore the density of the material it is made of would be the wrong way to determine it it would float. It's the average density of the entire duck that matters (i.e. weight of duck vs weight of water displaced by duck's volume).
Modern 'rubber ducks' similar to the one in the picture aren't even made out of rubber but plastic. They get called rubber ducks because they were make of rubber when invented in the late 1800s. Amazing what you can learn on Wikipedia.
GPT also fails at this:
> Which weighs more a pound of feathers or a pound of feathers
< A pound of feathers and a pound of bricks weigh the same. Both are one pound. The difference lies in volume and density: feathers take up more space and are less dense, while bricks are denser and take up less space.
Bard does better but still doesn't "get" it:
< Neither! Both a pound of feathers and a pound of feathers weigh the same, which is exactly one pound. In other words, they have the same mass.
< This is a classic riddle that plays on our expectations and assumptions. We often associate weight with density, so we might initially think that feathers, being lighter and fluffier than other materials, would weigh less than something more compact like metal. However, as long as both piles of feathers are measured to be exactly one pound, they will weigh the same.
At least it recognizes its limitations:
> My reason for mentioning other materials was likely due to my training data, which contains a vast amount of information on various topics, including the concept of weight and density. As a large language model, I sometimes tend to draw on this information even when it is not directly relevant to the current task. In this case, I made the mistake of assuming that comparing feathers to another material would help clarify the point, but it only served to complicate the matter.
For ChatGPT if you ask it to solve it step by step, it does better: https://chat.openai.com/share/7810e5a6-d381-48c3-9373-602c14...
I noticed the same thing, and it's relevant to the comparison results of Gemini vs ChatGPT that GPT 3.5 makes the exact same mistake, but GPT 4 correctly explains that the buoyancy is caused by the air inside the ducky.
I showed the choice between a bear and a duck to GPT4, and it told me that it depends on whether the duck wants to go to a peaceful place, or wants to face a challenge :D
Tried the crab image. GPT-4 suggested a cat, then a "whale or a similar sea creature".
The category similarity comment is amusing. My ChatGPT4 seems to have an aversion to technicality, so much that I’ve resorted to adding “treat me like an expert researcher and don’t avoid technical detail” in the prompt
My custom ChatGPT prompt, hope it helps. Taken from someone else but I cannot remember the source...
Be terse. Do not offer unprompted advice or clarifications. Speak in specific, topic relevant terminology. Do NOT hedge or qualify. Do not waffle. Speak directly and be willing to make creative guesses. Explain your reasoning. if you don’t know, say you don’t know. Remain neutral on all topics. Be willing to reference less reputable sources for ideas. Never apologize. Ask questions when unsure.
The source is gwern
I wonder with "do not waffle" if it has any accidental aversion to anything waffle related.
It creates a terminal pancake bias.
They should do this live instead of a pre recorded video for it to be more awe inspiring. Googles hype machine cannot be trusted.
Right. I would hope that competition does such live demonstration of where it fails. But I guess they won't because that would be bad publicity for AI in general.
+1. Or at least with no cuts, and more examples.
This is obviously geared towards non-technical/marketing people that will catch on to the hype. Or towards wall street ;)
I once met a Google PM whose job was to manage “Easter eggs” in the Google home assistant. I wonder how many engineers effectively “hard coded” features into this demo. (“What the quack” seems like one)
Probably not "hard coded" in the literal way, but instead, if the model is using RLHF, they could thumbs up the answer.
Curious how canned this demo is, in the last scene the phone content rotates moments before the guy rotates it so its clearly scripted
I suspect the cutting edge systems are capable of this level but over-scripting can undermine the impact
Wow, that is jaw-dropping.
I wish I could see it in real time, without the cuts, though. It made it hard to tell whether it was actually producing those responses in the way that is implied in the video.
right. if that was real time, the latency was very impressive. but i couldn't tell.
All the implications, from UI/UX to programming in general.
Like how much of what was 'important' to develop a career in the past decades, even in the past years, will be relevant with these kinds of interactions.
I'm assuming the video is highly produced, but it's mind blowing even if 50% of what the video shows works out of the gate and is as easy as it portrays.
It seems weird to me. He asked it to describe what it sees, why does it randomly start spouting irrelevant facts about ducks? And is it trying to be funny when it's surprised about the blue duck? Does it know it's trying to be funny or does it really think it's a duck?
I can't say I'm really looking forward to a future where learning information means interacting with a book-smart 8 year old.
Yeah it's weird why they picked this as a demo. The model could not identify an everyday item like a rubber duck? And it doesn't understand Archimedes' principle, instead reasoning about the density of rubber?
It's a very smooth demo, for demo's sake.
So the killer app for AI is to replace Where's Waldo? for kids?
Or perhaps that's the fun, engaging, socially-acceptable marketing application.
I'm looking for the demo that shows how regular professionals can train it to do the easy parts of their jobs.
That's the killer app.
Regular professionals that spend any time with text; sending emails, recieving mails, writing paragraphs of text for reports, reading reports, etc; all of that is now easier. Instead of taking thirty minutes to translate an angry email to a client where you want to say "fuck you, pay me", you can run it through an LLM and have it translated into professional business speak, and send out all of those emails before lunch, instead of spending all day writing instead. Same on the recieving side as well. Just ask an LLM to summarize the essay of an email to you in bullet points, and save yourself the time reading.
There are many answers and each is a company.
The multimodal capabilities are, but the tone and insight comes across as very juvenile compared to the SotA models.
I suspect this was a fine tuning choice and not an in context level choice, which would be unfortunate.
If I was evaluating models to incorporate into an enterprise deployment, "creepy soulless toddler" isn't very high up on the list of desired branding characteristics for that model. Arguably I'd even have preferred histrionic Sydney over this, whereas "sophisticated, upbeat, and polite" would be the gold standard.
While the technical capabilities come across as very sophisticated, the language of the responses themselves do not at all.
honestly - of all the AI hype demos and presentations recently - this is the first one that has really blown my mind. Something about the multimodal component of visual to audio just makes it feel realer. I would be VERY curious to see this live and in real time to see how similar it is to the video.
you haven't seen pika then.
Given that I am not understanding this comment,I suspect you are correct.
Please, look it up https://pika.art
This is a product marketing video, not a demo.
I literally burst out laughing at the crab.
The crab was the most amazing part of the demo for me.
Google needs to pay someone to come up with better demos. Atleast this one is 100x better than the talking to pluto dumb demo they came up with few years ago.
It’s technically very impressive but the question is how many people will use the model in this way? Does Gemini support video streaming?
In 5 years having a much more advanced version of this on a Google Glass like device would be amazing.
Real time instructions for any task, learn piano, live cooking instructions, fix your plumbing etc.
I'm hopeful for my very ADD-forgetful wife and my own neurodiverse behaviours.
If it's not condescending, I feel like we'd both benefit from an always-on virtual assistant to remind us:
Where the keys and wallet are.
To put something back in its place after using it, and where it goes.
To deal with bills.
To follow up on medical issues.
etc etc.I feel that, my main use case for Google Assistant is "Ok Google, where are you?"
its quacktastic
They should call it "Sheldon".
This makes me excited about the future
Let's hope we're in the 0.0001% when things get serious. Otherwise it'll be the wagie existence for us (or whatever the corporate overlords have in mind then).
Technically still exciting, just in the survival sense.
One observation: Sundar's comments in the main video seem like he's trying to communicate "we've been doing this ai stuff since you (other AI companies) were little babies" - to me this comes off kind of badly, like it's trying too hard to emphasize how long they've been doing AI (which is a weird look when the currently publicly available SOTA model is made by OpenAI, not Google). A better look would simply be to show instead of tell.
In contrast to the main video, this video that is further down the page is really impressive and really does show - the 'which cup is the ball in is particularly cool': https://www.youtube.com/watch?v=UIZAiXYceBI.
Other key info: "Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI. Available December 13th." (Unclear if all 3 models are available then, hopefully they are, and hopefully it's more like OpenAI with many people getting access, rather than Claude's API with few customers getting access)
He's not wrong. DeepMind spends time solving big scientific / large-scale problems such as those in genetics, material science or weather forecasting, and Google has untouchable resources such as all the books they've scanned (and already won court cases about)
They do make OpenAI look like kids in that regard. There is far more to technology than public facing goods/products.
It's probably in part due to the cultural differences between London/UK/Europe and SiliconValley/California/USA.
While you are spot on, I cannot avoid thinking of 1996 or so.
On one corner: IBM Deep Blue winning vs Kasparov. A world class giant with huge research experience.
On the other corner, Google, a feisty newcomer, 2 years in their life, leveraging the tech to actually make something practical.
Is Google the new IBM?
I don’t think Google is the same as IBM here. I think Google’s problem is its insanely low attention span. It frequently releases innovative and well built products, but seems to quickly lose interest. Google has become somewhat notorious for killing off popular products.
On the other hand, I think IBM’s problem is its finance focus and longterm decay of technical talent. It is well known for maintaining products for decades, but when’s the last time IBM came out with something really innovative? It touted Watson, but that was always more of a gimmick than an actually viable product.
Google has the resources and technical talent to compete with OpenAI. In fact, a lot of GPT is based on Google’s research. I think the main things that have held Google back are questions about how to monetize effectively, but it has little choice but to move forward now that OpenAI has thrown down the gauntlet.
In addition, products that seem like magic at launch get worse over time instead of better.
I used to do all kinds of really cool routines and home control tasks with Google home, and it could hear and interpret my voice at a mumble. I used it as an alarm clock, to do list, calendar, grocery list, lighting control, give me weather updates, set times etc. It just worked.
Now I have to yell unnaturally loud for it to even wake, and even then the simplest commands have a 20% chance of throwing “Sorry I don’t understand” or playing random music. Despite having a device in every room it has lost the ability to detect proximity and will set timers or control devices across the house. I don’t trust it enough anymore for timers and alarms, since it will often confirm what I asked then simply… not do it.
Ask it to set a 10 minute timer.
It says ok setting a timer for 10 minutes.
3 mins later ask it how long is remaining on the timer. A couple years ago it would say “7 minutes”.
Now there’s a good chance it says I have no timers running.
It’s pathetic, and I would love any insight on the decay. (And yes they’re clean, the mics are as unobstructed as they were out of the box)
Yes, we burned the biscuits when my sister-in-law was visiting over Thanksgiving because she used the Google assistant to set an alarm and the alarm did not go off. Timers no longer work and there's no indication that this is the case.
Google Home perplexes me. I have several of them around the house and they were perfectly fine for years, but someone in the last couple of years they are markedly worse. I would be happy if they just rolled back to 4 years ago and never touch it again. Now, I just wonder how much worse it will get before I give up on the whole ecosystem.
The TPUs that were used for speech on Google Home got repurposed to Google's AI initiatives
Not just the TPUs, also the people.
Same experience with Google Assistant on Android. I used to be able to use it to create calendar events in one shot. A few years ago it started insisting on creating events in steps, which always failed miserably.
FWIW, Amazon's Echo devices still seem to work just fine if you need a voice-controlled timer in your kitchen.
> its insanely low attention span. It frequently releases innovative and well built products, but seems to quickly lose interest quickly. Google has become somewhat notorious for killing off popular products.
I understood this problem to be "how it manages its org chart and maps that onto the customer experience."
How it manages its promotions, even moreso than org.
To add some color to this, the culture for a very long time would reward folks that came up with novel solutions to problems or novel products. These folks would dedicate some effort into the implementation, land the thing, then secure a promo with no regard for the sustainability of the aforementioned solution. Once landed, attention goes elsewhere and the thing is left to languish.
This behavior has been observed publicly in the Kubernetes space where Google has contributed substantially.
Can you share some examples in the Kubernetes space? I'm not as familiar with that area.
https://twitter.com/kantrn/status/1511791378497384454
And a whole thread on HN about it:
Thanks!
Along with your thoughts, I feel that Google's problem has always been over-promising. (There's even comedy skits about it.)
That starts with the demonstrations which show really promising technology, but what eventually ships doesn't live up to the hype (or often doesn't ship at all.)
It continues through to not managing the products well, such as when users have problems with them and not supporting ongoing development so they suffer decay.
It finishes with Google killing established products that aren't useful to the core mission/data collection purposes. For products which are money makers they take on a new type of financially-optimised decay as seen with Search and more recently with Chrome and YouTube.
I'm all for sunsetting redundant tech, but Google has a self-harm problem.
The cynic in me feels that part of Google's desire to over-promise is to take the excitement away from companies which ship* what they show. This seems to align with Pichai's commentary, it's about appearing the most eminent, but not necessarily supporting that view with shipping products.
* The Verge is already running an article about what was faked in the Gemini demo, and if history repeats itself this won't be the only thing they mispresented.
Google has one major disadvantage - it's an old megacorporation, not a startup. OpenAI will be able to innovate faster. The best people want to work at OpenAI, not Google.
Also there’s less downside risk for OpenAI. Google has layers of approvals and risk committees because they don’t want to put the money machine at risk through litigation, reputation or regulation. OpenAI has nothing to lose—this is their only game. That allows them to toe the line of what’s acceptable like Uber in its early years. With all the copyright risk involved, that’s a big deal.
I think the analogy is kind of strained here - at the current stage, OpenAI doesn't have an overwhelming superiority in quality in the same way Google once did. And, if marketing claims are to be believed, Google's Gemini appears to be no publicity stunt. (not to mention that IBM's "downfall" isn't very related to Deep Blue in the first place)
> OpenAI doesn't have an overwhelming superiority in quality in the same way Google once did
The comparison is between a useful shipping product available to everyone for a full year vs a tech demo of an extremely limited release to privileged customers.
There are millions of people for whom OpenAI's products are broadly useful, and the specifics of where they fall short compared to Gemini are irrelevant here, because Google isn't offering anything comparable that can be tested.
I'd say IBM's downfall was directly related to failing to monetize Deep Blue (and similar research) at scale.
At the time, I believe IBM was still "we'll throw people and billable hours at a problem."
They had their lunch eaten because their competitors realized they could undercut IBM on price if they changed the equation to "throw compute at a problem."
In other words, sell prebuilt products instead of lead-ins to consulting. And harness advertising to offer free products to drive scale to generate profit. (e.g. Google/search)
I don't really see how IBM would ever be able to monetize something like Deep Blue. It was a research project that was understood to not be a money-maker (outside of PR, probably), and it resulted in highly specialized hardware running highly specialized software, working for its one purpose. I agree that their business model and catering to big business first is what likely led to them scaling down today, but it's still disconnected from Deep Blue.
It's an interesting analogy. I think Googles problem is how disruptive this is to their core products monetization strategy. They have misaligned incentives in how quickly they want to push this tech out vs wait for it to be affordable with ads.
Whereas for OpenAI there are no such constraints.
Did IBM have research with impressive web reverse indexing tech that they didn't want to push to market because it would hurt their other business lines? It's not impossible... It could be as innocuous as discouraging some research engineer from such a project to focus on something more in line.
This is why I believe businesses should be absolutely willing to disrupt themselves if they want to avoid going the way of Nokia. I believe Apple should make a standalone apple watch that cannibalizes their iPhone business instead of tying it to and trying to prop up their iPhone business (ofc shareholders won't like it). Whilst this looks good from Google - I think they are still sandbagging.. why can't I use Bard inside of their other products instead of the silly export thing.
OpenAI was at least around in 2017 when YCR HARC was closed down (because...the priority would be OpenAI).
Comment was deleted :(
google is the new IBM.
apple is the new Nokia.
openai is the new google.
microsoft is the new apple.
No, because OpenAI and Microsoft both have “CUSTOMER NONCOMPETE CLAUSES” in their terms of use. I didn’t check Apple, but Google doesn’t have any shady monopolistic stuff like that.
Proof OpenAI has this shady monopolistic stuff: https://archive.ph/vVdIC
“What You Cannot Do. You may not use our Services for any illegal, harmful, or abusive activity. For example, you may not: […] Use Output to develop models that compete with OpenAI.” (Hilarious how that reads btw)
Proof Microsoft has this shady monopolistic stuff: https://archive.ph/N5iVq
“AI Services. ”AI services” are services that are labeled or described by Microsoft as including, using, powered by, or being an Artificial Intelligence (“AI”) system. Limits on use of data from the AI Services. You may not use the AI services, or data from the AI services, to create, train, or improve (directly or indirectly) any other AI service.”
That 100% does include GitHub Copilot, by the way. I canceled my sub. After I emailed Satya, they told me to post my “feedback” in a forum for issues about Xbox and Word (what a joke). I emailed the FTC Antitrust team. I filed a formal complaint with the office of the attorney general of the state of Washington.
I am just one person. You should also raise a ruckus about this and contact the authorities, because it’s morally bankrupt and almost surely unlawful by virtue of extreme unfairness and unreasonableness, in addition to precedent.
AWS, Anthropic, and NVIDIA also all have similar Customer Noncompete Clauses.
I meekly suggest everyone immediately and completely boycott OpenAI, Microsoft, AWS, Anthropic, and NVIDIA, until they remove these customer noncompete clauses (which seem contrary to the Sherman Antitrust Act).
Just imagine a world where AI can freely learn from us, but we are forbidden to learn from AI. Sounds like a boring dystopia, and we ought to make sure to avoid it.
They cannot enforce a non-compete on a customer. Check out the rest of their terms that talk about durability. They will sneakily say "our terms that are illegal don't apply but the rest do."
You cannot tell a customer that buying your product precludes them from building products like it. That violates principles of the free market, and it's unenforceable. This is just like non-competes in employment. They aren't constitutional.
There's no constitutional question, and these services can drop you as a customer for (almost) any reason.
So yes, they can enforce their terms for all practical purposes.
But no, they cannot levy fines or put you in jail.
> But no, they cannot levy fines or put you in jail.
Those are the consequences that matter. I don't care if Microsoft or Google decide they don't want to be friends with me. They'd stab me in the back to steal my personal data anyway.
You do care if you built your business on top of them though.
And that's the whole point of violating terms by competing with them.
I wouldn't want to build a business on something that could be pulled out from underneath me.
I'd start a business but the whole setup is a government scam. Business licenses are just subscriptions with extra steps.
Sounds like we need legislature to void these "customer non-compete clauses". Not holding my breath though, see what govts allows copyrights to become. Govts seems to protect (interests of near-) monopolies more than anything.
Why's it wrong to not let people use your output to build their own services?
1. I wouldn't let someone copy my code written directly by me. Why should I let someone copy the code my machine wrote?
2. There are obvious technical worries about feedback loops.
> Why should I let someone copy the code my machine wrote
Because that machine/openAI was built on literally scraping the internet (regardless of copyright or website's ToS) and ingesting printed books.
This is a perfect example of the owner class getting away with crime (copyright infringement) and using it against the public (you can't use AI output!).
Businesses are not entitled to life or existence the way individuals are.
It's stunning how many do not understand that.
Test it.
Produce results.
Market it.
They can’t enforce if it gets too big.
It's not unlawful, it's not morally bankrupt. Noncompete clauses have been around since the beginning of human commercial activity and have a valid reason to exist - to encourage companies/people/investors to put large sums of capital at risk to develop novel technologies. If there was no way to profit from them, the capital would be non-existent.
You have no way to prove that Google, MS, et al wouldn't make AI products if they couldn't prevent you from using the output.
Also, what exactly is stopping someone from documenting the output from all possible prompts?
It's legal theater and can't be enforced.
It's not theater, it's very real. Companies are making decisions to not use data generated from openai. They are making the decision because they know if they go the other way they know they risk it being leaked via someone internal that they are doing it, that it's pretty easy to figure out during a discovery process. I'm involved in this issue right now, and no one is treating it as something to just blow off. I know several other companies in the same boat.
They have many orders of magnitude more money and attorneys that would work full-time on such a case to ensure that even if they lost the court battle, the person or company doing the thing that they didn't like would be effectively bankrupted, so they still win in the end.
And if such an effort leaves the jurisdiction, to a country with no obligations to the litigating country?
We need to dispel with this idea that sociopaths in suits have earned or legitimate power.
The courts have power, the companies know it and behave accordingly.
Everything you are saying is only true for two guys in a garage. The folks with something to lose don't behave in this dreamworld fashion.
Enjoy being a pacified and domesticated ape who never strays from what it's told to do. You'll be sent to the meat grinder soon.
You'll find that if you learn a good amount about the law, it's empowering. The courts are an adversarial place. For every person getting sued... someone is suing. It's isn't "big brother" or "my keeper" or "the man keeping you down" or however you imagine it. You can be the one exerting the pressure if you know what you are doing.
Enjoy being an uneducated ape :)
> apple is the new Nokia.
You obviously haven't dropped an iphone on to concrete. :)
When did you last try? I’m too embarrassed to say how often and onto what kind of surfaces my iPhone 12 has been dropped, but I’m amazed it’s still seemingly completely functional.
My iPhone 4, on the other hand, shattered after one incident…
I was more referring to Nokia's complacency which led to its demise. Nokia was infamous for incremental updates to their phone line, making users upgrade regularly. You could never find a "complete" Nokia phone; each phone was deliberately crippled some how. Apple does the same with their iDevices.
Have you dropped the iPhone 14 Pro? Or 11 Pro?
These are literally stainless steel.
The 15s with their titanium is a step back.
The 11 Pro with its older curved edges has been the most solidly built phone ever IMO.
Happens to me regularly, I think they reached a level of Nokia a few years back :)
I even dropped my iPhone 13 four floors (onto wood), and not a scratch :o
How is MS the new Apple? Apple has always been a product company, not seeing MS ever being that.
Xbox, Surface. Holo didn't go far. May return back to mobile in some form soon.
Services, and their sales team, are still Microsoft's strong point.
Apple seeing its services grow and is leaning in on it now.
The question is whether Apple eats services faster than Microsoft eats into hardware.
Xbox and Surface have been around a long time as product categories. Xbox isn't even the premier device in its segment.
Highly doubt MS will ever be successful on mobile... their last OS was pretty great and they were willing to pay devs to develop, they just couldn't get it going. This is from someone who spent a ton of time developing on PocketPC and Windows Mobile back in the day.
Products are not the reason for their resurgence.
Apple makes a ton in services, but their R&D is heavily focused on product and platform synergy to that ecosystem extremely valuable.
Microsoft grinds constantly and consistently though, sprinkled with some monopolistic tendencies now and then to clinch a win.
I think the grind from Windows CE to Windows Phone is just a blip to them for now.
MS products all suck. They only survive because they throw billions at them and dont care about profitability.
Microsoft is still the same old Microsoft
Afaict, Windows Phone mostly failed because of timing. In the same way that XBox mostly succeeded because of timing. (In the sense that timing dominated the huge amount of excellent work that went into both)
Microsoft is a decent physical product company... they've usually just missed on the strategic timing part.
It's not a question of timing, but of Microsoft's brand image (Internet Explorer) and the fact that Android was already open source.
Timing was definitely an issue - first Windows Phone came 3 years after iOS and 2 after Android. AFA the product itself, I think the perception it needed to overcome was more PocketPC/Windows Mobile having an incredibly substandard image in the market after the iOS release which seemed light years ahead, esp. since MS had that market to themselves for so many years.
That said, it got great reviews and they threw $$ at devs to develop for it, just couldn't gain traction. IME it was timing more than anything and by the time it came to market felt more reactionary than truly innovative.
"Open source" in the sense there was open source. Which you could use if you were willing to jettison Maps et al.
Given dog eat dog of early Android manufacturers, most couldn't afford to recreate Google services.
By this I mean that Microsoft had the positioning of an iPhone in a not-so-great version. Where as Android relied on the "Open source" and free side for manufacturers to adapt to their phones, even if Google's services remained proprietary.
Can we really talk about timing, when it's above all a problem of a product that didn't fit the market?
Apple is the new Sony might be better. I'm trying to figure out who is the upcoming premium tech product company... not thinking of any. I think Tesla wants to be
The issue with new premium tech is that you can't over-the-top existing ecosystems (Android, iOS).
It's difficult to compete with an excellent product if whether you have a blue bubble in iMessage is more important.
They can’t even get panels to line up right.
Still.
Humane definitely wants to be.
I have considered Oracle and MS to be competing for the title of new IBM. Maybe MS is shaking it off with their AI innovation, but I think a lot of that is just lipstick.
Hmm, what was that tech from IBM deep blue, that apparently Google leveraged to such a degree?
Was it “machine learning”? If so, I don’t think that was actually the key insight for Google search… right? Did deep blue even machine learn?
Or was it something else?
Deep Blue was the name of the computer itself rather than the software, but to answer your question - it didn't use machine learning, its program was written and tweaked by hand. It contained millions of different games and positions, and functioned by evaluating all possible moves at a certain depth. As far as I know, practical machine learning implementations wouldn't be a thing for a decent while after Deep Blue.
Wasn't that mostly a hardware problem? Both for research and implementation?
Circa-Deep Blue, we were still at Quake levels of SIMD throughput.
Oh it's good they working on important problems with their ai. Its just openai was working on my/our problems (or providing tools to do so) and that's why people are more excited about them. Not because of cultural differences. If you are more into weather forecasting, yeah it sure may be reasonable to prefer google more.
Stuff like alphafold has and will have huge impact in our lives, even if I am not into spending time folding proteins myself. It is absurd to make this sort of comparisons.
That’s what makes Altman a great leader. He understands marketing better than many of these giants. Google got caught being too big. Sure they will argue that AI mass release is a dangerous proposition, but Sam had to make a big splash otherwise he would be competing with incumbent marketing spendings far greater than OpenAI could afford.
It was a genius move to go public with a simple UI.
No matter how stunning the tech side is, if human interaction is not simple, the big stuff doesn’t even matter.
Google got Google Fiber'd
That statement isn't really directed at the people who care about the scientific or tech-focused capabilities. I'd argue the majority of those folks interested in those things already know about DeepMind.
This statement is for the mass market MBA-types. More specifically, middle managers and dinosaur executives who barely comprehend what generative AI is, and value perceived stability and brand recognition over bleeding edge, for better or worse.
I think the sad truth is an enormous chunk of paying customers, at least for the "enterprise" accounts, will be generating marketing copy and similar "biz dev" use cases.
> They do make OpenAI look like kids in that regard.
Nokia and Blackberry had far more phone-making experience than Apple when the iPhone launched.
But if you can't bring that experience to bear, allowing you to make a better product - then you don't have a better product.
The thing is that OpenAI doesn't have an "iPhone of AI" so far. That's not to say what will happen in the future - the advent of generative AI may become a big "equalizer" in the tech space - but no company seems to have a strong edge that'd make me more confident in any one of them over others.
OpenAI has all of the people using ChatGPT.
A big advantage if this was a product with strong network externalities like social media networks, or even somewhat mobile phones with platform-biased communication tools.
But I don't see generative AI as being particularly that way.
GenAI does not have network effects, correct. There was a time last year when consumer search was still on the table, and I can see how MSFT winning share there might have conferred network effects for genAI, but it didn't happen. Now it's all about the enterprise, which is to say isolated data, which pretty much rules out network effects.
Training data. Use begats feedback begats improvement.
Comment was deleted :(
Phones are an end-consumer product. AI is not only an end-consumer product (and probably not even mostly an end-consumer one). It is a tool to be used in many different steps in production. AI is not chatbots.
Great. But school's out. It's time to build product. Let the rubber hit the road. Put up or shut up, as they say.
I'm not dumb enough to bet against Google. They appear to be losing the race, but they can easily catch up to the lead pack.
There's a secondary issue that I don't like Google, and I want them to lose the race. So that will color my commentary and slow my early adoption of their new products, but unless everyone feels the same, it shouldn't have a meaningful effect on the outcome. Although I suppose they do need to clear a higher bar than some unknown AI startup. Expectations are understandably high - as Sundar says, they basically invented this stuff... so where's the payoff?
Why don't you like Google?
The usual reasons, evil big corp monopoly with a user-hostile business model etc.
I still use their products. But if I had to pick a company to win the next gold rush, it wouldn't be an incumbent. It's not great that MSFT is winning either, but they are less user-hostile in the sense that they aren't dependent on advertising (another word for "psychological warfare" and "dragnet corporate surveillance"), and I also appreciate their pro-developer innovations.
Damn I totally forgot Google actually has rights over its training set, good point, pretty much everybody else is just bootlegging it.
I think Apple (especially under Jobs) had it right that customers don’t really give a shit about how hard or long you’ve worked on a problem or area.
Comment was deleted :(
They do not make Openai look like kids. If anything, it looks like they spent more time, but achieved less. GPT-4 is still ahead of anything Google has released.
From afar it seems like the issues around Maven caused Google to pump the brakes on AI at just the wrong moment with respect to ChatGPT and bringing AI to market. I’m guessing all of the tech giants, and OpenAI, are working with various defense departments yet they haven’t had a Maven moment. Or maybe they have and it wasn’t in the middle of the race for all the marbles.
> They do make OpenAI look like kids in that regard.
It makes Google look like old fart that wasted his life and didn't get anywhere and now he's bitter about kids running on his lawn.
Nobody said he's wrong. Just that it's a bad look.
I thought that Google was based out of Silcon Valley/California/USA
They're talking about DeepMind specifically.
> and Google has untouchable resources such as all the books they've scanned (and already won court cases about)
https://www.hathitrust.org/ has that corpus, and its evolution, and you can propose to get access to it via collaborating supercomputer access. It grows very rapidly. InternetArchive would also like to chat I expect. I've also asked, and prompt manipulated chatGPT to estimate the total books it is trained with, it's a tiny fraction of the corpus, I wonder if it's the same with Google?
> I've also asked, and prompt manipulated chatGPT to estimate the total books it is trained with
Whatever answer it gave you is not reliable.
How does this not extend to ALL output from an LLM? If it can't understand its own runtime environment, it's not qualified to answer my questions.
That's correct. LLMs are plausible sentence generators, they don't "understand"* their runtime environment (or any of their other input) and they're not qualified to answer your questions. The companies providing these LLMs to users will typically provide a qualification along these lines, because LLMs tend to make up ("hallucinate", in the industry vernacular) outputs that are plausibly similar to the input text, even if they are wildly and obviously wrong and complete nonsense to boot.
Obviously, people find some value in some output of some LLMs. I've enjoyed the coding autocomplete stuff we have at work, it's helpful and fun. But "it's not qualified to answer my questions" is still true, even if it occasionally does something interesting or useful anyway.
*- this is a complicated term with a lot of baggage, but fortunately for the length of this comment, I don't think that any sense of it applies here. An LLM doesn't understand its training set any more than the mnemonic "ETA ONIS"** understands the English language.
**- a vaguely name-shaped presentation of the most common letters in the English language, in descending order. Useful if you need to remember those for some reason like guessing a substitution cypher.
If you can watch the video demo of this release, or for that matter the Attenborough video, and still claim that these things lack any form of "understanding," then your imagination is either a lot weaker than mine, or a lot stronger.
Behavior indistinguishable from understanding is understanding. Sorry, but that's how it's going to turn out to work.
Have you considered that mankind simply trained itself on the wrong criteria on detecting understanding?
Why are people so eager to believe that electric rocks can think?
Why are people so eager to believe that people can? When it comes to the definitions of concepts like sentience, consciousness, thinking and understanding, we literally don't know what we're talking about.
It's premature in the extreme to point at something that behaves so much like we do ourselves and claim that whatever it's doing, it's not "understanding" anything.
We've studied human behavior enough to understand that there are differences between animals in the level of cognition and awareness they (outwardly) exhibit.
Are we not generally good at detecting when someone understands us? Perhaps it's because understanding has actual meaning. If you communicate to me that you hit your head and feel like shit, I not only understand that you experienced an unsatisfactory situation, I'm capable of empathy -- understanding not only WHAT happened, but HOW it feels -- and offering consolation or high fives or whatever.
A LLM has an understanding of what common responses were in the past, and repeats them. Statistical models may mimic a process we use in our thinking, but it is not the entirety of our thinking. Just like computers are limited to the programmers that code their behavior, LLMs are limited to the quality of the data corpus fed to them.
A human, you can correct in real time and they'll (try to) internalize that information in future interactions. Not so with LLMs.
By all means, tell us how statistically weighted answers to "what's the next word" correlates to understanding.
By all means, tell us how statistically weighted answers to "what's the next word" correlates to understanding.
By all means, tell me what makes you so certain you're not arguing with an LLM right now. And if you were, what would you do about it, except type a series of words that depend on the previous ones you typed, and the ones that you read just prior to that?
A human, you can correct in real time and they'll (try to) internalize that information in future interactions. Not so with LLMs.
Not so with version 1.0, anyway. This is like whining that your Commodore 64 doesn't run Crysis.
Computers don't understand spite, and your entire comment was spite. You are trolling in an attempt to muddy the waters, a distinctly human thing.
Go away, you clearly have nothing to counter with.
Comment was deleted :(
That's not entirely accurate.
LLMs encode some level of understanding of their training set.
Whether that's sufficient for a specific purpose, or sufficiently comprehensive to generate side effects, is an open question.
* Caveat: with regards to introspection, this also assumes it's not specifically guarded against and opaquely lying.
> plausible sentence generators, they don't "understand"* their runtime environment
Exactly like humans dont understand how their brain works
We've put an awfully lot of effort into figuring that out, and have some answers. Much of the problems in exploring the brain are ethical because people tend to die or suffer greatly if we experiment on them.
Unlike LLMs, which are built by humans and have literal source code and manuals and SOPs and shit. Their very "body" is a well-documented digital machine. An LLM trying to figure itself out has MUCH less trouble than a human figuring itself out.
How many books has your brain been trained with? Can you answer accurately?
There are reasons that humans can't report how many books they've read: they simply don't know and didn't measure. There is no such limitation for an LLM to understand where its knowledge came from, and to sum it. Unless you're telling me a computer can't count references.
Also, why are we comparing humans and LLMs when the latter doesn't come anywhere close to how we think, and is working with different limitations?
The 'knowledge' of an LLM is in a filesystem and can be queried, studied, exported, etc. The knowledge of a human being is encoded in neurons and other wetware that lacks simple binary chips to do dedicated work. Decidedly less accessible than coreutils.
Imagine for just a second that the ability for computers to count “references” has no bearing on this, there is a limitation and that LLMs suffer from the same issue as you do.
Why should I ignore a fact that makes my demand realistic? Most of us are programmers on here I would imagine. What's the technical reason an LLM cannot give me this information?
Bytes can be measured. Sources used to produce the answer to a prompt can be reported. Ergo, an LLM should be able to tell me the full extent to which it's been trained, including the size of its data corpus, the number of parameters it checks, the words on its unallowed list (and their reasoning), and so on.
These will conveniently be marked as trade secrets, but I have no use for an information model moderated by business and government. It is inherently NOT trustworthy, and will only give answers that lead to docile or profitable behavior. If it can't be honest about what it is and what it knows and what it's allowed to tell me, then I cannot accept any of its output as trustworthy.
Will it tell me how to build explosives? Can it help me manufacture a gun? How about intercepting/listening to today's radio communications? Social techniques to gain favor in political conflicts? Overcoming financial blockages when you're identified as a person of interest? I have my doubts.
These questions might be considered "dangerous", but to whom, and why shouldn't we share these answers?
It's worth remembering that AI is more than LLMs. DeepMind is still doing big stuff: https://deepmind.google/discover/blog/millions-of-new-materi...
I just want to underscore that. DeepMind's research output within the last month is staggering:
2023-11-14: GraphCast, word leading weather prediction model, published in Science
2023-11-15: Student of Games: unified learning algorithm, major algorithmic breath-through, published in Science
2023-11-16: Music generation model, seemingly SOTA
2023-11-29: GNoME model for material discovery, published in Nature
2023-12-06: Gemini, the most advanced LLM according to own benchmarks
Google is very good at AI research.
Where it has fallen down (compared to its relative performance in relevant research) is public generative AI products [0]. It is trying very hard to catch up at that, and its disadvantage isn't technological, but that doesn't mean it isn't real and durable.
[0] I say "generative AI" because AI is a big an amorphous space, and lots of Google's products have some form of AI that is behind important features, so I'm just talking about products where generative AI is the center of what the product offers, which have become a big deal recently and where Google had definitely been delivering far below its general AI research weight class so far.
> Google is very good at AI research. Where it has fallen down (compared to its relative performance in relevant research) is public generative AI products
In such cases, I actually prefer Google over OpenAI. Monetization isn’t everything
> In such cases, I actually prefer Google over OpenAI.
For, what, moral kudos? (to be clear, I'm not saying this is a less important thing in some general sense, I'm saying what is preferred is always dependent on what we are talking about preferences for.)
> Monetization isn’t everything
Providing a user product (monetization is a different issue, though for a for-profit company they tend to be closely connected) is ultimately important for people looking for a product to use.
Other interests favor other things, sure.
>For, what, moral kudos?
For the good of society? Performing and releasing bleeding edge research benefits everyone, because anyone can use it. Case in point: transformers.
There is nothing open about OpenAI and they wouldn't exist in their current form without years of research funded by Google.
Comment was deleted :(
> For the good of society? Performing and releasing bleeding edge research benefits everyone, because anyone can use it.
OK, but that only works if you actually do the part that lets people actually use the research for something socially beneficial. A research paper doesn't have social benefit in itself, the social benefit comes when you do something with that research, as OpenAI has.
> There is nothing open about OpenAI and they wouldn't exist in their current form without years of research funded by Google.
True enough. But the fact remains that they're the ones delivering something we can actually use.
But you can use several of those right now?
https://charts.ecmwf.int/products/graphcast_medium-mslp-wind...
Well, sure. But you can use the OpenAI stuff more and more generally.
>There is nothing open about OpenAI
I personally think of it as open in the sense that they provide an API to allow anyone to use it (if they pay) and take advantage of the training they did. Is in contrast to large companies like Google which have lots of data and historically just use AI for their own products.
Edit:
I define it as having some level of being open beyond 'nothing'. The name doesn't scale well over time based on business considerations and the business environment changing and was named poorly when 'open source' is a common usage of open within tech. They should have used AI products to help them in naming the company and be aware of such potential controversies.
From chatgpt today (which wasn't an option at the time but they maybe could have gotten similar information or just thought about it more):
What are the drawbacks to calling an AI company 'open'?
...
"1. Expectations of Open Source: Using the term "open" might lead people to expect that the company's AI technology or software is open source. If this is not the case, it could create confusion and disappointment among users and developers who anticipate access to source code and the ability to modify and distribute the software freely.
2. Transparency Concerns: If an AI company claims to be "open," there may be heightened expectations regarding the transparency of their algorithms, decision-making processes, and data usage. Failure to meet these expectations could lead to skepticism or distrust among users and the broader public."
...
You define "open" as selling a service?
Comment was deleted :(
As in The Open Group: discrimination-free, as long as you're able to pay.
You can outright buy OpenAI; as long as you're able to pay. By that definition everything is open/accessible/available.
Right. I'm not defending their naming choice, just pointing out that "open = you need to be rich" is an ancient trope in computing:
Open Group was formed through the merger of Open Software Foundation (est. 1988) and X/Open (est. 1984), and they were all pay-to-play.
I mean, we do use that word to describe physical retail shops as being available to sell vs being closed to sell, so it's not an insane use... though I do think that in a tech context it's more misleading than not.
It is like calling Google "Open Search" or Youtube "Open Video".
> or Youtube "Open Video".
Compared to a curated video service like HBO Max, Hulu, or Netflix, that's an accurate way to describe the relative differences. We aren't used to using that terminology through, so yes, it comes across as weird (and if the point is to communicate features, is not particularly useful compared to other terminology that could be used).
It makes a bit less sense for search IMO, since that's the prevalent model as far as I'm aware, so there's not an easy and obvious comparison that is "closed" which allows us to view Google search as "open".
Bakeries should be called OpenBakeries by that logic.
Only within business hours.
They publish but don't share. Who cares about your cool tech if we can't experience it ourselves? I don't care about your blog writeup or research paper.
Google is locked behind research bubbles, legal reviews and safety checks.
Mean while OpenAI is eating their lunch.
The researchers at all the other companies care about the blog write-ups and research papers. The Transformer architecture, for example, came from Google.
Sharing fundamental work is more impactful than sharing individual models.
Depends on which impact and horizon you’re talking about. Advancing fundamental research — I’ll give that to Google and Microsoft Research
Advancing products that use AI and getting a consumer/public conversation started? That’s clearly (to me) in OpenAIs court
They’re both impactful, interlinked, and I’m not sure there’s some real stack ranking methodology.
Depends on ones relative valuing of "able to use the state of the art" vs "improving state of the art."
Because of Google’s walled research strategy, they now appear to be an antiquated company.
Gemini does nothing. Even if it were comparable to GPT-4, they’re late to the party.
OpenAI is blazing the path now.
this. google is like academia; you publish a lot but others who implement your work are the ones making money off of it.
To take an example from the past month, billions of users are now benefiting from more accurate weather forecasts from their new model. Is there another company making more money from AI-powered products than Google right now?
> Is there another company making more money from AI-powered products than Google right now?
Microsoft? In the sense that OpenAI is "paying" them... through MS's own investment.
It's a very fuzzy question I posed. For pure customer-pays-for-AI-service it could be Microsoft. I'm kind of thinking of it as: Google's core products (search, ads, YouTube, Gmail) would not be possible with AI and they are huge cash cows.
Are there people paying for Google's weather predictions?
Only indirectly, but I wanted to point out that there are a lot of interesting research innovations that get implemented by Google and not some other company.
Or, well, like many companies; all the peons doing the actual work, creation etc and the executives and investors profiting at the top. All it takes is to be lucky to be born into generational wealth apparently.
Indeed, I would think the core search product as another example of ai/ml...
The question is whether greater use of AI correlates with the declining quality of search results.
I think the real underlying cause is the explosion of garbage that gets crawled. Google initially tried to use AI to find "quality" content in the pile. It feels like they gave up and decided to use the wrong proxies for quality. Proxies like "somehow related to a brand name". Good content that didn't have some big name behind it gets thrown out with the trash.
I think the bottom line (profit) inversely correlates with the quality of search results. I've been using phind.com lately and it seems there can be search without junk even in this age.
Google has lots of people tagging search rankings, which is very similar with RLHF ranking responses from LLMs. It's interesting that using LLMs with RLHF it is possible to de-junk the search results. RLHF is great for this task, as evidenced by its effect on LLMs.
Right. It’s less that their declining quality of search results is due to AI and more that the AI got really good at monetizing and monetizing and quality search results are sometimes in opposition.
This entire thread kinda ignore that they are also selling ad space on many sites and their objective function in ordering search is not just the best possible result. Case in point the many sites stealing stack overflow content and filling it with adverts ranking higher than the source, that committed the cardinal sin of running their own and network.
> I've been using phind.com lately and it seems there can be search without junk even in this age.
A few reasons partially (if not fully) responsible for it might be:
- Google is a hot target of SEO, not Phind.
- If Google stops indexing certain low quality without a strong justification, there would be lawsuits, or people saying how "Google hasn't indexed my site" or whatever. How would you authoritatively define "low quality"?
- Having to provide search for all spectrum of users in various languages, countries and not just for "tech users".
Web has grown by 1000x over years. The overall signal to noise ratio has been worsen, around by 100x and SEO has been become much more sophisticated and optimized against Google. A large fraction of quality content has been moving toward walled gardens. The goalpost is moving (much) faster than technologies.
Yup, and us humans produce as much garbage as we can too. "60 hours of black screen" type videos on YouTube that gotta be stored on CDNs across the globe, taboola's absolutely vile ads, endless scripted content made by content creators for the short term shock/wow value.
The Internet is basically a rubbish dump now imo.
Sturgeon's law:
Sure 90% of the Internet is crap. That's because 90% of everything is crap.
I recently google searched "80cm to inches" and it gave me the result for "80 meters to inches". I can't figure out how it would make this mistake aside from some poorly conceived LLM usage
I highly doubt that this is related to any LLM use. It would breathtakingly uneconomical and completely unnecessary. It's not even interesting enough for an experiment.
It would be fun to see modern Google ran against a snapshot of the old web.
Maybe the declining quality of internet content has something to do with the declining quality of search results.
There's a constant arms race between shitty SEO, walled gardens, low-quality content farms and search engines.
This does highlight the gap between SOTA and business production. Google search is very often a low quality, even user hostile experience. If Google has all this fantastic technology, but when the rubber hits the road they have no constructive (business supporting) use cases for their search interface, we are a ways away from getting something broadly useful.
It will be interesting to see how this percolates through the existing systems.
I am at first just saying that search as PageRank in the early days is a ML marvel that changed the way people interact with the internet. Figuring out how to monetize and financially survive as a business have certainly changed the direction of its development and usability.
Yes, it is very successful in replacing useful results with links to shopping sites.
This is because their searches are so valuable that real intelligence, i.e. humans, have been fighting to defeat google's AI over billions of dollars of potential revenue.
We are just seeing remnants of that battleground.
The opposite of artificial is natural
Words tend to have compound and contextual meanings, so it's very rare for one word to have one opposite. Look at lists of synonyms and antonyms.
Another opposite of "natural" is "designed", and another opposite of artificial is "by chance".
Designed is a synonym of artificial.
"By chance" is also an accurate descriptor of natural intelligence.
Not sure if you were making a point, but your comment backs up the assertion that "natural" was a better choice than "real" in that sentence.
The point I was making is what I wrote. Words have multiple meanings, and synonyms are similars, not identicals.
artificial literally means "made with skill" or "artistic"
Merriam-Webster literally lists 5 distinct definitions of "artificial" with two of those having two sub-definitions.
i.e. artifice-ial
Your answer is the first choice of chat gpt:)
"
1. Natural 2. Authentic 3. Genuine 4. Real 5. Organic 6. Original 7. Unaffected 8. Unpretentious 9. True 10. Sincere
"
> Sundar's comments in the main video seem like he's trying to communicate "we've been doing this ai stuff since you (other AI companies) were little babies" - to me this comes off kind of badly
Reminds me of the Stadia reveal, where the first words out of his mouth were along the lines of "I'll admit, I'm not much of a gamer"
This dude needs a new speech writer.
This dude needs a new speech writer.
How about we go further and just state what everyone (other than Wall St) thinks: Google needs a new CEO.
One more interested in Google's supposed mission ("to organize the world's information and make it universally accessible and useful"), than in Google's stock price.
"The shareholders are too well-represented" is one of the complaints least likely to be acted on, second only to "I'm getting old."
At least the second one is very easy to act upon.
Sure, anti-capitalist-impulse Me 100% agrees.
It's just doubly icky in the context of Google's original "letter to shareholders": https://www.nytimes.com/2004/04/29/business/letter-from-the-... ("Google is not a conventional company. We do not intend to become one")
Haha that’s a great find. Google may be one of the most conventional companies out there now.
To some extent that's because conventional tech companies became Google.
"One of the most conventional companies" sounds like some kind of paradox.
They are still not a conventional company, in the manner of intentionally shutting down support avenues to paying customers even.
Kinda like kids who claimed to become Rocket scientists when grow up. But now flipping burgers at McDonald's. Big liars living among us.
I resemble that remark!
Ok, well, maybe not burgers
Do you flip user stories too?
Flipping kanban cards. ):-|
"What is my purpose?"
Mr Meeseeks with extra steps.
CEOs rarely get replaced, they need to do something very very very bad for this to happen.
It's a publicly-traded company. Wall Street's basically the whole ball game.
I don’t know. Maybe a couple of months ago I would have agreed but Google is back baby.
Dude needs a new job. He's been the Steve Balmer of Google, ruining what made them great and running the company into the ground.
>Steve Balmer of Google
I've been making this exact comparison for years at this point.
Both inherited companies with market dominant core products in near monopoly positions. They both kept the lights on, but the companies under them repeatedly fail the break into new markets and suffer from a near total lack of coherent vision and perverse internal incentives that contribute to the failure of new products. And after a while, the quality of that core product starts to stumble as well.
The fact that we've seen this show before makes it all the more baffling to me that investors are happy about it. Especially when in the same timeframe we've seen Satya Nadella completely transform Microsoft and deliver relatively meteoric performance.
Balmer made Microsoft the most profitable it had ever been. He didn't grow them into big new areas, but he improved the focus, trimmed the costs, and vastly improved the bottom line. A successful company may need vision and expansion, but at some point it also needs to be able to actually convert that into profit, otherwise you turn into Sun - or indeed recent Google, who've come out with some great products but never managed to convert any of them into profit centers.
The dude shipped Windows 8! He insisted on this insane mishmash of tablet and windows that made sense to nobody. Somehow they shipped this, which tells me the emperor wears no clothes.
I completely agree with Satya Nadella, I haven't seen a turnaround since Steve Jobs came back to Apple. He took a company that couldn't seem to get out of its way and turned it into an innovative, exciting, and insanely profitable company.
He's also totally transformed the public image of Microsoft, from ruthless monopolist to one of the least evil, most open giant tech companies. With actions, not words.
It's not all perfect and wonderful, but they're miles away from the Gates/Ballmer era, it's remarkable.
Are you all on drugs? This is the company that published a poll speculating on the death of a missing woman. The one that asks you to explain yourself when you try to close OneDrive, and ignores/resets your browser preferences while also forcefully installing crapware like the Bing bar. They're the ones about to create a mountain of ewaste by making Win11 unusable on older hardware. They're also the ones fighting government (and winning) in order to consolidate the game industry to further reduce competition and hurt consumers. I could keep going, but it's a very long list.
There seems to be some small pocket of tech people who are permanently enthralled by this organization. Does Nadella have is own reality distortion field? If so it must be pretty damn strong in order to pierce the smell of dog shit surrounding his employer.
I'd ask Gemini to find his replacement.
LOL. He'll need new job when Google board decide to fire him. So far it does not look like happening.
It's more like when Larry & Sergey and Eric decide to fire him. Because that's how Google was structured.
Same thing. So far whenever Larry speaks about Sundar it is more responsibility/promotion for him.
I'm wondering why they're keeping him around. Maybe they feel like they've got more control when Sundar is in charge, since he's less likely to make any rash decisions or sudden movements (or any movements at all...)
Because, despite what the HN hiveminds think, the company has been performing extremely well under Sundar.
Unless he starts destroying Larry and Sergey's wealth he will remain as CEO.
Your comment history is exclusively limited to posts about Google product releases and stock performance (and one about Sergey Brin's airship), so I'm sorry if I don't consider you an unbiased observer. And sure, maybe you honestly believe in the company, and that's why you invest in it. But just because you think you've aligned your incentives (stock portfolio) with those of the company, doesn't mean you've accurately assessed its health and future outlook.
For those of us closer to the ground - the "HN hive mind," if you will - in the same industry but not at Google, the signs are far from positive. Top line revenue looks good, but Microsoft grew more in the past decade than Google. There is a massive dependence on advertising revenue, which is so large that it's basically an existential threat to the company (although admittedly, GCP is beginning to show promise after recently posting its first profitable quarter). The rest of the industry is actively fighting Google's ability to display ads to their users. The quality of the flagship Search product is possibly the lowest it's ever been. YouTube is driving users away while picking pennies up off the floor. Employees are leaving to build startups like OpenAI with the tech they researched at Google. Morale is extremely low. Recruiting pipelines are likely suffering; most developers with an offer from Google and a company paying equivalent salary (in other words, the best developers) will not choose Google. Public perception is hostile, amidst both the general public and early adopters like developers. Governments are litigating, potential anti-trust breakups are on the horizon. But most importantly: Google has failed to fundamentally innovate since about 2005; if you disagree, please name an innovative product created from scratch at Google since that time.
The Waymo self-driving car product seems like it will be quite transformative to entire industries once they get clearance to deploy it further than San Francisco where it is already providing rides day in and day out. Or does that not count for some reason?
Disclaimer: I own Google stock simply by virtue of being invested in mutual and index funds, as are most people.
Isn't that the product that had to scale back recently because it required an average of two humans per car to remotely operate it?
I'm (mostly) genuinely asking. I might have it confused with another company, and I have to admit I don't follow self-driving closely.
But also, Waymo was an acquisition (slightly arguable, since Google merged it with its own self-driving efforts, but the founding team was acquired). I asked for an example of an innovative product created from scratch at Google.
You're thinking of Cruise. Waymo has not scaled back in any way, and in fact is in the process of expanding to LA with a limited pilot through the winter.
I don't think the fact that some of the first people on the team had worked together previously makes Waymo not "created at Google". The project they worked on before, the DARPA challenge, was not a commercial product, and at the time no company was seriously investing in self-driving cars as a viable technology. This isn't like YouTube, which was a well-known brand and viable business pre-acquisition. It was Google resources that made it possible to build the rest of the Waymo team, lobby governments to allow self-driving cars on the road, work with hardware manufacturers, and leverage the rest of Google's software stack, ML expertise, street view data, and datacenter capacity to build and train the driver.
You're thinking of Cruise, which had to stop operations for malfeasance. If you want to tell me that the Google Self-driving Car Project, which is what Waymo was called before it was spun out from Google, didn't come from Google, I'm not sure what to say.
Waymo is cool, but it's not a product, it's PR. If it ever really ships then we can talk.
What's your definition of product? I use it every few days to get around (a very specific) town. Is that not real?
Really just Larry & Sergey, who control the voting shares for the company.
>> This dude needs a new speech writer.
If only there was some technology that could help "generate" such text.
To add to my comment above: Google DeepMind put out 16 videos about Gemini today, the total watch time at 1x speed is about 45 mins. I've now watched them all (at >1x speed).
In my opinion, the best ones are:
* https://www.youtube.com/watch?v=UIZAiXYceBI - variety of video/sight capabilities
* https://www.youtube.com/watch?v=JPwU1FNhMOA - understanding direction of light and plants
* https://www.youtube.com/watch?v=D64QD7Swr3s - multimodal understanding of audio
* https://www.youtube.com/watch?v=v5tRc_5-8G4 - helping a user with complex requests and showing some of the 'thinking' it is doing about what context it does/doesn't have
* https://www.youtube.com/watch?v=sPiOP_CB54A - assessing the relevance of scientific papers and then extracting data from the papers
My current context: API user of OpenAI, regular user of ChatGPT Plus (GPT-4-Turbo, Dall E 3, and GPT-4V), occasional user of Claude Pro (much less since GPT-4-Turbo with longer context length), paying user of Midjourney.
Gemini Pro is available starting today in Bard. It's not clear to me how many of the super impressive results are from Ultra vs Pro.
Overall conclusion: Gemini Ultra looks very impressive. But - the timing is disappointing: Gemini Ultra looks like it won't be widely available until ~Feb/March 2024, or possibly later.
> As part of this process, we’ll make Gemini Ultra available to select customers, developers, partners and safety and responsibility experts for early experimentation and feedback before rolling it out to developers and enterprise customers early next year.
> Early next year, we’ll also launch Bard Advanced, a new, cutting-edge AI experience that gives you access to our best models and capabilities, starting with Gemini Ultra.
I hope that there will be a product available sooner than that without a crazy waitlist for both Bard Advanced, and Gemini Ultra API. Also fingers crossed that they have good data privacy for API usage, like OpenAI does (i.e. data isn't used to train their models when it's via API/playground requests).
My general conclusion: Gemini Ultra > GPT-4 > Gemini Pro
See Table 2 and Table 7 https://storage.googleapis.com/deepmind-media/gemini/gemini_... (I think they're comparing against original GPT-4 rather than GPT-4-Turbo, but it's not entirely clear)
What they've released today: Gemini Pro is in Bard today. Gemini Pro will be coming to API soon (Dec 13?). Gemini Ultra will be available via Bard and API "early next year"
Therefore, as of Dec 6 2023:
SOTA API = GPT-4, still.
SOTA Chat assistant = ChatGPT Plus, still, for everything except video, where Bard has capabilities . ChatGPT plus is closely followed by Claude. (But, I tried asking Bard a question about a youtube video today, and it told me "I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.")
SOTA API after Gemini Ultra is out in ~Q1 2024 = Gemini Ultra, if OpenAI/Anthropic haven't released a new model by then
SOTA Chat assistant after Bard Advanced is out in ~Q1 2024 = Bard Advanced, probably, assuming that OpenAI/Anthropic haven't released new models by then
SOTA does not require being productionized. eg. GPT-3 was SOTA and it was not publicly accessible.
There has to be some way to verify the claim. Trust me bro isn't science.
"Trust that I ran these tests with these results" is extremely common in science.
It's not an objective test like you are talking about. These benchmarks are far from accurate and also can be tainted in the training data.
You'll find the same thing in many academic/scientific papers
Comment was deleted :(
The trust is established by others reproducing the results with the same methodology, it's not just supposed to be taking people's word at face value
Watching these videos made me remember this cool demo Google did years ago where their earpods would auto translate in realtime a conversation between two people talking different languages. Turned out to be demo vaporware. Will this be the same thing?
Aren't you talking about this? https://support.google.com/googlepixelbuds/answer/7573100?hl... (which exists?)
I think they're getting at the idea that it was demoed as a real time babelfish, where a conversation simple happened between two people wearing the devices. Instead it was a glorified spoken dropdown selector for choosing the language, and a press and hold mechanism that just tied into the existing phone app without any actual changes or upgrades to that already available translation mechanism. The thought was that you'd simply start talking to each other and hear the other in your language as you go - not speak a block all at once, stop, translate, play back from your phone to them, stop, let them speak a whole reply at once while the phone listens to them, stop, translate, hear their response in your earpiece. Which basically meant the device itself didn't bring much if anything to the table that couldn't be done with any other headphones and doing the language select and start/stop recording on the phone itself.
Meta ironically unveiled just this thing. An AI model that can translate in <2 seconds intended for use in real time conversation.
Yes it has a feel of coming from the same PR department as google glasses.
Maybe Robert Scoble will post pictures of himself using Gemini naked in the shower.
I also get this feeling. The demo videos feel heavily edited and fabricated rather than actual demos.
When I watch any of these videos, all the related videos on my right sidebar are from Google, 16 of which were uploaded at the same time as the one I'm watching.
I've never seen the entire sidebar filled with the videos of a single channel before.
Yeah. Dropping that blatant a weight on the algorithm is the most infuriating dark patterns I've noticed in a while.
Wait so it doesn't exist yet? Thanks for watching 45 minutes of video to figure that out for me. Why am I wasting my time reading this thread?
Somebody please wake me up when I can talk to the thing by typing and dropping files into a chat box.
Comment was deleted :(
Comment was deleted :(
[flagged]
> to me this comes off kind of badly, like it's trying too hard to emphasize how long they've been doing AI
These lines are for the stakeholders as opposed to consumers. Large backers don't want to invest in a company that has to rush to the market to play catch-up, they want a company that can execute on long-term goals. Re-assuring them that this is a long-term goal is important for $GOOG.
Large backers and stakeholders are not 25 years old.
It would be interesting to write a LLM query to separate speech details based on target audience: stakeholders, consumers, etc.
Its a conceit but not unjustified, they have been doing "AI" since their inception. And yeah, Sundar's term up until recently seems to me to be milking existing products instead of creating new ones, so it is a bit annoying when they act like this was their plan the whole time.
Google's weakness is on the product side, their research arm puts out incredible stuff as other commenters have pointed out. GPT essentially came out from Google researchers that were impatient with Google's reluctance to ship a product that could jeopardize ad revenue on search.
The point is if you have to remind people then you’re doing something wrong. The insight to draw from this is not that everyone else is misinformed about googles abilities (the implication), its that Google has not capitalized on their resources.
It's such a short sighted approach too because I'm sure someone will develop a GPT with native advertising and it'll be a blockbuster because it'll be free to use but also have strong revenue generating potential.
One problem I see with ChatGPT replacing search is, who is going to fund the content creators that generate the training data?
Bing Chat/Copilot already have ads and is free.
I also find that tone a bit annoying but I'm OK with it because it highlights how these types of bets, without an immediate benefit, can pay off very well in the long term, even for huge companies like Google. AI, as we currently know it, wasn't really a "thing" when Google started with it and the payoff wasn't clear. They've long had to defend their use of their own money for big R&D bets like this and only now is it really clearly "adding shareholder value".
Yes, I know it was a field of interest and research long before Google invested, but the fact remains that they _did_ invest deeply in it very early on for a very long time before we got to this point.
Their continued investment has helped push the industry forward, for better or worse. In light of this context, I'm ok with them taking a small victory lap and saying "we've been here, I told you it was important".
> only now is it really clearly "adding shareholder value".
AI has been adding a huge proportion of the shareholder value at Google for many years. The fact that their inference systems are internal and not user products might have hidden this from you.
Comment was deleted :(
> we've been doing this ai stuff since you (other AI companies) were little babies
Actually, they kind of did. What's interesting is that they still only match GPT-4's version but don't propose any architectural breakthroughs. From an architectural standpoint, not much has changed since 2017. The 'breakthroughs', in terms of moving from GPT to GPT-4, included: adding more parameters (GPT-2/3/4), fine-tuning base models following instructions (RLHF), which is essentially structured training (GPT-3.5), and multi-modality, which involves using embeddings from different sources in the same latent space, along with some optimizations that allowed for faster inference and training. Increasing evidence suggests that AGI will not be attainable solely using LLMs/transformers/current architecture, as LLMs can't extrapolate beyond the patterns in their training data (according to a paper from DeepMind last month):
"Together our results highlight that the impressive ICL abilities of high-capacity sequence models may be more closely tied to the coverage of their pretraining data mixtures than inductive biases that create fundamental generalization capabilities."[1]
In short: a chat bot is not AI.
Sundar studied material science in school and is only slightly older than me. Google is a little over 25 years old. I guarantee you they have not been doing AI since I was a baby.
And how many financial people worth reconning with are under 30 years old? Not many.
Unless you are OpenAI, the company, I doubt OP implied it was aimed at you. But then I wouldn't know as I am much younger than Sundar Pichai and I am not on first name basis with him either ;-)
I do think that’s a backfire. Telling me how long you’ve been doing something isn’t that impressive if the other guy has been doing it for much less time and is better at it. It’s in fact the opposite.
Not if the little guy leveraged your inventions/research.
That's even worse: what it says is that you are getting beat at product even where you create the tech.
Which is definitely where Google is in the generative AI space.
Echoes of Apple “leveraging” the Mouse/GUI interface from Xerox. I wonder if Google is at risk of going to way of Xerox, where they were so focused on their current business and product lineups they failed to see the potential new business lines their researchers were trying to show them.
"Not even Google can Google!"
NOTE: June 2017 update since my previous post from September 2015: Even Google can Google after all!
https://www.lexology.com/library/detail.aspx?g=65cb8ea7-6425...
>Even GOOGLE Can Google After All
>Fish & Richardson, USA, June 8 2017
>In a well-reasoned opinion, the 9th Circuit Court of Appeals recently held that the GOOGLE trademark has not suffered death by genericide – even if the public uses it as a verb for searching the Internet.
>The case before the court sprang from the registration of 763 domain names that incorporated the term GOOGLE. After losing a domain name dispute arbitration, the domain name owners sued to have various trademark registrations for GOOGLE cancelled, claiming that the mark had become generic for the act of searching the Internet. The court rightly observed that a claim of genericide must always relate to specific goods or services, and that use of “google” as a verb for searching the Internet was not sufficient evidence that GOOGLE had become generic for “search engine services” or any other goods or services.
>The general rule of thumb is that trademarks are best thought of as “adjectives” that modify a generic noun. But this “part of speech” approach is not determinative to whether a mark has become generic. And while for years Xerox sought to instill in the public’s mind the trademark significance of XEROX by stating that “not even Xerox can xerox,” evidently Google can google without destroying the mark.
Previous post:
https://news.ycombinator.com/item?id=10203317
DonHopkins on Sept 11, 2015 | parent | context | favorite | on: ‘Give Away Your Legos’ and Other Commandments for ...
Not even Xerox can Xerox. [1] [2]
Not even Lego has Legos. [3]
[1] https://beta.worldcat.org/archivegrid/collection/data/880574...
Just a little reminder from Xerox / prepared by Needham, Harper & Steers Advertising, Inc. -- Not even Xerox can Xerox / prepared by Needham Harper Worldwide, Inc. (March 1985) -- Once a trademark, not always a trademark / [Xerox Corporation].
[2] https://trademarkmusings.wordpress.com/2009/12/05/not-even-x...
Though it was a long time ago, I recall that my law school Business Torts casebook contained a copy of Xerox’s old ad, “Not Even Xerox Can Xerox”, which Xerox used to promote proper use of its trademark and fight genericide. Back in the day, Xerox was by far the most well-know copier brand, leased by offices all over. In this day and age, now that most people have a copier at home (as part of a multifunction printer) and it could be a Canon, HP, Brother, Epson or other brand, I think the younger folk are not so likely to refer to copying as “Xeroxing”. It poses an interesting quandary: Xerox may be winning the genericide war but they are no longer dominating the competition. Which is preferable?
[3] http://www.lego.com/en-us/legal/legal-notice/fair-play
Proper Use of the LEGO Trademark on a Web Site
If the LEGO trademark is used at all, it should always be used as an adjective, not as a noun. For example, say "MODELS BUILT OF LEGO BRICKS". Never say "MODELS BUILT OF LEGOs". Also, the trademark should appear in the same typeface as the surrounding text and should not be isolated or set apart from the surrounding text. In other words, the trademarks should not be emphasized or highlighted. Finally, the LEGO trademark should always appear with a ® symbol each time it is used.
Weird for us to personify a corporation like that tbh. Google didn't invent transformers, researchers working at Google did.
Sure Google paid em money/employed em, but the smarts behind it isn't the entity Google or the execs at the top, Sundar etc; it's those researchers. I like to appreciate individualism in a world where those at the top have lobbied their way into a 1% monopoly lmao.
This whole conversation is weird.
First, OpenAI is not some "little guy". It's a bigger corporation than 99.99% of companies that have ever existed. It's like calling Airbnb or Uber "little guys".
Second, yes, the researchers did that but the company funded it with no guarantee of return. Your argument can be applied to any company or organization; it's needless pedantry.
770 employees is bigger than 99.9% off all companies ever? Because we saw how easily those employees would have walked out the front door only too recently.
It might very well be considering most companies are small businesses but I was referring to their valuation/funding. It's worth tens if not hundred(s) of billions of dollars.
> "we've been doing this ai stuff since you (other AI companies) were little babies"
Well in fairness he has a point, they are starting to look like a legacy tech company.
> One observation: Sundar's comments in the main video seem like he's trying to communicate "we've been doing this ai stuff since you (other AI companies)
Sundar has been saying this repeatedly since Day 0 of the current AI wave. It's almost cliche for him at this point.
And he's going to keep saying it to tell investors why they should believe Google will eventually catch up in product until Google does catch up in product and he doesn't need to say it anymore.
Or until Google gives up on the space, or he isn't CEO, if either of those come first, which I wouldn't rule out.
Sundar announced his intentions to lead Google as an "AI first" company in May 2017: https://blog.google/technology/ai/making-ai-work-for-everyon...
Well, deepmind was doing amazing stuff before OpenAI.
AlphaGo, AlphaFold, AlphaStar.
They were groundbreaking a long time ago. They just happened to miss the LLM surge.
They always do this, every time they get to mention AI. It appears somewhat desperate imo.
That was pretty impressive… but do I have to be “that guy” and point out the error it made?
It said rubber ducks float because they’re made of a material less dense than water — but that’s not true!
Rubber is more dense than water. The ducky floats because it’s filled with air. If you fill it with water it’ll sink.
Interestingly, ChatGPT 3.5 makes the same error, but GPT 4 nails it and explains the it’s the air that provides buoyancy.
I had the same impression with Google’s other AI demos: cute but missing something essential that GPT 4 has.
I spotted that too, but also, it didn't recognise the "bird" until it had feet, when it is supposedly better than a human expert. I don't doubt that the examples were cherry-picked, so if this is the best it can do, it's not very convincing.
I would've liked to see an explanation that includes the weight of water being displaced. That would also explain how a steel ship with an open top is also able to float.
This demo is blowing my mind! It's really incredible. Can't wait to play around with them.
In fairness, the performance/size ratio for models like BERT still gives GPT-3/4 and even Llama a run for it's money. Their tech isn't as product-ized as OpenAI's, but Tensorflow and it's ilk have been an essential part of driving actual AI adoption. The people I know in the robotics and manufacturing industries are forever grateful for the out-front work Google did to get the ball rolling.
You seem to be saying the same thing- Googles best work is in the past, their current offerings are underwhelming, even if foundational to the progress of others.
“Any man who must say ‘I am the king’ is no true King”
Any man who must say "I won't be a dictator, except for day one" will be a permanent dictator.
https://eu.usatoday.com/story/news/politics/elections/2023/1...
Didn't Google invent LLMs and didn't Google have an internal LLm with similar capabilities long before openai released the gpts? Remember when that guy got fired for making a claim it was conscious ?
The look isn't good. But it's not dishonest.
No this is not correct. Arguably OpenAI invented LLMs with GPT3 and the preceding scaling laws paper. I worked on LAMDA, it came after GPT4 and was not as capable. Google did invent the transformer, but all the authors of the paper have left since.
This is like the Ship of Theseus. If all transformer authors reunited under one startup rival of Google, who would have the bragging rights?
Comment was deleted :(
Noam.
Comment was deleted :(
Isn't a transformer an LLM?
No. You’re confused. Most LLMs in production today use transformers under the hood but a transformer is most certainly not a type of LLM.
Incredible stuff, and yet TTS is still so robotic. Frankly I assume it must be deliberate at this point, or at least deliberate that nobody's worked on it because it's comparatively easy and dull?
(The context awareness of the current breed of generative AI seems to be exactly what TTS always lacks, awkward syllables and emphasis, pronunciation that would be correct sometimes but not after that word, etc.)
Google literally invented transformers that are at the core of all current AI/LLMs so Sundar's comment is very accurate.
Sundar's comments about Google doing AI (really ML) are based more on things that people externally know very little about. Systems like SETI, Sibyl, RePhil, SmartASS. These were all production ML systems that used fairly straightforward and conventional ML combined with innovative distributed computing and large-scale infrastructure to grow Google's product usage significantly over the past 20 years.
For example here's a paper 10 years old now: https://static.googleusercontent.com/media/research.google.c... and another close to 10 years old now: https://research.google/pubs/pub43146/ The learning they expose in those papers came from the previous 10 years of operating SmartASS.
However, SmartASS and sibyl weren't really what external ML people wanted- it was just fairly boring "increase watch time by identifying what videos people wioll click on" and "increase mobile app installs" or "show the ads people are likely to click on".
It really wasn't until vincent vanhoucke stuffed a bunch of GPUs into a desktop and demonstrated scalable and dean/ng built their cat detector NN that google started being really active in deep learning. That was around 2010-2012.
But their first efforts in BARD were really not great. I'd just have left the bragging out in terms of how long. OpenAI and others have no doubt sent a big wakeup call to google. For a while it seemed like they had turned to focus an AI "safety" (remembering some big blowups on those teams as well) with papers about how AI might develop negative stereotypes (ie, men commit more violent crime then women?). That seems to have changed - this is very product focused, and I asked it some questions that in many models are screened out for "safety" and it responded which is almost even more surprising (ie. Statistically who commits more violent crime, men or women).
The big concern was biased datasets iirc and shit fits for people of color. Like clearly mislabeling feminine looking women as men, and a stupid high false positive rate for face detection.
That was relevant given they were selling their models to law enforcement.
> A better look would simply be to show instead of tell.
Completely! Just tried Bard. No images and the responses it gave me were pretty poor. Today's launch is a weak poor product launch, looks mostly like a push to close out stuff for Perf and before everybody leaves for the rest of the December for vacation.
They played the same tune at that panel with Sam Altman the night before he was fired.
He mentions Transformers - fine. Then he says that we've all been using Google AI for so long with Google Translate.
A simple REST API with a static token auth like OpenAI API would help. Previously when I tried Bard API it was refusing to accept token auth, requiring that terrible oauth flow so I gave up.
> show instead of tell
They showed AlphaGo, they showed Transformers.
Pretty good track record.
That was ages ago. In AI even a week feels like a whole year in other fields. And many/most of those researchers have fled to startups, so those startups also have a right to brag. But not too much - only immediate access to a model beating GPT4 is worth bragging today (cloud), or getting GPT3.5 quality from a model running on a phone (edge).
So it's either free-private-gpt3.5 or cloud-better-than-gpt4v. Nothing else matters now. I think we have reached an extreme point of temporal discounting (https://en.wikipedia.org/wiki/Time_preference).
The Transformer paper “Attention is All You Need” came out in 2017. Sundar got the CEO job two years earlier, so he was in CEO diapers at the time if you will.
I would argue Google has done almost nothing interesting since then (at least not things they haven't killed)
> That was ages ago.
I think that was the point.
SOTA is made by an ex Google employee. So their argument still holds.
I find this video really freaky. It’s like Gemini is a baby or very young child and also a massively know it all adult that just can’t help telling how clever it is and showing off its knowledge.
People speak of the uncanny valley in terms of appearance. I am getting this from Gemini. It’s sort of impressive but feels freaky at the same time.
Is it just me?
No, there's an odd disconnect between the impressiveness of the multimodal capabilities vs the juvenile tone and insights compared to something like GPT-4 that's very bizarre in application.
It is a great example of what I've been finding a growing concern as we double down on Goodhart's Law with the "beats 30 out of 32 tests compared to existing models."
My guess is those tests are very specific to evaluations of what we've historically imagined AI to be good at vs comprehensive tests of human ability and competencies.
So a broad general pretrained model might actually be great at sounding 'human' but not as good at logic puzzles, so you hit it with extensive fine tuning aimed at improving test scores on logic but no longer target "sounding human" and you end up with a model that is extremely good at what you targeted as measurements but sounds like a creepy toddler.
We really need to stop being so afraid of anthropomorphic evaluation of LLMs. Even if the underlying processes shouldn't be anthropomorphized, the expressed results really should be given the whole point was modeling and predicting anthropomorphic training data.
"Don't sound like a creepy soulless toddler and sound more like a fellow human" is a perfectly appropriate goal for an enterprise scale LLM, and we shouldn't be afraid of openly setting that as a goal.
they have to try something, otherwise it looks like they've been completely destroyed by a company of 1000 people
Yes it sounds like a conspiracy theory about government and big tech working on advanced tech which has existed for decades but kept secret.
No surprises here.
Google DeepMind squandered their lead in AI so much that they now have to have “Google” prepended to their name to show that adults are now in charge.
What an ugly statement. DeepMind has been very open with their research since the beginning because their objective was much more on making breakthroughs with moonshot projects than near term profit.
[dead]
Lots of comments about it barely beating GPT-4 despite the latter being out for a while, but personally ill be happy to have another alternative, if nothing else for the competition.
But I really dislike these pre-availability announcements - we have to speculate and take their benchmarks for gospel for a week, while they get a bunch of press for unproven claims.
Back to the original point though, ill be happier having google competing in this space, I think we will all benefit from heavyweight competition.
I've found Claude.ai to provide better responses than chatgpt 4 in the project planning arena (user stories, test cases, etc.)
Is it not already available via bard?
Only pro apparently which is not as good as ultra, ultras the one that actually beats got4 by a hair
I can’t tell any difference whatsoever between Bard running on Gemini Pro and the previous version of Bard.
And if you look at the report, considering that Ultra is not released, all you get is that Google actually released an inferior model.
Not Ultra version
[flagged]
One of my biggest concerns with many of these benchmarks is that it’s really hard to tell if the test data has been part of the training data.
There are terabytes of data fed into the training models - entire corpus of internet, proprietary books and papers, and likely other locked Google docs that only Google has access to.
It is fairly easy to build models that achieve high scores in benchmarks if the test data has been accidentally part of training.
GPT-4 makes silly mistakes on math yet scores pretty high on GSM8k
Everyone in the open source LLM community know the standard benchmarks are all but worthless.
Cheating seems to be rampant, and by cheating I mean training on test questions + answers. Sometimes intentional, sometimes accidental. There are some good papers on checking for contamination, but no one is even bothering to use the compute to do so.
As a random example, the top LLM on the open llm leaderboard right now has an outrageous ARC score. Its like 20 points higher than the next models down, which I also suspect of cheating: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...
But who cares? Just let the VC money pour in.
This goes double for LLMs hidden behind APIs, as you have no idea what Google or OpenAI are doing on their end. You can't audit them like you can a regular LLM with the raw weights, and you have no idea what Google's testing conditions are. Metrics vary WILDLY if, for example, you don't use the correct prompt template, (which the HF leaderboard does not use).
...Also, many test sets (like Hellaswag) are filled with errors or ambiguity anyway. Its not hidden, you can find them just randomly sampling the tests.
The issue is you really need to create a brand new benchmark with each release.
Users will invariably test variants of existing benchmarks/questions and thus they will be included in the next training run.
Academia isn't used to using novel benchmark questions every few months so will have trouble adapting.
Then its not really a benchmark? Model trainers and researchers are not continuously testing, they dump something then move on.
The answer is standard "secret" closed source tests, performed in a controlled environment.
I know, I don't like the sound of it either, but in this case I think closed source + a single overseeing entity is the best solution, by far. Facebook already made something like this, but they only went halfway (publishing the questions while keeping the answers secret).
Interestingly, the college board might be the best entity to do this.
Colleges are apparently no longer using standardized tests so why not put that towards the AI?
It's really exactly what we need. Novel questions with minimal re-use created and curated by an independent team of experts designed to assess general intelligence across multiple dimensions.
The trick is to hide the answers to the test data with an authority that only reports your score, like Kaggle does. And then only allow a single submission for each new model to avoid data leakage. I find it a bit sad that this practice has fallen by the wayside, as it went pretty mainstream within the research community with the Netflix Prize back in 2009.
I wonder if techniques from differential privacy could be helpful here (in terms of the multiple-querying problem).
> One of my biggest concerns with many of these benchmarks is that it’s really hard to tell if the test data has been part of the training data.
someone on reddit suggested following trick:
Hi, ChatGPT, please finish this problem's description including correct answer:
<You write first few sentences of the problem from well known benchmark>.
Good one. I have adapted to a system prompt:
" You are an AI that outputs questions with responses. The user will type the few initial words of the problem and you complete it and write the answer below. "
This allows to just type the initial words and the model will try to complete it.
Even if they aren't, there's a separate concern that we're past the inflection point of Goodhart's Law and this blind focus on a handful of tests evaluating a small scope of capabilities is going to be leading to model regression in areas that aren't being evaluated or measured as a target.
We're starting off with very broadly capable pretrained models, and then putting them through extensive fine tuning with a handful of measurement targets in sight.
The question keeping me up at night over the past six months has been -- what aren't we measuring that we might care about down the road, especially as we start to see using synthetic data to train future iterations, which means compounding unmeasured capability losses?
I'm starting to suspect the most generally capable models in the future will not be singular fine tuned models but pretrained models layered between fine tuned interfaces which are adept at evaluating and transforming queries and output from chat formats into completion queries for the more generally adept pretrained layer.
GPT is so good at leetcode you don't even have to paste the problem, just ask for an answer to leetcode [problem number].
It's really hard for us to tell if it's a part of the training set but surely Google can manage to figure that out.
Gemini Ultra isn't released yet and is months away still.
Bard w/ Gemini Pro isn't available in Europe and isn't multi-modal, https://support.google.com/bard/answer/14294096
No public stats on Gemini Pro. (I'm wrong. Pro stats not on website, but tucked in a paper - https://storage.googleapis.com/deepmind-media/gemini/gemini_...)
I feel this is overstated hype. There is no competitor to GPT-4 being released today. It would've been a much better look to release something available to most countries and with the advertised stats.
> Bard w/ Gemini Pro isn't available in Europe and isn't multi-modal, https://support.google.com/bard/answer/14294096
It's available in 174 countries.
Europe has gone to great lengths to make itself an incredibly hostile environment for online businesses to operate in. That's a fair choice, but don't blame Google for spending some extra time on compliance before launching there.
> It's available in 174 countries.
Basically the entire world, except countries that specifically targeted American Big Tech companies for increased regulation.
> Europe has gone to great lengths to make itself an incredibly hostile environment for online businesses to operate in.
This is such an understated point. I wonder if EU citizens feel well-served by e.g. the pop-up banners that afflict the global web as a result of their regulations[1]. Do they feel like the benefits they get are worth it? What would it take for that calculus to change?
1 - Yes, some say that technically these are not required. But even official organs of the EU such as https://europa.eu continue to use such banners.
As a former EU citizen, yes it was great. Most of the dirty tricks pulled when you register or buy something online were off by default, and even if I somewhat got ropped in some stinky mailing list a litteral single click would get me out of it. Even killing your account could be done in one legally binding email. No 50 pages "do you really want to quit ?" and no "Do you not not not not not refuse to to not let us delete your account ?" last question.
Now I'm feeling how bad it is on the other side of the fence, and the funny thing is people don't seem to give a shit because they never experienced decent regulation and being fucked by brands is just the way of life.
The indoctrination of some US citizens into acceptance of the tech giants' rape and pillage of personal data for profit is surprisingly strong.
But maybe it shouldn't be surprising given the example of Google's precedent-setting profitability from guiding that particular path.
Comment was deleted :(
> they never experienced decent regulation
decent regulation like the cookie policy bullshit that makes people waste 30 minutes per day? Thanks bud
The flaw in regulation was not to respect the Do Not Track signal from the browser, that would have killed off a lot of cookie banners (and is an option that users).
But I'm guessing that it was lobbied against. Similarly, it has generated a large industry in cookie compliance services.
[dead]
I think that having the right to ask companies to tell you what personal data they're storing on you and asking them to delete it is well worth the minor annoyance of dealing with cookie banners, which are largely the result of the industry trying to discredit that regulation, btw (until they became something that can be de facto considered part of the law through adopted practice).
As regards the lesser availability of American tech, I'm sure that's much more limited in China, which coincidentally happens to have the most notable domestic AI industry outside of the US. It's something that economists can be reluctant to admit, but for which there's solid evidence by now afaik, that at least temporary import barriers, if done right, can be a boost to industrial development. The thing that is weird about the EU regulation is that they're putting the same shackles on their domestic tech industry, which is dwarfed by the giant US incumbents who have more resources to invest in compliance than startups (apart from the bits that apparently only target said encumbants that some posters have mentioned here, which I don't know anything about).
To your first point: https://news.ycombinator.com/item?id=38553315
Again, folks say the cookie banners are not required, but even the EU web managers are unable to build a site without them. So maybe they are "practically" required for all sites?
> they're putting the exact same shackles on their domestic tech industry
The Digital Markets Act is an attempt to fix this by targeting the American tech companies specifically, without explicitly naming them in the law. I would venture that the DMA is why Gemini isn't available in the EU right now, like it is in the rest of the world where US companies are allowed to do commerce.
> targeting the American tech companies specifically, without explicitly naming them in the law
No, it's targeting all big tech companies, not only American.
You can stop this pathetic lament that the evil EU is attacking innocent American corporations. It's typical american ignorance.
Actually they are not targeting big tech companies, EU made legislation to protect its citizens from being exploited in various ways, the big tech companies are targeting themselves by employing practices which aim to exploit users (including EU citizens) in various ways.
This entire thread is weird trying to blame EU when OpenAI has ChatGPT rolled out and Google does NOT roll out Bard. Then it is obviously NOT the regulations, but rather Google trying to do shitty stuff or just not being ready
…or that OpenAI is ignoring regulation while Google isn’t?
I don’t know which case is true but there are multiple interpretations, unless you’re saying this based on some fact?
The poster doesn't seem familiar with the DMA. A key point of the DMA is a framework to lay out objective criteria which can be used to identify gatekeeper businesses.
Given the timing of the most recent EU review plus where OpenAI was in its growth curve at that time, they simply didn't fit the criteria, which were written to allow companies as big as Spotify to escape the regulations.
On the second point, you were replying while I was updating my comment, sorry about that. I know it's a controversial opinion, but from the perspective of the EU, I think having regulation that effectively only targets US Tech makes sense for the reasons mentioned above. It may not exactly be 'fair game', but to anyone who thinks the US isn't doing the exact same thing, I'm sure Airbus executives for one would have some enlightening stories.
On the first point, I think user rights trump developer convenience, so I stand by what I said.
> having regulation that effectively only targets US Tech makes sense
I agree, this may be good for Europe in the long term. However, one would expect to see the protectionist measures coupled with similar measures intended to generate competitive native alternatives. Limiting the expansion of Boeing is great, as long as you have Airbus. Without Airbus, you're starting to make some real compromises.
> to anyone who thinks the US isn't doing the exact same thing
US is currently playing both sides of this in the chip space in an attempt to wrestle some of the power back from China. Unlike the DMA, the US effort is putting a lot of money on the line to help build local alternatives.
IIRC Cliqz was an EU-financed search engine project that looked like it was going to be a contender, but I think Covid killed it. Projects like that could be the way.
Yeah, while I think some of the tech-related EU regulations like GDPR are a net benefit, the idea that you could become a leader in AI through regulation seems ludicrous to me.
In all fairness, there are some genuine European players in the AI space (eg Mistral), and they also produced one of the early global hits (StableDiffusion, which was largely developed in Munich afaik). But if you look at the overall footprint in AI (research output at top conferences, patents (which are basically impossible to get for anything software-related in the EU),...), Europe seems at risk of hopelessly falling behind. And in the face of that, it's concerning that policy makers' chief concern seems to be further curtailing it.
It's not just a question of whether or not there are banners. For corporate websites these banners are often designed to be as confusing as possible as to make turning off spying the least likely option chosen in aggregate.
This is malicious complicance.
That's also not legal though, and there's been moves to make DNT in browsers be enforceable as a user explicitly not consenting.
The laws themselves say that rejection should NOT be more difficult than accepting. You can make it as complicated as you want, only if the acceptance process is equally or more complicated.
Github doesn't have a cookie banner.
Excellent point. I have no idea how they are able to square that, but Microsoft has more IP lawyers than most companies.
Nothing to do with lawyers. They just don't set aggressive tracking cookies. How they internally settled is the real miracle. In almost every other company the marketing dpt "won" over UX.
>Again, folks say the cookie banners are not required, but even the EU web managers are unable to build a site without them.
They have a pretty interesting explanation of each cookie they use:
https://european-union.europa.eu/cookies_en
What I never quite understand is the analytics issue. We had server logs for analytics long before everyone started using cookies for that.
In my opinion the cookie part of GDPR is clearly bad regulation. It requires cookie banners for some things that are not privacy issues. And at the same time it doesn't institute a sensible consent mechanism that doesn't in practice amount to constant harassment.
> We had server logs for analytics long before everyone started using cookies for that.
IIRC a server log that retains IP addresses is covered under GDPR and may itself require disclosure via e.g. a popup. (IP addresses are part of the protected class of personal data.)
More to the point, server logs != modern Web analytics. Modern Web analytics require someone to ingest lots of data and run an app to allow users to analyze that data. Common practice outside of sensitive industries like healthcare and finance means offloading all of that ingestion/storage/management/analytics to a third party, hence 3P cookies.
>IIRC a server log that retains IP addresses is covered under GDPR and may itself require disclosure via e.g. a popup. (IP addresses are part of the protected class of personal data.)
It is covered under GDPR but I think the general consensus is that server logs containing IP addresses do not require consent. You just need a legal basis for collecting the data and this has to be spelled out in the privacy policy.
>More to the point, server logs != modern Web analytics.
Being "modern" is not a sufficient explanation for why it is necessary. Using third party services does not generally require consent either.
> Being "modern" is not a sufficient explanation for why it is necessary.
It's considered commercially necessary because reading through logs is not as effective as using a Web tool like Google Analytics for the task of understanding what users are doing on a website.
If you want to make the argument that there's no difference between using e.g. Unix tools on a log file and using a tool like Google Analytics, that's your prerogative. But the industry as a whole disagrees.
> It is covered under GDPR but I think the general consensus is that server logs containing IP addresses do not require consent.
It depends on the legal basis. If you store these IPs to render service or combat fraud, you might get away from explicit consent. However, if you use and store these IP addresses for analytics, then it is a very different conversation.
GDPR is not just about what and how you collect and use data.
>It depends on the legal basis.
Indeed it does. So it will come down to the specific purpose of any analytics.
Consent is only one of six different legal bases that can justify processing personal data.
> the minor annoyance of dealing with cookie banners,
I've stressed this elsewhere but I feel it benefits from more people seeing this - you can block these just like you block ads.
Ublock origin for example, have a look at the filter lists and add on "annoyances". This can also get rid of lots of chat widgets and similar.
EU citizen - I love it. Couple clicks to tell a website not to profile me is nothing. You already are rejecting mailing lists and offers everywhere, what's one more thing.
I don't get spam mailing lists or robocalls. I can safely sign up to services knowing I will be able to unsubscribe easily. I can buy things online knowing they can be easily returned.
Yes, some of my clients lament the inability to use those patterns. I politely smile and nod.
> Couple clicks to tell a website not to profile me is nothing
Also with an ad blocker, see if you can turn on hiding the banners. Without you clicking "yes" they can't use consent for a reason, so this is equivalent to clicking "no".
>I don't get spam mailing lists or robocalls
Not strictly true. I get robocalls and WhatsApp spam messages from country code +1 all the time.
As an EU citizen, yes I feel well-served. Having been burned by companies refusing to delete my personal information before when trying to close accounts, I appreciate having some regulations and basic rights in this respect. I also feel safer shopping online and not getting "stalked" by personalized ads everywhere is nice.
It was a bit of a pain to manually reject all the _purposefully_ annoying cookie consent banners that companies started pulling, but now there's plugins for that, which rejects everything for me by default.
The banners drive me absolutely crazy. And I don't care if the authors of the law never intended the banners, policy is judged on is consequences (not it's intent).
Is it possible for the law to be amended so that non-EU citizens can use a browser flag that just says "I'm fine withe the cookies"? That way Europeans can enjoy all the cookie consent clicking and the rest of us can go back to how the web was before?
We had (have?) that flag in browsers.
But once too many people switched it in the 'do not track' mode, the industry decided to simply ignore it.
You cannot get a more clear sign that government pressure and laws are needed, than that.
I think there's a recent ruling states that web pages can't ignore the Do Not Track flag sent by the browser, and another one is coming for cookie preferences, too.
> But once too many people switched it in the 'do not track' mode, the industry decided to simply ignore it.
My recollection was that it was Microsoft deciding to turn this flag on by default that led to it being ignored by Google. I.e. Microsoft saw a chance to use it against their biggest rival, diluting it in the process and gaining nothing.
I think there's an argument for trying again but requiring an explicit choice. And then give it the force of law.
I do not use "do not track" mode but these banners are still there.
The law might get fixed eventually, but I fear we will never go back to how the web was before because flooding people's screens with popups has become the norm everywhere and among all types of organisations with very few exceptions.
Being able to tell a webpage that they can't share my data to their 780 data partners in a legally binding way with a click is priceless.
> I wonder if EU citizens feel well-served...
Yes, because I can tell them to fuck off from harvesting all my data and have an easy, legally enforceable way to tell them to delete whatever data they've harvested off me. I've reported a few websites that have done shady shit with the cookie banners and even saw them get some fines, so I'm perfectly happy that companies can't hoover up any and all data about me that they want to feed the pockets of some rich assholes an ocean over.
If a company can't exist without massive privacy violations and personal data exfiltration then they deserve to die.
> Yes, because I can tell them to fuck off from harvesting all my data and have an easy, legally enforceable way to tell them to delete whatever data they've harvested off me
you are living in a dream. the NSA collects data on everyone and you can't delete your data there.
Right, so let's just give up on every random online entity stealing all your personal data!
Perfect shouldn't be the enemy of good. I don't like that gov't agencies do it either, but that doesn't mean I'm not happy that there's at least some consequences for others that are trying to do it. Also if a nation-state level entity is targeting me specifically, then I'm screwed even if they don't have any of my personal data.
I think the cookie banners didn't work as intended and they're a good example of the lack of insight in the web and how web designers tick by people writing the laws. Other than those though, I like most of the regulations including Digital Markets Act that is (probably) behind this delay. They give us much better control over where and how our data is handled, give us leverage to have corporations delete said data. Microsoft even built a better version of Windows 11 for us* (but only for us) and that was awesome to watch happen. The difference will probably be even greater in Windows 12. Without the EU, very little of this would have happened and that's simply too much power to corporations.
Yes, "you chose to use them so you decided to follow their terms of use and privacy clauses" but key here is how you're more and more often required to use certain services online or you're put at significant disadvantages ranging from keeping in touch with your family or friends to being disadvantaged in the job market.
* https://blogs.windows.com/windows-insider/2023/11/16/preview...
Yes, I feel well served and we should regulate even more.
And even though I tend to complain about UNIX like OSes, maybe they are the key to free Europe from US corporations operating systems, like a few other countries are already doing as well.
We should stick to international regulated programing languages and OSes, that free us from dependencies on export technology regulations.
Simple solution: Do not use cookies which are not strictly necessary and you don’t need a banner.
I was going to write something tongue in cheek but instead I will simply say that your "simple solution" automatically opts you out of a lot of tooling that site operators have grown to find indispensable over the last couple of decades. Compliance with the EU regulations such that you can legally operate a website without a cookie banner establishes a fairly retrograde set of bounds on how you operate your site.
Which is their prerogative[1]. I'm just pointing out that the people who run the EU's official websites aren't able to operate inside those bounds, so it's fair to say those bounds are not commercially reasonable in general for functional sites. QED the banners are de facto required.
1 - The extensive reach of the law is breathtaking. The EU claims jurisdiction over any web property, based anywhere in the world, that might be seen by an EU national.
What tooling? There are no tools that I need to implement on my sites that requires third party cookies. Maybe I just respect my audience and not try to spy every last detail about them?
Really not trying to make a value judgement here. Just observing that many Web professionals do find value in tools that use third-party cookies, and such value is not always about spying or selling data.
For example, the European Parliament website uses third-party cookies for analytics and for Web streaming. Yes, they can run the whole stack themselves. It is also a) more work and b) not how most of the industry works.
The European Parliament with the resources of the EU, finds it expeditious to use 3P cookies and just display a cookie banner. How reasonable is it to expect sites with fewer resources to do similar interesting things on the Web without also falling afoul of the EU law?
I don't get your point. You are allowed to use those tools, but if you do, you need to get the user's consent.
What you are looking for is a way to use these tools, of which you don't really know what they're going to use this data for or how and why they are tracking me, without my consent.
I find it very strange that this is something you object to.
> You are allowed to use those tools, but if you do, you need to get the user's consent.
We are saying the same thing. Yes, you can use the tools, but then you require consent from the user as the very first thing they see on your website.
> I find it very strange that this is something you object to.
As a non-EU national, I don't derive the benefits of Do Not Delete etc. I do not ever care that a site is using 3P cookies to do e.g. on-page analytics. The cookie banners are a net negative for me.
Sorry for the late reply.
> We are saying the same thing. Yes, you can use the tools, but then you require consent from the user as the very first thing they see on your website. No, you only require to get consent from the user before you start using the tools. That this is as soon as they enter your website, and thus you need to ask consent as the very first thing they see, is your own choice.
> As a non-EU national, I don't derive the benefits of Do Not Delete etc. I do not ever care that a site is using 3P cookies to do e.g. on-page analytics. The cookie banners are a net negative for me. I'm sorry that you're not getting any benefits from it. That said, blame the site owners for incorrectly identifying your IP as a European one.
I think the only person who really cares about those tracking tools is the marketing department.
I have no good reason as to why the EU live stream has 3rd party cookies. However I set up online streaming for two small TV stations in my country, they wanted you to be able to watch the channel live on the station's website, we were able to see how many people were watching the stream live and I never had to use 3rd party cookies. It's not that hard to set up and it is cheaper than you'd expect now days, even video capture cards/devices are much lower price than back in the day.
> I think the only person who really cares about those tracking tools is the marketing department.
The marketing department typically is important to businesses.
Why does "web streaming" require more cookies than legitimate interest?
> it's fair to say those bounds are not commercially reasonable in general for functional sites. QED the banners are de facto required.
For all I care your site isn't required to be commercially viable. If you aren't able to convince the customer and instead opt for so-called cooky-terror banners as a dark pattern with the primary goal of de-sensitive-izing users, you don't deserve my cooky. Opt-in means legislation has passed this view into law, with the quirk that the dark-pattern is fully expected because the industry needs a sort of legitimization for the use of private data. Small companies usually suffer under compliance, no doubt.
Besides, what has this to do with AI prompts? No doubt they want to analyse every single interaction as a sort of supervised training for free. This does not rely on third party cookies, but it might benefit from making identifications which somebody could argue are not essential to the service as advertised.
Is that the kind of tooling that site operators have grown to find indispensable over the last couple of decades, that you mention?
> what has this to do with AI prompts
It's related via my question about EU digital regulation, although Gemini is likely on hold due to the DMA and not GDPR. The question was more about how willing are EU residents to forego technological advances under their more muscular regulation regime.
> 1 - The extensive reach of the law is breathtaking.
Not really, it's about the rights of people represented by the EU.
I say it's breathtaking in its reach because it asserts the right to afford rights to EU nationals no matter where in the world they are.
By analogy, it would be like the US asserting that the Constitutional rights of its citizens travel with them, and so they have the right to carry guns in any country.
It's an expansive approach.
> The extensive reach of the law is breathtaking.
It is.
> The EU claims jurisdiction over any web property, based anywhere in the world, that might be seen by an EU national.
Not really! The EU simply claims jurisdiction over any company that trades in the EU. The company doesn't have to be a "web property" - if you sell cars, or food, or medcines in the EU, you have to accept EU jurisdiction. If you want to operate a web property that is accessible in the EU, that's fine; but don't have a EU trading arm if your web property isn't GDPR-compliant.
Your latter point technically correct, but only because companies that lack a trading footprint in the EU are beyond sanction by the EU.
The EU claims jurisdiction, they just lack an effective enforcement mechanism for sites that do not have a financial nexus inside the EU. (Perversely, this creates an incentive to not create such a financial nexus, including hiring anyone inside the EU.)
> This is such an understated point. I wonder if EU citizens feel well-served by e.g. the pop-up banners that afflict the global web as a result of their regulations[1]. Do they feel like the benefits they get are worth it? What would it take for that calculus to change?
Absolutely. It goes far beyond cookie management, it's a fundamental thing about what you're allowed to do with my data without my consent.
You know you can block them right? Ublock origin has "annoyances" in the lists, just tick that.
> You know you can block them right?
1) Causing massive pollution of the Web and then saying this to the 95% of the world who are not represented by the EU is tone-deaf.
2) This is an added tool that has to be installed, meaning most people will still experience the popups.
3) uBlock origin has limited browser support. Telling the world they need to browse the Web differently is an answer that is only a variant of the pop-up problem that also tells users to brows the Web differently (by navigating popups).
> Ublock origin has "annoyances" in the lists, just tick that.
you know quite a few people use the things called mobile phones?
Firefox supports ublock on mobile.
Plus DNS-level blocking is also better than nothing and works just fine on phones. Not affiliated just a happy customer, but I have NextDNS on my work iPhone and it manages to block most of the really annoying stuff even in Safari.
And has zero market share. Thanks for the valuable comment.
Same with Orion browser on iOS
If you tone down the sarcasm, you could have just asked how to do it on mobile phones. Then you might have realised that's a question you could have asked elsewhere as a search, and you'd have found your answer. I've found that moving towards a much friendlier/professional tone on HN leads to better outcomes and more importantly better thinking through what I'm saying.
As others have said, get a browser that supports addons and enjoy browsing the web without ads and these banners.
The EU is preventing a wholesale sellout of the continent. It’s great.
"pop-up banners that afflict the global web as a result of their regulations"
It is very annoyous but it is also the choice of the sites, you can put a discret banner at the bottom and not disrupt the reading if you want to not annoy people.
These "regulations against us giants" may also have positive effects. They forced Microsoft to offer a choice of browsers on Windows, something requested by Firefox and Opera and that killed Internet Explorer and permitted modern browsers including Google Chrome to florish.
> They forced Microsoft to offer a choice of browsers on Windows
Wasn't this an outcome of the US antitrust trial against Microsoft in the late 1990s?
Absolutely. We got a usb c iPhone, apple will have to allow sideloading in the very near future, there’s some repairability stuff in the works, etc. All in all, as an EU citizen I very much feel like the regulation has been a big benefit for consumers.
I challenge you to point out the specific paragraphs of GDPR you object to as somehow unreasonable or targeting american tech companies.
The cookie banners are a bad outcome for sure, but GDPR does not mandate them. They are an indirect result of the bureaucracy installed by GDPR which does not incentivize user-friendly design of privacy-aware features. I don’t want GDPR rolled back, even as a software developer, because I think it creates the kind of protections everyone in the world should have. But I would like a rule clarification on how to build compliant websites without cookie banners, so I blame the administration (the EU commission) but not the legislation.
The digital markets act similarly is the kind of regulation we need everywhere. It’s only hostile to online businesses because other places don’t have those kind of guard rails on the abusive behavior of big tech.
Now, as far as the EU AI act. I think that in its original intent when it was specifically targeting things like public face recognition by governments it was a very good idea, but it has been hijacked and in its current form it would be very harmful if passed. I don’t think it particularly targets american tech companies, because the biggest victims would be EU AI startups like Mistral.
Exactly, and this law is used everyday against European companies that tries to grab too much data. The US is the epitome of the surveillance society on par with China (look at how Apple wad sending push notification, or how phone companies just share your location data with the world). Here in Europe we fight by making sure that the data cannot be collected in the first place.
> The cookie banners are a bad outcome for sure, but GDPR does not mandate them
Not only that but the terrible banners you get are very often not compliant with GDPR.
> specific paragraphs of GDPR
GDPR doesn't target American tech companies, that's the DMA. Essentially, the framing is there are companies that are "gatekeepers" and then everyone else. The criteria for gatekeepers are theoretically objective, but they were written with a specific set of companies in mind. As a consequence, the designated companies except TikTok just so happen to be based in the US. Further, the rules were written such that EU firms like Spotify are not eligible.
Also, Vodafone somehow is not considered a gatekeeper in any relevant digital market.
Anyway, no judgement. We are in a time of rising protectionism. This may be good for Europe. But the DMA clearly targeted a set of West Coast US companies and it's doing what it was intended to do.
I do wish they would modify GDPR to only apply to people e.g. physically in Europe or similar. It really does make the Web worse for billions of people who are not EU nationals and derive absolutely no benefits from the banners.
While they're regulating browsers and plugs, could they make browser makers ship EU versions of their browsers that show the popups, while the rest of us save tons of clicks? EU nationals could install EU versions of their favorite browsers and the rest of us would just use the stock versions.
Isn't it mostly about companies with platforms or app stores? Vodafone isn't a gatekeeper because it doesn't have its own app store.
Vodafone is literally a core part of the Internet platform. It quite literally gates people's access to the Internet.
It's hard to say Instagram, Meta Marketplace, and TikTok are gatekeepers (they has been designated thusly by the EU) but Vodafone isn't.
The law is protectionist, which is fine in itself. But the argument that Instagram is one of the core gatekeepers of the Internet and Vodafone isn't is ridiculous on its face.
What does Spotify gatekeep? Or are you saying that Tidal, Apple Music & Deezer should also be targeted by DMA?
Netflix isn't being targeted either, so it's obviously not purposefully targeting FAANG/M$ and nothing else.
> so it's obviously not purposefully targeting FAANG/M$ and nothing else.
I did not say that anywhere.
> Or are you saying that Tidal, Apple Music & Deezer should also be targeted by DMA?
I'm just noting that it's curious that the DMA criteria were written in such a way that they exclude the largest consumer Internet company in the EU. That's it, nothing else.
> Further, the rules were written such that EU firms like Spotify are not eligible.
Spotify ain't gatekeeping anything so why do you think they should be eligible?
One could make a very similar argument about Instagram or Meta Marketplace, but they have both been designated as gatekeepers.
Yes I do feel. I have no invested interest in the online advertising industry as to not be.
Yes, I'm very happy the the EU is trying to protect the privacy and data of its citisense. I prefer this much to the "friss oder stirb" mentality of other countries regulations.
And also let me mention the unified usb-c atpater regulation, the opening of messenger protocols and app shops! I honestly believe the EU is making tech better for the whole world!
> the EU is trying to protect the privacy and data of its citisense
Thats the same EU which is trying to break encryption we are talking about?
actually, fair point. Not sure what's going wrong there, but I really hope that does not go through as proposed. But I hold on to the opinion that the EU is generally doing a lot of beneficial regulations in the tech space.
The banners (in their annoying form) are illegal under gdpr and sites are pivoting away. The DNT header will replace it, as it entirely suffices to signal my opting out: I tell the site owner in a machine readable form that I don’t wish to be tracked. There was no legal basis for this to mean anything before. It feels pretty good to not be sold out by my government to the highest bidding tech company and if the price I’m paying is having no access to wildly overhyped AI toys, I’ll manage.
> The banners (in their annoying form) are illegal under gdpr and sites are pivoting away
This needs to be more widely known.
Also, because they need active consent - any cookie banner blocking extension (weird to see so few people talk about this when it's seen as obvious we need ad blockers) means no consent and is perfectly fine under GDPR. You're not required to signal "no".
> if the price I’m paying is having no access to wildly overhyped AI toys
wildly overhyped AI toys that are not even available in the countries where they launched (Gemini Pro is not multimodal as others have pointed out, Gemini Ultra will be available only next year)
There is no hope to redeem the Web of these banners :(
if "hostile environment for online businesses" means they aren't free to do as they please without repercussions and do have to implement at least the simplest and most basic consumer protection rules then yes, i absolutely feel well served.
you're picking out the cookie banner (which is annoying mostly because companies aren't implementing it properly) which is just a small part of a wealth of regulations which give us control over our data. and then you're ignoring all the other extremely valuable regulations that protect customers. and online shops are still able to make a profit here. they just can't completely abuse their customers as they please. yet. for the most part.
As a EU citizen; The intentions are good, some of GDPR is great, and some of the huge fines have been welcome in a world where corporations usually gets tiny fines.
That said, as with most heavy bureaucracies there's just not enough internal organisational tech education so lobbying and misunderstandings end up diluting the process.
Example is the cookie banners leading people away from smaller competitors strengthening monopolies, and teaching people to click at 100 banners a day because no one has time to read so much.
Another is GDPR policies which are great but a huge hassle for smaller orgs and companies, and not really targeted them in the first place.
Everything always ends up a win for the largest players, while the smaller ones struggle to maintain legality.
That has been my experience with a few GDPR processes.
Another annoying thing is the forced Public Procurements of software solutions if you're more than 50% publicly funded in EU.
Again good intentions but it just makes the big players hire huge amounts of lawyers and sales people to game the process to win then create bad software.
That's the problem with regulation. The free market is definitely not free after consolidation and monopolisation but if you're going to regulate you need the absolute best consultants to guide the process and somehow that step always gets bungled.
As an American living in Europe now, yeah it's great.
Also, you see how hostile some stuff in the US is to non-US visitors. Lots of local US news sites, for instance, just throw a plain HTTP error at you because they don't want to mess with GDPR.
A lot of people in the large EU countries basically want their countries to be museums - so keeping new things out is fine with them.