Tech Investments

Tech Investments

AI Outlook & Nvidia

A tour of tech and AI

Tech Fund's avatar
Tech Fund
Nov 23, 2025
∙ Paid

This week we got what looks like a major new breakthrough with the release of Gemini-3. The model blew especially past all competition on the ARC-AGI-2 benchmark. This test measures fluid intelligence, i.e. the ability of a model to solve tasks and types of puzzles it has never seen before.

And subsequently Gemini-3’s performance on an advanced physics test was also a clear outlier:

Image

The main strength of LLMs so far has been memorized knowledge. So, they do very well on benchmarks that test what the AI already knows, such as SAT or bar exams, or writing smaller pieces of code (e.g. 100 to 500 lines). With Gemini-3’s more advanced reasoning capabilities, it should perform especially stronger at more complex tasks such as understanding larger code bases, data sets and libraries of scientific work, with an ability to then generate novel and innovative hypotheses.

The other important part to focus on in the first ARC-AGI-2 chart is the x-axis, which illustrates that Gemini-3-Pro can do this more complex reasoning at a similar $ cost than Claude Sonnet or ChatGPT-5. Gemini-3-Deep-Think has even more advanced capabilities, however, its $ cost is currently also extremely high. So, for the moment, it will only be applied for more selective workloads that really need and can afford these capabilities. However, Gemini-3-Pro shows that pretty advanced compute is already widely accessible.

Google also released its new IDE (coding editor), Antigravity, to integrate Gemini-3 directly into coding projects. We spent several days this week coding a native mobile app in Antigravity and it definitely has a lot of innovative features. However, it’s also still fairly problematic. Overall, the codebase it wrote is extremely well organized, the code is high quality and secure, and our MVP app is getting close to ready.

What’s really good is that Antigravity creates a coding plan at each step, which the developer can then correct or validate. This way, it avoids dumb setups being introduced into the codebase which can cause a lot of problems later. However, the main problem is that Gemini-3 still regularly crashed and then we had to restart it to try again. Occasionally, it still made a few dumb mistakes but these could easily be corrected via the coding plan. However, all the feedback we gave to the coding agent could likely be automated by another smart coding agent. So, it’s fairly easy to see how in the coming months agents can start taking off to work independently for hours on coding projects and produce useful work.

Overall, having been using both Antigravity and OpenAI Codex in the past few weeks, our long term outlook on AI agents and the amount of useful work they can produce is definitely bullish.

We’ve always been fans of Google. We first invested in the company in 2010 and so this has been a 20x bagger or so. However, we’ve been reducing our position this year as 60%-plus of its profits stem from traditional Google Search, and it looks very likely to us that gradually this business will get disrupted in the coming years. The company’s bread-and-butter monopoly in retrieving 10-blue-links will increasingly become a legacy business, while in next-gen AI interfaces the company now has to compete with a number of strong competitors such as Meta Llama, Grok, ChatGPT, Claude, Perplexity, Qwen, DeepSeek, Mistral, etc.

Thus, as the stock was getting pricy again at close to 28x forward PE – despite the company facing disruption risks in its key business – we recently halved our position. However, this week, we got a real showcasing of the bull case in Google. Obviously, the company has a deep bench of strong AI researching talent, combined with some of the widest access to AI training compute. In addition, it’s the only company which has successfully built a full AI stack including its custom accelerator, the TPU. While estimates vary, Google should have a strong cost advantage thanks to their ability to train models on their custom TPU clusters. Firstly, Google doesn’t have to pay Nvidia’s 75% gross margin, and secondly, TPUs consume about 30% less power for equivalent workloads.

Currently, it definitely looks like OpenAI is falling behind. ChatGPT-5 is a fine model but wasn’t really much of an improvement vs GPT-4. They basically added a simple LLM in front of it which then decides whether the query should be routed to the ‘fast’ or ‘thinking’ model. While some tried to hype it as a major breakthrough, frankly, it’s something we could have easily coded ourselves. At this moment, it doesn’t look like OpenAI has made much progress in the last year in terms of model improvements. And that might be fine, the real money will be made by the players who can successfully insert these LLMs into consumer apps or to automate workloads in the real economy. Being the best at solving science problems will likely be more of a niche application.

Anthropic is a good example here of how to turn an LLM into business success via a simple API. While the capabilities of its Claude model are not hugely impressive on benchmarks, it remains by far the most widely used model by enterprises to generate code. We use it as well, and its great attraction in our opinion is that it only implements code what the developer asks for. Gemini-2.5 and ChatGPT can’t really be trusted alone with a codebase as they might start changing all sorts of things that can introduce problems. Similarly, Grok regularly crashes or starts reasoning for 10 minutes on end. Claude delivers reliable and good code quickly. So, we would rate Claude as the most ‘stable’ model in coding and as a result, its annual revenue run rate is approaching $7 billion now (vs $1 billion at the start of the year, a 7x increase) and at healthy gross margins.

Next, we’ll dive into the current market turmoil and our outlook for AI, Nvidia, and stocks we’d be taking profits in here.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Tech Fund · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture