Thoughts before my next ten years

I am currently thinking about what I want to do next in my career. I've worked in the US federal government and big tech, founded a startup, worked on ChatGPT when OpenAI was scaling, and invested in companies. I'm sharing the following to document what I've done over the past ten years, in an attempt to explore the foundation I have been building for the next ten.

I took the less traditional path. While I was born and grew up in San Francisco, my family went through financial obstacles that led to bankruptcy and my subsequent juvenile rebellion. Street art and graffiti were my escape. I was expelled from high school and dropped out of college, twice. Eventually, I found myself living in Shanghai, China, where I began a career writing software. By 22, I had worked over 30 jobs, ranging from washing dishes at a stayover camp, working the night shift at a pharmacy, street canvassing for a non-profit, and providing respite care for families with autistic children.

My first salaried job was in New York, where I worked at Condé Nast, on The New Yorker, as a software engineer. Practically, I migrated old custom Java-based publishing software to a "modern" custom WordPress implementation that facilitated the publishing process for the physical magazine and a website with tens of millions of monthly visitors. Over two years, I went from feeling excited about producing web content that was actually just advertisement inventory targeting a relatively high-net-worth audience to wanting to do something more societally useful. In the first era I recall of exciting civic technology, public software projects were criticized for running massively over budget in the news (like Healthcare.gov). I sought out an opportunity in the US federal government to deploy better technology solutions and joined an emerging "startup" in the government called 18F, and later the General Services Administration's Technology Transformation Services. In 2016, this meant working on Ruby on Rails and Go applications, configuring Cloud Foundry infrastructure for large federal bureaus and receiving thank you letters from Navy generals. The organization was remote-first, in an era that made this unusual for the times, so I learned what a functional culture could look like when intentionally designed (especially compared to the Covid mess that followed).

In 2018, deep learning and computer vision were my way to get into "deep tech." I spent as much time as I could experimenting with projects that combined street art detection and data scraping. I left the federal government to join an early-stage startup building machine learning experiment management tooling, which I later learned was inherently limited in its prospective market, due to the overall limited number of machine learning engineers. At that time, even 100% adoption of employed and academic machine learning engineers in the US was closer to tens of thousands of individuals, compared to the millions (if not tens of millions) of software engineers using tools like GitHub.

I was looking for an opportunity working on software deployed at a much greater scale and began work on Google Fonts, an API with over 10 trillion requests a year (now over 120 trillion). I was there as a vendor on a two-year contract that didn't convert into a full-time role.

The Covid pandemic gave me the opportunity to go full-time on building a company around the corporate disruption coming from remote work and everything moving online. I started a (then machine-learning-driven) video-editing software company that helped marketing and sales teams repurpose the Zoom webinars they produced. The tool was designed around the premise that speech-to-text was getting good and video edits were generalizable using pre-made visual templates for social media sharing. I was admitted to Y Combinator in its remote-batch era, and gave it my all to find product-market fit during the rise of remote work. The product was good, but the problem wasn't urgent enough for customers to prioritize. I realized my cofounder and I would need to start over to keep growing. We'd built enough goodwill that a clean split was possible, so I decided it was time.

As the pandemic era wound down, I went looking for work with a bigger potential upside, and in 2022 I joined OpenAI. Studying Kurzweil in college and then watching deep learning solve steadily harder problems had convinced me that machine intelligence would eventually become a practical way to solve problems with raw compute. (In hindsight, it also became clear how well OpenAI had positioned itself to ride big tech's surge in high-margin cloud and infrastructure spending, each side's incentives reinforcing the other.) At the time, though, I was simply electrified by the DALL-E launch and ready to do anything. I joined a code-generation research team while GPT-4 was still in training. Going from a startup on Friday to a larger company on Monday turned out to be a perfect transition, given how fast OpenAI moved.

When I joined OpenAI, it was around 250 employees, with a nascent product org. GitHub Copilot was the leading code-generation tool, and DALL-E 2 was the leading consumer image model. All interactions with language model generations were through the completion API, meaning you would write text, set hyperparameters, prompt the model, and watch it stream tokens back into the same text box. Conversational interfaces weren't widely adopted yet.

The first team I joined was focused on developing long-running code generation and code execution environments. The concept was an extremely early version of an "AI computer," where the interface could generate code, run it, and determine the next steps needed. The underlying model and data collection efforts were constrained to exploration in Jupyter notebooks. My first task was to recreate the Jupyter interface and execution environment in a standalone application. I used a headless Jupyter server to manage the code interpreter sessions, which replaced the copy-and-paste workflow used for generating and executing code and established a single surface area that gave better feedback and model intuition.

Data collection across the company was fragmented, and every research effort carried its own operational overhead for sourcing contractors and maintaining quality. I was one of three people who started a team to centralize it, where I drove the product work as it grew until we brought on a dedicated PM. We built the task management, review, and reporting layer that most of the company's data collection ran through, spanning everything from code execution to computer-use environments. This taught me how much of model quality is actually an operations-and-quality problem.

The most influential effort I touched was WebGPT. Its "chat" interface, which guided the model through an instruction-following paradigm, would later become the basis of ChatGPT, though at the time most of us didn't register its significance against the alternatives: the code-completion interface, the Jupyter-like code blocks, and the other modality surfaces. It also shaped a unifying data structure the rest of us converged on, which mattered for training a single model with many capabilities rather than many small ones.

The WebGPT research effort had been in progress for over a year and a half, so most didn't realize the significance of the interface, given the alternatives: the code-completion interface, the Jupyter-like code blocks interface, and the other modality surfaces.

When ChatGPT launched that November in 2022, the rest of the company needed to adjust. Consumer usage was beyond any expectations, and the burden on the entire research organization was material as GPU capacity got reallocated. Everyone assumed the initial surge would settle. Instead it compounded week over week, and the whole organization bent around the GPU constraint that couldn't be planned for at that scale.

I recognized that the ChatGPT user base at the time was far greater than any contractor force we could manage. If we could properly incentivize that user base to help with data collection, we could produce a much higher-quality "flywheel" for improving the models. In reality, there are numerous challenges to producing a clean data flywheel from end-users, but this gave me conviction that it was an important thread worth exploring. Since keeping ChatGPT online was an all-hands-on-deck effort across infrastructure, research, product, and customer support, my focus on finding the right way to gather meaningful data from users felt even more important. Through this, I formally joined the ChatGPT team and began contributing to the codebase and product roadmap.

As soon as 2023 began and the holiday code freeze concluded, my priorities shifted from data collection to executing on whatever needed to be done to make sure ChatGPT would be usable. Each day ChatGPT would suffer hours of downtime as a wave of traffic followed the busy working hours around the world. Traffic peaked when Asia, Europe, and the US East and West Coasts were all online simultaneously, and the hours leading up to and following these surges were committed to doing anything possible to reduce the pain. Databases were migrated, telemetry was improved, caching and traffic rules were established, and heroic efforts were made by a surprisingly small number of people to make the next day's surge less painful.

My first major product contributions were around ChatGPT launching a paid subscription. While the previous consumer-facing OpenAI paid product had required weeks of planning and development, the goal this time was to ship a paid product with zero downtime in single-digit days. This was an effort I eagerly jumped into. We started in February and launched in March with ChatGPT Plus, publicly reaching $100M in ARR within days and continuing to grow far faster than anyone could have anticipated. By April, GPT-4 launched, speeding up demand and challenges even more.

The subsequent year is a blur. ChatGPT had unquestionable product market fit, constrained by a single variable: GPUs. Database IDs started wrapping, nearly every early infrastructure decision eventually broke and needed attention, and systems needed refactors. Even with careful planning, we were constantly making changes to improve stability and security. Surprisingly, for a product growing this fast, the biggest unexpected drains were the abuse and misuse we hadn't designed for.

The ChatGPT team, which began as fewer than 10 people, grew to over 200 dedicated contributors, not to mention the numerous behind-the-scenes infrastructure engineers and adjacent researchers. The company I'd joined at 250 employees a year before was on track to hit 2,000. It was an insane period of continually finding the most important bottleneck, finding any means to relieve it, and moving on to the next.

At the beginning of 2024, our first child was due, and I left OpenAI and joined a venture capital firm run by two people I deeply respected in the AI and startup ecosystem. Investing wasn't something I'd imagined I'd want to spend time on, but I valued the chance to work with people I admired and could learn from through osmosis, the same way I had at OpenAI.

I spent the next year and a half helping find promising technology startups, exploring prospective industries to invest in, working with existing portfolio companies, and finding new ways to learn from talented people. This period was the most "out of distribution" stretch of my decade of work, and easily the highest rate of non-technical learning I'd ever been through. Each day was spent going as deep as possible into a new sector: the financial implications of AI on real-world markets, or the obscure supply chains for energy and raw materials. Then I'd try to find whether there was any action to take, whether companies of consequence to meet or ways our firm could act on the opportunity.

In 2025, the firm was acquired, the founders moved on, and new investments were paused. The people I'd joined to work with were gone, so I chose to leave. Which brings me to now. I've had the opportunity to be part of a range of industries and organizational structures across a wide spectrum of growth, and I feel this is the time to find the problems and people I find important enough to spend the next ten years on, even after the excitement fades.

When I think about the role of AI in the economy, I keep coming back to an idea borrowed from economics. Economists use "velocity of money" to describe how quickly a dollar moves through an economy and turns over into new value. I've started thinking in terms of a "velocity of intelligence," or how quickly the distance between knowing something and acting on it collapses. AI compresses that distance, and as it does, the velocity of intelligence rises.

At OpenAI, I saw the friction collapse in real time as hundreds of millions of people discovered AI's utility in the post-ChatGPT wave, and the physics of software businesses shifted. Then, from the startup and venture side, I saw both halves of the unevenness. AI and infrastructure companies were compounding at a rate that was previously impossible, while a far larger set of existing enterprises and industries, where that same acceleration would matter even more, wouldn't see it arrive for years, held back by organizational constraints rather than any limit of the technology. The places where intelligence is cheap and fast today aren't the places where the gains would matter most.

That gap is where I want to spend the next decade: getting AI adopted where the velocity of intelligence would be genuinely consequential but won't happen without a push. I'm still working out the specifics, but having seen the acceleration from inside the labs and where it stalls from the investor's seat, I think I'm positioned to push on this in a way few others could. For now, I'm getting back to building.