Writing from Lenny Bogdonoff

Thoughts on my 33rd birthday

August 23, 2022 by rememberlenny

A few personal updates.

Today is my 33rd birthday, and in somewhat abrupt fashion, I am moving to San Francisco. It feels particularly significant for the reasons I’ll mention below.

First, about three months ago, I began a conversation with a senior individual at a company, OpenAI, I had been following in the artificial general intelligence (AGI) space. Initially, my interest was around understanding how I could best position my company, Milk Video, to capitalize on new innovations, which I had been building for the past two years. Instead, I was deeply surprised by the unexpected rapid development and maturity of machine learning innovation, and a month later, decided to leave the company I started to accept a full-time role.

One major factor in my decision making was a series of classes and books I had been reading for the past three years. In particular, two courses Jason Crawford led around the history of technology and the narratives around societal progress. One resonating point between this course and my exposure to AGI was the realization of the unbounded upside in humanity cultivating digital capacity of ever-growingly complex labor functions.

My personal impression taken from these materials is that much the past 100, 10,000, or 1 million years of human economic value creation has been largely bottlenecked on a variety of external factors. For the most part of humanity, this has been due to a limited access to valuable materials (ie. food and shelter), the ability to cultivate the limited resources available, high-mortality rates, coordination costs, access to educated individuals and more. All being said, the rapid technological development in society over the last 100 years alone have created compounding improvements in quality of life and general wellbeing for the overall benefit of humanity. Even compared to this, I find that the creation of AGI has an almost uncapped opportunity to translate ingenuity and intellectual capacity into a recurring and recursively value creating humanity improvement vehicle.

Realizing this, my decision became much more clear.

Second reason, making today quite significant, is that I am spending the day moving out of our apartment – and sending all of “stuff” across the country. One condition of this job was to be working physically in San Francisco, so after nearly 10 years living in New York City, I – in my wife’s eyes abruptly – started the job and settled in California. It’s been two months since I began, and while I had found a temporary place to live, I had not ended our lease in New York. As a result, today, I am back in New York to properly relocate and this time with my wife, begin a cross country move.

For the curious, the company I started will continue to have a life of its own. My cofounder determined he would like to continue building the business and with enough money raised and a non-trivial number of existing and growing customers, there is a lot of potential for the future. I strongly believe two years is far too short to truly see a company through, and I don’t take lightly that numerous investors took a stake in making it possible to start and grow our company, but even with all of this, I do believe I’m making the right decision.

Moving forward, I have a number of personal projects I kept on the back burner and want to kick off. Outside of work and personal explorations, my wife and I will be making a much more concerted effort to begin a family. Now that my waking hours aren’t solely focused on company building, I have more mental cycles to focus on a wider range of ideas. Personally, transitioning from founder to employee hasn’t made such a large difference in the quantity of hours working, but the emotional investment isn’t quite as gut-wrenching. More so, I’m deeply appreciative for the opportunity to be working alongside people who inspire and challenge me in a fulfilling way I genuinely forgot I was missing.

I grew up in San Francisco, but I haven’t lived there in over 20 years, so its a familiar place where I still have a lot to discover. I suspect some of you will pass through the Bay Area, so I hope I we can meet in person for a walk, coffee or meal.

While the past year is filled with a lot to reflect about, the leaving of New York is easily the highest variance event to reflect on today.

With that being said, I was inspired to write a short poem yesterday I’ll share:

New York

Give me the sound of hammers ,
The glint of light on a carbon steel beam,
Workers marching with coffee at dawn,
A city in constant motion.

Give me the serenity of new parks,
Blooming flowers visited by passerby’s,
Tended trees protecting visitors in the day,
A haven for creativity and new encounters.

Give me the collage of diverse faces,
In pursuit of individual wants and wishes,
Brought together by a force invisible yet shared,
Friendships emerge throughout the city in strangers alike.

Give me unending dramas of everyday people,
Each person deeply immersed in their daily lives,
Emerging to make a mark on their fellow citizens,
A history no one person can tell, but all alike share in.

Second order effects of companies as content creators

February 13, 2021 by rememberlenny

TLDR: There’s going to be a shit ton of temporarily useful content; businesses need to have a quick way to make the worthwhile content stand out. (Thanks Joseph)

There are two ideas I have been thinking about over the past few months that I want to document here. The first has to do with disruption, and the second has to do with novelty vs usefulness.

The late Clay Christensen pointed out that major technology shifts take place when there is a massive change in price or available quantity of something. When there was a 10x reduction in price for computers, the personal computing revolution took place. When price decreases, then the potential application for said technology significantly increase. When prices decrease and potential applications increase, then there the quantity of said technology also increases, and with it are new platforms for new tools and previously non-existent technologies.

I’ve heard this idea too many times to count, but something I never deeply thought about was reasons why certain ideas within the new platform do or don’t succeed. The specific area of interest for me is how the new-platform creates new opportunities that previously couldn’t exist – and therefore everything that wasn’t possible before is novel. When I use the term novel, I mean that since it is new, its significance seems highly valued relative to the past. That being said, if the new idea was to be projected out in the new reality, the novelty eventually wears off and is no longer as valuable as it once was.

Another way to think of it is the bundling and unbundling effect which I believe comes from a American economist who studied the cross-country trucking industry, and the standardization of freight trucks. Sadly, I don’t know the original reference. The summary point is that when getting packages safely and cheaply across the country was difficult, the creation of an 18 wheeler semi-truck (actually, I’m pretty sure the first version was not an 18 wheeler, so I’m butchering the facts) was a new innovation that solved a large problem. Once the trailer hauling semi-truck was widely accepted as the best option for hauling freight, the machine parts were modularized so that the trucking company could delegate part production and reduce costs. At the same time, the process of modularizing the trucks allowed new entrants into the trucking space, which created differentiated offerings and options for achieving cross country freight hauling.

World War I trucks circa 1917, manufactured by White Motor Company

Once the hauling freight was predictable, then new industries that couldn’t exist before emerged. The trucking industry created new modes of consumption with federated infrastructure. The big box stores, like Walmart or Costco, that couldn’t exist before were now possible (or the early 20th century parallel). As a consequence of the new modes of freight shipping creating a new possible economic model, parking lot based stores formed and the urban landscape changed. I’m extrapolated here, but I imagine a number of non-existent suburban environments formed due to this single freight based innovation.

The most exciting part of this to me is how the previously nascent technologies around freight positively exploded as a result of the new advancements.

And inline the new urban landscape fostered smaller environments that created new markets of supply and demand that didn’t previously exist. I imagine this also put stresses on areas that never expected such high throughput. I’m sure new roads needed to be developed, new laws, and in the process more and more markets emerged.

Intermodal containers waiting to be transferred between ships, trains, and trucks are stacked in holding areas at a shipping port.

I mention all of this because I find an interesting parallel when it comes to today and the state of what I’ve been thinking about in the proliferation of video content. In relation to traditional disruption theory, we had a major event that increased the quantity of video production, without necessarily changing the cost of production. Specifically, the cost of individual video creation has transformed the way we communicate personally – as seen through social media – but the impact on businesses has been delayed. Given that the cost of video production had been low for quite a while, the recent societal forces around Covid turned nearly every business into a video producer.

I think a good parallel is to think of Zoom as being the semi-truck and the need to interact with one another during a pandemic as being the freight to haul. Naturally, as Zoom became the fix-all solution, the tool that started as a necessity became fraught with problems for specific use cases. For one, Zoom wasn’t designed for highly interactive environments or planning events ahead of time. As a result, a world wind of new applications emerged to compliment the lacking areas, such as Slidio for questions or Luma for event planning.

The need for higher fidelity interactions as the new normal of interacting over video was no longer a novelty. For one, people wanted to have fun. Games became a common way to spend time together, but beyond actual video games, an entire category of Zoom based games emerged. Things like GatherTown or Pluto Video came to light. Similarly, as the interaction model of being online together and participating in a shared online experience was normalized, existing platforms like Figma were used for non-traditional purposes to created synchronous shared experiences.

I bring these examples up because I think the ones above are obviously novel ideas that won’t be valuable for extended periods of time. That being said, the seed of the novelty comes from a core experience that will likely reemerge in various places in the future.

Continuing with the Zoom thread, the explosion of video communication and alternative ways of communicating with video has created the explosion in video content that previously didn’t exist. Of course, there is the novelty around having this increased content – which I believe will only continue to increase. As many things, the transition from in-person to digital was a one-way door for many industries, where the reduction in cost and surprising resilience in results has created a new normal.

I won’t try an exhaustive analysis around video, but one area that is particularly interesting in the machine learning space. Given the past decade, the huge improvements around computer vision, deep learning, and speech-to-text based research have been expensive and the applications have been very specific.

There is an exciting cross section around the sheer quantity of content being produced, and the widely accessible machine learning APIs that make it possible to analyze the content in a way that wasn’t previously possible cheaply. The novelty effect here seems ripe for misapplication. Specifically, the thing that wasn’t generally possible and the thing that previously wasn’t widely present are converging at the same time.

Real life example of an executive realizing how much money they are spending on novel tools they don’t actually use.

More specifically, there is a ton of video content being produced by individuals and businesses. Naturally, given the circumstances, applying the new technology to the ever growing problem seems like a good idea. Creating tooling that helps organize the new video content, or improve the reflection and recall of the content seems valuable. But is it just a novelty or a future necessity. If its a necessity, will it be a commodity and common practice, or specialize enough to demand variety in offerings?

Applying speech-to-text processing on video content is cheap, so analyzing everything creates a new possibility where the previously non-machine readable media is now itself a new resource that didn’t previously exist. Not only do we have video that didn’t exist before, but we have a whole category of text content that was non-existent. At the basic level, we can search video. At the more advanced level, we can quantify qualities about the video at scale. The valuable applications for this have been around phone call analysis in the past, but now are applicable to the video calls of sales teams or user interviews at product teams.

Again, this being the case, I find it falls to the trap of being a novel and immediately useful solution, but far from a long term value. I imagine the construct to think about the new machine readable video, is that it unlocks a new form of organizing. The ability to organize content is valuable at the surface, but the content being organized needs to be worth the effort.

For one, when video is being produced at scale, in the way that it is today, the shelf life of content is quite low. If you record a call for a product interview today, then when that product changes next month, the call is no longer valuable. Or at least the value of the call declines proportionally to how much the product changes. Lets call this shelf life.

The sales calls and user interviews from last year

Interestingly, as individuals – in the social media space – the notion of shelf life was given a sexy term: ephemerality. Since there is no cost to produce content individually, the negatives of having content with a short shelf life outweigh the positives, and among many other tings, this became a positive differentiator.

For businesses though, the creation of media is often far from free. Not only is the time of employees valuable, but the investment behind certain types of content are not immediately returned. So while the short shelf life video is already widely in use, the question that comes to my mind is: what are the future industries that don’t have a near term end in the road in this new space?

My general take is that organizing content is a limited venture. Having immediate access to content is useful, but at some level a novelty that requires a long shelf life value. Going back to the trucking analogy, while trucking is a commodity, you have industries that were enabled by trucking, like the Walmarts – which are bound to emerge in this new Zoom based freight model. I imagine while the shuttling of goods for commerce is important in freight, that point of value in the new ecosystem is going to be around helping businesses increase the value of their already existing content.

Might I even say, helping businesses “milk” the value.

Text rendering stuff most people might not know

October 10, 2020 by rememberlenny

I was stuck on a problem that I wanted to write out. The problem I was trying to solve could be simplified to the following:

I have a box in the browser with fixed dimensions.
I have a large number of words, which vary in size, which will fill the box.
If a full box was considered a “frame”, then I wanted to know how many frames I would have to use up all the words.
Similarly, I needed to know which frame a word would be rendered in.

This process is simple if the nodes are all rendered on a page, because the dimensions of the words could be individually calculated. Once each word has a width/height, then its just a matter of deciding how many can fit in each row, until its filled, and also how many rows you can have before the box is filled.

I learned this problem is similar to the knapsack problem, bin/rectangle packing, or the computer science text-justification problem.

The hard part was deciding how to gather the words dimensions, considering the goal is to calculate the information before the content is rendered.

Surprisingly, due to my experience with fonts, I am quite suited to solving this problem – and I thought I would jot down notes for anyone else. When searching for the solution, I noticed a number of people in StackOverflow posts saying that this was a problem that could not be solved, for a variety of correct-sounding, but wrong, answers.

When it comes to text rendering in a browser, there are two main steps that take place, which can be emulated in JavaScript. The first is text shaping, and the second is layout.

The modern ways of handling these are C++ libraries called Freetype and Harfbuzz. The two of these libraries combined will read a font file, render glyphs in a font, and then layout the rendered glyphs. While this sounds trivial, it’s important because behind-the-scenes a glyph is more or less a vector, which needs determine how it will be displayed on a screen. Also each glyph will be laid out depending on its usage context. It will render differently based on what characters its next to, where in a sentence or line it is location.

https://twitter.com/rememberlenny/status/1314730744581967878?s=20

Theres a lot that can be said about the points above, which I am far from an expert on.

The key points to take away is that you can calculate the bounding box of a glyph/word/string given the font and the parameters for rendering the text.

I have to thank Rasmus Andersson for taking time to explain this to me.

Side note

Today, I had a problem that I couldn’t figure out for the life of me. It may have been repeated nights of not sleeping, but also it was a multiple layered problem that I intuitively understood. I just didn’t have a framework for breaking it apart and understanding how to approach it. In the broad attempt to see if I could get the internet’s help, I posted a tweet, with a Zoom link and called for help. Surprisingly, it was quite successful and over a two hour period, I was able to find a solution.

I’m genuinely impressed by the experience, and highly encourage others to do the same.

One more note, this is a great StackOverflow answer: https://stackoverflow.com/questions/43140096/reproduce-bounding-box-of-text-in-browsers

Why is video editing so horrible today?

September 15, 2020 by rememberlenny

In the last three months, I have done more video post-production than I have done in the past 12 years. Surprisingly, in these years, nothing seems to have changed. Considering how much media is now machine analyzable content, such as audio and visual, I’m surprised there aren’t more patterns that make navigating and arranging video content faster. Beyond that, I’m surprised there isn’t more process for programmatically composing video in a polished complimentary way to the existing manual methods of arranging.

In 1918, when the video camera was created, if you filmed something and wanted to edit it, you took your footage, cut it and arranged it according to how you wanted it to look. Today, if you want to edit a video, you have to import the source assets into a specialty program (such as Adobe Premiere), and then manually view each item to watch/listen for the portion that you want. Once you have the sections of each imported asset, you have to manually arrange each item on a timeline. Of course a ton has changed, but the general workflow feels the same.

Should Critics and Festivals Give Editing Awards? Yes, and Here's Why | IndieWire — Real life photo of me navigating my Premiere assets folders

How did video production and editing not get its digital-first methods of creation? Computing power has skyrocketed. Access to storage is generally infinite. And our computers are networked around the world. How is it that the workflow of import, edit, and export take so long?

The consumerization of video editing has simplified certain elements by abstracting away seemingly important but complicated components, such as the linearity of time. Things like Tiktok seem to be the most dramatic shift in video creation, in that the workflow shifts from immediate review and reshooting of video. Over the years, the iMovies and such have moved timelines, from horizontal representation of elapsed time into general blocks of “scenes” or clips. The simplification through abstraction is important for the general consumer, but reduces the attention to detail. This creates an aesthetic of its own, which seems to be the result of the changing of tools.

Where are all the things I take for granted in developer tools, like autocomplete or class-method search, in the video equivalent? What is autocomplete look like in editing a video clip? Where are the repeatable “patterns” I can write once, and reuse everywhere? Why does each item on a video canvas seem to live in isolation from one another, with no awareness of other elements or an ability to interact with each other?

My code editor searches my files and tried to “import” the methods when I start typing.

As someone who studied film and animation exclusively for multiple years, I’m generally surprised that the overall ways of producing content are largely the same as they have been 10 years ago, but also seemingly for the past 100.

I understand that the areas of complexity have become more niche, such as in VFX or multi-media. I have no direct experience with any complicated 3D rendering and I haven’t tried any visual editing for non-traditional video displays, so its a stretch to say film hasn’t changed at all. I haven’t touched the surface in new video innovation, but all considering, I wish some basic things were much easier.

For one, when it comes to visual layout, I would love something like the Figma “autolayout” functionality. If I have multiple items in a canvas, I’d like them to self-arrange based on some kind of box model. There should be a way to assign the equivalent of styles as “classes”, such as with CSS, and multiple text elements should be able to inherit/share padding/margin definitions. Things like flexbox and relative/absolute positioning would make visual templates significantly much easier and faster for developing fresh video content.

Currently I make visual frames in Figma, then export them because its so much easier than fumbling through the 2D translations in Premiere

I would love to have a “smarter” timeline that can surface “cues” that I may want to hook into for visual changes. The cues could make use of machine analyzable features in the audio and video, based on features detected in the available content. This is filled with lots of hairy areas, and definitely sounds nicer than it might be in actuality. At a basic example, the timeline could look at audio or a transcript and know when a certain speaker is talking. There are already services, such as Descript, that make seamless use of speaker detection. That should find some expression in video editing software. Even if the software itself doesn’t detect this information, the metadata from other software should be made use of.

The two basic views in Zoom. Grid or speaker.

More advanced would be to know when certain exchanges between multiple people are a self-encompassed “point”. Identifying when a “exchange” takes place, or when a “question” is “answered”, would be useful for title slides or lower-thirds with complimentary text.

Descript will identify speakers and color code the transcript.

If there are multiple shots of the same take, it would be nice to have the clips note where the beginning and end based on lining up the audio. Reviewing content shouldn’t be done in a linear fashion if there are ways to distinguish content of video/audio clip and compare it to itself or other clips.

In line with “cues”, I would like to “search” my video in a much more comprehensive way. My iPhone photos app lets me search by faces or location. How about that in my video editor? All the video clips with a certain face or background?

Also, it would be nice to generate these “features” with some ease. I personally dont know what it would take to train a feature detector by viewing some parts of a clip, labeling it, and then using the labeled example to find the other instances of similar kinds of visual content. I do know its possible, and that would be very useful for speeding up the editing process.

In my use case, I’m seeing a lot of video recordings of Zoom calls or webinars. This is another example of video content that generally looks the “same” and could be analyzed for certain content types. I would be able to quickly navigate through clips if I could be able to filter video by when the video is a screen of many faces viewed at once, or when only one speaker is featured at a time.

All of this to say, there is a lot of gaps in the tools available at the moment.

Making the variable fonts Figma plugin (part 1 – what is variable fonts [simple])

September 8, 2020 by rememberlenny

See this video summary at the bottom of the post, or by clicking this picture.

Important update: The statement that Google Fonts only displays a single variable font axis was wrong. Google Fonts now has a variable font axis registry, which displays the number of non-weight axes that are available on their variable fonts. View the list here: https://fonts.google.com/variablefonts

Variable fonts are a new technology that allows a single font file to render a range designs. A traditional font file normally corresponds to a single weight or font style (such as italics or small caps). If a user users a bold and regular font weight, that requires two separate font files, which respectively correspond to each font weight. Variable fonts allow for a single font file to take a parameter and render various font weights. One font file can then render thin, regular, and bold based on font variation settings used to invoke the font. Even more, the variable font files can also render everything between those various “static instances”, allowing for intrigue expressibility.

At a high level, variable fonts aren’t broadly “better” than static fonts, but allow for tradeoffs that can potentially benefit an end user. For example, based on the font’s underlying glyph designs, a single variable font file can actually be smaller in byte size than multiple static font files, while offering the same visual expressibility. While the size does depend on the font glyph’s “masters”, another beneficial factor is that a single variable font requires less network requests to accomplish a wide design space.

Example how Figma canvas is rendering “Recursive” variable font with various axis values.

Outside of technical benefits, variable fonts provide an incredible potential for design flexibility which isn’t possible with static instances alone. The example of a variable font and font weight was given above, but actually a variable font can have any number of font axes based on the designers wishes. Another common font axis is the “slant” axis, which allows a glyph to go between being italics and upright. Rather than being a boolean switch, in many cases, the available design space is a range which provides for potential around intentional font animation/transitions as well.

Key terminology:
Design space: the range of visual ways which a font file can be rendered, based on the font designers explicit intention. Conceptually, this can be visualized as a multidimensional space, and the glyph’s visual composition is a single point in the space.

Variable axis: A single parameter which can be declared to determine a fonts design space. For example, the weight axis.

Variable font settings: The compilation of variable axis definitions, which are passed to a variable font and determine the selected design space.

Static instances: An assigned set of font axis settings, often stored with a name that can be accessed from the font. For example, “regular 400” or “black 900”.

Importantly, variable fonts are active and available across all major browsers. Simply load them in as a normal font, and pass the variable-font-settings css property to explicitly declare the passed variable axis parameters.

As you can see here, a normal font weight declaration or a font style declaration would look like this, but a variable font style definition allows for a wider range of expression.

Google Fonts is currently a major web font service that makes using variable fonts extremely easy. Their font directory allows for filtering on variable fonts, and the font specimens pages allow you to sample the font’s static instances as well as the font weight variable axis. While Google Fonts serves variable fonts, they are currently limiting their API to single font weight axes.

One popular font, beloved by developers and designers alike is Inter, which was designed by Rasmus Andersson. Inter contains a weight access, as you can see from the Google Fonts specimen page. If you go directly to the Inter specimen website, you can actually see that it also contains a second font axis – the slant axis, which was mentioned above.

From the specimen page, you can also see that assigning the weight and slant can allow for use cases that make it invoke different feelings of seriousness, casualness, and legibility. While changing the font weight can make it easier to read, based on the size of the font, it can also be combined with colors (for example in dark mode) to stand out more in the page’s visual hierarchy.

Another font to show as an example is Stephen Nixon’s Recursive. Recursive can also be found on Google Fonts, but again by going to the font’s own specimen page, you can experiment with its full design space. Recursive contains three font axes that are unique: expression, cursive and mono. Additionally, as you can see, certain glyphs in the font will change based on the combined assigned font axis values. One example is the lowercase “a”, as well as the lowercase “g”.

Example of the “a” and “g” on the Recursive font’s glyph changes

For Recursive, some of the font axes are boolean switches, as opposed to ranges. The font is either mono or not. Also the range values can be explicitly limited, such as with the cursive axis which is either on/off/auto.

Side note – with Inter, one thing to note that was glanced over is how changes in the font’s weight axis actually result in changing the width of the font glyph. For Recursive, which has a “mono” axis, the weight is explicitly not meant to adjust the width of a font glyph. While not found in either of these two fonts, a very useful font axis which is sometimes found is the “grade” axis, which allows for glyphs to become thicker, without expanding in width.

All of this is a quick overview, but if you are interested in learning more, do check out TypeNetwork’s variable font resource to see some interactive documentation.

Beyond the browser, major Adobe software products as well as Sketch now renders basic font axis sliders to customize variable fonts. As I switch between code and design software, I was surprised to find that Figma was one of the few design softwares that wasn’t compatible with variable fonts and their variable font settings. That being said, they do have an incredible plugin API which allows someone to potentially hack together a temporary solution until they have time to implement them fully.

In the next blog post, I’ll go into how Figma’s plugin architecture lets you render variable fonts as SVG vector glyphs.

React Figma Plugin – How to get data from the canvas to your app

September 2, 2020 by rememberlenny

I had much too hard of a time groking the Figma Plugin’s documentation, and thought I would leave a note for any brave souls to follow.

Figma has a great API and documentation around how to make a plugin on the desktop app. When writing a plugin, you have access to the entire Figma canvas of the active file, to which you can read/write content. You also have quite a lenient window API from which you can make external requests and do things such as download assets or OAuth into services.

All of this being said, you are best off learning about what you can do directly from Figma here.

If you are like me, and working on a plugin, which you have decided to write in React, then you may encounter the desire to receive callback events from the Figma canvas in your app. In my case, I wanted the React application to react to the updated user selection, such that I could access the content of a TextNode, and update the plugin content accordingly.

To do this, I struggled with the Figma Plugin examples to understand how to access data from the canvas and into my app. The Figma Plugin examples, which can be found here, have a React application sample which sends data to the canvas, but not the other way around. While this is seemingly straight forward, I didn’t immediately absorb the explanations from Figma Plugin website.

In retrospect, the way to do this is quite simple.

First, the Figma Plugin API uses the Window postMessage API to transmit information. This is explained in the Plugin documentation with a clear diagram which you can see here:

The first thing to note from this diagram is the postMessage API, which I mentioned above. The second thing is that the postMessage API is bi-directional, and allows for data to go from the app to the canvas, and vice-versa.

Practically speaking, the React Figma Plugin demo shows this in the example

This is part of the React app, which is in the ui.tsx file

In the example, the postMessage API is using the window.parent object to announce from the React app to the Figma canvas. Specifically, from the plugin example, there are two JavaScript files – code.ts and ui.tsx, which respectively handle the code that directly manages the figma plugin API, and the UI code for the plugin itself.

While the parent object is used to send data to the canvas, you need to do something different to receive data. You can learn about how the window.parent API exists here. In short, iFrames can speak to the parent windows. As the Figma Plugin ecosystem runs in a iFrame, this is how the postMessages are exchanged.

To receive data from the figma api, you need to setup a postMessage from the code.ts file, which has access to the figma object.

In my case, the example is that I would like to access the latest selected items from the figma canvas, when the user has selected something new. To do that, I have the following code which creates an event listener on the figma object, and then broadcasts a postMessage containing that information.

Once the figma object broadcasts the message, the React app can then receive the message. To receive this from the React application, you can create a simple EventListener on message.

Now the part that was unintuitive, given the example, was that the React app listens directly to the window object to receive the data broadcasted from the code.ts file. You can see an example below.

Event listener which can live anywhere in your React app (ie. ui.tsx)

As you can see, to listen for the event in the React application, the window.addEventListener is used, as opposed to parent.addEventListener. This is done because the React application is unable to setup event listeners on the parent, due to cross-origin rules. To bypass this, you can access the window object, and the postMessage API properly passes the data that was broadcasted from the code.ts file.

To summarize, to get data from the React application to the Figma plugin, you use parent.postMessage in your React code, which is demo-ed as the ui.tsx file. To get data from the Figma canvas into the React application, you need to broadcast a postMessage message using the figma.ui.postMessage method (demo-ed from code.ts), which then can be listened to from the React application using the window.addEventListener.

I hope this helps if you are looking to send data from the Figma Plugin to your React application!