GSoC 2021: Overview

Over the summer I worked on implementing the new screenshot UI for GNOME Shell as part of Google Summer of Code 2021. This post is an overview of the work I did and work still left to do.

The project was about adding a dedicated UI to GNOME Shell for taking screenshots and recording screencasts. The idea was to unify related functionality in a discoverable and easy to use interface, while also improving on several aspects of existing screenshot and screencast tools.

Over the summer, I implemented most of the functionality:

  • Capturing screen and window snapshots immediately, letting the user choose what to save later.
  • Area selection, which can be resized and dragged after the first selection.
  • Screen selection.
  • Window selection presenting an Overview-like view.
  • Mouse cursor capturing which can be toggled on and off inside the UI.
  • Area and screen video recording.
  • Correct handling of HiDPI and mixed DPI setups.

I opened several merge requests:

I expect that Mutter merge requests won’t require many further changes before merging. The screenshot UI however still has some work that I will do past GSoC, detailed in the main merge request. This work includes adding window selection support for screen recording, ensuring all functionality is keyboard- and touch-accessible, and working with the designers to polish the final result. GNOME 41 is already past the UI freeze, but GNOME 42 seems to me like a realistic target for finishing and landing the screenshot UI.

For the purposes of GSoC, I additionally made two frozen snapshots of work done over the GSoC period that I will not update further: three commits in this mutter tag and 16 commits in this gnome-shell tag.

I also wrote several blog posts about my work on the screenshot UI:

Additionally, I gave a short presentation of my work at GUADEC, GNOME’s annual conference.

Over the course of this GSoC project I learned a lot about GNOME Shell’s UI internals which will help me with GNOME Shell contributions in the future. I enjoyed working on an awesome upgrade to taking screenshots and screencasts in GNOME. For me participating in the GNOME community is a fantastic experience and I highly recommend everyone to come hang out and contribute.

I would like to once again thank my mentor Jonas Dreßler for answering my questions, as well as Tobias Bernard, Allan Day and Jakub Steiner for providing design feedback.

Screen recording in the new screenshot UI

GSoC 2021: Screenshots with Pointer

Over the summer I’m working on a new screenshot UI for GNOME Shell. Here’s my progress since the last post.

The new “Show Pointer” toggle in the screenshot UI

First of all, I made the window selection mode work across multiple screens and ensured that it works correctly with HiDPI and mixed DPI setups. Each screen gets its own Overview-like view of all the windows, letting you pick the one you need at your leisure.

In this and the following showcases, you can see GNOME Shell running with two virtual monitors: one regular DPI on the left, and one high DPI (200% scaling) on the right. Both virtual monitors use the same resolution, which is why the right one appears two times smaller.

Window selection working across two monitors

Next, I implemented the screen selection mode which lets you choose a full monitor to screenshot.

Screen selection with the primary monitor selected

Finally, I embarked on an adventure to add a “Show Pointer” toggle. Following the spirit of the screenshot UI, you should be able to hit your Print Screen key first and adjust the screenshot contents afterwards. That is, you should be able to show and hide the mouse pointer and see it on the preview in real-time.

But first things first: let’s figure out how to add a menu. There’s a handy PopupMenu class that you can inherit to make your own menu:

class UIMenu extends PopupMenu.PopupMenu {
    constructor(sourceActor) {
        // The third argument controls which side
        // the menu "points" to. Here the menu
        // will point to the left.
        super(sourceActor, 0, St.Side.LEFT);

        Main.uiGroup.add_actor(this.actor);
        this.actor.hide();
    }

    toggle() {
        if (this.isOpen)
            this.close(BoxPointer.PopupAnimation.FULL);
        else
            this.open(BoxPointer.PopupAnimation.FULL);
    }
}

To show the menu on a button press, we also need a PopupMenuManager:

let button = new St.Button();

let menu = new UIMenu(button);
let manager = new PopupMenu.PopupMenuManager(button);
manager.addMenu(menu);

button.connect('clicked', () => menu.toggle());

Let’s add a switch to our menu. PopupSwitchMenuItem is exactly what we need:

class UIMenu extends PopupMenu.PopupMenu {
    constructor(sourceActor) {
        // ...

        this._showPointerItem =
            new PopupMenu.PopupSwitchMenuItem(_("Show Pointer"), false);
        this._showPointerItem.connect(
            'toggled', (_item, state) => {
                this.emit('show-pointer-toggled', state);
            });
        this.addMenuItem(this._showPointerItem);
    }

    get showPointer() {
        return this._showPointerItem.state;
    }

    // ...
}
Signals.addSignalMethods(UIMenu.prototype);

Pay attention to the last line. Signals.addSignalMethods() does a bit of magic that lets you use GObject signal methods (connect() and emit()) on plain JavaScript classes. In this case I use it to thread through a signal for toggling the “Show Pointer” switch.

The mouse cursor on the preview is just another St widget. Its visibility is connected to the state of the “Show Pointer” switch:

let cursor = new St.Widget();

menu.connect('show-pointer-toggled', (_menu, state) => {
    cursor.visible = state;
});

// Set the initial state.
cursor.visible = menu.showPointer;

When screenshot UI captures a snapshot of the screen, it will also snapshot the current cursor texture, position and scale. These variables are used to configure the cursor widget so it shows in the same spot in the screenshot UI as where it was on screen:

// Get a snapshot of the screen contents.
let [content, scale, cursorContent, cursorPoint, cursorScale] =
    await screenshot.to_content();

// Set the cursor texture.
cursor.set_content(cursorContent);
// Set the cursor position.
cursor.set_position(cursorPoint.x, cursorPoint.y);

// Get the cursor texture size.
let [, w, h] = cursorContent.get_preferred_size();

// Adjust it according to the cursor scale.
w *= cursorScale;
h *= cursorScale;

// Set the cursor size.
cursor.set_size(w, h);

The scale is needed mainly for HiDPI setups. Clutter operates in logical pixels, which means that, for example, on a monitor with 200% scaling, a widget with a size of 10×10 will occupy a 20×20 physical pixel area. Since get_preferred_size() returns a size in physical pixels, we need to multiply it by cursorScale to convert it to logical pixels.

With this, we have a working cursor preview in the screenshot UI:

How many layers of screenshot UI were used to take this picture?

When writing the final screenshot, we need to composite the cursor texture on the screenshot image. To do it correctly, we need to take into account scale of the screenshot texture, scale of the cursor texture, screen selection and cursor coordinates:

Shell.Screenshot.capture_from_texture(
    // The screen texture.
    texture,
    // Selected area.
    x, y, w, h,
    // Scale of the screen texture.
    scale,
    // The cursor texture.
    cursorTexture,
    // Cursor coordinates in physical pixels.
    cursor.x * scale,
    cursor.y * scale,
    // Scale of the cursor texture.
    cursorScale,
    // ...
);

With this in place, cursor capturing works perfectly across mixed screen and cursor texture scales:

Previewing and capturing the cursor in various configurations

But we’re not done yet! Time for window selection.

In window selection mode, every window gets its own cursor preview sprite since the cursor can overlap multiple windows at once:

Overlapping cursor in screen selection and window selection modes

If you thought scale handling was complicated above, brace yourself because window selection takes it a level further. Apart from the scale of the window buffer (counter-part to the screenshot texture scale from before) and the scale of the cursor texture, there’s also the scale that overview-like window selection applies to windows to fit them all on screen. To handle all of this complex positioning, I overrode the allocate() virtual function of the window preview actor:

vfunc_allocate(box) {
    this.set_allocation(box);

    // Window buffer size in physical pixels.
    let [, windowW, windowH] =
        this.content.get_preferred_size();

    // Compute window scale.
    //
    // Divide by buffer scale to convert
    // from physical to logical pixels.
    let xScale =
        (box.x2 - box.x1) /
        (windowW / this._bufferScale);
    let yScale =
        (box.y2 - box.y1) /
        (windowH / this._bufferScale);

    let cursor = this.get_child();

    // Compute cursor size in logical pixels.
    let [, , w, h] =
        cursor.get_preferred_size();
    w *= this._cursorScale;
    h *= this._cursorScale;

    // The cursor position and size.
    let cursorBox = new Clutter.ActorBox({
        x1: this._cursorPoint.x,
        y1: this._cursorPoint.y,
        x2: this._cursorPoint.x + w,
        y2: this._cursorPoint.y + h,
    });

    // Rescale it to match the window scale.
    cursorBox.x1 *= xScale;
    cursorBox.x2 *= xScale;
    cursorBox.y1 *= yScale;
    cursorBox.y2 *= yScale;

    // Allocate the cursor.
    cursor.allocate(cursorBox);
}

Finally, we need to pass these values to the recording function in a similar fashion to what we did before:

Shell.Screenshot.capture_from_texture(
    // The window texture.
    texture,
    // Special values that mean
    // "record the whole texture".
    0, 0, -1, -1,
    // Scale of the window texture.
    window.bufferScale,
    // The cursor texture.
    cursorTexture,
    // Cursor coordinates in physical pixels.
    window.cursorPoint.x * window.bufferScale,
    window.cursorPoint.y * window.bufferScale,
    // Scale of the cursor texture.
    cursorScale,
    // ...
);

Phew! Now we can lean back and enjoy window screenshots with cursor working perfectly across various screen, window and cursor scales. Don’t forget the cursor can be toggled on and off after the fact—this is what all the trouble was for!

Cursor capture on window selection

With pointer capturing implemented (although with some minor bugfixes still due), the next step is screen recording. You should be able to select an area, a monitor, or a window to record, optionally with a cursor, and start the recording. The design for what happens next is not finalized yet but a natural place to put the recording indicator and the stop button seems to be the top-right menu on the panel.

Thanks for getting all the way through the post and see you in the next update! By the way, check out my GUADEC intern lightning talk about the new screenshot UI in this YouTube recording.

GSoC 2021: Selection Editing and Window Selection

This summer I’m implementing a new screenshot UI for GNOME Shell. In this post I’ll show my progress over the past two weeks.

The new screenshot UI in the area selection mode

I spent the most time adding the four corner handles that allow you to adjust the selection. GNOME Shell’s drag-and-drop classes were mostly sufficient, save for a few minor things. In particular, I ended up extending the _Draggable class with a drag-motion signal emitted every time the dragged actor’s position changes. I used this signal to update the selection rectangle coordinates so it responds to dragging in real-time without any lag, just as one would expect. Some careful handling was also required to allow dragging the handle past selection edges, so for example it’s possible to grab the top-left handle and move it to the right and to the bottom, making it a bottom-right handle.

Editing the selection by dragging the corner handles

I’ve also implemented a nicer animation when opening the screenshot UI. Now the screen instantly freezes when you press the Print Screen button and the screenshot UI fades in, without the awkward screenshot blend. Here’s a side-by-side comparison to the previous behavior:

Comparison of the old and new opening animation, slowed down 2×

Additionally, I fixed X11 support for the new screenshot capturing. Whereas on Wayland the contents of the screen are readily available because GNOME Shell is responsible for all screen compositing, on X11 that’s not always the case: full-screen windows get unredirected, which means they bypass the compositing and go straight through the X server to the monitor. To capture a screenshot, then, GNOME Shell first needs to disable unredirection for one frame and paint the stage.

This X11 capturing works just as well as on Wayland, including the ability to capture transient windows such as tooltips—a long-requested feature. However, certain right-click menus on X11 grab the input and prevent the screenshot UI hotkey (and other hotkeys such as Super to enter the Overview) from working. This has been a long-standing limitation of the X11 session; unfortunately, these menus cannot be captured on X11. On Wayland this is not a problem as GNOME Shell handles all input itself, so windows cannot block its hotkeys.

Finally, over the past few days I’ve been working on window selection. Similarly to full-screen screenshots, every window’s contents are captured immediately as you open the screenshot UI, allowing you to pick the right window at your own pace. To capture the window contents I use Robert Mader’s implementation, which I invoke for all windows from the current workspace when the screenshot UI is opening. I arrange these window snapshots in a grid similar to the Overview and let the user pick the right window.

Window selection in action

As usual, the design is nowhere near finished or designer-approved. Consider it an instance of my “programmer art”. 😁

My goal was to re-use as much of the Overview window layout code as possible. I ended up making my own copy of the WorkspaceLayout class (I was able to strip it down considerably because the original class has to deal with windows disappearing, re-appearing and changing size, whereas the screenshot UI window snapshots never change) and directly re-using the rest of the machinery. I also made my own widget compatible with WindowPreview, which exports the few functions used by the layout code, once again considerably simplified thanks to not having to deal with the ever changing real windows.

The next step is to put more work into the window selection to make sure it handles all the different setups and edge cases right: the current implementation is essentially the first working draft that only supports the primary monitor. Then I’ll need to add the ability to pick the monitor in the screen selection mode and make sure it works fine with different setups too. I also want to figure out capturing screenshots with a visible cursor, which is currently notably missing from the screenshot UI. After that I’ll tackle the screen recording half.

Also, unrelated to the screenshot UI, I’m happy to announce that my merge request for reducing input latency in Mutter has finally been merged and should be included in Mutter 41.alpha.

That’s it for this post, see you in the next update!

GSoC 2021: GNOME Shell Screenshot UI

Hello! I’m Ivan Molodetskikh, a computer science student from Moscow, Russia.

I’ve been involved in GNOME starting from my GSoC 2018 project to port librsvg filters to Rust. Throughout the last year in GNOME I’ve been doing some work to reduce input latency in Mutter, the GNOME’s compositor (by implementing the presentation-time Wayland protocol and adding dynamic render time computation). I’ve also created two small apps, Video Trimmer and Identity.

As part of this year’s Google Summer of Code, I’m implementing a new screenshot UI in GNOME Shell.

Screenshot UI panel mock-up by the design team

The UI will make taking screenshots and recording screencasts more intuitive and discoverable. On a key press, GNOME Shell will capture a full screenshot, and you will be able to select the exact area you want. The screenshot is captured immediately, so it’s much easier to catch the right moment or capture open context menus.

Screencasts will get an upgrade too: you will be able to record areas of the screen or individual windows, just like you already can with screenshots.

Over the first few weeks I figured out how to add new UI elements to GNOME Shell: how to construct UI with GJS, how to style elements with CSS, the difference between Clutter actors and layouts and St (GNOME Shell’s toolkit) widgets, how to do transitions and handle input. I’ve been basing my work on the UI mock-up from the design team. Here’s a short demo of what I’ve implemented so far:

Demo of the parts of the mock-up that I’ve implemented thus far

Keep in mind this is very much a work-in-progress: I used stock icons instead of correct mock-up ones, I haven’t got any designer feedback yet, screen recording is not implemented and so on.

Using Robert Mader’s texture actor implementation, I added a Mutter function to snapshot the screen contents into a GPU texture that can be shown on a GNOME Shell widget. This way I can instantly display the screenshot preview in the UI without doing a slow PNG encoding round-trip. Then the UI allows you to select an area or a screen and record it into an image by pressing the capture button. Currently, the image is copied into the clipboard. I paste the screenshot into Obfuscate to display it.

When switching into the screencast mode, instead of the screen snapshot you can simply see your desktop normally because screen recording starts only upon pressing the capture button, not from an old screen snapshot.

The next step is to implement Window selection, which will arrange windows similarly to the Overview. Afterwards I’ll work on the screen recording part. I have also contacted the design team to get feedback and make sure the UI is the best it can be.

I’d like to thank my mentor, Jonas Dreßler (aka verdre), for keeping up with my questions. I’m excited to bring an awesome screenshot UI to GNOME, see you all in the next blog posts!

GSoC 2018: Overview

Introduction

Throughout the summer I was working on librsvg, a GNOME library for rendering SVG files to Cairo surfaces. This post is an overview of the work I did with relevant links.

My Results

For the project I was to port the SVG filter infrastructure of librsvg from C to Rust, adding all missing filter tests from the SVG test suite along the way. I was also expected to implement abstractions to make the filter implementation more convenient, including Rust iterators over the surface pixels.

Here’s a list of all merge requests accepted into librsvg as part of my GSoC project:

Here’s a convenient link to see all of these merge requests in GitLab: https://gitlab.gnome.org/GNOME/librsvg/merge_requests?scope=all&utf8=%E2%9C%93&state=all&author_username=YaLTeR&label_name[]=GSoC%202018

All of this code was accepted into the mainline and will appear in the next stable release of librsvg.

I also wrote the following blog posts detailing some interesting things I worked on as part of the GSoC project:

Further Work

There are a couple of fixes which still need to be done for filters to be feature-complete:

  • Fixing filters operating on off-screen nodes. Currently all intermediate surfaces are limited to the original SVG view area so anything off-screen is inaccessible to filters even when it should be. This is blocked on some considerable refactoring in the main librsvg node drawing code which is currently underway.
  • Implementing the filterRes property. This property allows to set the pixel resolution for filter operations and is one of the ways of achieving more resolution-independent rendering results. While it can be implemented with the current code as is, it will be much more convenient to account for it while refactoring the code to fix the previous issue.
  • Implementing the enable-background property. The BackgroundImage filter input should adhere to this property when picking which nodes to include in the background image, whereas it currently doesn’t.