This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Actions API

A low-level interface for providing virtualized device input actions to the web browser.

In addition to the high-level element interactions, the Actions API provides granular control over exactly what designated input devices can do. Selenium provides an interface for 3 kinds of input sources: a key input for keyboard devices, a pointer input for a mouse, pen or touch devices, and wheel inputs for scroll wheel devices (introduced in Selenium 4.2). Selenium allows you to construct individual action commands assigned to specific inputs and chain them together and call the associated perform method to execute them all at once.

Action Builder

In the move from the legacy JSON Wire Protocol to the new W3C WebDriver Protocol, the low level building blocks of actions became especially detailed. It is extremely powerful, but each input device has a number of ways to use it and if you need to manage more than one device, you are responsible for ensuring proper synchronization between them.

Thankfully, you likely do not need to learn how to use the low level commands directly, since almost everything you might want to do has been given a convenience method that combines the lower level commands for you. These are all documented in keyboard, mouse, pen, and wheel pages.

Pause

Pointer movements and Wheel scrolling allow the user to set a duration for the action, but sometimes you just need to wait a beat between actions for things to work correctly.

        WebElement clickable = driver.findElement(By.id("clickable"));
        new Actions(driver)
                .moveToElement(clickable)
                .pause(Duration.ofSeconds(1))
                .clickAndHold()
                .pause(Duration.ofSeconds(1))
                .sendKeys("abc")
                .perform();
    clickable = driver.find_element(By.ID, "clickable")
    ActionChains(driver)\
        .move_to_element(clickable)\
        .pause(1)\
        .click_and_hold()\
        .pause(1)\
        .send_keys("abc")\
        .perform()

Selenium v4.2

            IWebElement clickable = driver.FindElement(By.Id("clickable"));
            new Actions(driver)
                .MoveToElement(clickable)
                .Pause(TimeSpan.FromSeconds(1))
                .ClickAndHold()
                .Pause(TimeSpan.FromSeconds(1))
                .SendKeys("abc")
                .Perform();

Selenium v4.2

    clickable = @driver.find_element(id: 'clickable')
    @driver.action
           .move_to(clickable)
           .pause(duration: 1)
           .click_and_hold
           .pause(duration: 1)
           .send_keys("abc")
           .perform
    // Add Code
    // Add Code

Release All Actions

An important thing to note is that the driver remembers the state of all the input items throughout a session. Even if you create a new instance of an actions class, the depressed keys and the location of the pointer will be in whatever state a previously performed action left them.

There is a special method to release all currently depressed keys and pointer buttons. This method is implemented differently in each of the languages because it does not get executed with the perform method.

        ((RemoteWebDriver) driver).resetInputState();
    ActionBuilder(driver).clear_actions()
            ((WebDriver)driver).ResetInputState();
    @driver.action.release_actions
    // Add Code
    // Add Code

1 - Keyboard actions

A representation of any key input device for interacting with a web page.

There are only 2 actions that can be accomplished with a keyboard: pressing down on a key, and releasing a pressed key. In addition to supporting ASCII characters, each keyboard key has a representation that can be pressed or released in designated sequences.

Keys

In addition to the keys represented by regular unicode, unicode values have been assigned to other keyboard keys for use with Selenium. Each language has its own way to reference these keys; the full list can be found here.

Key down

        new Actions(driver)
                .keyDown(Keys.SHIFT)
                .sendKeys("a")
                .perform();
    ActionChains(driver)\
        .key_down(Keys.SHIFT)\
        .send_keys("abc")\
        .perform()
            new Actions(driver)
                .KeyDown(Keys.Shift)
                .SendKeys("a")
                .Perform();
    @driver.action
           .key_down(:shift)
           .send_keys('a')
           .perform
      await driver.actions()
        .keyDown(Key.SHIFT)
        .sendKeys('a')
        .perform();
    // Add Code

Key up

        new Actions(driver)
                .keyDown(Keys.SHIFT)
                .sendKeys("a")
                .keyUp(Keys.SHIFT)
                .sendKeys("b")
                .perform();
    ActionChains(driver)\
        .key_down(Keys.SHIFT)\
        .send_keys("a")\
        .key_up(Keys.SHIFT)\
        .send_keys("b")\
        .perform()
            new Actions(driver)
                .KeyDown(Keys.Shift)
                .SendKeys("a")
                .KeyUp(Keys.Shift)
                .SendKeys("b")
                .Perform();
    @driver.action
           .key_down(:shift)
           .send_keys('a')
           .key_up(:shift)
           .send_keys('b')
           .perform
    // Add Code
    // Add Code

Send keys

This is a convenience method in the Actions API that combines keyDown and keyUp commands in one action. Executing this command differs slightly from using the element method, but primarily this gets used when needing to type multiple characters in the middle of other actions.

Active Element

        new Actions(driver)
                .sendKeys("abc")
                .perform();
    ActionChains(driver)\
        .send_keys("abc")\
        .perform()
            new Actions(driver)
                .SendKeys("abc")
                .Perform();
    @driver.action
           .send_keys('abc')
           .perform
    // Add Code
    // Add Code

Designated Element

        WebElement textField = driver.findElement(By.id("textInput"));
        new Actions(driver)
                .sendKeys(textField, "Selenium!")
                .perform();
    text_input = driver.find_element(By.ID, "textInput")
    ActionChains(driver)\
        .send_keys_to_element(text_input, "abc")\
        .perform()
            IWebElement textField = driver.FindElement(By.Id("textInput"));
            new Actions(driver)
                .SendKeys(textField, "abc")
                .Perform();
    text_field = @driver.find_element(id: 'textInput')
    @driver.action
           .send_keys(text_field, 'Selenium!')
           .perform
    // Add Code
    // Add Code

Copy and Paste

Here’s an example of using all of the above methods to conduct a copy / paste action. Note that the key to use for this operation will be different depending on if it is a Mac OS or not. This code will end up with the text: SeleniumSelenium!

        Keys cmdCtrl = platformName.is(Platform.MAC) ? Keys.COMMAND : Keys.CONTROL;

        WebElement textField = driver.findElement(By.id("textInput"));
        new Actions(driver)
                .sendKeys(textField, "Selenium!")
                .sendKeys(Keys.ARROW_LEFT)
                .keyDown(Keys.SHIFT)
                .sendKeys(Keys.ARROW_UP)
                .keyUp(Keys.SHIFT)
                .keyDown(cmdCtrl)
                .sendKeys("xvv")
                .keyUp(cmdCtrl)
                .perform();
    cmd_ctrl = Keys.COMMAND if sys.platform == 'darwin' else Keys.CONTROL

    ActionChains(driver)\
        .send_keys("Selenium!")\
        .send_keys(Keys.ARROW_LEFT)\
        .key_down(Keys.SHIFT)\
        .send_keys(Keys.ARROW_UP)\
        .key_up(Keys.SHIFT)\
        .key_down(cmd_ctrl)\
        .send_keys("xvv")\
        .key_up(cmd_ctrl)\
        .perform()
            String cmdCtrl = platformName.Contains("mac") ? Keys.Command : Keys.Control;

            new Actions(driver)
                .SendKeys("Selenium!")
                .SendKeys(Keys.ArrowLeft)
                .KeyDown(Keys.Shift)
                .SendKeys(Keys.ArrowUp)
                .KeyUp(Keys.Shift)
                .KeyDown(cmdCtrl)
                .SendKeys("xvv")
                .KeyUp(cmdCtrl)
                .Perform();
    @cmd_ctrl = @driver.capabilities.platform_name.include?('mac') ? :command : :control
    @driver.action
           .send_keys('Selenium!')
           .send_keys(:arrow_left)
           .key_down(:shift)
           .send_keys(:arrow_up)
           .key_up(:shift)
           .key_down(@cmd_ctrl)
           .send_keys('xvv')
           .key_up(@cmd_ctrl)
           .perform
    // Add Code
    // Add Code

2 - Mouse actions

A representation of any pointer device for interacting with a web page.

There are only 3 actions that can be accomplished with a mouse: pressing down on a button, releasing a pressed button, and moving the mouse. Selenium provides convenience methods that combine these actions in the most common ways.

Click and hold

This method combines moving the mouse to the center of an element with pressing the left mouse button. This is useful for focusing a specific element:

        WebElement clickable = driver.findElement(By.id("clickable"));
        new Actions(driver)
                .clickAndHold(clickable)
                .perform();
    clickable = driver.find_element(By.ID, "clickable")
    ActionChains(driver)\
        .click_and_hold(clickable)\
        .perform()
            IWebElement clickable = driver.FindElement(By.Id("clickable"));
            new Actions(driver)
                .ClickAndHold(clickable)
                .Perform();
    clickable = @driver.find_element(id: 'clickable')
    @driver.action
           .click_and_hold(clickable)
           .perform
    // Add Code
    // Add Code

Click and release

This method combines moving to the center of an element with pressing and releasing the left mouse button. This is otherwise known as “clicking”:

        WebElement clickable = driver.findElement(By.id("click"));
        new Actions(driver)
                .click(clickable)
                .perform();
    clickable = driver.find_element(By.ID, "click")
    ActionChains(driver)\
        .click(clickable)\
        .perform()
            IWebElement clickable = driver.FindElement(By.Id("click"));
            new Actions(driver)
                .Click(clickable)
                .Perform();
    clickable = @driver.find_element(id: 'click')
    @driver.action
           .click(clickable)
           .perform
    // Add Code
    // Add Code

Alternate Button Clicks

There are a total of 5 defined buttons for a Mouse:

  • 0 — Left Button (the default)
  • 1 — Middle Button (currently unsupported)
  • 2 — Right Button
  • 3 — X1 (Back) Button
  • 4 — X2 (Forward) Button

Context Click

This method combines moving to the center of an element with pressing and releasing the right mouse button (button 2). This is otherwise known as “right-clicking”:

        WebElement clickable = driver.findElement(By.id("clickable"));
        new Actions(driver)
                .contextClick(clickable)
                .perform();
    clickable = driver.find_element(By.ID, "clickable")
    ActionChains(driver)\
        .context_click(clickable)\
        .perform()
            IWebElement clickable = driver.FindElement(By.Id("clickable"));
            new Actions(driver)
                .ContextClick(clickable)
                .Perform();
    clickable = @driver.find_element(id: 'clickable')
    @driver.action
           .context_click(clickable)
           .perform
    // Add Code
    // Add Code

Back Click

There is no convenience method for this, it is just pressing and releasing mouse button 3

        PointerInput mouse = new PointerInput(PointerInput.Kind.MOUSE, "default mouse");

        Sequence actions = new Sequence(mouse, 0)
                .addAction(mouse.createPointerDown(PointerInput.MouseButton.BACK.asArg()))
                .addAction(mouse.createPointerUp(PointerInput.MouseButton.BACK.asArg()));

        ((RemoteWebDriver) driver).perform(Collections.singletonList(actions));

Selenium v4.2

    action = ActionBuilder(driver)
    action.pointer_action.pointer_down(MouseButton.BACK)
    action.pointer_action.pointer_up(MouseButton.BACK)
    action.perform()

Selenium v4.2

            ActionBuilder actionBuilder = new ActionBuilder();
            PointerInputDevice mouse = new PointerInputDevice(PointerKind.Mouse, "default mouse");
            actionBuilder.AddAction(mouse.CreatePointerDown(MouseButton.Back));
            actionBuilder.AddAction(mouse.CreatePointerUp(MouseButton.Back));
            ((IActionExecutor)driver).PerformActions(actionBuilder.ToActionSequenceList());

Selenium v4.2

    @driver.action
           .pointer_down(:back)
           .pointer_up(:back)
           .perform
    // Add Code
    // Add Code

Forward Click

There is no convenience method for this, it is just pressing and releasing mouse button 4

        PointerInput mouse = new PointerInput(PointerInput.Kind.MOUSE, "default mouse");

        Sequence actions = new Sequence(mouse, 0)
                .addAction(mouse.createPointerDown(PointerInput.MouseButton.FORWARD.asArg()))
                .addAction(mouse.createPointerUp(PointerInput.MouseButton.FORWARD.asArg()));

        ((RemoteWebDriver) driver).perform(Collections.singletonList(actions));

Selenium v4.2

    action = ActionBuilder(driver)
    action.pointer_action.pointer_down(MouseButton.FORWARD)
    action.pointer_action.pointer_up(MouseButton.FORWARD)
    action.perform()

Selenium v4.2

            ActionBuilder actionBuilder = new ActionBuilder();
            PointerInputDevice mouse = new PointerInputDevice(PointerKind.Mouse, "default mouse");
            actionBuilder.AddAction(mouse.CreatePointerDown(MouseButton.Forward));
            actionBuilder.AddAction(mouse.CreatePointerUp(MouseButton.Forward));
            ((IActionExecutor)driver).PerformActions(actionBuilder.ToActionSequenceList());

Selenium v4.2

    @driver.action
           .pointer_down(:forward)
           .pointer_up(:forward)
           .perform
    // Add Code
    // Add Code

Double click

This method combines moving to the center of an element with pressing and releasing the left mouse button twice.

        WebElement clickable = driver.findElement(By.id("clickable"));
        new Actions(driver)
                .doubleClick(clickable)
                .perform();
    clickable = driver.find_element(By.ID, "clickable")
    ActionChains(driver)\
        .double_click(clickable)\
        .perform()
            IWebElement clickable = driver.FindElement(By.Id("clickable"));
            new Actions(driver)
                .DoubleClick(clickable)
                .Perform();
    clickable = @driver.find_element(id: 'clickable')
    @driver.action
           .double_click(clickable)
           .perform
    // Add Code
    // Add Code

Move to element

This method moves the mouse to the in-view center point of the element. This is otherwise known as “hovering.” Note that the element must be in the viewport or else the command will error.

        WebElement hoverable = driver.findElement(By.id("hover"));
        new Actions(driver)
                .moveToElement(hoverable)
                .perform();
    hoverable = driver.find_element(By.ID, "hover")
    ActionChains(driver)\
        .move_to_element(hoverable)\
        .perform()
            IWebElement hoverable = driver.FindElement(By.Id("hover"));
            new Actions(driver)
                .MoveToElement(hoverable)
                .Perform();
    hoverable = @driver.find_element(id: 'hover')
    @driver.action
           .move_to(hoverable)
           .perform
    // Add Code
    // Add Code

Move by offset

These methods first move the mouse to the designated origin and then by the number of pixels in the provided offset. Note that the position of the mouse must be in the viewport or else the command will error.

Offset from Element (Top Left Origin)

This method moves the mouse to the in-view center point of the element then attempts to move to the upper left corner of the element and then moves by the provided offset.

This will be removed as an option in Selenium 4.3, and only an offset from center of the element will be supported. As of Selenium 4.2, this is the default behavior for Ruby, .NET and Python in order to be backwards compatible with previous versions of Selenium. This approach does not work correctly when the element is not entirely inside the viewport.

Not Implemented in Selenium 4
    mouse_tracker = driver.find_element(By.ID, "mouse-tracker")
    ActionChains(driver)\
        .move_to_element_with_offset(mouse_tracker, 8, 11)\
        .perform()
            IWebElement tracker = driver.FindElement(By.Id("mouse-tracker"));
            new Actions(driver)
                .MoveToElement(tracker, 8, 11, MoveToElementOffsetOrigin.TopLeft)
                .Perform();
    mouse_tracker = @driver.find_element(id: 'mouse-tracker')
    @driver.action
           .move_to(mouse_tracker, 8, 11)
           .perform
    // Add Code
    // Add Code

Offset from Element (Center Origin)

This method moves to the in-view center point of the element, then moves the mouse by the provided offset

This is the default behavior in Java as of Selenium 4.0, and will be the default for the remaining languages as of Selenium 4.3.

        WebElement tracker = driver.findElement(By.id("mouse-tracker"));
        new Actions(driver)
                .moveToElement(tracker, 8, 11)
                .perform();
Coming in Selenium 4.3
            IWebElement tracker = driver.FindElement(By.Id("mouse-tracker"));
            new Actions(driver)
                .MoveToElement(tracker, 8, 11, MoveToElementOffsetOrigin.Center)
                .Perform();
Coming in Selenium 4.3
    // Add Code
    // Add Code

Offset from Viewport

This method moves the mouse from the upper left corner of the current viewport by the provided offset.

        PointerInput mouse = new PointerInput(PointerInput.Kind.MOUSE, "default mouse");

        Sequence actions = new Sequence(mouse, 0)
                .addAction(mouse.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), 8, 12));

        ((RemoteWebDriver) driver).perform(Collections.singletonList(actions));
    action = ActionBuilder(driver)
    action.pointer_action.move_to_location(8, 12)
    action.perform()
            ActionBuilder actionBuilder = new ActionBuilder();
            PointerInputDevice mouse = new PointerInputDevice(PointerKind.Mouse, "default mouse");
            actionBuilder.AddAction(mouse.CreatePointerMove(CoordinateOrigin.Viewport,
                8, 12, TimeSpan.Zero));
            ((IActionExecutor)driver).PerformActions(actionBuilder.ToActionSequenceList());
    @driver.action
           .move_to_location(8, 12)
           .perform
    // Add Code
    // Add Code

Offset from Current Pointer Location

This method moves the mouse from its current position by the offset provided by the user. If the mouse has not previously been moved, the position will be in the upper left corner of the viewport. Note that the pointer position does not change when the page is scrolled.

        new Actions(driver)
                .moveByOffset(13, 15)
                .perform();
    ActionChains(driver)\
        .move_by_offset( 13, 15)\
        .perform()
            new Actions(driver)
                .MoveByOffset(13, 15)
                .Perform();
    @driver.action
           .move_by(13, 15)
           .perform
    // Add Code
    // Add Code

Drag and Drop on Element

This method firstly performs a click-and-hold on the source element, moves to the location of the target element and then releases the mouse.

        WebElement draggable = driver.findElement(By.id("draggable"));
        WebElement droppable = driver.findElement(By.id("droppable"));
        new Actions(driver)
                .dragAndDrop(draggable, droppable)
                .perform();
    draggable = driver.find_element(By.ID, "draggable")
    droppable = driver.find_element(By.ID, "droppable")
    ActionChains(driver)\
        .drag_and_drop(draggable, droppable)\
        .perform()
            IWebElement draggable = driver.FindElement(By.Id("draggable"));
            IWebElement droppable = driver.FindElement(By.Id("droppable"));
            new Actions(driver)
                .DragAndDrop(draggable, droppable)
                .Perform();
    draggable = @driver.find_element(id: 'draggable')
    droppable = @driver.find_element(id: 'droppable')
    @driver.action
           .drag_and_drop(draggable, droppable)
           .perform
    // Add Code
    // Add Code

Drag and Drop by Offset

This method firstly performs a click-and-hold on the source element, moves to the given offset and then releases the mouse.

        WebElement draggable = driver.findElement(By.id("draggable"));
        Rectangle start = draggable.getRect();
        Rectangle finish = driver.findElement(By.id("droppable")).getRect();
        new Actions(driver)
                .dragAndDropBy(draggable, finish.getX() - start.getX(), finish.getY() - start.getY())
                .perform();
    draggable = driver.find_element(By.ID, "draggable")
    start = draggable.location
    finish = driver.find_element(By.ID, "droppable").location
    ActionChains(driver)\
        .drag_and_drop_by_offset(draggable, finish['x'] - start['x'], finish['y'] - start['y'])\
        .perform()
            IWebElement draggable = driver.FindElement(By.Id("draggable"));
            Point start = draggable.Location;
            Point finish = driver.FindElement(By.Id("droppable")).Location;
            new Actions(driver)
                .DragAndDropToOffset(draggable, finish.X - start.X, finish.Y - start.Y)
                .Perform();
    draggable = @driver.find_element(id: 'draggable')
    start = draggable.rect
    finish = @driver.find_element(id: 'droppable').rect
    @driver.action
           .drag_and_drop_by(draggable, finish.x - start.x, finish.y - start.y)
           .perform
    // Add Code
    // Add Code

3 - Pen actions

A representation of a pen stylus kind of pointer input for interacting with a web page.

Chromium Only

A Pen is a type of pointer input that has most of the same behavior as a mouse, but can also have event properties unique to a stylus. Additionally, while a mouse has 5 buttons, a pen has 3 equivalent button states:

  • 0 — Touch Contact (the default; equivalent to a left click)
  • 2 — Barrel Button (equivalent to a right click)
  • 5 — Eraser Button (currently unsupported by drivers)

Using a Pen

Selenium v4.2

        WebElement pointerArea = driver.findElement(By.id("pointerArea"));
        new Actions(driver)
                .setActivePointer(PointerInput.Kind.PEN, "default pen")
                .moveToElement(pointerArea)
                .clickAndHold()
                .moveByOffset(2, 2)
                .release()
                .perform();

Selenium v4.2

    pointer_area = driver.find_element(By.ID, "pointerArea")
    pen_input = PointerInput(POINTER_PEN, "default pen")
    action = ActionBuilder(driver, mouse=pen_input)
    action.pointer_action\
        .move_to(pointer_area)\
        .pointer_down()\
        .move_by(2, 2)\
        .pointer_up()
    action.perform()
            IWebElement pointerArea = driver.FindElement(By.Id("pointerArea"));
            ActionBuilder actionBuilder = new ActionBuilder();
            PointerInputDevice pen = new PointerInputDevice(PointerKind.Pen, "default pen");
            
            actionBuilder.AddAction(pen.CreatePointerMove(pointerArea, 0, 0, TimeSpan.FromMilliseconds(800)));
            actionBuilder.AddAction(pen.CreatePointerDown(MouseButton.Left));
            actionBuilder.AddAction(pen.CreatePointerMove(CoordinateOrigin.Pointer,
                2, 2, TimeSpan.Zero));
            actionBuilder.AddAction(pen.CreatePointerUp(MouseButton.Left));
            ((IActionExecutor)driver).PerformActions(actionBuilder.ToActionSequenceList());

Selenium v4.2

    pointer_area = @driver.find_element(id: 'pointerArea')
    @driver.action(devices: :pen)
           .move_to(pointer_area)
           .pointer_down
           .move_by(2, 2)
           .pointer_up
           .perform
    // Add Code
    // Add Code

Adding Pointer Event Attributes

Selenium v4.2

        WebElement pointerArea = driver.findElement(By.id("pointerArea"));
        PointerInput pen = new PointerInput(PointerInput.Kind.PEN, "default pen");
        PointerInput.PointerEventProperties eventProperties = PointerInput.eventProperties()
                .setTiltX(-72)
                .setTiltY(9)
                .setTwist(86);
        PointerInput.Origin origin = PointerInput.Origin.fromElement(pointerArea);

        Sequence actionListPen = new Sequence(pen, 0)
                .addAction(pen.createPointerMove(Duration.ZERO, origin, 0, 0))
                .addAction(pen.createPointerDown(0))
                .addAction(pen.createPointerMove(Duration.ZERO, origin, 2, 2, eventProperties))
                .addAction(pen.createPointerUp(0));

        ((RemoteWebDriver) driver).perform(Collections.singletonList(actionListPen));
    pointer_area = driver.find_element(By.ID, "pointerArea")
    pen_input = PointerInput(POINTER_PEN, "default pen")
    action = ActionBuilder(driver, mouse=pen_input)
    action.pointer_action\
        .move_to(pointer_area)\
        .pointer_down()\
        .move_by(2, 2, tilt_x=-72, tilt_y=9, twist=86)\
        .pointer_up(0)
    action.perform()
            IWebElement pointerArea = driver.FindElement(By.Id("pointerArea"));
            ActionBuilder actionBuilder = new ActionBuilder();
            PointerInputDevice pen = new PointerInputDevice(PointerKind.Pen, "default pen");
            PointerInputDevice.PointerEventProperties properties = new PointerInputDevice.PointerEventProperties() {
                TiltX = -72,
                TiltY = 9,
                Twist = 86,
            };            
            actionBuilder.AddAction(pen.CreatePointerMove(pointerArea, 0, 0, TimeSpan.FromMilliseconds(800)));
            actionBuilder.AddAction(pen.CreatePointerDown(MouseButton.Left));
            actionBuilder.AddAction(pen.CreatePointerMove(CoordinateOrigin.Pointer,
                2, 2, TimeSpan.Zero, properties));
            actionBuilder.AddAction(pen.CreatePointerUp(MouseButton.Left));
            ((IActionExecutor)driver).PerformActions(actionBuilder.ToActionSequenceList());
    pointer_area = @driver.find_element(id: 'pointerArea')
    @driver.action(devices: :pen)
           .move_to(pointer_area)
           .pointer_down
           .move_by(2, 2, tilt_x: -72, tilt_y: 9, twist: 86)
           .pointer_up
           .perform
    // Add Code
    // Add Code

4 - Scroll wheel actions

A representation of a scroll wheel input device for interacting with a web page.

Selenium v4.2

Chromium Only

There are 5 scenarios for scrolling on a page.

Scroll to element

This is the most common scenario. Unlike traditional click and send keys methods, the actions class does not automatically scroll the target element into view, so this method will need to be used if elements are not already inside the viewport.

This method takes a web element as the sole argument.

Regardless of whether the element is above or below the current viewscreen, the viewport will be scrolled so the bottom of the element is at the bottom of the screen.

        WebElement iframe = driver.findElement(By.tagName("iframe"));
        new Actions(driver)
                .scrollToElement(iframe)
                .perform();
    iframe = driver.find_element(By.TAG_NAME, "iframe")
    ActionChains(driver)\
        .scroll_to_element(iframe)\
        .perform()
            IWebElement iframe = driver.FindElement(By.TagName("iframe"));
            new Actions(driver)
                .ScrollToElement(iframe)
                .Perform();
    iframe = @driver.find_element(tag_name: 'iframe')
    @driver.action
           .scroll_to(iframe)
           .perform
    Not implemented yet
    Not implemented yet

Scroll by given amount

This is the second most common scenario for scrolling. Pass in an delta x and a delta y value for how much to scroll in the right and down directions. Negative values represent left and up, respectively.

        WebElement footer = driver.findElement(By.tagName("footer"));
        int deltaY = footer.getRect().y;
        new Actions(driver)
                .scrollByAmount(0, deltaY)
                .perform();
    footer = driver.find_element(By.TAG_NAME, "footer")
    delta_y = footer.rect['y']
    ActionChains(driver)\
        .scroll_by_amount(0, delta_y)\
        .perform()
            IWebElement footer = driver.FindElement(By.TagName("footer"));
            int deltaY = footer.Location.Y;
            new Actions(driver)
                .ScrollByAmount(0, deltaY)
                .Perform();
    footer = @driver.find_element(tag_name: 'footer')
    delta_y = footer.rect.y
    @driver.action
           .scroll_by(0, delta_y)
           .perform
    Not implemented yet
    Not implemented yet

Scroll from an element by a given amount

This scenario is effectively a combination of the above two methods.

To execute this use the “Scroll From” method, which takes 3 arguments. The first represents the origination point, which we designate as the element, and the second two are the delta x and delta y values.

If the element is out of the viewport, it will be scrolled to the bottom of the screen, then the page will be scrolled by the provided delta x and delta y values.

        WebElement iframe = driver.findElement(By.tagName("iframe"));
        WheelInput.ScrollOrigin scrollOrigin = WheelInput.ScrollOrigin.fromElement(iframe);
        new Actions(driver)
                .scrollFromOrigin(scrollOrigin, 0, 200)
                .perform();
    iframe = driver.find_element(By.TAG_NAME, "iframe")
    scroll_origin = ScrollOrigin.from_element(iframe)
    ActionChains(driver)\
        .scroll_from_origin(scroll_origin, 0, 200)\
        .perform()
            IWebElement iframe = driver.FindElement(By.TagName("iframe"));
            WheelInputDevice.ScrollOrigin scrollOrigin = new WheelInputDevice.ScrollOrigin
            {
                Element = iframe
            };
            new Actions(driver)
                .ScrollFromOrigin(scrollOrigin, 0, 200)
                .Perform();
    iframe = @driver.find_element(tag_name: 'iframe')
    scroll_origin = Selenium::WebDriver::WheelActions::ScrollOrigin.element(iframe)
    @driver.action
           .scroll_from(scroll_origin, 0, 200)
           .perform
    Not implemented yet
    Not implemented yet

Scroll from an element with an offset

This scenario is used when you need to scroll only a portion of the screen, and it is outside the viewport. Or is inside the viewport and the portion of the screen that must be scrolled is a known offset away from a specific element.

This uses the “Scroll From” method again, and in addition to specifying the element, an offset is specified to indicate the origin point of the scroll. The offset is calculated from the center of the provided element.

If the element is out of the viewport, it first will be scrolled to the bottom of the screen, then the origin of the scroll will be determined by adding the offset to the coordinates of the center of the element, and finally the page will be scrolled by the provided delta x and delta y values.

Note that if the offset from the center of the element falls outside of the viewport, it will result in an exception.

        WebElement footer = driver.findElement(By.tagName("footer"));
        WheelInput.ScrollOrigin scrollOrigin = WheelInput.ScrollOrigin.fromElement(footer, 0, -50);
        new Actions(driver)
                .scrollFromOrigin(scrollOrigin,0, 200)
                .perform();
    footer = driver.find_element(By.TAG_NAME, "footer")
    scroll_origin = ScrollOrigin.from_element(footer, 0, -50)
    ActionChains(driver)\
        .scroll_from_origin(scroll_origin, 0, 200)\
        .perform()
            IWebElement footer = driver.FindElement(By.TagName("footer"));
            var scrollOrigin = new WheelInputDevice.ScrollOrigin
            {
                Element = footer,
                XOffset = 0,
                YOffset = -50
            };
            new Actions(driver)
                .ScrollFromOrigin(scrollOrigin, 0, 200)
                .Perform();
    footer = @driver.find_element(tag_name: 'footer')
    scroll_origin = Selenium::WebDriver::WheelActions::ScrollOrigin.element(footer, 0, -50)
    @driver.action
           .scroll_from(scroll_origin, 0, 200)
           .perform
    Not implemented yet
    Not implemented yet

Scroll from a offset of origin (element) by given amount

The final scenario is used when you need to scroll only a portion of the screen, and it is already inside the viewport.

This uses the “Scroll From” method again, but the viewport is designated instead of an element. An offset is specified from the upper left corner of the current viewport. After the origin point is determined, the page will be scrolled by the provided delta x and delta y values.

Note that if the offset from the upper left corner of the viewport falls outside of the screen, it will result in an exception.

        WheelInput.ScrollOrigin scrollOrigin = WheelInput.ScrollOrigin.fromViewport(10, 10);
        new Actions(driver)
                .scrollFromOrigin(scrollOrigin, 0, 200)
                .perform();
    ActionChains(driver)\
        .scroll_from_origin(scroll_origin, 0, 200)\
        .perform()
            var scrollOrigin = new WheelInputDevice.ScrollOrigin
            {
                Viewport = true,
                XOffset = 10,
                YOffset = 10
            };
            new Actions(driver)
                .ScrollFromOrigin(scrollOrigin, 0, 200)
                .Perform();
    @driver.action
           .scroll_from(scroll_origin, 0, 200)
           .perform
    Not implemented yet
    Not implemented yet