Hooray, Ferret v0.8 has been released!
It’s been a while since the last release, but we worked hard to bring new and better Ferret. This release has many new exciting features, but unfortunately, there are also some breaking changes.
You can find the full changelog here.
Ferret finally supports
When a page gets loaded, Ferret finds all available elements and provieds an access to them via the
.frames property of a page object.
Here’s an example of how to find a target frame:
Alternatively, you can filter them out by url or access to a target iframe by index, if you know it’s position.
With this release, we introduce a new language feature - namespaces.
Namespaces allow library authors (and us) to isolate functions into dedicated sub sections.
Here is an example:
To namespace a function, use the new
namespace method. The
namespace method is chainable:
A good web scraping tool needs XPath support, and Ferret finally has it!
Ferret provides simple interface to XPath engine for both drivers - CDP and HTTP.
It automatically detects the output value type and deserializes them accordingly.
These two queries will return 2 different types:
- Returns an array of serialized elements (their inner HTML)
- Returns a number indicating how many “div” elements are on the page.
Regular expression operator
This release provides a shorthand for using regexp assertions:
New functions to manipulate DOM
There are some cases when you might need to change the existing DOM. To help with that, we added the
In this release, you can override default values of a viewport in headless mode.
Better emulation of user interaction
This is a big change in how Ferret handles page interactions.
Now Ferret interacts with pages in a more advanced way - your script can scrolls down or up to an element, moves the mouse, focuses and types… with random delays. Just like a real person!
There are many other many small changes here and there, like adding
DECODE_URI_COMPONENT functions; improving performance; and changing internal design of some parts of the system.
We try to maintain backwards compatibility, but some of the new features required serious design changes that lead to breaking compatibility with previous versions. As we approach to release v1.0, the API is becoming more stable and will require fewer dramatic changes.
Virtual DOM structure
iframe support required us to redesign the structure of the virtual DOM by introducing top level entity called
HTMLDocument contained the open page, but
iframe nodes introduce the need to have multiple documents representing each node. This led to a new entity in the structure.
Because of the changes in Virtual DOM structure, the driver API has been changed as well in order to be reasonable.
LoadDocumentParams are renamed to
In the context of API stabilization and consistency, there are some other minor changes in vDOM elements like extra returned value (usually an error) or
Get prefix in some methods.