Hello friends,
We are excited to announce a new Ferret release - Ferret v0.13.0.
This release brings new syntax, DOM API updates and some bug fixes. Let’s dive in!
What’s added
While loop
Finally! While loops have landed FQL!
Ferret query language inherently lacked while loops due to its origin from ArangoDB query language. But over time, it’s become obvious that this kind of looping is essential for data scraping.
While loops in FQL are implemented as an extension of FOR IN
loop and come in two modes: while-do
and do-while
:
FOR counter [DO] WHILE condition
RETURN value
while-do
is a traditional while loop where a condition is checked before every iteration. It might be not very useful in web scraping (e.g. pagination), thus we have implemented do-while
where condition checks start to occur on 2nd iteration i.e. there is always at least one iteration.
Since data in FQL is immutable, there are no direct mutations in code which while loops can depend on. Rather while loops depend on checks of external resources like existance of HTML elements or database records.
Ok, enough of theory! Let’s take a look at the query that gets a list of API managment solutions from the GitHub Marketplace:
In this while loop, we are checking if the next button exists (external mutation) but always execute the loop body at least once just in case there is only one page.
The i
variable represents a loop counter that gives information which iteration we are in.
Being an extension of FOR-IN
loop gives us all statements that can be used with regular FOR-IN
loop like FILTER
, LIMIT
and etc.
Computed styles
Until this version, CDP driver returned attribute based styles whenver you accessed them via Element.style
or Element.attributes.style
or STYLE_GET(Element)
.
Starting with this release, Element.style
returns an object containing computed styles while Element.attributes.style
returns an object containg styles defined in style
attribute of an element. STYLE_GET(Element)
returns computed styles as well.
Element’s parent and siblings
With this release, it’s possible to direclty access parents and siblings of elements:
What’s fixed
HTML escaping
Long lasting issue of HTML escaping has been finally fixed.
The problem was that Go’s JSON encoder from the standard library does escaping of HTML strings.