Splash Lua API Overview¶
Splash provides a lot of methods, functions and properties; all of them are documented in Splash Scripts Reference, Available Lua Libraries, Element Object, Request Object, Response Object and Working with Binary Data. Here is a short description of the most used ones:
Script as an HTTP API endpoint¶
Each Splash Lua script can be seen as an HTTP API endpoint, with input arguments and structured result value. For example, you can emulate render.png endpoint using Lua script, including all its HTTP arguments.
- splash.args is the way to get data to the script;
- splash:set_result_status_code allows to change HTTP status code of the result;
- splash:set_result_content_type allows to change Content-Type returned to the client;
- splash:set_result_header allows to add custom HTTP headers to the result;
- Working with Binary Data section describes how to work with non-text data in Splash, e.g. how to return it to the client;
- treat library allows to customize the way data is serialized to JSON when returning the result.
Delays¶
- splash:wait allows to wait for a specified amount of time;
- splash:call_later schedules a task in future;
- splash:wait_for_resume allows to wait until a certain JS event happens;
- splash:with_timeout allows to limit time spent in a code block.
Extracting information from a page¶
- splash:html returns page HTML content, after it is rendered by a browser;
- splash:url returns current URL loaded in the browser;
- splash:evaljs and splash:jsfunc allow to extract data from a page using JavaScript;
- splash:select and splash:select_all allow to run CSS selectors in a page; they return Element objects which has many methods useful for scraping and further processing (see Element Object)
- element:text returns text content of a DOM element;
- element:bounds returns bounding box of an element;
- element:styles returns computed styles of an element;
- element:form_values return values of a
<form>
element; - many methods and attributes of DOM HTMLElement are supported - see DOM Methods and DOM Attributes.
Screenshots¶
- splash:png, splash:jpeg - take PNG or JPEG screenshot;
- splash:set_viewport_full - change viewport size (call it before splash:png or splash:jpeg) to get a screenshot of the whole page;
- splash:set_viewport_size - change size of the viewport;
- element:png and element:jpeg - take screenshots of individual DOM elements.
Interacting with a page¶
- splash:runjs, splash:evaljs and splash:jsfunc allow to run arbitrary JavaScript in page context;
- splash:autoload allows to preload JavaScript libraries or execute some JavaScript code at the beginning of each page render;
- splash:mouse_click, splash:mouse_hover, splash:mouse_press, splash:mouse_release allow to send mouse events to specific coordinates on a page;
- element:mouse_click and element:mouse_hover allow to send mouse events to specific DOM elements;
- splash:send_keys and splash:send_text allow to send keyboard events to a page;
- element:send_keys and element:send_text allow to send keyboard events to particular DOM elements;
- you can get initial
<form>
values using element:form_values, change them in Lua code, fill the form with the updated values using element:fill and submit it using element:submit; - splash.scroll_position allows to scroll the page;
- many methods and attributes of DOM HTMLElement are supported - see DOM Methods and DOM Attributes.
Making HTTP requests¶
- splash:http_get - send an HTTP GET request and get a response without loading page to the browser;
- splash:http_post - send an HTTP POST request and get a response without loading page to the browser;
Inspecting network traffic¶
- splash:har returns all requests and responses in HAR format;
- splash:history returns information about redirects and pages loaded to the main browser window;
- splash:on_request allows to capture requests issued by a webpage and by the script;
- splash:on_response_headers allows to inspect (and maybe drop) responses once headers arrive;
- splash:on_response allows to inspect raw responses received (including content of related resources);
- splash.response_body_enabled enables full response bodies in splash:har and splash:on_response;
- see Response Object and Request Object for more information about Request and Response objects.
Browsing Options¶
- splash.js_enabled allows to turn JavaScript support OFF:
- splash.private_mode_enabled allows to turn Private Mode OFF (it is requird for some websites because Webkit doesn’t have localStorage available in Private Mode);
- splash.images_enabled allows to turn OFF downloading of images;
- splash.plugins_enabled allows to enable plugins (in the default Docker image it enables Flash);
- splash.resource_timeout allows to drop slow or hanging requests to related resources after a timeout
- splash.indexeddb_enabled allows to turn IndexedDB ON
- splash.webgl_enabled allows to turn WebGL OFF
- splash.html5_media_enabled allows to turn on HTML5 media
(e.g. playback of
<video>
tags). - splash.media_source_enabled allows to turn off Media Source Extension API support