Working with Binary Data¶
Motivation¶
Splash assumes that most strings in a script are encoded to UTF-8. This is true for HTML content - even if the original response was not UTF-8, internally browser works with UTF-8, so splash:html result is always UTF-8.
When you return a Lua table from the main
function Splash encodes it
to JSON; JSON is a text protocol which can’t handle arbitrary binary data,
so Splash assumes all strings are UTF-8 when returning a JSON result.
But sometimes it is necessary to work with binary data: for example, it could be raw image data returned by splash:png or a response body of a non-UTF-8 page returned by splash:http_get.
Binary Objects¶
To pass non-UTF8 data to Splash (returning it as a result of main
or
passing as arguments to splash
methods) a script may mark it as
a binary object using treat.as_binary function.
Some of the Splash functions already return binary objects: splash:png, splash:jpeg; response.body attribute is also a binary object.
A binary object can be returned as a main
result directly.
It is the reason the following example works
(a basic render.png implementation in Lua):
-- basic render.png emulation
function main(splash)
assert(splash:go(splash.args.url))
return splash:png()
end
All binary objects have content-type attached. For example, splash:png
result will have content-type image/png
.
When returned directly, a binary object data is used as-is for the response body, and Content-Type HTTP header is set to the content-type of a binary object. So in the previous example the result will be a PNG image with a proper Content-Type header.
To construct your own binary objects use treat.as_binary function. For example, let’s return a 1x1px black GIF image as a response:
treat = require("treat")
base64 = require("base64")
function main(splash)
local gif_b64 = "AQABAIAAAAAAAAAAACH5BAAAAAAALAAAAAABAAEAAAICTAEAOw=="
local gif_bytes = base64.decode(gif_b64)
return treat.as_binary(gif_bytes, "image/gif")
end
When main
result is returned, binary object content-type takes a priority
over a value set by splash:set_result_content_type. To override
content-type of a binary object create another binary object with a required
content-type:
lcoal treat = require("treat")
function main(splash)
-- ...
local img = splash:png()
return treat.as_binary(img, "image/x-png") -- default was "image/png"
end
When a binary object is serialized to JSON it is auto-encoded to base64
before serializing. For example, it may happen when a table is returned
as a main
function result:
function main(splash)
assert(splash:go(splash.args.url))
-- result is a JSON object {"png": "...base64-encoded image data"}
return {png=splash:png()}
end