Si Wei's Portfolio

Sharing JSON data without SQL

18 min read
banner

Introduction

In June, I built a route planner and sharing webapp, inspired by Strava's premium Routes feature. Users are able to add waypoints while the system finds the shortest path routes between them and calculates the distance. Locally, users are able to create multiple routes, manage them and edit them however they like.

I wanted to add a sharing functionality so that users can send their friends their pre-created routes. The shared route must have the exact same markers and path as the original. To keep things simple as it is, I didn't want to implement an account system and database to store all these information.

Through some fiddling, I manage to achieve this through a URL shortlink, and just a Redis database! The result is an incredibly efficient and fast process for creating and retrieving shareable JSON data! Here is how i did it

Example JSON

Within my app, here is how each route is being represented on the frontend:

interface RoutePrimitive {
  id: string;
  name: string;
  latlngs: [number, number][]; // LatLng of a specific point in a generated polyline path
  markerPos: [number, number][]; // LatLng of a marker placed by user
  distance: number;
  type: "bicycle" | "foot" | "mixed";
  short: string;
}

An example route that has 3 points would look like this:

{
  id: "123";
  name: "Example Route";
  latlngs: [
    [1.0, 2.0],
    [2.0, 3.0],
    [3.0, 4.0],
  ];
  markerPos: [
    [1.0, 2.0],
    [3.0, 4.0],
  ];
  distance: 13012;
  type: "foot";
  short: "";
}

I can further shrink the data shared into 4 properties; latlngs, markerPos, distance and type. Since these properties require a third-party API to calculate, it would incur additional latency each time someone accesses the shared link. It would be more efficient if these data are encoded into a string which can be decoded anytime there is a request for it. The rest of the data like name is considered trivial can be generated on the fly by my own backend.

Final JSON to be encoded/decoded throughout my entire application:

{
  latlngs: [
    [1.0, 2.0],
    [2.0, 3.0],
    [3.0, 4.0],
  ];
  markerPos: [
    [1.0, 2.0],
    [3.0, 4.0],
  ];
  distance: 13012;
  type: "foot";
}

Encoding with Rust

From here on, we assume that the JSON has been sent to my Rust backend through a POST request. json_obj will refer to that piece of data above.

For the simplicity of writing code, I'm not doing any error handling and just unwrapping every value here, but it is highly recommended.

#[derive(Deserialize, Debug)]
struct Route {
    latlngs: Vec<(f64, f64)>,
    marker_pos: Vec<(f64, f64)>,
    route_type: String,
    distance: f64
}

let json_obj = Route {
  latlngs: vec![(1.0, 2.0), (2.0, 3.0), (3.0, 4.0)],
  marker_pos:  vec![(1.0, 2.0), (3.0,4.0),],
  route_type: String::from("foot"),
  distance: 1230.0
};

Well, the intuitive way is to stringify the JSON, send it back, and that could be our shortlink. However, when decoding shortlinks, parsing the string as an entire JSON will result in more errors which include:

  1. An additional comma at the end of an array field
  2. Use of single quotes instead of double quotes

Therefore, I would prefer to destructure the JSON into individual fields, before encoding each of them separately. Then, joining the encoded strings together and return it as a shortlink. When I want to decode, I can simply decode each field and build the JSON up again. This is possible since I know all the JSON keys beforehand.

let latlngs = json_obj.latlngs;
let marker_pos = json_obj.marker_pos;
let route_type = json_obj.route_type;
let distance = json_obj.distance;

In this example, I'm using Base64 over UrlEncoding since it results in a shorter and more elegant encoding.

use base64::{engine::{general_purpose, GeneralPurpose}};
const B64_ENGINE: GeneralPurpose =
    GeneralPurpose::new(&alphabet::URL_SAFE, general_purpose::NO_PAD);

The easier fields to encode are strings and numbers, since they do not have special formatting to conform to. To encode in Base64, the data have to able to be represented as bytes.

let route_type_bytes = route_type.as_bytes(); // [102, 111, 111, 116]
let distance_bytes = distance.to_le_bytes(); // [0, 0, 0, 0, 0, 56, 147, 64]
let encoded_route = engine.encode(route_type_bytes); // Zm9vdA
let encoded_distance = engine.encode(distance_bytes); // AAAAAAA4k0A

For the coordinates fields, latlngs and marker_pos, it is slightly trickier. The tuples within the arrays are not representable by bytes (or rather there isn't an implementation in Rust).

Therefore, we need to transform the coordinates fields into a byte-representable data type, such as a string, before encoding that.

Within a tuple (in this case fixed size 2), we can join them with a comma. Then join each tuple with a semi-colon.

[(1.0, 2.0),(2.0,3.0)] -> "1.0,2.0;2.0,3.0"

In code, this will look something like:

// Converts vector of (f64, f64) into a string of latlngs in the format `1.0,2.0;3.0,4.0`
fn encode_coordinates(ll: &Vec<(f64, f64)>) -> String {
    let coordinates_str = ll.iter()
                            .map(|(lat, lng)| format!("{},{}", lat, lng))
                            .collect::<Vec<_>>()
                            .join(";");
    let encoded_data = engine.encode(coordinate_str);
    encoded_data
}

let encoded_latlngs = encode_coordinates(&latlngs); // eJwz1DGyNtIxtjbWMQEADO4CKg

This is still not enough. Till now, we have been using simple examples - A coordinate vector with 2-3 tuples. In reality, there would be thousands of coordinates and each value in the tuple consist of at least 5 decimal places.

let actual_latlngs = latlngs.repeat(5000); // Expanded 5000 times
let encoded_latlngs = encode_coordinates(&json_obj.latlngs);
let encoded_actual_latlngs = encode_coordinates(&actual_latlngs);

println!("{}", encoded_latlngs.len()); // 15
println!("{}", encoded_actual_latlngs.len()); // 79999

When the coordinate vector grows in size, the encodings grow proportionally as well. It is unscalable this way since URL links only allows a maximum of 2048 characters.

We can use a compression library to shrink the string before we encode it.

use flate2::write::ZlibEncoder;

fn encode_coordinates(coordinates: &Vec<(f64, f64)>) -> String {
    // Convert coordinates to a string
    let coordinate_str = coordinates
        .iter()
        .map(|(lat, lon)| format!("{},{}", lat, lon))
        .collect::<Vec<_>>()
        .join(";");

    // Compress the string
    let mut encoder = ZlibEncoder::new(Vec::new(), Compression::default());
    encoder.write_all(coordinate_str.as_bytes()).unwrap();
    let compressed_data = encoder.finish().unwrap();

    // Encode the compressed data
    let encoded_data = engine.encode(compressed_data);

    encoded_data
}

Now, the encoding should be much smaller now.

let encoded_latlngs = encode_coordinates(&latlngs);
let encoded_actual_latlngs = encode_coordinates(&actual_latlngs);

println!("{}", encoded_latlngs.len()); // 26
println!("{}", encoded_actual_latlngs.len()); // 200

Although the shorter string increased in length, it is considered negligible. More importantly, when expanded 5000 times, the encoding version increased less than 10 times.

Of course, this isn't the most scalable as well and cannot be used directly as a shortlink (in my case at least). But more on that in a bit...

Now, we can just join all the encoded fields together as a final string.

let encoded_route = engine.encode(route_type_bytes); // Zm9vdA
let encoded_distance = engine.encode(distance_bytes); // AAAAAAA4k0A
let encoded_latlngs = encode_coordinates(&latlngs); // eJwz1DGyNtIxtjbWMQEADO4CKg
let encoded_markers = encode_coordinates(&marker_pos); // eJwz1DGyNtYxAQAFcQFe

let result = vec![encoded_markers, encoded_latlngs, encoded_distance, encoded_type].join("&");
// eJwz1DGyNtYxAQAFcQFe&eJwz1DGyNtIxtjbWMQEADO4CKg&AAAAAAA4k0A&Zm9vdA

Each of the 4 fields is separated by a & character. The ampersand & is used as a separator because it's both URL safe and does not appear in Base64.

Decoding with Rust

Now that we have a encoded string, we can decode it anytime we want and get its original JSON back. The decode function is just the reverse.

We split the string into 4 parts that represents the individual fields, and decode them separately.

let components = to_decode.split("&").into_iter().collect::<Vec<_>>();
let decode_markers = components.get(0).unwrap();
let decode_latlngs = components.get(1).unwrap();
let decode_distance = components.get(2).unwrap();
let decode_type = components.get(3).unwrap();

For the distance and type, we did no transformation so decoding is quite simple with a few extra steps.

fn decode_f64(f64_str: &str) -> f64 {
  let decoded_bytes_vec = B64_ENGINE.decode(f64_str).unwrap();
  let decoded_bytes_arr = decoded_bytes_vec[..8].try_into().unwrap();
  let result = f64::from_le_bytes(decoded_bytes_arr);

  result
}

The decode method of the engine returns a vector of u8 integers. We only want the first 8 since our distance is represented in 64 bits. Then, we parse it by reading it in little endian byte ordering.

fn decode_str(r_str: &str) -> String {
    let decoded_str = B64_ENGINE.decode(r_str).unwrap();
    let result = String::from_utf8(decoded_str).unwrap();

    result
}

The same goes for the decoding string.

Now, decoding a coordinate string is much more complex, since we transformed it quite a bit. To recap, here's what we did to encode it.

  1. Convert the vector of coordinates tuple into a string.
  2. Compress the string using ZLib or any compression library into a vector of u8.
  3. Encode the vector using Base64

So, to decode it, we just have to reverse the process.

  1. Decode the vector using Base64.
  2. Decompress the string using ZLib.
  3. Transform the string into a vector of coordinate tuples.
fn decode_coordinates2(ll: &str) -> Vec<(f64, f64)> {
    // Step 1:
    let compressed_bytes = B64_ENGINE.decode(ll).unwrap();

    // Step 2:
    let mut decoder = ZlibDecoder::new(&compressed_bytes[..]);
    let mut decompressed_bytes = Vec::new();
    decoder.read_to_end(&mut decompressed_bytes).unwrap();
    let coord_str = String::from_utf8(decompressed_bytes).unwrap();

    // Step 3:
    let result = coord_str
                        .split(";")
                        .filter_map(|x| {
                            let parts = x.split(",").collect::<Vec<_>>();
                            if parts.len() != 2 { return None };
                            let lat = parts[0].parse::<f64>().ok()?;
                            let lng = parts[1].parse::<f64>().ok()?;
                            Some((lat, lng))
                        })
                        .collect::<Vec<(f64, f64)>>();

    result
}

Once we decode all fields, we rebuild the JSON from the ground up.

fn decode_route(to_decode: &str) -> Route {
    let components = to_decode.split("&").into_iter().collect::<Vec<_>>();

    let decode_markers = components.get(0).unwrap();
    let decode_latlngs = components.get(1).unwrap();
    let decode_distance = components.get(2).unwrap();
    let decode_type = components.get(3).unwrap();

    let marker_pos = decode_coordinates(&decode_markers);
    let latlngs = decode_coordinates(&decode_latlngs);
    let distance = decode_f64(&decode_distance);
    let route_type = decode_str(&decode_type);

    let result = Route {
        marker_pos,
        latlngs,
        distance,
        route_type
    };

    result
}

So, to share routes, we can send a GET request to the server with the encoded string as the query parameter. Then decode it and send the Route JSON back in response.

The Case for a Redis Database

Earlier, I mentioned that encoding coordinates is unscalable and not feasible for my application. This happens when there are large JSON data where encoding hits >2048 characters. However, if you are sure that your JSON data is small its encoding is guaranteed to not exceed the limit, you can do away with this step.

In my application, users are free to add an infinite number of coordinates and share their routes. Even though string compression has reduced the size of the encoding significantly, there will be cases where encoding will hit the URL limit of 2048 characters.

This is when we need to bring in a Redis store. What is Redis and how does it help?

Redis is a in memory key-value data store that is incredibly fast for setting and retrieving data. It functions as a hashmap that lives forever. And we are going to take advantage of this to further shorten our encoding.

Essentially, after encoding a given JSON, we are going to generate a fixed length shortlink that corresponds to the given encoding, and cache it in Redis.

Then, whenever the server receives a request with that shortlink as the payload, it queries the Redis store to retrieve the associated encoding. There isn't an algorithm to shorten our encoding, we simply hard define a shortlink for the encoding, which is exclusive to our application.

To generate a reliable shortlink that is nearly impossible to have collisions, we can make use of UUID.

We instantiate a client instance that can be reused. If you are using a server, it would be better to be using a pool of connections to handle multiple requests concurrently. The code below is just a simple way to demonstrate the basic way to use Redis.

use redis;

let client = redis::Client::open("redis://127.0.0.1/").unwrap();
let mut con = client.get_connection().unwrap();

The con variable is able to use the essential SET and GET function of Redis.

use Uuid;
use redis::Commands;

let id = Uuid:new_v4().to_string();
let encoding = encode_route(&json_obj);
let _: () = con.set(&id, encoding).unwrap();

We generate a unique identifier for our corresponding encoding. And cache it in the Redis store using con.set(). This tells our application: From now on, any requests that refers to this id will translate to this encoding.

To retrieve this encoding after setting it, we can do this:

let encoding: String = con.get(&id).unwrap();
let route: Route = decode_route(&encoding);

Conclusion

And there you have it! The ability to transmit JSON data through a server without the need to provision an SQL database, eliminating the unnecessary overhead and latency!

In this post, we went through the many ways we can achieve this.

Firstly, we could directly convert a JSON to a URLEncoded string, however, it is prone to parsing errors. We opted to deconstruct the JSON into individual fields and encode them instead. Whenever we want to the JSON back, we decode the fields and reconstruct from the ground up.

Then, we realised that the encoding, which was originally meant to be passed in a URL, could be to large to meet the limit. This is because our data is freely formed by our users and there is no way to determine the size of it at runtime.

Therefore, we have to make use of a Redis key-value store to map fixed length shortlinks to our large encodings. This ensures that every encoding output we send back is of fixed and safe length to use.

This may not be necessary for some cases, when the data shared is small enough. Some cases would include:

  1. Known number of fields containing primitive data-types
  2. Known number of fields containing non-primitive data-types that has a fixed or maximum length.

At the same time, you could use the module RedisJSON which would eliminate the need for all the encoding and decoding. It does however, has a bit of a performance overhead than vanilla Redis. I would definitely recommend this module if your JSON data has complex nested objects.