3D Renderer

TL;DR

  • DeckGL is a great high-level abstraction layer of WebGL and should be considered for data visualization with reasonably large datasets.

  • Processing the data should be offloaded to web workers to avoid blocking the main thread.

  • It is easy to integrate Rust into a web development development project using wasm-bindgen.

There are existing 3D Renderers. The ones that are often used in conjunction with Python related projects is Plotly. However, not all the feature functionality that Plotly has to offer is available in its SVG based implementation is also available in its WebGL based implementation. For reference, WebGL and WebGPU are the libraries that enable fast rendering in the browser. Another common library for visualizing data is matplotlib, which can be used for recording videos of 3D data. However, I wanted something dynamic, simple and animatable.

Suggested Solutions

The top results when searching for Software that does just that are not easily accessible via the browser. ChatGPT suggested Potree, 3D Tiles and Sketchfab and while those are usable to render 3D points, they did not include the functionality I needed.

So I created my own.

A 3D Renderer Tool for Animations

A 3D Renderer Tool for Animations

The web rendering site I created is based on DeckGL. DeckGL was open-sourced by Uber and is great as a higher level abstraction layer of WebGL.

The current implementation can be found here.

WebAssembly - Clustering and Convex Hull creation

To show clusters forming in the data I used WebAssembly in Rust. WebAssembly can run significantly faster on complex tasks and clustering thousands or millions of points is quite resource intensive. The main thread sends the request to WebAssembly using Web Workers within a React hook. Web Workers allow the creation of separate threads, which keeps the UI responsive.

const useClusteringWorkers = (clusteringData: InstancesData, minClusterSize: number) => {
  const [clusteringResults, setClusteringResults] = useState<ClustersData | undefined>(undefined);
  const [isClusteringInProgress, setIsClusteringInProgress] = useState<boolean>(false); // Added state for tracking clustering progress
  const clusteringResultsRef = useRef<ClustersData | undefined>(clusteringResults);
  const cacheRef = useRef<Map<number, ClustersData | undefined>>(new Map());

  useEffect(() => {
    const cachedResults = cacheRef.current.get(minClusterSize);
    if (cachedResults) {
      setClusteringResults(cachedResults);
    } else {
      setIsClusteringInProgress(true); // Set true when starting the clustering process
      const workerInputs = prepareDataForWorkers(clusteringData, minClusterSize, 60);
      const initializedWorkers: Array<Worker> = workerInputs.map((input, index) => {
        const worker = new Worker(new URL("./clusteringWorker.ts", import.meta.url));
        worker.onmessage = (event: MessageEvent) => {
          const { status, result, message } = event.data;
          if (status === "success" && result) {
            const clusteringOutput: ClusteringOutput[] = mapToClusteringOutput(result);
            const clusterData = convertClusteringResultsToClusterData(clusteringOutput);
            setClusteringResults((prevResults) => {
              const newResults = prevResults ? [...prevResults] : [];
              newResults[index] = clusterData;
              return newResults;
            });
            cacheRef.current.set(minClusterSize, clusteringResultsRef.current);
            setIsClusteringInProgress(false);
          } else if (message) {
            console.error(`Error from worker:`, message);
          }
        };

        function mapToClusteringOutput(result: ClusteringOutputArray): ClusteringOutput[] {
          return result.map((r) => ({
            clusterLabel: r[0],
            clusterBoundaries: r[1],
            clusterBoundariesIndices: r[2],
            clusterInstances: r[3],
          }));
        }

        worker.onerror = (error: ErrorEvent) => {
          console.error(`Worker error:`, error.message);
          setIsClusteringInProgress(false); // Set false on error
        };

        worker.postMessage(input);
        return worker;
      });

      return () => {
        initializedWorkers.forEach((worker) => worker.terminate());
      };
    }
  }, [clusteringData, minClusterSize]);

  useEffect(() => {
    clusteringResultsRef.current = clusteringResults;
  }, [clusteringResults]);

  return { clusteringResults, clusteringResultsRef, isClusteringInProgress }; // Return the new state here
};

The worker forwards the request to WebAssembly.

// Initialize the cluster function to null initially
let cluster: typeof cluster_hdbscan | null = null;

// Function to execute clustering with provided data
function executeClustering(data: {
  points: Float32Array;
  min_cluster_size: number;
  min_samples: number;
}) {
  if (!cluster) {
    console.error("Cluster function not initialized.");
    self.postMessage({ status: "error", message: "Cluster function not initialized." });
    return;
  }

  const { points, min_cluster_size, min_samples } = data;
  try {
    const result = cluster(points, min_cluster_size, min_samples);
    self.postMessage({ status: "success", result });
  } catch (error) {
    console.error("Clustering error:", error);
    self.postMessage({ status: "error", message: "Clustering failed." });
  }
}

// Listener for incoming messages to the worker
self.onmessage = (e) => {
  // Check if the cluster function is already initialized
  if (cluster) {
    // Execute clustering with the provided data
    executeClustering(e.data);
  } else {
    // Dynamically import and initialize the WASM module
    import("../../clustering/pkg/clustering")
      .then(({ default: init, cluster_hdbscan }) => {
        init().then(() => {
          cluster = cluster_hdbscan;
          // After initialization, execute clustering with the provided data
          executeClustering(e.data);
        });
      })
      .catch((error) => {
        console.error("Error loading WASM module:", error);
        self.postMessage({ status: "error", message: "Failed to load WASM module." });
      });
  }
};

The clustering works with either two or three dimensional data. Once the clusters are formed, a convex hull is created. A convex hull is the smallest polygon that encloses all given points with straight line segments, in this case all points in a cluster. This is different from a concave hull, which can include inward-curving segments. Wasm-bindgen was used to generate bindings.

use chull::ConvexHullWrapper;
use hdbscan::{Hdbscan, HdbscanHyperParams};
use serde_wasm_bindgen::to_value;
use std::collections::HashMap;
use wasm_bindgen::prelude::*;
use web_sys::console;

#[wasm_bindgen]
pub fn cluster_hdbscan(
    data: Vec<f32>,
    min_cluster_size: usize,
    min_samples: usize,
) -> Result<JsValue, JsValue> {
    console::time_with_label("Total Processing Time");

    // Determine the number of dimensions
    let num_points = data.len();
    let dimensions = if num_points % 3 == 0 {
        3
    } else if num_points % 2 == 0 {
        2
    } else {
        return Err(JsValue::from_str(
            "Invalid data length; data must be divisible by 2 or 3.",
        ));
    };

    // Split the data into points based on the determined dimensions
    let points: Vec<Vec<f32>> = data
        .chunks(dimensions)
        .map(|chunk| chunk.to_vec())
        .collect();

    let hyper_params = HdbscanHyperParams::builder()
        .min_cluster_size(min_cluster_size)
        .max_cluster_size(points.len() / 2)
        .min_samples(min_samples)
        .build();

    let clusterer = Hdbscan::new(&points, hyper_params);
    let clustering_result: Vec<i32> = clusterer
        .cluster()
        .map_err(|e| JsValue::from_str(&e.to_string()))?;

    // Group points by cluster label for convex hull calculation
    let mut clusters: HashMap<i32, Vec<Vec<f64>>> = HashMap::new();
    let mut cluster_point_indices: HashMap<i32, Vec<usize>> = HashMap::new();

    for (i, &label) in clustering_result.iter().enumerate() {
        if label >= 0 {
            clusters
                .entry(label)
                .or_insert_with(Vec::new)
                .push(points[i].iter().map(|&x| x as f64).collect());
            cluster_point_indices
                .entry(label)
                .or_insert_with(Vec::new)
                .push(i); // Store the index of each point in its respective cluster
        }
    }

    let hulls_info: Vec<_> = clusters
        .into_iter()
        .map(|(label, cluster_points)| {
            match ConvexHullWrapper::try_new(&cluster_points, None) {
                Ok(hull) => {
                    let (vertices, indices) = hull.vertices_indices();
                    let point_indices = cluster_point_indices
                        .get(&label)
                        .cloned()
                        .unwrap_or_default();
                    Some((label, vertices, indices, point_indices)) // Include point indices in the output
                }
                Err(e) => {
                    let error_message = format!("Error calculating convex hull: {:?}", e);
                    console::error_1(&JsValue::from_str(&error_message));
                    None
                }
            }
        })
        .filter_map(|x| x)
        .collect();

    console::time_end_with_label("Total Processing Time");
    // Serialize hulls_info, which now includes point indices, to JsValue to return
    to_value(&hulls_info).map_err(|e| JsValue::from_str(&e.to_string()))
}

The final result is a hook that can simply be called from any React component.

  const { clusteringResults, clusteringResultsRef, isClusteringInProgress } = useClusteringWorkers(
    data.current,
    minClusterSize
  );