Include a basic Rust wrapper for Equi-X and HashX

The idea behind this is that we may want to start exporting more pieces of c-tor as Rust crates so that Arti can perform cross compatibility and comparison testing using Rust tooling. This turns the 'tor' repo into a Cargo workspace, and adds one crate to start with: "tor-c-equix", rooted in src/ext/equix. This actually includes both Equi-X itself and HashX, since there's less overall duplication if we package these together instead of packaging HashX separately. This patch adds a basic safe Rust interface, but doesn't expose any additional internals for testing purposes. No changes to the C code here or the normal Tor build system. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2024-09-20 04:12:13 +02:00 · 2023-07-25 19:28:06 -07:00 · 2023-07-25 19:28:06 -07:00 · 95bcd17705
commit 95bcd17705
parent 1e3b5c94ab
6 changed files with 425 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -47,6 +47,7 @@ core.*
 /.cache

 # /
+/Cargo.lock
 /Makefile
 /Makefile.in
 /aclocal.m4
@ -71,6 +72,7 @@ core.*
 /stamp-h1
 /tags
 /TAGS
+/target
 /test-driver
 /tor.sh
 /tor.spec
--- a/Cargo.toml
+++ b/Cargo.toml
@ -0,0 +1,14 @@
+# See doc/HACKING/Rust.md
+#
+# There is no plan to offer a stable Rust API to the C implementation of Tor.
+# This workspace is for wrapper crates that are used internally by Arti for
+# cross-compatibility and comparison testing.
+
+[workspace]
+
+members = [
+    "src/ext/equix",
+]
+
+resolver = "2"
+
--- a/doc/HACKING/Rust.md
+++ b/doc/HACKING/Rust.md
@ -0,0 +1,43 @@
+# Rust support in C Tor
+
+The [Arti project](https://gitlab.torproject.org/tpo/core/arti) is the team's
+ongoing effort to write a pure-Rust implementation of Tor.
+
+Arti is not yet feature complete but it's in active development. That's where
+you want to go if you're interested in Tor and Rust together.
+
+This document describes something with niche interest: the C implementation of
+Tor can expose Rust crates which are used for internal testing, benchmarking,
+comparison, fuzzing, and so on. This could be useful for comparing the C
+implementation against new Rust implementations, or for simply using Rust
+tooling for writing tests against C.
+
+## Crates
+
+Right now we are only using this mechanism for one crate:
+
+- `tor-c-equix` -- Wraps the `src/ext/equix` module,
+  containing Equi-X and HashX algorithms.
+
+## Stability
+
+This is not a stable API and we have no plans to develop a stable Rust interface
+to the C implementation of Tor.
+
+## Files
+
+We use only a few of the standard Rust file types in order to build our
+wrapper crates. Here's a summary:
+
+- `Cargo.toml` in the repository root defines a Cargo *workspace*. It will
+  list all subdirectories that contain crates with their own `Cargo.toml`.
+- A per-crate `Cargo.toml` defines metadata and dependencies. These crates
+  should all be marked `publish = false`.
+- `build.rs` implements a simple build system that does not interact with
+  autotools. It uses the `cc` and `bindgen` crates to get from `.c`/`.h`
+  files to a static library and matching auto-generated bindings. Prefer to
+  include bindgen wrapper headers inline within `build.rs` instead of adding
+  `.h` files that are only used by the Rust bindings.
+- `lib.rs` publishes the low-level `ffi` interface produced with `cc` and
+  `bindgen`. This is also where we can add any wrappers or additions we want
+  for making the Rust API more convenient.
--- a/src/ext/equix/Cargo.toml
+++ b/src/ext/equix/Cargo.toml
@ -0,0 +1,23 @@
+# See doc/HACKING/Rust.md
+#
+# This is a low-level Rust wrapper around Equi-X and its embedded copy of
+# HashX, provided for cross-compatibility testing within Arti.
+# This module does not make API stability guarantees.
+
+# Copyright (c) 2020 tevador <tevador@gmail.com>
+# See LICENSE for licensing information
+
+[package]
+name = "tor-c-equix"
+version = "0.1.0"
+edition = "2021"
+license = "LGPL-3.0-only"
+
+publish = false
+
+[build-dependencies]
+bindgen = "0.66.1"
+cc = { version = "1.0", features = ["parallel"] }
+
+[dev-dependencies]
+hex-literal = "0.4.1"
--- a/src/ext/equix/build.rs
+++ b/src/ext/equix/build.rs
@ -0,0 +1,50 @@
+fn main() {
+    cc::Build::new()
+        .files(vec![
+            "src/context.c",
+            "src/equix.c",
+            "src/solver.c",
+            "hashx/src/blake2.c",
+            "hashx/src/compiler.c",
+            "hashx/src/compiler_a64.c",
+            "hashx/src/compiler_x86.c",
+            "hashx/src/context.c",
+            "hashx/src/hashx.c",
+            "hashx/src/program.c",
+            "hashx/src/program_exec.c",
+            "hashx/src/siphash.c",
+            "hashx/src/siphash_rng.c",
+            "hashx/src/virtual_memory.c",
+        ])
+        // Equi-X always uses HashX size 8 (64-bit output)
+        .define("HASHX_SIZE", "8")
+        // Avoid shared library API declarations, link statically
+        .define("HASHX_STATIC", "1")
+        .define("EQUIX_STATIC", "1")
+        .includes(vec!["include", "src", "hashx/include", "hashx/src"])
+        .compile("equix");
+
+    // Run bindgen to automatically extract types and functions. This time set
+    // HASHX_SHARED and EQUIX_SHARED, so the function symbols are not hidden.
+    let out_path = std::path::PathBuf::from(std::env::var("OUT_DIR").unwrap());
+    bindgen::Builder::default()
+        .header_contents(
+            "wrapper.h",
+            r#"
+                #define HASHX_SIZE 8
+                #define HASHX_SHARED 1
+                #define EQUIX_SHARED 1
+                #include "hashx/include/hashx.h"
+                #include "include/equix.h"
+            "#,
+        )
+        .parse_callbacks(Box::new(bindgen::CargoCallbacks))
+        .default_enum_style(bindgen::EnumVariation::Rust {
+            non_exhaustive: true,
+        })
+        .bitfield_enum(".*_flags")
+        .generate()
+        .unwrap()
+        .write_to_file(out_path.join("bindings.rs"))
+        .unwrap();
+}
--- a/src/ext/equix/src/lib.rs
+++ b/src/ext/equix/src/lib.rs
@ -0,0 +1,293 @@
+//! Rust wrapper for Equi-X and HashX
+//!
+//! This is a Rust wrapper for the original C implementation of Equi-X and
+//! HashX, as used by the C implementation of Tor. For cross-compatibility
+//! testing conducted by Arti.
+//!
+//! The wrapper statically links with a modified version of the original
+//! implementation by tevador, covered by the LGPL version 3. This modified
+//! codebase is maintained as an ext module within the tor source distribution.
+//!
+//! Equi-X and HashX are `Copyright (c) 2020 tevador <tevador@gmail.com>`.
+//! See `LICENSE` for licensing information.
+//!
+
+pub mod ffi {
+    //! Low-level access to the C API
+
+    #![allow(non_upper_case_globals)]
+    #![allow(non_camel_case_types)]
+    #![allow(non_snake_case)]
+
+    include!(concat!(env!("OUT_DIR"), "/bindings.rs"));
+}
+
+/// Type parameter for [`HashX::new()`]
+pub type HashXType = ffi::hashx_type;
+
+/// Result codes for HashX
+pub type HashXResult = ffi::hashx_result;
+
+/// Configured size of the HashX output. Always 8 in this implementation.
+pub const HASHX_SIZE: usize = ffi::HASHX_SIZE as usize;
+
+/// Output value obtained by executing a HashX hash function
+pub type HashXOutput = [u8; HASHX_SIZE];
+
+/// Safe wrapper around a HashX context
+pub struct HashX(*mut ffi::hashx_ctx);
+
+impl HashX {
+    /// Allocate a new HashX context
+    pub fn new(ht: HashXType) -> Self {
+        let ctx = unsafe { ffi::hashx_alloc(ht) };
+        if ctx.is_null() {
+            panic!("out of memory in hashx_alloc");
+        }
+        Self(ctx)
+    }
+
+    /// Create a new hash function within this context, using the given seed
+    ///
+    /// May fail if the seed is unusable or if a runtime compiler
+    /// error occurs while the interpreter is disabled.
+    #[inline(always)]
+    pub fn make(&mut self, seed: &[u8]) -> HashXResult {
+        unsafe { ffi::hashx_make(self.0, seed.as_ptr() as *const std::ffi::c_void, seed.len()) }
+    }
+
+    /// Check which implementation was selected by `make`
+    #[inline(always)]
+    pub fn query_type(&mut self) -> Result<HashXType, HashXResult> {
+        let mut buffer = HashXType::HASHX_TYPE_INTERPRETED; // Arbitrary default
+        let result = unsafe { ffi::hashx_query_type(self.0, &mut buffer as *mut ffi::hashx_type) };
+        match result {
+            HashXResult::HASHX_OK => Ok(buffer),
+            e => Err(e),
+        }
+    }
+
+    /// Execute the hash function for a given input
+    #[inline(always)]
+    pub fn exec(&mut self, input: u64) -> Result<HashXOutput, HashXResult> {
+        let mut buffer: HashXOutput = Default::default();
+        let result = unsafe {
+            ffi::hashx_exec(
+                self.0,
+                input,
+                &mut buffer as *mut u8 as *mut std::ffi::c_void,
+            )
+        };
+        match result {
+            HashXResult::HASHX_OK => Ok(buffer),
+            e => Err(e),
+        }
+    }
+}
+
+impl Drop for HashX {
+    fn drop(&mut self) {
+        let ctx = std::mem::replace(&mut self.0, std::ptr::null_mut());
+        unsafe {
+            ffi::hashx_free(ctx);
+        }
+    }
+}
+
+/// Option flags for [`EquiX::new()`]
+pub type EquiXFlags = ffi::equix_ctx_flags;
+
+/// A single Equi-X solution
+pub type EquiXSolution = ffi::equix_solution;
+
+/// Flags with additional information about solutions
+pub type EquiXSolutionFlags = ffi::equix_solution_flags;
+
+/// A buffer with space for several Equi-X solutions
+pub type EquiXSolutionsBuffer = ffi::equix_solutions_buffer;
+
+/// Number of indices in a single Equi-X solution
+pub const EQUIX_NUM_IDX: usize = ffi::EQUIX_NUM_IDX as usize;
+
+/// Maximum number of Equi-X solutions we will return at once
+pub const EQUIX_MAX_SOLS: usize = ffi::EQUIX_MAX_SOLS as usize;
+
+impl Default for EquiXSolutionsBuffer {
+    fn default() -> Self {
+        Self {
+            count: 0,
+            flags: ffi::equix_solution_flags(0),
+            sols: [EquiXSolution {
+                idx: [0; EQUIX_NUM_IDX],
+            }; EQUIX_MAX_SOLS],
+        }
+    }
+}
+
+/// Result codes for Equi-X
+pub type EquiXResult = ffi::equix_result;
+
+/// Safe wrapper around an Equi-X context
+pub struct EquiX(*mut ffi::equix_ctx);
+
+impl EquiX {
+    /// Allocate a new Equi-X context
+    pub fn new(flags: EquiXFlags) -> Self {
+        let ctx = unsafe { ffi::equix_alloc(flags) };
+        if ctx.is_null() {
+            panic!("out of memory in equix_alloc");
+        }
+        Self(ctx)
+    }
+
+    /// Verify an Equi-X solution against a particular challenge
+    #[inline(always)]
+    pub fn verify(&mut self, challenge: &[u8], solution: &EquiXSolution) -> EquiXResult {
+        unsafe {
+            ffi::equix_verify(
+                self.0,
+                challenge.as_ptr() as *const std::ffi::c_void,
+                challenge.len(),
+                solution as *const ffi::equix_solution,
+            )
+        }
+    }
+
+    /// Run the solver, returning a variable number of solutions for a challenge
+    #[inline(always)]
+    pub fn solve(&mut self, challenge: &[u8], buffer: &mut EquiXSolutionsBuffer) -> EquiXResult {
+        unsafe {
+            ffi::equix_solve(
+                self.0,
+                challenge.as_ptr() as *const std::ffi::c_void,
+                challenge.len(),
+                buffer as *mut ffi::equix_solutions_buffer,
+            )
+        }
+    }
+}
+
+impl Drop for EquiX {
+    fn drop(&mut self) {
+        let ctx = std::mem::replace(&mut self.0, std::ptr::null_mut());
+        unsafe {
+            ffi::equix_free(ctx);
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use crate::*;
+    use hex_literal::hex;
+
+    #[test]
+    fn equix_context() {
+        let _ = EquiX::new(EquiXFlags::EQUIX_CTX_TRY_COMPILE | EquiXFlags::EQUIX_CTX_SOLVE);
+        let _ = EquiX::new(EquiXFlags::EQUIX_CTX_SOLVE);
+        let _ = EquiX::new(EquiXFlags::EQUIX_CTX_VERIFY);
+    }
+
+    #[test]
+    fn equix_verify_only() {
+        let mut ctx = EquiX::new(EquiXFlags::EQUIX_CTX_TRY_COMPILE | EquiXFlags::EQUIX_CTX_VERIFY);
+
+        assert_eq!(
+            ctx.verify(
+                b"a",
+                &EquiXSolution {
+                    idx: [0x2227, 0xa173, 0x365a, 0xb47d, 0x1bb2, 0xa077, 0x0d5e, 0xf25f]
+                }
+            ),
+            EquiXResult::EQUIX_OK
+        );
+        assert_eq!(
+            ctx.verify(
+                b"a",
+                &EquiXSolution {
+                    idx: [0x1bb2, 0xa077, 0x0d5e, 0xf25f, 0x2220, 0xa173, 0x365a, 0xb47d]
+                }
+            ),
+            EquiXResult::EQUIX_FAIL_ORDER
+        );
+        assert_eq!(
+            ctx.verify(
+                b"a",
+                &EquiXSolution {
+                    idx: [0x2220, 0xa173, 0x365a, 0xb47d, 0x1bb2, 0xa077, 0x0d5e, 0xf25f]
+                }
+            ),
+            EquiXResult::EQUIX_FAIL_PARTIAL_SUM
+        );
+    }
+
+    #[test]
+    fn equix_solve_only() {
+        let mut ctx = EquiX::new(EquiXFlags::EQUIX_CTX_TRY_COMPILE | EquiXFlags::EQUIX_CTX_SOLVE);
+        let mut buffer = Default::default();
+        assert_eq!(
+            ctx.solve(b"01234567890123456789", &mut buffer),
+            EquiXResult::EQUIX_OK
+        );
+        assert_eq!(buffer.count, 5);
+        assert_eq!(
+            buffer.sols[0].idx,
+            [0x4803, 0x6775, 0xc5c9, 0xd1b0, 0x1bc3, 0xe4f6, 0x4027, 0xf5ad,]
+        );
+        assert_eq!(
+            buffer.sols[1].idx,
+            [0x5a8a, 0x9542, 0xef99, 0xf0b9, 0x4905, 0x4e29, 0x2da5, 0xfbd5,]
+        );
+        assert_eq!(
+            buffer.sols[2].idx,
+            [0x4c79, 0xc935, 0x2bcb, 0xcd0f, 0x0362, 0x9fa9, 0xa62e, 0xf83a,]
+        );
+        assert_eq!(
+            buffer.sols[3].idx,
+            [0x5878, 0x6edf, 0x1e00, 0xf5e3, 0x43de, 0x9212, 0xd01e, 0xfd11,]
+        );
+        assert_eq!(
+            buffer.sols[4].idx,
+            [0x0b69, 0x2d17, 0x01be, 0x6cb4, 0x0fba, 0x4a9e, 0x8d75, 0xa50f,]
+        );
+    }
+
+    #[test]
+    fn hashx_context() {
+        // Context creation should always succeed
+        let _ = HashX::new(HashXType::HASHX_TYPE_INTERPRETED);
+        let _ = HashX::new(HashXType::HASHX_TYPE_COMPILED);
+        let _ = HashX::new(HashXType::HASHX_TRY_COMPILE);
+    }
+
+    #[test]
+    fn bad_seeds() {
+        // Some seed values we expect to fail (and one control).
+        // Also tests query_type while we're here.
+        let mut ctx = HashX::new(HashXType::HASHX_TYPE_INTERPRETED);
+        assert_eq!(ctx.query_type(), Err(HashXResult::HASHX_FAIL_UNPREPARED));
+        assert_eq!(ctx.make(b"qfjsfv"), HashXResult::HASHX_FAIL_SEED);
+        assert_eq!(ctx.query_type(), Err(HashXResult::HASHX_FAIL_UNPREPARED));
+        assert_eq!(ctx.make(b"llompmb"), HashXResult::HASHX_OK);
+        assert_eq!(ctx.query_type(), Ok(HashXType::HASHX_TYPE_INTERPRETED));
+        assert_eq!(ctx.make(b"mhelht"), HashXResult::HASHX_FAIL_SEED);
+        assert_eq!(ctx.query_type(), Err(HashXResult::HASHX_FAIL_UNPREPARED));
+    }
+
+    #[test]
+    fn hash_values() {
+        // Some sample hash values
+        let mut ctx = HashX::new(HashXType::HASHX_TRY_COMPILE);
+        assert_eq!(ctx.make(b"ebrazua"), HashXResult::HASHX_OK);
+        assert_eq!(ctx.exec(0xebc19ba9cafb0863), Ok(hex!("41cb0b4b24551d26")));
+        assert_eq!(ctx.make(b"This is a test\0"), HashXResult::HASHX_OK);
+        assert_eq!(ctx.exec(0), Ok(hex!("2b2f54567dcbea98")));
+        assert_eq!(ctx.exec(123456), Ok(hex!("aebdd50aa67c93af")));
+        assert_eq!(
+            ctx.make(b"Lorem ipsum dolor sit amet\0"),
+            HashXResult::HASHX_OK
+        );
+        assert_eq!(ctx.exec(123456), Ok(hex!("ab3d155bf4bbb0aa")));
+        assert_eq!(ctx.exec(987654321123456789), Ok(hex!("8dfef0497c323274")));
+    }
+}