pub struct Dataset {
    name: String,
    annotation: String,
    observations: Vec<Observation>,
    variables: Vec<VarId>,
    index_map: HashMap<ObservationId, usize>,
}
Expand description

An ordered list of observations for given variables. The order is important for some datasets, for example, to be able to capture time series.

Dataset provides classical Rust API for modifications. It also manages its observations through event-based API. However, this API is limited, and only serves as an extension to that of the ObservationManager.

Fields§

§name: String§annotation: String§observations: Vec<Observation>

List of binarized observations.

§variables: Vec<VarId>

Variables captured by the observations.

§index_map: HashMap<ObservationId, usize>

Index map from observation IDs to their index in vector, for faster searching.

Implementations§

source§

impl Dataset

Creating new Dataset instances.

source

pub fn new( name: &str, observations: Vec<Observation>, var_names: Vec<&str>, ) -> Result<Self, String>

Create new dataset from a list of observations and variables.

Length of each observation and number of variables must match. Observation IDs must be valid identifiers and must be unique. Annotation is empty (for annotated version, check Dataset::new_annotated)

source

pub fn new_annotated( name: &str, annotation: &str, observations: Vec<Observation>, var_names: Vec<&str>, ) -> Result<Self, String>

Create new dataset from a list of observations and variables.

Length of each observation and number of variables must match. Observation IDs must be valid identifiers and must be unique.

source

pub fn new_empty(name: &str, var_names: Vec<&str>) -> Result<Self, String>

Shorthand to create new empty dataset over given variables, with empty annotation.

source

pub fn default(name: &str) -> Dataset

Default dataset instance with no Variables or Observations, and with an empty annotation.

source

fn try_convert_vars(var_names: &[&str]) -> Result<Vec<VarId>, String>

(internal) Try converting variables string slices into VarIDs.

source§

impl Dataset

Editing Dataset instances.

source

pub fn set_name(&mut self, name: &str) -> Result<(), String>

Set dataset’s name.

source

pub fn set_annotation(&mut self, annotation: &str)

Set dataset’s annotation string.

source

pub fn push_obs(&mut self, obs: Observation) -> Result<(), String>

Add observation at the end of the dataset.

The observation must have the same length as is the number of dataset’s variables, and its id must not be already present in the dataset.

source

pub fn pop_obs(&mut self)

Remove observation from the end of the dataset. If no observations, nothing happens.

source

pub fn remove_obs(&mut self, id: &ObservationId) -> Result<(), String>

Remove observation with given ID from the dataset. The ID must be valid

This operation might be very costly, as we must reindex all subsequent observations.

source

pub fn insert_obs( &mut self, index: usize, obs: Observation, ) -> Result<(), String>

Add observation to a given index in the dataset.

This operation might be very costly, as we must reindex all subsequent observations.

source

pub fn remove_var(&mut self, var_id: &VarId) -> Result<(), String>

Remove variable and all the values corresponding to it (decrementing dimension of the dataset in process).

source

pub fn remove_var_by_str(&mut self, id: &str) -> Result<(), String>

Remove variable and all the values corresponding to it (decrementing dimension of the dataset in process).

source

pub fn add_var_default( &mut self, var_id: VarId, index: usize, ) -> Result<(), String>

Add variable to a specific index, and fill its values in all observations with “*” placeholders.

source

pub fn add_var_default_by_str( &mut self, id: &str, index: usize, ) -> Result<(), String>

Add variable to a specific index, and fill its values in all observations with “*” placeholders.

source

pub fn set_observation_raw( &mut self, id: &ObservationId, obs: Observation, ) -> Result<(), String>

Swap the whole observation data for given ID.

source

pub fn swap_obs_values( &mut self, id: &ObservationId, new_values: Vec<VarValue>, ) -> Result<(), String>

Swap value vector for an observation with given ID. The new vector of values must be of the same length as the original.

source

pub fn set_var_id( &mut self, original_id: &VarId, new_id: VarId, ) -> Result<(), String>

Set the id of variable with original_id to new_id.

source

pub fn set_var_id_by_str( &mut self, original_id: &str, new_id: &str, ) -> Result<(), String>

Set the id of variable given by string original_id to new_id.

source

pub fn set_all_variables( &mut self, new_variables_list: Vec<VarId>, ) -> Result<(), String>

Set the list of all variable IDs (essentially renaming some/all of them). The length of the new list must be the same as existing one (only renaming, not adding/removing variables).

source

pub fn set_all_variables_by_str( &mut self, new_variables_list: Vec<&str>, ) -> Result<(), String>

Set the list with all variable IDs (essentially renaming some/all of them) using string names. The length of the new list must be the same as existing one (only renaming, not adding/removing variables).

source

pub fn set_obs_id( &mut self, original_id: &ObservationId, new_id: ObservationId, ) -> Result<(), String>

Set the id of an observation with original_id to new_id.

source

pub fn set_obs_id_by_str( &mut self, original_id: &str, new_id: &str, ) -> Result<(), String>

Set the id of observation given by string original_id to new_id.

source

pub fn set_obs_name( &mut self, id: &ObservationId, new_name: &str, ) -> Result<(), String>

Set name of a given observation.

source

pub fn set_obs_annot( &mut self, id: &ObservationId, new_annot: &str, ) -> Result<(), String>

Set annotation of a given observation.

source§

impl Dataset

Observing Dataset instances.

source

pub fn get_name(&self) -> &str

Name of the dataset.

source

pub fn get_annotation(&self) -> &str

Annotation string of the dataset.

source

pub fn num_observations(&self) -> usize

Number of observations in the dataset.

source

pub fn num_variables(&self) -> usize

Number of variables tracked by the dataset.

source

pub fn is_valid_variable(&self, var: &VarId) -> bool

Check if variable is tracked in this dataset.

source

pub fn is_valid_obs(&self, id: &ObservationId) -> bool

Check if observation is present in this dataset.

source

pub fn get_obs_on_idx(&self, index: usize) -> Result<&Observation, String>

Observation on given index (indexing starts at 0).

source

pub fn get_obs(&self, id: &ObservationId) -> Result<&Observation, String>

Observation with given ID.

source

pub fn get_obs_id(&self, index: usize) -> &ObservationId

ID of an observation on given index.

source

pub fn get_obs_id_by_str(&self, id: &str) -> Result<ObservationId, String>

ID of an observation on given index.

source

pub fn get_obs_index(&self, id: &ObservationId) -> Result<usize, String>

Get index of given observation, or None (if not present). Indexing starts at 0.

source

pub fn observations(&self) -> &Vec<Observation>

Vector of all observations.

source

pub fn variables(&self) -> &Vec<VarId>

Vector of all variables.

source

pub fn variable_names(&self) -> Vec<String>

Vector of all variable names.

source

pub fn get_var_id(&self, id: &str) -> Result<&VarId, String>

Get VarId for a corresponding string identifier, if it is valid.

source

pub fn get_var_on_idx(&self, index: usize) -> Result<&VarId, String>

Variable on given index.

source

pub fn get_idx_of_var(&self, var_id: &VarId) -> Result<usize, String>

Index of given variable.

source

pub fn to_debug_string(&self, list_all: bool) -> String

Make a string describing this Dataset in a human-readable format. If list_all is set to true, all observation vectors are listed. Otherwise, just a summary is given (number of observations).

This is mainly for debug purposes, as it is different than classical string serialization.

source

fn assert_no_obs(&self, id: &ObservationId) -> Result<(), String>

(internal) Utility method to ensure there is no observation with given ID yet.

source

fn assert_valid_obs(&self, id: &ObservationId) -> Result<(), String>

(internal) Utility method to ensure there is a observation with given ID.

source

fn assert_no_variable(&self, var_id: &VarId) -> Result<(), String>

(internal) Utility method to ensure there is no variable with given Id yet.

source

fn assert_valid_variable(&self, var_id: &VarId) -> Result<(), String>

(internal) Utility method to ensure there is a variable with given Id.

source§

impl Dataset

Implementation for events related to modifying observations in a particular Dataset. Dataset does not implement SessionState trait directly. Instead, it just offers methods to perform certain events, after the preprocessing is done by ObservationManager.

source

pub(in sketchbook::observations) fn event_push_observation( &mut self, event: &Event, dataset_id: DatasetId, ) -> Result<Consumed, DynError>

Perform event of adding a new observation to the end of this Dataset.

source

pub(in sketchbook::observations) fn event_push_empty_observation( &mut self, event: &Event, dataset_id: DatasetId, ) -> Result<Consumed, DynError>

Perform event of adding a completely new “empty” observation to the end of this Dataset.

All its values are unspecified and its Id is newly generated.

source

pub(in sketchbook::observations) fn event_pop_observation( &mut self, event: &Event, dataset_id: DatasetId, ) -> Result<Consumed, DynError>

Perform event of removing the last observation from this Dataset.

source

pub(in sketchbook::observations) fn event_modify_observation( &mut self, event: &Event, action: &str, dataset_id: DatasetId, obs_id: ObservationId, ) -> Result<Consumed, DynError>

source§

impl Dataset

Methods for safely generating new valid (unique) instances of identifiers for the current Dataset.

source

pub fn generate_obs_id( &self, ideal_id: &str, start_index: Option<usize>, ) -> ObservationId

Generate valid ObservationId that’s currently not used by any observation in this Dataset.

First, the given ideal_id and its transformation by replacing invalid characters are tried. If they are both invalid (non-unique), a numerical identifier is added at the end. By specifying start_index, the index search starts directly at that number (e.g., when ideal ID is “obs” and start index is 3, search for ID starts with “obs_3”, “obs_4”, …)

Warning: Do not use this to pre-generate more than one id at a time, as the process is deterministic and might generate the same IDs. Always generate an Id, add that observation, and then repeat for other observations.

source

pub fn generate_var_id( &self, ideal_id: &str, start_index: Option<usize>, ) -> VarId

Generate valid VarId that’s currently not used by any variable in this Dataset.

First, the given ideal_id and its transformation by replacing invalid characters are tried. If they are both invalid (non-unique), a numerical identifier is added at the end. By specifying start_index, the index search starts directly at that number (e.g., when ideal ID is “var” and start index is 3, search for ID starts with “var_3”, “var_4”, …)

Warning: Do not use this to pre-generate more than one id at a time, as the process is deterministic and might generate the same IDs. Always generate an Id, add that variable, and then repeat for other variables.

Trait Implementations§

source§

impl Clone for Dataset

source§

fn clone(&self) -> Dataset

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for Dataset

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Manager for Dataset

source§

fn generate_id<T>( &self, ideal_id: &str, is_taken: &dyn Fn(&Self, &T) -> bool, num_indices: usize, start_index: Option<usize>, ) -> T
where T: FromStr, <T as FromStr>::Err: Debug,

Generate an ID of type T for a certain component of a manager (e.g., generate a VariableId for a Variable in a ModelState). Read more
source§

fn assert_ids_unique_and_used<T>( &self, id_list: &Vec<&str>, assert_id_is_managed: &dyn Fn(&Self, &T) -> Result<(), String>, ) -> Result<(), String>
where T: Eq + Hash + Debug + FromStr, <T as FromStr>::Err: Debug,

Check that the list of (typesafe or string) IDs contains only unique IDs (no duplicates), and check that all of the IDs are already managed by the manager instance (this is important, for instance, when we need to change already existing elements). Read more
source§

fn assert_ids_unique_and_new<T>( &self, id_list: &Vec<&str>, assert_id_is_new: &dyn Fn(&Self, &T) -> Result<(), String>, ) -> Result<(), String>
where T: Eq + Hash + Debug + FromStr, <T as FromStr>::Err: Debug,

Check that the list of (typesafe or string) IDs contains only unique IDs (no duplicates), and check that all of the IDs are NOT yet managed by the manager instance, i.e., they are fresh new values (this is important, for instance, when we need to add several new elements). Read more
source§

impl PartialEq for Dataset

source§

fn eq(&self, other: &Dataset) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
source§

impl SessionHelper for Dataset

source§

fn starts_with<'a, 'b>( prefix: &str, at_path: &'a [&'b str], ) -> Option<&'a [&'b str]>

A utility function which checks if at_path starts with a specific first segment. If yes, returns the remaining part of the path.
source§

fn matches(expected: &[&str], at_path: &[&str]) -> bool

A utility function which checks if at_path is exactly
source§

fn invalid_path_error_generic<T>(at_path: &[&str]) -> Result<T, DynError>

A utility function which emits a generic “invalid path” error.
source§

fn invalid_path_error_specific<T>( path: &[&str], component: &str, ) -> Result<T, DynError>

A utility function which emits a “invalid path” error mentioning specific state’s component.
source§

fn clone_payload_str(event: &Event, component: &str) -> Result<String, DynError>

A utility function to get and clone a payload of an event. Errors if payload is empty. Read more
source§

fn assert_path_length( path: &[&str], length: usize, component: &str, ) -> Result<(), DynError>

A utility function to assert that path has a given length, or emit a DynError otherwise. Read more
source§

fn assert_payload_empty(event: &Event, component: &str) -> Result<(), DynError>

A utility function to assert that payload is empty - otherwise, DynError is emitted. Read more
source§

impl Eq for Dataset

source§

impl StructuralPartialEq for Dataset

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

source§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

§

impl<T> Pointable for T

§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<T> Same for T

source§

type Output = T

Should always be Self
source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> MaybeSendSync for T

§

impl<T> UserEvent for T
where T: Debug + Clone + Send + 'static,